IPRSE_parse: Parse a text string against a grammar
IPRSE_parse matches an input string against an input grammar and produces a structure containing elements of the grammar with the corresponding elements of the input string. It is intended primarily for parsing z/TPF commands by real-time segments written in C language.
IPRSE_parse returns the parsed parameters and values through a pointer to a struct IPRSE_output, declared in the tpfparse.h header.
Format
#include <tpf/tpfparse.h>
int IPRSE_parse(char *string, const char *grammar,
struct IPRSE_output *result, int options,
const char *errheader);
- string
- The input string, which must be a standard C string terminated by a zero byte ('\0') or by an EOM character if the IPRSE_EOM option is specified. If the IPRSE_EOM option is specified, the maximum length of the string is 4095 characters.
- grammar
- The grammar describing acceptable input strings. The grammar must end in a zero byte ('\0').
- result
- The tokenized parameter list in the following form:
- result.IPRSE_parameter
- The parameter name as specified by the grammar
- result.IPRSE_value
- The value of the parameter as specified by the input string. Note: This value is translated to uppercase when the IPRSE_MIXED_CASE option is specified and the corresponding grammar parameter ends with a less-than sign (<).
- result.IPRSE_next
- The pointer to the next entry in the output parameter list.
- options
- The following options control sending error messages:
- IPRSE_PRINT
- Print all error messages.
- IPRSE_NOPRINT
- Suppress all error messages. This is the default if IPRSE_PRINT is not specified.
- IPRSE_ALLOC
- Obtain storage dynamically for the output structure. Always code this option.
- IPRSE_NOALLOC
- This is the default if IPRSE_ALLOC is not specified. Do not use IPRSE_NOALLOC except to facilitate migration of old code that uses the IPRSE_bldprstr function to initialize preallocated storage. Using the IPRSE_ALLOC option is both more efficient and less likely to cause errors.
- IPRSE_EOM
- If the input string ends with an EOM character (+) instead of an EOS character ('\0'), the IPRSE_parse function replaces the EOM character with an EOS character. The first EOM character in the input string is replaced, there cannot be any '+' characters within the input string if the IPRSE_EOM option is specified. The EOM character must be in the first 4095 characters of the input string.
- IPRSE_NOEOM
- This is the default if IPRSE_EOM is not specified. The input string must end with an EOS character ('\0').
The following four options control how the IPRSE_parse function parses the input string. All four of these options can also be specified at the beginning of the grammar (see Specifying parser options in the grammar). Options specified in the grammar parameter override options specified in the options parameter:
- IPRSE_STRICT
- Accept only spaces as token separators, and only dashes (-) as separators between keywords and values in the input string. Use this option when the input parameters can contain commas (,), slashes (/), or equal signs (=).
- IPRSE_NOSTRICT
- Accept spaces, commas (,) or slashes (/) as token separators, and dashes (-) or equal signs (=) as separators between keywords and values. This is the default if IPRSE_STRICT is not specified.
- IPRSE_MIXED_CASE
- Accept lowercase or uppercase letters in the input string.
- IPRSE_NOMIXED_CASE
- Accept only uppercase letters in the input string. This is the default if IPRSE_MIXED_CASE is not specified.
- IPRSE_QUOTE
- Use the dollar sign ($) or single quote (') characters to delimit quoted parameters. The returned value will match the contents of a quoted parameter minus the delimiter. To specify a delimiter character, double up the character value; for example, 2 single quotes (' ') will return 1 single quote (').
- IPRSE_NOQUOTE
- This is the default if IPRSE_QUOTE is not specified. The IPRSE_ NOQUOTE option specifies that there is no special meaning to the dollar sign ($) or single quote (') characters.
Multiple options can be ORed together; for example, IPRSE_ALLOC | IPRSE_PRINT.
- errheader
- A string that identifies the program calling the parser. This string is printed out as part of the error message text if the IPRSE_PRINT option is specified.
Normal return
IPRSE_parse returns the number of parameters that have been parsed and put in the result structure. For example, a return code of 3 would mean that 3 parameters were parsed from an input string that contained 3 parameters.
Error return
- Error in Input String
- Return Codes
- -1
- The input string is a question mark (?) or HELP (represented by symbolic IPRSE_HELP).
- 0
- The input string does not meet the requirements of the grammar (represented by symbolic IPRSE_BAD).
- Error Messages The IPRSE_parse function issues the following messages. The cccc represents the errheader parameter that is printed after the message header; it shows which function or program was calling IPRSE_parse when the error occurred. All messages will be sent via the wtopc function without chaining.
- PRSE0001E
- cccc - TOO MANY PARAMETERS ENTERED
- PRSE0004E
- cccc - INVALID USE OF PERIOD
- PRSE0005E
- cccc - INVALID ALPHANUMERIC CHARACTER
- PRSE0006E
- cccc - INVALID DECIMAL CHARACTER
- PRSE0007E
- cccc - INVALID CHARACTER
- PRSE0008E
- cccc - INVALID HEXADECIMAL CHARACTER
- PRSE0009E
- cccc - MANDATORY PARAMETER NOT GIVEN
If the system can determine the last parameter in error, the message indicates the last parameter in error by adding the PARAMETER IN ERROR IS text and the parameter value.
- PRSE0011E
- cccc - INVALID INPUT PARAMETER
If the system can determine the last valid parameter, the message indicates the last valid parameter by adding text that states LAST VALID PARAMETER IS and the parameter value. If a keyword was found to be in error, text will be added that states ERROR IN KEYWORD and the keyword.
- PRSE0014E
- cccc - TOO MANY CHARACTERS ENTERED
- PRSE0015E
- cccc - TOO FEW CHARACTERS ENTERED FOR PARAMETER
- Return Codes
- Error in Grammar
- 00006F system error messages are displayed in console or dump when the grammar syntax is in error.
- 0007B system error messages occur when the parser is unable to obtain needed heap storage.
Programming considerations
- Always code the IPRSE_ALLOC option. IPRSE_NOALLOC and the IPRSE_bldprstr are supported only for code that was written before the IPRSE_ALLOC option was available.
- For information on creating a grammar, see Defining a grammar.
Examples
A series of examples follows, the first of which is a complete program for creating and parsing with a grammar. All of the other examples show a grammar, its input string, and the IPRSE_output structure.
The number of parameters found is returned if the string complies with the grammar conventions. See Defining a grammar for additional information.

Example 1: Coding example for grammar and parser
- Parses input using a specific grammar (IPRSE_parse)
- Uses the parsed output (process_parm, defined in the example).
/*====================================================================*/
/* This example shows a segment that parses */
/* a message in MI0MI format on data level D0. */
/* This code example includes calls to: */
/* - parse input using a specific grammar (IPRSE_parse) */
/* - use the parsed output (process_parm, defined in this segment) */
/*====================================================================*/
#include <tpf/tpfeq.h>
#include <tpf/tpfapi.h>
#include <string.h>
#include <stdlib.h>
#include <tpf/tpfparse.h>
/*--------------------------------------------------------------------*/
/* Define the grammar for the command handled by this segment, where:*/
/* Positional is a positional parameter. */
/* d+++ is a positional parameter that represents 1 to 4 digits. */
/* a.a is a positional parameter that represents a regular list of */
/* alphanumeric characters (character type a). */
/* (xx)* is an optional positional parameter that represents a */
/* wildcard list of hexadecimal digits (character type x). */
/* (NO)SELFdef is a self-defining keyword that returns a Y (yes */
/* value) or N (no value). */
/* Key-w is an optional regular keyword parameter that can have */
/* an alphanumeric value (character type w). */
/* List-cc.cc is an optional regular keyword parameter that can */
/* have a value that consists of a regular list of uppercase */
/* letters (character type c). */
/*--------------------------------------------------------------------*/
#define XMP_GRAMMAR "{ Positional " \
"| d+++ a.a [(xx)*] " \
"| (NO)SELFdef [Key-w List-cc.cc] " \
"}"
/*--------------------------------------------------------------------*/
/* Declare an interface to functions that will process the parsed */
/* command parameters. */
/*--------------------------------------------------------------------*/
enum parm1_type { POSITIONAL_NOT_SPECIFIED, POSITIONAL_SPECIFIED };
enum parm5_type { SELFDEF_NOT_SPECIFIED, SELFDEF_NO, SELFDEF_YES };
struct xmp_interface
{
enum parm1_type parm1_value; /* "Positional" */
int parm2_value; /* "d+++" */
char *parm3_first; /* first "a" */
char *parm3_second; /* second "a" */
char *parm4_string; /* "(xx)*" */
enum parm5_type parm5_value; /* "(NO)SELFdef" */
char parm6_value; /* "Key-w" */
char *parm7_first; /* first "cc" */
char *parm7_second; /* second "cc" */
};
#define XMP_DEFAULTS { POSITIONAL_NOT_SPECIFIED, -1, NULL, NULL, NULL, \
SELFDEF_NOT_SPECIFIED, '\0', NULL, NULL }
/*--------------------------------------------------------------------*/
/* Declare internal function called by this segment. */
/*--------------------------------------------------------------------*/
static void process_parm(struct xmp_interface *xi , char *p, char *v);
/**********************************************************************/
/* Function ____ completes the parsing of the "Zxxxx" functional */
/* message contained in the core block on data level D0. */
/**********************************************************************/
void ____(void)
{
/*--------------------------------------------------------------------*/
/* Define variables for accessing the command text in the */
/* core block on D0. */
/*--------------------------------------------------------------------*/
struct mi0mi *block_ptr; /* pointer to core block */
char *input_ptr; /* pointer to message text */
char *eom_ptr; /* pointer to _EOM character */
/* (to be replaced by '\0') */
/*--------------------------------------------------------------------*/
/* Define variables for the parser results. */
/*--------------------------------------------------------------------*/
struct IPRSE_output parse_results;
int num_parms; /* For saving the IPRSE_parse */
/* return code. */
/*--------------------------------------------------------------------*/
/* Define a moving pointer for traversing the parse results, a wtopc */
/* header for the help message, and an interface variable for the */
/* parsed parameter values. */
/*--------------------------------------------------------------------*/
struct IPRSE_output *pr_ptr;
struct wtopc_header msg_header;
struct xmp_interface parm_values = XMP_DEFAULTS;
/*--------------------------------------------------------------------*/
/* Access the command block on level D0, point to the */
/* beginning of the parameters by skipping over "Zxxxx", and replace */
/* _EOM with '\0'. */
/*--------------------------------------------------------------------*/
block_ptr = ecbptr()->ce1cr0;
input_ptr = block_ptr->mi0acc + strlen("Zxxxx");
eom_ptr = (char *)&block_ptr->mi0ln0 + block_ptr->mi0cct - 1;
*eom_ptr = '\0';
/*--------------------------------------------------------------------*/
/* Call the parser. */
/*--------------------------------------------------------------------*/
num_parms = IPRSE_parse(input_ptr, XMP_GRAMMAR, &parse_results,
IPRSE_ALLOC | IPRSE_PRINT, "cpp_tppc_test");
/*--------------------------------------------------------------------*/
/* Check if the command meets the grammar's requirements. */
/*--------------------------------------------------------------------*/
if (num_parms > 0) /* The parse was successful; num_parms */
/* parameters from the command */
/* matched parameters specified in the */
/* grammar (XMP_GRAMMAR). */
{
pr_ptr = &parse_results; /* point to the first result */
do
{
process_parm(&parm_values, pr_ptr->IPRSE_parameter,
pr_ptr->IPRSE_value);
pr_ptr = pr_ptr->IPRSE_next;
} while (--num_parms);
/* call additional functions to further process the input */
}
else
if (num_parms == IPRSE_HELP)
{
wtopc_insert_header(&msg_header, "cpp_tppc_test", 99, 'I',
WTOPC_SYS_TIME);
wtopc("EXAMPLE HELP MESSAGE", 0, WTOPC_NO_CHAIN,
&msg_header);
}
else ; /* IPRSE_parse has already written an error message. */
exit(0); /* Command processing is completed. */
}
/**********************************************************************/
/* Function process_parm sets the appropriate interface variable */
/* field to the value corresponding to the matched parameter. */
/**********************************************************************/
static void process_parm(struct xmp_interface *xi , char *p, char *v)
{
/*--------------------------------------------------------------------*/
/* Define the value that strcmp returns when the two strings passed */
/* to it are equal. */
/*--------------------------------------------------------------------*/
#define STRCMP_EQUAL 0
/*--------------------------------------------------------------------*/
/* Define a local variable to point to the dot in a list. */
/*--------------------------------------------------------------------*/
char *dot_ptr;
/*--------------------------------------------------------------------*/
/* Determine which parameter was matched and set up the appropriate */
/* interface field(s) with the matching values. */
/*--------------------------------------------------------------------*/
if (strcmp(p, "Positional") == STRCMP_EQUAL)
{
xi->parm1_value = POSITIONAL_SPECIFIED;
}
else if (strcmp(p, "d+++") == STRCMP_EQUAL)
{
xi->parm2_value = atoi(v);
}
else if (strcmp(p, "a.a") == STRCMP_EQUAL)
{
dot_ptr = strchr(v, '.'); /* Point to the dot separating */
/* the two list sub-parameters. */
*dot_ptr = '\0'; /* Divide the list parameter */
/* into two sub-strings. */
xi->parm3_first = v;
xi->parm3_second = dot_ptr + 1;
}
else if (strcmp(p, "xx") == STRCMP_EQUAL)
{
xi->parm4_string = v;
}
else if (strcmp(p, "(NO)SELFdef") == STRCMP_EQUAL)
{
xi->parm5_value =
*v == 'Y' ? SELFDEF_YES : SELFDEF_NO;
}
else if (strcmp(p, "Key-w") == STRCMP_EQUAL)
{
xi->parm6_value = *v;
}
else if (strcmp(p, "List-cc.cc") == STRCMP_EQUAL)
{
dot_ptr = strchr(v, '.');
*dot_ptr = '\0';
xi->parm7_first = v;
xi->parm7_second = dot_ptr + 1;
}
return;
}"{ Postional | d+++ a.a [(xx)*]
| (NO)SELFdef [Key-w List-cc.cc]}" And the input string:
NOSELF K-A L-DD.AA Therefore, the string meets the grammar requirements.
IPRSE_parse returns a return code of 3 because it parsed 3 parameters on the input string. The
results in the arrays are shown as follows (result 0 is the first parameter parsed, and so on):

char *grammar = "cccc PRoc-ccxx [IS-d ]";
char *string = "DDDD PRO-AB23";
char *grammar = "cccc PR {Deac | Reac}";
char *string = "DDDD PR DEA ";
char *grammar = "cccc (wwww)* ";
char *string = "DDDD AAAB.CC* ";
char *grammar = "(NO)RETURNcode";
char *string = "NORETURN";
char * grammar = IPRSE_QUOTE_GRAMMAR "u*";
char * string = "'MY QUOTED INPUT WITH SPACES'";

char * grammar = IPRSE_QUOTE_GRAMMAR "u*";
char * string = "$MY 'QUOTED' INPUT WITH SPACES$";