Using the yacc grammar file
A yacc grammar file consists of the following sections:
- Declarations
- Rules
- Programs
declarations
%%
rules
%%
programs
%%
rules
The yacc command ignores blanks, tabs, and new line characters in the grammar file. Therefore, use these characters to make the grammar file easier to read. Do not, however, use blanks, tabs or new line characters in names or reserved symbols.
Using comments
To explain what the program is doing, put comments in the grammar file. You can put comments anywhere in the grammar file that you can put a name. However, to make the file easier to read, put the comments on lines by themselves at the beginning of functional blocks of rules. A comment in a yacc grammar file looks the same as a comment in a C language program. The comment is enclosed between /* (backslash, asterisk) and */ (asterisk, backslash). For example:
/* This is a comment on a line by itself. */
Using literal strings
A literal string is one or more characters enclosed in '' (single quotes). As in the C language, the \ (backslash) is an escape character within literals, and all the C language escape codes are recognized. Thus, the yacc command accepts the symbols in the following table:
Symbol | Definition |
---|---|
'\a' | Alert |
'\b' | Backspace |
'\f' | Form-feed |
'\n' | New line |
'\r' | Return |
'\t' | Tab |
'\v' | Vertical tab |
'\'' | Single quote (') |
'\"' | Double quote (") |
'\?' | Question mark (?) |
'\\' | Backslash (\) |
'\Digits' | The character whose encoding is represented by the one-, two-, or three-digit octal integer specified by the Digits string. |
'\xDigits' | The character whose encoding is represented by the sequence of hexadecimal characters specified by the Digits string. |
Because its ASCII code is zero, the null character (\0 or 0) must not be used in grammar rules. The yylex subroutine returns 0 if the null character is used, signifying end of input.
Formatting the grammar file
- Use uppercase letters for token names, and use lowercase letters for nonterminal symbol names.
- Put grammar rules and actions on separate lines to allow changing either one without changing the other.
- Put all rules with the same left side together. Enter the left side once, and use the vertical bar to begin the rest of the rules for that left side.
- For each set of rules with the same left side, enter the semicolon once on a line by itself following the last rule for that left side. You can then add new rules easily.
- Indent rule bodies by two tab stops and action bodies by three tab stops.
Errors in the grammar file
The yacc command cannot produce a parser for all sets of grammar specifications. If the grammar rules contradict themselves or require matching techniques that are different from what the yacc command provides, the yacc command will not produce a parser. In most cases, the yacc command provides messages to indicate the errors. To correct these errors, redesign the rules in the grammar file, or provide a lexical analyzer (input program to the parser) to recognize the patterns that the yacc command cannot.