yacc rules
The rules section of the grammar file contains one or more grammar rules. Each rule describes a structure and gives it a name.
A : BODY;
where A is a nonterminal name, and BODY is a sequence of 0 or more names, literals, and semantic actions that can optionally be followed by precedence rules. Only the names and literals are required to form the grammar. Semantic actions and precedence rules are optional. The colon and the semicolon are required yacc punctuation.
Semantic actions allow you to associate actions to be performed each time that a rule is recognized in the input process. An action can be an arbitrary C statement, and as such, perform input or output, call subprograms, or alter external variables. Actions can also refer to the actions of the parser; for example, shift and reduce.
Precedence rules are defined by the %prec keyword and change the precedence level associated with a particular grammar rule. The reserved symbol %prec can appear immediately after the body of the grammar rule and can be followed by a token name or a literal. The construct causes the precedence of the grammar rule to become that of the token name or literal.
Repeating nonterminal names
A : B C D ;
A : E F ;
A : G ;
A : B C D
| E F
| G
;
Using recursion in a grammar file
rule : EndCase
| rule EndCase
Therefore, the simplest case of the rule is the EndCase, but rule can also consist of more than one occurrence of EndCase. The entry in the second line that uses rule in the definition of rule is the recursion. The parser cycles through the input until the stream is reduced to the final EndCase.
rule : EndCase
| EndCase rule
The following example defines the line rule as one or more combinations of a string followed by a newline character (\n):
lines : line
| lines line
;
line : string '\n'
;
Empty string
empty : ;
| x;
empty :
| x
;
End-of-input marker
When the lexical analyzer reaches the end of the input stream, it sends an end-of-input marker to the parser. This marker is a special token called endmarker, which has a token value of 0. When the parser receives an end-of-input marker, it checks to see that it has assigned all input to defined grammar rules and that the processed input forms a complete unit (as defined in the yacc grammar file). If the input is a complete unit, the parser stops. If the input is not a complete unit, the parser signals an error and stops.
The lexical analyzer must send the end-of-input marker at the appropriate time, such as the end of a file, or the end of a record.