Parser operation generated by the yacc command

The yacc command converts the grammar file to a C language program.

That program, when compiled and executed, parses the input according to the grammar specification provided.

The parser is a finite state machine with a stack. The parser can read and remember the look-ahead token. The current state is always the state at the top of the stack. The states of the finite state machine are represented by small integers. Initially, the machine is in state 0, the stack contains only 0, and no look-ahead token has been read.

The machine can perform one of the following actions:
Action Description
shift State The parser pushes the current state onto the stack, makes State the current state, and clears the look-ahead token.
reduce Rule When the parser finds a string defined by Rule (a rule number) in the input stream, the parser replaces that string with Rule in the output stream.
accept The parser looks at all input, matches it to the grammar specification, and recognizes the input as satisfying the highest-level structure (defined by the start symbol). This action appears only when the look-ahead token is the endmarker and indicates that the parser has successfully done its job.
error The parser cannot continue processing the input stream and still successfully match it with any rule defined in the grammar specification. The input tokens that the parser looked at, together with the look-ahead token, cannot be followed by anything that would result in valid input. The parser reports an error and attempts to recover the situation and resume parsing.

The parser performs the following actions during one process step:

  1. Based on its current state, the parser decides whether it needs a look-ahead token to determine the action to be taken. If the parser needs a look-ahead token and does not have one, it calls the yylex subroutine to obtain the next token.
  2. Using the current state, and the look-ahead token if needed, the parser decides on its next action and carries it out. As a result, states may be pushed onto or popped off the stack, and the look-ahead token may be processed or left alone.

Shift action

The shift action is the most common action that the parser takes. Whenever the parser does a shift, a look-ahead token always exists. For example, consider the following grammar specification rule:
IF shift 34
If the parser is in the state that contains this rule and the look-ahead token is IF, the parser:
  1. Pushes the current state down on the stack.
  2. Makes state 34 the current state (puts it at the top of the stack).
  3. Clears the look-ahead token.

Reduce action

The reduce action keeps the stack from growing too large. The parser uses reduce actions after matching the right side of a rule with the input stream. The parser is then ready to replace the characters in the input stream with the left side of the rule. The parser may have to use the look-ahead token to decide if the pattern is a complete match.

Reduce actions are associated with individual grammar rules. Because grammar rules also have small integer numbers, it is easy to confuse the meanings of the numbers in the two actions, shift and reduce. For example, the following action refers to grammar rule 18:
. reduce 18
The following action refers to state 34:
IF shift 34
For example, to reduce the following rule, the parser pops off the top three states from the stack:
A  :  x y z ;

The number of states popped equals the number of symbols on the right side of the rule. These states are the ones put on the stack while recognizing x, y, and z. After popping these states, a state is uncovered, which is the state that the parser was in before beginning to process the rule; that is, the state that needed to recognize rule A to satisfy its rule. Using this uncovered state and the symbol on the left side of the rule, the parser performs an action called goto, which is similar to a shift of A. A new state is obtained, pushed onto the stack, and parsing continues.

The goto action is different from an ordinary shift of a token. The look-ahead token is cleared by a shift but is not affected by a goto action. When the three states are popped, the uncovered state contains an entry such as the following:
A  goto 20

This entry causes state 20 to be pushed onto the stack and become the current state.

The reduce action is also important in the treatment of user-supplied actions and values. When a rule is reduced, the parser executes the code that you included in the rule before adjusting the stack. Another stack running in parallel with the stack holding the states holds the values returned from the lexical analyzer and the actions. When a shift takes place, the yylval external variable is copied onto the stack holding the values. After executing the code that you provide, the parser performs the reduction. When the parser performs the goto action, it copies the yylval external variable onto the value stack. The yacc keywords that begin with $ refer to the value stack.