Lexical analyzer

The lexical analyzer yylex() reads input and breaks it into tokens; in fact, it determines what constitutes a token. For example, some lexical analyzers may return numbers one digit at a time, whereas others collect numbers in their entirety before passing them to the parser.

Similarly, some lexical analyzers may recognize such keywords as if or while and tell the parser that an if token or while token has been found. Others may not be designed to recognize keywords, so it is up to the parser itself to distinguish between keywords and other things, such as variable names.

Each token named in the declarations section of the yacc input is set up as a defined C constant. The value of the first token named is 257, the value of the next is 258, and so on. You can also set your own values for tokens by placing a positive integer after the first appearance of any token in the declarations section. For example:
%token AA 56
assigns a value of 56 to the definition of the token symbol AA. This mechanism is very seldom needed, and you should avoid it whenever possible.

There is little else to say about requirements for yylex(). If the function is to return the value of a token as well as an indication of its type, the value is assigned to the external variable yylval. By default, yylval is defined as an int value, but it can also be used to hold other types of values. For more information, see the description of %union in Types.