Using yylex()

The structure of lex programs is influenced by what yacc requires of its lexical analyzer.

To begin with, the lexical analyzer is named yylex() and has no parameters. It is expected to return a token number (of type int), where that number is determined by yacc. The token number for a character is its value as a C character constant. yacc can also be used to define token names, using the token statement, where C definitions of these tokens can be written on the file y.tab.h with the -d option to yacc. This file defines each token name as its token number.

yacc also allows yylex() to pass a value to the yacc action routines, by assigning that value to the external yylval. The type of yylval is by default int, but this may be changed by the use of the yacc %union statement. lex assumes that the programmer defines yylval correctly; yacc writes a definition for yylval to the file y.tab.h if the %union statement is used.

For compatibility with yacc, lex provides a lexical analyzer named yylex(), which interprets tables formed from the lex program, and which returns token numbers from the actions it performs. The actions may include assignments to yylval (or its components, if it is a union of types), so that use with yacc is straightforward.

In the absence of a return statement in an action, yylex() does not return but continues to look for further matches. If some computation is performed entirely by the lexical analyzer with no normal return from any action, a suitable main program is:
#include <stdio.h>

main()
{
	return yylex();
}
The value 0 (zero) is returned by yylex() at end-of-file; this program allows for an error return to the program's caller. You can find such a main program in the lex library.