lex programs

A lex program consists of three sections: a section containing definitions, a section containing translations, and a section containing functions. The style of this layout is similar to that of yacc.

Throughout a lex program, you can freely use newlines and C-style comments; they are treated as white space. Lines starting with a blank or tab are copied through to the lex output file. Blanks and tabs are usually ignored, except when you use them to separate names from definitions, or expressions from actions.

The definition section is separated from the following section by a line consisting only of %%. In this section, named regular expressions can be defined, which means you can use names of regular expressions in the translation section, in place of common subexpressions, to make that section more readable. The definition section can be empty, but the %% separator is required.

The translation section follows the definition section, and contains regular expressions paired with actions, which describe what the lexical analyzer is to do when a match of a given regular expression is found. The first nonescaped space or tab on a line in the translation section signals the start of the action. Actions are further described in later sections of this topic.

You can omit the function section; if it is present, it is separated from the translation section by a line containing only %%. This section can contain anything, because it is simply attached to the end of the lex output file.