Lex Programming Tool

Lex is a program that generates lexical analyzers ("scanners"). Lex is commonly used with the yacc parser generator. Lex, originally written by Eric Schmidt and Mike Lesk, is the standard lexical analyzer on UNIX systems, and is included in the POSIX standard. A popular free version of lex is flex, a fast lexical analyzer. Lex reads an input file specifying the lexical analyzer and outputs code implementing the lexer in the C programming language.

Structure of a lex file

The structure of a lex file is intentionally similar to that of a yacc file; files are divided up into three parts: a definition section, a rules section, and a C code section. Sections are separated by lines that contain only two percent signs: %% The definition section is the place to define macros using regular expressions, and also to import header files written in C. The rules section is the most important section; it associates rules to C statements. When lex sees a pattern in its input matching a given rule, it executes the associated C code. Rules are simply regular expressions, probably containing the macros defined in the definition section. The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file and link it in at compile time.

Example flex file

The following is an example input file for the flex version of lex. It recognizes strings of numbers (integers) in the input. Given the input "abc123z.!&*2ghj6", the program will print:
  Saw an integer: 123  Saw an integer: 2  Saw an integer: 6 
  /*    * Example lexical analyzer for flex   *   * Picks out strings of digits (integers) from the input.   */    /*** Definition section ***/    %{    /*   * Some C code to include the C standard I/O library.   * Everything inside the %{ %} brackets is inserted   * verbatim into the generated file.   */  #include     %}    /* Macros;  regular expressions */  DIGIT       0-9  INTEGER     {DIGIT}+    /* This tells flex to read only one input file */  %option noyywrap    %%      /*       * Rules section        *       * Comments in this section must be indented       * so lex won't mistake them for regular expressions.       */    {INTEGER}   {                  /*                   * This rule prints integers from the input.                   * yytext is a string containing the matched text.                   */                  printf("Saw an integer: %s\n", yytext);               }    .           { /* Ignore all other characters. */ }    %%  /*** C Code section ***/    /*   * The main program.   *   * Call the lexer. Quit when done.   */  int main(void)  {      /* yyin is where lex reads from. Set it to the standard input. */      FILE *yyin = stdin;        /* Call the lexer. */        yylex();      return 0;  } 
See also: the flex lexical analyser

 

<< PreviousWord BrowserNext >>
eagar, arizona
william gaines
mc ren
newton, massachusetts
satyricon (band)
essential fatty acid
al mansur
hexane
crab louse
quixtar
palindromic number
dsi bouterse
bud abbott
kuala lumpur international airport
lou costello
cayley's theorem
earl of albemarle
abbott and costello
initialization vector
death becomes her
direct action
harper's ferry, west virginia
john ericsson
exclusive disjunction
louis prima
pandulph
runnymede (district)
rowan & martin's laugh in
lexx
chilperic i of neustria
clotaire i
pr lagerkvist
clotaire iii
clara bow
soil ph
childebert i
childebert ii of austrasia
sex in science fiction
kansas city knights
emery worldwide
bituminous coal
index fund
domus dei
auguste piccard