Analizador lexico / lex document

Solo disponible en BuenasTareas
  • Páginas : 33 (8194 palabras )
  • Descarga(s) : 71
  • Publicado : 11 de mayo de 2010
Leer documento completo
Vista previa del texto
Lex – A Lexical Analyzer Generator
M. E. Lesk and E. Schmidt M. E. Lesk and E. Schmidt Bell Laboratories Murray Hill, New Jersey 07974


Lex helps write programs whose control flow is directed by instances of regular expressions in the input stream. It is well suited for editor-script type transformations and for segmenting input in preparation for a parsing routine. Lexsource is a table of regular expressions and corresponding program fragments. The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. As each such string is recognized the corresponding program fragment is executed. The recognition of the expressions is performed by a deterministicfinite automaton generated by Lex. The program fragments written by the user are executed in the order in which the corresponding regular expressions occur in the input stream. The lexical analysis programs written with Lex accept ambiguous specifications and choose the longest match possible at each input point. If necessary, substantial lookahead is performed on the input, but the input stream will bebacked up to the end of the current partition, so that the user has general freedom to manipulate it. Lex can generate analyzers in either C or Ratfor, a language which can be translated automatically to portable Fortran. It is available on the PDP-11 UNIX, Honeywell GCOS, and IBM OS systems. This manual, however, will only discuss generating analyzers in C on the UNIX system, which is the onlysupported form of Lex under UNIX Version 7. Lex is designed to simplify interfacing with Yacc, for those with access to this compiler-compiler system.

July 21, 1975

Lex – A Lexical Analyzer Generator
M. E. Lesk and E. Schmidt M. E. Lesk and E. Schmidt Bell Laboratories Murray Hill, New Jersey 07974

Table of Contents Introduction. 1 Lex Source. 3 Lex Regular Expressions. 3 Lex Actions. 5Ambiguous Source Rules. 7 Lex Source Definitions. 8 Usage. 8 Lex and Yacc. 9 Examples. 10 Left Context Sensitivity. 11 Character Set. 12 Summary of Source Format. 12 Caveats and Bugs. 13 Acknowledgments. 13 References. 13 1. Introduction. user’s freedom to write actions is unimpaired. This avoids forcing the user who wishes to use a Lex is a program generator designed for string manipulation languagefor input analysis lexical processing of character input streams. It to write processing programs in the same and accepts a high-level, problem oriented often inappropriate string handling language. specification for character string matching, and produces a program in a general purpose language which recognizes regular expressions. The regular expressions are specified by the user in the sourcespecifications given to Lex. The Lex written code recognizes these expressions in an input stream and partitions the input stream into strings matching the expressions. At the boundaries between strings program sections provided by the user are executed. The Lex source file associates the regular expressions and the program fragments. As each expression appears in the input to the program written byLex, the corresponding fragment is executed. The user supplies the additional code beyond expression matching needed to complete his tasks, possibly including code written by other generators. The program that recognizes the expressions is generated in the general purpose programming language employed for the user’s program fragments. Thus, a high level expression language is provided to write thestring expressions to be matched while the Lex is not a complete language, but rather a generator representing a new language feature which can be added to different programming languages, called ‘‘host languages.’’ Just as general purpose languages can produce code to run on different computer hardware, Lex can write code in different host languages. The host language is used for the output...
tracking img