Vxquery

Páginas: 7 (1696 palabras) Publicado: 2 de enero de 2013
1 XQuery Grammar
1.1 Lexical structure
A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO10646] (see also [ISO10646-2000]). Legal characters are those allowed in the [XML] recommendation. A lexical pattern is a rule that describes how a sequence of characters can match a grammar unit. A lexeme is the smallest meaningful unit in the grammar that has syntacticinterpretation. A token is a symbol that matches lexemes, and is the output of the lexical analyzer. A token symbol is the symbolic name given to that token. A single token may be composed of one or more lexemes. If there is more than one lexeme, they may be separated by whitespace or punctuation. For instance, a token AxisDescendantOrSelf might have two lexemes, "descendant-or-self" and "::". Pattern "or" "="(Prefix ':')? LocalPart "or" "=" "p" ":" "foo" "descendant-or-self" "::" AxisDescendantOrSelf QName Lexeme(s) Token Names (for example) Or Equals

When patterns are simple string matches, the strings are embedded directly into the BNF. In other cases, token symbols are used when the pattern is a more complex regular expression (the major cases of these are NCName, QName, and Number and Stringliterals). It is up to an implementation to decide on the exact tokenization strategy, which may be different depending on the parser construction. For example, an implementation may decide that a token named For is composed of only "for", or may decide that it is composed of ("for" "$"). In the first case the implementation may decide to use lexical lookahead to distinguish the "for" lexeme from aQName that has the lexeme "for". In the second case, the implementation may decide to combine the two lexemes into a single "long" token. In either case, the end grammatical result will be the same. In the BNF, the notation "< ... >" is used to indicate and delimit a sequence of lexemes that must be recognized using lexical lookahead or some equivalent means. This grammar implies lexical states,which are lexical constraints on the tokenization process based on grammatical positioning. The

exact structure of these states is left to the implementation, but the normative rules for calculating these states are given in the 1.1.2 Lexical Rules section. When tokenizing, the longest possible match that is valid in the current lexical state is prefered . For readability, Whitespace may beused in most expressions even though not explicitly notated in the BNF. Whitespace may be freely added between lexemes, except a few cases where whitespace is needed to disambiguate the token. For instance, in XML, "-" is a valid character in an element or attribute name. When used as an operator after the characters of a name, it must be separated from the name, e.g. by using whitespace orparentheses. Special whitespace notation is specified with the BNF productions, when it is different from the default rules. "ws: explicit" means that where whitespace is allowed must be explicitly notated in the BNF. "ws: significant" means that whitespace is significant as value content. For XQuery, Whitespace is not freely allowed in the non-computed Constructor productions, but is specified explicitlyin the grammar, in order to be more consistent with XML. The lexical states where whitespace must have explicit specification are as follows: START_TAG, END_TAG, ELEMENT_CONTENT, XML_COMMENT, PROCESSING_INSTRUCTION, PROCESSING_INSTRUCTION_CONTENT, CDATA_SECTION, QUOT_ATTRIBUTE_CONTENT, and APOS_ATTRIBUTE_CONTENT. All keywords are case sensitive. 1.1.1 Syntactic Constructs Character Classes Thefollowing basic tokens are defined in [XML]. 1. Letter 2. BaseChar 3. Ideographic 4. CombiningChar 5. Digit 6. Extender Identifiers The following identifier components are defined in [XMLNAMES]. 1. NCName 2. NCNameChar 3. QName

4. Prefix 5. LocalPart String Literals and Numbers [1] [2] [3] [4]
IntegerLiteral DecimalLiteral DoubleLiteral StringLiteral
(ws: significant)

::= ::= ::= ::=...
Leer documento completo

Regístrate para leer el documento completo.

Conviértase en miembro formal de Buenas Tareas

INSCRÍBETE - ES GRATIS