Module rustc_expand::mbe::macro_parser [−][src]
Expand description
This is an NFA-based parser, which calls out to the main Rust parser for named non-terminals (which it commits to fully when it hits one in a grammar). There’s a set of current NFA threads and a set of next ones. Instead of NTs, we have a special case for Kleene star. The big-O, in pathological cases, is worse than traditional use of NFA or Earley parsing, but it’s an easier fit for Macro-by-Example-style rules.
(In order to prevent the pathological case, we’d need to lazily construct the resulting
NamedMatch
es at the very end. It’d be a pain, and require more memory to keep around old
items, but it would also save overhead)
We don’t say this parser uses the Earley algorithm, because it’s unnecessarily inaccurate. The macro parser restricts itself to the features of finite state automata. Earley parsers can be described as an extension of NFAs with completion rules, prediction rules, and recursion.
Quick intro to how the parser works:
A ‘position’ is a dot in the middle of a matcher, usually represented as a
dot. For example · a $( a )* a b
is a position, as is a $( · a )* a b
.
The parser walks through the input a character at a time, maintaining a list
of threads consistent with the current position in the input string: cur_items
.
As it processes them, it fills up eof_items
with threads that would be valid if
the macro invocation is now over, bb_items
with threads that are waiting on
a Rust non-terminal like $e:expr
, and next_items
with threads that are waiting
on a particular token. Most of the logic concerns moving the · through the
repetitions indicated by Kleene stars. The rules for moving the · without
consuming any input are called epsilon transitions. It only advances or calls
out to the real Rust parser when no cur_items
threads remain.
Example:
Start parsing a a a a b against [· a $( a )* a b].
Remaining input: a a a a b
next: [· a $( a )* a b]
- - - Advance over an a. - - -
Remaining input: a a a b
cur: [a · $( a )* a b]
Descend/Skip (first item).
next: [a $( · a )* a b] [a $( a )* · a b].
- - - Advance over an a. - - -
Remaining input: a a b
cur: [a $( a · )* a b] [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b] [a $( · a )* a b] [a $( a )* a · b]
- - - Advance over an a. - - - (this looks exactly like the last step)
Remaining input: a b
cur: [a $( a · )* a b] [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b] [a $( · a )* a b] [a $( a )* a · b]
- - - Advance over an a. - - - (this looks exactly like the last step)
Remaining input: b
cur: [a $( a · )* a b] [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b] [a $( · a )* a b] [a $( a )* a · b]
- - - Advance over a b. - - -
Remaining input: ''
eof: [a $( a )* a b ·]
Structs
Represents a single “position” (aka “matcher position”, aka “item”), as described in the module documentation.
An unzipping of TokenTree
s… see the stack
field of MatcherPos
.
Enums
NamedMatch
is a pattern-match result for a single token::MATCH_NONTERMINAL
:
so it is associated with a single ident in a parse, and all
MatchedNonterminal
s in the NamedMatch
have the same non-terminal type
(expr, item, etc). Each leaf in a single NamedMatch
corresponds to a
single token::MATCH_NONTERMINAL
in the TokenTree
that produced it.
Represents the possible results of an attempted parse.
Either a sequence of token trees or a single one. This is used as the representation of the sequence of tokens that make up a matcher.
Functions
Count how many metavars are named in the given matcher ms
.
len
Vec
s (initially shared and empty) that will store matches of metavars.
Generates the top-level matcher position in which the “dot” is before the first token of the
matcher ms
.
Process the matcher positions of cur_items
until it is empty. In the process, this will
produce more items in next_items
, eof_items
, and bb_items
.
Takes a sequence of token trees ms
representing a matcher which successfully matched input
and an iterator of items that matched input and produces a NamedParseResult
.
Use the given sequence of token trees (ms
) as a matcher. Match the token
stream from the given parser
against it and return the match.
Performs a token equality check, ignoring syntax context (that is, an unhygienic comparison)
Type Definitions
A ParseResult
where the Success
variant contains a mapping of
MacroRulesNormalizedIdent
s to NamedMatch
es. This represents the mapping
of metavars to the token trees they bind to.