This is a simple s-expression parser written in Python 3. It understands symbols and numbers and uses tuples to represent the data internally.
First, the tokenizer adds padding to left and right parentheses. Then it splits the raw token stream by space characters. As a result empty items will appear, since
( will turn into
(, which will be split into
. That is why the surroundingfilter
discards empty items usingbool
. Finally, the tokenizer turns all numeric tokens intoint`s. It returns the resulting token stream as a list.
The token stream can now be parsed.
A valid s-expression is either an atom (int or symbol) or a list of s-expressions. Since we operate on a token stream, the parser has to peek at the current token and then either parse a list or parse an atom.
A recursive descent parser lends itself to this type of recursive grammar.