This is a simple s-expression parser written in Python 3. It understands symbols and numbers and uses tuples to represent the data internally.
First, the tokenizer adds padding to left and right parentheses. Then it splits the raw token stream by space characters. As a result empty items will appear, since
( will turn into
(, which will be split into
. That is why the surrounding
filter discards empty items using
bool. Finally, the tokenizer turns all numeric tokens into
ints. It returns the resulting token stream as a list.
The token stream can now be parsed.
A valid s-expression is either an atom (int or symbol) or a list of s-expressions. Since we operate on a token stream, the parser has to peek at the current token and then either parse a list or parse an atom.
A recursive descent parser lends itself to this type of recursive grammar.
You are more than welcome to share your thoughts via email