This project is done as a formal languages course work for sophomore year at HSE SPb. Its goal is to implement pushdown automata with other various components such as context-free grammar syntax, grammar lexer and parser, and grammar conversion algorithm.
-
Intall
python
andpip
. -
Setup virtual environment:
> pip3 install virtualenv > git clone https://github.com/Giga-Chad-LLC/pushdown-automata.git && cd pushdown-automata > python3 -m venv .venv > source .venv/bin/activate # on windows: .venv\Scripts\activate.bat
-
Install project dependencies (make sure that the virtual environment is enabled):
(.venv)> pip install -r requirements.txt
-
Make sure that python version inside the virtual environment is at least
3.9.x
(if not, see how to upgrade it):(.venv)> python --version Python 3.9.6 # Python 3.9.x or above
- Lexer:
python ./lexer.py <path/to/file/with/grammar>
- saves lexing results into the file with same name but adding suffix.out
. - Parser:
python ./parser.py <path/to/file/with/grammar>
- saves parsing results into the file with the same name but adding suffix.out
. - Interpreter:
python ./interpreter.py <path/to/file/with/grammar>
- loads grammar rules from the specified file and then starts the interpreter. Interpretor waits for user to input a single string which is analyzed if it is recognized by the provided grammar. Prints the results of the analysis into the file with same name but adding suffix.out
. - Tests:
python ./test_transformation.py
- runs tests withunittest
python library.
- Implementation of algorithm that converts initial arbitrary context-free grammar into Greibah Weak Form: created algorithm that removes left recursion from grammar according to the article, completed the conversion algorithm with the help of other team members, added other various utility pieces of code.
- Simulation of pushdown automata work process: created class
Interpreter
which traverses states of automata (represented in pairs, e.g.{ string, stack }
, wherestring
is a suffix of the string that is to be covered with grammar parts stored instack
) and evaluates if string is recognized by the grammar. - Took part in covering the codebase with tests.
- Collaborated with other team members in VS Code via the Live Share and helped other team members.
- Created concrete syntax for an abstract syntax: non-terminals, terminals, rules.
- Implemented of removal of epsilon-generating rules from grammar using the modification with queue according to the article.
- Developed evaluation algorithm of user input string as well as created output printing and formatting functionality.
- Organized and maintained the collaborative work in the VS Code editor via the Live Share extension.
- Implementation of lexer: created tokenization of input context-free grammar description.
- Parser of context-free grammar parser: created class Grammar and others classes which help present a context-free grammar in a convenient way to present the operation of a pushdown automata.
- Tests were written to check the correctness of the simulation of the operation of the pushdown automata.
Checking the operation of the following functionalities:
- Removing of left-hand recursion
- Removing epsilon productions in a context-free grammar
- Removing the start non-terminal
- Checking the correctness of the reduction to the weak Greibach form
- Collaborated with other team members in VS Code via the Live Share extension and took part in realization and discussing of code.
- Non-terminals format:
🤯...🤯
- Terminals format:
🥵...🥵
- Empty string (aka
ε
) format:😵
- Enumeration terminator format:
🗿
- Start non-terminal format:
start=🤯...🤯
. Start must be included only once and must not be followed by enumeration terminator🗿
.
- Non-terminal is followed by binding arrow
👉
, which is followed by the enumeration of terminals and nonterminal separated by🤌
sign.
-
Palindromes over an alphabet
{ a, b }
:start=🤯str🤯 🤯str🤯 👉 🥵a🥵 🤯str🤯 🥵a🥵 🤌 🥵b🥵 🤯str🤯 🥵b🥵 🤌 🥵a🥵 🤌 🥵b🥵 🤌 😵 🗿
-
Right bracket sequence over an alphabet
{ (, ) }
:start=🤯S🤯 🤯S🤯 👉 🥵(🥵 🤯S🤯 🥵)🥵 🤌 🥵(🥵 🤯S🤯 🥵)🥵 🤯S🤯 🤌 😵 🗿
-
Basic arithmetics over digits
{ 0, 1, 2, 3 }
:start=🤯expr🤯 🤯expr🤯 👉 🤯expr🤯 🥵+🥵 🤯term🤯 🤌 🤯expr🤯 🥵-🥵 🤯term🤯 🤌 🤯term🤯🗿 🤯term🤯 👉 🤯term🤯 🥵*🥵 🤯mult🤯 🤌 🤯term🤯 🥵/🥵 🤯mult🤯 🤌 🤯mult🤯🗿 🤯mult🤯 👉 🥵(🥵 🤯expr🤯 🥵)🥵 🤌 🥵0🥵 🤌 🥵1🥵 🤌 🥵2🥵 🤌 🥵3🥵🗿
We used ply.lex
and ply.yacc
python libraries to implement lexer and parser. The abstract syntax tree that we generate when parsing the grammar is based on the following grammar-describing language:
Grammar Entity | Ruleset |
---|---|
Single | EMPTY |
NON_TERMINAL | |
TERMINAL | |
Multiple | Single |
Multiple Single | |
Description | Multiple |
Description SEPARATOR Multiple | |
Rule | NON_TERMINAL ARROW Description END |
Ruleset | Rule |
Ruleset Rule | |
Start | START |
Root | Start Ruleset |
We got our asses 🔥burned down🔥.