Replies: 1 comment
-
|
Update: I have since figured out how to keep GC enabled during parsing without swapping out the current parser. If/when we do finally switch away from yacc, I expect it to be for a hand-written parser that allows for more control over things like error handling and user-extensible syntax. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have hacked up a change (emphasis on the "hack" -- it is not at all PR-quality code right now) which enables garbage collection during the course of parsing. In particular, it leaves the GC enabled while the shell is collecting input to be parsed. This should, at least in theory, make it possible to run user-defined es code during input, enabling things like programmable completions (or, more extremely, I'd love to see if it were possible to define an all-es readline alternative.) The catch is that implementing this required changing parser generator engines entirely: I swapped out the current yacc/bison parser for the lemon parser generator.
Lemon is the parser generator used in the SQLite project. It is an LALR(1) parser generator like yacc and friends, and its grammar syntax is largely the same as well (I had to change a lot of lines of the grammar file, but those changes were generally mechanical). It is also public domain, written in mostly-library-less C89, and intended to be included directly in the source tree of other projects, which means it shouldn't add any meaningful portability burden.
The features which make lemon compelling are these:
Generated lemon parsers are "push-style", instead of "pull-style". This means that rather than calling
yyparse()and letting it "pull" tokens from the input as it needs, the caller instead takes tokens from the input itself, and then "pushes" those tokens into the parser in a loop until either the tokens run out or the parser signals that a full valid statement has been given. This means that input happens without any live parser code in the stack -- all parser state is encapsulated into a singleparserobject.Lemon enables a high degree of control of the generated code. Generated logic from the grammar file is injected into a template file
lempar.cwhich can be supplied by the caller. This means that theparserobject from the first item can be inspected and even augmented with application-specific things -- for instance, garbage collection routines, which allow theparserto beRefd.These two things together make it so the parser-calling bit of code in
input.ccan look, roughly, likeand
yylex()(and thefillfunctions that it calls to fetch input) can run with the GC enabled!I think this is very exciting. Interactive features in general have become a major Achilles' heel for es in the age of friendly, interactive shells. However, this a big change, and switching from what is essentially The Standard parser generator to one that is significantly more obscure is a decision that should be made carefully. So, I'm curious what people think about it.
Beta Was this translation helpful? Give feedback.
All reactions