-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental and error-resilient parser #286
Comments
But until then, we must utilize the most out of Ohm's parser, including its incremental parsing capabilities and try to enhance its syntax error reporting clarity locally or in it's upstream repo. It's actually seems to be not that far from giving a structured error message instead of a pre-formed string (although if worst comes to worst, we could parse the resulting string however brittle this may seem). Note, that it's not strictly necessary to eventually make an incremental parser, but it's a must to make it error-resilient and producing nice errors (see: #183). |
hey @novusnota this is a great idea and a huge boost for
The target language is assumed to be |
This is definitely out-of-scope for the parser project. There are two static analyzer projects supported by the grants and bounties program: |
@0xGeorgii this is a big project, but we don't need to rush out the implementation before we prototype some more with Ohm and check off some points from #314 with it. Moreover, #183 has to be addressed with the current Ohm parser we have. That said, Ohm won't be going anywhere even when we'll have a new parser — it's always nice to have a formal specification/reference, so Ohm's Now, to your questions:
I think just being faster than Ohm would do, but concrete metrics are up for discussion.
Yes, that's the point :) Oh, and incrementality — lexer/parser combo should be able to start from a specified rule and not re-lex/re-parse the whole tree. This is very important for editor/IDE environments with frequent changes of the source code. (We may introduce a debounce or something here, just to keep things simple, but that's still a thing to keep in mind nonetheless). At the moment, I lean towards having a combined LL (with some tricks) + Pratt (for expressions) parser architecture. But this is not set in stone :)
Not sure here, but it would be nice.
Yes, and it also must be error-resilient — just marking invalid tokens and going forward should be enough.
That's what the Ohm's grammar for — syntax will first be tested in it, then those changed would be matched by the new parser. At least that's how I view it. Some important questions (IMHO) weren't asked:
Summoning @anton-trunov to correct me about our plans and/or my points here :) |
Would greatly help with #183.
The basic idea is to move from the current Ohm-generated parser to a standalone one, which would be suitable for tools like Language Server or others. Additionally, it would be nice to keep the existing grammar.ohm spec file — it will allow us to check the implementetion of the new parser against an Ohm-generated one.
List of ideas, papers and useful blog-posts (organized by year, in decreasing order):
Some more things:
The text was updated successfully, but these errors were encountered: