Replace the current RegExp-based brittle parser with a new prototype #26

novusnota · 2024-05-15T15:36:53Z

Problems with current RegExp-based parsing

it's too brittle
it's incomplete and won't be able to compete with proper parsing algorithms (although it manages to provide better completions than current VSCode plugin somehow :)
it's rather slow, with delays for completion starting somewhere at 1,000 LoC of Tact or even earlier, which is certainly possible for large-enough contracts

Proposed solution

The idea is to quickly iterate on the new LL+Pratt parser in this plugin not to disturb the main compiler's workflow. It may come without handling comments or very nice error messages, but it will be robust and capable of doing a lot already. And it would be easier to maintain compared to the RegExp one.

Futhermore, it's main aim is to provide a sufficient groundwork to implementing a production-grade one for the Tact compiler and the language server. Until then and beyond that, we'll continue to hone the compiler APIs as they are, of course.

Addtionally, it's worth mentioning that it's already possible to use the extracted language server alongside this plugin and that'll still be possible in the future even when the new parser arrives there as well.

Goals to hit

Error-resilience: parser should be able to work with broken code, which is primary state of the code in editors :)
Incremental parsing: only new things have to get parsed, not the whole tree every single time
Good-enough error reporting: the main thing for editors is fast and up-to-date error-reporting, not the accuracy of them. Although that will be worked on later in the compiler.

P.S.: Easier to start with moving the symbol indexing out of the auto-completion. And using some standard, like LSIF or SCIP (yay!) — https://github.com/sourcegraph/scip. Also, debounce and do updates of that index in the background.

novusnota self-assigned this May 15, 2024

novusnota mentioned this issue May 15, 2024

Idea: Use of tree-sitter-tact or Ohm's grammar #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace the current RegExp-based brittle parser with a new prototype #26

Replace the current RegExp-based brittle parser with a new prototype #26

novusnota commented May 15, 2024 •

edited

Loading

Replace the current RegExp-based brittle parser with a new prototype #26

Replace the current RegExp-based brittle parser with a new prototype #26

Comments

novusnota commented May 15, 2024 • edited Loading

Problems with current RegExp-based parsing

Proposed solution

Goals to hit

novusnota commented May 15, 2024 •

edited

Loading