Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the current RegExp-based brittle parser with a new prototype #26

Open
novusnota opened this issue May 15, 2024 · 0 comments
Open
Assignees

Comments

@novusnota
Copy link
Member

novusnota commented May 15, 2024

Problems with current RegExp-based parsing

  • it's too brittle
  • it's incomplete and won't be able to compete with proper parsing algorithms (although it manages to provide better completions than current VSCode plugin somehow :)
  • it's rather slow, with delays for completion starting somewhere at 1,000 LoC of Tact or even earlier, which is certainly possible for large-enough contracts

Proposed solution

The idea is to quickly iterate on the new LL+Pratt parser in this plugin not to disturb the main compiler's workflow. It may come without handling comments or very nice error messages, but it will be robust and capable of doing a lot already. And it would be easier to maintain compared to the RegExp one.

Futhermore, it's main aim is to provide a sufficient groundwork to implementing a production-grade one for the Tact compiler and the language server. Until then and beyond that, we'll continue to hone the compiler APIs as they are, of course.

Addtionally, it's worth mentioning that it's already possible to use the extracted language server alongside this plugin and that'll still be possible in the future even when the new parser arrives there as well.

Goals to hit

  • Error-resilience: parser should be able to work with broken code, which is primary state of the code in editors :)
  • Incremental parsing: only new things have to get parsed, not the whole tree every single time
  • Good-enough error reporting: the main thing for editors is fast and up-to-date error-reporting, not the accuracy of them. Although that will be worked on later in the compiler.

P.S.: Easier to start with moving the symbol indexing out of the auto-completion. And using some standard, like LSIF or SCIP (yay!) — https://github.com/sourcegraph/scip. Also, debounce and do updates of that index in the background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant