-
-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recursion causes stackoverflow and crash #368
Comments
I can't reproduce this on my dev machine with the example definition of I thought that in practice we'd be safe until expressions got much more deeply nested. The flisp parser uses exactly the same recursive structure but it's run in an interpreter so I guess it might not be using the native stack... |
A nesting depth of 1700 is sufficient to cause stack overflow on my machine. For example:
Typing such huge expressions is unusual and probably cursed. But if someone is generating them it's disturbingly easy to hit the limit here. |
I think I saw this error message locally when doing a precompile and it also involved this code |
I've been seeing this a lot too.
|
(Aside: why does these errors keep printing out a weird |
This code is from Tokenize I guess so we could change that code as an immidiate workaround I guess. Or is it even from this package? |
It's from Tokenize. In JuliaSyntax I've changed the way that function is defined for efficiency and maintainability. |
Ok, I'll port that and update it. |
This is still on my mind. I fear it's possibly the primary thing that can hold us back from deleting the flisp parser in the near future (please discuss!?) Ironically flisp is interpreted so its stack is actually on the heap (probably? I think?) - so it has less of a problem with deep recursion than the JuliaSyntax parser. One option may be to go toward Pratt parsing - or the iterative equivalent - the shunting yard algorithm: More interesting discussion here: It's not clear to me whether that would accommodate all the recursive descent hacks we've already got - investigation needed. Also it's not clear whether we can keep the code readable if we go to an explicit stack... |
We are now getting a fair number of crash reports from the VS Code extension that look like this bug (at the moment it takes the top spot of "most bugs reported by the LS"), so in the wild this seems to still be a real problem. @c42f idea about Pratt parsing sounds like a big lift? Is that in the cards in the near to mid-term? Any other ideas on how to solve this? I have more crash reports with stack overflow messages that I could share if that is useful. |
This is very much a work in progress, but shows some promise for *very greatly* reducing our recursion depth. The idea is to use the non recursive shunting-yard algorithm for parsing operators and grouping-parentheses but to delegate back to the existing recursive formulation for other constructs. This will likely solve #368 in all practical cases - I expect deeply recursive constructs only for huge chains of operators and parentheses. Currently our operator parsing consumes maybe 15 or so stack frames every time a grouping parenthesis nested in combination with arithmetic. This quickly leads to absurdly deep program stacks and stack overflow. Moving to a system like a Pratt parser where we skip non-used precedence levels would make this a single stack frame. Moving to the shunting yard algorithm makes it zero stack frames, provided we can also use it to treat grouping parentheses (not an entirely simple thing, because parentheses in Julia are *very* syntactically overloaded.) The biggest challenge here is to ensure we exactly reproduce all of Julia's operator precedence rules, which have many complicated special cases. The demo here doesn't cover many special cases, but it does show how a few of these can be dealt with quite simply in the non-recursive context. For example, chains of `+` and `*` need to parse into a single n-ary call, and it was reasonably easy to add this special case.
@davidanthoff I've started prototyping non-recursive operator precedence parsing over at #524 (basically equivalent to Pratt, but no recursion at all) and it looks very promising. However you're right that doing it in full is a relatively big job which I'd very roughly estimate will take a month full time if we count the testing of all the many subtle operator precedence rules and integration with It may be possible to get some of the benefits with a bit less work but that will require some subtle targeting in the right part of the operator precedence stack. I won't have time to work further on this for the next month unless Jeff is willing to delay the end date of the JuliaLowering project (I might message Jeff and ask about this as a potential project to slot in somewhere in the near future).
Only if you feel they show something distinct, I think I understand the underlying issue here quite well and further stack traces are unlikely to be useful. However, if VSCode users supply example code that would be rather useful. |
observed in https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/5a9df24_vs_3250804/ODE.primary.log
In general, a compiler/parser commonly must not use recursion since it is surprisingly easy for the user to generate inputs that cause it to crash.
The text was updated successfully, but these errors were encountered: