recursion causes stackoverflow and crash #368

vtjnash · 2023-10-15T18:35:31Z

observed in https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/5a9df24_vs_3250804/ODE.primary.log

In general, a compiler/parser commonly must not use recursion since it is surprisingly easy for the user to generate inputs that cause it to crash.

is_operator_start_char(u::UInt32) = u == 0x00000021 || (u == 0x00000024 || (u == 0x00000025 || (u == 0x00000026 || (u == 0x00000027 || (u == 0x0000002a || (u == 0x0000002b || (u == 0x0000002d || (u == 0x0000002e || (u == 0x0000002f || (u == 0x0000003a || (u == 0x0000003c || (u == 0x0000003d || (u == 0x0000003e || (u == 0x0000003f || (u == 0x0000005c || (u == 0x0000005e || (u == 0x00000069 || (u == 0x00000077 || (u == 0x0000007c || (u == 0x0000007e || (u == 0x000000ac || (u == 0x000000b1 || (u == 0x000000d7 || (u == 0x000000f7 || (u == 0x00002026 || (u == 0x0000205d || (u == 0x0000214b || (u == 0x00002190 || (u == 0x00002191 || (u == 0x00002192 || (u == 0x00002193 || (u == 0x00002194 || (u == 0x0000219a || (u == 0x0000219b || (u == 0x000021a0 || (u == 0x000021a3 || (u == 0x000021a6 || (u == 0x000021ae || (u == 0x000021ce || (u == 0x000021cf || (u == 0x000021d2 || (u == 0x000021d4 || (u == 0x000021f4 || (u == 0x000021f5 || (u == 0x000021f6 || (u == 0x000021f7 || (u == 0x000021f8 || (u == 0x000021f9 || (u == 0x000021fa || (u == 0x000021fb || (u == 0x000021fc || (u == 0x000021fd || (u == 0x000021fe || (u == 0x000021ff || (u == 0x00002208 || (u == 0x00002209 || (u == 0x0000220a || (u == 0x0000220b || (u == 0x0000220c || (u == 0x0000220d || (u == 0x00002213 || (u == 0x00002214 || (u == 0x00002217 || (u == 0x00002218 || (u == 0x00002219 || (u == 0x0000221a || (u == 0x0000221b || (u == 0x0000221c || (u == 0x0000221d || (u == 0x00002224 || (u == 0x00002225 || (u == 0x00002226 || (u == 0x00002227 || (u == 0x00002228 || (u == 0x00002229 || (u == 0x0000222a || (u == 0x00002237 || (u == 0x00002238 || (u == 0x0000223a || (u == 0x0000223b || (u == 0x0000223d || (u == 0x0000223e || (u == 0x00002240 || (u == 0x00002241 || (u == 0x00002242 || (u == 0x00002243 || (u == 0x00002244 || (u == 0x00002245 || (u == 0x00002246 || (u == 0x00002247 || (u == 0x00002248 || (u == 0x00002249 || (u == 0x0000224a || (u == 0x0000224b || (u == 0x0000224c || (u == 0x0000224d || (u == 0x0000224e || (u == 0x0000224f || (u == 0x00002250 || (u == 0x00002251 || (u == 0x00002252 || (u == 0x00002253 || (u == 0x00002254 || (u == 0x00002255 || (u == 0x00002256 || (u == 0x00002257 || (u == 0x00002258 || (u == 0x00002259 || (u == 0x0000225a || (u == 0x0000225b || (u == 0x0000225c || (u == 0x0000225d || (u == 0x0000225e || (u == 0x0000225f || (u == 0x00002260 || (u == 0x00002261 || (u == 0x00002262 || (u == 0x00002263 || (u == 0x00002264 || (u == 0x00002265 || (u == 0x00002266 || (u == 0x00002267 || (u == 0x00002268 || (u == 0x00002269 || (u == 0x0000226a || (u == 0x0000226b || (u == 0x0000226c || (u == 0x0000226d || (u == 0x0000226e || (u == 0x0000226f || (u == 0x00002270 || (u == 0x00002271 || (u == 0x00002272 || (u == 0x00002273 || (u == 0x00002274 || (u == 0x00002275 || (u == 0x00002276 || (u == 0x00002277 || (u == 0x00002278 || (u == 0x00002279 || (u == 0x0000227a || (u == 0x0000227b || (u == 0x0000227c || (u == 0x0000227d || (u == 0x0000227e || (u == 0x0000227f || (u == 0x00002280 || (u == 0x00002281 || (u == 0x00002282 || (u == 0x00002283 || (u == 0x00002284 || (u == 0x00002285 || (u == 0x00002286 || (u == 0x00002287 || (u == 0x00002288 || (u == 0x00002289 || (u == 0x0000228a || (u == 0x0000228b || (u == 0x0000228d || (u == 0x0000228e || (u == 0x0000228f || (u == 0x00002290 || (u == 0x00002291 || (u == 0x00002292 || (u == 0x00002293 || (u == 0x00002294 || (u == 0x00002295 || (u == 0x00002296 || (u == 0x00002297 || (u == 0x00002298 || (u == 0x00002299 || (u == 0x0000229a || (u == 0x0000229b || (u == 0x0000229c || (u == 0x0000229e || (u == 0x0000229f || (u == 0x000022a0 || (u == 0x000022a1 || (u == 0x000022a2 || (u == 0x000022a3 || (u == 0x000022a9 || (u == 0x000022ac || (u == 0x000022ae || (u == 0x000022b0 || (u == 0x000022b1 || (u == 0x000022b2 || (u == 0x000022b3 || (u == 0x000022b4 || (u == 0x000022b5 || (u == 0x000022b6 || (u == 0x000022b7 || (u == 0x000022bb || (u == 0x000022bc || (u == 0x000022bd || (u == 0x000022c4 || (u == 0x000022c5 || (u == 0x000022c6 || (u == 0x000022c7 || (u == 0x000022c9 || (u == 0x000022ca || (u == 0x000022cb || (u == 0x000022cc || (u == 0x000022cd || (u == 0x000022ce || (u == 0x000022cf || (u == 0x000022d0 || (u == 0x000022d1 || (u == 0x000022d2 || (u == 0x000022d3 || (u == 0x000022d5 || (u == 0x000022d6 || (u == 0x000022d7 || (u == 0x000022d8 || (u == 0x000022d9 || (u == 0x000022da || (u == 0x000022db || (u == 0x000022dc || (u == 0x000022dd || (u == 0x000022de || (u == 0x000022df || (u == 0x000022e0 || (u == 0x000022e1 || (u == 0x000022e2 || (u == 0x000022e3 || (u == 0x000022e4 || (u == 0x000022e5 || (u == 0x000022e6 || (u == 0x000022e7 || (u == 0x000022e8 || (u == 0x000022e9 || (u == 0x000022ea || (u == 0x000022eb || (u == 0x000022ec || (u == 0x000022ed || (u == 0x000022ee || (u == 0x000022ef || (u == 0x000022f0 || (u == 0x000022f1 || (u == 0x000022f2 || (u == 0x000022f3 || (u == 0x000022f4 || (u == 0x000022f5 || (u == 0x000022f6 || (u == 0x000022f7 || (u == 0x000022f8 || (u == 0x000022f9 || (u == 0x000022fa || (u == 0x000022fb || (u == 0x000022fc || (u == 0x000022fd || (u == 0x000022fe || (u == 0x000022ff || (u == 0x000025b7 || (u == 0x000027c2 || (u == 0x000027c8 || (u == 0x000027c9 || (u == 0x000027d1 || (u == 0x000027d2 || (u == 0x000027d5 || (u == 0x000027d6 || (u == 0x000027d7 || (u == 0x000027f0 || (u == 0x000027f1 || (u == 0x000027f5 || (u == 0x000027f6 || (u == 0x000027f7 || (u == 0x000027f9 || (u == 0x000027fa || (u == 0x000027fb || (u == 0x000027fc || (u == 0x000027fd || (u == 0x000027fe || (u == 0x000027ff || (u == 0x00002900 || (u == 0x00002901 || (u == 0x00002902 || (u == 0x00002903 || (u == 0x00002904 || (u == 0x00002905 || (u == 0x00002906 || (u == 0x00002907 || (u == 0x00002908 || (u == 0x00002909 || (u == 0x0000290a || (u == 0x0000290b || (u == 0x0000290c || (u == 0x0000290d || (u == 0x0000290e || (u == 0x0000290f || (u == 0x00002910 || (u == 0x00002911 || (u == 0x00002912 || (u == 0x00002913 || (u == 0x00002914 || (u == 0x00002915 || (u == 0x00002916 || (u == 0x00002917 || (u == 0x00002918 || (u == 0x0000291d || (u == 0x0000291e || (u == 0x0000291f || (u == 0x00002920 || (u == 0x00002944 || (u == 0x00002945 || (u == 0x00002946 || (u == 0x00002947 || (u == 0x00002948 || (u == 0x00002949 || (u == 0x0000294a || (u == 0x0000294b || (u == 0x0000294c || (u == 0x0000294d || (u == 0x0000294e || (u == 0x0000294f || (u == 0x00002950 || (u == 0x00002951 || (u == 0x00002952 || (u == 0x00002953 || (u == 0x00002954 || (u == 0x00002955 || (u == 0x00002956 || (u == 0x00002957 || (u == 0x00002958 || (u == 0x00002959 || (u == 0x0000295a || (u == 0x0000295b || (u == 0x0000295c || (u == 0x0000295d || (u == 0x0000295e || (u == 0x0000295f || (u == 0x00002960 || (u == 0x00002961 || (u == 0x00002962 || (u == 0x00002963 || (u == 0x00002964 || (u == 0x00002965 || (u == 0x00002966 || (u == 0x00002967 || (u == 0x00002968 || (u == 0x00002969 || (u == 0x0000296a || (u == 0x0000296b || (u == 0x0000296c || (u == 0x0000296d || (u == 0x0000296e || (u == 0x0000296f || (u == 0x00002970 || (u == 0x000029b7 || (u == 0x000029b8 || (u == 0x000029bc || (u == 0x000029be || (u == 0x000029bf || (u == 0x000029c0 || (u == 0x000029c1 || (u == 0x000029e1 || (u == 0x000029e3 || (u == 0x000029e4 || (u == 0x000029e5 || (u == 0x000029f4 || (u == 0x000029f6 || (u == 0x000029f7 || (u == 0x000029fa || (u == 0x000029fb || (u == 0x00002a07 || (u == 0x00002a08 || (u == 0x00002a1d || (u == 0x00002a22 || (u == 0x00002a23 || (u == 0x00002a24 || (u == 0x00002a25 || (u == 0x00002a26 || (u == 0x00002a27 || (u == 0x00002a28 || (u == 0x00002a29 || (u == 0x00002a2a || (u == 0x00002a2b || (u == 0x00002a2c || (u == 0x00002a2d || (u == 0x00002a2e || (u == 0x00002a30 || (u == 0x00002a31 || (u == 0x00002a32 || (u == 0x00002a33 || (u == 0x00002a34 || (u == 0x00002a35 || (u == 0x00002a36 || (u == 0x00002a37 || (u == 0x00002a38 || (u == 0x00002a39 || (u == 0x00002a3a || (u == 0x00002a3b || (u == 0x00002a3c || (u == 0x00002a3d || (u == 0x00002a40 || (u == 0x00002a41 || (u == 0x00002a42 || (u == 0x00002a43 || (u == 0x00002a44 || (u == 0x00002a45 || (u == 0x00002a4a || (u == 0x00002a4b || (u == 0x00002a4c || (u == 0x00002a4d || (u == 0x00002a4e || (u == 0x00002a4f || (u == 0x00002a50 || (u == 0x00002a51 || (u == 0x00002a52 || (u == 0x00002a53 || (u == 0x00002a54 || (u == 0x00002a55 || (u == 0x00002a56 || (u == 0x00002a57 || (u == 0x00002a58 || (u == 0x00002a5a || (u == 0x00002a5b || (u == 0x00002a5c || (u == 0x00002a5d || (u == 0x00002a5e || (u == 0x00002a5f || (u == 0x00002a60 || (u == 0x00002a61 || (u == 0x00002a62 || (u == 0x00002a63 || (u == 0x00002a66 || (u == 0x00002a67 || (u == 0x00002a6a || (u == 0x00002a6b || (u == 0x00002a6c || (u == 0x00002a6d || (u == 0x00002a6e || (u == 0x00002a6f || (u == 0x00002a70 || (u == 0x00002a71 || (u == 0x00002a72 || (u == 0x00002a73 || (u == 0x00002a74 || (u == 0x00002a75 || (u == 0x00002a76 || (u == 0x00002a77 || (u == 0x00002a78 || (u == 0x00002a79 || (u == 0x00002a7a || (u == 0x00002a7b || (u == 0x00002a7c || (u == 0x00002a7d || (u == 0x00002a7e || (u == 0x00002a7f || (u == 0x00002a80 || (u == 0x00002a81 || (u == 0x00002a82 || (u == 0x00002a83 || (u == 0x00002a84 || (u == 0x00002a85 || (u == 0x00002a86 || (u == 0x00002a87 || (u == 0x00002a88 || (u == 0x00002a89 || (u == 0x00002a8a || (u == 0x00002a8b || (u == 0x00002a8c || (u == 0x00002a8d || (u == 0x00002a8e || (u == 0x00002a8f || (u == 0x00002a90 || (u == 0x00002a91 || (u == 0x00002a92 || (u == 0x00002a93 || (u == 0x00002a94 || (u == 0x00002a95 || (u == 0x00002a96 || (u == 0x00002a97 || (u == 0x00002a98 || (u == 0x00002a99 || (u == 0x00002a9a || (u == 0x00002a9b || (u == 0x00002a9c || (u == 0x00002a9d || (u == 0x00002a9e || (u == 0x00002a9f || (u == 0x00002aa0 || (u == 0x00002aa1 || (u == 0x00002aa2 || (u == 0x00002aa3 || (u == 0x00002aa4 || (u == 0x00002aa5 || (u == 0x00002aa6 || (u == 0x00002aa7 || (u == 0x00002aa8 || (u == 0x00002aa9 || (u == 0x00002aaa || (u == 0x00002aab || (u == 0x00002aac || (u == 0x00002aad || (u == 0x00002aae || (u == 0x00002aaf || (u == 0x00002ab0 || (u == 0x00002ab1 || (u == 0x00002ab2 || (u == 0x00002ab3 || (u == 0x00002ab4 || (u == 0x00002ab5 || (u == 0x00002ab6 || (u == 0x00002ab7 || (u == 0x00002ab8 || (u == 0x00002ab9 || (u == 0x00002aba || (u == 0x00002abb || (u == 0x00002abc || (u == 0x00002abd || (u == 0x00002abe || (u == 0x00002abf || (u == 0x00002ac0 || (u == 0x00002ac1 || (u == 0x00002ac2 || (u == 0x00002ac3 || (u == 0x00002ac4 || (u == 0x00002ac5 || (u == 0x00002ac6 || (u == 0x00002ac7 || (u == 0x00002ac8 || (u == 0x00002ac9 || (u == 0x00002aca || (u == 0x00002acb || (u == 0x00002acc || (u == 0x00002acd || (u == 0x00002ace || (u == 0x00002acf || (u == 0x00002ad0 || (u == 0x00002ad1 || (u == 0x00002ad2 || (u == 0x00002ad3 || (u == 0x00002ad4 || (u == 0x00002ad5 || (u == 0x00002ad6 || (u == 0x00002ad7 || (u == 0x00002ad8 || (u == 0x00002ad9 || (u == 0x00002adb || (u == 0x00002af7 || (u == 0x00002af8 || (u == 0x00002af9 || (u == 0x00002afa || (u == 0x00002b30 || (u == 0x00002b31 || (u == 0x00002b32 || (u == 0x00002b33 || (u == 0x00002b34 || (u == 0x00002b35 || (u == 0x00002b36 || (u == 0x00002b37 || (u == 0x00002b38 || (u == 0x00002b39 || (u == 0x00002b3a || (u == 0x00002b3b || (u == 0x00002b3c || (u == 0x00002b3d || (u == 0x00002b3e || (u == 0x00002b3f || (u == 0x00002b40 || (u == 0x00002b41 || (u == 0x00002b42 || (u == 0x00002b43 || (u == 0x00002b44 || (u == 0x00002b47 || (u == 0x00002b48 || (u == 0x00002b49 || (u == 0x00002b4a || (u == 0x00002b4b || (u == 0x00002b4c || (u == 0x0000ffe9 || (u == 0x0000ffea || (u == 0x0000ffeb || u == 0x0000ffec)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

Error: JuliaSyntax parser failed — falling back to flisp!
exception = StackOverflowError()
 [parse_with_chains(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_rational), is_op::typeof(Base.JuliaSyntax.is_prec_times), chain_ops::Tuple{Base.JuliaSyntax.Kind}) at parser.jl:973]
 [parse_term(ps::Base.JuliaSyntax.ParseState) at parser.jl:965]
 [parse_with_chains(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_term), is_op::typeof(Base.JuliaSyntax.is_prec_plus), chain_ops::Tuple{Base.JuliaSyntax.Kind, Base.JuliaSyntax.Kind}) at parser.jl:973]
 [parse_expr(ps::Base.JuliaSyntax.ParseState) at parser.jl:958]
 [parse_invalid_ops(ps::Base.JuliaSyntax.ParseState) at parser.jl:943]
 [parse_range(ps::Base.JuliaSyntax.ParseState) at parser.jl:853]
 [parse_LtoR(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_range), is_op::typeof(Base.JuliaSyntax.is_prec_pipe_gt)) at parser.jl:367]
 [parse_pipe_gt(ps::Base.JuliaSyntax.ParseState) at parser.jl:840]
 [parse_RtoL(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_pipe_gt), is_op::typeof(Base.JuliaSyntax.is_prec_pipe_lt), self::typeof(Base.JuliaSyntax.parse_pipe_lt)) at parser.jl:382]
 [parse_pipe_lt(ps::Base.JuliaSyntax.ParseState) at parser.jl:833]
 [parse_comparison(ps::Base.JuliaSyntax.ParseState, subtype_comparison::Bool) at parser.jl:791]
 [parse_comparison(ps::Base.JuliaSyntax.ParseState) at parser.jl:783]
 [parse_lazy_cond(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_comparison), is_op::typeof(Base.JuliaSyntax.is_prec_lazy_and), self::typeof(Base.JuliaSyntax.parse_and)) at parser.jl:748]
 [parse_and(ps::Base.JuliaSyntax.ParseState) at parser.jl:776]
 [parse_lazy_cond(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_and), is_op::typeof(Base.JuliaSyntax.is_prec_lazy_or), self::typeof(Base.JuliaSyntax.parse_or)) at parser.jl:748]
 [parse_or(ps::Base.JuliaSyntax.ParseState) at parser.jl:767]
 [parse_arrow(ps::Base.JuliaSyntax.ParseState) at parser.jl:724]
 [parse_cond(ps::Base.JuliaSyntax.ParseState) at parser.jl:662]
 [parse_RtoL(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_cond), is_op::typeof(Base.JuliaSyntax.is_prec_pair), self::typeof(Base.JuliaSyntax.parse_pair)) at parser.jl:382]
 [parse_pair(ps::Base.JuliaSyntax.ParseState) at parser.jl:653]
 [parse_assignment at parser.jl:589 [inlined], parse_eq_star(ps::Base.JuliaSyntax.ParseState) at parser.jl:580]
 [parse_brackets(after_parse::Base.JuliaSyntax.var"#89#90"{Bool}, ps::Base.JuliaSyntax.ParseState, closing_kind::Base.JuliaSyntax.Kind) at parser.jl:3124]
 [parse_paren(ps::Base.JuliaSyntax.ParseState, check_identifiers::Bool) at parser.jl:3033]
 [parse_atom(ps::Base.JuliaSyntax.ParseState, check_identifiers::Bool) at parser.jl:3547]
 [parse_atom at parser.jl:3399 [inlined], parse_unary_prefix(ps::Base.JuliaSyntax.ParseState) at parser.jl:1446]
 [parse_call(ps::Base.JuliaSyntax.ParseState) at parser.jl:1407]
 [parse_factor(ps::Base.JuliaSyntax.ParseState) at parser.jl:1355]
 [parse_unary(ps::Base.JuliaSyntax.ParseState) at parser.jl:1181]
 [parse_juxtapose(ps::Base.JuliaSyntax.ParseState) at parser.jl:1104]
 [parse_where(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_juxtapose)) at parser.jl:1093]
 [parse_unary_subtype(ps::Base.JuliaSyntax.ParseState) at parser.jl:1056]
 [parse_LtoR(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_unary_subtype), is_op::typeof(Base.JuliaSyntax.is_prec_bitshift)) at parser.jl:367]
 [parse_shift(ps::Base.JuliaSyntax.ParseState) at parser.jl:1025]
 [parse_LtoR(ps::Base.JuliaSyntax.ParseState, down::typeof(Base.JuliaSyntax.parse_shift), is_op::typeof(Base.JuliaSyntax.is_prec_rational)) at parser.jl:367]
 ⋮

The text was updated successfully, but these errors were encountered:

c42f · 2023-10-16T06:36:18Z

I can't reproduce this on my dev machine with the example definition of is_operator_start_char(). Could the nanosolider machine have a small max stack size?

I thought that in practice we'd be safe until expressions got much more deeply nested. The flisp parser uses exactly the same recursive structure but it's run in an interpreter so I guess it might not be using the native stack...

c42f · 2023-10-16T06:59:24Z

A nesting depth of 1700 is sufficient to cause stack overflow on my machine. For example:

julia> make_expr(N) = N == 0 ? UInt32(0) : :(u == $(UInt32(N)) || $(make_expr(N-1)))

julia> parsestmt(Expr, string(make_expr(1700)));
ERROR: StackOverflowError:
...

Typing such huge expressions is unusual and probably cursed. But if someone is generating them it's disturbingly easy to hit the limit here.

KristofferC · 2023-10-31T11:36:05Z

I think I saw this error message locally when doing a precompile and it also involved this code is_operator_start_char(u::UInt32) = u == 0x00000021 || (u == 0x00000024.... So it might be that that function is on the border of overflowing?

oscardssmith · 2023-11-02T13:58:44Z

I've been seeing this a lot too.

Precompiling project...
  221 dependencies successfully precompiled in 330 seconds. 41 already precompiled.
  1 dependency had output during precompilation:
┌ Tokenize
│  ┌ Error: JuliaSyntax parser failed — falling back to flisp!
│  │   exception = (StackOverflowError(), Union{Ptr{Nothing}, Base.InterpreterIP}[Ptr{Nothing} @0x00007f178ebd056a, Ptr{Nothing} @0x00007f178e91c08c, Ptr{Nothing} @0x00007f178e91c643, Ptr{Nothing} @0x00007f178eca7a19, Ptr{Nothing} @0x00007f178ead8d2f, Ptr{Nothing} @0x00007f178ee75f33, Ptr{Nothing} @0x00007f178ec82f81, Ptr{Nothing} @0x00007f178ec8303e, Ptr{Nothing}
...

vtjnash · 2023-11-02T15:19:04Z

(Aside: why does these errors keep printing out a weird backtrace()-looking thing instead of the usual catch_stacktrace()?)

KristofferC · 2023-11-02T17:07:02Z

#375

KristofferC · 2023-11-02T17:13:45Z

This code is from Tokenize I guess so we could change that code as an immidiate workaround I guess. Or is it even from this package?

c42f · 2023-11-03T20:38:22Z

It's from Tokenize. In JuliaSyntax I've changed the way that function is defined for efficiency and maintainability.

KristofferC · 2023-11-03T21:23:40Z

Ok, I'll port that and update it.

KristofferC · 2023-11-07T14:24:39Z

JuliaLang/Tokenize.jl#208

c42f · 2024-07-19T03:12:37Z

This is still on my mind. I fear it's possibly the primary thing that can hold us back from deleting the flisp parser in the near future (please discuss!?) Ironically flisp is interpreted so its stack is actually on the heap (probably? I think?) - so it has less of a problem with deep recursion than the JuliaSyntax parser.

One option may be to go toward Pratt parsing - or the iterative equivalent - the shunting yard algorithm:
https://matklad.github.io/2020/04/15/from-pratt-to-dijkstra.html

More interesting discussion here:
https://www.reddit.com/r/rust/comments/g0eusf/blog_post_simple_but_powerful_pratt_parsing/

It's not clear to me whether that would accommodate all the recursive descent hacks we've already got - investigation needed. Also it's not clear whether we can keep the code readable if we go to an explicit stack...

davidanthoff · 2024-12-22T20:44:21Z

We are now getting a fair number of crash reports from the VS Code extension that look like this bug (at the moment it takes the top spot of "most bugs reported by the LS"), so in the wild this seems to still be a real problem.

@c42f idea about Pratt parsing sounds like a big lift? Is that in the cards in the near to mid-term? Any other ideas on how to solve this? I have more crash reports with stack overflow messages that I could share if that is useful.

This is very much a work in progress, but shows some promise for *very greatly* reducing our recursion depth. The idea is to use the non recursive shunting-yard algorithm for parsing operators and grouping-parentheses but to delegate back to the existing recursive formulation for other constructs. This will likely solve #368 in all practical cases - I expect deeply recursive constructs only for huge chains of operators and parentheses. Currently our operator parsing consumes maybe 15 or so stack frames every time a grouping parenthesis nested in combination with arithmetic. This quickly leads to absurdly deep program stacks and stack overflow. Moving to a system like a Pratt parser where we skip non-used precedence levels would make this a single stack frame. Moving to the shunting yard algorithm makes it zero stack frames, provided we can also use it to treat grouping parentheses (not an entirely simple thing, because parentheses in Julia are *very* syntactically overloaded.) The biggest challenge here is to ensure we exactly reproduce all of Julia's operator precedence rules, which have many complicated special cases. The demo here doesn't cover many special cases, but it does show how a few of these can be dealt with quite simply in the non-recursive context. For example, chains of `+` and `*` need to parse into a single n-ary call, and it was reasonably easy to add this special case.

c42f · 2025-01-03T03:16:23Z

Pratt parsing sounds like a big lift?

@davidanthoff I've started prototyping non-recursive operator precedence parsing over at #524 (basically equivalent to Pratt, but no recursion at all) and it looks very promising.

However you're right that doing it in full is a relatively big job which I'd very roughly estimate will take a month full time if we count the testing of all the many subtle operator precedence rules and integration with Base. For the longer term I think this is worth doing because I feel it's the last remaining big issue before we can think seriously about retiring the flisp parser.

It may be possible to get some of the benefits with a bit less work but that will require some subtle targeting in the right part of the operator precedence stack.

I won't have time to work further on this for the next month unless Jeff is willing to delay the end date of the JuliaLowering project (I might message Jeff and ask about this as a potential project to slot in somewhere in the near future).

I have more crash reports with stack overflow messages that I could share if that is useful.

Only if you feel they show something distinct, I think I understand the underlying issue here quite well and further stack traces are unlikely to be useful. However, if VSCode users supply example code that would be rather useful.

KristofferC mentioned this issue Nov 7, 2023

tweak is_operator_start_char to not use so heavily nested syntax JuliaLang/Tokenize.jl#208

Merged

c42f mentioned this issue Mar 16, 2024

Stack overflow when parsing 20k consecutive + signs #415

Closed

KristofferC mentioned this issue May 31, 2024

JuliaSyntax parser failed - falling back to flisp! #435

Closed

c42f mentioned this issue Dec 22, 2024

StackOverflowError from VS Code extension telemetry #520

Closed

c42f mentioned this issue Jan 3, 2025

WIP: Non-recursive shunting yard algorithm for expression parsing #524

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recursion causes stackoverflow and crash #368

recursion causes stackoverflow and crash #368

vtjnash commented Oct 15, 2023

c42f commented Oct 16, 2023

c42f commented Oct 16, 2023

KristofferC commented Oct 31, 2023

oscardssmith commented Nov 2, 2023

vtjnash commented Nov 2, 2023

KristofferC commented Nov 2, 2023

KristofferC commented Nov 2, 2023

c42f commented Nov 3, 2023

KristofferC commented Nov 3, 2023

KristofferC commented Nov 7, 2023

c42f commented Jul 19, 2024

davidanthoff commented Dec 22, 2024

c42f commented Jan 3, 2025

recursion causes stackoverflow and crash #368

recursion causes stackoverflow and crash #368

Comments

vtjnash commented Oct 15, 2023

c42f commented Oct 16, 2023

c42f commented Oct 16, 2023

KristofferC commented Oct 31, 2023

oscardssmith commented Nov 2, 2023

vtjnash commented Nov 2, 2023

KristofferC commented Nov 2, 2023

KristofferC commented Nov 2, 2023

c42f commented Nov 3, 2023

KristofferC commented Nov 3, 2023

KristofferC commented Nov 7, 2023

c42f commented Jul 19, 2024

davidanthoff commented Dec 22, 2024

c42f commented Jan 3, 2025