-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No size limit #22
Comments
If we can agree on a limit I agree that would be good. cc @chihweitung |
Also, relatedly, it would be good to decide what to do if the limit is exceeded (Just pass the body along anyways, I assume, rather than fail the request / eat the body?) |
We would need some telemetry, but ideally we would fail in that case. Sites can specify |
If we want to collect the telemetry for the sizes of responses, we need to be careful since it may cause fingerprinting issues. |
I am not sure if it helps with fingerprinting concerns, but I want to observe that we don't necessarily need to track sizes accurate to a every single byte. Counting responses falling into broad, logarithmic buckets should be sufficient (e.g. 256-511 kB, 512-1023 kB, 1024-2047 kB, 2048-4095 kB, etc.). |
If the parser needs to look at the whole response, then I agree with #22 (comment) that the secure fallback would be to block responses bigger than X MB. OTOH, maybe there is some wiggle room here to mitigate some of the risk from the initial launch of ORB. Some (half-baked) ideas:
/cc @csreis |
I don't think it's possible to get a reliable answer from the JavaScript parser for partial input in a way that can be standardized without a ton effort, but perhaps it's good to get a second opinion on that. @syg @littledan thoughts? In particular, how realistic is it to determine if a part of a file (say the first 4 kibibytes) is JavaScript and get that standardized? |
Hm, my hunch is that it's kind of hard at the spec level. For the encoding, are you imagining this would require a particular encoding? UTF8, 16? What if the partial payload falls on a surrogate pair boundary, for example? For parsing, one problem is that the specification is specified with "cover grammars", where an ambiguous production is parsed into a tree (e.g. is it the start of an arrow function or is it a comma expression?) that's disambiguated (from the spec's perspective) after the whole tree has been parsed. Not sure how we would refactor the grammar so there's more useful information for partial input if we run into a cover grammar. Additionally, what kind of implementation latitude are you looking for for error timing? If it's acceptable that an implementation might report certain class of early errors only after the entire file is parsed, even if they can be detected earlier, that'd probably be much easier to spec (and implement). |
Decoding is tracked by #7 and I think I have a solution that works. And yeah, those are the kind of problems I heard about when I asked others about this. What we'd basically want is a function that takes x kibibytes of JavaScript and returns probably okay/not okay. The x kibibytes are known to maybe EOF at an unusual point because it's (sometimes) the start of a file. For cases where you'd need the entire file to determine if there's an error, we would make a judgment call of sorts. As we'd have to decide whether to allow/deny based on the x kibibytes alone. (The alternative is to always parse the whole file or some kind of hybrid approach as @anforowicz outlines above. That's the plan of record.) |
Unless there's credible alternative to looking at the whole response, I suggest we mark this as something to look into in the future. It's not clear to me it has to block the mvp. |
The algorithm doesn't seem give a size limit on how much body is loaded into memory to try and parse as JSON. Caching arbitrary amounts of data in memory to run through a parser, particularly outside of the renderer process, seems not great, particularly when we're not sure if the data is JSON or JS or not.
Admittedly, I don't have full context here, so I may be missing something.
The text was updated successfully, but these errors were encountered: