Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking the output size #9

Open
kripken opened this issue Jul 22, 2015 · 10 comments
Open

Checking the output size #9

kripken opened this issue Jul 22, 2015 · 10 comments

Comments

@kripken
Copy link
Contributor

kripken commented Jul 22, 2015

I'm not entirely clear on the purpose of CHECKED_OUTPUT_SIZE. It looks like in one mode, we unpack the entire thing to see what the output size is (in calculate_unpacked_size); in the other, we use an unpacked size that we store in the binary.

The second mode is more efficient, but troublesome because unpacking is not deterministic, due to doubles (#7). To avoid snprintf issues we now use the browser's double printing, which is different than at least my OS libc's printing, but even without that, the libc in emscripten (musl) might print differently than the glibc that my OS uses to do the packing, etc.

If that is correct, it seems like the options are

  • Don't check the output size strictly. Give an estimate, just enough to allocate a buffer known to be big enough. zlib does this, for example.
  • Bundle a full double printer in the unpacker, like dtoa.c.
@BrendanEich
Copy link

ND is the enemy; this isn't the last time double sprinting ND will bite, I bet (based on decades of bitter experience). I know dtoa.c is ugly old code, and I haven't kept up with its maintenance log or many forks, but it looks to me like the winning option.

/be

@lukewagner
Copy link
Owner

Is the text output of JavaScript's ToString(number) deterministic?

@BrendanEich
Copy link

@lukewagner Yes! See http://www.ecma-international.org/ecma-262/6.0/#sec-tostring-applied-to-the-number-type.

/be

P.S. Recall we had Guy Steele on Ecma TC39 TG1 in 1996-7 courtesy Sun, and he and David M. Gay co-authored the dtoa.c paper.

@lukewagner
Copy link
Owner

Cool. Could we then build our double printing in terms of FFI calls to use JS ToString(number) (and then apply my hack on top to insert a '.' if there isn't one)?

@kripken
Copy link
Contributor Author

kripken commented Jul 23, 2015

That is what we do in web builds now - are you suggesting that we run the packer in JS as well, instead of natively (relying on deterministic ToNumber)? Sounds like a reasonable third option alongside the 2 mentioned before. A downside is it means no native packing/unpacking, and a little slower packing.

A fourth option is I suppose to run the packer on NULL, finding out the output size first, then doing the unpacking (as in the other CHECKED_OUTPUT_SIZE option). This would increase unpacking time, but would still allow native packing/unpacking.

@kripken
Copy link
Contributor Author

kripken commented Jul 23, 2015

It seems like optimizing the web unpacker is the most important thing, so option 3 as I think @lukewagner proposed sounds best. I implemented that. Ran into #10 along the way, but other than that a basic test now works. Packing speed clearly suffers, though.

@lukewagner
Copy link
Owner

Oh, I forgot that we need to run in both packer and unpacker. Requiring the packer to run as JS is a bit annoying. Since there aren't many double literals, I'm inclined back to what you suggested earlier of just over-allocating at the start. We already over-allocate and copy a Blob at the end (b/c the asm.js heap needs to contain the input, the output, and temporary heap usage), so this isn't a big deal.

@kripken
Copy link
Contributor Author

kripken commented Jul 24, 2015

The over-allocation would need to know how many doubles are in the code, though, because the worst-case of the size change is quite large. For example, off the top of my head, 1.1e20 vs
110000000000000000000..

I'm inclined to just do the packing in JS. This seems to be slow only because emscripten runs the packer in node by default, but in spidermonkey it should be quite snappy.

@lukewagner
Copy link
Owner

I'm just a bit reticent to introduce a dependency on JS. At the moment, it's been nice that you can run pack/unpack-asmjs as native or asm.js.

@lukewagner
Copy link
Owner

OTOH, it would help keep the unpacker small (compared to importing dtoa.c) so I guess this seems fine for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants