Skip to content
This repository has been archived by the owner on Apr 27, 2023. It is now read-only.

struct.pack('l', -2) gives wrong result #3

Open
EnTerr opened this issue Aug 21, 2015 · 13 comments
Open

struct.pack('l', -2) gives wrong result #3

EnTerr opened this issue Aug 21, 2015 · 13 comments
Labels

Comments

@EnTerr
Copy link

EnTerr commented Aug 21, 2015

struct.pack('l', -2)
"\0\0\0\0\0\0\0\0"    -- should be "\254\255\255\255\255\255\255\255"

I guess i was wrong that leaving this was okay:

      if val < 0 then
        val = val + 2 ^ (n * 8)
      end

Comment it out and conversion works

@iryont
Copy link
Owner

iryont commented Aug 21, 2015

You are correct. It's safe to remove that part.

However, I do not believe we can read 8-bytes long signed value because of the following part in unpack:

      if signed and val >= 2 ^ (n * 8 - 1) then
        val = val - 2 ^ (n * 8)
      end

Check this out:

local val = (2 ^ 64 - 1) - 2 ^ 64
print(val)

In general it should return -1, but it does return 0. It's probably due to overflow of built-in lua number variable.

@iryont iryont added the bug label Aug 21, 2015
@EnTerr
Copy link
Author

EnTerr commented Aug 21, 2015

Right you are, unpack misbehaves! That's probably different though (if numbers are 8-byte double in Lua, they have only 7 bytes for precision?), should split as new issue

>>> 2^64, 2^64-1
1.844674407371e+19  1.844674407371e+19

@iryont
Copy link
Owner

iryont commented Aug 21, 2015

Actually 8-bytes integer can hold only up to 2^64 - 1, while 2^64 is an overflow. I have no idea (yet) what to do about it.

The problem is that following equation works:

local val = (2 ^ 53 - 1) - 2 ^ 53
print(val)

This one doesn't:

local val = (2 ^ 54 - 1) - 2 ^ 54
print(val)

The only thing I can think of is an overflow of double precision (53 bits).

@EnTerr
Copy link
Author

EnTerr commented Aug 21, 2015

Let's do a quick check which numbers are "endangered". In the Lua i am using just now:

>>> for i = 0, 63 do x= 2^i; if struct.unpack('l', struct.pack('l', x)) ~= x then print(i, x); end; end
63  9.2233720368548e+18
>>> for i = 0, 63 do x= - 2^i; if struct.unpack('l', struct.pack('l', x)) ~= x then print(i, x); end; end
0   -1
1   -2
2   -4
3   -8
4   -16
5   -32
6   -64
7   -128
8   -256
9   -512
10  -1024

2^63 is not an issue, since maxint is 2^63-1. But for negatives it's for numbers with absolutes <= 2^10, that is a concern.

I am new to Lua, so did not know that but now i see RTFM says

The type number uses two internal representations, or two subtypes, one called integer and the other called float. Lua has explicit rules about when each representation is used, but it also converts between them automatically as needed (see §3.4.3). Therefore, the programmer may choose to mostly ignore the difference between integers and floats or to assume complete control over the representation of each number. Standard Lua uses 64-bit integers and double-precision (64-bit) floats, ...

So internally Lua likely can use 64-bit integer where we need it, question is to nudge it

@randomeizer
Copy link
Contributor

FYI, this seems to be solvable by using math.floor around any exponential operation. Eg:

val = val + byte * math.floor((2 ^ ((j - 1) * 8)))

@smarek
Copy link

smarek commented Aug 12, 2020

Hey guys, does not seem to be fixed yet, and we'd like to use in kaitai-struct, and reading signed 8byte number currently does not work stable, nor cross platform (5.1, 5.2, 5.3 and LuaJit)

Is there anyone interested in this ticket, who knows the solution, please?

@Qix-
Copy link

Qix- commented Aug 12, 2020

It's probably due to overflow of built-in lua number variable.

This is correct. Lua uses double (8-byte IEEE 754 double precision). You cannot represent a 64-bit number in Lua without risking e.g. NaN - also because some implementations of Lua use NaN tagging for typing.

This is why e.g. LOVE's implementation of unpack uses cdata (since it's built on top of LuaJIT) for 64-bit numbers - meaning, you cannot manipulate them directly from Lua, but you can pass them back to the C level as properly typed values. Useful in some cases, but not so much to Lua code directly.

I have no idea how you'd do 64-bit numbers in lua without a custom bignumber type and some extensive metatable handling - but then you'll have to turn all numbers into metatables since the metatable methods for arithmetic require both operands to be tables IIRC. One of the pitfalls of Lua metamethods (and one that has bitten me time and time again).

This isn't a solvable issue unless you want to approximate - after a certain point, you will get wrong results.

If you have a solution to this problem that isn't a hack, the gods of information theory would love to have a word 👍

@iryont
Copy link
Owner

iryont commented Aug 12, 2020

This is why I never fixed this issue. I might as well just remove any 8-byte integer handling code from lua-struct and leave maximum 4-byte long integer values (which can be used to handle bigger numbers in parts anyway, at least in a hacky way).

@Qix-
Copy link

Qix- commented Aug 12, 2020

That's what we did in our own packer/unpacker - it's unfortunately really the only solution here.

@smarek
Copy link

smarek commented Aug 21, 2020

And could we, possibly, detect if the operation (pack/unpack/math) works/produces with number, that is too large to be reliably represented, and just error-out in such cases? That would be, in my opinion, comfortable trade-off between not supporting 64-bit numbers at all, and not being able to handle numerical operations at all, even if the 64-bit number represents number in much smaller range (let's say 48-bit).

I think if the error said something as Lua cannot represent number this big, it'd be better than letting the runtime overflow and get unreliable results across the environment.

And third option, mabye, would be to explicitly declare that using 64bit unpack would produce eg. bigint.lua or lua-nums BigNum and packing could be done using the same

@Qix-
Copy link

Qix- commented Aug 21, 2020

The problem with that is that now you have a completely input-dependent side effect, not at all evident from written code. It's not clear which cases will succeed or fail and it'll make things way way more difficult to reason about when it comes to error handling and the like.

Plus, in that case, just read a 32-bit and then a 16-bit number and use the bit[32] package to re-construct it.

@smarek
Copy link

smarek commented Aug 22, 2020

@Qix- yes, input-dependent side effect, that is precisely what I had in mind. If the choices about handling the issue are roughly this
kaitai-io/kaitai_struct_lua_runtime#5 (comment)

(a.) expect the target runtime to fail hard if the requested data type cannot be represented/worked with correctly (s8le/be, u8le/be methods will fail, regardless of the read data size)
(b.) expect the target runtime to fail hard, only if the read value cannot be represented correctly (ie. returning uint32/int32 if the read value fits, and failing otherwise)
(c.) expect the runtime to provide, if necessary, additional data type to handle such data (java BigInteger, lua BigNum, etc.), even if that means introducing new runtime dependency or target OS/platform constraints

I'd take choice (b) over others mentioned any time, because that allows the user/developer to adapt to situation, and instead of reading 64-bit number, read two 32-bit numbers and do math, and does not make the library (struct or kaitai) look faulty

Also the 48-bit was an example, it was mentioned earlier here that the limit might be 53-bits, which would be, if choice (b) applies, the maximum handled by struct correctly, and if more bits were to represent a number, struct would fail hard and force the user/developer to work around the runtime limits

@smarek
Copy link

smarek commented Apr 15, 2023

Hey, almost 3 years later, does anybody interested in this topic have some solution for the others?

I guess, this project was not updated (this bug was not fixed), because adding bc/bigint, would introduce low-level build/runtime dependencies, which you don't want to introduce, since this project is so far pure-lua

I guess, maybe the best would be to do a library split?
lua-struct would hard fail on any operation with 64bit numbers, since it cannot provide such feature reliably
and
lua-struct-bigint would be drop-in replacement, that maybe uses more dependencies or has more complex install, and under the hood uses bc/bigint for all operations (not just the problematic ones)

Those projects API input will be compatible (eg. take two int64 arguments), API output will be not (eg. difference between returning Lua number and returning bigint from lua-bigint)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants