-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbolize literal integer constants #1
Comments
Oh no, I didn't plan to write a book explaining all the tricks. Because for me it would take more time than writing this code. You can ask me to explain particular places. The most common trick is to use For x86_64, I avoid frequent use of the r8..r15 registers, because using them wastes an extra byte on the REX prefix. The same with using 64-bit registers.
I didn't understand the question, what are the "parameter values"?
The offsets in the probability model tables have been rearranged (to save some bytes in the code) as follows: Name: offset, size original:IsMatch: 0, 192 micro-lzmadec:IsMatch: 0, 192
Just follow the code, it's not that bad if you already know LZMA decompression algorithm. And there are many hints around. Many of these offsets in the table are bound to each other, but I don't see why anyone would change them unless that person understands the code and has found a way to make it even smaller. So, here 192/2 is used to make three constants: cdq
...
mov dl, 192/2
lea ebx, [rsi+rdx*2] ; IsRep0Long, 192
lea esi, [rax+rdx*4] ; IsRep, 384 + state
...
jmp _case_len
...
mov dl, 154 ; 154*9 = 1332+54
...
_case_len:
lea esi, [rdx*8+rdx] I think that if necessary, this code can be made a hundred bytes more, but much faster. |
While the small size is remarkable, there is much work to be done before some other project might be willing to use or adapt this code.
Looking at the current HEAD 9fd3a26 at Tue Jun 7 10:14:56 2022 +0700, in file lzmadec.x86_64.asm there are many many bare literal integer constants that do not have symbolic names. In contrast, the reference implementation by Igor Pavlov in LZMA SDK 4.40 and successors uses dozens of symbolic names with designated inter-relationships. This makes it hard to compare the two versions. Also, the micro-lzmadec code lacks documentation of strategy, explanation of coding tricks, and comments in general. Which specializations of parameter values has micro-lzmadec assumed? What relationships do the numeric constants in micro-lzmadec have to each other? If it becomes necessary or desirable to change one value, then how are the others affected?
The text was updated successfully, but these errors were encountered: