-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove dependency on capstone for PowerPC disassembly #6292
base: dev
Are you sure you want to change the base?
Conversation
This commit just brings it in and makes sure it builds, but doesn't change anything in the arch plugin to use it yet.
Decoding instructions don't need these, but it's useful for the architecture plugin.
We take a divergence from capstone: instead of treating every single branch pseudo-op as its own distinct instruction ID, we just group all of the BCx, BCLRx, and BCCTRx in a group, and just change the mnemonic depending on the value of BI and BO. This will drastically simplify the arch plugin for things like getting instruction info and lifting.
Note that this hasn't been tested against capstone 5.0.3 yet: I said "ooh it looks like binaryninja supports SPE now", added it, then when I went to test it, it looks like capstone 5.0.3 doesn't support it yet (`CS_MODE_SPE` isn't yet in the allowed bitmask for powerpc in the `arch_configs` table in `cs.c`). There's enough work that I'm leaving it in here for now.
This was getting unwieldly.
noone seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
A while ago (August), I was informed on Twitter that a Rust-based PPC arch plugin may be coming: https://x.com/vector35/status/1823830788976021840. Unsure if there's any news on that... |
This PR removes the dependency on capstone for disassembling powerpc.
Capstone has worked admirably up until now, but it's nice to have something that you can have a little more control over. Some of its drawbacks:
cmp*
instructions in powerpc, where the condition register is omitted from the disassembly text if it'scr0
. For program analysis purposes (ie lifting IL), it's cleaner to include the impliedcr0
in the internal representation. But with capstone, the first operand has to be checked to see if it's is a condition register or a GPR register to figure out if it's an impliedcr0
or a specified condition register.This "homegrown" version addresses those:
The natural question is "ok so this version fixes the bugs in capstone, but how do you know this doesn't just introduce new/different bugs ?" There are only 4 billion possible instructions, so it's really not that bad to just do
for (i = 0; i < 0xffffffff; ++i)
, decode with this homegrown version, and decode with capstone, and compare them. I did that (seecapstone_compare_test.c
), so that if there are any bugs in this homegrown implementation, at least they're already present, so this is strictly a step forwards with some bugs and not a step backwards with any others, known or unknown.NOTES:
tsr
,tbegin
) that I believe to be bugs in capstone, but didn't add special cases for those in the script.capstone/suite/synctools/tablegen/PPC
didn't reveal anything, so I'm not sure if there are any cases like this.capstone_compare_test.c
) doesn't compile out-of-the-box, since there's a namespace collision with the capstone's register enum. See the comments in the build script,build_capstone_compare.sh
; it's not too bad, it's just a find/replace in a few files, that has to be reverted back for before building the powerpc binary architecture.