-
Notifications
You must be signed in to change notification settings - Fork 500
File Taint: Adding ARM for Linux and Wildcard for Files, with PANDA logging #1616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8145b35 to
b69c772
Compare
b69c772 to
6e34452
Compare
6e34452 to
af4d98c
Compare
a9b7af5 to
4219acd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds ARM architecture support (32-bit and 64-bit) for Linux file tainting in the file_taint plugin and replaces exact filename matching with wildcard pattern matching using POSIX fnmatch.
Key Changes:
- Adds ARM register handling (R0 for 32-bit ARM, X0 for 64-bit ARM) to extract syscall return values for read operations
- Replaces substring-based filename matching with fnmatch-based wildcard pattern matching to support multiple files
- Includes numerous code style improvements (brace formatting consistency)
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| panda/plugins/file_taint/file_taint.cpp | Adds ARM support for Linux read syscalls, replaces filename matching with wildcard pattern matching using fnmatch, adds empty filename validation |
| panda/plugins/file_taint/README.md | Updates documentation to describe new wildcard matching behavior with examples and usage guidance |
| panda/plugins/taint2/taint_api.cpp | Code formatting improvements and adds debug print statement in taint2_label_ram |
| panda/plugins/taint2/taint2.cpp | Code style improvements (brace formatting) and adds informational print statements |
| panda/plugins/taint2/taint2_hypercalls.cpp | Code formatting improvement for multi-line function call |
| panda/debian/setup.sh | Adds TARGET_LIST build argument hardcoded to x86_64-softmmu |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
011af6b to
c8de23b
Compare
5e31d1c to
0311fda
Compare
0311fda to
7cf8205
Compare
7cf8205 to
abffa4a
Compare
Your checklist for this pull request
Detailed description
This PR completes two changes:
For 32-bit ARM, the system calls are identical it i386
https://github.com/panda-re/panda/blob/dev/panda/plugins/syscalls2/generated/syscalls_ext_typedefs_arm64.h#L1377-L1384
https://github.com/panda-re/panda/blob/dev/panda/plugins/syscalls2/generated/syscalls_ext_typedefs_arm.h#L1589-L1596
For 64-bit ARM, the system calls are identical to x86-64
https://github.com/panda-re/panda/blob/dev/panda/plugins/syscalls2/generated/syscalls_ext_typedefs_arm64.h#L1277-L1284
https://github.com/panda-re/panda/blob/dev/panda/plugins/syscalls2/generated/syscalls_ext_typedefs_arm64.h#L1373-L1380
I'm also assuming that I need to check register 0 for both the ARM architecture to get the number of bytes read.
fnmatch seems to be the best fit for supporting flexibility on file names, matching how shells do file matching.
https://man7.org/linux/man-pages/man3/fnmatch.3.html
...
Test plan
With LAVA, I will test being able to run with files such as ./toy/inputs/*, which should taint files such as ./toy/inputs/small-1.bin, ./toy/inputs/small-2.bin
I can confirm, based on LAVA logs, that two files, testbig.bin and testsmall.bin, were tainted using the wildcard. Additionally, it appears that taint2 works on both files. I added a debug print on label_ram JUST to make sure.
When testing originally, I saw the taint2 hypercall warning, hence that debug message, but I guess I was needlessly spooked.
bug_mining.log
Also, I added panda logging to any new files tainted, see here:

This would be useful so if your log captures multiple file taints, you can figure out which taints belong to which files!
PSA: If you use Python to convert panda log to JSON, you MUST use the updated version of PyPanda that will be made with this PR, then you should be able to see the FileMatchTaint instance.
You would see this in Python under pandare/plog_pb2.py, and search for "file_taint_match"
...
Closing issues
N/A
...