-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
display analysis information to users #2111
Conversation
This commit deals with: - displaying the capabilities for files matching a file limitations rule - inform users about potential false positive due to few library functions - wip: report the number of api calls made, and inform the user if the number is low
capa/capabilities/static.py
Outdated
if isinstance(feature, API): | ||
# delcare a global variable (a set) and append to it here? | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to count how many API calls are made. I am thinking of declaring a global variable here (i.e. Set[API]
) and appending to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we have all the features collected and merged at the file level, we could enumerate them once to collect the API features and count them. We shouldn't need a distinct variable for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
around line 231.
Though it's possible we don't construct the massive set like we do at lower scopes, I can't quite remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we have all the features collected and merged at the file level, we could enumerate them once to collect the API features and count them. We shouldn't need a distinct variable for this.
You're right, we don't. We collect all the features from the smaller blocks, but for functions we only return the len(function_features)
.
file_extractors = get_file_extractors_from_cli(args, input_format) | ||
found_file_limitation = find_file_limitations_from_cli(args, rules, file_extractors) | ||
_ = find_file_limitations_from_cli(args, rules, file_extractors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I am keeping find_file_limitation_from_cli
is that it prints a warning to the user if the sample is packed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to do more thinking/testing on how to handle this. One the one hand the limitations are very valid, on the other hand often users still want to see the results. Ideally, we find a solution that helps with both.
capa/render/default.py
Outdated
n_libs: int = len(doc.meta.analysis.library_functions) | ||
if n_libs <= MIN_LIBFUNCS_COUNT: | ||
ostream.write( | ||
"Few library functions recognized by FLIRT signatures, results may contain false positives\n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a new color (orange?) to display such warnings.
- Inform users if few library functions were recognized by FLIRT signatures (this should be around ~40-50% of all functions) - Inform users if there are very few API calls(<10), this could indicate that it is packed, corrupted or tiny
I think it'd be great if we can render this information in render.py. If we'd like to that, we need to introduce new metadata fields in |
that's exactly the strategy i'd recommend. (also requires updating the protobuf and (de)serialization code). If you can strictly add fields then we can make the changes in a minor version. If you change or delete existing fields then we'd release the breaking changes with the next major release. |
if isinstance(feature, API) | ||
for addr in addresses | ||
} | ||
api_calls += len(call_addresses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in addition we should count/report the imports
and it would be neat to generate some stats for various features / test samples
I
did you close this on purpose? |
@mr-tz Yep, deleted my fork, and will open a new PR with approach discussed above. (this will involve updating protobuf structures) |
Closes #857.
This PR deals with:
Checklist