Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

display analysis information to users #2111

Closed
wants to merge 6 commits into from
Closed

display analysis information to users #2111

wants to merge 6 commits into from

Conversation

fariss
Copy link
Collaborator

@fariss fariss commented May 30, 2024

Closes #857.

This PR deals with:

  • displaying the capabilities for files matching a file limitations rule
  • inform users about potential false positive in case of few library functions found
  • report the number of api calls made, and inform the user if the number is low

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

fariss added 2 commits May 30, 2024 04:04
This commit deals with:
- displaying the capabilities for files matching a file limitations rule
- inform users about potential false positive due to few library
  functions
- wip: report the number of api calls made, and inform the user if the
  number is low
Comment on lines 122 to 124
if isinstance(feature, API):
# delcare a global variable (a set) and append to it here?
pass
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to count how many API calls are made. I am thinking of declaring a global variable here (i.e. Set[API]) and appending to it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we have all the features collected and merged at the file level, we could enumerate them once to collect the API features and count them. We shouldn't need a distinct variable for this.

Copy link
Collaborator

@williballenthin williballenthin May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

around line 231.

Though it's possible we don't construct the massive set like we do at lower scopes, I can't quite remember.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we have all the features collected and merged at the file level, we could enumerate them once to collect the API features and count them. We shouldn't need a distinct variable for this.

You're right, we don't. We collect all the features from the smaller blocks, but for functions we only return the len(function_features).

capa/capabilities/common.py Show resolved Hide resolved
Comment on lines 799 to +800
file_extractors = get_file_extractors_from_cli(args, input_format)
found_file_limitation = find_file_limitations_from_cli(args, rules, file_extractors)
_ = find_file_limitations_from_cli(args, rules, file_extractors)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I am keeping find_file_limitation_from_cli is that it prints a warning to the user if the sample is packed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to do more thinking/testing on how to handle this. One the one hand the limitations are very valid, on the other hand often users still want to see the results. Ideally, we find a solution that helps with both.

capa/render/default.py Outdated Show resolved Hide resolved
n_libs: int = len(doc.meta.analysis.library_functions)
if n_libs <= MIN_LIBFUNCS_COUNT:
ostream.write(
"Few library functions recognized by FLIRT signatures, results may contain false positives\n\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a new color (orange?) to display such warnings.

@fariss fariss changed the title Issue/857 display analysis information to users May 30, 2024
fariss added 4 commits June 6, 2024 16:32
- Inform users if few library functions were recognized by FLIRT
signatures (this should be around ~40-50% of all functions)

- Inform users if there are very few API calls(<10), this could
indicate that it is packed, corrupted or tiny
@fariss
Copy link
Collaborator Author

fariss commented Jun 6, 2024

I think it'd be great if we can render this information in render.py. If we'd like to that, we need to introduce new metadata fields in StaticAnalysis and references them in render.py using doc.meta.analysis.<field>. This was my intial approach, but I didn't go this route because that would involve modifying the result_document structure.

@williballenthin
Copy link
Collaborator

that's exactly the strategy i'd recommend. (also requires updating the protobuf and (de)serialization code). If you can strictly add fields then we can make the changes in a minor version. If you change or delete existing fields then we'd release the breaking changes with the next major release.

if isinstance(feature, API)
for addr in addresses
}
api_calls += len(call_addresses)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition we should count/report the imports

and it would be neat to generate some stats for various features / test samples

I

@fariss fariss closed this by deleting the head repository Jun 7, 2024
@mr-tz
Copy link
Collaborator

mr-tz commented Jun 7, 2024

did you close this on purpose?

@fariss
Copy link
Collaborator Author

fariss commented Jun 7, 2024

@mr-tz Yep, deleted my fork, and will open a new PR with approach discussed above. (this will involve updating protobuf structures)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display analysis information to user
3 participants