-
Notifications
You must be signed in to change notification settings - Fork 93
Get objects performance optimization #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
saaj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| ), | ||
| ) | ||
| version_filters.append( | ||
| max_versions_subq == taxii2models.STIXObject.version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a JOIN to a correlated subquery. Some databases are good at rewriting such queries automatically, but I wouldn't bet on all the supported ones doing it well. And JOINs also easier to read (at SQL level). But not sure how good it lends itself to given version filtering code. So up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you mean JOIN. The problem I saw is that TAXII2 is allowing to return both the first and last elements at the same time or any specific version. But in the first case, that would add 2 disjoint joins that would return no result.
Motivation
Getting objects from a collection with
enterprise-attackdata was very slow. It took more than 12s on my computer while there are only ~20k objects. Also, adding alimit=1did not improve the performance.Optimizations
Query to find last/first
Note:
lastis the default is no option is provided on the API.Initially the query was:
with plan:
I updated the query to only look for the versions of the current row, not all:
with new plan:
The query is also much more simpler.
The index is not compatible with collection id filter
Change
ix_opentaxii_stixobject_date_added_idtoix_opentaxii_stixobject_col_date_added_idby adding thecollection_idas first parameter.New plan:
Do not count more than necessary
The API puts in the response, a
moreboolean to inform if the pagination could return more results. Initially, it was counting all elements independently of the limit. Now, it is counting at most limit + 1 objects.Results