Use faster solr faceting for dashboard stats #6865
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The main dashboard statistics loads slowly the more works, work types, and resource type values in solr. This can lead to timeouts in severe cases.
Guidance for testing, such as acceptance criteria or new user interface behaviors:
/dashboard
Type of change (for release notes)
notes-bugfix
Bug FixesDetailed Description
The stats graphic on the dashboard was built using large solr responses and several calls to looping methods (
#each
,#group_by
?,#transform_values
). Obviously, loops are very slow over large arrays/hashes. Solr can do this sort of data massaging for us with its facet API in constant time.This change uses Solr Facets to calculate the number of works grouped by any solr field, in particular
human_readable_type_sim
for work types andresource_type_sim
for resource types. Note this uses*_sim
rather than*_tesim
so the keys aren't mangled (string vs text). This can be done on a 0-row Solr query (opposed to the arbitrary 100k, which breaks on repos w/ more than 100k works).The most interesting line:
Hash[*response['facet_counts']['facet_fields'][query]]
is a technique to turn the Solr facet response, an array in the form ['key', 'value, 'key', value...] into the appropriate hash {key: value, key: value...}. This is also resilient tonil
or some empty values if the facet response isn't right.Changes proposed in this pull request:
@samvera/hyrax-code-reviewers