Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EML Award Elements to Solr Index to Complement Existing Funding Information #149

Open
clnsmth opened this issue Sep 10, 2024 · 0 comments
Assignees
Labels
feature New feature medium priority Medium priority

Comments

@clnsmth
Copy link

clnsmth commented Sep 10, 2024

Description

Currently, we index the funding element in Solr to allow users to search for funding information, which is useful for generating reports, such as listing the total number of citations for data packages tied to a specific project.

In EML version 2.2.0, a new award element was introduced to provide a more structured way of recording project information. The funding element remains for backward compatibility, but this creates potential discrepancies in how this information is recorded in an EML document and made searchable through EDI's Solr search engine.

To reconcile these differences and make all relevant information searchable, we propose indexing select fields from the award element to complement those already indexed from the funding element.

Award Elements and Considerations for Indexing

  • funderName (required): Allows identification of all data packages associated with a particular funder's name. Since this field is free text, search results may be affected by variability in naming conventions. However, it is a required element, so it’s likely to appear in metadata records using the award element.

  • funderIdentifier: Enables precise identification of funders via a unique identifier, which is ideal for search accuracy. However, it is not a required field, so its presence in metadata records using the award element may be inconsistent.

  • awardNumber: Supports searching for data packages associated with a specific award number. Since award numbers are unique to each funder, this element can be used in combination with funderName to enhance search precision.

  • title (required): Allows searches by project title, providing an alternative way to identify related data packages. As this is a free text field, search results may vary due to differences in how titles are entered. However, its required status ensures it is present in EML documents using the award element.

  • awardUrl: Enables searching by a project's award URL. While similar to awardNumber in its utility, this may be a less intuitive search method.

Proposal

Considering the role of the award elements and the likelihood of their inclusion in metadata records, we recommend indexing the following fields: funderName, awardNumber, and title. While funderIdentifier and awardUrl provide additional value, their optional status and less intuitive use reduces their overall impact on search capabilities.

Feedback

We welcome thoughts and comments on this proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature medium priority Medium priority
Projects
Status: ToDo
Development

No branches or pull requests

2 participants