Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement schema.org metadata for every dataset #2182

Open
14 tasks
only1chunts opened this issue Jan 27, 2025 · 0 comments
Open
14 tasks

implement schema.org metadata for every dataset #2182

only1chunts opened this issue Jan 27, 2025 · 0 comments

Comments

@only1chunts
Copy link
Member

only1chunts commented Jan 27, 2025

User story

As a website user
I want to see the dataset metadata in schema.org
So that I can easily find dataset I'm interested in through search engines

Acceptance criteria

Given a dataset is public
When I search for it by title in google or other generalist web search engines
Then I should find it

Given a dataset is public
When i validate the schema.org of it e.g. https://validator.schema.org/#url=https%3A%2F%2Fgigadb.org%2Fdataset%2F102651
Then I should see all the correct and relevant dataset specific metadata in the results

Additional Info

for info on why schema.org is worthwhile: https://www.schemaapp.com/schema-markup/benefits-of-schema-markup/

To validate a URL against schema.org: https://validator.schema.org/
e.g. for dataset 102651
https://validator.schema.org/#url=https%3A%2F%2Fgigadb.org%2Fdataset%2F102651

We should be using the Dataset attributes (https://schema.org/Dataset) to make each dataset schema.org profiles

This is linked to #514 work.

Note: Currently every dataset page includes schema.org metadata for the "DataCatalogue" i.e. GigaDB's general information. I dont know if keeping that will cause conflict issues or not, so some investigation maybe required as to whether we need to remove that part from the individual dataset entries or not.

Product Backlog Item Ready Checklist

  • Business value is clearly articulated
  • Item is understood enough by the IT team so it can make an informed decision as to whether it can complete this item
  • Dependencies are identified and no external dependencies would block this item from being completed
  • At the time of the scheduled sprint, the IT team has the appropriate composition to complete this item
  • This item is estimated and small enough to comfortably be completed in one sprint
  • Acceptance criteria are clear and testable
  • Performance criteria, if any, are defined and testable
  • The Scrum team understands how to demonstrate this item at the sprint review

Product Backlog Item Done Checklist

  • Item(s) in increment pass all Acceptance Criteria
  • Code is refactored to best practices and coding standards
  • Documentation is updated as needed
  • Data security has not been compromised (with particular reference to the personal information we hold in GigaDB)
  • No deviation from the team technology stack and software architecture has been introduced
  • The product is in a releasable state (i.e. the increment has not broken anything)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Estimate
Development

No branches or pull requests

2 participants