implement schema.org metadata for every dataset #2182

only1chunts · 2025-01-27T14:27:29Z

User story

As a website user
I want to see the dataset metadata in schema.org
So that I can easily find dataset I'm interested in through search engines

Acceptance criteria

Given a dataset is public
When I search for it by title in google or other generalist web search engines
Then I should find it

Given a dataset is public
When i validate the schema.org of it e.g. https://validator.schema.org/#url=https%3A%2F%2Fgigadb.org%2Fdataset%2F102651
Then I should see all the correct and relevant dataset specific metadata in the results

Additional Info

for info on why schema.org is worthwhile: https://www.schemaapp.com/schema-markup/benefits-of-schema-markup/

To validate a URL against schema.org: https://validator.schema.org/
e.g. for dataset 102651
https://validator.schema.org/#url=https%3A%2F%2Fgigadb.org%2Fdataset%2F102651

We should be using the Dataset attributes (https://schema.org/Dataset) to make each dataset schema.org profiles

This is linked to #514 work.

Note: Currently every dataset page includes schema.org metadata for the "DataCatalogue" i.e. GigaDB's general information. I dont know if keeping that will cause conflict issues or not, so some investigation maybe required as to whether we need to remove that part from the individual dataset entries or not.

Product Backlog Item Ready Checklist

Business value is clearly articulated
Item is understood enough by the IT team so it can make an informed decision as to whether it can complete this item
Dependencies are identified and no external dependencies would block this item from being completed
At the time of the scheduled sprint, the IT team has the appropriate composition to complete this item
This item is estimated and small enough to comfortably be completed in one sprint
Acceptance criteria are clear and testable
Performance criteria, if any, are defined and testable
The Scrum team understands how to demonstrate this item at the sprint review

Product Backlog Item Done Checklist

Item(s) in increment pass all Acceptance Criteria
Code is refactored to best practices and coding standards
Documentation is updated as needed
Data security has not been compromised (with particular reference to the personal information we hold in GigaDB)
No deviation from the team technology stack and software architecture has been introduced
The product is in a releasable state (i.e. the increment has not broken anything)

only1chunts added the backlog:Story label Jan 27, 2025

only1chunts added this to the B.2.Search Engine Optimisation milestone Jan 27, 2025

only1chunts added this to Backlog: GigaDB Database Jan 27, 2025

rija moved this to To Estimate in Backlog: GigaDB Database Feb 3, 2025

rija added the asa:WebsiteUser label Feb 3, 2025

rija mentioned this issue Feb 10, 2025

improve SEO to enable GigaDB datasets to be found by google/bing/beidu searches #514

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement schema.org metadata for every dataset #2182

implement schema.org metadata for every dataset #2182

only1chunts commented Jan 27, 2025 •

edited by rija

Loading

implement schema.org metadata for every dataset #2182

implement schema.org metadata for every dataset #2182

Comments

only1chunts commented Jan 27, 2025 • edited by rija Loading

User story

Acceptance criteria

Additional Info

Product Backlog Item Ready Checklist

Product Backlog Item Done Checklist

only1chunts commented Jan 27, 2025 •

edited by rija

Loading