Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-Publication Replication Linkage for Data Packages #151

Open
clnsmth opened this issue Nov 7, 2024 · 0 comments
Open

Post-Publication Replication Linkage for Data Packages #151

clnsmth opened this issue Nov 7, 2024 · 0 comments
Labels
feature New feature low priority Low priority

Comments

@clnsmth
Copy link

clnsmth commented Nov 7, 2024

Description

A common use case has emerged where a data package is initially published to EDI and assigned a DOI. Subsequently, this data package may be replicated to another repository and assigned a different PID. To establish a clear provenance link between the authoritative source and its replication, we propose a mechanism to define this relationship post-publication.

Proposed Solution

  1. Web Services: Create a set of services similar to the PASTA Data Package Manager Journal Citation Services for managing this replication metadata.
  2. User Interface: Implement a user interface within the Data Portal that allows users to:
    a. Select an existing published data package.
    b. Specify the PID of the replicated data package.
    c. Select the relationship to the replicate data package.
    d. Provide additional context or metadata related to the replication.
  3. Database Storage: Store this replication information in a database, associating it with the original data package.
  4. Metadata Update: Consider updating the original data package metadata with a reference to the replication.

Benefits:

  • Enhanced Data Provenance: Clearly establish the lineage of data packages and their replicas.
  • Improved Data Discovery: Facilitate the discovery of related data packages and their relationships.
  • Better Data Management: Support better data management practices by tracking the lifecycle of data packages and their derivatives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature low priority Low priority
Projects
Status: ToDo
Development

No branches or pull requests

2 participants