Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve ingestion workflow #150

Open
boegel opened this issue May 23, 2023 · 1 comment
Open

improve ingestion workflow #150

boegel opened this issue May 23, 2023 · 1 comment

Comments

@boegel
Copy link
Contributor

boegel commented May 23, 2023

The current workflow triggers a lot of requests to GitHub, and does a lot of lookup operations in S3.

We could:

  • restructure the S3 bucket so it's clear which tarballs have been ingested already:
    • by adding ingested and new folder below EESSI version for metadata files + a tarballs directory that has all the tarballs;
    • mainly because moving the tarballs seems to be quite time-consuming;
  • clean up tarballs in S3 bucket more actively (or move them to S3 Glacier?);
  • check out the PR branch to the staging repo to limit requests to the GitHub API;
@bedroge
Copy link
Collaborator

bedroge commented May 23, 2023

Just checked the current settings of this bucket, and currently we have the following:

Current version actions

Day 0
Objects uploaded

Day 30
Objects move to Standard-IA

Day 60
Objects move to Glacier Flexible Retrieval (formerly Glacier)

Day 180
Objects move to Glacier Deep Archive

Day 365
Objects expire

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants