Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tell StorageProviders where to write inputs #3905

Closed
rossjones opened this issue Nov 28, 2023 · 1 comment
Closed

Tell StorageProviders where to write inputs #3905

rossjones opened this issue Nov 28, 2023 · 1 comment
Labels
comp/storage Issues related to storage input th/config Theme: Related to configuration files and settings across the project

Comments

@rossjones
Copy link
Contributor

rossjones commented Nov 28, 2023

Currently each StorageProvider implementation, makes a decision in their 'New*()` function as to which local directory they will write inputs to. They do this by finding the configured storage directory, and then creating a temporary directory named after the type of StorageProvider. This means the structure becomes:

/data 
/data/bacalhau-s3/*
/data/bacalhau-ipfs/*
/data/bacalhau-local/*

As this decision is made in the New() method as the provider is created, there is no opportunity to tell the provider where we want it to save the data it downloads. Ideally we would provide a directory to the StorageProvider's PrepareStorage method, so that it would be

# Current
func (sp *StorageProvider) PrepareStorage(ctx context.Context, storageSpec models.InputSource) 
(storage.StorageVolume, error) {

# Proposed 
func (sp *StorageProvider) PrepareStorage(
    ctx context.Context, 
    localDirectory string,
    storageSpec models.InputSource) 
(storage.StorageVolume, error) {

This would:

  • Make testing easier/more-explicit
  • Allow us to index stored data on the job/execution and not the provider's name
  • Share data across executions in the same job
  • Remove the need for providers to generate a path from config and store it
  • Allow other components to take responsibility for cleanup, removing the need for the CleanupManager in StorageProviders
@rossjones rossjones changed the title Fix where StorageProviders choose to write inputs Tell StorageProviders where to write inputs Nov 28, 2023
@wdbaruni wdbaruni transferred this issue from another repository Apr 21, 2024
@wdbaruni wdbaruni transferred this issue from another repository Apr 21, 2024
@wdbaruni
Copy link
Member

@frrist planned to improve as part #4014

@wdbaruni wdbaruni moved this from Inbox to Next in Engineering Planning Jun 27, 2024
@wdbaruni wdbaruni added the comp/executor/plugins Pluggable executors related work label Jun 27, 2024
@wdbaruni wdbaruni moved this from Next to Backlog in Engineering Planning Aug 12, 2024
@wdbaruni wdbaruni added th/config Theme: Related to configuration files and settings across the project comp/storage Issues related to storage input and removed comp/executor/plugins Pluggable executors related work labels Oct 13, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Engineering Planning Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp/storage Issues related to storage input th/config Theme: Related to configuration files and settings across the project
Projects
Status: Done
Development

No branches or pull requests

3 participants