Skip to content

Conversation

@kirkkwang
Copy link
Contributor

This commit will add logic so users can set a file set row in their CSVs and have granular control over adding metadata to file sets.

An example CSV could look like:

file,model,source_identifier,parents,title,creator,rights_statement,visibility
,GenericWork,gw-123,,"Custom Generic Work Title",KKW,http://rightsstatements.org/vocab/NKC/1.0/,open
dummy.pdf,FileSet,fs-123,gw-123,Custom File Set Title,KKW,,restricted

This above would still need to be zipped with the dummy.pdf file in the files directory.

You can update it as well with a CSV like:

model,source_identifier,title,visibility
FileSet,fs-12346,Updated Custom File Set Title,open

For remote files you would need to include the URL in the remote_files column. An example CSV could look like:

remote_files,model,source_identifier,parents,title,creator,rights_statement,visibility
,GenericWork,gw-remote-123,,"Custom Remote File Generic Work Title",KKW,http://rightsstatements.org/vocab/NKC/1.0/,open
https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg,FileSet,fs-remote-123,gw-remote-123,Custom Remote File Set Title,KKW,,open

And you could update it with a CSV like:

model,source_identifier,title,creator,visibility
FileSet,fs-remote-1234,Custom Remote File Set Title (updated),Kirk,restricted

This commit will add logic so users can set a file set row in their CSVs
and have granular control over adding metadata to file sets.

An example CSV could look like:
```csv
file,model,source_identifier,parents,title,creator,rights_statement,visibility
,GenericWork,gw-123,,"Custom Generic Work Title",KKW,http://rightsstatements.org/vocab/NKC/1.0/,open
dummy.pdf,FileSet,fs-123,gw-123,Custom File Set Title,KKW,,restricted
```

This above would still need to be zipped with the `dummy.pdf` file in
the `files` directory.

You can update it as well with a CSV like:
```csv
model,source_identifier,title,visibility
FileSet,fs-12346,Updated Custom File Set Title,open
```

For remote files you would need to include the URL in the `remote_files`
column.  An example CSV could look like:
```csv
remote_files,model,source_identifier,parents,title,creator,rights_statement,visibility
,GenericWork,gw-remote-123,,"Custom Remote File Generic Work Title",KKW,http://rightsstatements.org/vocab/NKC/1.0/,open
https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg,FileSet,fs-remote-123,gw-remote-123,Custom Remote File Set Title,KKW,,open
```

And you could update it with a CSV like:
```csv
model,source_identifier,title,creator,visibility
FileSet,fs-remote-1234,Custom Remote File Set Title (updated),Kirk,restricted
```
@kirkkwang kirkkwang force-pushed the create-and-update-file-sets branch from a27ed00 to dadc894 Compare September 20, 2025 20:16
@kirkkwang kirkkwang added the minor-ver for release notes label Sep 20, 2025
@ShanaLMoore ShanaLMoore changed the title 🎁 Implement create and update for file sets 🎁 Implement create and update for file set rows Sep 22, 2025
Copy link
Contributor

@maxkadel maxkadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are just a few things I'm not clear on.

I'm thinking I want to move away from the "return [nil] unless [foo]" pattern. I think it's hiding a lot of problems, which then causes more problems. I think we should either return something that is of the same class that the method would otherwise return, or raise an error, or something like that, but not constantly return nil.

I'm definitely open to a wider conversation on that, but I think that's going to help us dig our way out of the "it fails and we don't know why and have to go into the console to recreate it and find the problem" situation

nil
end

# rubocop:disable Metrics/AbcSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we break this method up, rather than disabling the cop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes by all means!

uploaded_file_id = attrs[:uploaded_files].first # currently assuming one file per FileSet
Hyrax::UploadedFile.find(uploaded_file_id)

return Hyrax.query_service.find_by(id: file.file_set_uri) if file.file_set_uri.present?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - what is "file" here if we're setting it to the result of this method on line 344?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what happens if file.file_set_uri.present? is false - won't it just return nil? Is that definitely what we want to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on mobile so I can't see it all but I think file should be a Hyrax::UploadedFile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good question about the nil

'file_set.save_acl' => { permissions_params: change_set.input_params["permissions"] }
).call(change_set).value_or { false }

result if result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's no result? Will the method just return nil? Is that what we really want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that looks like what it would do

add_collection_type_gid if defined?(::Hyrax)
# add any additional collection metadata methods here
elsif factory_class == Bulkrax.file_model_class
elsif factory_class == Bulkrax.file_model_class && @new_record
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if it's a FileSet and not a new record? Does it actually hit the update code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it should since an update should only happen on an already existing record

attrs = clean_attrs(attrs)
file_set_attrs = attrs.slice(*object.attributes.keys) || {}

file = if attrs[:uploaded_files].present?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could split the file-finding chunk into its own pair of methods? One for uploaded files and one for remote files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup absolutely, this is just a quick draft to get started, feel free to change as much as you need

end
end

def update_file_set(attrs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this update_file_set get called vs. the FileFactoryInnerWorkings#update_file_set version? Should we actually have both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, I didn't know the other one existed

@kirkkwang
Copy link
Contributor Author

That sounds good to me, I like raising errors personally so feel free to change it up as much or even start completely over, this was meant to be a proof of concept and mainly just focused on making it the happy path work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

minor-ver for release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants