Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing vocabulary files to the pods #135

Open
egabancho opened this issue Dec 3, 2024 · 5 comments · May be fixed by #137
Open

Allow passing vocabulary files to the pods #135

egabancho opened this issue Dec 3, 2024 · 5 comments · May be fixed by #137
Assignees

Comments

@egabancho
Copy link
Member

We have several instances that share the same docker image. It'd be nice to configure the vocabularies each one loads using the helm chart.

Maybe something like this:

invenio:
  ...
  vocabularies:
    resource_types.yaml: | 
        - id: publication
          icon: file alternate
          props:
            csl: report
            datacite_general: Text
            datacite_type: ""   
            openaire_resourceType: "0017"
            openaire_type: publication
            eurepo: info:eu-repo/semantics/other
            schema.org: https://schema.org/CreativeWork
            subtype: ""
            type: publication
            marc21_type: publication
            marc21_subtype: ""
          title:
            en: Publication
          tags:
            - depositable
            - linkable
          ...
egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 3, 2024
* Adds the option to customize vocabularies (closes inveniosoftware#135)
@egabancho egabancho linked a pull request Dec 3, 2024 that will close this issue
@lindhe
Copy link
Contributor

lindhe commented Dec 4, 2024

I think loading vocabs via ConfigMaps may make sense, but I believe they are sometimes too big to fit in a ConfigMap. Do you know if Invenio has any kind of dynamic loading of resources that we could use, for example loading vocabs from S3 after container startup?

@egabancho
Copy link
Member Author

They can be large, but they don't have to. These are the "default" ones to give you an idea https://github.com/inveniosoftware/invenio-rdm-records/tree/master/invenio_rdm_records/fixtures/data/vocabularies

Honestly, it was the only way I could think of to load them into the pods. What I mean by that is that I am open to suggestions 😂

Do you know if Invenio has any kind of dynamic loading of resources that we could use, for example loading vocabs from S3 after container startup?

This is actually a good idea. Still, we will have to allow configuring the vocavularies.yaml file itself (a much smaller file).
There is something called SimpleHTTPReader, so maybe ... I'll explore that and come back with the findings!

@egabancho egabancho self-assigned this Dec 4, 2024
@egabancho
Copy link
Member Author

There is indeed a 1Mb limit on the config map size, https://kubernetes.io/docs/concepts/configuration/configmap/#motivation and https://stackoverflow.com/questions/53012798/kubernetes-configmap-size-limitation.

Still, I believe this is a far simpler solution for smaller vocabularies (most of the cases). It is also doable to fetch larger data from a service, e.g., ORCID authors, or even from a file stored on S3. They shouldn't be mutually exclusive.

@lindhe
Copy link
Contributor

lindhe commented Dec 5, 2024

Yes, I think this is viable for small vocabularies. I know ours at KTH are many GB in size, but that's maybe not the case for everyone. If we have this method in place, maybe that's helpful sometimes.

And I know @Samk13 mentioned to me that there's work (or plans, at least) to implement some dynamic loading of vocabularies from outside sources directly in the application. That's probably the way to go for large vocabularies.

@Samk13
Copy link
Member

Samk13 commented Dec 5, 2024

You can check the progress here

egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 5, 2024
* Adds the option to customize vocabularies (closes inveniosoftware#135)
egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 5, 2024
* Adds the option to customize vocabularies (closes inveniosoftware#135)
egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 10, 2024
* Adds the option to customize vocabularies (closes inveniosoftware#135)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants