Allow passing vocabulary files to the pods #135

egabancho · 2024-12-03T10:52:36Z

We have several instances that share the same docker image. It'd be nice to configure the vocabularies each one loads using the helm chart.

Maybe something like this:

invenio:
  ...
  vocabularies:
    resource_types.yaml: | 
        - id: publication
          icon: file alternate
          props:
            csl: report
            datacite_general: Text
            datacite_type: ""   
            openaire_resourceType: "0017"
            openaire_type: publication
            eurepo: info:eu-repo/semantics/other
            schema.org: https://schema.org/CreativeWork
            subtype: ""
            type: publication
            marc21_type: publication
            marc21_subtype: ""
          title:
            en: Publication
          tags:
            - depositable
            - linkable
          ...

The text was updated successfully, but these errors were encountered:

* Adds the option to customize vocabularies (closes inveniosoftware#135)

lindhe · 2024-12-04T11:54:08Z

I think loading vocabs via ConfigMaps may make sense, but I believe they are sometimes too big to fit in a ConfigMap. Do you know if Invenio has any kind of dynamic loading of resources that we could use, for example loading vocabs from S3 after container startup?

egabancho · 2024-12-04T21:31:02Z

They can be large, but they don't have to. These are the "default" ones to give you an idea https://github.com/inveniosoftware/invenio-rdm-records/tree/master/invenio_rdm_records/fixtures/data/vocabularies

Honestly, it was the only way I could think of to load them into the pods. What I mean by that is that I am open to suggestions 😂

Do you know if Invenio has any kind of dynamic loading of resources that we could use, for example loading vocabs from S3 after container startup?

This is actually a good idea. Still, we will have to allow configuring the vocavularies.yaml file itself (a much smaller file).
There is something called SimpleHTTPReader, so maybe ... I'll explore that and come back with the findings!

egabancho · 2024-12-05T11:25:40Z

There is indeed a 1Mb limit on the config map size, https://kubernetes.io/docs/concepts/configuration/configmap/#motivation and https://stackoverflow.com/questions/53012798/kubernetes-configmap-size-limitation.

Still, I believe this is a far simpler solution for smaller vocabularies (most of the cases). It is also doable to fetch larger data from a service, e.g., ORCID authors, or even from a file stored on S3. They shouldn't be mutually exclusive.

lindhe · 2024-12-05T13:10:45Z

Yes, I think this is viable for small vocabularies. I know ours at KTH are many GB in size, but that's maybe not the case for everyone. If we have this method in place, maybe that's helpful sometimes.

And I know @Samk13 mentioned to me that there's work (or plans, at least) to implement some dynamic loading of vocabularies from outside sources directly in the application. That's probably the way to go for large vocabularies.

Samk13 · 2024-12-05T13:42:32Z

You can check the progress here

* Adds the option to customize vocabularies (closes inveniosoftware#135)

egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 3, 2024

invenio: add vocabularies

45b30fb

* Adds the option to customize vocabularies (closes inveniosoftware#135)

egabancho linked a pull request Dec 3, 2024 that will close this issue

Custom vocabularies #137

Open

egabancho self-assigned this Dec 4, 2024

egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 5, 2024

invenio: add vocabularies

73d1a37

* Adds the option to customize vocabularies (closes inveniosoftware#135)

egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 5, 2024

invenio: add vocabularies

006a018

* Adds the option to customize vocabularies (closes inveniosoftware#135)

egabancho added a commit to egabancho/helm-invenio that referenced this issue Dec 10, 2024

invenio: add vocabularies

9362600

* Adds the option to customize vocabularies (closes inveniosoftware#135)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing vocabulary files to the pods #135

Allow passing vocabulary files to the pods #135

egabancho commented Dec 3, 2024

lindhe commented Dec 4, 2024

egabancho commented Dec 4, 2024

egabancho commented Dec 5, 2024

lindhe commented Dec 5, 2024

Samk13 commented Dec 5, 2024

Allow passing vocabulary files to the pods #135

Allow passing vocabulary files to the pods #135

Comments

egabancho commented Dec 3, 2024

lindhe commented Dec 4, 2024

egabancho commented Dec 4, 2024

egabancho commented Dec 5, 2024

lindhe commented Dec 5, 2024

Samk13 commented Dec 5, 2024