Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard.csv merge service #206

Closed
ColmDC opened this issue Jun 5, 2023 · 5 comments
Closed

Standard.csv merge service #206

ColmDC opened this issue Jun 5, 2023 · 5 comments

Comments

@ColmDC
Copy link
Contributor

ColmDC commented Jun 5, 2023

Is your feature request related to a problem? Please describe.
Merging data about co-ops from different sources can be done in many ways. It has been made easy by the efforts we have put into considering standards for the data we publish, however we have not written any data merging functionality for the data factory and we publish only one map demonstate co-op data set merging, and there, the merging was carried out within Airtable.

Describe the solution you'd like
A function which can be easily called from the data factory that can take the following files:

  • a primary standards csv file;
  • a secondary standards csv file;
  • a list which matches IDs from the primary file with entries in the secondary;
  • a list of the fields in the secondary, not present in the first that it we wish to add;

and merges the 2 csv files where the resultant csv file has the following characteristics:

  • it includes data for all the co-ops present in the primary data set, and no other;
  • it include all the fields in the primary source and the specified new fields in the secondary;
  • using the list of matches, it takes the values in the additions fields from matched co-ops in the secondary file and populating the fields in the primary co-op entry;

Describe alternatives you've considered
We have considered doing merging on a co-op by co-op basis as the user selects a co-op, with a Sparql enabled micro service but we believe this may not be very responsive.

Additional context
For ease of incorporation into the Data Factory, the function should be written in Ruby, but there are other factors to consider, so we should consider the options.
There are many other types of merge we can imagine needing in the near future, so it may be worth considering functionality a little more generic than strictly required for this type of merge, if the cost was not too high.

@ColmDC
Copy link
Contributor Author

ColmDC commented Jun 11, 2023

So at dev.data.solidarityeconomy.coop/coops-uk/standard.csv find the csv published for co-ops uk data, but
there is nothing at https://dev.data.solidarityeconomy.coop/dotcoop/standard.csv

Are the paths different for some reason? @wu-lee

@ColmDC
Copy link
Contributor Author

ColmDC commented Jun 13, 2023

Where can I find the standard.csv for current dotcoop data? @wu-lee

@wu-lee
Copy link
Contributor

wu-lee commented Jun 14, 2023

It should be there now - the SeOpenData gem was an older version which didn't publish this file, I updated it.

@ColmDC
Copy link
Contributor Author

ColmDC commented Sep 8, 2023

Dropping this out of the two Epics relating to the Turtle island confereces, as it is a proposed merge solution that is unlikley to be used now.

@ColmDC
Copy link
Contributor Author

ColmDC commented Jan 26, 2024

Closing this as obsolete @wu-lee

@ColmDC ColmDC closed this as completed Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants