Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPFS CID pin orchestration through cluster into private swarm #86

Open
3 tasks
benhylau opened this issue Apr 5, 2022 · 8 comments
Open
3 tasks

IPFS CID pin orchestration through cluster into private swarm #86

benhylau opened this issue Apr 5, 2022 · 8 comments

Comments

@benhylau
Copy link
Contributor

benhylau commented Apr 5, 2022

Task Summary

📅 Due date: N/A
🎯 Success criteria: Be able to take encrypted archives and pin them across private swarm in staging.

[@YurkoWasHere to fill]

Files to store:

  • /mnt/integrity_store/starling/internal/{org_id}/{collection_id}/action-archive/*.encrypted (NOT *.zip)

To Do

  • Research ipfs-cluster and how pins are orchestrated
  • Implement ipfs/ipfs-cluster pinning in backend
  • Start thinking about documentation on how to operate a "node" with audience: archiving org, news org, etc.
    • why and how
    • what is ipfs, filecoin, storj
    • security
      • private swarm key
      • ipfs gateway
      • data residency and deletion (who stores the stuff in each network)
    • pros and cons, risks
    • costs
    • ansibles
    • see: https://www.sucho.org
@benhylau
Copy link
Contributor Author

benhylau commented Apr 6, 2022

Here are several links Storj shared in our last call that'd be helpful to go over as we investigate cross-pinning onto their storage network:

@galargh
Copy link
Contributor

galargh commented Apr 11, 2022

Here's the script that @YurkoWasHere shared during the last sync:

diff --new-line-format="" --unchanged-line-format="" <(ipfs pin ls --type=recursive | cut -f1 -d' ' | ipfs cid format -b base32 -v 1 | sort | uniq) <(ipfs-cluster-ctl pin ls | cut -f1 -d' ' | ipfs cid format -b base32 -v 1 | sort | uniq) | parallel ipfs-cluster-ctl pin add --no-status {}

If I understand correctly, the script finds objects that are pinned to local storage but not in the cluster and pins them in the cluster. Could you let me know where we intend to use it? I couldn't find it myself in the repos I have access to.

If it is to make sure that everything we add to ipfs is pinned in ipfs-cluster as well then I think replacing ipfs add with ipfs-cluster-ctl add should do the trick since the latter adds stuff to ipfs, pins it there and pins it in the cluster as well (btw, I couldn't find where we add to ipfs either so any links would be helpful - unless that's not implemented yet in which case I can try jumping on it).

@YurkoWasHere
Copy link
Contributor

If I understand correctly, the script finds objects that are pinned to local storage but not in the cluster and pins them in the cluster.

Yes, this essential syncs pins on ipfs into an ipfs-cluster. It started off as 3 lines and got turned into this crazy thing :) I was just looking at what is possible and how ipfs-cluster actaully works.

If it is to make sure that everything we add to ipfs is pinned in ipfs-cluster as well then I think replacing ipfs add with ipfs-cluster-ctl add should do the trick since the latter adds stuff to ipfs, pins it there and pins it in the cluster as well

This is my understanding as well.

(btw, I couldn't find where we add toipfs` either so any links would be helpful - unless that's not implemented yet in which case I can try jumping on it).

This function does not yet exist. Currently the only time we invoke IPFS is to generate the CID

def digest_cidv1(self, file_path):

For example
https://github.com/starlinglab/integrity-backend/blob/dev/integritybackend/actions.py#L117

I would imagine the adding to IPFS would happen at the bottom of the action.
https://github.com/starlinglab/integrity-backend/blob/dev/integritybackend/actions.py#L161

@benhylau ^^ is there another action your thinking of pinning or this good?

@YurkoWasHere
Copy link
Contributor

YurkoWasHere commented Apr 12, 2022

It may be in Archive Action
#67

Never mind above is the archive action

@galargh
Copy link
Contributor

galargh commented Apr 26, 2022

TODO @galargh:

  • Check out https://demo.storj-ipfs.com/
  • Describe findings from publicly and privately available storj resources
  • Write up what it means for stuff to be on IPFS/storj

@galargh
Copy link
Contributor

galargh commented Apr 28, 2022

Axis to consider for the documentation:

- Geofencing
  - 1st level being US vs. EU (via IP address or KYC)
  - 2nd level being by country (via IP address or KYC)
- Whitelisting of specific storage providers
  - 1st level being that nodes have a persistent identity on the network, can be pseudonymous
  - 2nd level being we have contact information via some KYC process that we can “call a node operator” to sort issues
  - 3rd level being established orgs (like Internet Archive, etc.) are the only node operators (similar to how Lit imagines their validator network)
- Access
  - 1st level being data everywhere, encryption key is only protection
  - 2nd level being a static shared key to gate access of actual encrypted content (e.g. private ipfs swarm key)
  - 3rd level being a dynamically provisioned, ACL-based, or revokable key to gate access of content (so leak keys have remedy)
- Erasure
  - a programatic way to signal an erasure of content on the network

@galargh
Copy link
Contributor

galargh commented May 10, 2022

Because of holiday in-between I redirected my efforts a bit. I caught up on previous Slack convos with Storj and set up a repository for hosting documentation: the repository, the documentation site hosted with GitHub Pages.

Now that the form is ready, my plan is to get back to my TODO list.

@galargh
Copy link
Contributor

galargh commented May 10, 2022

Sync follow-up:

Because of holiday in-between I redirected my efforts a bit. I caught up on previous Slack convos with Storj and set up a repository for hosting documentation: the repository, the documentation site hosted with GitHub Pages.

I'll move my markdowns to wiki instead 🚀 and I'll work on them there 🥳


@benhylau I think as far as IPFS goes, all we need is to decide where exactly to do ipfs-cluster-ctl add (i.e. the answer to this #86 (comment)) + make sure that we init ipfs and ipfs-cluster on the machines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants