Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate possible use of OpenEBS #23

Open
rconway opened this issue Nov 14, 2024 · 2 comments
Open

Investigate possible use of OpenEBS #23

rconway opened this issue Nov 14, 2024 · 2 comments
Assignees

Comments

@rconway
Copy link
Contributor

rconway commented Nov 14, 2024

OpenEBS - https://openebs.io/

Investigate whether use of OpenEBS can be suggested in the Deployment Guide as a possible approach to establishing ReadWriteMany storage on any cluster.

This could support the EOEPCA Prerequisites as requested by #21.

@jdries
Copy link
Contributor

jdries commented Nov 14, 2024

We would appreciate this information very much.
Also some explanation on operational cost and maintenance would be helpful, to allow us to assess how realistic it is to support rwx volumes via such a component.
I also understand that this is used for performance, to avoid moving data across the network, but am wondering how this will work with a big cluster? If I write data on one node, does it somehow stay there until another node tries to read it, or how does this high-performance kind of replication work?

@spinto
Copy link

spinto commented Nov 14, 2024

hi @jdries , that's a good point about having more info about production (ops&maintenance) and I think this is not only about RWX volumes but for all the EOEPCA pre-requisites (e.g., even the K8S cluster itself). I have put a note in #21 to add more info on what is recommended for demo/test and what for production.

About the other points, I cannot speak about OpenEBS, I have never used it, but using IBM Spectrum Scale, LUSTRE or GlusterFS is very realistic to support ReadWriteMany volumes in K8S. This is used in production in many HPC centers (CERN) and I have myself used both GlusterFS and IBM Spectrum Scale in EUMETSAT for a big 2000 CPU-cores K8S cluster for multi-mission bulk data processing.

About how does it works, for most the backend is a distributed file system, which stores the data in multiple replicas in storage nodes "as close as possible" to the computing nodes and use multi-tier "intelligent" caching (based on read/write statistics on fs files and folders) to improve overall performance/stability. Over-simplifying, and here I hope no one expert in HTC/HPC reads this otherwise they will kill me :), if you write something in a folder from node A, some bits may go on node A and in parallel (and in the background) go also to nodes B and C, then when you read from D you will read in parallel from A B and C , but caching also on D. At some point if you always write on A and read from D, D will become the reference, plus your future writes from A will go directly to D and the cached bits on B and C will be deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants