cortx/doc/integrations/cortx-pytorch at main · Seagate/cortx

History

Name		Name	Last commit message	Last commit date
parent directory ..
src		src
Cortx-PyTroch Integration - 1-Data Preprocessing.ipynb		Cortx-PyTroch Integration - 1-Data Preprocessing.ipynb
Cortx-PyTroch Integration - 2, Loading Data from Cotrx-S3 and Train the model.ipynb		Cortx-PyTroch Integration - 2, Loading Data from Cotrx-S3 and Train the model.ipynb
Cortx-PyTroch Integration - 3, Load trained model from Cotrx-S3 for Inference .ipynb		Cortx-PyTroch Integration - 3, Load trained model from Cotrx-S3 for Inference .ipynb
README.md		README.md

README.md

Integrating Cortx and PyTorch

PyTorch

PyTorch is an open-source deep learning framework that provides graph-based execution, distributed training, mobile deployment, quantization, developing neural network models.

Cortx

CORTX is a distributed object storage system designed for great efficiency, massive capacity, and high HDD-utilization. CORTX is 100% Open Source.

Why Integrate PyTorch to Cortx

In the 21st century, there has been an increase in applications of deep learning, from finance, health, self-driving cars, etc. Most of these applications require massive datasets to be efficient. Hosting the data becomes expensive. Currently, with the help of Cortx, you can host massive data from different sources, reducing the cost.

Integration Process.

For this integration, the training of the deep learning model is performed on the CPU. Therefore I’ll use a sample dataset with 5 classes, then train for 1 epoch. The data is stored in Cotrx-S3 and integrated via a custom PyTorch Dataset Loader to pass through to a Convolutional Neural Network for Multi-class classification. Data on s3 using cyberduck.

Step 1: Install Cortx ova on virtual box following these guidelines:

Step 2: Set up Cortx-S3 account via the Cortx GUI dashboard to generate the access and secret key.

Step 3: Data Preprocessing:

Follow the instruction on this noteboook to create, list, delete a bucket, upload a file.

endpoint_url="http://ens34 ip address"
aws_access_key_id = "provided on Step 2"
aws_secret_access_key = "provided on Step 2"

Step 4: Loading the data from Cortx s3 to PyTorch Dataset Loader:

Follow the setup on this notebook
This is the main step when integrating to PyTorch. Currently, PyTorch does not have pre-existing Dataset Loaders to fetch data from S3. Therefore you need to create a custom Dataset class

    class ImageDataset(Dataset):
        def __init__, 
        
        def __len__,
        
        def __getitem__

Use boto3 to fetch the image from the bucket, OpenCV to read the images, then convert to PIL images which is the required format by PyTorch.

   image = cv2.imdecode(np.asarray(bytearray(img_name)), cv2.COLOR_BGR2RGB)
   image = Image.fromarray(image)

Use the s3fs library for listdir(), listfiles() etc. S3 buckets do not have folder structures. s3fs make it easier to maintain the folder structure.

Step 5: Save the trained model directly to Cortx-S3:

I do not save the model to my local machine first then upload it but saving directly to s3. We're minimizing using the local storage at all costs.

Step 6: Load the saved model from Cortx-S3 - Inference.

Follow the guideline on this notebook
After training any machine learning model, the next step is to evaluate how your model performs on the test data. For this setup, the test images on the local storage. You can save them to Cortx s3 and load them directly from there as we did on the train and validation images.

Step 7: Experiment!!!:

Integrate different PyTorch models to Cortx and see how they perform.

Demo

Contributors:

Rose Wambui

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cortx-pytorch

cortx-pytorch

README.md

Integrating Cortx and PyTorch

PyTorch

Cortx

Why Integrate PyTorch to Cortx

Integration Process.

Demo

Contributors:

Files

cortx-pytorch

Directory actions

More options

Directory actions

More options

Latest commit

History

cortx-pytorch

Folders and files

parent directory

README.md

Integrating Cortx and PyTorch

PyTorch

Cortx

Why Integrate PyTorch to Cortx

Integration Process.

Demo

Contributors: