This project is a suggestion about how it's possible to build and delivery data apps quickly, by using technologies such as:
In order to accelerate the delivery of the apps, we offer a model of a CI/CD pipeline using many modern and reliable tools from AWS. Responding to every change of the code, building new versions of Docker images and finally deploying into the Kubernetes cluster.
The dataset used in this project were processed using the project myanimelist-data-collector. A web scrap project capable to download data of thousands of titles of anime. Transforming the data from json to parquet, in order to offer the best performance on analytics tasks.
The application built in Python using AWS Data Wrangler and Streamlit is a sample about how easy and fast you can build applications to navigate your users/customers through Data Visualization, Data Analysis, and ML/IA solutions.
It doesn't require expertise with front-end techologies, thanks to Streamlit resources you can build the back-end and the front-end only using the framework.
If you liked what you've read so far, please check the Streamlit page to know more about this amazing solution.
To run our applications we'll need a Kubernetes cluster. Amazon Elastic Kubernetes Service (Amazon EKS) accelerates the way we build and manage a cluster Kubernetes, and eksctl - The official CLI for Amazon EKS helps us to build the cluster and its resources, by using simple commands and config files.
The pipeline is composed of 4 AWS tools:
Service Name | Purpose |
---|---|
AWS CodeCommit | our repository of code |
AWS CodeBuild | our tool to build docker images and deployment |
AWS CodePipeline | wrapping the previous two tools |
Amazon ECR | our docker images' repository |
The workflow in short can be reflected by the following steps:
- code pushed to AWS CodeCommit repository
- AWS CodeBuild gets the updates from the branch and build a new Docker image, by reading the buildspec.yml file
- The image is stored in Amazon ECR repository
- AWS CodeBuild gets the new image and deploy into the EKS cluster, by reading the buildspec_deployment.yml and deployment.yaml files
- The Pods are created in the cluster, along the Service (ELB) and Horizontal Auto Scaling
- AWS CodeBuild gets the new image and deploy into the EKS cluster, by reading the buildspec_deployment.yml and deployment.yaml files
- The image is stored in Amazon ECR repository
- AWS CodeBuild gets the updates from the branch and build a new Docker image, by reading the buildspec.yml file
Please follow the link to understand how to proceed.
I'd like to say a special thanks to Ian and Carlos for the reviews made on the project :)