Prepare AWS, GCP, or Azure cloud environments to run VectorMPP in Kubernetes.
Before installing VectorMPP on a Kubernetes cluster, two key preparations must be completed:
- Cloud Credentials Setup
- Kubernetes Cluster Provisioning
We provide a single public GitHub repository containing Terraform configurations organized by platform:
vectormpp-cloud-prep/
├── README.md # (this file)
├── airline-sample-data/ # sample data which can be preloaded
├── aws/
│ ├── credentials/ # IAM roles & policies
│ ├── cluster/ # EKS setup (reference)
│ ├── post.sh # post actions
│ └── README.md # AWS-specific instructions
├── gcp/
│ ├── credentials/ # service accounts & roles
│ ├── cluster/ # GKE setup (reference)
│ ├── yaml/storageclass.yaml # Customized gcp filestore storage class
│ └── README.md # GCP-specific instructions
└── azure/
├── credentials/ # Azure AD roles & identities
├── cluster/ # AKS setup (reference)
├── yaml/storageclass.yaml # Customized Azure file storage class
└── README.md # Azure-specific instructions
⚠️ Note: Not everything in these folders is mandatory.
Some resources are critical and required, while others are examples or optional, depending on your cloud setup.
The critical components for each cloud provider are clearly marked in the platform-specific doc or annotated in the Terraform code comments.
Remote Terraform backends are not preconfigured — recommended for production.
- A cloud account (AWS, GCP, or Azure)
- CLI tools:
- Cloud CLI (
aws,gcloud, oraz) terraformkubectlhelm
- Cloud CLI (
-
Clone this repository:
git clone https://github.com/ActianCorp/vectormpp-cloud-prep.git cd vectormpp-cloud-prep -
Choose your platform:
-
Follow the steps to:
- Set up required credentials
- Provision a Kubernetes cluster (or adapt to your own)
-
Proceed to the main VectorMPP installation guide once your environment is ready.
VectorMPP can optionally preload a sample airline dataset for demonstration and testing purposes.
This data is stored in airline-sample-data/ and is based on the U.S. DOT Bureau of Transportation Statistics On-Time Performance dataset.
If you enable the sample data preload feature in VectorMPP, you must first download and upload the data to your cloud object storage (S3, GCS, or Azure Blob).
-
Generate the sample data locally
cd airline-sample-data chmod +x download.sh ./download.shThis downloads 2018 monthly flight performance data from BTS into
airline-sample-data/data/. -
Verify the data (optional but recommended)
python3 verify.py
If everything matches the lookup tables, you should see all green output.
-
Upload the prepared data to your cloud storage:
- AWS → S3 bucket
aws s3 cp data/ s3://<your-sample-data-bucket>/ --recursive
- GCP → GCS bucket
gsutil -m cp -r data/* gs://<your-sample-data-bucket>/
- Azure → Blob container
az storage blob upload-batch -d <container-name> -s data/
- AWS → S3 bucket
-
Configure VectorMPP to point to the location of your uploaded sample data when enabling preload.