Loading Parquet Data into StarRocks from Local Disk or S3 Bucket #22777
Closed
Replies: 1 comment
-
A small mistake: at the first line, you should change "routine load" to "broker load" |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This tutorial describes how you can loading Parquet data from your remote or local (#22782) object store using the "broker load" feature.
Prerequisites
For this tutorial you need to:
A StarRocks or CelerData database cluster
This is out of scope for the tutorial.
Downloading the Sample Data File
To download the sample Parquet data file, click cities.parquet.
The Parquet data file includes sample continent data. The following is a representative example:
Access to an Object Store and upload the parquet data file
This is out of scope for the tutorial. Please store the URI to the file and any credential needed to login for future usage.
Create a Database, a Table and Query the Data
The following commands create objects specifically for use with this tutorial. When you have completed the tutorial, you can drop these objects.
Step 0: Login to Database
To login into the database, you'll need the server name, host port, username and password.
Step 1: Create Database
Run the create database command.
Step 2: Create Table
Issue: We don't support the "variant" snowflake type as this time so that's why we use the "json" type. Github Issue #22781
Step 3A: Load Table with "broker load"
Issue: There is an outstanding github issue with bucket names with "." in them. Github Issue #22711
Issue: We don't support the "variant" snowflake type as this time so that's why we don't import into the "city" column. Github Issue #22781
AWS S3 and Minio
Google Cloud Storage
Azure Object Store
Check the status of the "broker load" by executing the "show" command.
If you make a mistake, just delete the "load".
Step 3B: Load Table using files()
In the 3.1 release of StarRocks, you can now use the files(). See https://docs.starrocks.io/en-us/latest/sql-reference/sql-functions/table-functions/files for details.
Step 3C: Load Table from Local Disk
Here's an example of the sql statements needed. You'll need an environment with the broker in it. In situations with docker, you may need to copy the file into the container using
docker cp
.Step 4: Run Query
The results should look like below
Step 5: Clean Up
Beta Was this translation helpful? Give feedback.
All reactions