How to query data and write data in Apache Iceberg using StarRocks #23427
Replies: 7 comments 10 replies
-
Could you provide the full docker yaml file? |
Beta Was this translation helpful? Give feedback.
-
hi, does SR supports REST catalog? there is no documentation on the official site. great thanks. |
Beta Was this translation helpful? Give feedback.
-
I have followed the recipe as above -combining tabulario set of dockers with additional starrrocks/allin1ubuntu docker. I used notebook from spark-iceberg docker. Subsequently I tried StarRocks :
...but when making SELECT on any of the tables created in and working in the notebook I get : StarRocks > select * from taxis limit 100; ERROR 1064 (HY000): BE access S3 file failed, SdkResponseCode=301, SdkErrorType=100, SdkErrorMessage=Unable to parse ExceptionName: PermanentRedirect Message: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.: file = s3://warehouse/nyc/taxis/data/tpep_pickup_datetime_day=2022-02-25/00000-106-d2cd1139-8fee-4124-bbd2-b92f1fa23b6a-00005.parquet I will greatly appreciate any suggestions. |
Beta Was this translation helpful? Give feedback.
-
logs from a working docker-compose up with SR 3.1.5 and Iceberg 1.4.2
|
Beta Was this translation helpful? Give feedback.
-
Hi @alberttwong, thanks for this tutorial. Since Starrocks supports shared-data architecture which has FE and CN nodes, do we have any tutorials on deploying a shared-data Starrocks and querying Iceberg/Delta/Hudi tables? |
Beta Was this translation helpful? Give feedback.
-
@alberttwong Noob questions, why do you need spark here? Can't you do everything from Starrocks? |
Beta Was this translation helpful? Give feedback.
-
i saw clickhouse and starrocks benchmark comparison.
|
Beta Was this translation helpful? Give feedback.
-
This tutorial describes how you can query data in Apache Iceberg using the Apache Iceberg Quickstart.
Note
The last time we ran this tutorial was using StarRocks 3.1.5 and Apache Iceberg 1.4.2
Prerequisites
For this tutorial you need to:
Have a MySQL client
This is out of scope for the tutorial.
A StarRocks or CelerData database cluster
This is out of scope for the tutorial. Must use SR 3.0.3 or higher.
An Apache Iceberg environment
This is out of scope for the tutorial. However, we are using the Apache Iceberg docker container for this tutorial.
Configure Apache Iceberg
Follow the Apache Iceberg Docker Quickstart at https://iceberg.apache.org/spark-quickstart/ .
Modify the docker-compose.yml to include the lines:
You will also have to add a network alias to the rest image.
Here's a cat of my modified docker compose yaml.
After this, you can then run
docker-compose up
and follow the rest of the Apache Iceberg quickstart. Specifically creating the table, inserting data and querying if the data exists.Create an external catalog and query the data
Now we're going to create an external catalog in StarRocks.
Login to the SR container within the iceberg docker compose
Note
Usually, you don't need to put in the "client.factory" setting. It is there to support the special Apache Iceberg Quickstart environment.
Output should look like this:
Write data into the iceberg tables backed by Minio
You'll also see this in the logs.
Beta Was this translation helpful? Give feedback.
All reactions