Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
-
Updated
Feb 14, 2025 - Python
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Postgres for Search and Analytics
Upserts, Deletes And Incremental Processing on Big Data.
lakeFS - Data version control for your data lake | Git for data
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
The LeoFS Storage System
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
汇总Apache Hudi相关资料
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
DuckDB-powered data lake analytics from Postgres
Open Control Plane for Tables in Data Lakehouse
Use SQL to build ELT pipelines on a data lakehouse.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Add a description, image, and links to the datalake topic page so that developers can more easily learn about it.
To associate your repository with the datalake topic, visit your repo's landing page and select "manage topics."