This is a template for creating a fully functional dbt project for teaching, learning, writing, demoing, or any other scenarios where you need a basic project with a synthesized jaffle shop business.
- Data Ingestion - Airbyte
- Data transformation - dbt
- Data orchistration - dagster
- CICD - TBC
This will create a new repository exactly like this one, and navigate you there. Make sure to execute the next instructions in that repo.
This will create a new codespace
, a sandboxed devcontainer with everything you need for a dbt project. Once the codespace is finished setting up, you'll be ready to run a dbt build
.
After the container is built and connected to, VSCode will run a few clean up commands and then a postCreateCommand
, a set of commands run after the container is set up. This is where we install our dependencies, such as dbt, the duckdb adapter, and other necessities, as well as run dbt deps
to install the dbt packages we want to use. That screen will look something like the above, when its completed it will close and leave you in a fresh terminal prompt. From there you're ready to do some analytics engineering!
- requires docker and understanding of devcontainers
- get docker up and running and use VS Code's dev containers extention to reload workspace in dev container.
task dagit
See notes on how to run locally in orchestration/readme.md
- running dbt sql in vscode (ctrl + enter)
- generate model from source (click source in .yml and click model)
- compile sql (ctr +)
- execute tests
- go to definition (click a ref() or macro/util)
- auto complete
See notes on how to run locally in orchestration/readme.md
duckdb ./reports/jaffle_shop.duckdb
.tables
select * exclude (subtotal) from analytics.orders;
from information_schema.tables;
.exit
limitation with package conflict dbt-metricflow[duckdb]
doesnt support dbt-duckdb
1.7 yet.
dbt run -s orders && data-diff --dbt
cd reports
npm run dev
Then add database via ui called jaffle_shop.duckdb
- sqlfluff TODO sqlfluff lint not working
see ./orchistration/readme.md for more details
see task temp_limitation for workaround
duckdb ./reports/jaffle_shop.duckdb Error: unable to open database "./reports/jaffle_shop.duckdb": IO Error: Could not set lock on file "./reports/jaffle_shop.duckdb": Resource temporarily unavailable
airbyte currently has duckdb alpha connection set to v0.6.0 however more recent versions of duckdb are available and not aligned with dbt-duckdb for now airbyte will write to duckdb_airbyte.duckdb (older version of duckdb) then dagster will pass to duckdb.duckdb (newer version of duckdb)
https://duckdb.org/internals/storage.html
waiting for v1 of duckdb to become stable