feat: Add DuckDB Parquet tutorial notebook#116
Conversation
Create interactive marimo notebook demonstrating Parquet file analysis with DuckDB. Features remote file querying, table creation, and Airbnb stock price visualization using Plotly. - Direct FROM clause queries on remote Parquet files - read_parquet() function for optimized column selection - SQL-based time series analysis with reactive cells
|
I've implemented the DuckDB Parquet tutorial notebook as discussed in #48. The notebook demonstrates:
The implementation uses the Airbnb stock dataset from Hugging Face as suggested in the issue. All dependencies are properly declared using marimo's sandbox format, and the content focuses on practical, beginner-friendly examples. Would appreciate your review when you have a chance. Happy to make any adjustments based on your feedback. Thanks! |
|
@thliang01 Thanks a lot for the PR!
Will get to reviewing it shortly. |
Haleshot
left a comment
There was a problem hiding this comment.
Solid notebook & first notebook contrib covering DuckDB's Parquet capabilities 🎉; the progression from direct querying to read_parquet to persistent tables is great (the flow makes sense). The relevant Airbnb stock data analysis also helps with understanding (practical example).
Left some comments as part of the PR review; some minor nits/corrections.
- Add author attribution to notebook header - Add sqlglot dependency for future SQL parsing capabilities - Use consistent table references via variables instead of string literals - Remove unused pyarrow import - Improve markdown formatting for better readability The notebook now properly references the created airbnb_stock table through variables, making the code more maintainable and reducing the risk of typos in table names.
|
@Haleshot - I've addressed the review comments and pushed a new commit. Main changes include adding author attribution and refactoring the SQL queries to use proper table variables. Ready for re-review when you have time! |
Haleshot
left a comment
There was a problem hiding this comment.
Great; thanks for addressing the comments so quickly and for the notebook contribution. LGTM 🚀
Create interactive marimo notebook demonstrating Parquet file analysis with DuckDB. Features remote file querying, table creation, and Airbnb stock price visualization using Plotly.
📝 Summary
📋 Checklist
--sandboxREADME.md