Projects done in the Data Engineer Nanodegree Program by Udacity.com
-
Updated
Dec 8, 2022 - Jupyter Notebook
Projects done in the Data Engineer Nanodegree Program by Udacity.com
Documentation for Getting Up and Running w/ indexed.xyz Data
This is a repository to hold the files and notebooks produced throughout my Udacity's Nanodegree Data Engineering program.
A Semantic Data Reservoir for Heterogeneous Datasets
Codebase and data for our paper - Pylon: Semantic Table Union Search in Data Lakes.
A Search Join is a join operation which extends a user-provided table with additional attributes based on a large corpus of heterogeneous data originating from the Web or corporate intranets.
Discussion of DTF software architecture Repository
Follow along with materials in the book "Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses and data lakes" (Lipp, 2023)
Data Engineering Nanodegree Program
Add a description, image, and links to the data-lakes topic page so that developers can more easily learn about it.
To associate your repository with the data-lakes topic, visit your repo's landing page and select "manage topics."