Web Scraping & Analysis: Running Shoes on Lazada

As part of my training, I was assigned the role of a Data Engineer working on a data pipeline/ETL project. My main task was to extract data from a website, process it, and store it in a PostgreSQL database.

For this project, I built a web scraping tool to gather product data from Lazada, specifically focusing on running shoes, which are currently trending due to the growing interest in running and fitness.

This project helped me understand the real-world workflow of a Data Engineer — from data extraction and cleaning to storage and analysis.

🎯 Objectives

Scrape product data related to running shoes from Lazada.
Clean and process the collected data.
Store the structured data in a PostgreSQL database using pgAdmin4.
Perform basic analysis to understand product distribution and popularity.

🛠️ Tools

Python: Main programming language
Pandas: Data manipulation and analysis
BeautifulSoup: HTML parsing for scraping static content
Selenium: Automating browser actions and scraping dynamic content
PostgreSQL: Database for storing the cleaned data
pgAdmin4: GUI for PostgreSQL database management

📈 Collected Data Includes:

the data I scraped was up to 10 slides, resulting in 400 rows and 6 columns :

Product_Name
Price
Seller Location
Sold
Rating
Review

🚀 Outcome

By the end of this project, I was able to simulate a real-world ETL (Extract, Transform, Load) process and gain hands-on experience in:

Building web scrapers with Selenium & BeautifulSoup
Structuring and cleaning data with Pandas
Using PostgreSQL for data storage
Understanding the workflow of a data engineering project

📁 Check the notebooks folder for the Jupyter Notebook.

📂 View data folder for raw and cleaned datasets.

📌 Note

This project is for educational purposes only. It complies with Lazada’s terms of use and was not used for commercial purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping & Analysis: Running Shoes on Lazada

🎯 Objectives

🛠️ Tools

📈 Collected Data Includes:

🚀 Outcome

📌 Note

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
notebooks		notebooks
README.md		README.md

rifkyiqbal52/data-analytics-projects

Folders and files

Latest commit

History

Repository files navigation

Web Scraping & Analysis: Running Shoes on Lazada

🎯 Objectives

🛠️ Tools

📈 Collected Data Includes:

🚀 Outcome

📌 Note

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages