Skip to content

web scraping online store lazada.co.id, search running shoes

Notifications You must be signed in to change notification settings

rifkyiqbal52/data-analytics-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Web Scraping & Analysis: Running Shoes on Lazada

As part of my training, I was assigned the role of a Data Engineer working on a data pipeline/ETL project. My main task was to extract data from a website, process it, and store it in a PostgreSQL database.

For this project, I built a web scraping tool to gather product data from Lazada, specifically focusing on running shoes, which are currently trending due to the growing interest in running and fitness.

This project helped me understand the real-world workflow of a Data Engineer β€” from data extraction and cleaning to storage and analysis.


🎯 Objectives

  • Scrape product data related to running shoes from Lazada.
  • Clean and process the collected data.
  • Store the structured data in a PostgreSQL database using pgAdmin4.
  • Perform basic analysis to understand product distribution and popularity.

πŸ› οΈ Tools

  • Python: Main programming language
  • Pandas: Data manipulation and analysis
  • BeautifulSoup: HTML parsing for scraping static content
  • Selenium: Automating browser actions and scraping dynamic content
  • PostgreSQL: Database for storing the cleaned data
  • pgAdmin4: GUI for PostgreSQL database management

πŸ“ˆ Collected Data Includes:

the data I scraped was up to 10 slides, resulting in 400 rows and 6 columns :

  • Product_Name
  • Price
  • Seller Location
  • Sold
  • Rating
  • Review

πŸš€ Outcome

By the end of this project, I was able to simulate a real-world ETL (Extract, Transform, Load) process and gain hands-on experience in:

  1. Building web scrapers with Selenium & BeautifulSoup
  2. Structuring and cleaning data with Pandas
  3. Using PostgreSQL for data storage
  4. Understanding the workflow of a data engineering project

πŸ“ Check the notebooks folder for the Jupyter Notebook.

πŸ“‚ View data folder for raw and cleaned datasets.

πŸ“Œ Note

This project is for educational purposes only. It complies with Lazada’s terms of use and was not used for commercial purposes.

About

web scraping online store lazada.co.id, search running shoes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages