This Python project demonstrates how to scrape specific information, such as headlines or prices, from a webpage using the requests
and BeautifulSoup
libraries. The program fetches and parses the webpage content and extracts the desired data using HTML tag identification.
- Fetches HTML content from a webpage using the
requests
library. - Parses the HTML content using
BeautifulSoup
. - Extracts specific information (e.g., headlines, prices) by identifying and targeting HTML tags.
- Outputs the extracted data in a readable format.
- The
requests.get()
method downloads the HTML content from the specified URL. - The program parses the HTML using
BeautifulSoup
to create a searchable structure. - The
find()
orfind_all()
methods are used to extract the desired data (such as headlines within<h2>
tags). - The extracted data is displayed in the console.
- Python 3.x
- Libraries:
requests
beautifulsoup4
You can install the required libraries using pip
:
pip install requests beautifulsoup4