Skip to content

Python repo to scrape Form 3 and Form 4 off the SEC website.

License

Notifications You must be signed in to change notification settings

hmcguinn/sec-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SEC Scraper

Python repo to scrape data from the SEC's EDGAR database(https://www.sec.gov/edgar.shtml). Currently configured to pull Form 3 and Form 4 (changes in beneficial ownership of securities), but can be changed to pull arbitrary forms and data.

Dependencies

  • BeautifulSoup4
  • Pandas / Numpy

Rate-Limiting

The SEC asks that developers restrict their web crawling to 10 requests a second. While this can slow down the scraper, not doing so will get you temporarily blocked from the site. The code for this is in https://github.com/hmcguinn/secScraper/blob/a40ec6b4d088f6ab6279b545cc42b34ec9683478/multiThreading/opensPerSecond.py

About

Python repo to scrape Form 3 and Form 4 off the SEC website.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published