Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Dumps #21

Closed
firmai opened this issue Nov 24, 2024 · 3 comments
Closed

Data Dumps #21

firmai opened this issue Nov 24, 2024 · 3 comments

Comments

@firmai
Copy link

firmai commented Nov 24, 2024

It might be worth transitioning from Dropbox to Wasabi or DigitalOcean Spaces. Wasabi, while slower, is appealing due to its lack of ingress/egress costs, and both platforms are significantly more stable than Dropbox. If you need preliminary code for any of these, I’d be happy to help.

I mention this because I’ve been encountering issues with the 13Fs fetcher:
downloader.download_dataset(dataset='13f_information_table').

As you’ve likely noticed, the direct download approach often results in frequent timeouts. It’s far more efficient for one person to fetch the data once and make it accessible to others.

Lastly, does your downloader authenticate with headers in the SEC’s preferred format (e.g., name and email), or does the targeted API not require it?

@firmai
Copy link
Author

firmai commented Nov 24, 2024

Just to show you an error for running the above code:

image

@john-friedman
Copy link
Owner

Hi @firmai, I'm going to bypass downloader issues by hosting my own SEC archive. Should be up next week.

Cost looks to be $10-30/month using backblaze B2 + cloudfare for free egress. Definitely interested in finding sponsorships to defray the cost/help keep me going.

@john-friedman
Copy link
Owner

Archive is up. Depending on your hardware/internet speed it should take about 1-2 minutes to download per year. I'm working on making this 10-100x faster.

Here's a quick guide
https://github.com/john-friedman/datamule-python/blob/main/examples/13F-HR_information_tables.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants