Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"asyncio.run() cannot be called from a running event loop" #23

Open
debalee101 opened this issue Nov 24, 2024 · 3 comments
Open

"asyncio.run() cannot be called from a running event loop" #23

debalee101 opened this issue Nov 24, 2024 · 3 comments

Comments

@debalee101
Copy link

CIK.xlsx

I am trying to download 10Ks from a list of CIKs [Sample list attached]. I am using the following code:

pip install datamule[all]

import pandas as pd
import datamule as dm

df_CIK = pd.read_excel('CIK.xlsx', sheet_name='Sheet1')
CIKlist = df_CIK['CIK'].tolist()

downloader = dm.Downloader()
downloader.set_limiter('www.sec.gov', 5)
downloader.set_limiter('efts.sec.gov', 5)

for CIK in CIKlist:
  output_dir = '10-K'
  metadata_csv = 'metadata.csv'

for CIK in CIKlist:
    try:
        print(f"Downloading 10-K forms for {CIK}...")
        downloader.download(
            form='10-K',
            cik=CIK,
            output_dir=output_dir,
            date=('2004-01-01', '2024-01-31'),
            save_metadata=True
        )
        print(f"Completed downloading for {CIK}")
    except Exception as e:
        print(f"Failed to download for {ticker}: {e}")

I have also tried:

async def download_10k():
    for CIK in CIKlist:
        try:
            print(f"Downloading 10-K forms for {CIK}...")
            await downloader.download(
                form='10-K',
                cik=CIK,
                output_dir=output_dir,
                date=('2004-01-01', '2024-01-31'),
                save_metadata=True
            )
            print(f"Completed downloading for {CIK}")
        except Exception as e:
            print(f"Failed to download for {CIK}: {e}")
try:
    loop = asyncio.get_running_loop()
    task = loop.create_task(download_10k())
    await task
except RuntimeError:
    asyncio.run(download_10k())

Or is it a good idea to download all available 10Ks within the date range, and then filter by the list of CIKs?

@john-friedman
Copy link
Owner

You can pass your ciklist directly into the downloader.

downloader.download(
    form='10-K',
    cik=CIKlist,
    output_dir=output_dir,
    date=('2004-01-01', '2024-01-31'),
    save_metadata=True)

Not sure whether the async option is coming from your first code block or second. If it's the second - downloader is already async and meant to work right out of the box which is why the error would occur. if it's the first - not sure why, would love to know if you're using jupyter notebooks / colab.

@debalee101
Copy link
Author

I added a blocked that solved the error:

!pip install nest_asyncio
import nest_asyncio
nest_asyncio.apply()
import asyncio

I have a question:

Does 'filing' not work when the file is .htm? I have tried the following:

dfs = []

for file in Path(output_dir).iterdir():
    print(file)

for file in Path(output_dir).iterdir():
  filing = Filing(str(file), '10-K')
  dfs.append(pd.DataFrame(filing))

df = pd.concat(dfs)
df.to_json('10k.json')

I get a 'document' error. I am not able to figure out what I am doing wrong here.

@john-friedman
Copy link
Owner

Hi @debalee101. Please share what you are running your code in. nest_asyncio is enabled by default for jupyter notebooks for this package, so if it's not working for you by default I would like to fix it.

Also nice catch! What causes the error is 10-K/A files being mixed in with your 10-Ks. This is because the downloader by default downloads root forms.

Try emptying your directory and running the downloader selecting only file_type = 10-K. (Works on my machine)
downloader.download(ticker=['MSFT','TSLA','AAPL'],output_dir=output_dir,form='10-K',file_types=['10-K'])

Note: that the code's behavior is annoying and I will be making it simpler in future versions, so this is very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants