You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 17,000 historical reports in a folder in o365, which I'm trying to process with parsedmarc.
This seems to require watch=true to do more than one batch, but with watch=true it works through these at an extremely slow rate (less than 1/second).
It appears this is because a batch_size param is passed to connector.fetch_messages() on the first call but not subsequent ones, so the first batch is reasonably fast and every one after that is running a slow, expensive query to list all 17,000 items in the mailbox.
Bad workaround: Set batch_size=250 or so, so the expensive query is only run 34 times instead of 1,700, and deal with the duplicates if it crashes between starting processing and moving/deleting the emails. (Also, don't use watch=true - see #416 )
The text was updated successfully, but these errors were encountered:
If you are only working with DMARC reports you may be interested in trying out nhairs/parsedmarc-fork which is designed to be more stable for lots of reports.
Note that it is a WIP so it might not have all the functionality you need (do let me know though!).
I have 17,000 historical reports in a folder in o365, which I'm trying to process with parsedmarc.
This seems to require watch=true to do more than one batch, but with watch=true it works through these at an extremely slow rate (less than 1/second).
It appears this is because a batch_size param is passed to connector.fetch_messages() on the first call but not subsequent ones, so the first batch is reasonably fast and every one after that is running a slow, expensive query to list all 17,000 items in the mailbox.
Bad workaround: Set batch_size=250 or so, so the expensive query is only run 34 times instead of 1,700, and deal with the duplicates if it crashes between starting processing and moving/deleting the emails. (Also, don't use watch=true - see #416 )
The text was updated successfully, but these errors were encountered: