Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts to Get, Correct, and Set Docket, Document, and Comment values #194

Merged
merged 14 commits into from
Oct 29, 2024

Conversation

jpappel
Copy link
Collaborator

@jpappel jpappel commented Oct 17, 2024

This PR aims to solve the issues of #193 while providing greater flexibility.

Summary

  • get_counts.py
    • get docket, document, and comment counts from regulations.gov, a mirrulations dashboard, or a mirrulations Redis instance as json
    • when using regulations.gov a timestamp can be given to make all dockets, documents, and comments before the timestamp count as if they were downloaded
  • correct_counts.py
    • correct possible errors within a counts json file generated by get_counts.py
  • set_counts.py
    • set values in a mirrulations Redis instance using json generated by get_counts.py

All of the scripts above share a common format

get_counts.py common format
{
  "creation_timestamp": "2024-10-16 15:00:00",
  "dockets": {
    "downloaded": 253807,
    "jobs": 0,
    "total": 253807,
    "last_timestamp": "2024-10-13 04:04:18"
  },
  "documents": {
    "downloaded": 1843774,
    "jobs": 0,
    "total": 1843774,
    "last_timestamp": "2024-10-13 04:04:18"
  },
  "comments": {
    "downloaded": 22240501,
    "jobs": 10,
    "total": 22240511,
    "last_timestamp": "2024-10-13 04:04:18"
  }
}

Examples

Cap Docket, Document, and Comment downloaded counts by the counts from Regulations.gov

./get_counts.py redis | ./correct_counts.py | ./set_counts.py -y

Set Docket, Document, Comment downloaded counts while jobs are in the queue

./get_counts.py dashboard | ./correct_counts.py --ignore-queue --strategy diff_total_with_jobs | ./set_counts.py -y

Download Counts for a Certain Time from Regulations.gov

./get_counts.py --api-key $API_KEY -o aug_6_2022.json -t 2024-08-06T06:20:50Z

EXPORT API_KEY=<REGULATIONS.GOV_API_KEY>
./get_counts.py regulations -o oct_01_2024.json --last-timestamp 2024-10-01T15:30:10Z
./set_counts.py -i oct_01_2024.json

@jpappel
Copy link
Collaborator Author

jpappel commented Oct 17, 2024

Looking for comment on if set_counts.py should output json when values are replaced

@OnToNothing
Copy link
Collaborator

I've checked the scripts and they are working well. Having set_counts output JSON would work well for logging purposes.

@jpappel
Copy link
Collaborator Author

jpappel commented Oct 21, 2024

this needs to check the length of the queue, not the individual jobs waiting

@OnToNothing OnToNothing merged commit 06a22fe into MoravianUniversity:main Oct 29, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants