Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split scraper_utils.py, legistar_utils.py #95

Open
Tracked by #90
dphoria opened this issue Apr 30, 2022 · 7 comments
Open
Tracked by #90

Split scraper_utils.py, legistar_utils.py #95

dphoria opened this issue Apr 30, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@dphoria
Copy link
Collaborator

dphoria commented Apr 30, 2022

Feature Description

Take out top-level utility functions out of scraper_utils.py and legistar_utils.py into files under cdp_scrapers/utils. The files have become too large. i.e. Those files will contain implementations for IngestionModelScraper and LegistarScraper only.

Solution

Restructure so we will do something like

from cdp_scrapers.utils import reduced_list
from cdp_scrapers.utils.legistar import get_legistar_body

Alternatives

Functionally, the current state is completely adequate. I think the change will make for more manageable code.

@dphoria dphoria added the enhancement New feature or request label Apr 30, 2022
@dphoria dphoria mentioned this issue Apr 30, 2022
3 tasks
@dphoria
Copy link
Collaborator Author

dphoria commented Apr 30, 2022

LegistarScraper.find_time_zone() should be a base utils member. i.e.

from cdp_scrapers.utils import find_time_zone

@mdahwireng
Copy link

How can one identify top-level utility functions?

@dphoria
Copy link
Collaborator Author

dphoria commented Sep 28, 2022

Hello @mdahwireng . Thank you so much for your interest. That is a great question and I will add some more definition to this issue later today.

@dphoria
Copy link
Collaborator Author

dphoria commented Sep 29, 2022

As far as I can tell these are the functions defined at the file root level in those 2 files.

get_legistar_body
get_legistar_person
get_legistar_events_for_timespan
get_legistar_content_uris

reduced_list
str_simplified
parse_static_person
parse_static_file
sanitize_roles

@mdahwireng
Copy link

@dphoria Thanks. I will have a look at it

@mdahwireng
Copy link

LegistarScraper.find_time_zone() should be a base utils member. i.e.

from cdp_scrapers.utils import find_time_zone

@dphoria LegistarScraper does not have a find_time_zone() method. Is there something I am missing?

@dphoria
Copy link
Collaborator Author

dphoria commented Sep 30, 2022

No, not at all. You are correct; the code has changed since when this issue was created. Sorry about that confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants