Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite download_changesets.sh #31

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

grischard
Copy link
Collaborator

@grischard grischard commented Apr 3, 2023

  • Use curl instead of wget to fetch the list of changesets.
  • Use mktemp instead of a hardcoded filename for the temporary file.
  • Allow the output directory to be specified via a command-line argument.
  • Add logging with detailed messages, including timestamps and errors.
  • Implement retries with exponential backoff for failed downloads.
  • Implement a dry-run mode to display the changesets that would be downloaded without actually downloading them.
  • Allow users to specify a custom log file location.
  • Output a summary report of the downloaded changesets, including total count and size.
  • Implement a help message that displays the available options and their descriptions.

This hopefully makes download_changesets more flexible, robust, and user-friendly than the original.

Use curl instead of wget to fetch the list of changesets.
Use mktemp instead of a hardcoded filename for the temporary file.
Allow the output directory to be specified via a command-line argument.
Add logging with detailed messages, including timestamps and errors.
Implement retries with exponential backoff for failed downloads.
Implement a dry-run mode to display the changesets that would be downloaded without actually downloading them.
Allow users to specify a custom log file location.
Output a summary report of the downloaded changesets, including total count and size.
Implement a help message that displays the available options and their descriptions.

This hopefully makes download_changesets more flexible, robust, and user-friendly than the original.
@woodpeck
Copy link
Owner

woodpeck commented Apr 3, 2023

Thank you. (Unsure how curl is better than wget but hey...) Note that exponential backoffs may not always solve a download issue - I have encountered changesets so big that they could not be downloaded no matter how long you waited. For these I had to resort to @pnorman's make_changeset.py that synthesizes a changeset from minutely diffs.

@AntonKhorev
Copy link
Collaborator

Why the "since" date defaults to 2013-11-01? Wouldn't you want some date before osm existed?

@SomeoneElseOSM
Copy link
Collaborator

Is this planned to cover #30 ("Added download_changesets_txt.sh") and #25 ("download_changesets_uid.sh")

@mmd-osm
Copy link

mmd-osm commented Jun 2, 2023

I have encountered changesets so big that they could not be downloaded no matter how long you waited.

@woodpeck : This is a bit unexpected. Do you happen to have a download URL to demonstrate the issue?

As always, if you encounter similar issues in the future, please report them in the repo implementing the respective endpoint (rather than some random place on the internet).

@woodpeck
Copy link
Owner

@pnorman do you by any chance remember why you had to build your python "changeset synthesizer"? regarding @mmd's question above. Possibly that was at a time when changesets still had 50k objects and were maybe not even served by cgimap?

@pnorman
Copy link
Contributor

pnorman commented Jun 21, 2023

@pnorman do you by any chance remember why you had to build your python "changeset synthesizer"? regarding @mmd's question above. Possibly that was at a time when changesets still had 50k objects and were maybe not even served by cgimap?

Yes, it was before cgimap handled changesets, and they could time out for large ones. But I mainly built it because it was interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants