Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Optimize ObjectRefresh for lower memory usage and better performance #4971

Closed
2 tasks done
smdsbz opened this issue Jan 21, 2025 · 0 comments · Fixed by #4980
Closed
2 tasks done

[Feature] Optimize ObjectRefresh for lower memory usage and better performance #4971

smdsbz opened this issue Jan 21, 2025 · 0 comments · Fixed by #4980
Labels
enhancement New feature or request

Comments

@smdsbz
Copy link
Contributor

smdsbz commented Jan 21, 2025

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

The current implementation of ObjectRefresh first collects a list of all files under object-location, then writes them out to the table. This requires the driver node to have memory enough to reside the entire object listing.

Another problem is that the current implementation generates a new commit for each file in the listing. This can result in an enormous amount of snapshots and poor refresh performance.

Solution

  1. Use FileIO#listFilesIterative to load file listing into memory in batches.
    The final effect of memory saving will depend on the actual implementation of the FileIO, but the worst case it can fallback to is what we already have now.
  2. Periodically issue commits for writes of a certain batch size.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@smdsbz smdsbz added the enhancement New feature or request label Jan 21, 2025
@smdsbz smdsbz changed the title [Feature] Lower memory usage of ObjectRefresh [Feature] Optimize ObjectRefresh for lower memory usage and better performance Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant