Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collect shards in SQS Filesource #5339

Merged
merged 14 commits into from
Sep 19, 2024
Merged

Garbage collect shards in SQS Filesource #5339

merged 14 commits into from
Sep 19, 2024

Conversation

rdettai
Copy link
Contributor

@rdettai rdettai commented Aug 23, 2024

Description

Trigger the prune_shards endpoint from the SQS file sources.

How was this PR tested?

Added integration test.

Copy link

github-actions bot commented Aug 23, 2024

On SSD:

Average search latency is 1.0x that of the reference (lower is better).
Ref run id: 3537, ref commit: c9dfb6d
Link

On GCS:

Average search latency is 1.3x that of the reference (lower is better).
Ref run id: 3538, ref commit: c9dfb6d
Link

@rdettai rdettai force-pushed the sqs-stale-shards branch 2 times, most recently from ce78565 to 66c1ca8 Compare September 2, 2024 14:16
Base automatically changed from sqs-stale-shards to main September 2, 2024 14:29
@rdettai rdettai marked this pull request as ready for review September 3, 2024 13:17
@rdettai
Copy link
Contributor Author

rdettai commented Sep 4, 2024

There is one footgun here that I would like to avoid: it might happen that a shard is garbage collected right before it's published (e.g deduplication_window_max_messages is small compared to processed message rate). Possible solutions:

  • GC only EOF shards, the drawback is that if a lot of junk shards that couldn't be published accumulate, we might end up overwhelming the metastore with a huge shard table
  • set a grace period of x*commit_timeout, reducing the situations where the shard could be garbage collected while it was still being worked on. This is not bulletproof though as the indexing might be interrupted for some reason and resume after the grace period, but failing the commit in those rare cases should be supportable

@rdettai rdettai force-pushed the sqs-garbage-collection branch 2 times, most recently from 7c543ee to d1885af Compare September 5, 2024 16:12
@rdettai rdettai enabled auto-merge (squash) September 19, 2024 14:03
@rdettai rdettai merged commit 565becd into main Sep 19, 2024
5 checks passed
@rdettai rdettai deleted the sqs-garbage-collection branch September 19, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants