Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat : Add orphan checkpoint retention policy #28

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Parthiba-Hazra
Copy link
Contributor

This PR introduces the orphan checkpoint retention policy, allowing users to control whether orphaned checkpoints are retained or deleted.

  • The new retainOrphan field is added across global, container, pod, and namespace policies.
  • By default, orphan checkpoints are retained, but users can configure this behavior by setting the retainOrphan field to false.

@rst0git
Copy link
Member

rst0git commented Aug 13, 2024

@Parthiba-Hazra Would it be possible to rebase this pull request on the main branch?

- Introduced global, container, pod, and namespace-level
  policies for checkpoint retention, based on storage/size limits.
- Updated CRD definitions to store the storage/size based policies.
- Updated the sample configuration of CheckpointRestoreOperator with
  storage/checkpoint-size based policies

Signed-off-by: Parthiba-Hazra <[email protected]>
- Enhance generate_checkpoint_tar.sh to optionally
  generate tar files larger than 5MB
- Update GitHub Actions workflow to test storage quota
  garbage collection policies

Signed-off-by: Parthiba-Hazra <[email protected]>
- To implement the orphan checkpoint retention policy, the manager
  requires permissions to watch and get resources. This allows the
  manager pod to watch the relevant resources and retrieve the
  necessary resource information when applying the policies.

Signed-off-by: Parthiba-Hazra <[email protected]>
- Added support for orphan retention policies at the global,
  namespace, pod, and container levels.
- Introduced the `retainOrphan` field in each policy type to
  control the retention of orphan checkpoints.
- Updated the policy application logic to delete all orphan
  checkpoints when `retainOrphan` is set to false.
- Implemented a PodWatcher to monitor pod deletions and apply
  policies immediately when a resource is deleted.

Signed-off-by: Parthiba-Hazra <[email protected]>
- Add `test_orphan_retention_policy` test and update
  GitHub Actions workflow to test orphan retention policy

Signed-off-by: Parthiba-Hazra <[email protected]>
@@ -51,21 +57,35 @@ A sample configuration file is available [here](/config/samples/_v1_checkpointre
- `checkpointDirectory`: Specifies the directory where checkpoints are stored.
- `applyPoliciesImmediately`: If set to `true`, the policies are applied immediately. If `false` (default value), they are applied after new checkpoint creation.
- `globalPolicy`: Defines global checkpoint retention limits.
- `retainOrphan`: If set to `true` (default), orphan checkpoints (checkpoints whose associated resources have been deleted) will be retained. If set to `false`, orphan checkpoints will be automatically deleted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `retainOrphan`: If set to `true` (default), orphan checkpoints (checkpoints whose associated resources have been deleted) will be retained. If set to `false`, orphan checkpoints will be automatically deleted.
- `retainOrphan`: If set to `true` (default), orphan checkpoints (checkpoints whose associated resources have been deleted) will be retained. If set to `false`, orphan checkpoints will be automatically deleted. This is particularly useful for transient checkpoints used to recover from errors by replacing 'container restart' with 'container restore'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants