Skip to content

Audit and reconcile ArgoCD manifests vs Ansible deploy roles#153

Merged
SRF-Audio merged 12 commits intomainfrom
copilot/reconcile-argocd-manifests
Dec 31, 2025
Merged

Audit and reconcile ArgoCD manifests vs Ansible deploy roles#153
SRF-Audio merged 12 commits intomainfrom
copilot/reconcile-argocd-manifests

Conversation

Copy link
Contributor

Copilot AI commented Dec 31, 2025

Audit ArgoCD Application manifests vs legacy <app>_deploy roles

Completed Work ✅

  • Create comprehensive audit markdown document at docs/argo_vs_ansible_deploy_audit.md
  • Delete redundant roles (2): crafty_controller_deploy, nfs_provisioner_deploy
  • Create missing ArgoCD Application manifests (3): Frigate, NetBox, Velero
  • Address all PR feedback:
    • Frigate: Use nfs-synology-retain storage, Homepage on ingress only, Tailscale on service only
    • NetBox: Use existing postgres/redis, infra-netbox namespace, proper annotations
    • NetBox: Move onepassword files to k8s/netbox/
    • Velero: No storage class changes needed (no PVCs)

Summary

  • 2 roles deleted: crafty_controller_deploy, nfs_provisioner_deploy
  • 3 ArgoCD apps created: Frigate, NetBox, Velero (all with proper configuration)
  • 4 roles kept: homepage_deploy, omada_deploy, paperless_ngx_deploy, tailscale_operator_deploy

Annotation Strategy

  • Homepage annotations: On ingress only (used for service discovery)
  • Tailscale annotations: On ClusterIP service only (for network exposure)
  • Ingress: "Dummy" ingress with empty className for Homepage discovery

NetBox Configuration

  • Namespace: infra-netbox (matches infrastructure naming convention)
  • Database: Uses existing postgres at postgres-postgresql.db-postgres.svc.cluster.local
  • Redis: Uses existing redis at redis-master.db-redis.svc.cluster.local
  • Storage: nfs-synology-retain for persistence
Original prompt

This section details on the original issue you should resolve

<issue_title>Reconcile ArgoCD Application manifests vs legacy <app>_deploy roles</issue_title>
<issue_description>### Goal

Audit argocd/** ArgoCD Application manifests against legacy Ansible “deploy roles” and:

  1. Delete any <app>_deploy role that is fully redundant with the Argo manifests (i.e., it only rendered/applied an Argo Application and does not do anything that must remain in Ansible).
  2. Create missing Argo Application manifests when an <app>_deploy role exists but there is no corresponding Argo app manifest today.
  3. Do not touch any roles that are not clearly “deploy an app” roles (anything not matching the scope rules below).

This work is repo-hygiene + migration completeness: ensure Argo is the single source of truth for deploying these Kubernetes apps, and Ansible only retains what Argo cannot own (e.g., 1Password secret templating, bootstrap steps, kubeconfig plumbing, etc.).


In scope

Argo app manifests (source of truth)

Under:

  • argocd/apps/apps/*.yml
  • argocd/apps/operators/*.yml
  • argocd/apps/platform/*.yml
  • argocd/projects/*.yml
  • argocd/root.yml

Candidate legacy roles to evaluate (only deploy roles)

Under ansible/roles/**, only roles whose names match:

  • *_deploy

Examples explicitly in tree:

  • crafty_controller_deploy
  • homepage_deploy
  • nfs_provisioner_deploy
  • omada_deploy
  • paperless_ngx_deploy
  • tailscale_operator_deploy
  • frigate_deploy
  • netbox_deploy
  • velero_deploy

Exclusions:
These two deploy roles are required to boostrap the cluster in the first place. Do not touch them.

  • argocd_deploy (special-case: bootstrap, not an “app deploy role” in the same sense)
  • onepassword_operator_deploy

Out of scope (hard guardrails)

  • Any role not ending with _deploy (examples: argocd_api_auth, kubeconfig_manager, k3s_kubeconfig_retriever, op_*, proxmox_*, k8s_*, etc.).
  • Any changes to actual Helm values, kustomize overlays, chart sources, cluster primitives manifests, PV/PVC content, etc. (This spec is about deployment mechanism parity, not redesign.)
  • Any refactors to playbooks invoking these roles (unless required to remove deleted roles; keep that as a minimal follow-up change only if tests/lint require it).

Definitions

“Corresponding Argo app exists”

A deploy role is considered “covered by Argo” if there is an Argo Application manifest matching the same logical app (by obvious name mapping).

Expected mappings from the tree provided (initial set):

  • crafty_controller_deployargocd/apps/apps/crafty-controller.yml
  • paperless_ngx_deployargocd/apps/apps/paperless-ngx.yml AND argocd/apps/apps/paperless-ngx-secrets.yml
  • homepage_deployargocd/apps/platform/homepage.yml
  • nfs_provisioner_deployargocd/apps/platform/nfs-provisioner.yml
  • omada_deployargocd/apps/platform/omada-controller.yml
  • tailscale_operator_deployargocd/apps/operators/tailscale-operator.yml AND argocd/apps/operators/tailscale-operator-secrets.yml
  • argocd_deployargocd/root.yml + argocd/projects/*.yml + argocd/apps/operators/argocd-projects.yml (treat as bootstrap, not a deletable deploy role)

Other deploy roles in repo likely do not have Argo apps yet (based on your Argo tree):

  • frigate_deploy
  • netbox_deploy
  • velero_deploy

“Totally redundant”

A deploy role is “totally redundant” only if:

  • It only templates or copies an Argo Application manifest (or a small set of manifests) and then applies them with kubernetes.core.k8s / kubectl-style tasks,

  • And it does not:

    • template sensitive material that must remain outside Git (ex: 1Password Connect credentials, bootstrap tokens),
    • create/update required external resources (ex: 1Password items, GitHub repo, DNS records, certs),
    • perform cluster bootstrap prerequisites not owned by Argo,
    • do migrations/backups/restores.

If any of those exist, the role is not “totally redundant”; instead it needs a decision:

  • Keep role (if it performs required non-Argo work), OR
  • Split it (Argo Application stays in GitOps; If the other bits are just secret templating, those should instead become onepassword CRD's. Refer to existing Argo *_secrets apps to see examples.

Tasks

1) Inventory: build an app coverage matrix (must be committed)

Create a markdown file:

  • docs/argo_vs_ansible_deploy_audit.md

Include a table with columns:

  • Deploy role name
  • Role path
  • Corresponding Argo manifest(s) path(s) (or “MISSING”)
  • What the role does (1–3 bullets, derived from tasks/templates)
  • Redundant? (YES/NO)
  • Action (DELETE ROLE / CREATE ARGO APP / KEEP ROLE)
  • Notes / follow-ups (only if needed)

Rules:

  • Only include roles ending in _deploy.
  • Only include Argo apps inside the provided argocd/** tree.

2) For each _deploy ...


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI self-assigned this Dec 31, 2025
Copilot AI and others added 7 commits December 31, 2025 01:52
Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
…sioner_deploy

Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
…Velero

Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
… heading

Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
Copilot AI changed the title [WIP] Reconcile ArgoCD application manifests with legacy deploy roles Audit and reconcile ArgoCD manifests vs Ansible deploy roles Dec 31, 2025
Copilot AI requested a review from SRF-Audio December 31, 2025 02:07
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I already have a postgres and redis instance in cluster, so I don't want this chart to deploy duplicate instances. Instead, it should be pointed at the existing instances, which you can see under:

  • argocd/apps/platform/postgres.yml
  • argocd/apps/platform/redis.yml

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot also, you need to make the same performance determination for these volumes as before. If it's not I/O bound, then let's map it directly to the retain NFS class. Only put it on local retain when it is a performance sensitive mount.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot the ingress should be used exclusively as a "dummy" for Homepage service discovery annotations. Tailscale annotations should actually go on the cluster IP service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configured NetBox to use existing postgres (postgres-postgresql.db-postgres.svc.cluster.local) and redis (redis-master.db-redis.svc.cluster.local) instances. Added OnePasswordItem CRDs for the external database and redis credentials. Persistence set to nfs-synology-retain (non-performance-sensitive). Moved Tailscale annotations to ClusterIP service, ingress now only used for Homepage discovery. Files moved from k8s/netbox/onepassword/ to k8s/netbox/. Commit: a3a20a7

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please reference my other comments from this review about storage classes and performance, and let's fix this file as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Velero doesn't use PVCs for its own data - it only manages backup storage locations which are configured externally (S3/cloud). No storage class changes were needed for this manifest. Commit: a3a20a7

Copilot AI and others added 2 commits December 31, 2025 05:48
… netbox files

- Frigate: Replace longhorn with nfs-synology-retain for config and media
- NetBox: Use existing postgres/redis instances, fix storage classes, move Tailscale to service
- NetBox: Move onepassword files from subdirectory to k8s/netbox/
- NetBox: Add separate OnePasswordItem CRDs for database and redis credentials

Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
Copilot AI and others added 2 commits December 31, 2025 12:55
- Frigate: Remove Tailscale annotations from ingress, keep only Homepage annotations
- Frigate: Remove ingressClassName to make it a "dummy" ingress
- Frigate: Tailscale annotations remain on ClusterIP service only
- NetBox: Change namespace from 'netbox' to 'infra-netbox' across all manifests
- NetBox: Update all OnePasswordItem CRDs to use infra-netbox namespace

Co-authored-by: SRF-Audio <16975040+SRF-Audio@users.noreply.github.com>
@SRF-Audio SRF-Audio marked this pull request as ready for review December 31, 2025 13:16
@SRF-Audio SRF-Audio merged commit 287a33b into main Dec 31, 2025
3 of 12 checks passed
@SRF-Audio SRF-Audio deleted the copilot/reconcile-argocd-manifests branch December 31, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reconcile ArgoCD Application manifests vs legacy <app>_deploy roles

2 participants