Skip to content

Conversation

@thomastaylor312
Copy link
Contributor

Ok, so this one is a doozy, and full disclaimer, I hate it. But it is what is going to have to work.

Background

Due to the upcoming removal of the gogo proto dependency, we have to enable a build flag to reenable the V1 proto message marker method ProtoMessage so things can work with the connect API. However, because types no longer have a Descriptor method, we would get a panic when Marshaling out any message that had a kubernetes type using bare map types because there is no registered type for them. We work around this already in our code with special protobuf_key and protobuf_val tags for our maps. Even worse, the k8s ObjectMeta type works just fine mostly because of a quirk of message handling (detailed later)

What this actually does

I tried messing around with the underlying Codec used by ConnectRPC, but the main problem is you can't update the descriptor at runtime. So, as far as I could tell, I couldn't use protoreflect to try to get around this and the proto.Marshal method doesn't work like MarshalJSON where it recurses down each type in a struct and calls its MarshalJSON method. So the easiest (and very very hacky) solution was to copy the types that we had problems with (Secret and ConfigMap, as well as ObjectMeta) from the k8s proto files and add them to our service. We do have ServiceAccount but this isn't really used by the UI and I don't think it has any top level maps other than labels (which may work in this case?)

Please note that our types work purely because of the quirk described below where labels and annotations are just skipped and assumed empty when returning them as a protobuf type. This doesn't affect our UI because we use the RawFormat option to get the yaml bytes back, which returns the full object

The gnarly technical quirk, here there be dragons

I asked Claude to help me summarize it and its breakdown was perfect, so I present it here in its entirety:

Summary: Why ObjectMeta Works but ConfigMap Doesn't

The Key Discovery

The root cause is in how aberrantLoadMessageDescReentrant handles different Go types:

func aberrantLoadMessageDescReentrant(t reflect.Type, name protoreflect.FullName) protoreflect.MessageDescriptor {
    // ...
    if t.Kind() != reflect.Ptr || t.Elem().Kind() != reflect.Struct {
        return md  // Returns EMPTY descriptor for non-pointer types!
    }
    // ... process fields only for *struct types ...
}

For pointer types (*struct): The function processes all struct fields, including map fields that need protobuf_key/protobuf_val tags.

For value types (struct, not pointer): The function returns an empty descriptor without processing any fields.

How This Affects ConfigMap vs Stage

ConfigMap (*v1.ConfigMap):

  1. ConfigMap is referenced in service.pb.go as a pointer type
  2. aberrantLoadMessageDescReentrant processes ConfigMap's fields
  3. ConfigMap's Data field is a map[string]string without protobuf_key/protobuf_val tags
  4. This creates a malformed map entry descriptor (fields have number 0 instead of 1 and 2)
  5. During needsInitCheck, when processing the Data field:

Stage (*v1alpha1.Stage):

  1. Stage is referenced as a pointer type
  2. aberrantLoadMessageDescReentrant processes Stage's fields
  3. Stage's fields are:
  • ObjectMeta - embedded as value type (struct, not pointer)
  • Spec StageSpec - embedded as value type
  • Status StageStatus - embedded as value type
  1. For each embedded struct field, when aberrantLoadMessageDescReentrant is called, the type is a struct (not pointer), so it short-circuits and returns an empty descriptor
  2. These empty descriptors have no fields
  3. During needsInitCheck, when recursing into ObjectMeta, Spec, or Status, there are no fields to iterate
  4. ObjectMeta's Labels and Annotations map fields are never seen → No panic

Visual Comparison

ConfigMap (fails):
├── metadata (ObjectMeta) - struct type → empty descriptor ✓
├── Data map[string]string - map type → MALFORMED map entry → PANIC ✗
└── BinaryData map[string][]byte - map type → MALFORMED map entry → PANIC ✗

Stage (works):
├── ObjectMeta - struct type → empty descriptor (hides Labels/Annotations maps) ✓
├── Spec (StageSpec) - struct type → empty descriptor ✓
└── Status (StageStatus) - struct type → empty descriptor ✓

The Irony

ObjectMeta works not because it's special, but because:

  1. It's always embedded as a value type (not pointer)
  2. The aberrant loading code accidentally skips non-pointer types
  3. This bug/quirk hides ObjectMeta's problematic map fields from the init check

ConfigMap fails because it has direct map fields at the top level of its struct. These maps don't have protobuf_key/protobuf_val tags, and since ConfigMap is processed as a pointer type, these fields are encountered and their malformed descriptors cause the panic.

@thomastaylor312 thomastaylor312 requested review from a team as code owners January 10, 2026 23:16
@netlify
Copy link

netlify bot commented Jan 10, 2026

Deploy Preview for docs-kargo-io ready!

Name Link
🔨 Latest commit d0a0380
🔍 Latest deploy log https://app.netlify.com/projects/docs-kargo-io/deploys/6966eab4182c9b0008677b87
😎 Deploy Preview https://deploy-preview-5562.docs.kargo.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

// "master" is still the default branch name for a new repository unless
// you configure it otherwise.
&AddWorkTreeOptions{Ref: "master"},
// Don't assume a default branch name ("main" vs "master").
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept running into issues with the test here, so I made this more flexible. Shouldn't impact the tests themselves, but let me know if I missed something

@thomastaylor312
Copy link
Contributor Author

Please note that the backend is ready to go and I've spot checked functionality, but obviously UI stuff is failing and I need to have a more thorough manual test done. However, it is pretty much ready for review as is otherwise

@codecov
Copy link

codecov bot commented Jan 10, 2026

Codecov Report

❌ Patch coverage is 7.91367% with 128 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.75%. Comparing base (539cc4a) to head (d0a0380).

Files with missing lines Patch % Lines
api/service/v1alpha1/service_conversions.go 0.00% 105 Missing ⚠️
pkg/cli/cmd/get/credentials.go 0.00% 6 Missing ⚠️
pkg/cli/cmd/get/tokens.go 0.00% 6 Missing ⚠️
pkg/cli/cmd/update/credentials.go 0.00% 2 Missing ⚠️
pkg/server/list_api_tokens_v1alpha1.go 0.00% 2 Missing ⚠️
pkg/server/list_repo_credentials_v1alpha1.go 0.00% 2 Missing ⚠️
pkg/cli/cmd/create/credentials.go 0.00% 1 Missing ⚠️
pkg/cli/cmd/create/token.go 0.00% 1 Missing ⚠️
pkg/server/create_api_token_v1alpha1.go 0.00% 1 Missing ⚠️
pkg/server/get_api_token_v1alpha1.go 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5562      +/-   ##
==========================================
- Coverage   55.95%   55.75%   -0.20%     
==========================================
  Files         423      424       +1     
  Lines       32088    32203     +115     
==========================================
+ Hits        17954    17955       +1     
- Misses      13084    13198     +114     
  Partials     1050     1050              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@thomastaylor312 thomastaylor312 force-pushed the feat/k8s_upgrade_and_fix branch from 97090b2 to 55a2e3a Compare January 10, 2026 23:25
@krancour
Copy link
Member

krancour commented Jan 11, 2026

✅ Build flags

❓ Is the hack of defining our own knock-off corev1 types worth it if we plan to transition away from protobufs? I'm not against it, but want to understand the risks.

@thomastaylor312
Copy link
Contributor Author

Putting this context here from an offline convo: turning on just the build flags isn’t sufficient because anything involving k8s objects with maps panics

This updates all k8s libraries to 1.35 and enables the back compat build
flag for generated protobuf files. We plan on adding a REST-ish API and
deprecating our use of Connect RPC in 1.10

Signed-off-by: Taylor Thomas <[email protected]>
Signed-off-by: Taylor Thomas <[email protected]>
@thomastaylor312 thomastaylor312 force-pushed the feat/k8s_upgrade_and_fix branch from 4ca2fd8 to d0a0380 Compare January 14, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants