Refactor base stages #92

UdeshyaDhungana · 2025-08-29T11:55:45Z

@rohitpaulk Requesting for an early review.

The refactors are not yet complete in following aspects

serializer_legacy is still being used in base stages
Encoder interface is yet to be refactored (hexdump printing part can be incorporated to encoding it seems)

As of now, only major change I've made is in the decoder/decoder.go and api_versions_response.go.

Also, I found myself duplicating a lot of files, so I'm not sure if I'm correctly following the approach.

rohitpaulk

Let's remove all unused code and focus on the parts that we actually need - simpler to review, and makes it more intentional when we add more features

rohitpaulk · 2025-08-29T18:45:31Z

protocol/decoder/decoder.go

+	logger             *logger.Logger
+	backupLogger       *logger.Logger
+	indentationLevel   int
+	currentSectionName string


Where is this used? I see it set but never see it being read. If this is actually being used, then there are bunch of error cases - for example, calling StartSubSection("section_1") followed by EndSubSection() doesn't reset currentSectionName, it still remains as section_1.

rohitpaulk · 2025-08-29T18:50:57Z

protocol/decoder/decoder.go

+	currentSectionName string
+}
+
+func (d *Decoder) Init(bytes []byte, logger *logger.Logger) {


Rather than Decoder.Init (which means users have to create a decoder and then call Init on it), do NewDecoder() instead - so it's clear the object needs to be created using a specific method

rohitpaulk · 2025-08-29T18:53:56Z

protocol/decoder/decoder.go

+
+/* Basic Types */
+
+func (d *Decoder) GetInt8(variableName string) (int8, error) {


Suggested change

func (d *Decoder) GetInt8(variableName string) (int8, error) {

func (d *Decoder) ReadInt8(variableName string) (int8, error) {

Let's name all of these ReadXYZ instead of GetXYZ - a bit more suited for the task + indicates that there is some mutation happening

rohitpaulk · 2025-08-29T18:54:49Z

protocol/decoder/decoder.go

+	return nil
+}
+
+func (d *Decoder) Remaining() int {


Suggested change

func (d *Decoder) Remaining() int {

func (d *Decoder) UnreadByteCount() int {

(Ties to the name Read, making it more clear what "remaining" means + adds Count() making it more clear what the type is)

rohitpaulk · 2025-08-29T18:55:12Z

protocol/decoder/decoder.go

+	return len(d.bytes) - d.offset
+}
+
+// ConsumeRawBytes is parallels version of GetRawBytes


What is GetRawBytes? Don't see that anywhere. Also, spelling mistake - "parallels"

rohitpaulk · 2025-08-29T18:59:07Z

protocol/decoder/decoder.go

+	}
+}
+
+func (d *Decoder) MuteLogger() {


I don't see this used at the moment either - is there a specific place where we're using this? Particularly interested to see if there are cases where we need to mute a specific part of decoding, or if it's that we want to mute the decoder entirely (which could be done by just passing in a quiet logger to NewDecoder)

rohitpaulk · 2025-08-29T18:59:55Z

protocol/decoder/decoder.go

+	bytes              []byte
+	offset             int
+	logger             *logger.Logger
+	backupLogger       *logger.Logger


Not a great name - you want to optimize for a person understanding what this is used for without having to look at the implementation. The only way I'd know what this is used for currently is by searching for usages and figuring out what why need a "backup"

rohitpaulk · 2025-08-29T19:01:50Z

protocol/encoder/encoder.go

+	offset int
+}
+
+func (re *Encoder) Init(raw []byte) {


Same note as for Decoder - use NewEncoder

rohitpaulk · 2025-08-29T19:03:26Z

protocol/encoder/encoder.go

+	re.offset = 0
+}
+
+func (re *Encoder) PutRawBytesAt(in []byte, offset int, length int) {


I don't see any of these used at the moment, let's stick to the functions we need at the moment so it's easier to review along w/ usages

rohitpaulk · 2025-08-29T19:03:55Z

protocol/encoder/encoder.go

+
+// primitives
+
+func (re *Encoder) PutInt8(in int8) {


Suggested change

func (re *Encoder) PutInt8(in int8) {

func (re *Encoder) WriteInt8(in int8) {

(Let's use WriteXYZ instead of PutXYZ - parallel to Decoder's Read)

rohitpaulk · 2025-09-02T05:23:17Z

Some notes on the functionality overall:

Looks like in some cases we're printing empty field names:

Looking at the diff from old to new, the older style of using "- " for indents actually seems more clear, can you stick to that format please?

rohitpaulk · 2025-09-02T05:31:44Z

protocol/encoder/encoder.go

+	return re.offset
+}
+
+func (re *Encoder) AllBytes() []byte {


The interface for this could be far simpler. I wouldn't expose Offset and AllBytes - no caller is going to require these values, they'll only need the "written" bytes, and doing len(bytes) will give them the offset.

Would remove these both and make EncodedBytes just Bytes.

rohitpaulk · 2025-09-02T05:31:59Z

protocol/encoder/encoder.go

+	offset int
+}
+
+func NewEncoder(bytes []byte) *Encoder {


Suggested change

func NewEncoder(bytes []byte) *Encoder {

func NewEncoder() *Encoder {

Don't think we need bytes to be passed in? Could just instantiate an array here

rohitpaulk · 2025-09-02T05:32:23Z

protocol/encoder/encoder.go

+)
+
+type Encoder struct {
+	bytes  []byte


I think you can use bytes.Buffer here and not have to track offset/bytes etc. on your own: https://pkg.go.dev/bytes#Buffer

rohitpaulk · 2025-09-02T05:36:33Z

protocol/kafkaapi/api_versions_request.go

+
+func (r ApiVersionsRequestBody) encode(enc *encoder.Encoder) {
+
+	if r.Version < 3 {


Suggested change

if r.Version < 3 {

if r.Version != 3 {

(We support exactly one version if I'm not wrong, so let's make an exact check)

rohitpaulk · 2025-09-02T05:37:26Z

protocol/kafkaapi/api_versions_request.go

+	enc.WriteEmptyTagBuffer()
+}
+
+func (r ApiVersionsRequestBody) Encode() []byte {


Let's merge Encode and encode, don't need the split, doesn't really make it any more readable

rohitpaulk · 2025-09-02T05:40:51Z

protocol/kafkaapi/api_versions_response.go

+	ThrottleTimeMs int32
+}
+
+func (r *ApiVersionsResponseBody) Decode(d *decoder.Decoder, version int16, logger *logger.Logger, indentation int) (err error) {


Doesn't look like we're using logger and indentation anywhere here.

rohitpaulk · 2025-09-02T05:47:52Z

protocol/kafkaapi/api_versions_response.go

+		panic("CodeCrafters Internal Error: ApiVersionsResponseBody.Version is not initialized")
+	}
+
+	if err := r.Body.Decode(decoder, r.Body.Version, logger, 1); err != nil {


We're validating that r.Body.Version is not equal to zero, and then we're passing that same thing into r.Body.Decode - doesn't r.Body.Decode already have access to Version?

rohitpaulk · 2025-09-02T05:49:48Z

protocol/kafkaapi/api_versions_response.go

+// ApiKeyEntry contains the APIs supported by the broker.
+type ApiKeyEntry struct {
+	// Version defines the protocol version to use for encode and decode
+	Version int16


Let's try to get rid of this "Version" field both on ApiKeyEntry and ResponseBody and move it to the top-level ApiVersionsResponse. This isn't actual "state", it's just being passed down for convenience and this can instead be done by passing down a variable to the Decode functions.

@rohitpaulk , Sure. I think we should only move it as up as ResponseBody because it seems like different body versions can co-exist with same header version.

For eg, both body versions (1 and 2) co-exist with same header version. So, I think the version should be tagged with body instead of the response. I'll remove the versions property and just pass down for any sub-fields of body.

rohitpaulk · 2025-09-02T05:50:38Z

protocol/kafkaapi/api_versions_response.go

Overall this file still feels kind of complex, let's spend some extra time and try to make it as minimal as possible

rohitpaulk · 2025-09-02T05:51:28Z

protocol/kafkaapi/headers/request_header.go

+}
+
+// encode v2
+func (h RequestHeader) encode(enc *encoder.Encoder) {


Let's merge encode and Encode, the split doesn't really help with readability - also the "encode v2" comment here doesn't make a lot of sense

rohitpaulk · 2025-09-02T05:53:48Z

@UdeshyaDhungana let's get the tests passing too (so we can see what the exact changes to fixtures are)

rohitpaulk · 2025-09-02T05:55:28Z

Encoder interface is yet to be refactored (hexdump printing part can be incorporated to encoding it seems)

Whoops, just saw this! Okay feel free to ignore my comments on encoder if these will be part of a separate PR

UdeshyaDhungana · 2025-09-02T10:53:42Z

Applied suggested fixes
Tried to make api_versions_response.go a bit cleaner
Refactored encoder in the same PR
Log format is now the same as old (with - and .)
CI: regenerated fixtures
Renamed serializer -> files_manager
Base stages only use function for writing metadata and configs and not log dirs.

rohitpaulk

@UdeshyaDhungana there's too much going on in a single PR here, let's split please - hard to review properly. For example the files_manager stuff can be split into a separate PR

UdeshyaDhungana · 2025-09-03T05:06:27Z

@UdeshyaDhungana there's too much going on in a single PR here, let's split please - hard to review properly. For example the files_manager stuff can be split into a separate PR

I've removed the files_manager part for now. I'll move that into a separate PR.

rohitpaulk

Once the major issues (like printing log file details to CLI) are resolved, let's merge this in and open smaller PRs to tackle the rest of stuff. Let's try to keep PRs under 300 lines

rohitpaulk · 2025-09-03T06:34:49Z

internal/test_helpers/fixtures/base/pass

@@ -1,18 +1,25 @@
 Debug = true

 [33m[tester::#PV1] [0m[94mRunning tests for Stage #PV1 (pv1)[0m
+[33m[tester::#PV1] [Serializer] [0m[36mWriting log files to: /tmp/kraft-combined-logs[0m


@UdeshyaDhungana these changes don't seem to be intentional - we don't want to print things like log files in early stages

rohitpaulk · 2025-09-03T06:37:16Z

internal/test_helpers/fixtures/base/pass

+[33m[tester::#PV1] [Decoder] [0m[36m  - .ApiVersions Response Body[0m
+[33m[tester::#PV1] [Decoder] [0m[36m    - .error_code (0)[0m
+[33m[tester::#PV1] [Decoder] [0m[36m    - .ApiKeys.Length (62)[0m
+[33m[tester::#PV1] [Decoder] [0m[36m    - .ApiKeys[0][0m


Can be tackled in a separate PR since this one is already pretty large, but we need some kind of standard around the casing we use for things here. I see three distinct types of casing used here, i.e. the same field could've been named "Error Code", "error_code", or "ErrorCode" because there's no consistency among how we name fields.

rohitpaulk · 2025-09-03T06:38:07Z

internal/stage_2.go

-	if err != nil {
+	stageLogger := stageHarness.Logger
+
+	if err := serializer_legacy.GenerateLogDirs(stageHarness.Logger, true); err != nil {


Related to the comment below, think we still need to pass in QuietLogger here.

rohitpaulk · 2025-09-03T06:41:51Z

protocol/decoder/decoder.go

+	}
+}
+
+func (d *Decoder) Offset() int {


Can probably be avoided, the only usage I see is in FormatDetailedError. I'd imagine that all other public usages would only care about UnreadBytesCount and won't need the exact offset.

rohitpaulk · 2025-09-03T06:45:28Z

protocol/decoder/decoder.go

+	d.unindentLog()
+}
+
+func (d *Decoder) logDecodedValue(variableName string, value any) {


Yep, definitely not clear from usage. I'd suggest just creating separate versions of the methods. Something like:

ReadInt16WithoutLogging() ReadInt16WithLogging("variableName")

And the non-silent version can re-use the other internally. And if the "silent" versions aren't used externally we can even make them private (readInt16WithoutLogging).

rohitpaulk · 2025-09-03T07:01:03Z

protocol/decoder/decoder.go

+		return 0, err
+	}
+
+	if decodedInteger == 0 {


Is it intended that here we read a value and don't log it? Also, we're essentially treating the value 1 and 0 the same (here we return 0 directly, below we do 1-1 which is also 0)

rohitpaulk · 2025-09-03T07:03:49Z

protocol/decoder/decoder.go

+
+	if err != nil {
+		if decodingErr, ok := err.(*errors.PacketDecodingError); ok {
+			return -1, decodingErr.AddContexts("ARRAY_LENGTH", variableName)


This Contexts thing seems to account for a lot of code here. Would look a lot more simpler if this wasn't needed. I'd think a bit more about exactly what kind of error message we're trying to construct and see if what's needed could be stored as state on Decoder.

Can be a separate PR might need some thinking re: approach

rohitpaulk · 2025-09-03T07:06:03Z

protocol/kafka_interface/interface.go

+	GetHeader() headers.RequestHeader
+}
+
+type Encodable interface {


Let's remove this - isn't used

rohitpaulk · 2025-09-03T07:07:08Z

protocol/kafkaapi/api_versions_response.go

+	decoder.BeginSubSection("ApiVersions Response")
+
+	defer func() {
+		decoder.EndCurrentSubSection()


Let's place this above as defer decoder.EndCurrentSubSection, defer calls are easier to read when they're placed close to the action they're reversing

rohitpaulk · 2025-09-03T07:08:45Z

protocol/kafkaapi/api_versions_response.go

+		decoder.EndCurrentSubSection()
+
+		if decodingErr, ok := err.(*errors.PacketDecodingError); ok {
+			detailedError := decodingErr.AddContexts("ApiVersions Response")


I think a lot of complexity for this comes from the ordering - AddContexts is only done when the error is constructed, instead of before the decoding is done. Might be worth looking into merging the ideas of "SubSection" and "Context" (again, can be separate PR)

cursor · 2025-09-03T14:47:36Z

internal/stage_4.go


-	errorCode, err := decoder.GetInt16()
+	errorCode, err := decoder.ReadInt16("error_code")
+


Bug: Error Handling Regression: Missing Context and Hexdump

It looks like we lost some detailed error handling for decoding failures across these stages. Previously, decoding errors included context and formatted hexdump details, which were really helpful for debugging. Now, we're just returning raw errors, making it tougher to diagnose issues, especially in error-testing scenarios like stage 4.

Additional Locations (2)

internal/stage_2.go#L67-L70

internal/stage_3.go#L75-L78

UdeshyaDhungana added 19 commits August 29, 2025 13:09

Rename decoder -> decoder_legacy

0977102

Suffix modules with *_legacy

2e38b0d

Rename all usages

a492573

Rename all remaining usages

9d07a5e

Rename errors -> errors_legacy

be0cb21

assertions package is now legacy

a7ba1d5

Add new decoder and errors package

a18ba77

Rename builder to builder_legacy

c3f9b6d

Rename builder_legacy usages

9a570ff

Merge branch 'rename-modules' into temp

27f017f

migrate upto stage 4

c8d1f04

Add remaining files

f292d88

client and interface moved to legacy

3969d77

Resolve merge conflict

6d88217

move encoder to legacy

10411d7

Merge branch 'rename-modules' into temp

830cd46

serializer -> serializer_legacy

2bc71fa

Resolve merge conflict

8c8175b

First round of refactor

9171cb6

UdeshyaDhungana requested a review from rohitpaulk August 29, 2025 11:56

This comment was marked as outdated.

Sign in to view

rohitpaulk requested changes Aug 29, 2025

View reviewed changes

Apply suggested fixes

bafa2da

This comment was marked as outdated.

Sign in to view

UdeshyaDhungana requested a review from rohitpaulk September 2, 2025 04:08

rohitpaulk requested changes Sep 2, 2025

View reviewed changes

Remove the usage of serializer_legacy in base stages

88869fd

UdeshyaDhungana added 6 commits September 2, 2025 12:29

Minor refactor

3c4cecf

Refactor files manager

70533ee

Encoder interface changed

cf33239

Remove logger and indentation in ApiVersionsResponseBody.Decode()

6a52ba2

Change decoder printing format and refactor api_versions_response.go

7662386

Refactor and regenerate fixtures

8bc159e

This comment was marked as outdated.

Sign in to view

Refactor base stages and dependencies

8b662c7

UdeshyaDhungana added 2 commits September 2, 2025 16:43

Refactor decoder

97cf7fa

Refactor decoder

17468a4

UdeshyaDhungana requested a review from rohitpaulk September 2, 2025 11:02

rohitpaulk requested changes Sep 2, 2025

View reviewed changes

Remove files manager for refactor-base-stages

ac4ba7e

UdeshyaDhungana requested a review from rohitpaulk September 3, 2025 05:06

rohitpaulk approved these changes Sep 3, 2025

View reviewed changes

Reove printing log files details to CLI in base stages

d9bf369

This comment was marked as outdated.

Sign in to view

UdeshyaDhungana added 2 commits September 3, 2025 17:37

Merge with main

0d6c84c

Add missing files after git merge

b488f96

UdeshyaDhungana changed the base branch from rename-modules to main September 3, 2025 14:26

This comment was marked as outdated.

Sign in to view

UdeshyaDhungana added 2 commits September 3, 2025 20:27

Minor decoder refactor

76723a5

CI: Re-record Fixtures

391be3d

cursor bot reviewed Sep 3, 2025

View reviewed changes

UdeshyaDhungana merged commit c2e71c3 into main Sep 3, 2025
3 checks passed


		/* Basic Types */

		func (d *Decoder) GetInt8(variableName string) (int8, error) {

	func (d *Decoder) GetInt8(variableName string) (int8, error) {
	func (d *Decoder) ReadInt8(variableName string) (int8, error) {

	func (d *Decoder) Remaining() int {
	func (d *Decoder) UnreadByteCount() int {

	func (re *Encoder) PutInt8(in int8) {
	func (re *Encoder) WriteInt8(in int8) {

	func NewEncoder(bytes []byte) *Encoder {
	func NewEncoder() *Encoder {


		func (r ApiVersionsRequestBody) encode(enc *encoder.Encoder) {

		if r.Version < 3 {


		errorCode, err := decoder.GetInt16()
		errorCode, err := decoder.ReadInt16("error_code")

Refactor base stages #92

Refactor base stages #92

Uh oh!

Conversation

UdeshyaDhungana commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

rohitpaulk commented Sep 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

UdeshyaDhungana Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rohitpaulk commented Sep 2, 2025

Uh oh!

rohitpaulk commented Sep 2, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

UdeshyaDhungana commented Sep 2, 2025

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

Uh oh!

UdeshyaDhungana commented Sep 3, 2025

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

UdeshyaDhungana commented Aug 29, 2025 •

edited

Loading

UdeshyaDhungana Sep 2, 2025 •

edited

Loading