-
Notifications
You must be signed in to change notification settings - Fork 2
Refactor base stages #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove all unused code and focus on the parts that we actually need - simpler to review, and makes it more intentional when we add more features
protocol/decoder/decoder.go
Outdated
logger *logger.Logger | ||
backupLogger *logger.Logger | ||
indentationLevel int | ||
currentSectionName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used? I see it set but never see it being read. If this is actually being used, then there are bunch of error cases - for example, calling StartSubSection("section_1")
followed by EndSubSection()
doesn't reset currentSectionName
, it still remains as section_1
.
protocol/decoder/decoder.go
Outdated
currentSectionName string | ||
} | ||
|
||
func (d *Decoder) Init(bytes []byte, logger *logger.Logger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than Decoder.Init
(which means users have to create a decoder and then call Init
on it), do NewDecoder()
instead - so it's clear the object needs to be created using a specific method
protocol/decoder/decoder.go
Outdated
|
||
/* Basic Types */ | ||
|
||
func (d *Decoder) GetInt8(variableName string) (int8, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (d *Decoder) GetInt8(variableName string) (int8, error) { | |
func (d *Decoder) ReadInt8(variableName string) (int8, error) { |
Let's name all of these ReadXYZ
instead of GetXYZ
- a bit more suited for the task + indicates that there is some mutation happening
protocol/decoder/decoder.go
Outdated
return nil | ||
} | ||
|
||
func (d *Decoder) Remaining() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (d *Decoder) Remaining() int { | |
func (d *Decoder) UnreadByteCount() int { |
(Ties to the name Read
, making it more clear what "remaining" means + adds Count()
making it more clear what the type is)
protocol/decoder/decoder.go
Outdated
return len(d.bytes) - d.offset | ||
} | ||
|
||
// ConsumeRawBytes is parallels version of GetRawBytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is GetRawBytes
? Don't see that anywhere. Also, spelling mistake - "parallels"
protocol/decoder/decoder.go
Outdated
} | ||
} | ||
|
||
func (d *Decoder) MuteLogger() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this used at the moment either - is there a specific place where we're using this? Particularly interested to see if there are cases where we need to mute a specific part of decoding, or if it's that we want to mute the decoder entirely (which could be done by just passing in a quiet logger to NewDecoder
)
protocol/decoder/decoder.go
Outdated
bytes []byte | ||
offset int | ||
logger *logger.Logger | ||
backupLogger *logger.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a great name - you want to optimize for a person understanding what this is used for without having to look at the implementation. The only way I'd know what this is used for currently is by searching for usages and figuring out what why need a "backup"
protocol/encoder/encoder.go
Outdated
offset int | ||
} | ||
|
||
func (re *Encoder) Init(raw []byte) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note as for Decoder
- use NewEncoder
protocol/encoder/encoder.go
Outdated
re.offset = 0 | ||
} | ||
|
||
func (re *Encoder) PutRawBytesAt(in []byte, offset int, length int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any of these used at the moment, let's stick to the functions we need at the moment so it's easier to review along w/ usages
protocol/encoder/encoder.go
Outdated
|
||
// primitives | ||
|
||
func (re *Encoder) PutInt8(in int8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (re *Encoder) PutInt8(in int8) { | |
func (re *Encoder) WriteInt8(in int8) { |
(Let's use WriteXYZ
instead of PutXYZ
- parallel to Decoder's Read
)
protocol/encoder/encoder.go
Outdated
return re.offset | ||
} | ||
|
||
func (re *Encoder) AllBytes() []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface for this could be far simpler. I wouldn't expose Offset
and AllBytes
- no caller is going to require these values, they'll only need the "written" bytes, and doing len(bytes)
will give them the offset.
Would remove these both and make EncodedBytes
just Bytes
.
protocol/encoder/encoder.go
Outdated
offset int | ||
} | ||
|
||
func NewEncoder(bytes []byte) *Encoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func NewEncoder(bytes []byte) *Encoder { | |
func NewEncoder() *Encoder { |
Don't think we need bytes
to be passed in? Could just instantiate an array here
protocol/encoder/encoder.go
Outdated
) | ||
|
||
type Encoder struct { | ||
bytes []byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use bytes.Buffer here and not have to track offset/bytes etc. on your own: https://pkg.go.dev/bytes#Buffer
|
||
func (r ApiVersionsRequestBody) encode(enc *encoder.Encoder) { | ||
|
||
if r.Version < 3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if r.Version < 3 { | |
if r.Version != 3 { |
(We support exactly one version if I'm not wrong, so let's make an exact check)
enc.WriteEmptyTagBuffer() | ||
} | ||
|
||
func (r ApiVersionsRequestBody) Encode() []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge Encode
and encode
, don't need the split, doesn't really make it any more readable
ThrottleTimeMs int32 | ||
} | ||
|
||
func (r *ApiVersionsResponseBody) Decode(d *decoder.Decoder, version int16, logger *logger.Logger, indentation int) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like we're using logger
and indentation
anywhere here.
panic("CodeCrafters Internal Error: ApiVersionsResponseBody.Version is not initialized") | ||
} | ||
|
||
if err := r.Body.Decode(decoder, r.Body.Version, logger, 1); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're validating that r.Body.Version
is not equal to zero, and then we're passing that same thing into r.Body.Decode
- doesn't r.Body.Decode
already have access to Version
?
// ApiKeyEntry contains the APIs supported by the broker. | ||
type ApiKeyEntry struct { | ||
// Version defines the protocol version to use for encode and decode | ||
Version int16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to get rid of this "Version" field both on ApiKeyEntry
and ResponseBody
and move it to the top-level ApiVersionsResponse
. This isn't actual "state", it's just being passed down for convenience and this can instead be done by passing down a variable to the Decode
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rohitpaulk , Sure. I think we should only move it as up as ResponseBody
because it seems like different body versions can co-exist with same header version.
For eg, both body versions (1 and 2) co-exist with same header version. So, I think the version should be tagged with body instead of the response. I'll remove the versions property and just pass down for any sub-fields of body.


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this file still feels kind of complex, let's spend some extra time and try to make it as minimal as possible
} | ||
|
||
// encode v2 | ||
func (h RequestHeader) encode(enc *encoder.Encoder) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge encode
and Encode
, the split doesn't really help with readability - also the "encode v2" comment here doesn't make a lot of sense
@UdeshyaDhungana let's get the tests passing too (so we can see what the exact changes to fixtures are) |
Whoops, just saw this! Okay feel free to ignore my comments on encoder if these will be part of a separate PR |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UdeshyaDhungana there's too much going on in a single PR here, let's split please - hard to review properly. For example the files_manager stuff can be split into a separate PR
I've removed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the major issues (like printing log file details to CLI) are resolved, let's merge this in and open smaller PRs to tackle the rest of stuff. Let's try to keep PRs under 300 lines
@@ -1,18 +1,25 @@ | |||
Debug = true | |||
|
|||
[33m[tester::#PV1] [0m[94mRunning tests for Stage #PV1 (pv1)[0m | |||
[33m[tester::#PV1] [Serializer] [0m[36mWriting log files to: /tmp/kraft-combined-logs[0m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UdeshyaDhungana these changes don't seem to be intentional - we don't want to print things like log files in early stages
[33m[tester::#PV1] [Decoder] [0m[36m - .ApiVersions Response Body[0m | ||
[33m[tester::#PV1] [Decoder] [0m[36m - .error_code (0)[0m | ||
[33m[tester::#PV1] [Decoder] [0m[36m - .ApiKeys.Length (62)[0m | ||
[33m[tester::#PV1] [Decoder] [0m[36m - .ApiKeys[0][0m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be tackled in a separate PR since this one is already pretty large, but we need some kind of standard around the casing we use for things here. I see three distinct types of casing used here, i.e. the same field could've been named "Error Code", "error_code", or "ErrorCode" because there's no consistency among how we name fields.
internal/stage_2.go
Outdated
if err != nil { | ||
stageLogger := stageHarness.Logger | ||
|
||
if err := serializer_legacy.GenerateLogDirs(stageHarness.Logger, true); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to the comment below, think we still need to pass in QuietLogger
here.
} | ||
} | ||
|
||
func (d *Decoder) Offset() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably be avoided, the only usage I see is in FormatDetailedError
. I'd imagine that all other public usages would only care about UnreadBytesCount
and won't need the exact offset.
d.unindentLog() | ||
} | ||
|
||
func (d *Decoder) logDecodedValue(variableName string, value any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, definitely not clear from usage. I'd suggest just creating separate versions of the methods. Something like:
ReadInt16WithoutLogging()
ReadInt16WithLogging("variableName")
And the non-silent version can re-use the other internally. And if the "silent" versions aren't used externally we can even make them private (readInt16WithoutLogging
).
protocol/decoder/decoder.go
Outdated
return 0, err | ||
} | ||
|
||
if decodedInteger == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended that here we read a value and don't log it? Also, we're essentially treating the value 1 and 0 the same (here we return 0 directly, below we do 1-1
which is also 0)
|
||
if err != nil { | ||
if decodingErr, ok := err.(*errors.PacketDecodingError); ok { | ||
return -1, decodingErr.AddContexts("ARRAY_LENGTH", variableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Contexts
thing seems to account for a lot of code here. Would look a lot more simpler if this wasn't needed. I'd think a bit more about exactly what kind of error message we're trying to construct and see if what's needed could be stored as state on Decoder
.
Can be a separate PR might need some thinking re: approach
GetHeader() headers.RequestHeader | ||
} | ||
|
||
type Encodable interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this - isn't used
decoder.BeginSubSection("ApiVersions Response") | ||
|
||
defer func() { | ||
decoder.EndCurrentSubSection() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's place this above as defer decoder.EndCurrentSubSection
, defer
calls are easier to read when they're placed close to the action they're reversing
decoder.EndCurrentSubSection() | ||
|
||
if decodingErr, ok := err.(*errors.PacketDecodingError); ok { | ||
detailedError := decodingErr.AddContexts("ApiVersions Response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of complexity for this comes from the ordering - AddContexts
is only done when the error is constructed, instead of before the decoding is done. Might be worth looking into merging the ideas of "SubSection" and "Context" (again, can be separate PR)
|
||
errorCode, err := decoder.GetInt16() | ||
errorCode, err := decoder.ReadInt16("error_code") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Error Handling Regression: Missing Context and Hexdump
It looks like we lost some detailed error handling for decoding failures across these stages. Previously, decoding errors included context and formatted hexdump details, which were really helpful for debugging. Now, we're just returning raw errors, making it tougher to diagnose issues, especially in error-testing scenarios like stage 4.
@rohitpaulk Requesting for an early review.
The refactors are not yet complete in following aspects
As of now, only major change I've made is in the
decoder/decoder.go
andapi_versions_response.go
.Also, I found myself duplicating a lot of files, so I'm not sure if I'm correctly following the approach.