-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2][WIP] Span hash sanitizer and enhance span hash adjuster #6499
base: main
Are you sure you want to change the base?
[v2][WIP] Span hash sanitizer and enhance span hash adjuster #6499
Conversation
Signed-off-by: chahatsagarmain <[email protected]>
hashStr := hex.EncodeToString(buf.Bytes()) | ||
span.Tags = append(span.Tags, model.KeyValue{ | ||
Key: "span.hash", | ||
VType: model.ValueType_STRING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hard to evaluate this PR without seeing how it will be used in the storage backends. For example, is it really a string we want, or a number?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6499 +/- ##
==========================================
- Coverage 96.27% 96.17% -0.10%
==========================================
Files 372 373 +1
Lines 21282 21332 +50
==========================================
+ Hits 20490 20517 +27
- Misses 605 622 +17
- Partials 187 193 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: chahatsagarmain <[email protected]>
model/span.go
Outdated
func (s *Span) HashHashTag() bool { | ||
if _, ok := KeyValues(s.Tags).FindByKey(SpanHashKey); ok { | ||
return true | ||
} | ||
return false | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (s *Span) HashHashTag() bool { | |
if _, ok := KeyValues(s.Tags).FindByKey(SpanHashKey); ok { | |
return true | |
} | |
return false | |
} |
redundant. Someone can always call _, ok := GetHashTag
return span | ||
} | ||
|
||
func uint64ToInt64Bits(value uint64) int64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplication. I would move this to Span, e.g. span.SetHashTag()
Signed-off-by: chahatsagarmain <[email protected]>
Signed-off-by: chahatsagarmain <[email protected]>
spanHash, _ := model.HashCode(span) | ||
spanHash, found := span.GetHashTag() | ||
if !found { | ||
tempSpam := *span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to make a copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted a review on this , calling SetHashTag
sets the span.hash for the span tag and unit tests fail because of unexpected value .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I would expect the test to fail if we add a new tag, depending on how the test are implemented. We need to work around that.
model/span.go
Outdated
if tag, ok := KeyValues(s.Tags).FindByKey(SpanHashKey); ok { | ||
return tag.GetVInt64(), true | ||
} | ||
return -1, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return -1, false | |
return 0, false |
I don't thin -1
conveys any more meaning than 0, so I would rather return default value, this would match behavior of a get from standard map[string]int64
model/span.go
Outdated
return -1, err | ||
} | ||
spanHash := uint64ToInt64Bits(hCode) | ||
s.Tags = append(s.Tags, KeyValue{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have helper methods for creating typed instances of KeyValue, please use that instead of explicit creation.
@@ -16,6 +16,7 @@ type SanitizeSpan func(span *model.Span) *model.Span | |||
func NewStandardSanitizers() []SanitizeSpan { | |||
return []SanitizeSpan{ | |||
NewEmptyServiceNameSanitizer(), | |||
NewHashingSanitizer(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: we also have v2 sanitizers, need to add this logic that as well.
./cmd/jaeger/internal/sanitizer
Signed-off-by: chahatsagarmain <[email protected]>
model/keyvalue.go
Outdated
@@ -27,6 +27,7 @@ const ( | |||
BinaryType = ValueType_BINARY | |||
|
|||
SpanKindKey = "span.kind" | |||
SpanHashKey = "span.hash" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move it into a separate group and use value @jaeger@hash
- see ##6522
span.CopyTo(dedupedSpans.AppendEmpty()) | ||
continue | ||
|
||
var spanHash int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a helper function (or two) in jptrace
package, similar to how you added them to model/
@chahatsagarmain what's the status of this? |
@yurishkuro I was busy with ci issues (mostly due to cache action ) , Now i can get back these ones . |
Signed-off-by: chahatsagarmain <[email protected]>
Signed-off-by: chahatsagarmain <[email protected]>
) | ||
|
||
// NewHashingSanitizer creates a sanitizer to add hash field to spans | ||
func NewHashingSanitizer() SanitizeSpan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func NewHashingSanitizer() SanitizeSpan { | |
func AddHashTag() SanitizeSpan { |
and similarly file name add_hash_tag
internal/jptrace/spanhash.go
Outdated
protoMarshaler := &ptrace.ProtoMarshaler{} | ||
return &SpanHasher{ | ||
marshaler: protoMarshaler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protoMarshaler := &ptrace.ProtoMarshaler{} | |
return &SpanHasher{ | |
marshaler: protoMarshaler, | |
return &SpanHasher{ | |
marshaler: &ptrace.ProtoMarshaler{}, |
} | ||
} | ||
|
||
func (s *SpanHasher) SpanHash(trace ptrace.Traces) (int64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptrace.Traces
is a collection of spans, it doesn't make sense to ask for a single hash. It should be taking ptrace.Span
as input.
hash, _ := model.HashCode(span) | ||
if _, ok := spansByHash[hash]; !ok { | ||
spansByHash[hash] = span | ||
spanHash, found := span.GetHashTag() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this approach going to work if the span undergoes multiple transformations? E.g. we start with sanitizers that can mutate it, and we can add the tag as the last step in sanitizers (so probably ok for the write path). But then on the read path we also have adjusters that can change the trace, with this deduper being somewhere in the middle, is it going to get consistent behavior? I suspect yes because as a deduper it's job is really to filter out identical spans as ingested.
@@ -82,6 +86,7 @@ func (c converter) toDomain(dbSpan *Span) (*model.Span, error) { | |||
if err != nil { | |||
return nil, err | |||
} | |||
tags = append(tags, model.Int64(jptrace.HashAttribute, dbSpan.SpanHash)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't we expect hash tag to already be stored?
Signed-off-by: chahatsagarmain <[email protected]>
Signed-off-by: chahat sagar <[email protected]>
Signed-off-by: chahatsagarmain <[email protected]>
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:npm run lint
andnpm run test