Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat][storage] Add Span Kind support for ES/OS #6399

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Manik2708
Copy link
Contributor

Which problem is this PR solving?

Fixes: #1923

Description of the changes

  • While querying GetOperations, operations can now be fetched with kind also. When kind kept empty, spans of all kinds are returned

How was this change tested?

  • Unit and E2E tests

Checklist

Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 82.35294% with 12 lines in your changes missing coverage. Please review.

Project coverage is 96.20%. Comparing base (976a36e) to head (7129344).

Files with missing lines Patch % Lines
plugin/storage/es/spanstore/service_operation.go 80.00% 8 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6399      +/-   ##
==========================================
- Coverage   96.24%   96.20%   -0.04%     
==========================================
  Files         373      373              
  Lines       21389    21429      +40     
==========================================
+ Hits        20585    20616      +31     
- Misses        612      618       +6     
- Partials      192      195       +3     
Flag Coverage Δ
badger_v1 10.62% <0.00%> (-0.04%) ⬇️
badger_v2 2.77% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v1-manual 16.55% <0.00%> (-0.07%) ⬇️
cassandra-4.x-v2-auto 2.70% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v2-manual 2.70% <0.00%> (-0.02%) ⬇️
cassandra-5.x-v1-manual 16.55% <0.00%> (-0.07%) ⬇️
cassandra-5.x-v2-auto 2.70% <0.00%> (-0.02%) ⬇️
cassandra-5.x-v2-manual 2.70% <0.00%> (-0.02%) ⬇️
elasticsearch-6.x-v1 20.45% <39.70%> (+0.05%) ⬆️
elasticsearch-7.x-v1 20.52% <39.70%> (+0.05%) ⬆️
elasticsearch-8.x-v1 20.67% <39.70%> (+0.03%) ⬆️
elasticsearch-8.x-v2 2.76% <0.00%> (-0.02%) ⬇️
grpc_v1 12.13% <0.00%> (-0.05%) ⬇️
grpc_v2 9.01% <0.00%> (-0.03%) ⬇️
kafka-3.x-v1 10.30% <0.00%> (-0.04%) ⬇️
kafka-3.x-v2 2.77% <0.00%> (-0.01%) ⬇️
memory_v2 2.76% <0.00%> (-0.02%) ⬇️
opensearch-1.x-v1 20.57% <39.70%> (+0.05%) ⬆️
opensearch-2.x-v1 20.57% <39.70%> (+0.04%) ⬆️
opensearch-2.x-v2 2.76% <0.00%> (-0.02%) ⬇️
tailsampling-processor 0.51% <0.00%> (-0.01%) ⬇️
unittests 95.05% <79.41%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, but I have question on whether the whole thing could be even simpler, by treating Kind as always present (but maybe blank).

plugin/storage/es/spanstore/dbmodel/model.go Outdated Show resolved Hide resolved
@@ -124,8 +124,9 @@ func getSpanAndServiceIndexFn(p SpanWriterParams) spanAndServiceIndexFn {
func (s *SpanWriter) WriteSpan(_ context.Context, span *model.Span) error {
spanIndexName, serviceIndexName := s.spanServiceIndex(span.StartTime)
jsonSpan := s.spanConverter.FromDomainEmbedProcess(span)
kind, _ := span.GetSpanKind()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add Kind field to the dbmodel.Span and not pass it around separately?

Copy link
Contributor Author

@Manik2708 Manik2708 Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be present as a tag in json model also! I tried fetching kind from original span but failed because of the bool AllTagsAsFields. As it will already be present in the json model, should we resave it as a seperate kind also?

plugin/storage/es/spanstore/service_operation.go Outdated Show resolved Hide resolved
@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 26, 2024

overall lgtm, but I have question on whether the whole thing could be even simpler, by treating Kind as always present (but maybe blank).

Then we can't get old data! In old data Kind was not present. In the actual issue, when it was asked that What will happen to old data, it was answered that it should be accessible when kind is not present! Therefore I introduced a whole new struct with kind and saved the data without kind also.

@Manik2708 Manik2708 requested a review from yurishkuro December 26, 2024 02:05
@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 28, 2024

@yurishkuro Comitted as per your suggestions. I tried many ways but I couldn't merge those with EmptyKinds and WithoutKinds into a single aggregation. Because both bool queries are contradictory to each other and so it is behaving in a different way. The only possible way is that we have to manage them in different aggregations which will increase complexity in the Operations. So we have to decide between complexity in writing span or fetching operations.

@yurishkuro
Copy link
Member

please update the branch, I am not sure if CI is failing because of that or because of your changes. If it's the latter, how are you testing this change? Did you run e2e tests locally?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 29, 2024

please update the branch, I am not sure if CI is failing because of that or because of your changes. If it's the latter, how are you testing this change? Did you run e2e tests locally?

I tried approaching with empty kind but failed as query in GetOperations is becoming more complex, hence have reverted that commit.

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

@yurishkuro Saving empty strings in ES is not optimal. Please see: elastic/elasticsearch#7515. I have been trying various ways of employing empty or null kinds but not getting results. The probable reason is because query is a bit complex and unconventional. I think it will be better to not to keep empty and null kinds rather handelling them seperately!

@yurishkuro
Copy link
Member

I don't have a strong opinion, but the issue you linked talks about searching for empty strings, which I don't think is the case for our scenario, we just need to write it.

@Manik2708
Copy link
Contributor Author

I don't have a strong opinion, but the issue you linked talks about searching for empty strings, which I don't think is the case for our scenario, we just need to write it.

But while reading we have to search for those operations also which have empty kinds or no kind! I am facing problems when fetching spans of all kinds, the search query behaves weirdly if I introduce filter of "". There are no problems in writing the service but in fetching them

@yurishkuro
Copy link
Member

Searching for "all kinds" to me means not specifying any filter for the "kind" field. How would it even work if you say search for kind="" when the kind is actually not empty?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

Searching for "all kinds" to me means not specifying any filter for the "kind" field. How would it even work if you say search for kind="" when the kind is actually not empty?

I think there is some sort of misunderstanding of my approach. Let me first explain problems:

  1. We want to get two fields (operatioName and kind) from our search query to ES, to achieve this, we have 3 ways:
    a) Use FetchSource(true): This will fetch all the fields which is not optimal neither is avilable in our abstraction.
    b) Use FetchSourceWithContext: Not available in our abstraction.
    c) Composite Filters: Not possible because it will be applied to a field and kind can be absent in old data.
    So I employed my approach, used the query only on service name and applied filters on kind (kinds are limited so it is possible) and used the filter name to get the kind from named buckets. But when this filter is applied to empty kind, it is not giving results. Your thoughts are absolutely correct when you said: When no kind is in query we have to just employ query on service name but we have to fetch two fields which is creating problem in just querying on service!

@yurishkuro
Copy link
Member

"Not available in our abstraction" is an odd argument since we own the abstraction and can change it at will. Re fetchsource, what is the source here - is it the whole span? Or do we write separate entries just for service/operation?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

"Not available in our abstraction" is an odd argument since we own the abstraction and can change it at will. Re fetchsource, what is the source here - is it the whole span? Or do we write separate entries just for service/operation?

Service, Operation and Kind (if present) but have to investigate the indices whether it is extracting any other information.

@Manik2708
Copy link
Contributor Author

Will try to clean this PR as soon as possible

@Manik2708
Copy link
Contributor Author

"Not available in our abstraction" is an odd argument since we own the abstraction and can change it at will. Re fetchsource, what is the source here - is it the whole span? Or do we write separate entries just for service/operation?

I have verified! It's only the service model, have comitted with FetchSourceContext

@Manik2708
Copy link
Contributor Author

@yurishkuro I have committed with FetchSourceContext, please review!

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for delay

plugin/storage/es/spanstore/service_operation.go Outdated Show resolved Hide resolved
plugin/storage/es/spanstore/service_operation.go Outdated Show resolved Hide resolved
plugin/storage/es/spanstore/service_operation.go Outdated Show resolved Hide resolved
Query(serviceQuery).
IgnoreUnavailable(true).
Aggregation(operationsAggregation, serviceFilter)

FetchSourceContext(elastic.NewFetchSourceContext(true).Include(spanKind, operationNameField)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this, can you please point to documentation of how this impacts the query behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#field-retrieval-methods. This part of docs suggests to use fields over source, but I used FetchSourceContext because:

  1. Weirdly olivere doesn't have Fields in their SearchService (maybe it's depreceated that's why) but rather it is in ExplainService.
  2. Technically we need the whole source as Service mapping has only three fields: Service, Kind, Operation Name.

Signed-off-by: Manik2708 <[email protected]>
@Manik2708 Manik2708 requested a review from yurishkuro January 21, 2025 14:33
Signed-off-by: Manik2708 <[email protected]>
@Manik2708 Manik2708 changed the title FEAT: Add Span Kind support for ES/OS [feat][storage] Add Span Kind support for ES/OS Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ES storage plugin: query service to support spanKind when retrieve operations for a given service
2 participants