-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ruler to retrieve proto format query response #6345
base: master
Are you sure you want to change the base?
Support ruler to retrieve proto format query response #6345
Conversation
872120e
to
5f93620
Compare
6915258
to
fb61f5a
Compare
@@ -176,7 +183,7 @@ func (instantQueryCodec) EncodeResponse(ctx context.Context, res tripperware.Res | |||
return nil, httpgrpc.Errorf(http.StatusInternalServerError, "invalid response format") | |||
} | |||
|
|||
b, err := json.Marshal(a) | |||
contentType, b, err := marshalResponse(a, req.Header.Get("Accept")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we only try to return non JSON response format if we know the request comes from Ruler?
If the request doesn't come from Ruler I feel we want to enforce JSON response even though they specify protobuf in their protocol like Accept: protobuf, json
. To reduce impact as this protobuf format is Cortex specific.
But tbh Idk if we can identify ruler requests or not. We probably have to do what you are doing here.
WDYT? @alanprot @danielblando
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeya24
One way is to define a new MIME type like application/x-protobuf-ruler
(just example) and then use it to accept header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. maybe we should define our internal MIME type so external calls will only get that format if they really wanna.
I would not add "ruler" in the format though. Maybe something like application/x-cortex.query.v1+proto
(idk the latest best practices to define those custom Mime types :P )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds worth trying. Let's hear from other maintainers, too
fb61f5a
to
80d5e87
Compare
80d5e87
to
ec03c22
Compare
{ | ||
name: "proto type", | ||
acceptHeader: "application/x-protobuf", | ||
expectedContentType: "application/json", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can enforce json marshal if we get the other types except the application/x-cortex-query+proto
.
ec03c22
to
6c4fe46
Compare
6c4fe46
to
0b4bf23
Compare
86a0a52
to
27d40bd
Compare
@yeya24 |
@@ -154,7 +161,7 @@ func (c instantQueryCodec) EncodeRequest(ctx context.Context, r tripperware.Requ | |||
} | |||
} | |||
|
|||
tripperware.SetRequestHeaders(h, c.defaultCodecType, c.compression) | |||
tripperware.SetRequestHeaders(h, c.defaultCodecType, c.compression, c.rulerQueryResponseFormat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always ask for cortex internal response format in instant query even though the requests are not initiated from Ruler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh.. I see.
We can utilize the User-Agent
header value at DecodeRequest
and EncodeRequest
.
194aaa3
to
5a369e5
Compare
5a369e5
to
26ee074
Compare
if strings.EqualFold(h, header) { | ||
result.Headers[h] = hv | ||
break | ||
if isSourceRuler(&r.Header) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We can initialize isSourceRuler(&r.Header)
as a var.
return tripperware.ApplicationJson, b, err | ||
} | ||
|
||
func isSourceRuler(h *http.Header) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that we identified query traffic and ruler traffic now.
Can we also add this to frontend/transport/handler.go
? We can have additional logging about the query source and update metrics (like queries from rule or API, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to add a label (source) to query stats metrics?
h.queryScannedSamples = promauto.With(reg).NewCounterVec(prometheus.CounterOpts{
Name: "cortex_query_samples_scanned_total",
Help: "Number of samples scanned to execute a query.",
}, []string{"user", "source"})
And add a log field to ServeHTTP
. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to add a label (source) to query stats metrics?
We don't need to add this for all metrics. Only the counter for number of requests should be good enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a counter metric in the handler.
Should I add it?
FYI. We have a cortex_ruler_query_frontend_request_duration_seconds
histogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add a User-Agent
field to logger.
logger := util_log.WithUserAgent(userAgent, util_log.WithContext(r.Context(), f.log))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, we actually already have one called queriesPerTenant
but it is in roundtrip.go
. Sorry for the wrong info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
I add a source
label to cortex_query_frontend_queries_total
.
The label value is ruler
or others
.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call it API
and Rule
? This aligns with https://github.com/cortexproject/cortex/blob/master/pkg/cortexpb/cortex.proto#L14
7c686bf
to
d988c9e
Compare
d988c9e
to
d3f99c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in good shape now... Some nites only.
Would be good to have more people to review this change.
pkg/ruler/frontend_decoder.go
Outdated
@@ -18,8 +20,18 @@ const ( | |||
) | |||
|
|||
type JsonDecoder struct{} | |||
type ProtobufDecoder struct{} | |||
type Warning []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean is we should return nil instead of the resp.Warnings
at the ProtobufDecoder
decode func since the error
is only decoded by the json codec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... Just the naming thing. Better to call it Warnings
instead of Warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed it.
d3f99c0
to
437a332
Compare
@rajagopalanand @rapphil Maybe you can help review this PR as well. Thanks |
31f2cce
to
af3e05b
Compare
@alanprot |
@@ -147,6 +149,7 @@ func TestProtobufCodec_Encode(t *testing.T) { | |||
}, | |||
}, | |||
{ | |||
name: "matrix", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing name for this matrix test https://github.com/cortexproject/cortex/pull/6345/files#diff-9c2062cda65af13cf43cc9371cae0424744499ad5ca83597ea2bf4df2437f4d8R106
seems like it is the same as https://github.com/cortexproject/cortex/pull/6345/files#diff-9c2062cda65af13cf43cc9371cae0424744499ad5ca83597ea2bf4df2437f4d8R186?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. let's add a name and delete the duplicate one. Thanks!
@@ -29,6 +31,9 @@ var ( | |||
SortMapKeys: true, | |||
ValidateJsonRawMessage: false, | |||
}.Froze() | |||
|
|||
rulerMIMEType = v1.MIMEType{Type: "application", SubType: tripperware.QueryResponseCortexMIMESubType} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do rulerMIMEType = tripperware.QueryResponseCortexMIMEType
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tripperware.QueryResponseCortexMIMEType
is a string
, so it is hard to use it to MIMEType
directly.
{ | ||
name: "empty accept header", | ||
acceptHeader: "", | ||
expectedContentType: "application/json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we do expectedContentType: tripperware.ApplicationJson
instead?
@@ -763,7 +769,8 @@ func SetRequestHeaders(h http.Header, defaultCodecType CodecType, compression Co | |||
} | |||
|
|||
func UnmarshalResponse(r *http.Response, buf []byte, resp *PrometheusResponse) error { | |||
if r.Header != nil && r.Header.Get("Content-Type") == ApplicationProtobuf { | |||
contentType := r.Header.Get("Content-Type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that r.Header != nil
before getting the contentType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching!
@@ -230,7 +230,7 @@ func (c prometheusCodec) DecodeResponse(ctx context.Context, r *http.Response, _ | |||
return &resp, nil | |||
} | |||
|
|||
func (prometheusCodec) EncodeResponse(ctx context.Context, res tripperware.Response) (*http.Response, error) { | |||
func (prometheusCodec) EncodeResponse(ctx context.Context, _ *http.Request, res tripperware.Response) (*http.Response, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems weird that we need to pass in a request here if it's not being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just for the interface
.
It is not used in the range query
, since the ruler only does instant query
.
@@ -156,7 +156,8 @@ func NewQueryTripperware( | |||
now := time.Now() | |||
userStr := tenant.JoinTenantIDs(tenantIDs) | |||
activeUsers.UpdateUserTimestamp(userStr, now) | |||
queriesPerTenant.WithLabelValues(op, userStr).Inc() | |||
source := getSource(r.Header.Get("User-Agent")) | |||
queriesPerTenant.WithLabelValues(op, userStr, source).Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably shouldn't have the user-agent as a possible label value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it can be set by the user, then there's a possibility of cardinality explosion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call out. I think source
is good enough and we don't need the user agent here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The label values would be ruler
or api
. Don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm ignore my comment. We are not using user agent value directly as label value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test here to make sure the user agent is either ruler or api to ensure we don't allow values other than those?
Signed-off-by: SungJin1212 <[email protected]>
af3e05b
to
da4a4da
Compare
isSourceRuler := strings.Contains(r.Header.Get("User-Agent"), tripperware.RulerUserAgent) | ||
if isSourceRuler { | ||
// When the source is the Ruler, then forward whole headers | ||
result.Headers = r.Header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Consider returning here vs else block
In #6151, we can retrieve query results from Query frontends when the rules are evaluated. The limitation was, since we only support
json
format, we cannot retrieve query results including thenative histogram
.To address this concern, this PR adds
-ruler.query-response-format
flag, which specifies which format to use when retrieving query results from the Query frontend. The supported values areprotobuf
andjson
, so we can retrieve proto format query results when we set the value toprotobuf
.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]