[PLAT-416] ensure bounded time range for mv query tool call#8887
[PLAT-416] ensure bounded time range for mv query tool call#8887begelundmuller merged 9 commits intomainfrom
Conversation
begelundmuller
left a comment
There was a problem hiding this comment.
Two thoughts:
- It might be nice to be able to disable this for projects that run into problems. So maybe add it as a
rill.ai.nameinstance-level flag? So it can be overridden at runtime if someone has problems. - I'm wondering if the AI will just get the full time range using the summary tool call and then use that, which will have similar bad performance. Do you think it would make sense to instead support a bounding by a number of days, e.g.
rill.ai.max_time_range_days? So e.g. users with large clusters could bound AI queries to max 30 days of data.
|
IMO we should definitely ensure that a valid time range is set, absence of it just means LLM is not able to some how use it, may be this can be also behind a flag and default is ensure time range, what do you think? I agree we can have |
| return nil | ||
| } | ||
|
|
||
| if qry.TimeRange.IsZero() { |
There was a problem hiding this comment.
put this behind feature flag?
There was a problem hiding this comment.
I'm okay with keeping it required for AI initially – but try running the evals in analyst_agent_test.go and check they still pass
There was a problem hiding this comment.
They pass (has to disable enforcing of time range for tests directly doing query_metrics_view tool call) except from CanvasContext test which fails here and I think its a genuine failure as I am not sure why app_or_site = 'App' this filter would be present in where clause.
There was a problem hiding this comment.
Then it's flaky. The filter is because the contextual canvas component has it:
rill/runtime/testruntime/projects.go
Line 215 in dae94e6
runtime/ai/metrics_view_query.go
Outdated
| - Prefer using absolute 'start' and 'end' times in 'time_range' and 'comparison_time_range' if available. | ||
| Otherwise, use 'expression' for relative time ranges, when specifying 'expression' make sure no other time range fields should be set as its not supported. | ||
| - Prefer using absolute 'start' and 'end' time range parameters in 'time_range' and 'comparison_time_range' if available. | ||
| Otherwise, use 'time_range.expression' for relative time ranges, when specifying 'time_range.expression' make sure no other time range fields other than 'time_range.time_dimension' is set as its not supported. |
There was a problem hiding this comment.
The problem with this is that there is also comparison_time_range.expression (and expressions are usually most useful for comparisons)
runtime/metricsview/query.go
Outdated
|
|
||
| QueryLimit *QueryLimits `json:"-" mapstructure:"query_limits"` |
There was a problem hiding this comment.
Is there a reason not to expose this?
| QueryLimit *QueryLimits `json:"-" mapstructure:"query_limits"` | |
| QueryLimit *QueryLimits `json:"limits,omitempty" mapstructure:"query_limits"` |
There was a problem hiding this comment.
Yes, this will be used only during AI tool calls and resolvers use mapstructure to decode the args, also query limits are not exposed through any proto requests so didn't felt need to expose it.
There was a problem hiding this comment.
I think we should just expose it. I'm not sure, but I think we may have some places that use JSON instead of mapstructure for serialization/deserialization, or if not, we might add that in the future. So since it would not be harmful to expose, I think it might just be simpler/safer.
There was a problem hiding this comment.
ok why use limits as key instead of query_limits, seems confusing with already present limit key.
There was a problem hiding this comment.
Was just typing quickly, query_limits also seems fine by me
| // QueryLimits represents limits that should be applied to a query, such as requiring a time range or limiting the maximum time range for interactive queries. These are not part of the Query json itself because they are not intrinsic to the query, but rather are constraints that may be applied to the query before execution. | ||
| type QueryLimits struct { | ||
| RequireTimeRange bool `json:"-" mapstructure:"require_time_range"` | ||
| MaxTimeRangeDays int `json:"-" mapstructure:"max_time_range_days"` | ||
| } |
There was a problem hiding this comment.
Same question about exposing it.
Is it a problem to consider it as part of the query? From the metricsview package perspective, I guess it is part of the query? It is only from the API/tool perspective that it may overwrite it (i.e. not allow the user to set it).
To prevent LLMs to run unbounded queries
Checklist: