Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TraceQL: support mixed-type attribute querying (int/float) #4391

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ndk
Copy link
Contributor

@ndk ndk commented Nov 27, 2024

What this PR does:
Below is my understanding of the current limitations. Please feel free to correct me if I’ve misunderstood or overlooked something.

Attributes of the same type are stored in the same column. For example, integers are stored in one column and floats in another.

Querying operates in two stages:

  • Predicate Creation: Predicates are created based on the operand types.
  • Chunk Scanning: Chunks are scanned, and spans are filtered using the predicates.

The issue arises because predicates are generated based on the operand type. If an attribute is stored as a float but the operand is an integer, the predicate evaluates against the integers column instead of the floats column. This results in incorrect behavior.

Proposed Solution
The idea is to generate predicates for both integers and floats, allowing both columns to be scanned for the queried attribute.

In this PR, I’ve created a proof-of-concept by copying the existing createAttributeIterator function to createAttributeIterator2. This duplication is intentional, as the original function is used in multiple places, and I want to avoid introducing unintended side effects until the approach is validated.

case traceql.TypeInt:
	{
		pred, err := createIntPredicate(cond.Op, cond.Operands)
		if err != nil {
			return nil, fmt.Errorf("creating attribute predicate: %w", err)
		}
		attrIntPreds = append(attrIntPreds, pred)
	}

	{
		if i, ok := cond.Operands[0].Int(); ok {
			operands := traceql.Operands{traceql.NewStaticFloat(float64(i))}
			pred, err := createFloatPredicate(cond.Op, operands)
			if err != nil {
				return nil, fmt.Errorf("creating attribute predicate: %w", err)
			}
			attrFltPreds = append(attrFltPreds, pred)
		}
	}

case traceql.TypeFloat:
	{
		operands := traceql.Operands{traceql.NewStaticInt(int(cond.Operands[0].Float()))}
		pred, err := createIntPredicate(cond.Op, operands)
		if err != nil {
			return nil, fmt.Errorf("creating attribute predicate: %w", err)
		}
		attrIntPreds = append(attrIntPreds, pred)
	}

	{
		pred, err := createFloatPredicate(cond.Op, cond.Operands)
		if err != nil {
			return nil, fmt.Errorf("creating attribute predicate: %w", err)
		}
		attrFltPreds = append(attrFltPreds, pred)
	}

WDYT? :)

Which issue(s) this PR fixes:
Fixes #4332

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@ndk ndk changed the title WIP: Proposal to address mixed-type attribute querying limitations TraceQL: Proposal to address mixed-type attribute querying limitations Nov 27, 2024
@ndk ndk changed the title TraceQL: Proposal to address mixed-type attribute querying limitations WIP: Proposal to address mixed-type attribute querying limitations Dec 19, 2024
@joe-elliott
Copy link
Member

I apologize for taking so long to get to this. Your analysis is correct! We do generate predicates per column and, since we store integers and floats independently we only scan one of the columns. Given how small int and float columns tend to be (compared to string columns) I think the performance hit of doing this is likely acceptable in exchange for the nicer behavior.

What is the behavior in this case? I'm pretty sure this will work b/c the engine will request all values for the two attributes and do the work itself. I believe the engine layer will compare ints and floats correctly but I'm not 100% sure.

{ span.intAttr > span.floatAttr }

Tests should also be added here for the new behavior. These tests build a block and then search for a known trace using a large range of traceql queries. If you add tests here and they pass it means that your changes work from the parquet file all the way up through the engine.

This will also break the "allConditions" optimization if the user types any query with a number comparison:

https://github.com/grafana/tempo/pull/4391/files#diff-a201423ab0b50d4455a497bf1804b1a9f596394413c28b7702710f89237c49c1R2815-R2821

I would like preserve the allConditions behavior in this case b/c it's such a nice optimization and number queries are common. I'm not quite sure why the len(valueIters) == 1 condition exists so we'd need to do some research into it.

@ndk ndk force-pushed the mixed-type-attr-query branch 2 times, most recently from 3d8f31d to 812d768 Compare January 10, 2025 16:41
@ndk
Copy link
Contributor Author

ndk commented Jan 12, 2025

I apologize for taking so long to get to this. Your analysis is correct! We do generate predicates per column and, since we store integers and floats independently we only scan one of the columns. Given how small int and float columns tend to be (compared to string columns) I think the performance hit of doing this is likely acceptable in exchange for the nicer behavior.

Thank you for confirming the approach and pointing out the allConditions optimization. Right now, the fix scans both integer and float columns for attributes that might be either type. I’ve also adjusted how float comparisons work for integer fields, taking into account the fraction part and the comparison operator.

What is the behavior in this case? I'm pretty sure this will work b/c the engine will request all values for the two attributes and do the work itself. I believe the engine layer will compare ints and floats correctly but I'm not 100% sure.

{ span.intAttr > span.floatAttr }

I verified that { span.intAttr > span.floatAttr } behaves as expected. Wanna me to add a test to cover this case?

Tests should also be added here for the new behavior. These tests build a block and then search for a known trace using a large range of traceql queries. If you add tests here and they pass it means that your changes work from the parquet file all the way up through the engine.

Done. Let me know if I missed something.

This will also break the "allConditions" optimization if the user types any query with a number comparison:

https://github.com/grafana/tempo/pull/4391/files#diff-a201423ab0b50d4455a497bf1804b1a9f596394413c28b7702710f89237c49c1R2815-R2821

I would like preserve the allConditions behavior in this case b/c it's such a nice optimization and number queries are common. I'm not quite sure why the len(valueIters) == 1 condition exists so we'd need to do some research into it.

Regarding the allConditions block, the optimization is lost because we generate two predicates (one for int, one for float) under the same attribute name, triggering a LeftJoinIterator instead of a JoinIterator. Possible workarounds I’m considering:

  • Creating a variant of JoinIterator that uses logical OR rather than AND.
  • Exploring parquet.multiRowGroup, parquetquery.UnionIterator, or parquetquery.KeyValueGroupPredicate to see if they can unify the int/float search without losing the optimization.
  • Refactoring a single ColumnChunk to the multy-one.

Given my limited exposure to Tempo’s internals, I’d appreciate any guidance on whether these routes are viable or if there’s a simpler approach to preserve allConditions.

P.S. Do we care about comparisons with negative values? Should it also be covered?

@ndk ndk changed the title WIP: Proposal to address mixed-type attribute querying limitations Mixed-type attribute querying (int/float) Jan 12, 2025
@ndk ndk marked this pull request as ready for review January 12, 2025 11:42
@ndk ndk force-pushed the mixed-type-attr-query branch from 812d768 to 171afab Compare January 12, 2025 12:42
@ndk ndk changed the title Mixed-type attribute querying (int/float) TraceQL: support mixed-type attribute querying (int/float) Jan 12, 2025
@ndk ndk force-pushed the mixed-type-attr-query branch 3 times, most recently from 4970fbd to 50f5ae5 Compare January 14, 2025 16:09
@joe-elliott
Copy link
Member

joe-elliott commented Jan 14, 2025

This is a really cool change. Ran benchmarks and found no major regressions. Nice tests added ./tempodb. We try to keep those as comprehensive as possible given the complexity of the language.

I verified that { span.intAttr > span.floatAttr } behaves as expected. Wanna me to add a test to cover this case?

This case is covered in the ./pkg/traceql tests so I wouldn't worry about it. It occurred to me that this case causes two "OpNone" conditions to the fetch layer and the condition itself is evaluated in the engine, so your changes will not impact it.

I’ve also adjusted how float comparisons work for integer fields, taking into account the fraction part and the comparison operator.

Nice improvements here. I like falling back to integer comparison (or nothing) based on if the float has a fractional part.

Regarding the allConditions block, the optimization ...

The right choice would be a UnionOperator on the columns. It would be interesting to compare the performance of that against what you have currently written. I'm less concerned about allConditions then I was previously b/c the root iterators will still behave as if allConditions is true which is what really drives performance. The benchmarks show your changes are not causing a regression. I'm fine with what you have now, but feel free to experiment with union if you want.

Also, if you're interested, plug your queries into this test and run it. It will dump the iterator structure and you can see how your changes have impacted the hierarchy.

P.S. Do we care about comparisons with negative values? Should it also be covered?

Yes, are they not already? reviewing your code I think they would work fine.

I think my primary ask at this point would be to keep the int and float switch cases symmetrical. Even though it's trivial can you create a createFloatPredicateFromInt()? If these two cases read the same line by line it will be easier for others to understand what was done here in the future.

I'm a bit impressed you're taking this on. I wouldn't have guessed someone outside of Grafana would have had the time and patience to find this.

benches
goos: darwin
goarch: arm64
pkg: github.com/grafana/tempo/tempodb/encoding/vparquet4
cpu: Apple M3 Pro
                                                    │ before.txt  │             after.txt              │
                                                    │   sec/op    │   sec/op     vs base               │
BackendBlockTraceQL/spanAttValMatch-11                94.58m ± 0%   95.15m ± 1%  +0.60% (p=0.007 n=10)
BackendBlockTraceQL/spanAttValNoMatch-11              4.913m ± 1%   4.979m ± 1%  +1.34% (p=0.001 n=10)
BackendBlockTraceQL/spanAttIntrinsicMatch-11          71.23m ± 0%   72.79m ± 1%  +2.18% (p=0.000 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-11        4.940m ± 0%   5.031m ± 0%  +1.83% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttValMatch-11            408.4m ± 1%   413.0m ± 1%  +1.13% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-11          5.070m ± 1%   5.181m ± 1%  +2.20% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-11      37.06m ± 0%   37.80m ± 1%  +2.00% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch#01-11   4.891m ± 1%   4.954m ± 1%  +1.29% (p=0.001 n=10)
BackendBlockTraceQL/traceOrMatch-11                   238.5m ± 0%   241.3m ± 0%  +1.17% (p=0.000 n=10)
BackendBlockTraceQL/traceOrNoMatch-11                 238.7m ± 1%   241.4m ± 0%  +1.14% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-11                179.1m ± 0%   179.2m ± 0%       ~ (p=0.190 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-11          4.949m ± 0%   5.030m ± 0%  +1.64% (p=0.000 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-11           148.9m ± 1%   148.6m ± 0%       ~ (p=0.529 n=10)
BackendBlockTraceQL/count-11                          340.5m ± 3%   341.2m ± 0%       ~ (p=0.218 n=10)
BackendBlockTraceQL/struct-11                         432.6m ± 2%   432.8m ± 3%       ~ (p=0.796 n=10)
BackendBlockTraceQL/||-11                             165.8m ± 0%   166.1m ± 0%       ~ (p=0.089 n=10)
BackendBlockTraceQL/mixed-11                          28.86m ± 1%   29.08m ± 0%       ~ (p=0.123 n=10)
BackendBlockTraceQL/complex-11                        4.918m ± 4%   4.969m ± 5%       ~ (p=0.123 n=10)
BackendBlockTraceQL/select-11                         4.918m ± 0%   4.999m ± 0%  +1.64% (p=0.000 n=10)
geomean                                               42.28m        42.72m       +1.06%

                                                    │  before.txt  │              after.txt              │
                                                    │     B/s      │     B/s       vs base               │
BackendBlockTraceQL/spanAttValMatch-11                236.8Mi ± 0%   235.4Mi ± 1%  -0.60% (p=0.007 n=10)
BackendBlockTraceQL/spanAttValNoMatch-11              343.9Mi ± 1%   339.4Mi ± 1%  -1.32% (p=0.001 n=10)
BackendBlockTraceQL/spanAttIntrinsicMatch-11          327.0Mi ± 0%   320.0Mi ± 1%  -2.14% (p=0.000 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-11        501.5Mi ± 0%   492.4Mi ± 0%  -1.80% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttValMatch-11            53.81Mi ± 1%   53.21Mi ± 1%  -1.12% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-11          177.4Mi ± 1%   173.6Mi ± 1%  -2.16% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-11      594.1Mi ± 0%   582.4Mi ± 1%  -1.96% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch#01-11   190.9Mi ± 1%   188.5Mi ± 1%  -1.28% (p=0.001 n=10)
BackendBlockTraceQL/traceOrMatch-11                   7.010Mi ± 0%   6.928Mi ± 0%  -1.16% (p=0.000 n=10)
BackendBlockTraceQL/traceOrNoMatch-11                 7.005Mi ± 1%   6.924Mi ± 0%  -1.16% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-11                11.01Mi ± 0%   11.00Mi ± 0%       ~ (p=0.303 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-11          180.4Mi ± 0%   177.5Mi ± 0%  -1.61% (p=0.000 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-11           18.52Mi ± 1%   18.56Mi ± 0%       ~ (p=0.492 n=10)
BackendBlockTraceQL/count-11                          64.51Mi ± 2%   64.39Mi ± 0%       ~ (p=0.197 n=10)
BackendBlockTraceQL/struct-11                         12.62Mi ± 2%   12.62Mi ± 3%       ~ (p=0.837 n=10)
BackendBlockTraceQL/||-11                             133.1Mi ± 0%   132.9Mi ± 0%       ~ (p=0.085 n=10)
BackendBlockTraceQL/mixed-11                          740.2Mi ± 1%   734.8Mi ± 0%       ~ (p=0.123 n=10)
BackendBlockTraceQL/complex-11                        183.0Mi ± 4%   181.1Mi ± 5%       ~ (p=0.123 n=10)
BackendBlockTraceQL/select-11                         183.0Mi ± 0%   180.0Mi ± 0%  -1.61% (p=0.000 n=10)
geomean                                               98.15Mi        97.12Mi       -1.05%

                                                    │ before.txt  │              after.txt               │
                                                    │  MB_io/op   │  MB_io/op    vs base                 │
BackendBlockTraceQL/spanAttValMatch-11                 23.48 ± 0%    23.48 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/spanAttValNoMatch-11               1.772 ± 0%    1.772 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/spanAttIntrinsicMatch-11           24.43 ± 0%    24.43 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/spanAttIntrinsicNoMatch-11         2.598 ± 0%    2.598 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttValMatch-11             23.04 ± 0%    23.04 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttValNoMatch-11          943.2m ± 0%   943.2m ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttIntrinsicMatch-11       23.09 ± 0%    23.09 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttIntrinsicMatch#01-11   979.0m ± 0%   979.0m ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/traceOrMatch-11                    1.753 ± 0%    1.753 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/traceOrNoMatch-11                  1.753 ± 0%    1.753 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValNoMatch-11                 2.067 ± 0%    2.067 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValMixedMatchAnd-11          936.1m ± 0%   936.1m ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValMixedMatchOr-11            2.893 ± 0%    2.893 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/count-11                           23.03 ± 0%    23.03 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/struct-11                          5.726 ± 0%    5.726 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/||-11                              23.14 ± 0%    23.14 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixed-11                           22.40 ± 0%    22.40 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/complex-11                        943.7m ± 0%   943.7m ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/select-11                         943.7m ± 0%   943.7m ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                4.351         4.351       +0.00%
¹ all samples are equal

@ndk ndk force-pushed the mixed-type-attr-query branch from b0ffcde to 10b04c5 Compare January 15, 2025 01:25
@ndk
Copy link
Contributor Author

ndk commented Jan 15, 2025

I'm fine with what you have now, but feel free to experiment with union if you want.

I'm not sure if it's worth it. I'd rather rely on your opinion here.

P.S. Do we care about comparisons with negative values? Should it also be covered?

Yes, are they not already? reviewing your code I think they would work fine.

Actually, it turned out they didn't work correctly with negative values. I've updated the shifting logic to fix this. Also, another edge case raises questions: what happens if a float hits MaxInt/MinInt? In some cases, it might cause jumps between MaxInt and MinInt.

I think my primary ask...

Done! Let me know if this aligns with what you had in mind.

Plus, I've added some tests in a separate commit. Feel free to let me know if they look odd or need adjustments.

I'm a bit impressed you're taking this on. I wouldn't have guessed someone outside of Grafana would have had the time and patience to find this.

Haha, thanks! Honestly, it's just curiosity. Tempo is a fascinating system, and I've wanted to dive into something challenging like this. It's fun to learn from real-world systems and see how they tackle performance and scalability. :)

@ndk ndk force-pushed the mixed-type-attr-query branch from 10b04c5 to c05b608 Compare January 15, 2025 02:19
@ndk ndk force-pushed the mixed-type-attr-query branch from c05b608 to a679803 Compare January 17, 2025 14:01
@joe-elliott
Copy link
Member

Also, another edge case raises questions: what happens if a float hits MaxInt/MinInt? In some cases, it might cause jumps between MaxInt and MinInt.

We could try to get tricky here. Like if you do { span.IntCol > IntMaxAsFloat } then we just don't do the int comparison. { span.IntCol < IntMaxAsFloat } would just return all values from the fetch layer. But I'm also fine with the easy path of just not attempting the float/int comparison is the float is outside the bounds of the int column. It feels like an acceptable edge case as long as we document it.

Done! Let me know if this aligns with what you had in mind.

Yup, I think this communicates better to a future reader what's going on. Thanks for the change.

Ok, I was running your branch on Friday to test and we do have one final thing to figure out. This query does not work:

{ span.http.status_code = 200.0 }

The reason is b/c we handle this special column here:

if entry, ok := wellKnownColumnLookups[cond.Attribute.Name]; ok && entry.level != traceql.AttributeScopeResource {
if cond.Op == traceql.OpNone {
addPredicate(entry.columnPath, nil) // No filtering
columnSelectAs[entry.columnPath] = cond.Attribute.Name
continue
}
// Compatible type?
if entry.typ == operandType(cond.Operands) {
pred, err := createPredicate(cond.Op, cond.Operands)
if err != nil {
return nil, fmt.Errorf("creating predicate: %w", err)
}
addPredicate(entry.columnPath, pred)
columnSelectAs[entry.columnPath] = cond.Attribute.Name
continue
}
}

All well known and dedicated columns are strings ... except this one unfortunately. To do this correctly we have to scan both the well known column as well as the general float attribute column if the static value being compared against http status code is a float. To do this performantly I think we will need to build a UnionIterator that joins two sub iterators. One that scans the well known column and one that scans the float attribute column with the appropriate predicate.

@ndk
Copy link
Contributor Author

ndk commented Jan 22, 2025

We could try to get tricky here. Like if you do { span.IntCol > IntMaxAsFloat } then we just don't do the int comparison. { span.IntCol < IntMaxAsFloat } would just return all values from the fetch layer.

Sounds like a plan. Will do it later. :)

{ span.http.status_code = 200.0 }

...
All well known and dedicated columns are strings ... except this one unfortunately. To do this correctly we have to scan both the well known column as well as the general float attribute column if the static value being compared against http status code is a float. To do this performantly I think we will need to build a UnionIterator that joins two sub iterators. One that scans the well known column and one that scans the float attribute column with the appropriate predicate.

Oh, that's a nice catch! But before rushing into handling this case, I want to address one quick concern. If a user specifies span.http.status_code = 200.0, isn't that likely just a typo? Automatically converting floats to ints might hide the mistake instead of surfacing it. Even though status codes are technically numbers, they're more like categorical values. 199 isn't "slightly less successful" than 200. It's a completely different outcome.

Anyway, if you see real value in covering this edge case, I'm happy to implement it. Let me know what you think!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TraceQL: Ints can't be compared to floats
2 participants