sizegen: do not ignore type aliases #17556

vmg · 2025-01-16T14:58:39Z

Description

Important fix for memory calculations of in-memory data structures. Was causing OOMs in production.

In Go 1.23 a new type alias node was introduced for reflection, see golang/go#63223. This initially caused the sizegen generation to crash, which was fixed by @systay in #16650 but that fix was not complete. It still creates a ton of warnings since that change that were never fixed:

2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 WARNING: size of external type regexp.Regexp cannot be fully calculated
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 WARNING: size of external type math/big.Int cannot be fully calculated
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 WARNING: size of external type regexp.Regexp cannot be fully calculated
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 unhandled type: *types.Alias
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/key/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/pools/smartconnpool/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vtgate/engine/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/mysql/collations/colldata/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/tableacl/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/schema/cached_size.go'
2025/01/16 15:51:18 saved '/Users/dirkjan/code/vitessio/vitess/go/sqltypes/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vtgate/evalengine/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vttablet/tabletserver/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/proto/query/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/srvtopo/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/sqlparser/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vtenv/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/proto/topodata/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/mysql/collations/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/mysql/json/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/mysql/decimal/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vtgate/vindexes/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vttablet/tabletserver/rules/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vttablet/tabletserver/planbuilder/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/vttablet/tabletserver/schema/cached_size.go'
2025/01/16 15:51:19 saved '/Users/dirkjan/code/vitessio/vitess/go/vt/proto/vttime/cached_size.go'

This might have looked innocent, but it certainly is not. The problem we found here specifically starts showing up in the vttablet consolidation logic. The problem is that we need these sizes to properly compute the cache. We have a type alias for Row to []Value for the actual row storage which is now ignored.

This means that the cache in use size for the consolidator did not actually count the row data! This is basically the main part of what it needs to measure. This in turns leads to excessive memory usage by the consolidator.

This also applies to the normal query consolidator, to the query cache, and any other piece of code that was depending on CachedSize to be accurate in order to limit vttablet memory usage. We're hoping this will cut down on the amount of OOMs we see in production.

cc @dbussink

Related Issue(s)

Fixes Bug Report: Sizegen is broken for cached results on v21 and later #17555

Checklist

"Backport to:" labels have been added if this change should be back-ported to release branches
If this change is to be back-ported to previous releases, a justification is included in the PR description
Tests were added or are not required
Did the new or modified tests pass consistently locally and on CI?
Documentation was added or is not required

Deployment Notes

Signed-off-by: Vicent Marti <[email protected]>

vitess-bot · 2025-01-16T14:58:42Z

dbussink · 2025-01-16T14:59:44Z

go/tools/sizegen/sizegen.go

@@ -163,6 +163,8 @@ func (sizegen *sizegen) generateTyp(tt types.Type) {
 		sizegen.generateKnownType(tt)
 	case *types.Alias:
 		sizegen.generateTyp(types.Unalias(tt))
+	default:
+		panic(fmt.Sprintf("unhandled type: %v (%T)", tt, tt))


We now make sure we panic here on unexpected values as well.

dbussink · 2025-01-16T15:00:06Z

go/tools/sizegen/sizegen.go

 	default:
-		log.Printf("unhandled type: %T", node)
-		return nil, 0
+		panic(fmt.Sprintf("unhandled type: %v (%T)", node, node))


We should not just print a warning here, but actively panic. This is so that if we have other new types in the future, we don't ignore this again but actively investigate.

vmg · 2025-01-16T15:02:08Z

go/sqltypes/cached_size.go

+		for _, elem := range cached.Rows {
+			{
+				size += hack.RuntimeAllocSize(int64(cap(elem)) * int64(32))
+				for _, elem := range elem {
+					size += elem.CachedSize(false)
+				}
+			}
+		}


This was the big missing memory calculation for the size of in-memory query results.

GuptaManan100

LGTM! Excellent find!

Signed-off-by: Vicent Marti <[email protected]>

codecov · 2025-01-16T16:15:49Z

Codecov Report

Attention: Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Project coverage is 67.72%. Comparing base (b103492) to head (755e0e1).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
go/tools/sizegen/sizegen.go	0.00%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #17556      +/-   ##
==========================================
+ Coverage   67.68%   67.72%   +0.03%     
==========================================
  Files        1584     1584              
  Lines      254718   254721       +3     
==========================================
+ Hits       172414   172508      +94     
+ Misses      82304    82213      -91

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

deepthi · 2025-01-16T16:26:42Z

Isn't there a unit test that could have caught this? We should add one.

dbussink · 2025-01-16T16:28:07Z

Isn't there a unit test that could have caught this? We should add one.

That's why we changed this to panic instead of only printing a warning. That way we'd hard crash / fail instead of it being possible to ignore the warning.

Signed-off-by: Vicent Marti <[email protected]> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

sizegen: do not ignore type aliases

755e0e1

Signed-off-by: Vicent Marti <[email protected]>

vmg requested review from deepthi, harshit-gangal, mattlord, frouioui, systay and shlomi-noach as code owners January 16, 2025 14:58

dbussink reviewed Jan 16, 2025

View reviewed changes

dbussink approved these changes Jan 16, 2025

View reviewed changes

vmg commented Jan 16, 2025

View reviewed changes

github-actions bot added this to the v22.0.0 milestone Jan 16, 2025

systay approved these changes Jan 16, 2025

View reviewed changes

GuptaManan100 approved these changes Jan 16, 2025

View reviewed changes

dbussink merged commit 71ccd6d into main Jan 16, 2025
212 checks passed

dbussink deleted the vmg/sizegen-fix branch January 16, 2025 15:46

vitess-bot pushed a commit that referenced this pull request Jan 16, 2025

sizegen: do not ignore type aliases (#17556)

dd97694

Signed-off-by: Vicent Marti <[email protected]>

vitess-bot bot mentioned this pull request Jan 16, 2025

[release-21.0] sizegen: do not ignore type aliases (#17556) #17557

Merged

dbussink pushed a commit that referenced this pull request Jan 16, 2025

[release-21.0] sizegen: do not ignore type aliases (#17556) (#17557)

16f03ed

Signed-off-by: Vicent Marti <[email protected]> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

dbussink mentioned this pull request Jan 20, 2025

Improve sizegen to handle more types #17583

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sizegen: do not ignore type aliases #17556

sizegen: do not ignore type aliases #17556

vmg commented Jan 16, 2025 •

edited by dbussink

Loading

vitess-bot bot commented Jan 16, 2025

dbussink Jan 16, 2025

dbussink Jan 16, 2025

vmg Jan 16, 2025

GuptaManan100 left a comment

codecov bot commented Jan 16, 2025

deepthi commented Jan 16, 2025

dbussink commented Jan 16, 2025

sizegen: do not ignore type aliases #17556

sizegen: do not ignore type aliases #17556

Conversation

vmg commented Jan 16, 2025 • edited by dbussink Loading

Description

Related Issue(s)

Checklist

Deployment Notes

vitess-bot bot commented Jan 16, 2025

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

dbussink Jan 16, 2025

Choose a reason for hiding this comment

dbussink Jan 16, 2025

Choose a reason for hiding this comment

vmg Jan 16, 2025

Choose a reason for hiding this comment

GuptaManan100 left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 16, 2025

Codecov Report

deepthi commented Jan 16, 2025

dbussink commented Jan 16, 2025

vmg commented Jan 16, 2025 •

edited by dbussink

Loading