Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential performance issue with LoadTypes #2183

Open
bradleygore opened this issue Dec 5, 2024 · 6 comments
Open

Potential performance issue with LoadTypes #2183

bradleygore opened this issue Dec 5, 2024 · 6 comments
Labels

Comments

@bradleygore
Copy link

Describe the bug
Recently updated from v5.5.5 to v5.7.1 and used the LoadTypes func to load up custom types en masse instead of 1-by-1. We hit some hiccups in that our cluster of 5 servers would spawn up and then our postgres server would crash. Upon troubleshooting the logs showed a pretty massive recursive CTE which led us to the buildLoadDerivedTypesSQL func. We rolled back our release (which had more than just pgx upgrade) and it became stable again. To be clear, I'm not saying this caused the crash - haven't replicated it on any other dev/test environment to figure out for sure. Just wanted to share these findings below.

We have 53 custom types that we load up in an after connect hook with the driver. I then cloned the pgx repo and made use of the test setup to benchmark both ways. I'm not expert at benchmarking, but if this is relatively correct there's a fairly major penalty for using LoadTypes as opposed to iterating the custom types and one-by-one using LoadType. My findings are roughly 20x slower to use the recursive CTE pathway.

To Reproduce
I added tests like this to the derived_types_test.go file against custom types I already have defined in my db.

var myCTypes = []string{
  // all custom type names here
}

func BenchmarkLoadMyCustomTypesBulk(b *testing.B) {
	for i := 0; i < b.N; i++ {
		testRunner := pgxtest.DefaultConnTestRunner()
		testRunner.CreateConfig = func(ctx context.Context, t testing.TB) *pgx.ConnConfig {
			config, err := pgx.ParseConfig("postgres://pguser:pgpass@localhost:5432/pgdb?connect_timeout=30&sslmode=disable")
			require.NoError(t, err)
			return config
		}
		testRunner.RunTest(context.Background(), b, func(ctx context.Context, t testing.TB, conn *pgx.Conn) {
			_, err := conn.LoadTypes(ctx, myCTypes)
			require.NoError(t, err)
		})
	}
}

func BenchmarkLoadMyCustomTypesSerial(b *testing.B) {
	for i := 0; i < b.N; i++ {
		testRunner := pgxtest.DefaultConnTestRunner()
		testRunner.CreateConfig = func(ctx context.Context, t testing.TB) *pgx.ConnConfig {
			config, err := pgx.ParseConfig("postgres://pguser:pgpass@localhost:5432/pgdb?connect_timeout=30&sslmode=disable")
			require.NoError(t, err)
			return config
		}
		testRunner.RunTest(context.Background(), b, func(ctx context.Context, t testing.TB, conn *pgx.Conn) {
			for _, ct := range myCTypes {
				cTyp, err := conn.LoadType(context.Background(), ct)
				require.NoError(t, err)
				conn.TypeMap().RegisterType(cTyp)
			}
		})
	}
}

Then I execute the benchmark tests for 30 seconds and get the result:

go test github.com/jackc/pgx/v5 -run MyCustomTypes -bench=MyCustomTypes -benchtime 30s -benchmem -v

BenchmarkLoadPOSAPICustomTypesBulk
BenchmarkLoadPOSAPICustomTypesBulk-10                 26        1330200502 ns/op          140919 B/op        964 allocs/op
BenchmarkLoadPOSAPICustomTypesSerial
BenchmarkLoadPOSAPICustomTypesSerial-10              553          63574098 ns/op          195709 B/op       1896 allocs/op

Expected behavior
Bulk loading, especially via such a massive recursive CTE, would ideally provide some benefit. Otherwise, the bulk operation could just iterate over the slice of strings and do LoadType for you and register it.

Actual behavior
Seems to be significantly less performant, and potentially risky combination of recursive CTE and dynamic sql 😅

Version

  • Go: go version go1.22.9 darwin/arm64
  • PostgreSQL: PostgreSQL 15.5 (Debian 15.5-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
  • pgx: v5.7.1
@bradleygore bradleygore added the bug label Dec 5, 2024
@jackc
Copy link
Owner

jackc commented Dec 5, 2024

The performance gain by using LoadTypes is primarily saving on the network round trips. If you are on a fast link, it may not make much difference. But I still would not have expected it to be slower.

@nicois, who originally contributed this feature, may have additional thoughts on this.

@bradleygore
Copy link
Author

@jackc Thanks for the insight. My testing was on local machine with db running locally, so it's possible the comparisons would be closer if testing against remote db.

I also added some logging around our wiring up of the db connection, and found that we are repeatedly registering types. I'm wondering if there isn't a better way to wire this up or something.

What we're doing now is using OptionAfterConnect:

//stdlib here is pgx/stdlib
sqldb := stdlib.OpenDB(*pgxCfg, stdlib.OptionAfterConnect(func(_ context.Context, c *pgx.Conn) error {
	return PgxRegisterCustomPgTypes(c)
}))

then inside of that register func, I'm checking to see if the connection's type map already has the first custom type registered or not and logging it out:

cTyps := []string{"custom-type-1", "custom-type-2"}
typMap := c.TypeMap()

_, customTypeExists := typMap.TypeForName(cTyps[0])
logger.Info("PgxRegisterCustomPgTypes called", log15.Ctx{
	"customTypeExists": customTypeExists,
})

I'm seeing this log entry getting called a lot, and customTypeExists is always false. So I'm thinking that maybe I've not wired this up the best way or something. I went back through the wiki and nothing jumped out - apologies if I'm missing something.

Alternatively, if caching custom type registrations to use across connection pool connections isn't something built into pgx, would there be any reason I shouldn't cache the *pgtype.Type values myself and reuse them to cut down on calls to LoadType?

Thanks again 😄

@jackc
Copy link
Owner

jackc commented Dec 14, 2024

I'm seeing this log entry getting called a lot, and customTypeExists is always false. So I'm thinking that maybe I've not wired this up the best way or something. I went back through the wiki and nothing jumped out - apologies if I'm missing something.

Each connection has its own type map, so this will get called every time a new connection is created.

Alternatively, if caching custom type registrations to use across connection pool connections isn't something built into pgx,

It is not built in.

would there be any reason I shouldn't cache the *pgtype.Type values myself and reuse them to cut down on calls to LoadType?

This is not safe for some types because not all Codecs are concurrency safe. In particular, EnumCodec is not as it interns the strings it receives. It's arguable whether this is a bug in EnumCodec. The Codec contract requires that a Codec not be mutated after it is registered with a Map. But that requirement was meant to mean that the Map took ownership, not necessarily that the Codec couldn't modify itself.

Anyway, it's an open question if or what to do about reusing Types, but at the moment it's not safe. Also, it is/was hoped that loading types could be made quick enough that reuse or caching wouldn't be necessary.

@bradleygore
Copy link
Author

Thank you for taking the time to go into the weeds a bit to explain to this level of detail - very much appreciated!

I did a proof of concept to cache the Types just to see what happened, and I figured concurrency would be a concern with connection pooling, so I mutex'd my own map. However, I didn't realize that EnumCodec had its own map for caching bytes vs string and as you say, that won't be concurrency safe 🤔

As for speed, doing them 1x1 with LoadType is working fine. We didn't notice any issue until we tried using LoadTypes - so it probably is fast enough.

If, however, you're open to the idea of caching these...

Some of the types in pgx have a Copy func (e.g. the *Config and such). What if Codec interface had a Copy() Codec func added to it and there was an internal mutex-protected cache of map[string]Type (or even a sync.Map)? Then when LoadType is called consult that first. If an entry found return &Type{Name: found.Name, OID: found.OID, Codec: found.Codec.Copy()} otherwise look up from db and store it.

I'd be happy to put up a PR if that seems reasonable.

All in all, pgx is a really well done body of work! It's a bit lower-level than I normally work at and I've learned a decent bit reading through it 😄 Thanks again!

@jackc
Copy link
Owner

jackc commented Dec 16, 2024

This approach would work in the overwhelming majority of cases, but there is the potential for some nasty edge cases.

  1. DDL that is executed after the pool was created may not be properly handled.
  2. When the pool is connecting to multiple servers that are synced through logical replication there is no guarantee that the same OIDs will be used for custom types.

We have no guarantee that caching is safe.

@bradleygore
Copy link
Author

Oh wow - I didn't realize that OIDs could be different across a replication set 🤯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants