Skip to content

Conversation

Umang01-hash
Copy link
Member

@Umang01-hash Umang01-hash commented Aug 7, 2025

Pull Request Template

Description:

What is DBResolver?

  • Adds a DBResolver module to GoFr, which provides automatic read/write splitting for SQL databases.

  • Read queries (e.g., SELECT) are routed to read replicas, and write queries (e.g., INSERT, UPDATE) are routed to the primary database.

  • Seamlessly wraps the existing SQL datasource: does not require any application code changes for existing queries.

  • Developers interact with c.SQL exactly as before; all routing and failover are fully transparent.

Motivation & Benefits

  • Scalable horizontal read performance with multiple replicas
  • Reduced load on primary database
  • Fault-tolerant: automatic fallback to primary if all replicas fail
  • Clean metrics and tracing support for operational visibility

Example Usage:

import (
    "gofr.dev/pkg/gofr"

    "gofr.dev/pkg/gofr/datasource/dbresolver"
)

func main() {
    a := gofr.New()

    // Enable DBResolver with round-robin strategy and fallback
    resolver := dbresolver.NewProvider("round-robin", true)
    a.AddDBResolver(resolver)

    // Continue as usual: all routes and SQL logic unchanged
    a.GET("/db/read", DBReadHandler)
    a.POST("/db/write", DBWriteHandler)
    a.Run()
}

Configuration Example:

DB_HOST=localhost
DB_USER=root
DB_PASSWORD=rootpassword
DB_NAME=testdb
DB_PORT=3306
DB_DIALECT=mysql
DB_MAX_IDLE_CONNECTION=2
DB_MAX_OPEN_CONNECTION=0

# Replica hosts (comma-separated, e.g., on ports 3307 and 3308)
DB_REPLICA_HOSTS=localhost:3307,localhost:3308

Testing Strategy:

  • Primary and Replicas launched with docker-compose (primary:3306, replicas:3307/3308)

  • Replication automated by setup scripts; seed data from SQL dump and app endpoints

  • Load Testing: Performed with Apache JMeter simulating high-concurrency API requests (both reads and writes)

Results:

  • Zero error rate; throughput and latency showed no regression compared to the baseline.

  • Read/write split is fully performant, and scaling is achieved without application change.

Checklist:

  • I have formatted my code using goimport and golangci-lint.
  • All new code is covered by unit tests.
  • This PR does not decrease the overall code coverage.
  • I have reviewed the code comments and documentation for clarity.

Thank you for your contribution!

Umang01-hash and others added 17 commits July 28, 2025 16:36
Bumps [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go) from 0.59.0 to 0.59.1.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@exporters/prometheus/v0.59.0...exporters/prometheus/v0.59.1)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
  dependency-version: 0.59.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.243.0 to 0.244.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.243.0...v0.244.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.244.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [gofr.dev](https://github.com/gofr-dev/gofr) from 1.42.4 to 1.42.5.
- [Release notes](https://github.com/gofr-dev/gofr/releases)
- [Commits](v1.42.4...v1.42.5)

---
updated-dependencies:
- dependency-name: gofr.dev
  dependency-version: 1.42.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
@gizmo-rt
Copy link
Contributor

@Umang01-hash Can you add more details on sequence of R(Read) and W(Write) cases under which read and write replicas would be selected ?

@Umang01-hash
Copy link
Member Author

@Umang01-hash Can you add more details on sequence of R(Read) and W(Write) cases under which read and write replicas would be selected ?

Hey @gizmo-rt we first determine is a query is read/write and if it is a read query we check if healthyReplica is available or not. If the replica is available we select it and send the read query to it and if the replica is not available we send it to primary
database. Methods like QueryContext, Select have this logic. Methods like Exec , Prepare etc are directly using the primary db. Transaction methods (Begin, BeginTx) are always routed to the primary.

If all replicas are unhealthy at time of a read query and fallback is enabled, the read falls back to the primary. If fallback is disabled, the read fails with an error.

If a replica experiences multiple failures, it's circuit breaker opens and replica is temporarily skipped for queries. This timeout period is 30 seconds by default and we allow 5 failures for replica before cicuit breaker is opened.

For an example sequence like R,R,W,R:

1st R: Routed to replica

2nd R: Routed to next replica (depends on which strategy is choosen random or round-robin)

1st W: Routed to primary

3rd R: Routed to next replica

Copy link
Contributor

@ccoVeille ccoVeille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let the discussion going as I wasn't sure where it would land.

So functionally, I'm not sure whether it's OK.

But the code seems OK

@akshat-kumar-singhal
Copy link
Contributor

@Umang01-hash Hope we are routing the reads within a transaction to the primary? I couldn't find any the corresponding code for it.

@Umang01-hash
Copy link
Member Author

Hope we are routing the reads within a transaction to the primary? I couldn't find any the corresponding code for it.

Yes @akshat-kumar-singhal Reads within transactions are always routed to the primary database because the transaction itself is created on the primary connection. Since Begin() returns a transaction object from the primary database, all subsequent operations on this transaction (including reads) automatically use the primary connection.

// Begin always routes to primary (transactions).
func (r *Resolver) Begin() (*gofrSQL.Tx, error) {
    r.stats.totalQueries.Add(1)
    return r.primary.Begin()
}

Copy link
Contributor

@akshat-kumar-singhal akshat-kumar-singhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to break the files into smaller logic groups - readability is a bit low right now.

Comment on lines 60 to 80
func (r *ResolverWrapper) Build(primary container.DB, replicas []container.DB) (container.DB, error) {
if primary == nil {
return nil, errPrimaryNil
}

// Create options slice
var opts []Option

// Default to round-robin
strategy := NewRoundRobinStrategy(len(replicas))

if r.strategyName == "random" {
strategy = NewRandomStrategy()
}

// Add options.
opts = append(opts, WithStrategy(strategy), WithFallback(r.readFallback))

// Create and return the resolver.
return NewResolver(primary, replicas, r.logger, r.metrics, opts...), nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we use the existing Connect method?

@Umang01-hash Umang01-hash self-assigned this Aug 29, 2025
@gofr-dev gofr-dev deleted a comment from coolwednesday Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants