Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readwrite-splitting data sources is incorrect when tableless #34387

Open
JoshuaChen opened this issue Jan 18, 2025 · 7 comments · May be fixed by #34340
Open

readwrite-splitting data sources is incorrect when tableless #34387

JoshuaChen opened this issue Jan 18, 2025 · 7 comments · May be fixed by #34340

Comments

@JoshuaChen
Copy link
Contributor

JoshuaChen commented Jan 18, 2025

Bug Report

When using the following configuration, the value passed to rule.findDataSourceGroupRule(logicDataSourceName) is write_ds or read_ds_0, which fails to locate the correct dataSourceGroups (e.g., joshua in the configuration). This causes the ReadwriteSplitting logic to be skipped. This PR fixes the issue by ensuring the logicDataSourceName parameter passed is set to joshua.

mode:
  type: Standalone
  repository:
    type: JDBC

dataSources:
  write_ds:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
  read_ds_0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
    readOnly: true

rules:
  - !READWRITE_SPLITTING
    dataSourceGroups:
      "joshua":
        writeDataSourceName: write_ds
        readDataSourceNames:
          - read_ds_0
        transactionalReadQueryStrategy: PRIMARY
        loadBalancerName: baseAA
    loadBalancers:
      baseAA: # Load balance algorithm name
        type: ROUND_ROBIN
  - !SINGLE
    tables:
      - "*.*.*"

props:
  sql-show: true

Which version of ShardingSphere did you use?

5.5.0/5.5.1/5.5.2-SNAPSHOT triggers BUG
5.4.1 works normally

Which project did you use? ShardingSphere-JDBC or ShardingSphere-Proxy?

ShardingSphere-JDBC

Expected behavior

Tableless write operations have transactions
Tableless write operations point to write_ds

Actual behavior

Tableless write operations have transactions
Tableless write operations do not always point to write_ds, but sometimes point to read_ds_0

Reason analyze (If you can)

#34340

@strongduanmu
Copy link
Member

What do you mean of tableless write operations? Can you give an example?

@JoshuaChen
Copy link
Contributor Author

sorry, is the following explanation ok?
Or do you need me to provide a demo app?

read operations:

SELECT version();

Expected behavior:
The final choice should be: read_ds_0


Write operations do not include transactions:

SELECT nextval('users_id_seq')

Expected behavior:
The final choice should be: read_ds_0


Write operations include transactions:

BEGIN;
SELECT nextval('users_id_seq')
COMMIT;

Expected behavior:
The final choice should be: write_ds

@strongduanmu
Copy link
Member

@JoshuaChen Thank you for your feedback. To solve this problem, we may need to convert the physical data source in connectionContext.getUsedDataSourceNames() into a logical data source.

@JoshuaChen
Copy link
Contributor Author

JoshuaChen commented Jan 18, 2025

@strongduanmu I don't think so.

If there are aggregatedDataSources with other rules in the configuration, the data in aggregatedDataSources should be taken first.
If there is no aggregateDataSources, then get the logical data source,

If we convert the physical data source in connectionContext.getUsedDataSourceNames() into a logical data source, then when the data source is obtained randomly in the list, the logical data source will still be obtained, and READWRITE_SPLITTING will still be skipped

aggregatedDataSources has a higher priority.

In addition to the READWRITE_SPLITTING rule, we have other rules.

A more complex example would be:

dataSources:
  write_ds:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
  read_ds_0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
    readOnly: true
  write_ds_2:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase2
    username: myuser
    password: secret
  read_ds_2:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase2
    username: myuser
    password: secret
    readOnly: true

rules:
  - !READWRITE_SPLITTING
    dataSourceGroups:
      "joshua":
        writeDataSourceName: write_ds
        readDataSourceNames:
          - read_ds_0
        transactionalReadQueryStrategy: PRIMARY
        loadBalancerName: baseAA
      "joshua2":
        writeDataSourceName: write_ds_2
        readDataSourceNames:
          - read_ds_2
        transactionalReadQueryStrategy: PRIMARY
        loadBalancerName: baseAA
    loadBalancers:
      baseAA:
        type: ROUND_ROBIN
  - !SINGLE
    tables:
      - joshua.public.t_config
      - joshua2.public.t_config2
      

@JoshuaChen
Copy link
Contributor Author

JoshuaChen commented Jan 18, 2025

I updated my PR and added unit tests to try to explain my changes.

I found that BroadcastRoute also does not support it.
So now it can become:

dataSources:
  mydatabase_write_ds:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
  mydatabase_read_ds_0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
    readOnly: true
  mydatabase_read_ds_1:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase
    username: myuser
    password: secret
    readOnly: true
  mydatabase2_write_ds:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase2
    username: myuser
    password: secret
  mydatabase2_read_ds_0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    url: jdbc:postgresql://127.0.0.1:5432/mydatabase2
    username: myuser
    password: secret
    readOnly: true

rules:
  - !READWRITE_SPLITTING
    dataSourceGroups:
      "dsg_1":
        writeDataSourceName: mydatabase_write_ds
        readDataSourceNames:
          - mydatabase_read_ds_0
          - mydatabase_read_ds_1
        transactionalReadQueryStrategy: PRIMARY
        loadBalancerName: baseAA
      "dsg_2":
        writeDataSourceName: mydatabase2_write_ds
        readDataSourceNames:
          - mydatabase2_read_ds_0
        transactionalReadQueryStrategy: PRIMARY
        loadBalancerName: baseAA
    loadBalancers:
      baseAA:
        type: ROUND_ROBIN

  - !BROADCAST
    tables:
      - t_address

If I'm wrong, please correct me..

@strongduanmu
Copy link
Member

I will spend some time investigating this issue. getUsedDataSourceNames is designed to make use of cached DataSource as much as possible to improve performance. We cannot give it up easily.

@JoshuaChen
Copy link
Contributor Author

Thank you very much, I'd be happy to contribute if there's anything I can help with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants