Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly support materialized views restoration #192

Merged
merged 5 commits into from
Mar 21, 2024

Conversation

Khatskevich
Copy link
Contributor

@Khatskevich Khatskevich commented Mar 15, 2024

This pr enables the restoration of databases with materialized views when:

  1. Source table was deleted before backup.
  2. Source table was not backed up by Astacus.

This is achieved by using ATTACH instead of CREATE, which performs
fewer checks at the moment of creation.

Misc: add a test for basic views to make sure they are restored properly
in similar case.

[DDB-890]

@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch 3 times, most recently from 0a84969 to 278b323 Compare March 17, 2024 17:26
@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2024

Codecov Report

Attention: Patch coverage is 45.45455% with 66 lines in your changes are missing coverage. Please review.

Project coverage is 89.10%. Comparing base (2712e4f) to head (587ecf4).

Files Patch % Lines
...tion/coordinator/plugins/clickhouse/test_plugin.py 25.30% 62 Missing ⚠️
...gration/coordinator/plugins/clickhouse/conftest.py 33.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #192      +/-   ##
==========================================
- Coverage   89.58%   89.10%   -0.48%     
==========================================
  Files         143      144       +1     
  Lines       10126    10216      +90     
==========================================
+ Hits         9071     9103      +32     
- Misses       1055     1113      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch from 278b323 to 3156fc6 Compare March 17, 2024 17:43
@Khatskevich Khatskevich marked this pull request as ready for review March 17, 2024 17:51
@@ -188,7 +188,6 @@ async def setup_cluster_content(clients: Sequence[HttpClickHouseClient], use_nam
await clients[0].execute(
b"CREATE TABLE default.mysql (a Int) ENGINE = MySQL('https://host:1234', 'database', 'table', 'user', 'password')"
)
await clients[0].execute(b"CREATE TABLE default.s3 (a Int) ENGINE = S3('http://bucket.s3.amazonaws.com/key.json')")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make that dependent on the presence of the engine instead of disabling it ?

(By using SELECT name FROM system.table_engines)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this.
But I have added incomplete set of table engines and did not used it in Table. Is it fine? (not sure if enum should not be used in Table because some new engines can be easily added to Clickhouse)

During test run several Clickhouse clusters are created.
Each Clickhouse instance preallocates many thread pools with many
threads each, especially on machines with many available cores.

This commit decreases the number of threads each Clickhouse instance
spawns during tests.

`setting` helper was necessary to avoid too long lines, because
Clickhouse does not support `\n` in variable declaration.
Comment on lines 410 to 461
await restored_cluster[0].execute(
b"CREATE TABLE default.source_table_for_view_deleted (thekey UInt32, thedata String) "
b"ENGINE = ReplicatedMergeTree ORDER BY (thekey)"
)
await restored_cluster[0].execute(b"INSERT INTO default.source_table_for_view_deleted VALUES (11, '11')")
await restored_cluster[2].execute(b"INSERT INTO default.source_table_for_view_deleted VALUES (17, '17')")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't notice earlier.

This test will mutate the restored_cluster that has a module scope, this will affect other tests.

This needs a separate function_restored_cluster fixture which does the same as the module fixture but function-scoped and without the steps parametrization. (We don't want to make all tests function scoped, since the restore is slow and this test doesn't need to check the partial-restore-and-retry that we use for other tests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did it intentionally, I thought it is fine. But probably it is better to not touch state which looks immutable.
The fix looks a bit ugly, but fine. I am not exactly sure what this test is testing, but let it better be there, as someone will want to undelete some table one day.

@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch 3 times, most recently from 4a78dd3 to 72574ec Compare March 19, 2024 00:45
@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch from 72574ec to 8226a4a Compare March 20, 2024 01:59
@Khatskevich Khatskevich marked this pull request as draft March 20, 2024 07:49
@Khatskevich Khatskevich marked this pull request as ready for review March 21, 2024 00:12
@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch from 536bf2e to fa1e44e Compare March 21, 2024 00:13
This commit enables the restoration of databases with materialized
views when:
1. Source table was deleted before backup.
2. Source table was not backed up by Astacus.

This is achieved by using `ATTACH` instead of `CREATE`, which performs
fewer checks at the moment of creation.

Misc: add a test for basic views to make sure they are restored properly
in similar case.

[DDB-890]
1. Deduplicate sorting logic.
2. Extend type checks.
Astacus can be tested against different builds and versions
of Clickhouse.
This commit skips tests with table engines which are likely to be
disabled while supported by Astacus.
@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch from fa1e44e to 587ecf4 Compare March 21, 2024 07:51
This commit makes replicas in Clickhouse tests synchronize before
checking if expected data is present. It eliminates situation when
one replica did not pass a fresh data to another one.

[DDB-906]
@Khatskevich Khatskevich force-pushed the khatskevich-materialized-views branch from 587ecf4 to a8ebc86 Compare March 21, 2024 08:28
@kmichel-aiven kmichel-aiven merged commit 7a4b1ae into main Mar 21, 2024
3 checks passed
@kmichel-aiven kmichel-aiven deleted the khatskevich-materialized-views branch March 21, 2024 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants