Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9391 Fixes replication break error because of HASH_SCAN #5528

Draft
wants to merge 1 commit into
base: 8.0
Choose a base branch
from

Conversation

VarunNagaraju
Copy link
Contributor

https://perconadev.atlassian.net/browse/PS-9391

Problem

When a replica's slave_rows_search_algorithms is set to HASH_SCAN, the
replication may break with HA_ERR_KEY_NOT_FOUND.

Analysis

When a replica's slave_rows_search_algorithms is set to HASH_SCAN,
it prepares a unique key list for all the rows in a particular Row_event.
The same unique key list will later be used to retrieve all tuples associated
to each key in the list from storage engine. In the case of multiple updates
targeted at the same row like how it is shown in the testcase, it may happen
that this unique key list filled with entries which don't exist yet in the table.
This is a problem when there is an intermediate update which changes the value
of the index column to a lesser value than the original entry and that changed
value is used in another update as shown in the second part of the testcase.
It is an issue because the unique key list is a std::set which internally
sorts it's entries. When this sorting happens, the first entry of the list
could potentially be a value which doesn't exist in the table and the when
it is searched in next_record_scan() method, it fails returning
HA_ERR_KEY_NOT_FOUND error.

Solution

Instead of using std::set to store the distinct keys, a combination of
unordered_set and a list is used to preserve the original order of
updates and avoid duplicates at the same time which prevents the side
effects of sorting.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

sql/log_event.cc Outdated Show resolved Hide resolved
@VarunNagaraju
Copy link
Contributor Author

@@ -2898,14 +2898,34 @@ class Rows_log_event : public virtual binary_log::Rows_event, public Log_event {
Key_compare(KEY **ki = nullptr) : m_key_info(ki) {}
bool operator()(uchar *k1, uchar *k2) const {
return key_cmp2((*m_key_info)->key_part, k1, (*m_key_info)->key_length,
k2, (*m_key_info)->key_length) < 0;
k2, (*m_key_info)->key_length) == 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this helper class to Key_equal.
Compare may be misleading as it may mean comparator and in your scenario this is no longer the case. All you do now is checking for equality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

class Key_hash {
public:
Key_hash(KEY **ki = nullptr) : m_key_info(ki) {}
size_t operator()(uchar* ptr) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be just create a std::string_view from ptr and (*m_key_info)->key_length and calculate a hash of this string_view using standard hash function.

std::string_view sv{ptr, (*m_key_info)->key_length};
return std::hash(sv);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Done.

sql/log_event.cc Outdated
@@ -9142,9 +9141,10 @@ int Rows_log_event::add_key_to_distinct_keyset() {
DBUG_TRACE;
assert(m_key_index < MAX_KEY);
key_copy(m_distinct_key_spare_buf, m_table->record[0], m_key_info, 0);
std::pair<std::set<uchar *, Key_compare>::iterator, bool> ret =
std::pair<std::unordered_set<uchar *, Key_hash, Key_compare>::iterator, bool> ret =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be just auto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -9088,14 +9090,11 @@ int Rows_log_event::open_record_scan() {

if (m_key_index < MAX_KEY) {
if (m_rows_lookup_algorithm == ROW_LOOKUP_HASH_SCAN) {
/* initialize the iterator over the list of distinct keys that we have */
m_itr = m_distinct_keys.begin();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we change this to m_distinct_key_id = 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_distinct_key_idx is already set to 0 when it is declared.

sql/log_event.cc Outdated
@@ -9068,8 +9068,10 @@ int Rows_log_event::next_record_scan(bool first_read) {
if ((error = table->file->ha_index_read_map(
table->record[0], m_key, HA_WHOLE_KEY, HA_READ_KEY_EXACT))) {
DBUG_PRINT("info", ("no record matching the key found in the table"));
if (!is_trx_retryable_upon_engine_error(error))
if (!is_trx_retryable_upon_engine_error(error)) {
sleep(10);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional or just for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just for debugging. Removed it.

https://perconadev.atlassian.net/browse/PS-9391

Problem
=======
When a replica's slave_rows_search_algorithms is set to HASH_SCAN, the
replication may break with HA_ERR_KEY_NOT_FOUND.

Analysis
========
When a replica's slave_rows_search_algorithms is set to HASH_SCAN,
it prepares a unique key list for all the rows in a particular Row_event.
The same unique key list will later be used to retrieve all tuples associated
to each key in the list from storage engine. In the case of multiple updates
targeted at the same row like how it is shown in the testcase, it may happen
that this unique key list filled with entries which don't exist yet in the table.
This is a problem when there is an intermediate update which changes the value
of the index column to a lesser value than the original entry and that changed
value is used in another update as shown in the second part of the testcase.
It is an issue because the unique key list is a std::set which internally
sorts it's entries. When this sorting happens, the first entry of the list
could potentially be a value which doesn't exist in the table and the when
it is searched in next_record_scan() method, it fails returning
HA_ERR_KEY_NOT_FOUND error.

Solution
========
Instead of using std::set to store the distinct keys, a combination of
unordered_set and a list is used to preserve the original order of
updates and avoid duplicates at the same time which prevents the side
effects of sorting.
@VarunNagaraju
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants