-
Notifications
You must be signed in to change notification settings - Fork 263
Description
Describe the bug
There is a bug in the reverse-link formation logic that can produce duplicate neighbor entries in HNSW graphs.
In form_reverse_links_, when the neighbor list size is below connectivity_max, the new slot is appended without checking whether it already exists:
USearch/include/usearch/index.hpp
Lines 3899 to 3904 in 7306bb4
| // If `new_slot` is already present in the neighboring connections of `close_slot` | |
| // then no need to modify any connections or run the heuristics. | |
| if (close_header.size() < connectivity_max) { | |
| close_header.push_back(new_slot); | |
| continue; | |
| } |
In incremental update scenarios(frequent update + remove), the same slot may be inserted multiple times. (despite the comment stating that nothing should be done if the slot already exists.)
For reference, It is likely that this issue does not occur when the number of connections of a neighbor node exceeds M (or 2M). In such cases, the refinement(heuristic) process(refine_) is expected to naturally eliminate duplicate slots.
Fix
Adding a uniqueness check in this branch prevents duplicate neighbor entries.
if (close_header.size() < connectivity_max) {
if (std::all_of(close_header.begin(), close_header.end(),
[new_slot](compressed_slot_t slot) {
return slot != new_slot;
})) {
close_header.push_back(new_slot);
}
continue;
}I have submitted a bugfix PR(#699) for this issue. I would appreciate it if you could review it.
Steps to reproduce
Perform repeated update(with different vector values) and remove operations on the same slot in index_dense_gt. This would leads to a situation where the same slot can be inserted multiple times during reverse-link formation.
(In my case, I reproduced the issue by implementing deletion manually on top of index_gt, but it is expected to occur in index_dense_gt as well.)
Expected behavior
No duplicate slots should exist in the neighbor list.
USearch version
v2.23.0
Operating System
Red Hat Enterprise Linux release 8.6
Hardware architecture
x86
Which interface are you using?
C++ implementation
Contact Details
Are you open to being tagged as a contributor?
- I am open to being mentioned in the project
.githistory as a contributor
Is there an existing issue for this?
- I have searched the existing issues
Code of Conduct
- I agree to follow this project's Code of Conduct