Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse: disable refinement by default #1033

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

sparknack
Copy link
Contributor

@sparknack sparknack commented Jan 15, 2025

Because of the removal of forward index, the current refine searching is
slower than before. To prevent performance degradation when using a
non-zero value of drop_ratio_search, disable refinement by default.

Copy link

mergify bot commented Jan 15, 2025

@sparknack 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@sparknack sparknack changed the title sparse: make the default value of refine_factor to 1 sparse: set the default value of refine_factor to 1 Jan 15, 2025
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.15%. Comparing base (3c46f4c) to head (db09d51).
Report is 292 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main    #1033       +/-   ##
=========================================
+ Coverage      0   73.15%   +73.15%     
=========================================
  Files         0       82       +82     
  Lines         0     7479     +7479     
=========================================
+ Hits          0     5471     +5471     
- Misses        0     2008     +2008     

see 82 files with indirect coverage changes

@mergify mergify bot added the ci-passed label Jan 15, 2025
@alexanderguzhva
Copy link
Collaborator

lgtm, but should it affect existing users?

@sparknack
Copy link
Contributor Author

lgtm, but should it affect existing users?

Yes. For the current code in main branch, if refinement is used, it will lead to a significant degradation in performance, thus affecting the usability of all range of the drop_ratio_search. Although turning off refinement by default will cause a certain decline in the recall rate with the same drop_ratio_search, it can still ensure that it is basically usable. Users still can use small drop_ratio_search like 0.1.

KNOWHERE_CONFIG_DECLARE_FIELD(refine_factor)
.description("refine factor")
.set_default(10)
.set_default(1)
.set_range(1, 10, true, true)
.for_search()
.for_range_search();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove for_range_search

KNOWHERE_CONFIG_DECLARE_FIELD(refine_factor)
.description("refine factor")
.set_default(10)
.set_default(1)
.set_range(1, 10, true, true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to set a upper limit

@sparknack
Copy link
Contributor Author

issue: #1035

Because of the removal of forward index, the current refine searching is
slower than before. To prevent performance degradation when using a
non-zero value of drop_ratio_search, disable refinement by default.

Signed-off-by: Shawn Wang <[email protected]>
@zhengbuqian
Copy link
Collaborator

/lgtm

@zhengbuqian
Copy link
Collaborator

/approve

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sparknack, zhengbuqian

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify bot added the ci-passed label Jan 16, 2025
@sparknack sparknack changed the title sparse: set the default value of refine_factor to 1 sparse: disable refinement by default Jan 16, 2025
@sparknack
Copy link
Contributor Author

/kind improvement

@sre-ci-robot sre-ci-robot merged commit 8332487 into zilliztech:main Jan 16, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants