Skip to content

Conversation

@veloman-yunkan
Copy link
Collaborator

@veloman-yunkan veloman-yunkan commented Sep 26, 2025

This PR is a less ambitious version of #994 intended to deliver a new feature in a more limited form as soon as possible.

Fixes #731 (will open other issues for future improvements)

@veloman-yunkan veloman-yunkan changed the base branch from main to suggestions_cleanup September 26, 2025 15:58
This is a prototype version of spelling correction attempting to mirror
the client's implementation at
https://github.com/gremid/xapian-spelling-suggestions/

For an unknown reason the new unit test fails as follows:

[ RUN      ] Suggestion.spellingSuggestions
Resolve redirect
set index
test/suggestion.cpp:835: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Tsunge", 1)
    Which is: {}
  std::vector<std::string> ({"Zunge"})
    Which is: { "Zunge" }
test/suggestion.cpp:841: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Lax", 1)
    Which is: {}
  std::vector<std::string> ({"Lachs"})
    Which is: { "Lachs" }
test/suggestion.cpp:842: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Mont", 1)
    Which is: {}
  std::vector<std::string> ({"Mond"})
    Which is: { "Mond" }
test/suggestion.cpp:845: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Trok", 1)
    Which is: {}
  std::vector<std::string> ({"Trog"})
    Which is: { "Trog" }
test/suggestion.cpp:850: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Son", 1)
    Which is: {}
  std::vector<std::string> ({"Sohn"})
    Which is: { "Sohn" }
test/suggestion.cpp:852: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Grahl", 1)
    Which is: { "Stuhl" }
  std::vector<std::string> ({"Gral"})
    Which is: { "Gral" }
test/suggestion.cpp:861: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "aba", 1)
    Which is: {}
  std::vector<std::string> ({"aber"})
    Which is: { "aber" }
test/suggestion.cpp:880: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Füreschein", 1)
    Which is: {}
  std::vector<std::string> ({"Führerschein"})
    Which is: { "F\xC3\xBChrerschein"
    As Text: "Führerschein" }
[  FAILED  ] Suggestion.spellingSuggestions (280 ms)
@veloman-yunkan veloman-yunkan force-pushed the limited_spelling_correction branch from 4c7c178 to 04d82ad Compare September 27, 2025 13:29
@kelson42 kelson42 linked an issue Sep 28, 2025 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Sep 28, 2025

Codecov Report

❌ Patch coverage is 59.25926% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.14%. Comparing base (5d00100) to head (d99b77b).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
src/suggestion.cpp 58.22% 3 Missing and 30 partials ⚠️

❌ Your patch status has failed because the patch coverage (59.25%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1007      +/-   ##
==========================================
+ Coverage   58.13%   58.14%   +0.01%     
==========================================
  Files         101      102       +1     
  Lines        5384     5462      +78     
  Branches     2197     2234      +37     
==========================================
+ Hits         3130     3176      +46     
- Misses        795      798       +3     
- Partials     1459     1488      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

... and fixed a bug in the test data.

This reduced the count of failures in the Suggestion.spellingSuggestions
unit test from 8 to 1:

[ RUN      ] Suggestion.spellingSuggestions
Resolve redirect
set index
test/suggestion.cpp:841: Failure
Expected equality of these values:
  getSpellingSuggestions(a, "Lax", 1)
    Which is: {}
  std::vector<std::string> ({"Lachs"})
    Which is: { "Lachs" }
[  FAILED  ] Suggestion.spellingSuggestions (260 ms)

The spelling correction "Lax -> Lachs" is not returned because the max
edit distance is capped at (length(query_word) - 1) which reduces our
passed value of the max edit distance argument from 3 to 2.

This problem disappears if the version of libxapian found on Ubuntu
22.04 (libxapian.so.30.11.0) is used instead of the one that we build
ourselves as a base dependency (libxapian.so.30.12.4).
@veloman-yunkan veloman-yunkan force-pushed the limited_spelling_correction branch from b530190 to b325bff Compare September 28, 2025 15:42
Base automatically changed from suggestions_cleanup to main September 29, 2025 12:35
@kelson42
Copy link
Contributor

@veloman-yunkan Can we close this because kiwix/libkiwix#1230 superseed it? Do we have anything else left interesting which has not been put in kiwix/libkiwix#1230?

@veloman-yunkan
Copy link
Collaborator Author

Superseded by kiwix/libkiwix#1230

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Offer search term spelling corrections

3 participants