-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCR spelling mistakes #136
Comments
Symspell takes two factors into account when ranking correction candidates:
Symspell does not take into account:
Using those factors with weighted Damerau-Levenshtein edit distance for the ranking of correction candidates could significantly improve correction quality. |
Thank you! I guess I was hoping someone already made a “low tech” solution in the form of a dictionary for all the kerning and ligature issues.
|
What is the recommended practice for OCR typos that come from say poor kerning? Examples below.
mformation --> information
wntmg --> writing
The problem I have is that SymSpell
lookup_compound
seems to suggest other words such as "formation" or "with a" instead of the correct terms.Apologies if this is not the right place to ask.
The text was updated successfully, but these errors were encountered: