-
Notifications
You must be signed in to change notification settings - Fork 12
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
french mistankenly detected as portugese #63
Comments
Thanks for trying out PeARS! I am not too sure why that page comes up as Portuguese. We rely on an external library for language detection (langdetect) and it is obviously misbehaving there. I tried that page on the Federated version of PeARS, which has better language support, and for me that page comes up as Italian :) It's a mystery, because other pages from the same site do come up as French as they should... NB: the Orchard version of PeARS is very brittle. It is the one with the most compact representations, but also the most unreliable ones. We are actively working on Federated right now and have a different indexing system, which can potentially be set up in a way that things don't crash entirely when the page language is unreliably recognised. So thanks for reporting this, it shows that we really have to get this sorted out! |
I followed the Installation and Setup section to install Pears-orchard, also installed 'fr' in addition to 'simple'.
Went to Indexer then to Index a single URL
The page is not saved and in the terminal I get
Language for https://forum-auto.caradisiac.com/topic/21001-mon-ex-megane-2-19-dci-130-confort-expression/#comments : pt
after installing pt language the page is indeed saved
The text was updated successfully, but these errors were encountered: