-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The chunker needs punctuation to work properly #29
Comments
Yea, this is a known issue documented here: https://github.com/dakrone/clojure-opennlp#known-issues It's something that the OpenNLP libary does, not clojure-opennlp. |
Seems fair - thanks for the reply. And sorry for not spotting that disclaimer. |
Would be nice if you could report this to OpenNLP, so it can be fixed in the next version. |
I think the OpenNLP 1.7.2 version this project is using right now has fixed the punctuation problem. So maybe we can include the end punctuation? Also, I notice the OpenNLP produce phrase tag as "O", where in the clojure-opennlp "O" is not incorporated. |
Using the definitions of
tokenize
,pos-tag
, andchunker
from the readme, and 1.5.1 versions of the model files, the following behaviour is observed:The pos-tag output seems correct however.
The text was updated successfully, but these errors were encountered: