-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/cleanup fuzzyreplace #123
base: master
Are you sure you want to change the base?
Conversation
jaderabbit
commented
Jul 21, 2020
- Replaced fuzzywuzzy with the C++ based rapidfuzz
- Added a progress bar for fuzzywuzzy
- Moved all the installations to the top cell
Check out this pull request on Review Jupyter notebook visual diffs & provide feedback on notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks awesome @JADE,
did we get the time. to benchmark it?
How long does it take on a medium-size dataset?
pandas | ||
p_tqdm | ||
rapidfuzz | ||
joeynmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaderabbit and @juliakreutzer is this the right way to install joeynmt?
like pip install joeynmt
I remember last time I tried it it gave me headaches and I was forced to install it from Github...
using pip install git+https://github.com/joeynmt/joeynmt.git
also it is missing the version...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@espoirMur I added it to pypi a few weeks ago so this should be fine now - I still need to update the folder locations later down so we don't need to install joeynmt via cloning it.
Yes, I should do versions for all the requirements - thanks for the push!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is on PyPI that is cool....
@espoirMur I'm actually going to change it.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If that is the speed we got .... it's really really nice.... |
That's an awesome improvement, thank you @jaderabbit! 💯 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaderabbit shall we merge this? Or not because of the progress bar?