-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with custom word list #33
Comments
Hey @SamDc73. Could you check that your word list satisfies these assumptions?
There's no bias given to words that occur at the beginning, a random word is chosen uniformly. It's possible that those words at the end of the file are being skipped because they don't satisfy the assumptions. |
could you please (if you have the time of course ) test it ? |
@SamDc73 thank you for posting the word list! First, the word list was not sorted alphabetically. I had to sort it myself using (it needs to be sorted according to the 3rd assumption in the list):
Then, some of the words in the file had trailing spaces. I removed them using:
(you can use any text editor to do it too, but this automated). Here's the fixed word list for reference.
After fixing the file, I found a bug in toipe. I was using complete word lists (which had words for all 26 letters), so the code assumed that too. Because of this, a bias was introduced. I will be fixing this by using a better word selection algorithm that doesn't need require as many assumptions and isn't so hyper optimized :) Thank you again for bringing this to my notice and sending the word list. |
I have i test file with 60 words in it, and when i run
toipe -f words -n 10
there will words that occurs multiple times , also the words that are at the end of the text file don't even show up at all [the words at the first line or two occurs the most]The text was updated successfully, but these errors were encountered: