Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Yiddish support to Yomitan #1567

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ThatsItForTheOtherOne
Copy link

@ThatsItForTheOtherOne ThatsItForTheOtherOne commented Nov 5, 2024

This pull request adds support for Yiddish.

It is not perfect, and more/better transforms could be added, but I believe this is sufficient for now. The preprocessor strips nikudes so the dictionary would need to as well to prevent matching issues. However, the postprocessor makes sure ligatures are not an issue,

My code is a bit ugly, I have never worked with TypeScript so if anyone can suggest some ways to make it nicer, that would be greatly appreciated. Thank you!

@capt-v
Copy link

capt-v commented Nov 5, 2024

This is the dictionary we would use to test. Generated with kty with entries provided with "nikkudes"-less versions after.
kty-yi-en.zip

@Kuuuube Kuuuube added kind/enhancement The issue or PR is a new feature or request area/linguistics The issue or PR is related to linguistics labels Nov 6, 2024
@jamesmaa
Copy link
Collaborator

What's the status of this PR? I haven't reviewed it since it's still in draft

@ThatsItForTheOtherOne
Copy link
Author

It works but I wanted to run it against a Yiddish speaker or a linguist first

@jamesmaa
Copy link
Collaborator

I can make a call out to the Yiddish subreddit for review if you need help finding folks.

@ThatsItForTheOtherOne
Copy link
Author

ThatsItForTheOtherOne commented Nov 30, 2024

Feel free. @capt-v and I just went through and added a few missing plural forms. I'm contemplating a few more changes though, tbh. I wonder if removing ende letters in the preprocessor and just assuming there is no ende letters at all would be better, since it would mean the same transforms should work better on Soviet (which removes ende letters) orthography.

Soviet orthography is very rare but if a relatively small change adds that much more, it shouldn't be a terrible idea. Wondering also if I'm overcomplicating all of this.

…lef and komets alef, and demutation of vov yud to ayin
@jamesmaa
Copy link
Collaborator

jamesmaa commented Dec 2, 2024

Honestly I would be fine with minimal review with this PR. As we get more Yiddish users we can follow up with future fixes and more coverage with the conjugation.

@ThatsItForTheOtherOne
Copy link
Author

Should I then make it not a draft?

@jamesmaa
Copy link
Collaborator

If you feel like it's shippable and you're not going to improve it in the near future, then yeah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/linguistics The issue or PR is related to linguistics kind/enhancement The issue or PR is a new feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants