Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Multilanguage Heisig data #213

Open
vincentbohlen opened this issue Aug 19, 2024 · 1 comment
Open

[FEATURE] Multilanguage Heisig data #213

vincentbohlen opened this issue Aug 19, 2024 · 1 comment
Labels
idea New or something to think about new feature New feature or request requires discussion Either complex, underdeveloped, or possibly contentious

Comments

@vincentbohlen
Copy link

I am not a native English speaker. Most of the Kanji learning material and advice available on the Internet is in English or is referencing English material. While I have no problem understanding English, when applying the RTK method, I run into trouble with Heisig's sometimes outlandish choice of keywords, or the use of synonyms making it important to memorize nuances. This is already difficult in one's native language but even more difficult if the nuances are not internalized for second language.
Since a translated and adapted version of RTK is available in different languages, I prefer using the keywords used in the version published in my native language.

I played around with how to use the Kanji GOD add on in German and while it would be a possible solution to enter the translations into the custom keyword / custom primitive column, it would be a lot of manual work.
I had an almost complete list of keywords and primitives on my PC, so I wrote some basic SQL to alter my local Kanji.db and overwrote the English keywords/primitives with the German ones. This works great for me.
I am now thinking about also adding Heisig's stories and comments, but I personally don't necessarily need them anymore.
I assume that there are other Japanese learners who would benefit from having a version of Kanji GOD which aligns with the localized version of RTK they might be using. This is not about customizing the data but providing the "official" localized set of keywords as a different base set.

Providing the data for different languages would be a one time effort. DB could manually be replaced by user but language selection with data load may be the nicer solution. Migaku already allows for the selection of dictionaries for different languages removing the mental work of translating from English. Adjusting Kanji GOD data would make for a seamless experience.

@vincentbohlen vincentbohlen added idea New or something to think about new feature New feature or request requires discussion Either complex, underdeveloped, or possibly contentious labels Aug 19, 2024
@mjuhanne
Copy link
Contributor

mjuhanne commented Aug 21, 2024

@vincentbohlen
Actually the groundwork for this is already done. There is a fork of Kanji GOD (https://github.com/mjuhanne/Migaku-Kanji-Addon/tree/test_storydb) which contains bunch of stuff improvements that I haven't yet tried to merge into the main branch.

One of the improvements is Story DB: It takes the stories (Heisig, Koohi) from Kanji DB into a separate Story DB. In this database each row contains a set of data for each kanji (source name, keyword, story, primitives). The source here refers to Heisig / Koohi / RRTK / Wanikani / "crowd-sourced". RRTK and Wanikani data is gathered from a couple of Anki decks and the crowd-sourced stuff is a mixture of best-of-the-best of Koohi stories and keywords (manually checked so they don't conflict with Heisig ones), in addition to some of my own mnemonics and keywords.

What you (and maybe other users for other languages) would like to create is another "source" into Story DB (for example heisig_de for german Heisig keywords). The process would be

  1. create a tab-separated file, in which each row would consist of source name + kanji + keywords
  2. merge those changes into Story DB with a separate Python script

If you'd like to try this approach, let me know and I can walk you through it.

The test branch contains bunch of other improvements so you might want to take a look at it anyway:

  • Editing mode for editing actual keywords, stories and primitives list in the Kanji Lookup tool
  • Updated Koohi stories for all available kanjis (Kanken 1.5 - 1)
  • More non-Heisig primitives
  • Primitives for all 6000 Kanken kanjis
  • Marking conflicting keywords (in Wanikani/RRTK vs Heisig) using strike-through text (see the image below)

I've used it like this for the past 8 months or so now so it should be farely stable. If you want to try it, make sure you use the right branch and DON'T FORGET TO BACKUP your previous Kanji GOD directory and Anki decks :)

Here's some screenshots with the current status:

Heisig, RRTK and Wanikani sources:
Screenshot 2024-08-21 at 21 47 21
Screenshot 2024-08-21 at 21 47 31

Koohi and Crowd-sourced:
Screenshot 2024-08-21 at 21 43 37
Screenshot 2024-08-21 at 21 43 59

Edit mode(here editing the Crowd-sources primitives list)
Screenshot 2024-08-21 at 22 23 09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea New or something to think about new feature New feature or request requires discussion Either complex, underdeveloped, or possibly contentious
Projects
None yet
Development

No branches or pull requests

2 participants