-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-unicode primitive support and major Kanji database update #191
Conversation
…sentation. They are instead referenced with [primitive_name] tags in primitives and primitive_of lists. The tags are then converted into image links when constructing visual representation. Included are .svg files for the non-Unicode primitives gathered from https://github.com/cyphar/heisig-rtk-index repository.
…imitives. The file contains modifications to few select existing kanji and adds references to these missing primitives. Included is also a tool script that merges kanji-ext.tsv to kanji.db. It also recalculates primitive_of references.
- Fix creating cards that have non-unicode primitives - Modify Production and Recognition HTML files to use images for these primitives - Copy primitive .svg files to media collection directory
…ts using [primitive] tag. Also tags work correctly now in primitives and primitive-of lists and popups. Updated all the .svg files (removed width and height elements so CSS can adjust their size properly)
… and primitives as accurately as possible: - Add all the missing non-Unicode character primitives - Fix many mix-ups between primitives ('good luck' vs 'lidded crock', 'chop-seal' vs 'stamps' among others) - separate 'hand' and 'fingers' primitives for clarity. - primitives for 'flowers', 'city walls' and 'pinnacle' actually reference the more common radicals instead of weird arcaic ones - Fix many erroneous primitive references (such as 'water' being used instead of 'rice grains') - Fix many Heisig stories and comments. Also add italic and bold text when referencing keywords Changes to the original kanji database (as of 7/2023) are read from kanji-ext.tsv. Included is the updated kanji.db. Log of changes can be found in db_merge_log.md in Markdown format
…ed correctly with white color when used in buttons
- Remove references to odd archaic primitives without Heisig keyword - Add many missing references to primitives and fix numerous wrong ones - Keep few alternative primitives separate from their main counterparts because they are so distinct visually ('cloak' vs. 'garment'. 'scarf' vs 'garment'. ) - Remove erroneous alternative primitive keywords. - Add commentary to many Heisig stories and comments, adding references to similar primitives and kanjis so lessen confusion.
…ences, comments and links Minor update to db_merge tool. Changes can be found kanji-ext3.tsv and in more readable form in db_merge_log_3.md
See #175 where I have done some simple testing. |
…Migaku Kanji database Excel sheet. Updated .tsv merge tool and added a new script to extract data from user modified fields to .tsv patch file
Just downloaded the latest changes you made. A few notes I've been collecting on entries that still seem to need fixing:
It's possible some of these are leftover garbage cards after rebuilding the deck that I just need to suspend, but reporting just in case. |
@calculuschild Thank you for testing!
Corrected
I think there's some confusion what Heisig actually means with the 'arrowhead'. It's not listed as a separate primitive but mentioned in the comments of 'arrow', 'request' and 'dog tag' primitives. How I see it, the arrow is still there in those primitives but the body is straightened out and merged with other primitives (like what happens in many of the kanjis in his stories). That's why I haven't created a yet another sub-primitive but keep referring to the 'arrow'.
That's a non-Heisig primitive (shared by 換 and 喚), but will be listed as an secondary/alternate new primitive after the next patch of update. There will be later a possibility to learn with these advanced (non-Heisig) primitives (optional feature). Also I'm adding a feature for the user to modify the primitive list of each kanji (as well as the Heisig story and comments) to facilitate a crowdsourcing effort to go thru all the kanjis that is not included in Heisig's books. I'm now in the middle of cross checking database with RTK3 but for the rarer stuff (kanji frequency 3000+ to ~12000) I obviously don't have time to do myself. I'm not a paid worker of Migaku project, but just a volunteer doing this stuff on my free time :)
巛 is now listed as an alternative to 川 and database should reflect all the changes now.
Please note that not all fixes are yet commited. The next batch should be in few days and hopefully the database for the 3000 most common kanjis should be fairly complete by then. |
… RTK3 (kanjis 2000-3000). Minor fixes. Added 'I-ching', 'futon' and 'butchers meeting' non-Unicode primitives
…ds'. When adding manually the characters the scan process will search for new sub-primitives even when the main character is already in the stack. Also show a 'Cancel' button in the KanjiConfirmDialog.
Closing for now. I'll make a new PR that uses the new refactored code (in other PR) and clean this up a bit too. |
Add capability to use primitives that have no Unicode character representation. They are instead referenced with [primitive_name] tags in primitives and primitive_of lists as well as Heisig stories and comments.
The tags are then converted into image links when constructing their visual representation.
Major update to kanji.db, trying to replicate Heisig's keywords and primitives as accurately as possible:
Changes to the original kanji database (as of 7/2023) are merged from kanji-ext.tsv files. Included is the updated kanji.db. There are three different batches of updates present in this PR. Log of changes can be found in corresponding db_merge_log_X.md files in Markdown format (the log and tsv files are now removed for cleaner addon directory, but can be found in the individual commits below if needed)