Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

how to convert {{w_xxxxx}} and {{n_xxxxx}} to unicode #6

Open
deathrush opened this issue Nov 17, 2018 · 3 comments
Open

how to convert {{w_xxxxx}} and {{n_xxxxx}} to unicode #6

deathrush opened this issue Nov 17, 2018 · 3 comments

Comments

@deathrush
Copy link

According to 外字Unicodeマップ http://ebstudio.info/manual/EBWin4_man/0_4_5.html
map file content looks like hA121 u00E0,there is no 'w' or 'n'

@FooSoft
Copy link
Owner

FooSoft commented Nov 20, 2018

Those are indices into the character map for the given dictionary. Yomichan-Import has code to parse these entries, you can check it out here: https://github.com/FooSoft/yomichan-import/blob/master/epwing.go#L172

Character tables have to be created for every EPWING dictionary, since certain 外字 have glyphs that would normally be rendered inside the text.

@epistularum
Copy link

Character tables have to be created for every EPWING dictionary

Is that what you mean by a character table?

zA577	u95BD		#	閽
zA578	u8772		#	蝲
zA579	u6A1D		#	樝
zA57B	u95AB		#	閫
zA57C	u95D0		#	闐
zA57D	u9F97		#	龗
zA57E	u5B7D		#	孽
zA621	u97DB		#	韛
zA622	u65F0		#	旰
zA623	u74EB		#	瓫

Because if that's the case, installing EBWin4 and browsing to C:\Users\username\AppData\Roaming\EBWin4\GAIJI gives you a lot of tables.
There's a table for kojien, wadai, meikyou, daijirin,...

@playHing
Copy link

@FooSoft
Noticed that OCR of 外字 for several main dictionaries are done in yomichan-import.
Would you mind to kindly suggest or share how the OCR can be done in batch?
I would like to contribute to the repo but get stuck in the OCR part...

issue-ocr

Thanks in advance!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants