This dataset contains ~1k (so far) Ukrainian handwritten symbols in PNG format, 200x200 size. You may find them listed in glyphs.csv
with following headers used:
- string
label
: lowercase cyryllic character, which represents the picture. - string
transliter_kmu2010
: transliteration of thelabel
according to the Resolution of the Cabinet of Ministers of Ukraine on 27 January 2010. You may find the transliteration table here.'
symbol is used for soft sign. Both и and й represented asy
in this system, so better don't use it for hashing. - string
name
: unique name of this symbol.soft_sign
is used for soft sign. - bool
is_uppercase
: determine case of the letter. - category
type
: one ofitalic
orprinted
. There are two types of cyryllic handwriting: italic (курсив) plus one which imitates printed serif symbols (друковані літери). - category
is_alternate
: determine whether some less used manner of writing was used. - number
top
,bottom
,left
,right
: precise boundaries of the glyph on the 200x200 picture in pixels. You may need these numbers in case you want to crop images. - number
height
,width
: size of the glyph in pixels. - string
filename
: address to the picture in formatglyphs/[name]-[number].png
.
Following things should appear in the future:
- Set of all
✔️ lowercaseand uppercase italic letters - Alternate forms and handwritten printed letters
- Possible couplings between letters
My respect for those who reached this paragraph. In order to contribute, you may do the following:
- Write bunch of letters on a piece of paper.
- Make a clear well-ligthed shot.
- Open it with your favorite editor (like GIMP or Photoshop).
- Tune brighness/contrast and apply threshhold to the photo. Your photo should contain only black and white pixels.
- Crop the letters and put them onto white canvas of 200x200 size.
- Add to
glyphs/
folder named as[letter name]-[number of the last letter + 1].png
. - Add to
glyphs.csv
. - Send me a pull request.
If that looks complicated to you, here's how I did it. It's as simple as few clicks. Feel free to use my notebook and add any comments.