Skip to content

Morphological Parser for Russian is able to split words into morphemes: prefixes, roots, infixes and postfixes

License

Notifications You must be signed in to change notification settings

konverner/morpholog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI version

Morpholog

Morpholog is a tool for dealing with morphological structure of russian words.

img

Get started

Installation:


pip install morpholog

Documentation

  1. Tokenize word into morphemes:

from morphological_tokenizer Morpholog

morph = Morpholog()
morph.tokenize('ДОИМПЕРИАЛИСТИЧЕСКИМ')
['до-', 'империал', '-ист-', '-ическ-', '+им']

token- : prefix

token : root

-token- : infix

+token : ending

-token : postfix

  1. Get roots of word

from morphological_tokenizer Morpholog

morph = Morpholog()
morph.get_roots('картограф')
['карт', 'граф']


  1. Find same-root words of the given word

morph.root_words('город',print_root=True)

ROOT:  город
['выгородить',
 'выгородиться',
 'городить',
 
  ...
 
 'по-городски',
 'подгородить',
 'подгородный',
 'полгорода',
 'пригород',
 'пригородить',
 'пригородный',
 'разгородить',
 'разгородиться']

  1. Convert a verbal noun into a verb

morph.noun2verb('оформление')

'оформить'

  1. Convert a particle into a verb

morph.ptcp2verb('отправленный')

'отправить'

What about neologisms?

Many neologisms are not presented in the dictionary, so, Morpholog 'makes guess' about it:

morph.tokenize('чилить')

[['чил', '+ить']]

morph.noun2verb('хайп')

'хайпить'

morph.noun2verb('хардкор')

'хардкорить

About

Morphological Parser for Russian is able to split words into morphemes: prefixes, roots, infixes and postfixes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages