You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
after a parameter learning from a corpus and a dictionary, neither of which is particularly big, I try to generate the dictionary from the built model (CRF parameter file) like below
F/seed$ mecab-dict-gen -m csj_f.mdl -o ../
csj_f.mdl is not a binary model. reopen it as text mode...
reading ./unk.def ... 36
reading ./csj_dic.csv ... 35243
emitting ../left-id.def/ ../right-id.def
emitting ../unk.def ... 36
emitting ../csj_dic.csv ... 35243
emitting matrix : 3% |#
but without success, since it crashes with just the error message 'killed'.
The parameter file is 352M with 5 million lines, while the dictionary is 2M with 40 thousand items. Then I do mecab-dict-gen, which takes a long time, about 5 mins every 1% of progress. And frustratingly, around 50% ie after 8 hours, 'gets killed'.
First of all i wonder what makes it take so long and if there is a way to investigate / debug. Perhaps the param file is unusually big? And then, if there's any recipe how to avoid this type of problem, please advise. If you need more info please get back to me.
The text was updated successfully, but these errors were encountered:
after a parameter learning from a corpus and a dictionary, neither of which is particularly big, I try to generate the dictionary from the built model (CRF parameter file) like below
F/seed$ mecab-dict-gen -m csj_f.mdl -o ../
csj_f.mdl is not a binary model. reopen it as text mode...
reading ./unk.def ... 36
reading ./csj_dic.csv ... 35243
emitting ../left-id.def/ ../right-id.def
emitting ../unk.def ... 36
emitting ../csj_dic.csv ... 35243
emitting matrix : 3% |#
but without success, since it crashes with just the error message 'killed'.
The parameter file is 352M with 5 million lines, while the dictionary is 2M with 40 thousand items. Then I do mecab-dict-gen, which takes a long time, about 5 mins every 1% of progress. And frustratingly, around 50% ie after 8 hours, 'gets killed'.
First of all i wonder what makes it take so long and if there is a way to investigate / debug. Perhaps the param file is unusually big? And then, if there's any recipe how to avoid this type of problem, please advise. If you need more info please get back to me.
The text was updated successfully, but these errors were encountered: