Cannot use trained model for validation or prediction #5

khusainovaidar · 2022-02-15T07:11:18Z

Hi, thank you for this repo! I'm trying to reproduce results for different language, so I'm using multilingual-bert fine-tuned to my language dataset.
Everything goes well during preprocessing and training, the resuls are comparable with those for English and French (97-99% for case and punctuation).

But when I try to use trained model, it gives very poor results even for sentences from training dataset. It works, sometimes it puts capital letters or dots, but it's rare and mostly model can't handle.
Also when I try to evaluate model with command from the README (also tried it for already used validation sets, for instance with command python recasepunc.py eval bertugan_casepunc.24000 valid.case+punc.x valid.case+punc.y) it gives error:

File "recasepunc.py", line 220, in batchify
x = x[:(len(x) // max_length) * max_length].reshape(-1, max_length)
TypeError: unhashable type: 'slice'

Sorry for pointing to two different problems in one Issue, but I though maybe it can be one common mistake for both cases.

khusainovaidar · 2022-02-15T08:00:44Z

Seems that it must be this code for line 388 (with addition of punctuation label):

print(cased_token + punctuation_syms[punc_label], end='')

Maybe this is also the case for #3

benob · 2022-02-21T17:24:04Z

Thanks for reporting those issues.

For the first one, could you be more specific about the sequence of commands you have run and the data you used. Do you observe the same problem with the released models?

For the second one, the error is cryptic. It looks like x is not a tensor. Can you add a line with print(type(x)) or print(x) to check the type of x?

caiolibanio · 2022-08-09T12:37:35Z

For the second one, the error is cryptic. It looks like x is not a tensor. Can you add a line with print(type(x)) or print(x) to check the type of x?

Hello! Thanks for this repository! I'm facing the same issue when tying to evaluate the model. The code reports the error just like:
"...in batchify
x = x[:(len(x) // max_length) * max_length].reshape(-1, max_length)
TypeError: unhashable type: 'slice'"

When I print the "x", I've found that the x is of type <class 'dict'>. Any suggestions regarding this issue?
Thanks again!

CoffeePerry · 2022-09-13T14:03:15Z

@khusainovaidar and @caiolibanio It could be an arguments order problem, try rewriting the command like this:

python recasepunc.py eval valid.case+punc.x valid.case+punc.y bertugan_casepunc.24000

If that works, please tell me so I fix the eval usage example into the file README.md.

caiolibanio · 2022-09-14T04:49:27Z

@khusainovaidar and @caiolibanio It could be an arguments order problem, try rewriting the command like this:
python recasepunc.py eval valid.case+punc.x valid.case+punc.y bertugan_casepunc.24000
If that works, please tell me so I fix the eval usage example into the file README.md.

Yes! That works! It was an arguments order problem as you said. Thanks!

CoffeePerry mentioned this issue Sep 14, 2022

Update README.md #11

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use trained model for validation or prediction #5

Cannot use trained model for validation or prediction #5

khusainovaidar commented Feb 15, 2022

khusainovaidar commented Feb 15, 2022

benob commented Feb 21, 2022

caiolibanio commented Aug 9, 2022 •

edited

Loading

CoffeePerry commented Sep 13, 2022

caiolibanio commented Sep 14, 2022

Cannot use trained model for validation or prediction #5

Cannot use trained model for validation or prediction #5

Comments

khusainovaidar commented Feb 15, 2022

khusainovaidar commented Feb 15, 2022

benob commented Feb 21, 2022

caiolibanio commented Aug 9, 2022 • edited Loading

CoffeePerry commented Sep 13, 2022

caiolibanio commented Sep 14, 2022

caiolibanio commented Aug 9, 2022 •

edited

Loading