No quantization of parameter with name ending in _Wt #747
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
marian-conv does not work properly with models trained with
tied-embeddings
andtied-embeddings-all
both set to false.This PR removes quantization of parameters with name ending in _Wt, which is used for the vocabs when
tied-embeddings
andtied-embeddings-all
are both set to false (see the logic aroundtiedParam_
inmlp::Output::lazyConstruct()
insrc/layers/generic.cpp
, and for transformer models, seeDecoderTransformer::lazyCreateOutputLayer()
insrc/models/transformer.h
): if eithertied-embeddings
andtied-embeddings-all
are set to true then the parameter name for the vocab isWemb
or ends in_Wemb
. These parameters are not quantized.This PR fixes a bug, issue: #683
List of changes:
Parameters with names ending in _Wt added to logic of those to not quantize.
Added dependencies: none
How to test
marian-conv -f model.npz -t model.bin -g packed8avx512
echo 'test' | marian-decoder -b <beam-size> --cpu-threads 1 -m model.bin -v vocab.src.spm vocab.trg.spm
The error message is
Error: Actual pathScore (-inf) is lower than INVALID_PATH_SCORE (-3.40282e+38)??
when the beam size is 2 or 3, and isError: No hypotheses in n-best list??
when the beam size is 1. With this PR, normal translation occurs.Also compare decode results with and without PR of models generated from
marian-conv -f model.npz -t model.bin -g packed8avx2
andmarian-conv -f model.npz -t model.bin -g packed16
Describe how you have tested your code, including OS and the cmake command.
Linux
cmake -DUSE_SENTENCEPIECE:BOOL=ON -DCOMPILE_CPU:BOOL=ON -DUSE_FBGEMM:BOOL=ON ..
Checklist