Skip to content
This repository was archived by the owner on Aug 30, 2024. It is now read-only.

Commit 4824186

Browse files
zhenwei-intelVincyZhang
authored andcommitted
fix starcoder quantization bug (#159)
Signed-off-by: zhenwei-intel <[email protected]>
1 parent 75f3409 commit 4824186

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

graph/models/starcoder/starcoder_utils.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -199,8 +199,7 @@ class starcoder_quant_layer : public quant_layer_base {
199199
virtual quant_params_internal get_layer_config(std::string layername, std::vector<int64_t> ne,
200200
ne_type type) override {
201201
bool quantize = layername.rfind("w") == layername.size() - 1; // ends with 'weight'?
202-
if (layername == "model/wte") quantize = true;
203-
if (layername == "model/lm_head") {
202+
if (layername == "model/wte") {
204203
// special layer process, can be loaded by config file
205204
return quant_params_internal(); // return q4_0 to cover the usage of getrow
206205
}

0 commit comments

Comments
 (0)