You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Download a model from huggingface via the cli call, we download the small tiny-gpt2 model which is fairly small (500KB) and should be usable on most devices running ezkl.
ezkl gen-settings -M model.onnx
[E] [2024-12-10 07:15:32:345, ezkl] - [graph] [tract] Undetermined symbol in expression:sequence_length
Hardcoding the values on onnx directly resulted in more problems suggesting a greater incompatibility with Tract
Examples of errors
[E] [2024-12-10 07:15:48:138, ezkl] - [graph] [tract] Can not broadcast 128 against 256
[E] [2024-12-10 07:18:30:675, ezkl] - [graph] [tract] Failed analyse for node #96 "/transformer/h.0/attn/Concat_3" InferenceConcat
Script used to perform surgery on onnx
importonnxfromonnximporthelperimportnumpyasnpdefprint_tensor_shapes(model, prefix=""):
print(f"\n{prefix} Tensor shapes:")
forinputinmodel.graph.input:
print(f"Input {input.name}: {[dim.dim_valueifhasattr(dim, 'dim_value') elsedim.dim_paramfordimininput.type.tensor_type.shape.dim]}")
foroutputinmodel.graph.output:
print(f"Output {output.name}: {[dim.dim_valueifhasattr(dim, 'dim_value') elsedim.dim_paramfordiminoutput.type.tensor_type.shape.dim]}")
defhardcode_sequence_lengths(model_path, past_sequence_length, sequence_length, batch_size, output_path):
""" Modify ONNX model to replace both past_sequence_length and sequence_length with fixed values Args: model_path: Path to input ONNX model past_sequence_length: Integer value to replace past_sequence_length sequence_length: Integer value to replace sequence_length batch_size: Integer value for batch size output_path: Path to save modified model """# Load the modelmodel=onnx.load(model_path)
# Print original shapesprint_tensor_shapes(model, "Before modification")
# Update input shapesforinputinmodel.graph.input:
tensor_type=input.type.tensor_type# Handle different input typesifinput.name=='input_ids':
tensor_type.shape.dim[0].dim_value=batch_sizetensor_type.shape.dim[1].dim_value=sequence_lengthelifinput.name=='attention_mask':
tensor_type.shape.dim[0].dim_value=batch_sizetensor_type.shape.dim[1].dim_value=sequence_length+past_sequence_lengthelifinput.name=='position_ids':
tensor_type.shape.dim[0].dim_value=batch_sizetensor_type.shape.dim[1].dim_value=sequence_lengthelif'past_key_values'ininput.name:
tensor_type.shape.dim[0].dim_value=batch_size# dim[1] is num_heads (2)tensor_type.shape.dim[2].dim_value=past_sequence_lengthtensor_type.shape.dim[3].dim_value=64# head dimension# Update output shapesforoutputinmodel.graph.output:
tensor_type=output.type.tensor_typeifoutput.name=='logits':
tensor_type.shape.dim[0].dim_value=batch_sizetensor_type.shape.dim[1].dim_value=sequence_length# dim[2] is vocab_size (50257)elif'present'inoutput.name:
tensor_type.shape.dim[0].dim_value=batch_size# dim[1] is num_heads (2)tensor_type.shape.dim[2].dim_value=sequence_length+past_sequence_lengthtensor_type.shape.dim[3].dim_value=64# head dimension# Print modified shapesprint_tensor_shapes(model, "After modification")
# Check model validityonnx.checker.check_model(model)
# Save the modified modelonnx.save(model, output_path)
if__name__=="__main__":
sequence_length=128# Your desired sequence lengthpast_sequence_length=128# Your desired past sequence lengthbatch_size=1# Your desired batch sizehardcode_sequence_lengths(
"model.onnx",
past_sequence_length,
sequence_length,
batch_size,
"surgery.onnx"
)
The text was updated successfully, but these errors were encountered:
Describe the bug
We are presently unable to parse huggingface model and call
gen-settings
via theoptimum-cli
flowExpected behaviors
We should be able to parse a huggingface model directly.
And then run
Steps to reproduce the bug
Examples of errors
Script used to perform surgery on onnx
The text was updated successfully, but these errors were encountered: