-
Notifications
You must be signed in to change notification settings - Fork 78
Open
Description
Can you provide a script similar to inference-example.py
, that utilises run_generation.py
file? i.e instead of command like execution
python src/run_generation.py --model_type llama --model_name_or_path meta-llama/Llama-2-13b-chat-hf \ --prefix "<s>[INST] <<SYS>>\n You are a helpful assistant. Answer with detailed responses according to the entire instruction or question. \n<</SYS>>\n\n Summarize the following book: " \ --prompt example_inputs/harry_potter_full.txt \ --suffix " [/INST]" --test_unlimiformer --fp16 --length 200 --layer_begin 16 \ --index_devices 1 --datastore_device 1
instead load the model and run inference from python script.
Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels