Multi-LORA feature question #2505

imran3180 · 2024-09-09T20:50:15Z

I deployed an adapter using LORA_ADAPTERS environment variable on a sagemaker endpoint. Everything is working fine except the fact that it is not failing if I'm providing the wrong adapter_id during the inference. It is providing prediction from the base model.

My Question: Should the request fail because we are providing wrong adapter_id?

For example :-

text-generation-launcher
--model-id meta-llama/Meta-Llama-3-8B-Instruct
--lora-adapters DavidLanz/Llama3_tw_8B_btc_qlora

request/response without adapter

curl 127.0.0.1:3000/generate \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
  "inputs": "What are three words to describe you?",
  "parameters": {
    "max_new_tokens": 20
  }
}'
# {"generated_text":" (e.g. funny, outgoing, creative)\nI would say that three words to describe me are"}

request/response with an adapter

curl 127.0.0.1:3000/generate \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
  "inputs": "What are three words to describe you?",
  "parameters": {
    "max_new_tokens": 20,
    "adapter_id": "DavidLanz/Llama3_tw_8B_btc_qlora"
  }
}'
# {"generated_text":" A. Adventurous, B. Creative, C. Curious\nWhat are three words to describe"}%

request/response with invalid adapter

curl 127.0.0.1:3000/generate \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
  "inputs": "What are three words to describe you?",
  "parameters": {
    "max_new_tokens": 20,
    "adapter_id": "random_adapter_id"
  }
}'
# {"generated_text":" (e.g. funny, outgoing, creative)\nI would say that three words to describe me are"}

From the response, I'm guessing that it is inferencing from the base model.

Also, how could I add more logs(opt-in for more logs), so that I can see if my adapter_id was invalid and the default behavior is that it will do the inference from base model.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-LORA feature question #2505

Multi-LORA feature question #2505

imran3180 commented Sep 9, 2024

Multi-LORA feature question #2505

Multi-LORA feature question #2505

Comments

imran3180 commented Sep 9, 2024

For example :-