Model/Dataset Capabilities #13

acon96 · 2024-01-14T17:03:02Z

acon96
Jan 14, 2024
Maintainer

Hey please use this area to discuss future capabilities for the model to avoid creating a ton of noise in the issues section.

colino17 · 2024-02-13T01:51:56Z

colino17
Feb 13, 2024

There's a few service calls I'd like to see implemented if possible:

todo.add_item which needs entity_id (ex: "todo.shopping_list") and "item" (ex: "milk") service data attributes
timer.start which needs entity_id and duration service data attributes
timer.cancel which needs entity_id service data attribute
vacuum.start which needs entity_id service data attribute
vacuum.pause which needs entity_id service data attribute
vacuum.stop which needs entity_id service data attribute
vacuum.return_to_base which needs entity_id service data attribute

Out of curiosity, what is the basic process for implementing new capabilities? Is it a matter of generating new dataset "piles" for the additional services and retraining, or are additional code changes required elsewhere on your end?

3 replies

acon96 Feb 14, 2024
Maintainer Author

Those all seem very doable. I'll take a look at it for the next model.

Also the process to add more capabilities is to add more examples to the pile CSVs (specifically the device_actions, templated_requests, and responses piles). I need to write proper documentation about how they work still and will post back here when it's done.

colino17 Feb 17, 2024

I forked the repo and added more examples to those three files related to the domains mentioned above. I tried to adhere to the structure and schema of the previous examples. Added a pull request. Let me know if there are any issues or anything that I structured improperly.

acon96 Feb 23, 2024
Maintainer Author

Just published a new model that supports these: https://huggingface.co/acon96/Home-3B-v3-GGUF

tannisroot · 2024-03-02T16:30:58Z

tannisroot
Mar 2, 2024

I've noticed that at least the fixt/home-3b-v3:q8_0 (with the default settings and prompt, apart from setting prompt format to Zephyr, and around 20 exposed entities) will either always or sometimes use wrong services to control devices. For example:

calling non-existent cover.open and cover.close services instead of cover.open_cover and cover.close_cover for my "Bedroom Curtains" entity.
And while I don't think the button entities are listed as supported (which would be very nice!), they do work from time to time. But when they don't, it's again because the LLM hallucinates services.
Here it creates button.suspend and button.turn_on services for my button entities called "Suspend my PC" and "Turn On my PC":

6 replies

tannisroot Mar 3, 2024

As per documentation, I set the temperature through a modelfile to 0.1 too, like this

FROM fixt/home-3b-v3:q8_0
PARAMETER num_ctx 2048
PARAMETER temperature 0.1
PARAMETER top_k 40
PARAMETER top_p 0.95
PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|im_end|>"

and LLAMA conversation is set to use the modelfile that defines this, not the source model. And in the Home assistant, in the LLAMA conversation config for this modelfile, temperature is also 0.1 (which is default).

acon96 Mar 3, 2024
Maintainer Author

OK cool. Thanks for ruling that out.

I think for the next model I want to experiment more with making sure the model is pulling the service name from the context and not just having enough examples in the fine-tuning dataset that it is able to convert "turn on the lights" to light.turn_on. I think that might be what is happening, because it would explain why it isn't able to generalize to grabbing the correct service names for the button entity type without having examples of it in the fine-tuning dataset. I'm still pretty new to training models so I think there is a lot of room for improvement on that front.

tannisroot Mar 3, 2024

For someone who is new to training models, you still managed to create by far the most consistently usable LLM conversation stack for Home Assistant! Personally I couldn't get any other non-fine tuned solutions to be a reliable conversation agent, and those models were far bigger.

xrami247 Mar 19, 2024

Just want to chime in, having pretty much the exact same issue with my curtains.

I have a temporary workaround that works sometimes; I created scenes that handle the behaviour and call those in chat. For example instead of blinds close, i say privacy (my scene).

It works great for my lights but it would be awesome if this was fixed as well.

Great work so far!

xrami247 Mar 19, 2024

Update, made a small modification to my prompt and now its opening/closing my blinds successfully 100% of the time!

My modified prompt (just the last line is new):

You are 'Al', a helpful AI Assistant that controls the devices in a house. Complete the following task ask instructed with the information provided only.
Services: {{ services }}
Devices:
{{ devices }}

Only use devices and services listed above.

Another important note: Looks like this fix only works for the q4_k_m version of home-3b-v3, anything larger and I still have the issue. Almost like it becomes too smart for its own good.

Edit: q2_k also struggles even with the fix

Edit: Looks like after Ollama restarted its back to being broken.. if somone else can test the prompt let me know if it makes any difference for you.

tannisroot · 2024-03-02T16:38:41Z

tannisroot
Mar 2, 2024

It would also be nice if the model could handle cases where TTS fails and only transcripts the command and not the device which to control. Right now if you tell it to just "Turn On", "Start", "Close" etc, it will try to issue the service to a random entity, sometimes to a real one, and sometimes not.

3 replies

acon96 Mar 3, 2024
Maintainer Author

Yes one of the things on the TODO list is to have the model properly respond requests that don't make sense such as asking about devices that don't exist or trying to call a service that doesn't exist.

tannisroot Mar 6, 2024

Have you considered fine-tuning the new Gemma Instruct model as a base for the v4? During my experiments with it, even without fine tuning it seemed quite a bit more capable than phi or StableLM, with just a custom prompt it managed to generate responses that looked like ones your fine-tuned model generates. It also wouldn't execute the request if I just asked it to "Turn on", atlthough it's probably because there was no fine-tuning.

acon96 Mar 6, 2024
Maintainer Author

I tried but had some issues with the model code acting weird and not properly adding new tokens to the dataset. I'll probably try again soon since I have seen claims that it is an even better multi-language base model than anything else in that size range.

fhteagle · 2024-04-14T01:37:22Z

fhteagle
Apr 14, 2024

Just getting my feet wet with this, but it appears that Home-3B model is not good at comprehending calendar entities and associated items. I have had exactly 0% success with around a dozen variations/iterations of each of the below (and received plenty of false "I did what you asked" responses, to boot!)

Asking what or when the next calendar entry is
Asking details about a specific upcoming event: it seems to find the event, but often reports wrong details, such as end time transposed with start time, hallucinated end time, "all day" events that start in a week but end today, etc.
List of upcoming events
Count of upcoming events
Add an item to the calendar (tries to craft a service response but its never once been successful)
Updating an existing calendar item
Deleting an existing calendar item

To be fair, GPT-4 also fails the same tests spectacularly.

Also, exposing zero calendar entities to the assistant results in hallucinations, when it should result in "I have no information about that".

3 replies

tannisroot Apr 14, 2024

I don't believe it was fine-tuned on calendar stuff

fhteagle Apr 14, 2024

I would absolutely believe that. Consider the above as a starting list of skills to train for "future capabilities" regarding calendar entities, then.

acon96 Apr 14, 2024
Maintainer Author

Can confirm calendar is NOT in the training dataset. Will add that to the list of entities to train for the next model. I'm currently working on proper multi-language support but can add some basic calendar examples as well. It will just be for adding events to calendar since doing proper calendar support will require more work to look up events using the get_events service.

fhteagle · 2024-04-14T15:08:47Z

fhteagle
Apr 14, 2024

Another currently missing skill / future capability to train on: being able to report the currently selected item, and set the correct item for an input_select.

Home3B appears to intermittently be able to use the input_select:next service call, maybe 10% of the time. But I have had 0% success picking a specific option. Additionally, model remembers setting the input_select to something (even though the service call failed in reality), then will sometimes incorrectly report that as the value of that item in future query/response pairs.

A better behavior for this and really all interactions would be to recheck the current value of the entity after the model thinks it has accomplished something ("check its own work before claiming its done"), but I have no idea if the conversation pipeline even allows for this...?

0 replies

tannisroot · 2024-04-19T03:54:05Z

tannisroot
Apr 19, 2024

I was playing with llama3:instruct recently, and I have to say I am very impressed with the 8b model.
I'm running it through Ollama, and if you tweak the system prompt a bit to provide it with better examples, it controls devices and answers questions without a hitch.
Here's an example of my conversation with it, and in every case it responded like I want it to, despite making some mistakes (changed the state when I wanted to, responded with the state when asked about it without executing a command). Though at the end, for some reason, the beginning of the response for some reason contains <|im_start|assistant.
I really hope the next finetuned model will be based off llama3.

Hey are kitchen lights on?
The kitchen lights are off.
What about lights in the bedroom?
The bedroom lights are off.
Ok can you turn both of those on?
Turning on the kitchen lights. Turning on the bedroom lights.
Ok then them back off.
Turning off the kitchen lights.Turning off the bedroom lights.
Can you tell me if bedroom TV is on?
The bedroom TV is off.
Can you turn it on?
Turning on the bedroom TV.
Turn it off now.
<|im_start|>assistantTurning off the bedroom TV.
Set boiler to 24 degrees please.
<|im_start|>assistantSetting the temperature.
What's the temperature of boiler?
The boiler is set to 24C.
Can you set it to 24 now?
Failed to parse call from '{"to_say": "Temperature already set.", "service": "", "target_device": "climate.boiler"}'!
Sorry, can you set it to 23?
<|im_start|>assistantSetting the temperature.```

My custom system prompt:

You are 'Al', a helpful AI Assistant that controls the devices in a house. Complete the following task ask instructed with the information provided only.
Services: {{ services }}
Devices:
{{ devices }}

Respond to the following user instruction by responding in the same format as the following examples:
{"to_say": "Turning on the light.", "service": "light.turn_on", "target_device": "light.example"}

{"to_say": "Turning on the switch.", "service": "switch.turn_on", "target_device": "switch.example"}

{"to_say": "Setting the temperature.", "service": "climate.set_temperature", "target_device": "climate.example", "temperature": 22}

{"to_say": "Setting the mode.", "service": "fan.set_preset_mode", "target_device": "fan.example", "preset_mode": "Sleep"}

{"to_say": "Activating the script.", "service": "script.example"}

If asked a question about the state of a device, for example:
"Is example off?"

respond like this:
The example is off.

If asked to do something without referencing a device, for example:
"Turn on"

respond like this and ONLY like this:

No device specified.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model/Dataset Capabilities #13

{{title}}

Replies: 6 comments 15 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Model/Dataset Capabilities #13

acon96 Jan 14, 2024 Maintainer

Replies: 6 comments · 15 replies

acon96 Feb 14, 2024 Maintainer Author

acon96 Feb 23, 2024 Maintainer Author

acon96 Mar 3, 2024 Maintainer Author

acon96 Mar 3, 2024 Maintainer Author

acon96 Mar 6, 2024 Maintainer Author

acon96 Apr 14, 2024 Maintainer Author

acon96
Jan 14, 2024
Maintainer

Replies: 6 comments 15 replies

acon96 Feb 14, 2024
Maintainer Author

acon96 Feb 23, 2024
Maintainer Author

acon96 Mar 3, 2024
Maintainer Author

acon96 Mar 3, 2024
Maintainer Author

acon96 Mar 6, 2024
Maintainer Author

acon96 Apr 14, 2024
Maintainer Author