Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gladys assistant #1044

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Conversation

cicoub13
Copy link

@cicoub13 cicoub13 commented Apr 12, 2024

App Submission

Gladys Assistant

https://gladysassistant.com/

https://github.com/gladysassistant/gladys

256x256 SVG icon

https://gladysassistant.com/fr/img/logo.svg

Gallery images

https://github.com/getumbrel/umbrel-apps/files/14981508/umbrel_gladys_screenshot_design.zip

I have tested my app on:

  • umbrelOS on a Raspberry Pi
  • umbrelOS on an Umbrel Home
  • umbrelOS on Linux VM - Umbrel OS 1.0.4

@cicoub13 cicoub13 marked this pull request as draft April 12, 2024 09:53
@Pierre-Gilles
Copy link

@cicoub13 Here are 3 screenshots (1440/900px as you asked)

umbrel_gladys_screenshot_design.zip

I've put all screenshots in English as Umbrel community is mainly international

@cicoub13 cicoub13 marked this pull request as ready for review April 15, 2024 16:08
@nmfretz
Copy link
Contributor

nmfretz commented Apr 23, 2024

Thanks for submitting Gladys Assistant @cicoub13! Stoked to try this out. I'll be able to review late this week. Thanks for your patience.

@cicoub13
Copy link
Author

cicoub13 commented Apr 24, 2024

I wanted to test in real device (RPI4 with SSD + Umbrel OS 1.1).

I'm facing an issue where dockers are not able to contact localhost
ping localhost from Gladys container returns ping: bad address 'localhost'

@highghlow
Copy link
Contributor

I have tried to ttest the app on my Umbrel 0.5.4 device and I get this error during installation:

The Compose file '/home/umbrel/umbrel/app-data/gladys-assistant/docker-compose.yml' is invalid because:
Unsupported config option for services.web: 'cgroup'

I guess it is an option that was added in docker-compose v2, which is only present in Umbrel 1.0.

@cicoub13
Copy link
Author

I guess it is an option that was added in docker-compose v2, which is only present in Umbrel 1.0.

I've targeted only Umbrel OS 1.X (as it's the default documentation). Do I need to make it compatible with previous versions ?

@highghlow
Copy link
Contributor

I've targeted only Umbrel OS 1.X (as it's the default documentation). Do I need to make it compatible with previous versions ?

Since Umbrel v1 is now available on custom hardware installs, no.


services:
web:
image: gladysassistant/gladys:v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
image: gladysassistant/gladys:v4
image: gladysassistant/gladys:v4@sha256:f694d49bf57426cfc37590eb71a19b0e183cd3e8cbae966ef093bb3d199d318b

It would be better to add a digest, so that the exact Dockerfile contents are pinned, not just the tag name.

Copy link
Contributor

@nmfretz nmfretz May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @highghlow 🙏.

@cicoub13 the main reason to do this is that tags can be "overwritten" on docker hub by just pushing another image to the same tag (e.g., v4). If for some reason a newer image that breaks something gets accidentally or maliciously pushed to the same tag, then Docker will download this new image when new users install the app or when existing users restart their app.

Docker gives the highest priority to the digest, which locks-in the exact image referenced by the digest. So even if the v4 tag changes at some point, Docker will still download the image associated with the exact digest we have tested here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the recommendation is to fix the image with a hash, then do Pull Requests to update the image, I will follow it.

Copy link
Contributor

@nmfretz nmfretz May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cicoub13. I'll add this for you and do this one as well to make things easier: #1044 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi-arch image digest added here: c20b585

An easy way to grab the digest when you know your image is multi-architecture is to run

docker pull <image>:<tag>

and you will see the multi-arch digest in the output. For example:

$ sudo docker pull gladysassistant/gladys:v4
v4: Pulling from gladysassistant/gladys
...
Digest: sha256:f694d49bf57426cfc37590eb71a19b0e183cd3e8cbae966ef093bb3d199d318b
Status: Downloaded newer image for gladysassistant/gladys:v4
docker.io/gladysassistant/gladys:v4

network_mode: host
cgroup: host
volumes:
- ${APP_DATA_DIR}/gladysassistant:/var/lib/gladysassistant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ${APP_DATA_DIR}/gladysassistant:/var/lib/gladysassistant
- ${APP_DATA_DIR}/data/gladysassistant:/var/lib/gladysassistant

This is where app data is usually located

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${APP_DATA_DIR} points to the gladys-assistant parent directory shown here:
image

The data directory under gladysassistant is at ${APP_DATA_DIR}/data/ as @highghlow shows.

If we want to bind /var/lib/gladysassistant to the host filesystem at ${APP_DATA_DIR}/data/gladysassistant (which I think is a good idea), then you will need to add a gladysassistant directory under the data directory and add a .gitkeep file to it.

This will future proof the data directory in the event that you add other containers to this app in the future, or you want to bind other directories in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check and fix that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this and push a commit @cicoub13

Copy link
Contributor

@nmfretz nmfretz May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 3669552 and 9b4af0a

restart: on-failure
stop_grace_period: 1m
privileged: true
network_mode: host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider specifying only the ports you need to be accessible or using app-auth

Copy link
Contributor

@nmfretz nmfretz May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think for now the Gladys container will need to run in host network mode. This is because Gladys needs to have network device discovery capabilities and access to multicast UDP traffic (e.g., for UPnP) is that correct @cicoub13?

We have some ideas on how to get this to work with bridge networks in the future, but there isn't really a way to do this on umbrelOS right now.

Copy link
Author

@cicoub13 cicoub13 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. We need host for because we are doing network discovery in local network https://demo.gladysassistant.com/dashboard/integration/device/lan-manager/config
Home Assistant is working the same way. If you find a way to work differently, we could adapt yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👌

image: gladysassistant/gladys:v4
restart: on-failure
stop_grace_period: 1m
privileged: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it to be privileged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cicoub13 do you happen to know if all of the following are required for Gladys functionality:
priveleged: true
cgroup: host
- /var/run/docker.sock:/var/run/docker.sock

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, I'm wondering how important binding the Docker socket is:
/var/run/docker.sock:/var/run/docker.sock

This gives the Gladys container complete control of the Docker daemon, which we can't allow because Gladys can then issue Docker commands to the host's Docker daemon, where other app containers are running.

Copy link
Author

@cicoub13 cicoub13 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Docker daemon binding, it's needed from our side to create, restart other containers when a user adds an integration like Zigbee2Mqtt / MQTT / Node-Red.
Gladys will automatically pull, configure and start a new container
We try to be as smooth for the users but still managed integrations cleanly by using docker each time it's needed.

For the cgroup and privileged, let me confirm it with the community/developers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the cgroup in Gladys to get Gladys current containerId in Docker ( https://github.com/GladysAssistant/Gladys/blob/master/server/lib/system/system.getGladysContainerId.js#L32 )

We use this to check if the container is in the correct state (network_mode = host for example)

privileged is used for some integration like Bluetooth to be able to scan for Bluetooth devices

Home Assistant Umbrel integration uses it too: https://github.com/getumbrel/umbrel-apps/blob/master/home-assistant/docker-compose.yml#L8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the explanations @cicoub13 and @Pierre-Gilles. Let me run this by Luke to figure out the best path forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cicoub13 @Pierre-Gilles - understood regarding cgroup and priveleged: true. These are fine to keep to maintain app functionality. Thanks for confirming their necessity.

I have added some thoughts on the docker socket mount here: #1044 (comment)

volumes:
- ${APP_DATA_DIR}/gladysassistant:/var/lib/gladysassistant
- /var/run/docker.sock:/var/run/docker.sock
- /dev:/dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really weird.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binding the entire /dev directory is something we probably have to do here so that Gladys Assistant can communicate with any USB that may need to be plugged into the user's device (e.g., a USB dongle for zigbee support).

We can't know in advance what devices a user does or does not have plugged in, so we need to mount the entire /dev directory. We're definitely open to suggestions on how to get around this though, as the goal should always be to restrict a container's access and permissions as much as possible.

This device issue is pretty tricky though. For example, let's just assume that the only device a user ever needs to plug in to work with Gladys gets connected as /dev/ttyACM0. The logical solution in that case would seem to be to use this bind mount:
/dev/ttyACM0:/dev/ttyACM0

But there is a huge problem with this in Docker, which is that if the user does not have a device plugged in at /dev/ttyACM0 then Docker will error out and exit when trying to bring up the container because the /dev/ttyACM0 file doesn't exist. This means that users who install Gladys without a usb installed at that file will get a broken install (Gladys won't even start), and any user that starts up Gladys after removing that USB device will also have a broken installation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. We need to connect to existing or new plugged usb devices and there is no other way to do it.

Home Assistant is doing the same here https://github.com/getumbrel/umbrel-apps/blob/master/home-assistant/docker-compose.yml#L11

@nmfretz
Copy link
Contributor

nmfretz commented May 20, 2024

Taking a look at this now @cicoub13.
And thanks for diving into this @highghlow, much appreciated.

@nmfretz
Copy link
Contributor

nmfretz commented May 27, 2024

@cicoub13 @Pierre-Gilles, thanks very much for outlining the requirement for binding the docker daemon socket:

For the Docker daemon binding, it's needed from our side to create, restart other containers when a user adds an integration like Zigbee2Mqtt / MQTT / Node-Red.
Gladys will automatically pull, configure and start a new container
We try to be as smooth for the users but still managed integrations cleanly by using docker each time it's needed.

We looked into this, and binding the host's docker daemon socket is incompatible with umbrelOS's current architecture, unfortunately. Even if we made the security tradeoff for better app compatibility, mounting the host Docker socket will end up breaking things.

Here's the conflicting code:

We essentially do a Docker cleanup on boot so that Docker is always in a clean state initially, even if something didn't shut down properly in the last boot.

  1. https://github.com/getumbrel/umbrel/blob/5a042334e36e65128058f512c29e7af422f4be0c/packages/umbreld/source/index.ts#L147-L148
  2. https://github.com/getumbrel/umbrel/blob/5a042334e36e65128058f512c29e7af422f4be0c/packages/umbreld/source/modules/apps/apps.ts#L28-L52

If an app has control of the main Docker daemon and can do whatever it wants, then any containers that it sets to run on boot will trigger the logic in the code linked above every single boot. This will have two consequences:

  1. any containers that were brought up will be destroyed
  2. the entire umbrelOS startup will be delayed while cleanup is happening.

There are a few potential workarounds we can try to get around this if you'd consider trying them:

Option 1:

Add a Docker in Docker (DinD) container alongside Gladys, and have Gladys mount the DinD docker socket. This is effectively what we've done with the Portainer app here: https://github.com/getumbrel/umbrel-apps/blob/master/portainer/docker-compose.yml

Option 2:

Gladys could add the existing Portainer app from the app store as a dependency (or a new Docker app instead) and use that socket

Option 3:

Install Docker inside the Gladys image

Let me know if any of these sound okay to you and I can help orchestrate.

@nmfretz nmfretz marked this pull request as draft November 8, 2024 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants