Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform Request: Hetzner #1324

Open
1 of 11 tasks
aral opened this issue Oct 19, 2022 · 48 comments
Open
1 of 11 tasks

Platform Request: Hetzner #1324

aral opened this issue Oct 19, 2022 · 48 comments

Comments

@aral
Copy link

aral commented Oct 19, 2022

In order to implement support for a new cloud platform in Fedora CoreOS, we need to know several things about the platform. Please try to answer as many questions as you can.

Hetzner: https://www.hetzner.com/cloud

  • Why is the platform important? Who uses it?

“Hetzner Online GmbH is a company and data center operator based in Gunzenhausen, Germany.” – https://en.wikipedia.org/wiki/Hetzner

According to Enlyft, over 180,000 companies use their services.

In 2021, they apparently had over 200,000 servers in just one of their data centres (https://www.youtube.com/watch?v=5eo8nz_niiM).

Personally, they offer the fastest instance creation times I’ve seen, an excellent API, and their prices are among the lowest available. All of these make them perfect for use for the small web. Unfortunately, since they don’t support CoreOS, I’m going to likely have to build the small web stuff on Ubuntu to start with. Which is less than ideal as I’d love for the instances to be auto-updating with a minimum of maintenance required. (The closest thing to that currently is Ubuntu LTS with automatic security updates enabled but that doesn’t, of course, cover major version updates.)

  • What is the official name of the platform? Is there a short name that's commonly used in client API implementations?

Hetzner.

  • How can the OS retrieve instance userdata? What happens if no userdata is provided?

Currently uses CloudInit, as far as I know (at least for Ubuntu instances). If no userdata is provided, no customisation occurs.

  • Does the platform provide a way to configure SSH keys for the instance? How can the OS retrieve them? What happens if none are provided?

Yes (through their interface/API). If none are provided, it sets a root password and emails it to the person.

  • How can the OS retrieve network configuration? Is DHCP sufficient, or is there some other network-accessible metadata service?

I don’t know, sorry.

  • In particular, how can the OS retrieve the system hostname?

All regular hostname commands appear to work. Not sure if that’s what you’re asking though.

  • Does the platform require the OS to have a specific console configuration?

I don’t know, sorry.

  • Is there a mechanism for the OS to report to the platform that it has successfully booted? Is the mechanism required?

Not sure.

  • Does the platform have an agent that runs inside the instance? Is it required? What does it do? What language is it implemented in, and where is the source code repository?

I don’t believe so. I haven’t encountered it in the instances I’ve set up, at least.

  • How are VM images uploaded to the platform and published to other users? Is there an API? What disk image format is expected?

Online interface + API. I haven’t used this personally.

  • Are there any other platform quirks we should know about?

Likely, but I haven’t encountered any in my use of their services :)

@bgilbert
Copy link
Contributor

Thanks for filing this. Note that we'll need these answers from the OS perspective, not the user perspective. E.g., how does the OS fetch userdata? How does it learn its own hostname?

@jlebon jlebon removed the meeting topics for meetings label Oct 19, 2022
@jlebon
Copy link
Member

jlebon commented Oct 19, 2022

We discussed this in today's community meeting:

13:10:54 < jlebon> #agreed we would like to add support for Hetzner. we are looking
                   for volunteers to pick it up and push it forward.

@aral
Copy link
Author

aral commented Oct 19, 2022

Happy to hear it. Would you like me to ask around to see if I can find some contacts there or do you already have folks you can talk to?

@bgilbert
Copy link
Contributor

Yes, that would be helpful, thanks!

@aral
Copy link
Author

aral commented Oct 21, 2022

Quick update: I’ve been in touch with an engineer at Hetzner:

“I passed it to the responsible people but they are on vacation for the next two weeks … I will force it to be answered then :)”

@der-On
Copy link

der-On commented Oct 25, 2022

At least for reseting passwords a "QEMU Guest Agent" is running on the OS.

@bgilbert
Copy link
Contributor

@der-On That might be the standard QEMU one, discussed in #74. We generally avoid shipping third-party agents (and reimplement pieces of them when necessary to avoid it), and don't currently ship the QEMU one.

@asciiprod
Copy link

So, how could we help to get CoreOS running on Hetzner Cloud?

@bgilbert
Copy link
Contributor

bgilbert commented Nov 8, 2022

@asciiprod Thanks for joining in! We could use some help answering the questions at the top of this issue. We have some answers already in the old Container Linux PRs, but it'd be good to make sure our understanding is up to date.

@asciiprod
Copy link

asciiprod commented Nov 8, 2022

Sure, I'll try to answer them as good as I can:

  • What is the official name of the platform? Is there a short name that's commonly used in client API implementations?
    That's already a tricky one. It is the Cloud product of the company Hetzner Online GmbH. It is usually just called Hetzner. That is also the name used for the cloud-init datasource. Other implementations like terraform or ansible use hcloud. Since ignition is more like cloud-init, I guess it should be hetzner.

  • How can the OS retrieve instance userdata? What happens if no userdata is provided?
    We provide a meta/userdata endpoint at http://169.254.169.254/hetzner/v1. Userdata is optional, metadata is not. So if the endpoint would be absent cloud-init would run to a certain degree using DMI information, but fail to retrieve essential data like SSH-keys.

  • Does the platform provide a way to configure SSH keys for the instance? How can the OS retrieve them? What happens if none are provided?
    Yes, via metadata endpoint. If no SSH key is selected at instance creation, a password hash is provided. If neither is retrieved/configured, the instance has no fallback login password.

  • How can the OS retrieve network configuration? Is DHCP sufficient, or is there some other network-accessible metadata service?
    IPv4 configuration is provided via DHCP, IPv6 currently via metadata service only.

  • In particular, how can the OS retrieve the system hostname?
    via metadata service.

  • Is there a mechanism for the OS to report to the platform that it has successfully booted? Is the mechanism required?
    No, there isn't one and so not required.

  • Does the platform have an agent that runs inside the instance? Is it required? What does it do? What language is it implemented in, and where is the source code repository?
    There is the qemu guest agent, which is used to reset passwords. There is also a package called hc-utils (https://github.com/hetznercloud/hc-utils). This contains bash scripts/systemd services/udev rules to automatically mount additional blockstorage volumes or start a DHCP client for new or unconfigured network interfaces. These are purely for ease of use and not required.

  • How are VM images uploaded to the platform and published to other users? Is there an API? What disk image format is expected?
    Currently there is no direct upload option. Standard images are provided and updated by Hetzner on a regular basis. For customers to deploy their own images, there are two options: ISO or via rescue system. The latter is a Debian-based live Linux and allows to write anything to the virtual disk. Using snapshots this image can be used by a customer to create new instances.
    If CoreOS would work out of the box, we could add it to the list of standard images as we already have Fedora there.

  • Are there any other platform quirks we should know about?
    Intel-based instances are currently using i440fx and AMD-based are Q35. Both legacy. UEFI possible, but not exposed via API (yet). No secure boot.

@lucab
Copy link
Contributor

lucab commented Nov 8, 2022

@asciiprod thanks for the detailed feedback! Some additional thoughts from my side:

  • it would be great to have an official Hetzner URL as the canonical documentation for the userdata/metadata endpoint and its API
  • for the default user setup, I'd personally would prefer to stay away from the "fallback to password hash". Ignition can still set up SSH keys or other credentials from its configuration, without having to transmit password hashes over HTTP.
  • as we are starting from scratch, it would be great to opt-in "UEFI only" for Fedora CoreOS images (even if without SecureBoot). That would spare many future troubles.
  • what disk format would you need for importing our images? Is that documented somewhere? Is your import flow adaptable to our release process (three parallel streams, each one releasing at least every two weeks)?
  • are there ARM aarch64 instances too? Do they work the same way?

@asciiprod
Copy link

The canonical documentation is available at:
https://docs.hetzner.cloud

If we want to include the image on the platform, it must support passing a password hash as instance creation does not force selecting an SSH key.

Using UEFI-only for given image is something that is currently not implemented. I'd have to check internally if we could do that.

The internal workflow for images does not import any external disk images. Hetzner Cloud images are generated by automated installations (e.g. kickstart/subiquity) from distribution ISOs using packer & ansible. This leads to a compressed (zstd) raw disk image, which is uploaded as an image snapshot and used to test and validate the new build on the platform. That is the point were it could be possible to import a pre-build external disk image. However that would have to be discussed internally, if it is acceptable to open this process up for 3rd party generated images.

From a release and support point of view, I think we could only support the stable version.

Currently no aarch64 Cloud instances, but as we offer Ampere dedicated servers, that's something I would keep on the list and I'd say they work the same way (probably UEFI-only)

@lucab
Copy link
Contributor

lucab commented Nov 8, 2022

Ah great, thanks. The page I was looking for is https://docs.hetzner.cloud/#server-metadata (though it doesn't currently cover the userdata part).

@bgilbert
Copy link
Contributor

bgilbert commented Nov 8, 2022

Thanks for the detailed info, this is very helpful!

If we want to include the image on the platform, it must support passing a password hash as instance creation does not force selecting an SSH key.

I don't think we should support this. Fedora CoreOS tries to encourage the use of best practices, and passwords aren't that. On other platforms, Fedora CoreOS instances are usually configured with an SSH key passed in the Ignition config.

From a release and support point of view, I think we could only support the stable version.

We always recommend that users run some testing and next instances alongside their stable instances to help us catch regressions before they're promoted to a stable release. Thus, those streams are an important part of any Fedora CoreOS deployment strategy. It's entirely reasonable for Hetzner not to provide customer support for those streams, but it's important that they be available alongside stable. If that isn't possible, I think we shouldn't pursue adding stable either, and either only document the custom deployment flow or not document Hetzner Cloud at all.

@asciiprod
Copy link

I totally agree that SSH keys should be used and we also strongly recommend it during instance creation. But we do offer a password fallback for the existing OS images. So if CoreOS does not support it, we would need to enforce it.

In any case the more CoreOS specific changes we would need to make, the more difficult it becomes to adopt it for Hetzner Cloud.

@jlebon
Copy link
Member

jlebon commented Nov 16, 2022

I totally agree that SSH keys should be used and we also strongly recommend it during instance creation. But we do offer a password fallback for the existing OS images. So if CoreOS does not support it, we would need to enforce it.

In any case the more CoreOS specific changes we would need to make, the more difficult it becomes to adopt it for Hetzner Cloud.

IIUC from the docs, it seems like the password hash is injected into the user-data, which is assumed to be a cloud-init config. Is that correct? I derived this from the fact that there's no entry for it in the Server Metadata section. (Aside: it seems like that section is missing an entry for public-keys, no?)

If that's the case, that logic would have to learn to support Ignition configs too. Password authentication is disabled by default on FCOS, so it would have to inject a drop-in for it. Also, the default sshd config (at least on Fedora) prohibits password authentication for the root user so it would have to undo that too.

What happens if no SSH keys are provided and the user-data isn't a cloud-init config? Does the API return an error because it doesn't know how to inject a root password? That seems like acceptable behaviour for the time being and avoids adding anything FCOS-specific.

@asciiprod
Copy link

The metadata API provides either the user-selected SSH-key or a random generated password hash if no SSH-key is selected. So instance creation will always succeed.
I have to apologize for the incomplete docs. The metadata service has of course a field/path for the public-keys and network-config.
Please correct me if I am wrong. As far as I understand it, we are currently only missing an afterburn provider to make CoreOS work on our platform. If that is correct, having it would enable us and anyone else to start using/testing it.
And it would also allow to resolve the other questions (password support, UEFI-only, releases) separately and step by step.

@lucab
Copy link
Contributor

lucab commented Nov 17, 2022

Yes, if we want to start making incremental progresses on this then the next immediate things to sort out on FCOS side are:

  1. pick up and document a platform identifier (see https://coreos.github.io/ignition/supported-platforms/)
  2. make Afterburn and Ignition aware of this new platform (see providers: add support for hetzner cloud afterburn#125 and 🇩🇪 Add support for Hetzner cloud ignition#1262)

@aral
Copy link
Author

aral commented Dec 9, 2022

Hey everyone (@asciiprod, @lucab), any progress on this?

It would be really amazing to be able to boot up a Fedora CoreOS instance on Hetzner in under a minute (that’s how fast the supported instances boot up; it’s a game-changer for Small Web use) :)

@prestist prestist added the jira for syncing to jira label Dec 12, 2022
@aral
Copy link
Author

aral commented Feb 12, 2023

Hey folks, any updates on this? Would still love to see it happen. Has communication between Fedora and Hetzner stalled? If so, how do we get it going again? :)

@bgilbert
Copy link
Contributor

@aral I think this thread has all of the needed information now, or at least most of it. There are some old Afterburn and Ignition PRs that'll need a rebase and an update based on the information here. I don't think anyone is currently working on that, but feel free to run with it if you'd like!

@travier
Copy link
Member

travier commented Sep 4, 2023

I've created a PR with the "simplified" steps to add a new platform: #1562

@travier
Copy link
Member

travier commented Sep 4, 2023

Ignition PR: coreos/ignition#1707
Afterburn PR: coreos/afterburn#996

@travier
Copy link
Member

travier commented Oct 4, 2023

Folks interested for initial support for this platform in Fedora CoreOS should open an issue with the emerging platform template and follow the steps there. Thanks!

@aral
Copy link
Author

aral commented Feb 10, 2024

Any updates on this for 2024?

I can’t imagine how launching a CoreOS installation on Hetzner’s cloud in under a minute would be bad for either Fedora or Hetzner. (Not to mention that this would have the Small Web launch on CoreOS instead of Ubuntu as that’s really the only option I see at the moment otherwise for an affordable platform with instance creation measured in the seconds.)

Anyone know what’s blocking this and how we can try and route around it?

@nachtjasmin
Copy link

@aral There's a really good guide by @swick that explains how to install Fedora CoreOS on Hetzner servers. It's not as easy as the other operating systems provided by Hetzner, but it's a good enough workaround until they provide official support.

@aral
Copy link
Author

aral commented May 21, 2024

@nachtjasmin Thanks, Jasmin, that is a good guide indeed. Sadly, for my needs (we will eventually have thousands of servers), that isn’t good enough so I’ve decided to go with AlmaLinux on Hetzner instead. It doesn’t automatically update like CoreOS, sadly, which would have been my first choice, but eight years of security updates should give us enough time to either implement a major version update system or transition to a transactional OS later.

@aral
Copy link
Author

aral commented May 21, 2024

Since this doesn’t look like it’s going to be implemented and since I’m moving ahead with using a different OS, I’m closing this. Please feel free to reopen if anything changes.

@aral aral closed this as completed May 21, 2024
@thomasaull
Copy link

@aral Why not just leave it open, since it’s not solved yet?

@aral
Copy link
Author

aral commented May 21, 2024

@thomasaull I’ll leave that decision to the Fedora CoreOS folks. They can reopen it if they decide to work on it. It’s been open for over two years, there’s no reason to keep it open longer in my view.

@thomasaull
Copy link

@aral Got it. Just out of curiosity: What exactly is the issue with the snapshot approach? Boot duration too long?

@aral
Copy link
Author

aral commented May 21, 2024

@thomasaull It’s too convoluted and specific to Hetzner. I don’t want to tie Domain so closely to one provider, even if Hetzner is the one we’re initially going to be supporting and to have a hacky workaround be the core way that servers are deployed for the Small Web.

Also, hopefully, we (Small Technology Foundation) won’t be the only ones running Domain instances – other organisations around the world will so it’s just not feasible to base such a system on a workaround.

(Boot duration isn’t the issue as Domain now uses prewarmed instances.)

In the future, once we have more resources, etc., we can maybe review the decision.

Hope that helps give some insight into my, admittedly rather unconventional, needs :)

@thomasaull
Copy link

@aral Thanks for the insights! I'll read up on Domain/Small Web

@aral
Copy link
Author

aral commented May 21, 2024

@thomasaull If you’re going to, the end-to-end encrypted Kitten chat (https://ar.al/2023/02/20/end-to-end-encrypted-kitten-chat/) and Streaming HTML (https://ar.al/2024/03/08/streaming-html/) posts/videos should give you a good idea of where everything is. It’s a new stack, specifically for a peer-to-peer web (Small Web) 💕

@Manawyrm
Copy link

Since this doesn’t look like it’s going to be implemented

One of the biggest roadblocks currently is the UEFI requirement.
We would love to have UEFI by default for everyone as well, but it would break the existing customers to roll this out for the current products (as new VMs with existing OS images might not boot in UEFI mode, if the image was created on a BIOS machine).

We also don't really want to offer a server image that doesn't boot in legacy BIOS mode (which then wouldn't boot on the older machine types).

@travier
Copy link
Member

travier commented Jul 9, 2024

With support in Afterburn and Ignition now in stable, it should be possible to convert a QEMU FCOS image using the script in coreos/fedora-coreos-docs#651 to an Hetzner one and use it to setup FCOS on Hetzner.

Testing welcomed! If successful, we should document that in the docs.

travier added a commit to travier/fedora-coreos-docs that referenced this issue Jul 16, 2024
@travier
Copy link
Member

travier commented Jul 16, 2024

While we do not yet provide ready made images for Hetzner, I've written documentation on how to setup Fedora CoreOS on Hetzner with what we have available right now: coreos/fedora-coreos-docs#654

Testing and feedback welcomed!

@dustymabe
Copy link
Member

While we do not yet provide ready made images for Hetzner

What's preventing this last piece ^^?

Looks like from the docs PR you are just changing the platform ID, is that it?

@travier
Copy link
Member

travier commented Jul 17, 2024

Looks like from the docs PR you are just changing the platform ID, is that it?

Yes, that's the only bit missing. If there are no objections then we could start building those and that would definitely make it easier to provision an instance.

@travier travier added the meeting topics for meetings label Jul 17, 2024
@yasminvalim yasminvalim removed the meeting topics for meetings label Jul 24, 2024
@yasminvalim
Copy link
Contributor

We discussed this issue on FCOS community meeting today and agreed that we will start producing Hetzner images for Fedora CoreOS.

@gaufde
Copy link

gaufde commented Sep 9, 2024

We discussed this issue on FCOS community meeting today and agreed that we will start producing Hetzner images for Fedora CoreOS.

Does this mean that FCOS will become a 1-click install option on Hetzner Cloud?
image

@jbtrystram
Copy link
Contributor

Does this mean that FCOS will become a 1-click install option on Hetzner Cloud?

No, we will produce disk images that you will have to upload to hetzner, that's the best we can do

@gaufde
Copy link

gaufde commented Sep 9, 2024

Does this mean that FCOS will become a 1-click install option on Hetzner Cloud?

No, we will produce disk images that you will have to upload to hetzner, that's the best we can do

No worries! That still sounds easier than the recovery mode work-arounds I keep seeing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests