Skip to content
This repository has been archived by the owner on Nov 23, 2021. It is now read-only.

LetsEncrypt #54

Closed
tobeycodes opened this issue Apr 8, 2016 · 43 comments
Closed

LetsEncrypt #54

tobeycodes opened this issue Apr 8, 2016 · 43 comments

Comments

@tobeycodes
Copy link
Contributor

I noticed DebOps has ACME integration. Is this something that can be integrated easily?

https://github.com/debops/ansible-pki/blob/master/docs/acme-integration.rst
https://github.com/debops/ansible-nginx/blob/master/docs/acme-support.rst
http://docs.debops.org/en/latest/ansible/roles/ansible-pki/docs/acme-integration.html

@carlalexander
Copy link
Owner

I've just started going over all of this recently. For now, I was going to disable the ACME challenge in the vhost config until I could figure things out. I might look at that next while I wait for the Xenial release.

@carlalexander
Copy link
Owner

Here's another relevant issue related to this.

@carlalexander
Copy link
Owner

Just saw this today. Is this something we could use @drybjed?

@drybjed
Copy link

drybjed commented May 16, 2016

@carlalexander CertBot is just the official Let's Encrypt client with a changed name.

The official Let's Encrypt way of handling things is not sufficient enough for what different applications require. With debops.pki, you also have covered OCSP support, custom Diffie-Hellman parameters, support for external and internal certificates in addition to Let's Encrypt, and so on, and so forth... I mean, have you at least tried it? :-)

@carlalexander
Copy link
Owner

No, I wasn't sure how this fit in the overall picture lol

@drybjed
Copy link

drybjed commented May 16, 2016

Well, for super short explanation: when you add a host to [debops_all_hosts] with the default DebOps playbooks you should get a working internal certificates (not signed by any "trusted" CA), as long as you have a properly configured domain (ansible_domain not empty).

If you enable debops.nginx role on a host, on the next run of debops.pki it will try to acquire a set of ACME certificates for ansible_domain and www.{{ ansible_domain }} hostnames, as long as the DNS points to the correct host that should be sufficient to have signed ACME certificates on a host. Although this isn't done with the default PKI realm configured, so you would most likely create a separate one for the webserver.

Everything that you need is explained in the documentation linked above by @schrapel. If anything else is missing or not explained well enough, let me know.

@carlalexander
Copy link
Owner

Ok, I'll look into all of that. Will the fact that we use these options cause an issue?

pki_internal: False
pki_authorities: []

@drybjed
Copy link

drybjed commented May 17, 2016

Well, that depends. First, the normal way to get to ACME certificates from Debian base install with a host with debops.nginx enabled would be:

  1. Run the common playbook, PKI with internal certificates is set up
  2. Install nginx with internal certificates, enable support for ACME HTTP-01 verification
  3. Re-run the common playbook (or at least debops.pki) which then requests and obtains the ACME certificates

The problem with PKI is that at the moment it's either enabled or not. If debops.pki is enabled on host, debops.nginx expects some certificates and private keys to "be there" in place. You could setup debops.pki without internal CA like above, but you then are expected to provide some kind of external certificates, either directly or using some script. Without that, debops.nginx configures the nginx with enabled SSL and points it to the usual place the certificates are expected to be - but there are none, and nginx won't restart, so HTTP server is broken and ACME requests cannot be processed.

It's basically a chicken and egg situation. The current DebOps defaults are designed to avoid that problem. I would suggest that you try that on a development server first to see what happens.

@carlalexander
Copy link
Owner

carlalexander commented May 22, 2016

So having read through everything tonight, I'm still not 100% sure how to proceed. The first step would be to create a PKI realm with ACME enabled. Right now, we use the default domain one which has it disabled. What's the correct naming convention for PKI realms? Should it be {{ ansible_fqdn }}?

Once that's done, I should just need to change nginx_pki_realm to this new realm. We also would need to re-enable ACME for the reverse proxy vhost that handles SSL termination. It's currently disabled.

I assume there'll be other snags, but this is a good starting point.

@drybjed
Copy link

drybjed commented May 22, 2016

Let's make a simple example case. You have a domain blogname.com which you want to use for your blog. This is an "apex domain", like 'wordpress.com', so you want everything to be inside it, on a subdomain. Your host then will be named say, alpha.blogname.com, and you want your blog to be available at blogname.com and www.blogname.com for convenience, and you pointed all three to your host in the DNS. In Ansible terms it would be like this:

ansible_hostname: 'alpha'
ansible_domain: 'blogname.com'
ansible_fqdn: 'alpha.blogname.com'

The domain PKI realm would have covered each case, since it uses a wildcard *.blogname.com, but that's not trusted by eveyrbody. On the other hand, ACME does not allow wildcard domains at the moment, so you need to specify which ones you want included. In that case you can add in the inventory for that host:

pki_host_realms:
  - name: '{{ ansible_domain }}'
    subdomains: [ 'www' ]
    acme_subdomains: [ 'www' ]

This should be enough for the debops.pki to setup the ACME certificates after debops.nginx is configured on it. You don't need to specifically enable ACME since it's enabled by default (the default domain PKI realm has it explicitly disabled).

When debops.nginx configures a server, it checks the list of configured names against list of available PKI realms on a given host. This means that if you set up a server say,

nginx_servers:
  - name: [ 'blogname.com' ]

the role will check if blogname.com PKI realm exists, and if yes, use that realm by default. So you don't need to change nginx_pki_realm for this to happen.

@carlalexander
Copy link
Owner

This is exactly what I needed! Thank you!

@carlalexander
Copy link
Owner

Started running some tests tonight. Ran into an issue with the Download private realm contents by host task. It fails when I create my new realm because DebOps doesn't create directory for it in the secret/pki/realms/by-host directory.

Not sure if this is a valid debops.pki issue or not, but I don't want to force anyone to create that directory. It doesn't even look like it's needed.

@drybjed
Copy link

drybjed commented May 23, 2016

The directories can be created automatically by the debops.secret role. To feed it proper data, you need to add the debops.pki/env role before it. Check the debops.pki playbook to see how it looks like.

@carlalexander
Copy link
Owner

carlalexander commented May 25, 2016

Alright, so the debops.pki/env works. Tested it tonight. I can't seem to get acme_subdomain to work. Here's a test realm config:

wordpress_pki_default_realm:
  name: 'carlalexander.ca'
  subdomains: [ 'dev' ]
  acme_subdomains: [ 'dev' ]

With this config, I'm just looking for a certificate for dev.carlalexander.ca. It doesn't seem to be working.

I get an error.log every time in /etc/pki/realms/carlalexander.ca/acme/. The error says couldn't download http://www.carlalexander.ca/.well-known/acme-challenge/D8iflQNI7fPn8-t2cUCsEmfcFPKMcOoXGYvFBxPgGeU. I'd expect the challenge to be http://dev.carlalexander.ca/.well-known/acme-challenge/D8iflQNI7fPn8-t2cUCsEmfcFPKMcOoXGYvFBxPgGeU.

Did I understand that wrong and the top level domain needs to be available at all times for challenges?

@drybjed
Copy link

drybjed commented May 25, 2016

The www.carlalexander.ca was added because the www. subdomain is set in pki_acme_default_subdomains variable. To disable this you can either overwrite that variable with [] or add item.acme_default_subdomains: [] to the realm configuration.

@carlalexander
Copy link
Owner

carlalexander commented May 25, 2016

So I got it to work with a realm config like this:

wordpress_pki_default_realm:
  name: 'dev.carlalexander.ca'
  acme_default_subdomains: []
  default_subdomains: []

I'll try something like this later:

wordpress_pki_default_realm:
  name: 'carlalexander.ca'
  acme_default_subdomains: [ 'dev' ]
  default_subdomains: [ 'dev' ]

But I have a bit of a chicken and the egg problem. I'm still getting the error.log with the same error that it can't download the challenge. Meanwhile, nginx won't start because the SSL certificates aren't present.

Shouldn't we get certificates first before the challenges? Are you assuming that self-signed certificates are present? Wondering if it's an issue because I set pki_internal: False and pki_authorities: [].

@drybjed
Copy link

drybjed commented May 25, 2016

Yes, this is the chicken and egg issue with should be solved by self-signed certificates. You could also provide external certificates as files, or via a script, internal certificates just fix the problem automatically.

@carlalexander
Copy link
Owner

carlalexander commented Jun 3, 2016

I finally got Let's Encrypt to work (somewhat). I got it to work by hand, but there's still some issues left to address. I'll document them and work on them next.

By default, nginx doesn't have access to the challenge folder. This was the cause of the redirects to the acme.{{ ansible_domain }} redirects. acme-tiny puts a file there in mode 640. This makes it impossible for nginx to read the challenge file.

My workaround was to add www-data to the pki-acme group. I don't think this is a good solution to the problem. I'm not sure what makes the challenge file be 640, but it should be 644. I'll try to find a solution to make that work. I think that should unlock everything.

There's also a possibility that the acme-tiny script will still run too early. If that's the case, I'll need to delete the error.log and run the pki-realm-scheduler again in the wordpress role.

Thanks again for the help @drybjed! (Enjoy your vacation!)

@carlalexander
Copy link
Owner

carlalexander commented Jun 3, 2016

More testing today. I've created a wordpress_ssl_provider variable like the one in Trellis. selfsigned and letsencrypt work. I have to rework the manual certificate upload to co-exist with debops.pki better.

I wasn't able to find a workaround to the file permission issue. There's an issue and a thread about it, but they all go modify the acme-tiny code. That's not a valid solution. I went with adding www-data to the pki-acme group and restarting nginx. Works without an issue that way.

During testing, I realized that there's no way to update a PKI realm once we create it. That means that, if you configure a server with one SSL provider, there's no way to change it to another. Not sure if it goes against debops.pki or not. We'll see in a few weeks.

@drybjed
Copy link

drybjed commented Jun 3, 2016

Why wait a few weeks? :-)

For the file/directory permission issue - the /srv/www/sites/acme/public/.well-known/acme-challenge/ directory is created by debops.pki role with 0755 permissions, so www-data user should be able to read its contents. You can also check this by running command:

root@host:~# namei -m /srv/www/sites/acme/public/.well-known/acme-challenge 
f: /srv/www/sites/acme/public/.well-known/acme-challenge
 drwxr-xr-x /
 drwxr-xr-x srv
 drwxr-xr-x www
 drwxr-xr-x sites
 drwxr-xr-x acme
 drwxr-xr-x public
 drwxr-xr-x .well-known
 drwxr-xr-x acme-challenge

Another matter are the files created by acme-tiny script. I'm not sure at the moment, but they should be word-readable, although I would try to check this by disabling the removal in the script and checking the permissions.

The PKI realms managed by debops.pki intentionally are not modifable. It comes from the fact, that it's not you that is making the changes, but a third-party. Let's break it down. The remote host creates a private and public key, and generates a CSR with requested CN, domain, etc. Then, CSR is submitted to the Certificate Authority which signs it, and generates a certificate. The signature is placed on the specific contents of that certificate. What would you then want to change?

  • you cannot modify the private key, because then you would decouple it from the signed certificate and it would be useless;
  • you cannot change the public key in the CSR, because it needs to be signed by a third party for the change to be effective;
  • you cannot change the signed certificate, because then signature becomes invalid;

As you can see, there's nothing to change in the PKI realm itself. A common way to update the PKI realm is to just remove the entire /etc/pki/realms/<realm>/ directory and re-run debops.pki. If you use all of the mechanisms in the role (config variables, secret directories), all of the information needed to recreate a PKI realm is available and will be used to recreate it.

If by changing the SSL provider you mean switching from internal CA certificates to ACME certificates, debops.pki is written in such a way that different "certificate authorities" (external script, ACME, internal CA, self-signed certificates) keep their files in separate directories, with the private key being the same in each case. Each PKI realm uses a list in above order to determine which set of private key + authority to use in a given moment, based on the existence of a signed and valid certificate. That means, that when you create a PKI realm with internal CA certificate, or even self-signed certificate enabled, and after that activate the ACME certificate by installing debops.nginx, the debops.pki role will automatically switch the active certificate to the one provided by ACME authority. You can check that by checking which set of certificates is symlinked in the public/ directory of a particular PKI realm.

@carlalexander
Copy link
Owner

Another matter are the files created by acme-tiny script. I'm not sure at the moment, but they should be word-readable, although I would try to check this by disabling the removal in the script and checking the permissions.

This is what I did yesterday. acme-tiny creates the challenge with 640 as the file permissions. Doesn't matter if the folder was 755. The ticket and thread discuss this issue. The acme-tiny author said you can edit the script if you want, but he won't change it.

As for the realm issue, I figured this what you wanted to do. The only issue is that let's say someone is using a manual provider. I'm going to turn acme off since they don't need it. If I want to switch to acme, I can't turn it on anymore. Even if I turned ACME on for external, I still wouldn't be able to remove the external certificate because I can't change the realm config.

I'm not sure how often it'll happen in practice. I did your workaround and it works fine. It's not a deal breaker anyhow. 😄

@drybjed
Copy link

drybjed commented Jun 3, 2016

OK, I've looked at the Let's Encrypt thread, and the issue might be wiith the restrictive umask set by the pki-realm script. This might be changed for the acme-tiny execution around here. If you want to test it, add similar code to remember current umask and restore it after the acme-tiny script is finished. Changing the umask to less restrictive should make the resulting files world-readable. And the acme-tiny script doesn't need to be changed at all.

I'm not sure I understand the PKI realm issue. What's the manual provider, do you mean usage of external certificates provided manually via the secret/ subdirectories and how you cannot then replace them with acme certificates? You can either stop using external certificates at all if you plan to use ACME, since selfsigned certificates should ensure that nginx is started correctly, or you could try and change the order of pki_authority_preference list to make the acme authority first on the list, and therefore the preferred one.

@carlalexander
Copy link
Owner

manual is just what I want to use to upload custom certificate using private_files. I'll look at the umask stuff and report back!

@carlalexander
Copy link
Owner

Made changes to the old code for the manual provider. I made it work with the new WordPress PKI realm that the role creates. I just need to test the umask change for the pki-realm script. What umask would you use @drybjed?

@drybjed
Copy link

drybjed commented Jun 6, 2016

@carlalexander You can try umask 022.

@aelsharawi
Copy link

Great job on this everyone, any ETA about let's encrypt implementation?

@carlalexander
Copy link
Owner

I'm 90% done, but I'm leaving on vacation Friday. Not sure, I'll have time to tackle it before then.

@carlalexander
Copy link
Owner

Just pushed the reworked integration with debops.pki. This includes support for ACME (e.g. Let's Encrypt) and selfsigned certificates. If you put wordpress_ssl: True, the new default is to create an Let's Encrypt certificate.

Only thing left is to update all the documentation in the wiki. I'll try to tackle it this week.

@tobeycodes
Copy link
Contributor Author

@carlalexander I'm getting ERROR! 'pki_env_secret_directories' is undefined when using wordpress_ssl: True

@carlalexander
Copy link
Owner

Is DebOps up to date? It's supposed to be generated by pki/env on the
previous step.
On Mon, Jul 4, 2016 at 5:12 PM Toby Schrapel [email protected]
wrote:

@carlalexander https://github.com/carlalexander I'm getting ERROR!
'pki_env_secret_directories' is undefined when using wordpress_ssl: True


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAn9XIY-GRJlrQ_k_-BiAq242T7N_yGdks5qSXdBgaJpZM4IDb7T
.

@tobeycodes
Copy link
Contributor Author

@carlalexander I ran sudo debops-update before trying this. I'll try it again

@tobeycodes
Copy link
Contributor Author

tobeycodes commented Jul 4, 2016

Nope didn't work.

This is in the response of the debops -u root but doesn't say ok or anything below

TASK [debops.secret : Create secret directories on Ansible Controller] *********

@carlalexander
Copy link
Owner

Is it when you run debops -u root or debops wordpress -u root?

@tobeycodes
Copy link
Contributor Author

@carlalexander it fails on debops wordpress -u root

debops -u root reports no errors

@carlalexander
Copy link
Owner

What version of Ansible are you using?

@tobeycodes
Copy link
Contributor Author

2.0.1.0

@carlalexander
Copy link
Owner

Can you try updating it and seeing if it works? I'm at 2.0.2.0. Not sure if that's the issue. I committed everything so I'm not sure what's going on.

@tobeycodes
Copy link
Contributor Author

That seems to have solved the issue. So many dependencies to keep updated. I lose track of it all haha

@carlalexander
Copy link
Owner

I know sorry!

@tobeycodes
Copy link
Contributor Author

@carlalexander how do you suggest handling www here? If it points to the same server the certificate is only valid for the non-www from the hosts file

@carlalexander
Copy link
Owner

If ansible_domain matches wordpress_domain, www sub domain is added by
default. Otherwise, you need to add it yourself.
On Tue, Jul 5, 2016 at 5:50 PM Toby Schrapel [email protected]
wrote:

@carlalexander https://github.com/carlalexander how do you suggest
handling www here? If it points to the same server the certificate is only
valid for the non-www from the hosts file


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAn9XOMjodPExtkb--vl9dsetV7HArBLks5qStHAgaJpZM4IDb7T
.

@tobeycodes
Copy link
Contributor Author

Thanks. I've added it myself

@carlalexander
Copy link
Owner

Just finished updating the wiki. Closing this. Thanks again for all the help @drybjed!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants