more notes on Letsencrypt | technotes.seastrom.com

Mon 21 March 2016
misc

Always cautious and keeping in mind the story of Mabel, which was one of the tales behind "always mount a scratch monkey", I had the foresight to use a scratch domain name for experimenting with Letsencrypt.

The whole idea behind letsencrypt is to automate the process of issuing certs and make them free. This provides a great opportunity to integrate with one's autoprovisioning scripts such as Ansible, Puppet, etc. but it is worth knowing about the limitations up front as it will inform one's choices...

Letsencrypt has some well documented rate limits for issuance of certificates. I skimmed it and correctly identified a personal pinch point of "limited to 5 certificates per domain per week" and figured I'd be good to go if I iterated among sandbox37.seastrom.org, sandbox38.seastrom.org, sandbox39.seastrom.org, etc.

What I missed was "public suffix". What's that you ask? Dont' feel bad, I count as a DNS SME in some circles and this was a new one on me. Apparently spurred by the Mozilla Foundation among others, the Public Suffix List is a list of domain names under which there might be a zone cut to someone who pays the registrar money. Or something like that (I don't think .edu and .gov result in cash changing hands but you get the idea).

To be fair, this is off-label use of the public suffix list, which was originally intended to be used mainly for scoping what cookies are allowed to be set in a browser. The maintainers specifically warn against using it "to determine what is a valid domain name and what isn't", citing the proliferation of new gTLDs.

So, in fact the limitation is not 5 certificates issued per week for sandbox37.seastrom.org, sandbox38.seastrom.org, sandbox39.seastrom.org, etc., but rather 5 certificates per week for seastrom.org. That's troublesome for me at home given the hosting of mail domains that I do on different lightweight servers for friends. It's more troublesome for me at work, since I had been contemplating using it for machine.lab.dayjob.com. And I bet it's a whole ball of fun at , which, by the way is not on the public suffix list despite very broad and uncoordinated fanout at least one level down and perhaps two. I bet a similar situation exists at most research universities.

I'm sure that the fix is a complex ~~bit of bikeshedding~~ balancing act and nobody asked me, so I'm not going to offer unsolicited advice on the Right Thing here. cough cough someday Jekyll will get GFM but I digress.

I don't know what the rate limits are for their staging environment but one must assume that the limit is not "infinity".

Here's another limitation - one can do SAN certificates (with the usual limit of 100 Subject Alternative Names) but not wildcard certificates. The latter makes perfect sense; letsencrypt issues DV certs and does the validation via a nonce-in-a-magic-directory handshake over http on port 80 to the hostname for which one is issuing certs. In order to do wildcard certificates "correctly", one would have to support the "put some magic stuff in the DNS" model, which requires more state on the letsencrypt end and is a bit of a corner case compared to the intended audience - Joe Sixpack putting a cert on his e-commerce or blog site.

I suspect based on what I've been reading so far that the SAN handshake mechanism is iterative for every SAN that is published, and that there is a bit of an assumption that all the SAN names actually go to the same box and so the configuration is centralized. Again, correct in the standard case, but not in mine where I might be doing this to get certificates issued for *.hmail.seastrom.com, where all of the mail-hosting VMs live.

By the way, for people who like a bit more transparency into what's going on under the hood (good way to understand what's actually going on), my friend Jon pointed me at acme-tiny - 200 lines of python and calls to the openssl command line utility.

Things to test/evaluate/ask others:

It already seems based on my experimentation that letsencrypt does not care about the http hostname of the server. Can I run haproxy on all of the hmail VMs and just proxy back to a centralized place where I'm generating up (a|b|c|d|e...).hmail.seastrom.com certs and we're good?
It is documented that revoking a cert does not decrement the certs/week counter.
So long as I have the account key for an account that has been previously authorized for a claimed identifier (== DNS hostname), can I run the Acme client from a different IP address for renewal purposes? This is all but spelled out in the ACME spec as a separate and decoupled step: "Once the client has authorized an account key pair for an identifier, it can use the key pair to authorize the issuance of certificates for the identifier." but may be implementation-dependent in the case of letsencrypt.
Is there a preferred method or workflow for pulling the certs back to a central (Ansible) location once they're generated? Maybe the right workflow is to push the centrally generated account-per-vm RSA keys to the VM, run the authentication there (once), and then generate the certs on the Ansible host using that account information, then push them out. Decisions, decisions.
Does a certificate renewal count against my 5 certificates per week? If not, that would seem to limit one to (12 week cert life * 5 certs/week) = 60 live certs into hosts under a single domain name given optimal timing. Is renewal "special" in some way or is it just another cert generation?
Does generating a new certificate with a superset of preexisting SANs increment the counter? (I'm betting so).
Can I get certs for IPv6 single-stacked hosts? There are AAAA records for acme-v01.api.letsencrypt.org and acme-staging.api.letsencrypt.org (which seem to be Akamai-hosted) so there is some hope here.