Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL certs for GO services (HTTPS) #53

Closed
cmungall opened this issue Mar 19, 2015 · 48 comments
Closed

SSL certs for GO services (HTTPS) #53

cmungall opened this issue Mar 19, 2015 · 48 comments

Comments

@cmungall
Copy link
Member

We need SSL certifications for geneontology.org

@kltm
Copy link
Member

kltm commented Mar 19, 2015

The idea would be to get a blanket cert for geneontology.org and all subdomains. More expensive, but future-proof.

@kltm
Copy link
Member

kltm commented Mar 19, 2015

This would be necessary to support sites and services that require GO data using https. Current (and future versions) of some web browsers prevent mixing http and https.

@stuartmiyasato
Copy link

I am planning to move forward with this now. As I mentioned in a previous email, we can get a three-year AlphaSSL wildcard cert for geneontology.org and *.geneontology.org for $120. This is the cert we are using for the ENCODE project and it seems to be working well. If we want subdomains of geneontology.org (e.g. www.subdomain.geneontology.org) we will need to get a separate cert for that subdomain.

Do we want to discuss specifics about implementation (e.g. redirecting HTTP to HTTPS, specific services, etc.) in this ticket or open new tickets for each topic?

@kltm
Copy link
Member

kltm commented Jun 1, 2015

Just to clarify, the wildcard does not allow for subdomains, or are you referring to subsuddomains with your example?

Ugh, this is going to be messy. Essentially, because there can be no mixing for services (http/s considered x-domain), the only way to do this that I am aware of would be to make everything go over to one or the other.

I guess the first thing would be to:

  • set everything up (get certs, etc.)
  • pilot it
  • switch for single domain entities (geneontology.org, etc.)
  • then switch everything over for sites that need more coordination (AmiGO)

@stuartmiyasato
Copy link

Hopefully the following will clear up the cert issue for subdomains. The cert I would get would cover these two cases:

  1. geneontology.org
  2. *.geneontology.org where the * is a value that doesn't include a dot. :)

For values of * that do include a dot, we'd need to get a separate cert. If we have two subdomains, one called sd1 and one called sd2, we'd need the following.

  1. for sd1.geneontology.org and *.sd1.geneontology.org, we'd need a separate cert.
  2. for sd2.geneontology.org and *.sd2.geneontology.org, we'd need a separate cert.

Again where * does not contain a dot.

Hopefully this clears up the subdomain issue. I'm not sure that I'm following the rest of the conversation just yet, but I think that will come in time, perhaps with more specific examples.

@kltm
Copy link
Member

kltm commented Jun 2, 2015

Okay, it looks like first-level subdomains are fine then.

For the rest, it looks like some of it can be wheeled out a bit at a time (https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy), with only minor hiccups in debuggability ,etc. I think there are a couple cases that will run afoul of the x-domain stuff, but as long as everything is setup to work either way, we can probably fix them on a case-by-case basis.

@stuartmiyasato
Copy link

I have the cert for geneontology.org and *.geneontology.org in hand. I got the AlphaSSL wildcard SSL cert from www.ssl2buy.com. Cost was $120 for three years. The private key and CSR are in user ubuntu's home directory on geneontology.org.

@kltm
Copy link
Member

kltm commented Jun 3, 2015

Okay, great--cheaper than I remember. I'm going to be tied down for the next little bit, but you should go ahead with any experiments that don't require any coordination.

@stuartmiyasato
Copy link

I set up an AWS instance to test the SSL cert. The URL is https://www-oregon.geneontology.org/. I basically cloned the existing geneontology.org site and moved the clone to a different region (US-West-2 Oregon). I changed the Apache configuration to use HTTPS. I also set up a virtual host on port 80 that simply redirects the same query to port 443. It seems to work transparently from my testing. @kltm , do you have a test suite that we can point at this server?

At the least, this shows that the cert is valid so we don't have to worry about that.

@kltm
Copy link
Member

kltm commented Jun 5, 2015

Great.

We have no unit tests (yet) for the main site besides the spider. However, the likely problem areas are few. For example:

https://www-oregon.geneontology.org/page/download-annotations

Is failing with:

Blocked loading mixed active content "http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/gene-associations/go_annotation_metadata.all.js"

Which is pretty much as we expected--we'll have to get viewvnc over to https there for this ugly hack to work. But it does illustrate the type of failures I'll expect for the various AmiGO bits as well.

@stuartmiyasato
Copy link

The various sites are so intertwined on the meatloaf.stanford.edu filesystem that I think this HTTPS transition is likely to be an all-or-nothing proposition. (With the exception of AmiGO and GOlr, which are more like a standalone group.) What do you all think of starting that process of migrating the meatloaf-based services up to AWS? (The services largely consisting of SVN/CVS repos and their viewers, the current non-AWS portion of geneontology.org, the anonymous FTP server, the database archives, and the wiki.

@cmungall
Copy link
Member Author

cmungall commented Jun 5, 2015

Seems reasonable

On 5 Jun 2015, at 11:47, stuartmiyasato wrote:

The various sites are so intertwined on the meatloaf.stanford.edu
filesystem that I think this HTTPS transition is likely to be an
all-or-nothing proposition. (With the exception of AmiGO and GOlr,
which are more like a standalone group.) What do you all think of
starting that process of migrating the meatloaf-based services up to
AWS? (The services largely consisting of SVN/CVS repos and their
viewers, the current non-AWS portion of geneontology.org, the
anonymous FTP server, the database archives, and the wiki.


Reply to this email directly or view it on GitHub:
#53 (comment)

@kltm
Copy link
Member

kltm commented Jun 5, 2015

Is the idea that the configurations of the various systems and how they relate will be simplified by the migration to AWS?
OTherwise, it seems like just getting HTTPS available on all of the first would allow us to move forward in a more piecemeal fashion.

@stuartmiyasato
Copy link

I would love to simplify the configurations, but I'm not sure if that's really feasible (or even possible) due to the tangled web of symlinks that is the filesystem. But if I can sort it out, I'd love to simplify it.

The main reasons for my wanting to deploy the copy in AWS are:

  1. I want to deploy both the HTTPS site and concurrently have the HTTP site redirect to the HTTPS site. We can't test that with the existing systems without simply cutting over -- we'd be testing in production. I'd like to test in an isolated environment to give us time to debug any issues that crop up, and I think it's a good bet that issues will indeed crop up as part of the migration.
  2. If our long-term goals are to move services into the cloud anyway, this would give us a head start down that road as a by-product.

@kltm , I suspect this doesn't really answer your question. Hopefully it gives you an idea of my current mindset, but please do elaborate on your questions if I didn't give you an answer you are looking for.

@kltm
Copy link
Member

kltm commented Jun 5, 2015

I guess my gut feeling would be that adding HTTPS to the current servers and then testing service-by-service would be, while possibly a little wobbly as things cut over and back, probably less work than getting everything in the cloud, testing with different names, then trying to flip the switch and debugging everything at once. At least less stressful from my POV; x-domain resources can go over to HTTPS first as they are pretty much optional and would cause minimal disruption while testing, once cleared the whole domain could forward.

However YMMV, and will be doing the heavy lifting in either case.

For immediate use cases, I don't think anything particularly needs to be HTTPS in the foreseeable future with the exception of AmiGO and GOlr, since some institutions that want to use GOlr need to be able to use HTTPS for policy reasons.

@kltm
Copy link
Member

kltm commented Jul 1, 2015

From an email with @stuartmiyasato earlier today, a lot of progress has been made on this front.

@kltm
Copy link
Member

kltm commented Feb 5, 2019

Now using certbot pretty generally. We have tested it out for the main site and are pretty happy.
The work in operations for geneontology/geneontology.github.io#101 has given us most of what we need. I think the easiest might to be to support both HTTP and HTTPS for a while, with the latter just a proxied HTTP, without the forced redirect.

@kltm kltm changed the title SSL certs for GO services SSL certs for GO services (HTTPS) Apr 10, 2021
@kltm
Copy link
Member

kltm commented Apr 10, 2021

Okay, thinking about a bit more of a step-by-step roadmap. for starters:

@kltm
Copy link
Member

kltm commented Oct 10, 2022

Tracking some internal parts at: https://github.com/berkeleybop/bbops/issues/27

@kltm
Copy link
Member

kltm commented Nov 23, 2022

@abessiari From our conversation on Monday, I wanted to clarify some possible ways forward for the HTTP/HTTPS wrt AmiGO and GOlr.

As we transition, unless we do everything all at once (hard), we'll need to walk services over due to web browsers not allowing mixed URL schema. Working our way up from our most primitive services, we want to switch over in such a way that allows APIs making direct calls not to error out. A path forward for any given AmiGO instance would be:

  1. amigo and golr are both http
  2. make the solr/golr part of an amigo instance to return to both http and https calls (two vhosts at the proxy)
    this way https web apps can still use Solr in either mode, and AmiGO can continue on
  3. once we're sure that all needed APIs are available on https, switch the amigo instance over to upgrading https--this requires both a proxy change and a config change in amigo.yaml.
  4. at some point in the future, change the solr to upgrading https

For our purposes, I think we can do 1-3 all at once, but if there is something that we need not available on https, we'd have to stop at 2 until we fixed it.

Does this make sense?

@kltm
Copy link
Member

kltm commented Nov 23, 2022

@abessiari TL;DR:
golr: http and https; amigo: http or https

@tgbugs
Copy link

tgbugs commented Jul 27, 2023

We import http://geneontology.org/formats/oboInOwl in the NIF-Ontology and not having it resolve breaks the import chain since it is imported in nif_backend https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/master/ttl/nif_backend.ttl.

If this is not fixed it means that all old versions of the NIF-Ontology are dead in the water and cannot be loaded in protege without significant effort from the user. (In the immediate moment it is blocking a data release because we cannot run robot on anything in the NIF-Ontology)

@cmungall
Copy link
Member Author

Hi @tgbugs

We will continue to support both

  1. oboInOwl annotation property URIs
  2. resolving the oboInOwl vocabulary (as a whole, and also hash-based annotation property URIs)

Apologies for the temporary downtime. Some temporary workarounds follow. I am sure you know these already and it's just hard slotting these into the existing workflows, but I'm including for reference for anyone who ends up here:

  • for robot and Protege can you can always include a catalog-v001.xml file and have it resolve networked vocabularies locally (I recommend this in general for all external dependencies)
  • you should move to importing OMO which should include the APs you need from oboInOwl (I realize this would likely have to be part of a bigger NIF-Ontology refactor

@kltm
Copy link
Member

kltm commented Jul 27, 2023

@suzialeksander @cmungall

After a lot of attempts, I'm not seeing a way forward with GH Pages or Cloudflare for supporting HTTPS and forwards/redirects, without changing DNS provider for the root domain (which causes other devops problems that I'd rather not deal with right now). The fundamental issue is that AWS Route 53 does not allow CNAMEs/aliases for apex domains (although other providers like Cloudflare are less picky about https://datatracker.ietf.org/doc/html/rfc1034).

I'm going to revert to the original GO proxy with the result that we cannot do HTTPS through a third party and must provide it through our own proxy: TODO. This will be a little fiddly, but hopefully doable. As part of this, we'll try and reduce the number of redirects that apache uses to the bare minimum, hopefully making this easier during future migrations. This means that we'll keep the changes that we made so far with this attempt and try and continue with other methods.

@stuartmiyasato
Copy link

@kltm At the risk of intruding where I don't really belong (having not really followed the conversation to this point), while Route 53 does not support CNAMEs of domain apex records, they do offer their specific ALIAS record type. Perhaps this does not do what you need, but in case it's useful down the line:

https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-choosing-alias-non-alias.html

(Goes back into hiding...)

@kltm
Copy link
Member

kltm commented Jul 27, 2023

@stuartmiyasato You are always welcome on anything that you run across here! :)

I started looking into various alias things that could be done in Route 53, but they mostly seem to be geared towards other AWS systems. The one thing that I was hopeful with ("Another Route 53 record of the same type in the same hosted zone") only allowed pointing at other A records, which kind of defeated what I wanted to do. I started exploring a trick where I would point to an S3 bucket and then setup uniform forwarding from that bucket, but it seemed like it might be too far down the rabbit hole.
I may have missed something in there, but fundamentally I need the apex domain to point to a named subdomain in Cloudflare; I've not seen anything non-exotic yet that would support that...

@kltm
Copy link
Member

kltm commented Aug 15, 2023

As I don't see an alternative way forward, I'm going to proceed with updating the Ansible playbooks for the GO homepage to support some kind of HTTPS. This will likely be with the wildcard certs from Let's Encrypt that @abdellilah put together, but with a custom one-off grabber and cronjob to keep them up-to-date (which will also allow layering over Cloudflare later on if we choose to do so).

@kltm
Copy link
Member

kltm commented Aug 18, 2023

After some frustration with versions, have added wildcard cert getter. Noting that currently this is assumed to be in place and certs are available before the proxies are setup.

@kltm
Copy link
Member

kltm commented Aug 18, 2023

This should now be safe on first and all runs.

@kltm
Copy link
Member

kltm commented Aug 18, 2023

I now have a testing site available at https://test.geneontology.org. This should be more-or-less identical to what we do when we make the final switch.

@kltm
Copy link
Member

kltm commented Aug 18, 2023

@suzialeksander @balhoff At our leisure, we can start testing now with https://test.geneontology.org.

@kltm
Copy link
Member

kltm commented Aug 19, 2023

From #53 (comment), no longer need to worry about /go-cam, as we are still using proxy for foreseeable future.

kltm added a commit to geneontology/geneontology.github.io that referenced this issue Aug 19, 2023
@kltm
Copy link
Member

kltm commented Aug 19, 2023

@balhoff An attempt to update ontology URLs geneontology/geneontology.github.io#472

@kltm
Copy link
Member

kltm commented Aug 30, 2023

Noting work in progress here: https://github.com/berkeleybop/bbops/issues/30

@kltm kltm self-assigned this Aug 30, 2023
@kltm
Copy link
Member

kltm commented Aug 30, 2023

Talking to @suzialeksander , we'll go ahead with transitioning the home site ASAP.

kltm added a commit to geneontology/geneontology.github.io that referenced this issue Aug 30, 2023
kltm added a commit to geneontology/geneontology.github.io that referenced this issue Aug 30, 2023
@kltm
Copy link
Member

kltm commented Sep 1, 2023

The GO homepage proxy setup has now transitioned over to HTTPS.

kltm added a commit to geneontology/geneontology.github.io that referenced this issue Sep 1, 2023
kltm added a commit to geneontology/geneontology.github.io that referenced this issue Sep 1, 2023
@kltm
Copy link
Member

kltm commented Sep 1, 2023

Browsing through the GO site, I believe the only remaining items may be Noctua and models.geneontology.org.

These are now dealt with in geneontology/web-gocam#23 (comment) and https://github.com/berkeleybop/bbops/issues/12 . Overall, we're done here.

@kltm kltm closed this as completed Sep 1, 2023
@kltm kltm reopened this Oct 12, 2023
@kltm
Copy link
Member

kltm commented Oct 12, 2023

@kltm kltm closed this as completed Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants