This doc describes the current deployment architecture we use for the Diamond/SGC version of fragalysis. Part of the work funded by Janssen aims to change our deployment infrastructure to be more generic, and to use only open-source deployment software (such as Travis and Kubernetes).
High-level overview of architecture
Fig. 1 shows a diagram for the current model of deployment for the fragalysis stack, which is explained in simple(ish) terms below.
Data upload follows a slightly different process. When new data is pushed up to a directory on our cluster, we tell Jenkins it’s there by sending it a curl request. Jenkins then creates a new loader image containing the new data, and pushes it to openshift. The fragalysis scientific code then processes this data, populating the back-end database. When the loader has finished, it’s container stops. In this way, we can add new data to fragalysis without having to re-build the full stack.
Geeky deployment info
The main components
|title||Cluster hardware for Diamond deployment|
A “bastion” node, currently providing DNS/name resolution
A “master” and separate “infrastructure” node
A hi-space “graph” node, and “application” and “gpu” nodes
An OpenShift Fragalysis TEST project that contains:
The “test” Fragalysis Web application
A “loader” that updates the Web application’s media directory
An hourly/daily/weekly MySQL backup process with an rsync connection to a Diamond server
An OpenShift-endorsed Jenkins CI/CD server.
“Fragalysis (CI/CD)”, which contains: -
A Jenkins CI/CD deployment
Various image streams written-to by the automated build jobs
Fragalysis (Development, Production, and “Ric”) projects, which each contain: -
Fragalysis Stack deployment
A MySQL database
An hourly/daily/weekly MySQL backup process with an rsync connection to a Diamond server (excluding the “Ric” project)
Fragalysis (Graph 1), which contains: -
A small workable/example graph database
Fragalysis (Graph 2), which contains: -
The MolPort graph database
Fragalysis (Staging), a project used for database recovery experiments that contains: -
A MySQL database
The ACME Controller, which contains: -
The OpenSHift ACME controller certificate generation project
The TEST/PROD CI/CD workflow
The approach to migrating images between test and production projects will be similar to that documented in the RedHat article Using Image Tags Through the SDLC, where: -
The basic workflow is described below in two stages, one for TEST and one for PRODUCTION. A simple schematic overview is included at the end of this section.
|title||Actions influencing TEST image roll-|
A change is committed to one of the Fragalysis Stack code repositories in Git.
This ultimately triggers an automated Fragalysis Web Image build in Jenkins creating a pair of new docker images, one tagged “latest” and one with a build-specific tag (using the Jenkins Web Image build number). The first build might result in Web images “latest” and “latest-1”. The images are pushed to the OpenShift built-in registry.
Each new build adds a new version-specific tag image (i.e. “latest-2”) while also replacing the current “latest” image.
The TEST project’s Fragalysis Web Image OpenShift DeploymentConfiguration is set to redeploy automatically as each new “latest” image becomes available (i.e. when it is pushed to the built-in OpenShift registry). So, any change to any repository used by the Web Image results in a new deployment of the Image in the TEST project.
A change is committed to one of the Fragalysis Loader code repositories in Git.
This ultimately triggers an automated Fragalysis Loader Image build in Jenkins creating a pair of new docker images, one tagged “latest” and one with a build-specific tag (using the Jenkins Web Image build number). The first build might result in Web images “latest” and “latest-1”. The images are pushed to the OpenShift built-in registry.
After each new Loader Image is created the Jenkins Job automatically launches the Loader Image as an OpenShift Job in the TEST project, after removing any prior completed Jobs, so that the Web Image media data is updated accordingly.
After each new Loader Image is created, jobs are also automatically launched in the PRODUCTION and RIC projects resulting in new loader deployments in all projects that contain a Fragalysis Stack.
In the unusual situation where a prior Image Loader Job is still running Jenkins does not launch the new job, instead marking the build as unstable. An attempt to deploy the Loader will be made when the next Loader Image is built.
As well as responding to code changes the Loader Image build supports remote triggering via cURL. This will allow new builds to be triggered when new data is deposited in the django_data directory.
|title||Actions influencing PRODUCTION image roll-outs|
When an operator is happy with a Web Image or Loader Image deployment in the TEST project they are able to run (or remotely execute via cURL) a “Promote Image to Production” job in Jenkins that will manage the deployment of the chosen TEST image to the PRODUCTION project. This parameterised Job will require the user to identify the build to promote and the version by which it will be referred using parameters in the Jenkins Job definition.
Using specific tags also allows the operator to select an earlier version so that the application can, in theory, be set from any stored version. The practicality of this will depend on whether the application and any external service on which it relies (i.e. a database) can be rolled forwards or backwards.
The “Promote Test Image to Production” job will initiate code that will re-tag the named version (i.e. “latest-2”) in the TEST project with a copy tagged “stable” in the PRODUCTION project.
The PRODUCTION project’s Fragalysis Web Image is configured to redeploy automatically as each new promoted “stable” image becomes available and promoting the Loader Image will result in a new Loader Image Job being launched. As in TEST the Loader Image Job will not run if a prior Loader Image is running in PRODUCTION.
The developer is responsible for identifying the builds that are to be promoted to production. The build number of each Web Image and Loader Image is present in the corresponding Docker image as a label named build.number.
Registry space will limit the number of TEST builds that can be kept but we anticipate that builds for at least a number of prior working days will remain available for promotion.