Designing a Project Lifecycle with Contracts and Constraints
Note: this blog post is adapted from a lightning talk given at a meeting of Boston Devops on Thursday, March 24th, 2016, the slides from which are available.
Why contracts and constraints?
Put simply, it’s possible to deliver software in a rapid, predictable manner by establishing contracts with developers, and constraining what projects should look like to be considered supportable. This blog post talks primarily about a design pattern, and implementation of that design pattern using several common DevOps tools which each fit into a different stage of a continuously deployable workflow.
From the eyes of operations, a new project is “born” when developers first commit code (or for those practicing test-driven development, when QA first commits tests). From the moment of this first checkin, there is a productivity gap: development and QA aren’t having any automation run against their code. Operationally, this code does not yet exist, as it hasn’t been onboarded into any workflow. The primary goal of this design pattern is to minimize that productivity gap by minimizing the time taken to build automation around a new project.
There are four goals which drive the pattern:
- Defaults should be sane. Development should not have to assume that a default deployment will give them insufficient resources to run their application. On the other end of the spectrum, developers shouldn’t come to operations with a request that their project be run on extra-large cloud instances “just in case” performance turns out to be subpar.
- Componentize everything. Any logic implemented for one particular project, in a way which cannot be easily reused by other projects, is probably being implemented incorrectly.
- Deployments should look and feel consistent. This makes code reuse relatively straightforward. Additionally, it helps to make infrastructure supportable by reducing the amount of knowledge sharing and cross-training required to effectively participate in a project lifecycle.
- Ultimately, development should be able to own their lifecycle. This simply isn’t possible if deployment and maintenance activities require developers to moonlight as UNIX sysadmins and release engineers.
Building code
The following line should look familiar to many - given a Maven project, it will first remove any local project state from previous builds (the clean target), then compile code, run unit tests, and upload the output of the build somewhere that it can later be retrieved (the deploy target):
[root@app01 ~]$ mvn clean deploy
Other build systems don’t have builtins like clean and deploy, but cleaning and deploying projects are fairly common build lifecycle activities. Running these same goals on a Grunt or Rake project should be no more difficult than one of:
[root@app01 ~]$ grunt clean deploy
[root@app01 ~]$ rake clean deploy
Contracts are established by defining what a project’s build system should be able to do. In this case, the contract for building code is:
- The clean target should remove any state left over from earlier builds. For instance, all code should be rebuilt, and nothing from a previous build cycle should be included in a further build. This prevents, for example, a deployable bundle which doubles in size every build, due to inclusion of the previous deployable bundle within the bundle.
- The deploy target should compile code, run unit tests, create a bundle of ready-to-execute code with its dependencies, and upload that bundle to a server such as Archiva or Artifactory.
Running Integration Tests
A slightly more complicated exercise is running integration tests. In the same three build systems, this may look like:
[root@app01 ~]$ mvn integration-test -Dtest-endpoint=(some url)
[root@app01 ~]$ grunt integration-test --test-endpoint=(some url)
[root@app01 ~]$ TEST-ENDPOINT=(some url) rake integration-test
Things are slightly less consistent here due to each build system’s eccentricities. In order to integration test a project, the build tool needs to know what to test - here, the test-endpoint
. This is passed as a define in Maven, a parameter in Grunt, and an environment variable in Rake (as Rake doesn’t support passing named parameters directly). Ultimately, however, the contract can be described as: “By calling integration-test with the named parameter integration-test
, the project should run automated integration tests against the test endpoint”.
Integrating with Chef
Disturbingly few publicly-available Chef cookbooks support deploying an application into a container - here, a common solution is to wrap a community cookbook with a deploy resource. The following, for instance, installs Tomcat and deploys version 1.0.0 of org.my:myapp
into it:
include_recipe 'tomcat'
tomcat_deploy 'my application' do
artifact_id 'myapp'
group_id 'org.my'
version '1.0.0'
end
Maybe somewhere in its lifecycle, this project becomes more enterprise-y, and requires deployment into a Wildfly container:
include_recipe 'wildfly'
wildfly_deploy 'my application' do
artifact_id 'myapp'
group_id 'org.my'
version '1.0.0'
end
Not every deployment is this simple, so it will be necessary to provide some more LWRPs to, for instance, make the org.apache
logger output at the DEBUG log level, provide a managed PostgreSQL datasource, and ship Wildfly’s application logs to Splunk:
include_recipe 'wildfly'
wildfly_deploy 'my application' do
artifact_id 'myapp'
group_id 'org.my'
version '1.0.0'
end
wildfly_logger 'org.apache' do
level 'DEBUG'
end
wildfly_datasource 'myDS' do
username 'user'
password 'user'
url 'jdbc:postgresql://some-server/some-database'
type 'XA'
end
wildfly_splunk 'logs' do
index 'myapp'
sourcetype 'wildfly_production'
end
For completeness’ sake, deploying an Angular application into Apache may look like this:
include_recipe 'apache'
apache_deploy 'my UI' do
artifact_id 'myui'
group_id 'org.my'
version '1.0.0'
end
Deploying a Node.js application with entry point at index.js
using Forever may look like this:
include_recipe 'nodejs'
nodejs_deploy 'my node app' do
artifact_id 'mynodeapp'
group_id 'org.my'
version '1.0.0'
end
Or, maybe the application needs to be started via start_server.js
instead:
include_recipe 'nodejs'
nodejs_deploy 'my node app' do
app_start 'start_server.js'
artifact_id 'mynodeapp'
group_id 'org.my'
version '1.0.0'
end
Ultimately, simple deployments end up being automated in six lines of code. Parameters are key, as with the app_start
parameter in the Node example: maybe an Apache deployment should be located at /ui
rather than /
, or maybe an EAR is being deployed into Wildfly instead of a WAR. Moving this logic to the wrapper cookbooks ensures that any automation supporting a project can be quickly reused by other projects.
Integrating with Packer
Packer templates are ugly: they’re repetitive, and they’re hard to keep in sync. The Chef run list in a Packer template is one line, but the rest of the template may amount to upwards of fifty lines. Luckily, Racker makes it easy to abstract out defaults, minimizing boilerplate. The following three lines of code, for instance, could be a macro for “converge my_cookbook
on a t2.micro running CentOS 6 with an 8GB boot volume:
Racker::Processor.register_template do |t|
t.provisioners[100]['chef-client']['run_list'] = ['my_cookbook']
end
Here’s the same Racker template, but running against a t2.small instead of a t2.micro:
Racker::Processor.register_template do |t|
t.builders['amazon-ebs']['instance_type'] = 't2.small'
t.provisioners[100]['chef-client']['run_list'] = ['my_cookbook']
end
Maybe a particular deployment needs to do something completely questionable: it needs to reach out to a RESTful security service, obtain a deployment token, and then set the Base64 of that token as the instance’s userdata. While ops and dev should likely have a conversation about this behavior, Racker is just Ruby code, and this is fully possible:
require 'base64'
require 'json'
require 'net/http'
userdata = Base64.encode64(
JSON.parse(
Net::HTTP.get('security.service.local', '/api/v1/token')
).token
)
Racker::Processor.register_template do |t|
t.builders['amazon-ebs']['user_data'] = userdata
t.provisioners[100]['chef-client']['run_list'] = ['my_cookbook']
end
Integrating with CloudFormation
As with Packer, CloudFormation templates are ugly. Often, they’re even more repetitive than Packer templates: a stack may consist of multiple micro-services, each requiring its own set of redundant, load-balanced instances. The total number of resources in such a stack works out to (number of services) * (number of service components), quickly becoming unweildy.
It’s possible, however, to create CloudFormation stacks from CloudFormation stacks. Suddenly, a micro-service deployment with n*m resources becomes a deployment with just n resources, by abstracting out into a separate template what a micro-service looks like:
{
"Type" : "AWS::CloudFormation::Stack",
"Properties" : {
"Parameters": {
"AMI": { "Ref": "AMI" },
"Branch": { "Ref": "Branch" },
"FriendlyName": "My Cloud Application",
"ProjectBaseURL": "my.org",
"ServiceName": "myapp"
},
"TemplateURL": "https://s3.amazonaws.com/b/universal-container-1.json"
}
By passing in an AMI from a workflow and a branch name, this universal container template could take the place of four resources:
- Create a launch configuration for t2.small instances of the given AMI.
- Create an auto-scale group based off that launch configuration, set to maintain two instances.
- Point a load balancer at those instances on port 80, with a health check at
/ping
. - Create a DNS name at (ServiceName).(Branch).(ProjectBaseURL), pointed at the load balancer.
Parameters are again key. Perhaps a service doesn’t have its health check endpoint at /ping
:
{
"Type" : "AWS::CloudFormation::Stack",
"Properties" : {
"Parameters": {
"AMI": { "Ref": "AMI" },
"Branch": { "Ref": "Branch" },
"FriendlyName": "My Cloud Application",
"HealthCheckEndpoint": "/api/v1/ping",
"ProjectBaseURL": "my.org",
"ServiceName": "myapp"
},
"TemplateURL": "https://s3.amazonaws.com/b/universal-container-1.json"
}
}
Perhaps it listens on port 8080, rather than port 80:
{
"Type" : "AWS::CloudFormation::Stack",
"Properties" : {
"Parameters": {
"AMI": { "Ref": "AMI" },
"Branch": { "Ref": "Branch" },
"FriendlyName": "My Cloud Application",
"InstancePort": "8080",
"HealthCheckEndpoint": "/api/v1/ping",
"ProjectBaseURL": "my.org",
"ServiceName": "myapp"
},
"TemplateURL": "https://s3.amazonaws.com/b/universal-container-1.json"
}
}
Perhaps performance does turn out to be subpar, and the service needs to be run on four c4.xlarge instances:
{
"Type" : "AWS::CloudFormation::Stack",
"Properties" : {
"Parameters": {
"AMI": { "Ref": "AMI" },
"Branch": { "Ref": "Branch" },
"FriendlyName": "My Cloud Application",
"InstancePort": "8080",
"Instances": "4",
"InstanceType": "c4.xlarge",
"HealthCheckEndpoint": "/api/v1/ping",
"ProjectBaseURL": "my.org",
"ServiceName": "myapp"
},
"TemplateURL": "https://s3.amazonaws.com/b/universal-container-1.json"
}
}
Depending on depth of AWS usage, the universal container may need to take on progressively more parameters to, say, enable fronting SSL on an ELB, or not build an ELB at all in the case of a set of instances which produce/consume over SQS rather than communicate RESTfully.
Conclusion
A consistent, simple, sane-but-overrideable infrastructure can create strong returns on minimizing the amount of time operations needs to onboard a new project. Additionally, it serves as an important stepping stone toward developer self-service. While this post only describes several specific implementations, the general idea should be clear: the default should be short, simple, and sane, yet as extensible as possible to allow reuse with more complex projects.