TDD kata with serverless services in AWS
Last week, I’ve re-read “Test Driven Development: By Example” by Kent Beck. I was amazed by the simplicity of his process, consisting of small pragmatic steps. So, I decided to put the process to the test in an unfamiliar domain.
The kata
In this kata, I am going to develop a serverless service in AWS using a Lambda and the API Gateway. I chose this task because, on the one hand, it will contain a fair amount of infrastructure code, which is considered hard to test. On the other hand, because I wanted a task that was more abstract and closer to a business requirement in contrast to a technical requirement like ‘deploy an AWS HTTP API Gateway’.
My goal is to understand whether TDD for infrastructure is possible and what are the trade-offs.
The TDD process
I am going to be following the TDD process, as described in the book, as close as possible. I will attempt to follow the Red-Green-Refactor cycle.
- Red: Start by writing a failing test. It may not even compile at that time.
- Green: Make the test pass, committing any sin necessary in the process.
- Refactor: Eliminate any duplication introduced to make the tests green.
One important piece of the process is having a To-Do list. It is going to help me keep track of what is left to do and help me discover what to work on next.
To-Do List
==========
* Publish sales API
What is new for me is an appreciation of TDD as a process to manage my uncertainty and fear, rather than a process to write tests. You can read more about this in my previous blog post, here.
Tech stack
Being cognizant of the uncertainty, I decided to use as much as possible familiar tools.
- Terraform for infrastructure provisioning.
- Javascript for the Lambda.
- Jest as a test runner.
There are probably tools better than terraform when it comes to testability. What I want to demonstrate is that the TDD process is independent of the tools you have to work with.
There are also aspects of this task that I am not familiar with:
- Using the API Gateway
- Using TDD in an infrastructure heavy task
I will have to be careful to tackle them in small increments, so that I don’t get overwhelmed.
Putting Terraform under test harness
‘Publish sales API’ is a very big task to do in one step. So, let’s look for a more achievable intermediate step to start with. If I can have a test which applies a minimal terraform configuration, I will at least know that I can put terraform under test harness, which is a prerequisite to the TDD cycle.
To-Do List
==========
* Publish sales API
* Run terraform in test
Test snippet
The test uses the simplest approach I can think of to drive terraform. It does what I would do in the command line.
const {execSync} = require('child_process');
const https = require('https');
const AWS = require('aws-sdk')
describe('serverless', () => {
test('run terraform', () => {
const modulePath = './src';
execSync('terraform init', {cwd: modulePath});
const applyResp = execSync('terraform apply -auto-approve', {cwd: modulePath});
expect(applyResp.toString()).toContain('Apply complete!')
})
})
Code snippet
provider "aws" {
region = "eu-central-1"
}
Writing this test was hard work but making it green required only 3 lines of boilerplate code! Still, it is an important milestone. It demonstrates that it is possible to use jest to drive terraform to act and then also assert on the outcome of the operation.
To-Do List
==========
* Publish sales API
☑️ Run terraform in test
Deploying the Lambda
Even with terraform under test harness, deploying the sales API in one step, is still too big a task. What I find especially challenging about the task ahead, is doing all the infrastructure automation in one go, especially since I am unfamiliar with the API Gateway.
I think a smaller task, that I feel comfortable to undertake, is to deploy the AWS Lambda with my application code and make sure I can invoke it using the aws-sdk. Let’s update the To-Do List with our next steps.
To-Do List
==========
* Publish sales API
☑️ Run terraform in test
* Deploy the sales Lambda
Test snippet
I’ve refactor the test code from before, and I’ve extracted a beforeAll
block where the
terraform related code now lives.
The test itself is using the aws-sdk to invoke a Lambda function and asserts that the function returned the expected values. The name of the function is coming from terraform.
const {execSync} = require('child_process');
const https = require('https');
const AWS = require('aws-sdk')
describe('lambda', () => {
jest.setTimeout(20000)
let lambda_name;
beforeAll(() => {
execSync('terraform init', {cwd: './src'});
const modulePath = './src';
const applyResp = execSync('terraform apply -auto-approve', {cwd: modulePath});
expect(applyResp.toString()).toContain('Apply complete!')
const resp = JSON.parse(execSync('terraform output -json', {cwd: modulePath}));
lambda_name = resp.lambda_name.value
})
test('have a lambda', async () => {
const lambda = new AWS.Lambda({apiVersion: '2015-03-31', region: 'eu-central-1'});
const resp = await lambda.invoke({
FunctionName: lambda_name,
}).promise()
expect(resp.StatusCode).toBe(200)
expect(resp.Payload).toContain('{ sales: [] }')
})
})
Code snippet
data "archive_file" "example" {
type = "zip"
source_file = "${path.module}/example/index.js"
output_path = "${path.module}/files/example.zip"
}
resource "aws_lambda_function" "example" {
function_name = "serverless_example"
handler = index.handler
role = aws_iam_role.lambda_exec.arn
runtime = "nodejs12.x"
filename = data.archive_file.example.output_path
source_code_hash = filebase64sha256(data.archive_file.example.output_path)
reserved_concurrent_executions = 1
timeout = 10
publish = true
}
data aws_iam_policy_document "lambda_exec" {
statement {
actions = ["sts:AssumeRole"]
principals {
identifiers = ["lambda.amazonaws.com"]
type = "Service"
}
effect = "Allow"
}
}
resource "aws_iam_role" "lambda_exec" {
name = "serverless_example_lambda"
assume_role_policy = data.aws_iam_policy_document.lambda_exec.json
}
output "lambda_name" {
value = aws_lambda_function.example.function_name
}
There was a bit more code, I had to write to make this test green. Fortunately, I was able to use jest, and work through the failures one by one until my Lambda was properly deployed.
I had to make the lambda_name
an output of terraform to have it available in the test.
I don’t provide the JS code of the Lambda. I don’t think there is any educational value in it.
To-Do List
==========
* Publish sales API
☑️ Run terraform in test
☑️ Deploy the sales Lambda
Publishing the sales API
Now, I think I can go back and tackle the original task.
To-Do List
==========
* Publish sales API
☑️ Run terraform in test
☑️ Deploy the sales Lambda
Test snippet
const {execSync} = require('child_process');
const https = require('https');
const AWS = require('aws-sdk')
describe('simple http api', () => {
jest.setTimeout(20000)
let tf_output = {};
beforeAll(() => {
execSync('terraform init', {cwd: './src'});
const applyResp = execSync('terraform apply -auto-approve', {cwd: './src'});
expect(applyResp.toString()).toContain('Apply complete!')
tf_output = JSON.parse(execSync('terraform output -json', {cwd: './src'}));
})
test('API', async () => {
const apigatewayv2 = new AWS.ApiGatewayV2({apiVersion: '2018-11-29', region: 'eu-central-1'});
const resp = await apigatewayv2.getApi({
ApiId: tf_output.simple_http_api.value.id
}).promise();
expect(resp.ApiEndpoint).toBeTruthy()
expect(resp.ApiId).toEqual(tf_output.simple_http_api.value.id)
})
test('Get response from API', (done) => {
const req = https.request(
tf_output.simple_http_api.value.api_endpoint,
(res) => {
let data = ''
res.setEncoding('utf8');
res.on('data', (chunk) => {
data += chunk
});
res.on('end', () => {
expect(res.statusCode).toBe(200)
expect(data).toContain('{sales: []}')
done()
});
});
req.on('error', (e) => {
console.error(e);
done(e)
});
req.end();
})
})
Those two tests demonstrate two different approaches to write assertions.
-
The first one uses the aws-sdk to inspect whether the necessary resource has been created.
-
The second one uses a completely outside-in approach without any knowledge of the infrastructure. It makes an HTTP request to the endpoint, demonstrating that our API is published and working.
Both tests depend on output from terraform.
Code snippet
resource "aws_apigatewayv2_api" "example" {
name = "simple_http_example"
protocol_type = "HTTP"
target = aws_lambda_function.example.arn
}
resource "aws_lambda_permission" "apigw" {
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.example.function_name
principal = "apigateway.amazonaws.com"
source_arn = "${aws_apigatewayv2_api.example.execution_arn}/*/*"
}
output "simple_http_api" {
value = aws_apigatewayv2_api.example
}
With that, our main task is done!
To-Do List
==========
☑️ Publish sales API
☑️ Run terraform in test
☑️ Deploy the sales lambda
Conclusion
All in all, I wrote 3 tests which take around 15 seconds to run including the terraform apply
.
This is about one order of magnitude slower than what I am used to, when I write tests for classic applications. Still, it is one of the fastest feedback cycles I’ve experienced doing infrastructure.
I hope I demonstrated that a TDD approach is a viable approach for developing infrastructure code. Of course, with dedicated tooling it gets easier to write tests. However, you can use simple tools to start today and reap the benefits of TDD.
Practical concerns
-
You will need an AWS account to apply the resources to. Either you need one account per developer or tweak the resources' names so that the resources of different developers do not conflict. Alternatively, you can use tools like
localstack
. They may help. -
You may want an
afterAll
hook that destroys the resources in the end.
Lessons learned
Tests force you to write testable code. If I compare the terraform code I wrote for this kata with the code I usually write, I see a few differences:
-
I definitely used more terraform outputs than I would otherwise.
-
Terraform apply can get quite slow depending on the number and type of resources to provision. In order to keep the tests run fast, I would be forced to decompose my terraform code into smaller, independently deployable components sooner than otherwise.
-
Testability should be one of your major considerations when design and building software systems. Be careful with technologies which do not make it easy for you. When it comes to serverless make sure to weight the potential benefits against the difficulty to test.