Test Driven Development for AWS CDK in Python

Test Driven Development for AWS CDK in Python

A Comprehensive Guide

·

10 min read

Background

Often I hear people say, we should do TDD with our code. TDD is so much better. But what is TDD? And why is it better? And how can it be added to for example CDK? If you want to know the WHY, just follow on...

Basic knowledge of CDK is required though, as I will not explain what CDK is. If you want that, go follow the workshops provided by AWS

TDD

So TDD stands for Test Driven Development. Test Driven Development (TDD) is a software development methodology that emphasizes writing tests before writing the actual code.

The main goal of TDD is to create a clear, reliable, and maintainable codebase. This is achieved by writing tests first so that developers can ensure that their code meets the required specifications and behaves as expected.

The TDD process consists of three main steps, often referred to as "Red, Green and Refactor":

1. Red (Write a failing test): In this step, developers should write a test for a specific functionality or feature before implementing the code. The test should initially fail, as the corresponding code has not yet been written.

2. Green (Write the code to pass the test): Developers then write the minimal code necessary to pass the test. The focus is on making the test pass, not on creating an optimized or complete solution.

3. Refactor (Improve the code): Once the test passes, developers can refactor the code to improve its design, readability, and efficiency, while ensuring that the test still passes. This step is crucial to maintaining a clean and maintainable codebase.

These steps are repeated iteratively for each piece of functionality, resulting in a comprehensive suite of tests that cover the entire application.

So what are the benefits you might ask, well it helps you with several things:

- Improved code quality: Writing tests before the actual code encourages you to think critically about the desired behaviour and design, leading to cleaner and more reliable code.

- Easier debugging: When a test fails, it's much easier to identify and fix the issue, as the test is focused on specific functionality.

- Faster development: By catching errors early in the development process, TDD helps prevent costly and time-consuming debugging sessions later on.

- Better collaboration: A well-tested codebase is easier for other team members to understand and modify, facilitating collaboration and reducing the likelihood of introducing new bugs.

- Enhanced maintainability: A comprehensive test suite serves as a safety net, ensuring that future changes to the codebase do not inadvertently break existing functionality.

So all and all quite nice. But how in the world can we apply that to CDK and building infrastructure in AWS?

Real-world scenario

So as a real-world scenario, I want to create a simple secure Simple Queue Service (SQS) queue. Following the best practices of AWS. So the guidelines we want to create for the SQS queue are:
1. the Queue must use KMS encryption
2. the Queue must have a dead letter queue to act as an overflow of messages which can't be processed.

Later on, we can add a lambda function which will read the queue and store the message in for example S3.

Go Build

Now the fun stuff begins. Let's start with initializing a new CDK project.

➜  Hashnode mkdir secure_sqs
➜  Hashnode cd secure_sqs
➜  secure_sqs cdk init app --language=python                                                                                                   
Applying project template app for python

# Welcome to your CDK Python project!

This is a blank project for CDK development with Python.
<...SNIPPIT...>
✅ All done!
➜  secure_sqs git:(main) source .venv/bin/activate
(.venv) ➜  secure_sqs git:(main)

What I did in the above code block is creating an empty new CDK project in a directory secure_sqs.

Looking at the CDK directory structure, you see that there is a directory called tests. This is the place where we will add our first test.

Write a failed test (RED)

We will use the assertions library which comes with CDK. More on this can be found in the docs of AWS CDK here.

Open the file tests/unit/test_secure_sqs_stack.py in your favourite code editor. The file is already boilerplate with the following code (see below). I've uncommented the last block where it checks if an SQS queue is created:

import aws_cdk as core
import aws_cdk.assertions as assertions

from secure_sqs.secure_sqs_stack import SecureSqsStack

# example tests. To run these tests, uncomment this file along with the example resource in secure_sqs/secure_sqs_stack.py
def test_sqs_queue_created():
    app = core.App()
    stack = SecureSqsStack(app, "secure-sqs")
    template = assertions.Template.from_stack(stack)

    template.has_resource_properties("AWS::SQS::Queue", {
        "VisibilityTimeout": 300
    })

What happens here if you run the python test is that the test tries to check if the CloudFormation template which is synthesised contains a property of AWS::SQS::Queue and that the queue has a Visibility Timeout of 300 seconds. As we do not have any code created the pytest should fail. Let's test this:

(.venv) ➜  secure_sqs git:(main) pytest                                                                                              <aws:abn>
============================================================= test session starts =============================================================
platform darwin -- Python 3.9.16, pytest-7.2.0, pluggy-1.0.0
rootdir: /Users/yvthepief/Code/Hashnode/secure_sqs
plugins: black-0.3.12, typeguard-2.13.3, cov-4.0.0, syrupy-3.0.5
collected 1 item                                                                                                                              

tests/unit/test_secure_sqs_stack.py F                                                                                                   [100%]

================================================================== FAILURES ===================================================================
___________________________________________________________ test_sqs_queue_created ____________________________________________________________
jsii.errors.JavaScriptError: 
  @jsii/kernel.RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
  No matches found

<...SNIPPIT...>

=========================================================== short test summary info ===========================================================
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
============================================================== 1 failed in 3.25s ==============================================================

As you can see the test fails. As the template has 0 resources with type AWS::SQS::Queue yet. And this is correct, as we didn't write any single line of code yet. In the Real-World scenario section, we described that we want the Queue to be encrypted and that it contains a dead letter queue. So create tests for that.

KMS Encrypted Queue

As the assertions library used by CDK is looking for matches with the CloudFormation template, a handy approach can be by looking into the official CloudFormation docs on the AWS::SQS::Queue resource. Here you can see that for adding KMS encryption to an SQS Queue, you need to have a KmsMasterKeyId.

So start with creating a test that checks if the Queue contains a KmsMasterKeyId. As we don't know the value of the key, we can use the Match class from the Assertions library.

from aws_cdk.assertions import Match

def test_sqs_queue_is_encrypted():
    app = core.App()
    stack = SecureSqsStack(app, "secure-sqs")
    template = assertions.Template.from_stack(stack)
    template.has_resource_properties(
        "AWS::SQS::Queue", {"KmsMasterKeyId": Match.any_value()}
    )

This block of code above checks if in the CloudFormation template, a resource is of type "AWS::SQS::Queue" and that the resource is created with a "KmsMasterKeyId".

Let's run the pytest, which will fail, of course, btw I've added the -v for verbose, and the --tb=no to disable the traceback:

(.venv) ➜  secure_sqs git:(main) ✗ pytest -v --tb=no                                                                                 <aws:abn>
============================================================= test session starts =============================================================
platform darwin -- Python 3.9.16, pytest-7.2.0, pluggy-1.0.0 -- /opt/homebrew/opt/python@3.9/bin/python3.9
cachedir: .pytest_cache
rootdir: /Users/yvthepief/Code/Hashnode/secure_sqs
plugins: black-0.3.12, typeguard-2.13.3, cov-4.0.0, syrupy-3.0.5
collected 2 items                                                                                                                             

tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created FAILED                                                                      [ 50%]
tests/unit/test_secure_sqs_stack.py::test_sqs_queue_is_encrypted FAILED                                                                 [100%]

----------------------------------------------------------- snapshot report summary -----------------------------------------------------------

=========================================================== short test summary info ===========================================================
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_is_encrypted - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
============================================================== 2 failed in 3.22s ==============================================================

Now add 1 more test for the DeadLetterQueue.

Dead letter queue attached to Queue

In the context of SQS, a dead letter queue is a queue that is used to store messages that cannot be processed by the main queue after a certain number of retries. These messages are typically failed messages that could not be processed due to issues such as incorrect message formatting or unhandled exceptions in the processing code, f.e. Lambda. By using a DLQ, you can isolate problematic messages and investigate the root cause of the issue. This can help you identify and fix issues in your application, and prevent the same errors from occurring in the future. Also, it gives you the option to reprocess the failed messages once more.

Looking at the CloudFormation documentation of "AWS::SQS::Queue" again, we can see that you need to give a RedrivePolicy with a deadletterTargetArn pointing to a Queue which acts as a dead letter queue. So create the test:

def test_sqs_queue_has_dead_letter_queue():
    app = core.App()
    stack = SecureSqsStack(app, "secure-sqs")
    template = assertions.Template.from_stack(stack)
    template.has_resource_properties(
        "AWS::SQS::Queue", {"RedrivePolicy": {"deadLetterTargetArn": Match.any_value()}}
    )p

Again we use the Match any value statement because we do not know the dead letter queue arn at this moment.

Running pytest should fail 3 tests now:

FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_is_encrypted - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_has_dead_letter_queue - RuntimeError: Error: Template has 0 resources with type AWS::SQS::Queue.

Write the code to pass the test (GREEN)

Creating SQS queue with CDK

As we finished our basic tests, we now need to write the actual test to pass the tests. So create an encrypted SQS queue with a dead letter queue attached.

Open the file secure_sqs/secure_sqs_stack.py. This file has a boilerplate as well, but we are going to adjust it.

from aws_cdk import (
    Duration,
    Stack,
    aws_kms,
    aws_sqs,
)
from constructs import Construct


class SecureSqsStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # Create key with rotation and alias
        key = aws_kms.Key(
            self,
            "SecureQueueKmsKey",
            alias="/kms/secure_queue_key",
            enable_key_rotation=True,
            description="Key for encrypting SQS queue",
        )
        # Create secure encrypted queue with 
        # visibility timeout of 300 seconds
        queue = aws_sqs.Queue(
            self,
            "SecureQueue",
            queue_name="secure_queue",
            encryption=aws_sqs.QueueEncryption.KMS,
            encryption_master_key=key,
            enforce_ssl=True,
            visibility_timeout=Duration.seconds(300),
        )

Looking at the code above, we create a Custom Managed KMS key which will have an alias and key rotation enabled. This key will be used to encrypt the SQS queue. We also set a queue policy to only allow Secure Transport (SSL). Lastly, we set 300 seconds for the visibility timeout, as specified in the tests earlier created.

So with the queue created, let's have a look at how the tests are running:

(.venv) ➜  secure_sqs git:(main) ✗ pytest -v --tb=no
================================================================== test session starts ==================================================================
platform darwin -- Python 3.11.3, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /Users/yvthepief/Code/Hashnode/secure_sqs/.venv/bin/python3.11
cachedir: .pytest_cache
rootdir: /Users/yvthepief/Code/Hashnode/secure_sqs
plugins: typeguard-2.13.3
collected 3 items                                                                                                                                       

tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created PASSED                                                                                [ 33%]
tests/unit/test_secure_sqs_stack.py::test_sqs_queue_is_encrypted PASSED                                                                           [ 66%]
tests/unit/test_secure_sqs_stack.py::test_sqs_queue_has_dead_letter_queue FAILED                                                                  [100%]

================================================================ short test summary info ================================================================
FAILED tests/unit/test_secure_sqs_stack.py::test_sqs_queue_has_dead_letter_queue - RuntimeError: Error: Template has 1 resources with type AWS::SQS::Q...
============================================================== 1 failed, 2 passed in 5.01s ==============================================================

2 out of 3 passed already. But it is still not the 100% mark we are aiming for. To have the test_sqs_queue_has_dead_letter_queue passed as well, we need to add a Dead Letter Queue. Add the dead letter queue between the key and queue, and refer to the dead letter queue resource for the queue:

from aws_cdk import (
    Duration,
    Stack,
    aws_kms,
    aws_sqs,
)
from constructs import Construct


class SecureSqsStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # Create key with rotation and alias
        key = aws_kms.Key(
            self,
            "SecureQueueKmsKey",
            alias="/kms/secure_queue_key",
            enable_key_rotation=True,
            description="Key for encrypting SQS queue",
        )

        # Create secure encrypted dead letter queue with 
        # visibility timeout of 300 seconds
        dead_letter_queue = aws_sqs.Queue(
            self,
            "SecureDeadLetterQueue",
            queue_name="secure_dead_letter_queue",
            encryption=aws_sqs.QueueEncryption.KMS,
            encryption_master_key=key,
            enforce_ssl=True,
            visibility_timeout=Duration.seconds(300),
        )

        # Create secure encrypted queue with 
        # visibility timeout of 300 seconds and refer to the dlq
        queue = aws_sqs.Queue(
            self,
            "SecureQueue",
            queue_name="secure_queue",
            encryption=aws_sqs.QueueEncryption.KMS,
            encryption_master_key=key,
            enforce_ssl=True,
            visibility_timeout=Duration.seconds(300),
            dead_letter_queue=aws_sqs.DeadLetterQueue(
                max_receive_count=5, queue=dead_letter_queue
            ),
        )

Shall the tests pass now?

(.venv) ➜  secure_sqs git:(main) ✗ pytest -v --tb=no
================================================================== test session starts ==================================================================
platform darwin -- Python 3.11.3, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /Users/yvthepief/Code/Hashnode/secure_sqs/.venv/bin/python3.11
cachedir: .pytest_cache
rootdir: /Users/yvthepief/Code/Hashnode/secure_sqs
plugins: typeguard-2.13.3
collected 3 items                                                                                                                                       

tests/unit/test_secure_sqs_stack.py::test_sqs_queue_created PASSED                                                                                [ 33%]
tests/unit/test_secure_sqs_stack.py::test_sqs_queue_is_encrypted PASSED                                                                           [ 66%]
tests/unit/test_secure_sqs_stack.py::test_sqs_queue_has_dead_letter_queue PASSED                                                                  [100%]

=================================================================== 3 passed in 4.98s ===================================================================
(.venv) ➜  secure_sqs git:(main) ✗

AWSome, it passes! Now we can build further.

As you can see in the examples, I only have created 3 tests, but looking at the CloudFormation output you can see 4 AWS resources are created. To be compliant and have all the resources tested, it is wise to add tests for these resources as well.

Extra options would be for example, creating a test for a Lambda which will process the messages in the Secure queue, or adding tests for the SQS queue policy. But that is all up to you! Now Go Build!

Summary

In this post, I showed how the process should work using Test Driven Development with CDK. Start small and step for step. It is not wise to create test for all the resources you will create in your end goal application. This will only clutter your tests. So the most important is that you start with tests, and once more keep these tests small. Iteration is the most important factor here!

Did you find this article valuable?

Support Yvo van Zee by becoming a sponsor. Any amount is appreciated!