Yvo van Zee
Yvo's Blog

Yvo's Blog

Upgrading CDK from CDKv1 to CDKv2 in existing project

Upgrading CDK from CDKv1 to CDKv2 in existing project

Yvo van Zee's photo
Yvo van Zee
·Jan 21, 2022·

10 min read

Subscribe to my newsletter and never miss my upcoming articles

Table of contents

  • Background
  • Prerequisites
  • Installation
  • Real World Scenario
  • Go Build
  • Try Yourself

Background

On december 2nd AWS announced the general availability of CDK version 2. The main reason that the CDK team released version two was to deal with the so called dependency hell. CDK and all stable constructs are now combined in 1 package/module. Experimental modules still need to be installed one by one.

As working as a cloud consultant for an enterprise in the financial sector, being secure and patching your software, yes CDK can be seen as software too, is a must. So patching CDK to version 2 is inevitable. Especially as CDK version 1 will be retired as of 1st of June 2023, see maintenance policy AWS

So time to upgrade our code base as well. As there are multiple guides on the internet, also the official one from AWS, this blog will not take it in to detail how a migration to CDK version 2 should take place. It describes the road I took to move from CDK version 1 to version 2.

Prerequisites

As this blog describes the patching of CDK, knowledge of CDK is required. Luckily AWS created workshops on this topics. Please check them out if you want to try out on CDK and CDK pipelines.

Access to an AWS Account with proper rights to deploy resources is needed.

Installation

Just grab a cup of coffee an keep on reading...

Real World Scenario

At the current assignment a data platform is created in AWS. The platform will ingest data from several sources, process the data and make it available for end users. The services used are DataSync, S3, Glue, EMR, Athena and Managed Workflow Apache Airflow(MWAA). Besides these services, a lot of other so called supporting services are used as well.

During the start of the project, CDK version 1 was the only major version available. When working with CDK version 1, every package/module which is needed for the CDK App, needs to be installed. For this application, which is a python CDK App, a requirements.txt is used.

The data platform itself is created in a so called restricted AWS environment, which means that there is no public internet connectivity available. Every connection to the internet is routed over direct connects to on-premise firewalls, where the traffic is scanned and inspected.

Because there is no internet connectivity to the outside world to retrieve the build packages, configured in a requirements.txt file, a on-premise package manager has to be configured. And due to the restrictive policies in place, every package which is needed in the requirements.txt and dependencies, have the potential to be blocked by the package manager. As this data platform is created for an enterprise, to release such a package takes time. A ticket needs to be raised, communication with the Security team, Approving the package, it all takes time.

With CDK version 2 the downloading of packages is limited to a single package, the aws-cdk-lib package. Ok, actually you need the constructs package as well. This will makes the requirements.txt file and the potential whitelisting of the aws-cdk-lib package easier.

Go Build

Old CDK version 1

So when the data platform application was build, it used the version cdk 1.136.0. The requirements.txt files looked like this:

aws-cdk.aws-athena==1.136.0
aws-cdk.aws-certificatemanager==1.136.0
aws-cdk.aws-codecommit==1.136.0
aws-cdk.aws-codebuild==1.136.0
aws-cdk.aws-codepipeline==1.136.0
aws-cdk.aws-codepipeline-actions==1.136.0
aws-cdk.custom-resources==1.136.0
aws-cdk.aws-datasync==1.136.0
aws-cdk.aws-dynamodb==1.136.0
aws-cdk.aws-ec2==1.136.0
aws-cdk.aws-elasticloadbalancingv2==1.136.0
aws-cdk.aws-elasticloadbalancingv2-targets==1.136.0
aws-cdk.aws-emr==1.136.0
aws-cdk.aws-glue==1.136.0
aws-cdk.aws-iam==1.136.0
aws-cdk.aws-lambda-event-sources==1.136.0
aws-cdk.aws-lambda-python==1.136.0
aws-cdk.aws-lambda==1.136.0
aws-cdk.aws-logs==1.136.0
aws-cdk.aws-mwaa==1.136.0
aws-cdk.aws-route53==1.136.0
aws-cdk.aws-route53-targets==1.136.0
aws-cdk.aws-s3==1.136.0
aws-cdk.aws-s3-deployment==1.136.0
aws-cdk.aws-s3-notifications==1.136.0
aws-cdk.aws-secretsmanager==1.136.0
aws-cdk.aws-ssm==1.136.0
aws-cdk.pipelines==1.136.0
jsonschema<=3.2.0
boto3
pytest
-e .

As you can see above, each package is installed separately and could potentially be blocked by the package manager.

Our cdk.json file which holds information on CDK looked like this:

{
  "app": "python app.py",
  "context": {
    "@aws-cdk/aws-apigateway:usagePlanKeyOrderInsensitiveId": true,
    "@aws-cdk/core:enableStackNameDuplicates": "true",
    "aws-cdk:enableDiffNoFail": "true",
    "@aws-cdk/core:stackRelativeExports": "true",
    "@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
    "@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
    "@aws-cdk/aws-kms:defaultKeyPolicies": true,
    "@aws-cdk/aws-s3:grantWriteWithoutAcl": true,
    "@aws-cdk/aws-ecs-patterns:removeDefaultDesiredCount": true,
    "@aws-cdk/aws-rds:lowercaseDbIdentifier": true,
    "@aws-cdk/aws-efs:defaultEncryptionAtRest": true,
    "@aws-cdk/aws-lambda:recognizeVersionProps": true,
    "@aws-cdk/core:newStyleStackSynthesis": true,
  }
}

Lastly our setup.py file:

import setuptools


with open("README.md") as fp:
    long_description = fp.read()


setuptools.setup(
    name="hashnode",
    version="0.0.1",

    description="Data Platform Application",
    long_description=long_description,
    long_description_content_type="text/markdown",

    author="Yvo van Zee",
    author_email="yvo@yvovanzee.nl",

    package_dir={"": "hashnode"},
    packages=setuptools.find_packages(where="hashnode"),

    install_requires=[
        "aws-cdk.core==1.136.0",
    ],

    python_requires=">=3.6",

    classifiers=[
        "Development Status :: 4 - Beta",

        "Intended Audience :: Developers",

        "Programming Language :: JavaScript",
        "Programming Language :: Python :: 3 :: Only",
        "Programming Language :: Python :: 3.6",
        "Programming Language :: Python :: 3.7",
        "Programming Language :: Python :: 3.8",

        "Topic :: Software Development :: Code Generators",
        "Topic :: Utilities",

        "Typing :: Typed",
    ],
)

Move to CDK version 2

As this is a python based CDK project, packages can be installed in a virtual environment. A virtual environment is standard setup when you first install a CDK project. For the migration to CDK version two, an extra virtual environment to install our CDK version 2 packages in was created.

➜  hashnode git:(main) python3 -m venv .cdkv2
➜  hashnode git:(main) source .cdkv2/bin/activate
(.cdkv2) ➜  hashnode git:(main)

To make changes easier, a separate branch was created to track all changes.

(.cdkv2) ➜  hashnode git:(main) git checkout -b feature/cdk_version_2          
Switched to a new branch 'feature/cdk_version_2'
(.cdkv2) ➜  hashnode git:(feature/cdk_version_2)

As everything is now set, the fun stuff begins. The CDK version 2 packages needs to be configured. I've chosen to move these packages to the setup.py, which is installed with the -e . option in the requirements.txt. So the requirements.txt file can be cleaned. The new version looks like:

boto3
pytest
-e .

In the setup.py file the aws-cdk-lib and construct packages are added, and the aws-cdk.core==1.136.0 is removed. The version of the aws-cdk-lib package is pinned on version 2.2.0. The reason for this was that the lib package was put in quarantine mode in the first place. Due to the fact that PyYaml package had a security vulnerability. When the PyYaml package was assessed by the security team, it got whitelisted.

import setuptools


with open("README.md") as fp:
    long_description = fp.read()


setuptools.setup(
    name="hashnode",
    version="0.0.1",

    description="Data Platform Application",
    long_description=long_description,
    long_description_content_type="text/markdown",

    author="Yvo van Zee",
    author_email="yvo@yvovanzee.nl",

    package_dir={"": "hashnode"},
    packages=setuptools.find_packages(where="hashnode"),

    install_requires=[
        "aws-cdk-lib==2.2.0",
        "constructs>=10.0.0,<11.0.0",
    ],

    python_requires=">=3.6",

    classifiers=[
        "Development Status :: 4 - Beta",

        "Intended Audience :: Developers",

        "Programming Language :: JavaScript",
        "Programming Language :: Python :: 3 :: Only",
        "Programming Language :: Python :: 3.6",
        "Programming Language :: Python :: 3.7",
        "Programming Language :: Python :: 3.8",

        "Topic :: Software Development :: Code Generators",
        "Topic :: Utilities",

        "Typing :: Typed",
    ],
)

As described in the official migration manual of AWS, the cdk.json needs to be adjusted as well. A lot of options used in CDK version 1 are now obsolete. Which simplifies our cdk.json to:

{
  "app": "python app.py",
  "context": {
    "@aws-cdk/aws-apigateway:usagePlanKeyOrderInsensitiveId": false,
    "@aws-cdk/aws-cloudfront:defaultSecurityPolicyTLSv1.2_2021": false,
    "@aws-cdk/aws-rds:lowercaseDbIdentifier": false,
    "@aws-cdk/core:stackRelativeExports": false,
  }
}

As this is it for the configuration of the installation part, it is time to install the requirements.txt in our virtual environment:

(.cdkv2) ➜  hashnode git:(main) pip install -r requirements.txt           
Obtaining file:///Users/yvthepief/Code/hashnode (from -r requirements.txt (line 3))
  Preparing metadata (setup.py) ... done
Collecting boto3
  Downloading boto3-1.20.35-py3-none-any.whl (131 kB)
     |████████████████████████████████| 131 kB 3.8 MB/s            
Collecting pytest
  Using cached pytest-6.2.5-py3-none-any.whl (280 kB)
Collecting botocore<1.24.0,>=1.23.35
  Downloading botocore-1.23.35-py3-none-any.whl (8.5 MB)
     |████████████████████████████████| 8.5 MB 6.9 MB/s            
Collecting jmespath<1.0.0,>=0.7.1
  Using cached jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.6.0,>=0.5.0
  Using cached s3transfer-0.5.0-py3-none-any.whl (79 kB)
Collecting pluggy<2.0,>=0.12
  Using cached pluggy-1.0.0-py2.py3-none-any.whl (13 kB)
Collecting toml
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting py>=1.8.2
  Using cached py-1.11.0-py2.py3-none-any.whl (98 kB)
Collecting packaging
  Using cached packaging-21.3-py3-none-any.whl (40 kB)
Collecting attrs>=19.2.0
  Downloading attrs-21.4.0-py2.py3-none-any.whl (60 kB)
     |████████████████████████████████| 60 kB 9.9 MB/s             
Collecting iniconfig
  Using cached iniconfig-1.1.1-py2.py3-none-any.whl (5.0 kB)
Collecting aws-cdk-lib==2.2.0
  Using cached aws_cdk_lib-2.2.0-py3-none-any.whl (57.6 MB)
Collecting constructs<11.0.0,>=10.0.0
  Downloading constructs-10.0.33-py3-none-any.whl (50 kB)
     |████████████████████████████████| 50 kB 10.1 MB/s            
Collecting publication>=0.0.3
  Using cached publication-0.0.3-py2.py3-none-any.whl (7.7 kB)
Collecting jsii<2.0.0,>=1.47.0
  Downloading jsii-1.52.1-py3-none-any.whl (382 kB)
     |████████████████████████████████| 382 kB 11.6 MB/s            
Collecting python-dateutil<3.0.0,>=2.1
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting urllib3<1.27,>=1.25.4
  Downloading urllib3-1.26.8-py2.py3-none-any.whl (138 kB)
     |████████████████████████████████| 138 kB 8.9 MB/s            
Collecting pyparsing!=3.0.5,>=2.0.2
  Using cached pyparsing-3.0.6-py3-none-any.whl (97 kB)
Collecting typing-extensions<5.0,>=3.7
  Using cached typing_extensions-4.0.1-py3-none-any.whl (22 kB)
Collecting cattrs<1.11,>=1.8
  Using cached cattrs-1.10.0-py3-none-any.whl (29 kB)
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, attrs, typing-extensions, python-dateutil, cattrs, urllib3, publication, jsii, jmespath, pyparsing, constructs, botocore, toml, s3transfer, py, pluggy, packaging, iniconfig, aws-cdk-lib, pytest, boto3, aac-cdp
  Running setup.py develop for aac-cdp
Successfully installed aac-cdp-0.0.1 attrs-21.4.0 aws-cdk-lib-2.2.0 boto3-1.20.35 botocore-1.23.35 cattrs-1.10.0 constructs-10.0.33 iniconfig-1.1.1 jmespath-0.10.0 jsii-1.52.1 packaging-21.3 pluggy-1.0.0 publication-0.0.3 py-1.11.0 pyparsing-3.0.6 pytest-6.2.5 python-dateutil-2.8.2 s3transfer-0.5.0 six-1.16.0 toml-0.10.2 typing-extensions-4.0.1 urllib3-1.26.8

Configuring code to use new CDK version 2 libraries

As we now have setup our environment, it is time to actually change our imports as well. These are now still pointing to the aws_cdk.core libraries. The app.py file for example was using the following imports (CDKv1):

#!/usr/bin/env python3
import os

from aws_cdk import core as cdk

from pipeline_resources.cdkpipeline import CdkPipelineStack
from pipeline_resources.repository import RepositoryStack
from utilities.permission_boundary import PermissionBoundaryAspect
from utilities.tagging import add_tags

To make use of CDK version 2, change the imports to:

#!/usr/bin/env python3
import os

from aws_cdk import (
    App,
    Aspects,
    Aws,
    Environment,
)

from pipeline_resources.cdkpipeline import CdkPipelineStack
from pipeline_resources.repository import RepositoryStack
from utilities.permission_boundary import PermissionBoundaryAspect
from utilities.tagging import add_tags

This looks like more lines of code, why so? Well this is because previous in CDK version 1 the core module was imported as a complet module. This meant that you could access everything inside the core module, such as App, Stack, Construct. With CDK version 2 in place a bit of cleaning code was done as well. So where we previously referred to the App module by cdk.App(), it is now done by just App().

For the pipeline stack the updated imports looks like the following:

from aws_cdk import (
    Aspects,
    Stack,
    aws_codecommit as codecommit,
    aws_codebuild as codebuild,
    aws_ec2 as ec2,
    aws_iam as iam,
    aws_kms as kms,
    aws_secretsmanager as secretsmanager,
    pipelines,
)
from constructs import (
    Construct,
)

class CdkPipelineStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, named_environments: dict, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)
<SNIPPIT>

As you can see above we basically import two modules (Aspects and Stack) which were previously in the core module. Furthermore the module Construct from constructs is used as well. It is in the class definition of the stack (scope). Basically everything which was cdk.xxx or core.xxx, depending on your type of import is replaced by the direct import from the aws_cdk module. All other imports such as aws_ec2 as ec2 remained untouched.

Some handy documentation to check what should end up where is the RFC 0192 of CDK.

Testing the CDK version 2, synthesizing

The first try synthesizing the templates failed. This was because with all the SecureBucket resources (demo on how to create a secure bucket construct can be found in this blog), a KMS key was created as well. For this KMS key we used the trust_account_identities value set to True. But this is not supportedfor KMS key in CDKv2 anymore.

So second try, yet another failure. This time on the VPC selection of the isolated subnet. In the Glue Stack one of the security requirements is that we create a CfnConnection for Glue so it use the VPC. We select the first private subnet ID:

subnet_id=vpc.select_subnets(
    subnet_type=ec2.SubnetType.ISOLATED
).subnet_ids[0]

This fails with an AttributeError

  File "/Users/yvthepief/Code/Hashnode/devtest/repository/application_resources/Ingestion/glue.py", line 262, in __init__
    private_subnets = vpc.vpc.select_subnets(subnet_type=ec2.SubnetType.ISOLATED, one_per_az=True)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/enum.py", line 429, in __getattr__
    raise AttributeError(name) from None
AttributeError: ISOLATED
Subprocess exited with error 1

What this actually means is that ISOLATED is not used anymore. This is renamed in CDK version 2 to PRIVATE_ISOLATED. Other options are PUBLIC or PRIVATE_WITH_NAT.

Finally third time's a charm! CDK synthesises correctly, which means all templates in the cdk.out folder are now rendered via CDK version 2.

Last thing to do was check in the code in the newly created branch. Create a proper commit message, following a pull request. This pull request was reviewed by a colleague, the so called 4-eye method, and merged with the main branch and let CDK pipelines work its magic.

Try Yourself

As the code is from an enterpise repository, it is not allowed for me to share it here.

If you want to try yourself, you can use my cdkpipeline_with_cfn_nag repository in GitHub. This is a CDK version 1 Application with CDK pipelines, blog on this can be found here. Clone or Fork it, and try to upgrade it to CDK version 2 following the steps in this blog.

Did you find this article valuable?

Support Yvo van Zee by becoming a sponsor. Any amount is appreciated!

See recent sponsors Learn more about Hashnode Sponsors
 
Share this