Building a sophisticated CodePipeline on AWS with AWS CDK

Building a sophisticated CodePipeline on AWS with AWS CDK

Leverage AWS CDK to build a DevOps Suite entirely on AWS

·

12 min read

You got hired as a DevOps engineer and have experience in building CI/CD- pipelines with Gitlab CI and GitHub actions. But the new company requires you to make CI/CD entirely on AWS and put the infrastructure to code.

CodePipeline, CodeBuild, CodeCommit! These are the DevOps services of AWS.

Your first reaction is, why? This looks so ugly! Where should I start? But the code is already on CodeCommit. It’s a monorepo.

Your research finds an interesting article on creating sophisticated pipelines entirely on AWS with AWS CDK. That blog post demonstrates each pipeline for different scenarios and how to organize your codebase. Let's have a look together.

Simple Pipeline

The simple pipeline includes a CodePipeline for orchestrating your CI/CD workflow.

A CodePipeline consists of stages, and each stage contains actions. An action is a command which is executed. Each stage produces an artifact resulting from an action that can be passed to the next stage. A stage always needs an artifact as an input, and it can define multiple actions that can run in parallel or sequentially.

Simple Pipeline Diagram

Below is the AWS CDK code for the simple pipeline. Out of brevity, it left out the import statements.

// Create the pipeline
const pipeline = new Pipeline(stack, "Pipeline", {
  pipelineName: "aws-cdk-pipeline-demo",
});
// Create the artifact to store outputs of the pipeline
const artifact = new Artifact();
// 1.1. Best Practice to Import Repository
const repository = Repository.fromRepositoryName(
  stack,
  "Repository",
  "aws-cdk-pipeline-demo",
);
// 1.2. Add the source action to the pipeline
pipeline.addStage({
  stageName: "Source",
  actions: [
    new CodeCommitSourceAction({
      actionName: "Source",
      output: artifact,
      repository,
      branch: "main",
    }),
  ],
});
// 2. Stage : CodeBuild
pipeline.addStage({
  stageName: "Build",
  // We can actually paste many actions in here
  // But since we can dump anything in one CodeBuild
  actions: [
    new CodeBuildAction({
      actionName: "Build",
      input: artifact,
      project: new PipelineProject(stack, "BuildProject", {
        projectName: "aws-cdk-build-project",
        // You can also create your own buildspec.yml file and reference it here instead of using the inline buildspec
        // https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#build-spec-ref-syntax
        buildSpec: BuildSpec.fromObject({
          version: "0.2",
          phases: {
            install: {
              commands: ["npm install"],
            },
            build: {
              commands: [
                "npm run lint",
                "npm run test:unit",
                "npm run deploy",
                "npm run test:integration",
              ],
            },
          },
          artifacts: {
            "base-directory": ".",
            files: ["**/*"],
          },
        }),
      }),
    }),
  ],
});

However, the Simple Pipeline would not meet your requirements. Because you also want to deploy to different AWS accounts.

Cross-Account Pipeline

You continue reading, and indeed, there is a section about deploying to different AWS accounts. You learned that CodePipeline is made with that in mind. Below, you see the diagram for cross-deploying to DEV, STAGING, and PROD- accounts.

Cross Account Pipeline Diagram

As depicted in the diagram, the Build - the stage is now shorter. You learned that the following stages can reuse the artifact resulting from it. Yes, that makes sense.

You go further, and your thought is CDK as Infrastructure as Code shines because it is written in Typescript, allowing object-oriented programming paradigms to be applied. It is time to get your hands dirty, and you put the “Simple Pipeline” idea into an abstract class.

/**
 * Abstract Class for building multiple Pipelines
 * Make it abstract to give the Pipeline a base
 */
export abstract class PipelineStack extends Stack {
  public readonly pipeline: Pipeline;
  public readonly artifact: Artifact;

  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    // 0. Stage: CodePipeline (Pre-Requisite)
    // 0.1 Create the pipeline
    this.pipeline = new Pipeline(this, "Pipeline", {
      pipelineName: id,
    });

    // 0.2 Create the artifact to store outputs of the pipeline
    this.artifact = new Artifact("SourceArtifact");

    // 1. Stage : CodeCommit
    // We are importing the repository because it's a stateful resource
    // and should be deployed in a stateful stack
    const repository = Repository.fromRepositoryName(
      this,
      "Repository",
      "aws-cdk-pipeline-demo",
    );

    // 1.2. Add the source action to the pipeline
    this.pipeline.addStage({
      stageName: "Source",
      actions: [
        new CodeCommitSourceAction({
          actionName: "Source",
          output: this.artifact,
          repository,
          branch: props.branch ?? "main",
        }),
      ],
    });
    // 2. Stage: CodeBuild
    // 2.1. Build Stage
    this.pipeline.addStage({
      stageName: "Build",
      // We can actually paste many actions in here
      // But since we can dump anything in one CodeBuild
      actions: [
        new CodeBuildAction({
          actionName: "Build",
          input: this.artifact,
          project: new PipelineProject(this, "BuildProject", {
            projectName: "aws-cdk-build-project-demo",
            // You can also create your own buildspec.yml file and reference it here instead of using the inline buildspec
            // https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#build-spec-ref-syntax
            buildSpec: BuildSpec.fromObject({
              version: "0.2",
              phases: {
                install: {
                  commands: ["npm install"],
                },
                build: {
                  commands: [
                    "npm run lint",
                    "npm run test:unit",
                    "npm run build",
                  ],
                },
              },
              artifacts: {
                "base-directory": "dist",
                files: ["**/*"],
              },
            }),
          }),
        }),
      ],
    });

    // 2.2. Deploy Stage
    // This part is now handled in the Concrete class which inherits this
    // However, we can simplify the inheritance and provide helper functions below
    // createPipelineProject(), addPoliciesToProject(), addStageToPipeline(), createPipelineProject(), addStageToPipeline()
  }

  /**
   * The function creates a pipeline project with specified commands and build specifications.
   * @param {Construct} scope - The scope parameter is the parent construct that the pipeline project
   * will be created under. It defines the scope or context in which the project will exist.
   * @param {string} name - The name of the pipeline project. It is used as the project name and also as
   * the name of the construct in the AWS CDK.
   * @param {Command} commands - The `commands` parameter is an object that contains the commands to be
   * executed in different phases of the build process. It has the following structure:
   * @returns a new instance of the `PipelineProject` class.
   */
  public createPipelineProject(
    scope: Construct,
    name: string,
    commands: Command,
  ): PipelineProject {
    return new PipelineProject(scope, name, {
      projectName: name,
      environment: {
        buildImage: LinuxBuildImage.STANDARD_7_0,
        computeType: ComputeType.LARGE,
      },
      buildSpec: BuildSpec.fromObject({
        version: "0.2",
        phases: {
          pre_build: {
            commands: commands.preBuild,
          },
          install: {
            commands: commands.install,
          },
          build: {
            commands: commands.build,
          },
          post_build: {
            commands: commands.postBuild,
          },
        },
        artifacts: {
          "base-directory": ".",
          files: ["**/*"],
        },
      }),
    });
  }

  /**
   * The function adds a stage to a pipeline with a given stage name and a list of actions.
   * @param {string} stageName - A string representing the name of the stage to be added to the
   * pipeline.
   * @param {Action[]} actions - The `actions` parameter is an array of `Action` objects. Each `Action`
   * object represents a specific action or task that needs to be performed in the pipeline stage.
   */
  public addStageToPipeline(stageName: string, actions: Action[]): void {
    this.pipeline.addStage({
      stageName,
      actions,
    });
  }
}

The beauty of the code is that you can simplify methods for the concrete class. You create a method called createPipelineProject which takes a third parameter commands from the type Command :

interface Command {
  preBuild?: string[];
  install?: string[];
  build?: string[];
  postBuild?: string[];
}

The concrete class would use this method whenever it creates a CodeBuild-Action like

this.createPipelineProject(this, `Deploy-${account.stage}`, {
  install: ["./scripts/assume-role.sh"],
  build: [`npx cdk deploy --stage ${account.stage}`],
  postBuild: ["npm run test:integration"],
});

Wow, I simplified the method because it’s now abstracted from the parent, you say. Now, you implement CrossAccountPipelineStack

interface Account {
  stage: string;
  number: string;
  region: string; // or to narrow it:  "eu-central-1" | "us-east-1"
}

const accounts: Account[] = [
  {
    number: "111111111111",
    region: "eu-central-1",
    stage: "dev",
  },
  {
    number: "222222222222",
    region: "eu-central-1",
    stage: "staging",
  },
  {
    number: "333333333333",
    region: "eu-central-1",
    stage: "prod",
  },
];

export class CrossAccountPipelineStack extends PipelineStack {
  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    for (const account of accounts) {
      // Create CodeBuild Project
      const deploy = this.createPipelineProject(
        this,
        `DeployTo${account.stage}`,
        {
          install: ["./scripts/assume-role.sh"],
          build: [
            `npx sst deploy --stage ${account.stage}`,
          ],
          postBuild: ["npm run test:integration"],
        },
      );

      // Create Action for Codepipeline
      const actions = [
        new CodeBuildAction({
          actionName: `Deploy-${account.stage.toUpperCase()}`,
          input: this.artifact,
          project: deploy,
          runOrder: 2,
        }),
      ];

      // Add Action to Pipeline
      this.addStageToPipeline(
        `Deploy-${account.stage.toUpperCase()}`,
        account.stage !== "dev"
          ? [
              new ManualApprovalAction({
                actionName: "ManualApproval",
                additionalInformation: `Review Before Deploy ${account.stage}`,
                runOrder: 1,
              }),
              ...actions,
            ]
          : actions,
      );
    }
  }
}

The CrossAccountPipelineStack will get the Source and the Build- stage out-of-the-box, and the following is a loop through accounts. You can even add an action as you did in line 56, whether we want to add a ManualApprovalAction to our stage.

You wonder now how you can add that to your Monorepo project.

Pipeline in a Monorepo setup

Your project has a backend, and a frontend inside the packages.

Here is where it can get complex.

When to trigger the pipeline? Will all packages be deployed? How can I only deploy the packages related to the files which have been changed?

You think you can only solve this with “Multiple Pipelines”.

Multiple Pipelines

Okay, you think that each package needs to create its own CodePipeline. Since the setup is the same, you copy the CodePipeline from the diagram above.

But when should the pipeline be triggered? After researching, you find out that almost all services emit events. Thus, you can use EventBridge with Lambda. You are going to note that down.

AWS emits an event every time something changes in our CodeCommit. We can catch that with EventBridge. The event pattern for the event looks like

{
  "detail-type": ["CodeCommit Repository State Change"],
  "resources": ["repository.repositoryArn"],
  "source": ["aws.codecommit"],
  "detail": {
    "referenceType": ["branch"],
    "event": ["referenceCreated","referenceUpdated"],
    "referenceName": ["main"]
  }
}

You add that in your abstract class PipelineStack and create the respective Lambda and a policy for triggering CodePipeline to check for differences in CodeCommit.

// MultiPipelineStack: Lambda to trigger the correct Pipeline
// Because it's event-based, we need to understand the event pattern.
const eventPattern = {
  "detail-type": ["CodeCommit Repository State Change"],
  resources: [repository.repositoryArn],
  source: ["aws.codecommit"],
  detail: {
    referenceType: ["branch"],
    event: ["referenceCreated", "referenceUpdated"],
    referenceName: [props.branch ?? "main"],
  },
};

// Policy for the Lambda
const initialPolicy = [
  new PolicyStatement({
    actions: ["codecommit:GetDifferences"],
    resources: [repository.repositoryArn],
  }),
  new PolicyStatement({
    actions: ["codepipeline:StartPipelineExecution"],
    resources: [`arn:aws:codepipeline:${this.region}:${this.account}:*`],
  }),
];

const customLambdaPipelineTrigger =
  // We need to check if the function already exists
  // Otherwise whenever we create a new pipeline it would create a new Lambda but we only need one
  Function.fromFunctionName(
    this,
    "CustomLambdaPipelineTrigger",
    "customLambdaPipelineTrigger",
  ) ??
  new Function(this, "CustomLambdaPipelineTrigger", {
    handler: "packages/core/src/customLambdaPipelineTrigger.handler",
    description:
      "Trigger the pipeline when a commit is pushed to the master branch",
    functionName: "customLambdaPipelineTrigger",
    initialPolicy,
  });

// This emits an event and will halt before the pipeline is triggered
this.pipeline.onEvent("EventTrigger", {
  eventPattern,
  description: "Trigger the pipeline when a commit is pushed to the branch",
  target: new LambdaFunction(customLambdaPipelineTrigger),
});

You realize that you need to check if the Lambda Function already exists; otherwise, whenever a new pipeline is created, it will create a new Lambda.

Now, the pipeline will not be triggered immediately. Instead, the event is sent to EventBridge, which forwards it to Lambda. Lambda decides when to start which pipeline based on the logic.

The CustomLambdaPipelineTrigger- Lambda

Okay, now you need to write a Lambda and decide to go with Typescript. AWS CDK comes with an interface EventPattern which it can extend from.

import type { EventPattern } from 'aws-cdk-lib/aws-events';

interface CodeCommitStateChangeEvent extends EventPattern {
  detail: {
    callerUserArn: string;
    commitId: string;
    oldCommitId: string;
    event: string;
    referenceFullName: string;
    referenceName: string;
    referenceType: string;
    repositoryId: string;
    repositoryName: string;
  };
}

Because it will compare the file differences and execute the CodePipeline, you add the @aws-sdk/client-codecommit and @aws-sdk/client-codepipeline npm packages.

import { CodeCommitClient } from '@aws-sdk/client-codecommit';
import { CodePipelineClient } from '@aws-sdk/client-codepipeline';

const codecommitClient = new CodeCommitClient({});
const codepipelineClient = new CodePipelineClient({});

This is the structure of your Monorepo

├── packages
│   ├── backend
│   ├── core
│   └── frontend
├── stacks
│   ├── BackendStack.ts
│   ├── FrontendStack.ts
│   └── PipelineStack

The core is a shared package between backend and frontend. In total, you need two pipelines. Furthermore, you define the paths for the respective pipeline.

const stacksPath = "stacks";
const packages = "packages";
const commonFolder = ["stacks/PipelineStack", `${packages}/core`];

const PipelinePath = {
  backend: [
    `${packages}/backend`,
    `${stacksPath}/BackendStack.ts`,
    ...commonFolder,
  ],
  frontend: [
    `${packages}/frontend`,
    `${stacksPath}/FrontendStack.ts`,
    ...commonFolder,
  ],
};

const Pipeline = {
  frontend: "FrontendStackPipeline",
  backend: "BackendStackPipeline",
};

If something is changed in the commonFolder, each pipeline is triggered. Otherwise, it depends on the packages - path. It is essential to put the correct name in the Pipeline- object.

After you set up the pre-requisite, you write the Lambda handler

export async function handler(event: Prettify<CodeCommitStateChangeEvent>) {
  // Use the SDK to get the difference from the new commit and previous commit
  const getDifferences = new GetDifferencesCommand({
    repositoryName: event.detail.repositoryName,
    afterCommitSpecifier: event.detail.commitId,
    beforeCommitSpecifier: event.detail.oldCommitId,
  });
  const codecommit = await codecommitClient.send(getDifferences);

  // iterate over the paths in PipelinePath to check which pipeline should be triggered
  // e.g. if the path includes functions/src/Application, then trigger the AppPipeline
  for (const path in PipelinePath) {
    const typePath = path as keyof typeof Pipeline;
    const pipeline = Pipeline[typePath];

    if (codecommit.differences) {
      const check = codecommit.differences.some((difference) => {
        return PipelinePath[typePath].some((substring) => {
          // Check if the path includes the substring
          return difference.afterBlob?.path?.includes(substring);
        });
      });

      if (check) {
        const triggerPipelineCommand = new StartPipelineExecutionCommand({
          name: pipeline,
        });
        await codepipelineClient.send(triggerPipelineCommand);
      }
    }
  }
}
  1. It gets the differences between two commits with GetDifferencesCommand (Line 3-8).

  2. The for-in- The loop goes through each file path of the packages (Lines 12-31).

  3. It extracts the pipeline type path and gets the path according to the pipeline name (Lines 13-14).

  4. If there are differences resulting from the GetDifferencesCommand , it uses the some method tests whether at least one element in the array passes the test implemented by the provided function. In this case, it's checking if any of the differences have a afterBlob.path that includes any of the substrings in PipelinePath[typePath] (Line 16-22)

  5. If Step 4 is true, it triggers the pipeline according to the pipeline by using the SDK (Lines 24-29)

MultiPipelineStack

Along the way, you find an excellent framework, SST. Which is much faster than CDK and provides a better DX^1. Here is how you then define your MultiPipelineStack.

import { StackContext } from "sst/constructs";
import { CrossAccountPipelineStack } from "./CrossAccountPipelineStack";
import { accounts } from "./accounts";

export function MultiPipelineStack({ stack }: StackContext) {
  new CrossAccountPipelineStack(stack, "FrontendPipelineStack", {
    accounts,
    purpose: "frontend",
  });

  new CrossAccountPipelineStack(stack, "BackendPipelineStack", {
    accounts,
    purpose: "backend",
  });
}

Since you have defined a CrossAccountPipelineStack you put multiple of them in that stack. You distinguish them by adding a purpose into the StackProps, which you then renamed to CrossAccountPipelineStackProps. This is needed to add it to the environment variable. PURPOSE.

// in CrossAccountPipelineStack.ts
type CrossAccountPipelineStackProps = PipelineStackProps & {
  purpose: string;
};

export class CrossAccountPipelineStack extends PipelineStack {
  constructor(scope: Construct, id: string, props: CrossAccountPipelineStackProps) {
                                                    // ^ This is changed
    for (const account of accounts) {
      // Create CodeBuild Project
      const deploy = this.createPipelineProject(this, `DeployTo${account.stage}`, {
        install: ['./scripts/assume-role.sh'],
        // This is how you create a new environment variable
        build: [`PURPOSE=${props.purpose} npx sst deploy --stage ${account.stage}`],
        postBuild: ['npm run test:integration'],
      });
      // ...
    }
  }
}

The deployment is controlled by the PURPOSE environmental variable inside the sst.config.ts .

import { SSTConfig } from "sst";
import { BackendStack } from "./stacks/BackendStack";
import { FrontendStack } from "./stacks/FrontendStack";

export default {
  config(_input) {
    return {
      name: "aws-ug-codepipeline-demo",
      region: "eu-central-1",
    };
  },
  stacks(app) {
    switch (process.env.PURPOSE) {
      case "backend":
        app.stack(BackendStack);
        break;
      case "frontend":
        app.stack(FrontendStack);
        break;
      default:
        throw new Error("PURPOSE environment variable is not set");
    }
  },
} satisfies SSTConfig;

Conclusion

Puh, you have written a lot of code, but you’re satisfied with the work and could even impress the other Engineers, especially your CTO.

You are proud that you have leveraged AWS CDK to make your CodePipeline less suck. You enabled the OOP paradigm to reuse code and abstract classes for the concrete class to have a pipeline foundation. It can fill the need for cross-account deployment and trigger the correct pipeline when having multiple pipelines in a Monorepo project.

For now, the backend team can push to the main- branch, and it triggers the BackendPipeline and the FrontendPipeline will deploy the NextJS application whenever a new file changes in the frontend- package.

However, you know there are ways to improve the pipeline by adding notifications and caches for CodeBuild and triggering a pipeline whenever a Pull Request has been created.