Building a sophisticated CodePipeline on AWS with AWS CDK
Leverage AWS CDK to build a DevOps Suite entirely on AWS
You got hired as a DevOps engineer and have experience in building CI/CD- pipelines with Gitlab CI and GitHub actions. But the new company requires you to make CI/CD entirely on AWS and put the infrastructure to code.
CodePipeline, CodeBuild, CodeCommit! These are the DevOps services of AWS.
Your first reaction is, why? This looks so ugly! Where should I start? But the code is already on CodeCommit. It’s a monorepo.
Your research finds an interesting article on creating sophisticated pipelines entirely on AWS with AWS CDK. That blog post demonstrates each pipeline for different scenarios and how to organize your codebase. Let's have a look together.
Simple Pipeline
The simple pipeline includes a CodePipeline for orchestrating your CI/CD workflow.
A CodePipeline consists of stages, and each stage contains actions. An action is a command which is executed. Each stage produces an artifact resulting from an action that can be passed to the next stage. A stage always needs an artifact as an input, and it can define multiple actions that can run in parallel or sequentially.
Below is the AWS CDK code for the simple pipeline. Out of brevity, it left out the import statements.
// Create the pipeline
const pipeline = new Pipeline(stack, "Pipeline", {
pipelineName: "aws-cdk-pipeline-demo",
});
// Create the artifact to store outputs of the pipeline
const artifact = new Artifact();
// 1.1. Best Practice to Import Repository
const repository = Repository.fromRepositoryName(
stack,
"Repository",
"aws-cdk-pipeline-demo",
);
// 1.2. Add the source action to the pipeline
pipeline.addStage({
stageName: "Source",
actions: [
new CodeCommitSourceAction({
actionName: "Source",
output: artifact,
repository,
branch: "main",
}),
],
});
// 2. Stage : CodeBuild
pipeline.addStage({
stageName: "Build",
// We can actually paste many actions in here
// But since we can dump anything in one CodeBuild
actions: [
new CodeBuildAction({
actionName: "Build",
input: artifact,
project: new PipelineProject(stack, "BuildProject", {
projectName: "aws-cdk-build-project",
// You can also create your own buildspec.yml file and reference it here instead of using the inline buildspec
// https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#build-spec-ref-syntax
buildSpec: BuildSpec.fromObject({
version: "0.2",
phases: {
install: {
commands: ["npm install"],
},
build: {
commands: [
"npm run lint",
"npm run test:unit",
"npm run deploy",
"npm run test:integration",
],
},
},
artifacts: {
"base-directory": ".",
files: ["**/*"],
},
}),
}),
}),
],
});
However, the Simple Pipeline would not meet your requirements. Because you also want to deploy to different AWS accounts.
Cross-Account Pipeline
You continue reading, and indeed, there is a section about deploying to different AWS accounts. You learned that CodePipeline is made with that in mind. Below, you see the diagram for cross-deploying to DEV
, STAGING
, and PROD
- accounts.
As depicted in the diagram, the Build
- the stage is now shorter. You learned that the following stages can reuse the artifact resulting from it. Yes, that makes sense.
You go further, and your thought is CDK as Infrastructure as Code shines because it is written in Typescript, allowing object-oriented programming paradigms to be applied. It is time to get your hands dirty, and you put the “Simple Pipeline” idea into an abstract class.
/**
* Abstract Class for building multiple Pipelines
* Make it abstract to give the Pipeline a base
*/
export abstract class PipelineStack extends Stack {
public readonly pipeline: Pipeline;
public readonly artifact: Artifact;
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
// 0. Stage: CodePipeline (Pre-Requisite)
// 0.1 Create the pipeline
this.pipeline = new Pipeline(this, "Pipeline", {
pipelineName: id,
});
// 0.2 Create the artifact to store outputs of the pipeline
this.artifact = new Artifact("SourceArtifact");
// 1. Stage : CodeCommit
// We are importing the repository because it's a stateful resource
// and should be deployed in a stateful stack
const repository = Repository.fromRepositoryName(
this,
"Repository",
"aws-cdk-pipeline-demo",
);
// 1.2. Add the source action to the pipeline
this.pipeline.addStage({
stageName: "Source",
actions: [
new CodeCommitSourceAction({
actionName: "Source",
output: this.artifact,
repository,
branch: props.branch ?? "main",
}),
],
});
// 2. Stage: CodeBuild
// 2.1. Build Stage
this.pipeline.addStage({
stageName: "Build",
// We can actually paste many actions in here
// But since we can dump anything in one CodeBuild
actions: [
new CodeBuildAction({
actionName: "Build",
input: this.artifact,
project: new PipelineProject(this, "BuildProject", {
projectName: "aws-cdk-build-project-demo",
// You can also create your own buildspec.yml file and reference it here instead of using the inline buildspec
// https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#build-spec-ref-syntax
buildSpec: BuildSpec.fromObject({
version: "0.2",
phases: {
install: {
commands: ["npm install"],
},
build: {
commands: [
"npm run lint",
"npm run test:unit",
"npm run build",
],
},
},
artifacts: {
"base-directory": "dist",
files: ["**/*"],
},
}),
}),
}),
],
});
// 2.2. Deploy Stage
// This part is now handled in the Concrete class which inherits this
// However, we can simplify the inheritance and provide helper functions below
// createPipelineProject(), addPoliciesToProject(), addStageToPipeline(), createPipelineProject(), addStageToPipeline()
}
/**
* The function creates a pipeline project with specified commands and build specifications.
* @param {Construct} scope - The scope parameter is the parent construct that the pipeline project
* will be created under. It defines the scope or context in which the project will exist.
* @param {string} name - The name of the pipeline project. It is used as the project name and also as
* the name of the construct in the AWS CDK.
* @param {Command} commands - The `commands` parameter is an object that contains the commands to be
* executed in different phases of the build process. It has the following structure:
* @returns a new instance of the `PipelineProject` class.
*/
public createPipelineProject(
scope: Construct,
name: string,
commands: Command,
): PipelineProject {
return new PipelineProject(scope, name, {
projectName: name,
environment: {
buildImage: LinuxBuildImage.STANDARD_7_0,
computeType: ComputeType.LARGE,
},
buildSpec: BuildSpec.fromObject({
version: "0.2",
phases: {
pre_build: {
commands: commands.preBuild,
},
install: {
commands: commands.install,
},
build: {
commands: commands.build,
},
post_build: {
commands: commands.postBuild,
},
},
artifacts: {
"base-directory": ".",
files: ["**/*"],
},
}),
});
}
/**
* The function adds a stage to a pipeline with a given stage name and a list of actions.
* @param {string} stageName - A string representing the name of the stage to be added to the
* pipeline.
* @param {Action[]} actions - The `actions` parameter is an array of `Action` objects. Each `Action`
* object represents a specific action or task that needs to be performed in the pipeline stage.
*/
public addStageToPipeline(stageName: string, actions: Action[]): void {
this.pipeline.addStage({
stageName,
actions,
});
}
}
The beauty of the code is that you can simplify methods for the concrete class. You create a method called createPipelineProject
which takes a third parameter commands
from the type Command
:
interface Command {
preBuild?: string[];
install?: string[];
build?: string[];
postBuild?: string[];
}
The concrete class would use this method whenever it creates a CodeBuild
-Action like
this.createPipelineProject(this, `Deploy-${account.stage}`, {
install: ["./scripts/assume-role.sh"],
build: [`npx cdk deploy --stage ${account.stage}`],
postBuild: ["npm run test:integration"],
});
Wow, I simplified the method because it’s now abstracted from the parent, you say. Now, you implement CrossAccountPipelineStack
interface Account {
stage: string;
number: string;
region: string; // or to narrow it: "eu-central-1" | "us-east-1"
}
const accounts: Account[] = [
{
number: "111111111111",
region: "eu-central-1",
stage: "dev",
},
{
number: "222222222222",
region: "eu-central-1",
stage: "staging",
},
{
number: "333333333333",
region: "eu-central-1",
stage: "prod",
},
];
export class CrossAccountPipelineStack extends PipelineStack {
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
for (const account of accounts) {
// Create CodeBuild Project
const deploy = this.createPipelineProject(
this,
`DeployTo${account.stage}`,
{
install: ["./scripts/assume-role.sh"],
build: [
`npx sst deploy --stage ${account.stage}`,
],
postBuild: ["npm run test:integration"],
},
);
// Create Action for Codepipeline
const actions = [
new CodeBuildAction({
actionName: `Deploy-${account.stage.toUpperCase()}`,
input: this.artifact,
project: deploy,
runOrder: 2,
}),
];
// Add Action to Pipeline
this.addStageToPipeline(
`Deploy-${account.stage.toUpperCase()}`,
account.stage !== "dev"
? [
new ManualApprovalAction({
actionName: "ManualApproval",
additionalInformation: `Review Before Deploy ${account.stage}`,
runOrder: 1,
}),
...actions,
]
: actions,
);
}
}
}
The CrossAccountPipelineStack
will get the Source
and the Build
- stage out-of-the-box, and the following is a loop through accounts
. You can even add an action as you did in line 56, whether we want to add a ManualApprovalAction
to our stage.
You wonder now how you can add that to your Monorepo project.
Pipeline in a Monorepo setup
Your project has a backend
, and a frontend
inside the packages
.
Here is where it can get complex.
When to trigger the pipeline? Will all packages be deployed? How can I only deploy the packages related to the files which have been changed?
You think you can only solve this with “Multiple Pipelines”.
Multiple Pipelines
Okay, you think that each package needs to create its own CodePipeline. Since the setup is the same, you copy the CodePipeline from the diagram above.
But when should the pipeline be triggered? After researching, you find out that almost all services emit events. Thus, you can use EventBridge with Lambda. You are going to note that down.
AWS emits an event every time something changes in our CodeCommit. We can catch that with EventBridge. The event pattern for the event looks like
{
"detail-type": ["CodeCommit Repository State Change"],
"resources": ["repository.repositoryArn"],
"source": ["aws.codecommit"],
"detail": {
"referenceType": ["branch"],
"event": ["referenceCreated","referenceUpdated"],
"referenceName": ["main"]
}
}
You add that in your abstract class PipelineStack
and create the respective Lambda and a policy for triggering CodePipeline to check for differences in CodeCommit.
// MultiPipelineStack: Lambda to trigger the correct Pipeline
// Because it's event-based, we need to understand the event pattern.
const eventPattern = {
"detail-type": ["CodeCommit Repository State Change"],
resources: [repository.repositoryArn],
source: ["aws.codecommit"],
detail: {
referenceType: ["branch"],
event: ["referenceCreated", "referenceUpdated"],
referenceName: [props.branch ?? "main"],
},
};
// Policy for the Lambda
const initialPolicy = [
new PolicyStatement({
actions: ["codecommit:GetDifferences"],
resources: [repository.repositoryArn],
}),
new PolicyStatement({
actions: ["codepipeline:StartPipelineExecution"],
resources: [`arn:aws:codepipeline:${this.region}:${this.account}:*`],
}),
];
const customLambdaPipelineTrigger =
// We need to check if the function already exists
// Otherwise whenever we create a new pipeline it would create a new Lambda but we only need one
Function.fromFunctionName(
this,
"CustomLambdaPipelineTrigger",
"customLambdaPipelineTrigger",
) ??
new Function(this, "CustomLambdaPipelineTrigger", {
handler: "packages/core/src/customLambdaPipelineTrigger.handler",
description:
"Trigger the pipeline when a commit is pushed to the master branch",
functionName: "customLambdaPipelineTrigger",
initialPolicy,
});
// This emits an event and will halt before the pipeline is triggered
this.pipeline.onEvent("EventTrigger", {
eventPattern,
description: "Trigger the pipeline when a commit is pushed to the branch",
target: new LambdaFunction(customLambdaPipelineTrigger),
});
You realize that you need to check if the Lambda Function already exists; otherwise, whenever a new pipeline is created, it will create a new Lambda.
Now, the pipeline will not be triggered immediately. Instead, the event is sent to EventBridge, which forwards it to Lambda. Lambda decides when to start which pipeline based on the logic.
The CustomLambdaPipelineTrigger
- Lambda
Okay, now you need to write a Lambda and decide to go with Typescript. AWS CDK comes with an interface EventPattern
which it can extend from.
import type { EventPattern } from 'aws-cdk-lib/aws-events';
interface CodeCommitStateChangeEvent extends EventPattern {
detail: {
callerUserArn: string;
commitId: string;
oldCommitId: string;
event: string;
referenceFullName: string;
referenceName: string;
referenceType: string;
repositoryId: string;
repositoryName: string;
};
}
Because it will compare the file differences and execute the CodePipeline, you add the @aws-sdk/client-codecommit and @aws-sdk/client-codepipeline npm packages.
import { CodeCommitClient } from '@aws-sdk/client-codecommit';
import { CodePipelineClient } from '@aws-sdk/client-codepipeline';
const codecommitClient = new CodeCommitClient({});
const codepipelineClient = new CodePipelineClient({});
This is the structure of your Monorepo
├── packages
│ ├── backend
│ ├── core
│ └── frontend
├── stacks
│ ├── BackendStack.ts
│ ├── FrontendStack.ts
│ └── PipelineStack
The core
is a shared package between backend
and frontend
. In total, you need two pipelines. Furthermore, you define the paths for the respective pipeline.
const stacksPath = "stacks";
const packages = "packages";
const commonFolder = ["stacks/PipelineStack", `${packages}/core`];
const PipelinePath = {
backend: [
`${packages}/backend`,
`${stacksPath}/BackendStack.ts`,
...commonFolder,
],
frontend: [
`${packages}/frontend`,
`${stacksPath}/FrontendStack.ts`,
...commonFolder,
],
};
const Pipeline = {
frontend: "FrontendStackPipeline",
backend: "BackendStackPipeline",
};
If something is changed in the commonFolder
, each pipeline is triggered. Otherwise, it depends on the packages
- path. It is essential to put the correct name in the Pipeline
- object.
After you set up the pre-requisite, you write the Lambda handler
export async function handler(event: Prettify<CodeCommitStateChangeEvent>) {
// Use the SDK to get the difference from the new commit and previous commit
const getDifferences = new GetDifferencesCommand({
repositoryName: event.detail.repositoryName,
afterCommitSpecifier: event.detail.commitId,
beforeCommitSpecifier: event.detail.oldCommitId,
});
const codecommit = await codecommitClient.send(getDifferences);
// iterate over the paths in PipelinePath to check which pipeline should be triggered
// e.g. if the path includes functions/src/Application, then trigger the AppPipeline
for (const path in PipelinePath) {
const typePath = path as keyof typeof Pipeline;
const pipeline = Pipeline[typePath];
if (codecommit.differences) {
const check = codecommit.differences.some((difference) => {
return PipelinePath[typePath].some((substring) => {
// Check if the path includes the substring
return difference.afterBlob?.path?.includes(substring);
});
});
if (check) {
const triggerPipelineCommand = new StartPipelineExecutionCommand({
name: pipeline,
});
await codepipelineClient.send(triggerPipelineCommand);
}
}
}
}
It gets the differences between two commits with
GetDifferencesCommand
(Line 3-8).The
for-in
- The loop goes through each file path of the packages (Lines 12-31).It extracts the pipeline type path and gets the path according to the pipeline name (Lines 13-14).
If there are differences resulting from the
GetDifferencesCommand
, it uses thesome
method tests whether at least one element in the array passes the test implemented by the provided function. In this case, it's checking if any of thedifferences
have aafterBlob.path
that includes any of the substrings inPipelinePath[typePath]
(Line 16-22)If Step 4 is true, it triggers the pipeline according to the
pipeline
by using the SDK (Lines 24-29)
MultiPipelineStack
Along the way, you find an excellent framework, SST. Which is much faster than CDK and provides a better DX^1. Here is how you then define your MultiPipelineStack
.
import { StackContext } from "sst/constructs";
import { CrossAccountPipelineStack } from "./CrossAccountPipelineStack";
import { accounts } from "./accounts";
export function MultiPipelineStack({ stack }: StackContext) {
new CrossAccountPipelineStack(stack, "FrontendPipelineStack", {
accounts,
purpose: "frontend",
});
new CrossAccountPipelineStack(stack, "BackendPipelineStack", {
accounts,
purpose: "backend",
});
}
Since you have defined a CrossAccountPipelineStack
you put multiple of them in that stack. You distinguish them by adding a purpose
into the StackProps
, which you then renamed to CrossAccountPipelineStackProps
. This is needed to add it to the environment variable. PURPOSE
.
// in CrossAccountPipelineStack.ts
type CrossAccountPipelineStackProps = PipelineStackProps & {
purpose: string;
};
export class CrossAccountPipelineStack extends PipelineStack {
constructor(scope: Construct, id: string, props: CrossAccountPipelineStackProps) {
// ^ This is changed
for (const account of accounts) {
// Create CodeBuild Project
const deploy = this.createPipelineProject(this, `DeployTo${account.stage}`, {
install: ['./scripts/assume-role.sh'],
// This is how you create a new environment variable
build: [`PURPOSE=${props.purpose} npx sst deploy --stage ${account.stage}`],
postBuild: ['npm run test:integration'],
});
// ...
}
}
}
The deployment is controlled by the PURPOSE
environmental variable inside the sst.config.ts
.
import { SSTConfig } from "sst";
import { BackendStack } from "./stacks/BackendStack";
import { FrontendStack } from "./stacks/FrontendStack";
export default {
config(_input) {
return {
name: "aws-ug-codepipeline-demo",
region: "eu-central-1",
};
},
stacks(app) {
switch (process.env.PURPOSE) {
case "backend":
app.stack(BackendStack);
break;
case "frontend":
app.stack(FrontendStack);
break;
default:
throw new Error("PURPOSE environment variable is not set");
}
},
} satisfies SSTConfig;
Conclusion
Puh, you have written a lot of code, but you’re satisfied with the work and could even impress the other Engineers, especially your CTO.
You are proud that you have leveraged AWS CDK to make your CodePipeline less suck. You enabled the OOP paradigm to reuse code and abstract classes for the concrete class to have a pipeline foundation. It can fill the need for cross-account deployment and trigger the correct pipeline when having multiple pipelines in a Monorepo project.
For now, the backend team can push to the main
- branch, and it triggers the BackendPipeline
and the FrontendPipeline
will deploy the NextJS application whenever a new file changes in the frontend
- package.
However, you know there are ways to improve the pipeline by adding notifications and caches for CodeBuild and triggering a pipeline whenever a Pull Request has been created.