Serverless Tekton Pipelines on AWS EKS Fargate

Serverless Tekton Pipelines on AWS EKS Fargate

Building an on-demand, server less Continuous Delivery automation solution using Tekton and AWS EKS Fargate

Continuous Delivery is hard business! Specially if you're dealing with microservices. While Jenkins does work pretty well unto a scale by creating shared libraries of sorts for common builds, but after a while when you're running your SaaS on microservices like we do at Digité, managing the builds, and the infrastructure for CI/CD can get cumbersome. It is for both optimized Cloud Infra usage and ability to easily write and maintain CD pipelines that we considered moving to Tekton.

Having said that, blocking two extra large VM for the "what if there are too many jobs running in parallel?" does not appear natural to me; so I set out at making Tekton work in Fargate. The reason behind Fargate is the ease of server-less thereby letting us concentrate of managing our CI/CD pipelines without having to manage the Infrastructure for it. Hence, i'll share my experience on how to get a Server-less CI/CD Infrastructure for Tekton up and running quickly via Terraform in this post.

Setup

Let's start with creating a Terraform module for installation of Tekton to Fargate, you can refer to this article for creating a basic setup of EKS Fargate Cluster. Assuming you have that in place, the next steps are as follows.

Fargate Profiles

We'll first create the Fargate profile for running Tekton, Tekton Dashboard and Tekton Triggers in the tekton-pipelines namespace

resource "aws_eks_fargate_profile" "tekton-dashboard-profile" {
  cluster_name           = module.eks.cluster_id
  fargate_profile_name   = "tekton-dashboard-profile"
  pod_execution_role_arn = module.eks.fargate_iam_role_arn
  subnet_ids             = module.vpc.private_subnets
  selector {
    namespace = "tekton-pipelines"
    labels = {
      "app.kubernetes.io/part-of" = "tekton-dashboard",
      "app.kubernetes.io/part-of" = "tekton-triggers"
    }
  }
  depends_on = [module.eks]
  tags = {
    Environment = "${var.environment}"
    Cost        = "${var.cost_tag}"
  }
}

EFS Setup

EFS is the recommended approach by AWS when it comes to mounting PV for Fargate nodes; hence, we'll add EFS configuration in the next steps.

It's a good practice to restrict EFS access to the VPC running EKS Cluster and your internal network for IAM controlled users to access it over AWS CLI. Declare a security group with Ingress rules for each of the subnet CIDR of the VPC running EKS Fargate to restrict access.

module "efs-access-security-group" {
  source  = "terraform-aws-modules/security-group/aws"
  version = "4.3.0"
  create  = true

  name        = "efs-${var.cluster_title}-${var.environment}-security-group"
  description = "Security group for pipeline tekton EFS, created via terraform"
  vpc_id      = module.vpc.vpc_id

  ingress_with_cidr_blocks = [{ cidr_blocks = "172.18.1.0/24"
    from_port = 0
    to_port   = 2049
    protocol  = "tcp"
    self      = true
    }, {
    cidr_blocks = "172.18.3.0/24"
    from_port   = 0
    to_port     = 2049
    protocol    = "tcp"
    self        = true
    }, 
    // All Subnet CIDRs...
, ]
  ingress_with_self = [{
    from_port   = 0
    to_port     = 0
    protocol    = -1
    self        = true
    description = "Ingress with Self"
  }]

  egress_with_cidr_blocks = [{
    cidr_blocks = "0.0.0.0/0"
    from_port   = 0
    to_port     = 0
    protocol    = -1
  }]
}

While Fargate auto-installs the EFS CSI Driver, we still have to declare an IAM policy for the cluster EFS access. Here's how to do it in our Terraform module

resource "aws_iam_policy" "efs-csi-driver-policy" {
  name        = "TektonEFSCSIDriverPolicy"
  description = "EFS CSI Driver Policy"

  policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect" : "Allow",
        "Action" : [
          "elasticfilesystem:DescribeAccessPoints",
          "elasticfilesystem:DescribeFileSystems"
        ],
        "Resource" : "*"
      },
      {
        "Effect" : "Allow",
        "Action" : [
          "elasticfilesystem:CreateAccessPoint"
        ],
        "Resource" : "*",
        "Condition" : {
          "StringLike" : {
            "aws:RequestTag/efs.csi.aws.com/cluster" : "true"
          }
        }
      },
      {
        "Effect" : "Allow",
        "Action" : "elasticfilesystem:DeleteAccessPoint",
        "Resource" : "*",
        "Condition" : {
          "StringEquals" : {
            "aws:ResourceTag/efs.csi.aws.com/cluster" : "true"
          }
        }
      }
    ]
  })
}

With that done, we'll define the Cluster IAM for EFS Access. First the policy document which details access the policy statements for the role

data "aws_iam_policy_document" "efs-iam-assume-role-policy" {

  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    effect  = "Allow"
    condition {
      test     = "StringEquals"
      variable = "${replace(aws_iam_openid_connect_provider.tekton-main.url, "https://", "")}:sub"
      values   = ["system:serviceaccount:tekton-pipelines:tekton-efs-serviceaccount"]
    }
    principals {
      identifiers = [aws_iam_openid_connect_provider.tekton-main.arn]
      type        = "Federated"
    }
  }
  depends_on = [
    aws_iam_policy.efs-csi-driver-policy
  ]
}

then we add the role

resource "aws_iam_role" "efs-service-account-iam-role" {
  assume_role_policy = data.aws_iam_policy_document.efs-iam-assume-role-policy.json
  name               = "tekton-efs-service-account-role"
}

resource "aws_iam_role_policy_attachment" "efs-csi-driver-policy-attachment" {
  role       = aws_iam_role.efs-service-account-iam-role.name
  policy_arn = aws_iam_policy.efs-csi-driver-policy.arn
}

And then we map it to a service account

resource "kubernetes_service_account" "efs-service-account" {
  metadata {
    name      = "tekton-efs-serviceaccount"
    namespace = "tekton-pipelines"
    labels = {
      "app.kubernetes.io/name" = "tekton-efs-serviceaccount"
    }
    annotations = {
      # This annotation is only used when running on EKS which can use IAM roles for service accounts.
      "eks.amazonaws.com/role-arn" = aws_iam_role.efs-service-account-iam-role.arn
    }
  }
  depends_on = [
    aws_iam_role_policy_attachment.efs-csi-driver-policy-attachment
  ]
}

resource "kubernetes_role" "efs-kube-role" {
  metadata {
    name = "efs-kube-role"
    labels = {
      "name" = "efs-kube-role"
    }
  }

  rule {
    api_groups = [""]
    resources  = ["persistentvolumeclaims", "persistentvolumes"]
    verbs      = ["create", "get", "list", "update", "watch", "patch"]
  }

  rule {
    api_groups = ["", "storage"]
    resources  = ["nodes", "pods", "events", "csidrivers", "csinodes", "csistoragecapacities", "storageclasses"]
    verbs      = ["get", "list", "watch"]
  }
  depends_on = [aws_iam_role_policy_attachment.alb-ingress-policy-attachment]
}

resource "kubernetes_role_binding" "efs-role-binding" {
  depends_on = [
    kubernetes_service_account.efs-service-account
  ]
  metadata {
    name = "tekton-efs-role-binding"
    labels = {
      "app.kubernetes.io/name" = "tekton-efs-role-binding"
    }
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "Role"
    name      = "efs-kube-role"
  }
  subject {
    kind      = "ServiceAccount"
    name      = "tekton-efs-serviceaccount"
    namespace = "tekton-pipelines"
  }
}

With the IAM linked service account in place, we'll define the EFS file system

resource "aws_efs_file_system" "eks-efs" {
  creation_token = "tekton-eks-efs"
  encrypted      = true
  tags = {
    Name                  = "tekton-eks-efs"
    Cost                  = var.cost_tag

  }
  depends_on = [
    kubernetes_role_binding.efs-role-binding
  ]
}

And its mount targets and storage class

resource "aws_efs_mount_target" "eks-efs-private-subnet-mnt-target" {
  count           = length(module.vpc.private_subnets)
  file_system_id  = aws_efs_file_system.eks-efs.id
  subnet_id       = module.vpc.private_subnets[count.index]
  security_groups = [module.efs-access-security-group.security_group_id]
}

resource "aws_efs_access_point" "eks-efs-tekton-access-point" {
  file_system_id = aws_efs_file_system.eks-efs.id
  root_directory {
    path = "/workspace"
    creation_info {
      owner_gid   = 1000
      owner_uid   = 1000
      permissions = 755
    }
  }
  posix_user {
    gid = 1000
    uid = 1000
  }
  tags = {
    Name        = "eks-efs-tekton-access-point"
    Cost        = var.cost_tag
    Environment = "${var.environment}"
  }
}

resource "kubernetes_storage_class" "eks-efs-storage-class" {
  metadata {
    name = "eks-efs-storage-class"
  }
  storage_provisioner = "efs.csi.aws.com"
  reclaim_policy      = "Retain"
}

Note the EFS and access point IDs in the terrafrom output whne appying these changes, they'll be used in the PV and PVC definitions. My scripts gave the output

fs-8a7eXXXX::fsap-0f60de28766XXXXXX

Installing Tekton

It's pretty simple from here on; the following command installs Tekton

kubectl apply --filename https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml

followed by Tekton dashboard (read-only install)

curl -sL https://raw.githubusercontent.com/tektoncd/dashboard/main/scripts/release-installer | \
   bash -s -- install latest --read-only

or kubectl apply --filename tekton-dashboard-readonly.yaml

after downloading the Read Only YAML from this GitHub link. Next we setup the persistent volume, refer to the generated EFS IDs from Terraform run in your PV definition, here's an example for a PV and PVC that will be used by a maven task for running tekton pipeline

apiVersion: v1
kind: PersistentVolume
metadata:
  name: piglet-source-pv
  labels:
    type: piglet-source-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: eks-efs-storage-class
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-8a7eXXXX::fsap-0f60de28766XXXXXX
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: piglet-source-pvc
spec:
  selector:
    matchLabels:
      type: piglet-source-pv
  storageClassName: eks-efs-storage-class
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Conclusion

While the Tekton installation itself doesn't change (you're using a kubectl apply command as always), we have to be aware of how Fargate profiles are applies for any workloads to run on EKS Fargate and thereby provision a Fargate profile using existing Tekton annotations as its selectors so that our tasks can run on Fargate. Other than that we have to provision and configure PV and PVC via EFS for tasks to use them at runtime.

With those in place we have a working Tekton installation over EKS Fargate with a truly on-demand way of running builds and CI/CD Pipelines.