This repository has been archived on 2024-10-01. You can view files and clone it, but cannot push or open issues or pull requests.
badhouseplants-net-old/content/posts/argocd-dynamic-environment-per-branch-part-1/index.md
2023-03-29 16:35:18 +00:00

26 KiB

title date draft ShowToc cover
Dynamic Environment Per Branch with ArgoCD 2023-02-25T14:00:00+01:00 false true
image caption relative responsiveImages
cover.png Dynamic Environment Per Branch with ArgoCD false false

[Do you remember?]({{< ref "dont-use-argocd-for-infrastructure" >}})

And using helmfile, I will install ArgoCD to my clusters, of course, because it's an awesome tool, without any doubts. But don't manage your infrastructure with it, because it's a part of your infrastructure, and it's a service that you provide to other teams. And I'll talk about in one of the next posts.

Yes, I have written 4 posts where I was almost absuletely negative about ArgoCD. But I was talking about infrastructure then. I've got some ideas about how to describe it in a better way, but I think I will write another post about it.

Here, I want to talk about dynamic (preview) environments, and I'm going to describe how to create them using my blog as an example. My blog is a pretty easy application. From Kubernetes perspective, it's just a container with some static content. And here, you already can notice that static is an opposite of dynamic, so it's the first problem that I'll have to tackle. Turning static content into dynamic. So my blog consists of markdown files that are used by hugo for a web page generation.

Initially I was using hugo server to serve the static, but it needs way more resources than nginx, so I've decided in favor of nginx.

I think that I'll write 2 of 3 posts about it, because it's too much to cover in only one. So here, I'd share how I was preparing my blog to be ready for dynamic environments.

So this is how my workflow looked like before I decided to use dynamic environments.

  • I'm editing hugo content while using hugo server locally
  • Pushing changes to a non-main branch
  • When everything is ready, I'm uploading pictures to the minio storage
  • And merging a non-main branch to the main
  • Drone-CI is downloading images from minio and builds a docker image with the latest tag
    • First step is to generate a static content by hugo
    • Second step is to put that static content in nginx container
  • Drone-CI is pushing a new image to my registry
  • Keel spots that images was updated and pulls it.
  • Pod with a static is being recreated, and I have my blog with a new content

What I don't like about it? I can't test something unless it's in production. And when I stated to work on adding comments (that is still WIP) I've understood that I'd like to have a real environemnt where I can test everything before firing the main pipeline. Even though having a static development environment would be fine for me, because I'm the only one who do the development here, I don't like the concept of static envs, and I want to be able to work on different posts in the same time. Also, adding a new static environemnt for development purposes it kind of the same amount of work as implementing a solution for deploying them dynamically.

Before I can start deploying them, I have to prepare the application for that. At the first glance changes looks like that:

  1. Container must not contain any static content
  2. I can't use only latest tags anymore
  3. Helm chart has a lot of stuff that's hard-coded
  4. CI pipelines must be adjusted
  5. Deployment process should be rethought

Static Container

Static content doesn't play well with dynamic environments. I'd even say, doesn't play at all. So at least I must stop defining hostname for my blog on the build stage. One container should be able to run anywhere with the same result. So I've decided that instead of putting the generated static content in the container with nginx on the build stage, I need to ship a container with source code to Kubernetes, generate static there and put it to a container with nginx. So before my deployment looked like that:

spec:
  containers:
    - image: git.badhouseplants.net/allanger/badhouseplants-net:latest
      imagePullPolicy: Always
      name: badhouseplants-net

And it was enough. Now it looks like that:

containers:
      - image: nginx:latest
        imagePullPolicy: Always
        name: nginx
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        resources: {}
        volumeMounts:
        - mountPath: /var/www
          name: public-content
          readOnly: true
        - mountPath: /etc/nginx/conf.d
          name: nginx-config
          readOnly: true
      initContainers:
      - args:
        - --baseURL
        - https://dynamic-charts-dev.badhouseplants.net/
        image: git.badhouseplants.net/allanger/badhouseplants-net:d727a51c0443eb4194bdaebf8ab0e94c0f228b06
        imagePullPolicy: Always
        name: badhouseplants-net
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /src/static
          name: s3-data
          readOnly: true
        - mountPath: /src/public
          name: public-content
      restartPolicy: Always
      - emptyDir:
          sizeLimit: 1Gi
        name: public-content
      - configMap:
          defaultMode: 420
          name: nginx-config
        name: nginx-config

So in the init container I'm generating a static content (--baseUrL flag is templated with Helm). Putting the result to the directory that is mounted as en emptyDir volume. And then later I'm mounting this folder to a container with nginx. Now I can use my docker image wherever I'd like with the same result It doesn't depend on the hostmame that was fixed during the build.

No more latest

Since I want to have my envs updated on each commit, I can't push only latest anymore. So I've decided to use commit sha as tags for my images. But it means that I'll have a lot of them now and having 300Mb of images and other media is becoming very painful. That means that I need to stop putting images directly to container during the build. So instead of using rclone to get data from minio in a drone pipeline, I'm adding another init container to my deployment.

      initContainers:
      - args:
        - -c
        - rclone copy -P badhouseplants-public:/badhouseplants-static /static
        command:
        - sh
        env:
        - name: RCLONE_CONFIG
          value: /tmp/rclone.conf
        image: rclone/rclone:latest
        imagePullPolicy: Always
        name: rclone
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /tmp
          name: rclone-config
          readOnly: true
        - mountPath: /static
          name: s3-data
      volumes:
      - name: rclone-config
        secret:
          defaultMode: 420
          secretName: rclone-config
      - emptyDir:
          sizeLimit: 1Gi
        name: s3-data

And also, I'm mounting the s3-data volume to the hugo container, so it can generate my blog with all images.

Helm chart should be more flexible

I had to find all the values, that should be different between different environments. And turned out, it's not a lot.

  1. Istio VirtualServices hostnames (Or Ingress hostname, if you don't use Istio)
  2. Image tag for the container with the source code
  3. And a hostname that should be passed to hugo as a base URL
  4. Preview environments should display pages that are still drafts

So all of that I've put to values.yaml

istio:
  hosts:
    - badhouseplants.net
  hugo:
  image:
    tag: $COMMIT_SHA
  baseURL: https://badhouseplants.net/
  buildDrafts: false

CI pipelines

Now I need to push a new image on each commit instead of pushing only once the code made it to the main branch, But I also don't want to have something that doesn't work completely in my registry, because I'm self-hosting and ergo I care about storage. So before building and pushing an image, I need to test it,

# ---------------------------------------------------------------
# -- My Dockerfile is very small and easy, so it's not a problem 
# --  to duplicate its logic in a job. But I think that 
# --  a better way to implement this, would be to build an image
# --  with Dockerfile, run it, and push, if everything is fine
# ---------------------------------------------------------------
- name: Test a build
  image: klakegg/hugo
  commands:
    - hugo

- name: Build and push the docker image 
  image: plugins/docker
  settings: 
    registry: git.badhouseplants.net
    username: allanger
    password: 
      from_secret: GITEA_TOKEN
    repo: git.badhouseplants.net/allanger/badhouseplants-net
    tags: ${DRONE_COMMIT_SHA}

Now if my code is not really broken, I'll have an image for each commit. And when I merge my branch to main I can use a tag from the latest preview build on for the production instance. So I'm almost sure that what I've tested before is what a visitor will see.

But with this kind of setup I've reached docker pull limit pretty fast, so I've decided that I need to have a builder image in my registry too. Of course, it must be an automated action, but right off the bat, I've just pushed the hugo image to my registry with the latest tag and created an issue to fix it later

docker pull klakegg/hugo
docker tag klakegg/hugo git.badhouseplants.net/badhouseplants/hugo-builder
docker push

And update my Dockerfile to look like this:

FROM git.badhouseplants.net/badhouseplants/hugo-builder
WORKDIR /src
COPY . /src
ENTRYPOINT ["hugo"]

How to deploy

Previously I was using the same helmfile that I use for everything else in my k8s cluster. It was fine for static envs, but when I need to deploy them dynamically, it's not an option anymore. And here ArgoCD enters the room. I'm creating an ApplicationSet that looks like that:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: badhouseplants-net
  namespace: argo-system
spec:
  generators:
    - list:
        elements:
          - name: application # just not to lose a backward compability with the prevouos setup
            app: badhouseplants
            branch: main
            chart_version: 0.3.6
            # Image that is latest now, we'll get there later
            value: |
              hugo:
                image:
                  tag: latest              
            # And this is an example of environemnt that I want to be created.
          - name: dynamic-charts
            app: badhouseplants
            branch: dynamic-charts
            chart_version: 0.3.6
            value: |
              istio:
                hosts:
                  - dynamic-charts-dev.badhouseplants.net
              hugo:
                image:
                  tag: 5d742a71731320883db698432303c92aee4d68a1
                baseURL: https://dynamic-charts-dev.badhouseplants.net/
                buildDrafts: true              
  template:
    metadata:
      name: "{{ app }}-{{ name }}"
      namespace: argo-system
    spec:
      project: "default"
      source:
        helm:
          valueFiles:
            - values.yaml
          values: "{{ value }}"
        repoURL: https://git.badhouseplants.net/api/packages/allanger/helm
        targetRevision: "{{ chart_version }}"
        chart: badhouseplants-net
      destination:
        server: "https://kubernetes.default.svc"
        namespace: "{{ app }}-{{ name }}"
      syncPolicy:
        syncOptions:
          - CreateNamespace=true

But storing I don't like an idea of storing something like that in the repository. So in the git I'm putting something like that.

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: badhouseplants-net
  namespace: argo-system
spec:
  generators:
    - list:
        elements:
          - name: application 
            app: badhouseplants
            branch: main
            chart_version: 0.3.6
            value: |
              hugo:
                image:
                  tag: $ARGO_IMAGE_TAG              
...

Since I'm not using latest anymore, I need to add use a new tag every time a new image is pushed. But let's test with the preview env first:

# ./kube/template.yaml
...
- name: $ARGO_APP_BRANCH
  app: badhouseplants
  branch: $ARGO_APP_BRANCH
  chart_version: $ARGO_APP_CHART_VERSION
  value: |
    istio:
      hosts:
        - $ARGO_APP_HOSTNAME
    hugo:
      image:
        tag: $ARGO_APP_IMAGE_TAG
      baseURL: https://$ARGO_APP_HOSTNAME/
      buildDrafts: true    
...

And the logic that I would like to have in my setup would be

  • In the git repo there is only application set with the main instance only (production)
  • After a new image is pushed to registry, I'm getting this application set as yaml and appending new generator to it.
  • Applying a new ApplicationSet and syncing application using the argo CLI tool

First, let's set environment variables:

- $ARGO_APP_BRANCH = $DRONE_BRANCH | I don't want to use it directly, in case if I want to stop using Drone
- $ARGO_APP_CHART_VERSION should be taken from the `./chart/Chart.yaml` file. `cat chart/Chart.yaml | yq '.version'`
- $ARGO_APP_HOSTNAME, I want it to look like that: "$DRONE_BRANCH-dev.badhouseplants.net"
- $ARGO_APP_IMAGE_TAG = $DRONE_COMMIT_SHA

So after setting all these variables, I can use envsubst < ./kube/template.yaml to create a correct generator. After that I only need to append it to one that is already in k8s. And not to append if it's already there.

So my pipeline for a non-main branch looks like that:

- name: Deploy a preview ApplicationSet
  image: alpine/k8s:1.24.10
  when: 
    branch:
      exclude:
        - main
  environment:
    KUBECONFIG_CONTENT:
      from_secret: KUBECONFIG_CONTENT
  commands: 
    - mkdir $HOME/.kube
    - echo $KUBECONFIG_CONTENT | base64 -d > $HOME/.kube/config
    - apk update --no-cache && apk add yq gettext
    - export ARGO_APP_CHART_VERSION=`cat chart/Chart.yaml | yq '.version'`
    - export ARGO_APP_BRANCH=$DRONE_BRANCH
    - export ARGO_APP_HOSTNAME="${DRONE_BRANCH}-dev.badhouseplants.net"
    - export ARGO_APP_IMAGE_TAG=$DRONE_COMMIT_SHA
    - kubectl get -f ./kube/applicationset.yaml -o yaml  > /tmp/old_appset.yaml
    - yq "del(.spec.generators[].list.elements[] | select(.name == \"$ARGO_APP_BRANCH\"))" /tmp/old_appset.yaml > /tmp/clean_appset.yaml
    - envsubst  < ./kube/template.yaml > /tmp/elements.yaml
    - yq '.spec.generators[].list.elements += load("/tmp/elements.yaml")' /tmp/clean_appset.yaml > /tmp/new_appset.yaml
    - kubectl apply -f /tmp/new_appset.yaml

And even though it's very ugly, I already like it. Because it works.

Drone pipeline result

I would like to move the whole pipeline logic out of the .drone.yml file. But I will do it later.

After our application set is deployed, we need to update the application the is created by it. I would like to use the argocd CLI tool for that. To sync a specific app, we need to use selectors, and I'd like to go with labels. So let's first add labels to our ApplicationSet

...
  template:
    metadata:
      name: "{{ app }}-{{ name }}"
      namespace: argo-system
      labels:
        branch: "{{ name }}"
        application: "{{ app }}"
...

And now let's create a job like that:

- name: Sync application 
  image: argoproj/argocd
  environment:
    ARGOCD_SERVER: 
      from_secret: ARGOCD_SERVER
    ARGOCD_AUTH_TOKEN:
      from_secret: ARGOCD_AUTH_TOKEN
  commands:
    - argocd app sync -l app=badhouseplants -l branch=$DRONE_BRANCH
    - argocd app wait -l app=badhouseplants -l branch=$DRONE_BRANCH

And the last step would be to remove an application when branch is merged. It could be easy with Gitlab because there you can use environments and triggers for removing branch (as I remember) But with drone it seems to be harder. Because drone won't be triggered by a removed branch. Maybe a pull request trigger could be used for that, but I've found another way, which may not be the best, obviously.

I've enabled only fast-forward merge to the main that that means that after merging a Pull Request the commit will have the same SHA. So when merging to the main branch, I can use the commit hash to remove a generator. It also means that if I have one commit deployed to several environments, I will remove more that I want. But I don't think that it will be a problem in my case. If you're not a lonely developer, but a team, you may need to choose something else.

So I've added a new element to preview generator: commit_sha: $ARGO_APP_IMAGE_TAG, and then this command will do the trick: yq -i "del(.spec.generators[].list.elements[] | select(.name == \"$ARGO_APP_BRANCH\"))" /tmp/appset.yaml

I've created a file ./kube/main-template.yaml, that looks like that:

- name: application
  app: badhouseplants
  branch: main
  chart_version: $ARGO_APP_CHART_VERSION
  value: |
    hugo:
      image:
        tag: $ARGO_APP_IMAGE_TAG    

And a job:

- name: Deploy a main ApplicationSet
  image: alpine/k8s:1.24.10
  when: 
    branch:
      - main
  environment:
    KUBECONFIG_CONTENT:
      from_secret: KUBECONFIG_CONTENT
  commands: 
    - mkdir $HOME/.kube
    - echo $KUBECONFIG_CONTENT | base64 -d > $HOME/.kube/config
    - apk update --no-cache && apk add yq gettext
    - export ARGO_APP_CHART_VERSION=`cat chart/Chart.yaml | yq '.version'`
    - export ARGO_APP_BRANCH=$DRONE_BRANCH
    - export ARGO_APP_IMAGE_TAG=$DRONE_COMMIT_SHA
    - kubectl get -f ./kube/applicationset.yaml -o yaml  > /tmp/old_appset.yaml
    - yq "del(.spec.generators[].list.elements[] | select(.name == \"$ARGO_APP_BRANCH\"))" /tmp/old_appset.yaml > /tmp/clean_appset1.yaml
    - yq "del(.spec.generators[].list.elements[] | select(.commit_sha == \"$ARGO_APP_IMAGE_TAG\"))" /tmp/clean_appset1.yaml > /tmp/clean_appset.yaml
    - envsubst  < ./kube/main.yaml > /tmp/elements.yaml
    - yq '.spec.generators[].list.elements += load("/tmp/elements.yaml")' /tmp/clean_appset.yaml > /tmp/new_appset.yaml
    - kubectl apply -f /tmp/new_appset.yaml

Also, I've found out that ArgoCD won't remove a namespace if it was created by a SyncPolicy, so I've added it to the helm chart, and add a new value to provide a name.

And a little bit more

  1. Since my storage capacity is a bit limited, I need to care about it. Hence, I can't store all the images there. And I had to come up with a cleaning up solution. Removing images that are older than X days, didn't seem to be an option, because in case I'm not pushing to the registry for quite a time, my production image will be gone, and I won't be able to run the blog fast. So I've decided that I only need to store images with tags that still exists in repo as a commit sha. If I have a feature branch with 100 commits, I'll have 100 images, but when I squash it before merging, I will be left with only one. So when it's merged to the main, I won't have to store 100 images forever. And I've decided to write a script for that. A Perl script. Why Perl? Because I like it and I wanted not to forget it completely. Also, bash seems a little bit too primitive for that, compilable languages (go, rust) seem to be an overkill, python I hate. So why not Perl? The initial plan to create a scheduled job that is getting all commit hashes from git, comparing them to docker tags, and if tag with a non-existent commit is found, it's getting removed. But t problem is that Gitea (At the time of writing, I am using Gitea 1.18.3) package registry doesn't have an API to list all tags for an image (or I'm too dummy to find it). So I've decided to use Drone API. Getting all commits and all drone builds, comparing builds to commits and for non-existent SHAs remove images from the registry. But the problem is that drone doesn't return all the builds, only recent (and again, maybe I couldn't find how to do it). So the scheduled job may not work, if I'm being very productive. So I've added a new step to the job. After syncing an Argo Application I'm running this script:

{{< details "In case you want to read it here:" >}}

#!/usr/bin/perl 
    
use strict; 
use warnings; 
# --------------------------------------
# -- Drone variables
# --------------------------------------
my $drone_url="$ENV{'DRONE_SYSTEM_PROTO'}://$ENV{'DRONE_SYSTEM_HOST'}";
my $drone_project=$ENV{'DRONE_REPO'};
my $drone_api="$drone_url/api/repos/$drone_project/builds";
# --------------------------------------
# -- Gitea variables
# --------------------------------------
my $gitea_url=$ENV{'GITEA_URL'} || 'https://git.badhouseplants.net/api/v1';
my $gitea_org=$ENV{'GITEA_ORG'} || 'badhouseplants';
my $gitea_package=$ENV{'GITEA_PACKAGE'} || 'badhouseplants-net';
my $gitea_api="$gitea_url/packages/$gitea_org/container/$gitea_package";
my $gitea_token=$ENV{'GITEA_TOKEN'};
my $gitea_user=$ENV{'GITEA_USER'} || $ENV{'DRONE_COMMIT_AUTHOR'};
# ---------------------------------------
# -- Get recent builds from drone-ci
# ---------------------------------------
my $builds = "curl -X 'GET' $drone_api -H 'accept: application/json' | jq -r '.[].after'";
my @builds_out = `$builds`;
chomp @builds_out;
# ---------------------------------------
# -- Get a list of all commits + 'latest'
# ---------------------------------------
my $commits = "git log --format=format:%H --all";
my @commits_out = `$commits`;
chomp @commits_out;
push @commits_out, 'latest';

# ---------------------------------------
# -- Compare builds to commits 
# -- And remove obsolete imgages from
# --  registry
# ---------------------------------------
foreach my $line (@builds_out)
{
    if ( ! grep( /^$line$/, @commits_out ) ) {
      my $cmd = "curl -X 'DELETE' -s \"$gitea_api/$line\"  -H 'accept: application/json' -u $gitea_user:$gitea_token || true";
      print "Removing ${line}\n\n";
      my $output = `$cmd`;
      print "$output \n";
    } 
}

{{< /details >}}

It's far from being perfect, but it works and I like that I was able to finally use Perl somewhere

  1. I want to have a manifest that I can apply in case of kind of disaster recovery. And it means that ApplicationSet should contain enough information to deploy a production instance of my blog right off the bat. But I don't want to keep it up-to-date with every new commit hash. So I've decided to keep pushing latest to registry but only on main builds. So I can use the latest tag in application set, but in the application life-time I'll keep using SHA as tags. The only static hard-coded value in the ApplicationSet is a version of the Helm chart. And I don't know how to automate it yet. But I'm sure that I will do it somehow. I know that it's a very common practice to store all Argo resource in git. But I don't see any sense in storing manifests for temporary environments that can be recreated by clicking a button in Drone or by pushing a new commit.

  2. Some more static data that I've found later. I've understood that I'm using a badge on the [About page]({{< ref "about" >}}). And it's statically points only to the main branch, that doesn't make a lot of sense on envs built from other branches. But fortunately, bit ups to devs, hugo can use environment variables for setting up parameters of a site. I've updated the badge, so it looks like that:

[![Build Status](https://drone.badhouseplants.net/api/badges/badhouseplants/badhouseplants-net/status.svg?ref=refs/heads/{{< param GitBranch >}})](https://drone.badhouseplants.net/badhouseplants/badhouseplants-net)

![Build Status](https://drone.badhouseplants.net/api/badges/badhouseplants/badhouseplants-net/status.svg?ref=refs/heads/{{< param GitBranch >}})

And then I'm setting an env var HUGO_PARAMS_GITBRANCH. And now badge is looking for its branch.

What's not done yet

  1. I'm using Minio as a storage for pictures, and currently all pictures (and other files) are stored in one folder regardless of the environment. I would to have something like that.
  • On the first commit to a branch, sync pictures from the main dir to a new one.
  • On next commits, if pictures are added, copy them to a new dir only
  • When branch is merged, pictures from the branch should be synced to the main dir.
  1. Since I don't really have a static content, I can't be 100% sure that content that is generated during the run-time is what I expect to have. So I'd like to add a UI test that is executed after pod with nginx is started and is being used as a startupProbe. If test is not satisfied by a content, pod is never getting ready and traffic will keep going to the older version.

  2. A lot of logic that is put to .drone.yaml file should be moved out of it. Maybe to scripts, or to Makefile. But I don't think it's an important thing for this post, so I've decided not to care about it now.

Some kind of conclusion

Even though my application is just a simple blog, I still believe that creating dynamic environments is a great idea that should totally replace static dev'n'stages. And it's not only my blog, I've created dynamic envs for. Two biggest pains as I think are Static content and Persistent data (I think, there are more, but these two are most obvious). I've already shown an example how you can handle the first one, and the second is also a big pain in the ass. In my case this data is the one coming from the Minio and I'm not doing anything about it, but I'll write one more post, when it's solved, other, in my opinion, more obvious example, are databases. You need it to contain all the data that's required for testing, but you also may want it not to be huge, and it most probably should not contain any sensible personal data. So maybe you could stream a database from the production through some kind of anonymizer, clean it up, so it's not too big. And it doesn't sound easy already. But if I'll have to add something like that to my blog once, I'll try to describe it.

Thanks,

Oi!