Rendering the TRUE Argo CD diff on your PRs

dag-andersen

Dag Andersen

Posted on May 23, 2024

Rendering the TRUE Argo CD diff on your PRs

TL;DR — The safest way to make changes to your Helm Charts and Kustomize Overlays is to let Argo CD render them for you. This can be done by spinning up an ephemeral cluster in your automated pipelines. This article presents a tool (argocd-diff-preview) for rendering manifest changes on pull requests. The rendered output is similar to what Atlantis creates for Terraform.

Problem

In the Kubernetes world, we often use templating tools like Kustomize and Helm to generate our Kubernetes manifests. These tools make maintaining and streamlining configuration easier across applications and environments. However, they also make it harder to visualize the application's actual configuration in the cluster.

Mentally parsing Helm templates and Kustomize patches is hard without rendering the actual output. Thus, making mistakes while modifying an application's configuration is relatively easy.

In the field of GitOps and infrastructure as code, all configurations are checked into Git and modified through PRs. The code changes in the PR are reviewed by a human, who needs to understand the changes made to the configuration. This is hard when the configuration is generated through templating tools like Kustomize and Helm.

If you are interested in a more detailed walkthrough for this problem, I recommend watching Nicholas Morey's talk at KubeCon 2024: "The Rendered Manifests Pattern: Reveal Your True Desired State"

This article introduces the tool argocd-diff-preview that solves this problem by rendering manifest changes directly on pull requests.

... but first, let's go through two simple examples where not rendering manifests can result in misconfiguration:

Helm misconfiguration example

Here we see an example of a developer trying to override the replica count on an Argo CD application:

helm misconfiguration example

This PR may look correct, but as a reviewer, you do not know if the value specified in the Helm Chart is named replicas: or replicaCount:. The code change has no effect if the value name is incorrect. Without rendering the Helm templates, the likelihood of these errors going to production is high.

Kustomize misconfiguration example

Here we see an example of a developer trying to set the replica count for both staging and production:

kustomize misconfiguration example

Again, this PR may look correct because the change happens in a base folder, so the change applies to all overlays (production and staging). But as a reviewer, you do not know if this value is overridden later down the chain of overlays.

~/someApp
├── base
│   ├── deployment.yaml        ⬅️ File changed in Pull Request
│   ├── kustomization.yaml
│   └── service.yaml
└── overlays
    ├── staging
    │   ├── cpu_count.yaml
    │   ├── kustomization.yaml
    └── production
        ├── cpu_count.yaml
        ├── kustomization.yaml
        └── replica_count.yaml ⬅️ replicaCount overwritten here
Enter fullscreen mode Exit fullscreen mode

This unintended result might not have been caught without rendering the final output for staging and production.

Other solutions to the problem

This problem has been pointed out many times in articles and tech talks about GitOps and infrastructure as code.

If you are interested in different approaches to solving the problem and their limitations, check out Kostis Kapelonis's article on the topic.

argocd-diff-preview is not the first tool that tries to tackle this problem. Other open-source repos include quizlet/argocd-diff-action and zapier/kubechecks.

  • quizlet/argocd-diff-action generates an Argo CD diff between the current PR and the current state of the cluster using the argocd app diff command. Thus, this tool needs the Argo CD applications to already be in sync with Git to be helpful. Applications that are out-of-sync on the Argo CD instance will be rendered as a diff on every PR. Additionally, you need to provide your CI pipeline with credentials to your Argo CD server, which may not be possible or desirable.

  • zapier/kubechecks is a system that you install on your cluster, which may not be desirable for organizations with strict security restrictions. The tool is complex but has many interesting features. Again, this tool requires access to your running Argo CD instance, which may not be possible or desirable.

argocd-diff-preview was created to avoid installing a tool directly on a cluster or providing it with credentials to your live Argo CD instance.


New solution: argocd-diff-preview

Goal

Create a tool that works like Atlantis for Terraform but for Argo CD. The tool should render a reliable diff of the configuration changes directly on the PR. Additionally, it should work without needing access to your existing infrastructure.

Instead of creating some scripts that try to mimic how Argo CD would render the manifests, why not let Argo CD render the manifests itself? This would ensure that the rendered manifests are exactly how Argo CD would render the manifests.

How it works

argocd-diff-preview spins up a local cluster, installs Argo CD, applies the manifests to the cluster, extracts the rendered manifests from Argo CD, and compares it to the main branch.

This tool runs an ephemeral local cluster inside Docker, so it does not need access to your infrastructure. It only needs read access to the Git repository and your Helm Charts (either stored in Git or a registry)

In other words, it follows these 10 steps:

  1. Start a local cluster
  2. Install Argo CD
  3. Add the required credentials (Git credentials, image pull secrets, etc.)
  4. Fetch all Argo CD application files on your PR branch
    • Point their targetRevision to the Pull Request branch
    • Remove the syncPolicy from the application (to avoid the application to sync locally)
  5. Apply the modified applications to the cluster
  6. Let Argo CD do its magic
  7. Extract the rendered manifests from the Argo CD server
  8. Repeat steps 4–7 for the base branch (main branch)
  9. Create a diff between the manifests rendered from each branch
  10. Display the diff in the PR

The flow visualized:
the flow visualized

Example

If you are asked for a review on a PR that looks like this:

pull request example

Then you can verify that it is configured correctly by checking the output generated by argocd-diff-preview. The output would look similar to this:

tool output example

Pros

  • Always renders the correct difference between branches because it is rendered by Argo CD itself.
  • Fully ephemeral cluster.
  • Does not access any of your existing infrastructure. It only requires read access to the Git repository and your Helm Charts.
  • Can be run locally before you open the pull request.
  • Supports multi-source applications
  • Supports Argo CD Config Management Plugins (CMP)
  • Renders changes in resources from external sources (e.g., Helm Charts). For example, when you update the Helm Chart version of nginx, you can see what exactly changed - PR example.

Cons

  • It is slow. Spinning up a cluster and installing Argo CD takes a few minutes each run (see table below)

Comparing desired states - Not actual state

An important point to understand is that, unlike Atlantis or the argocd diff CLI command, this approach doesn't compare the desired state in Git with the actual state in Kubernetes. Instead, it compares the desired state of the two branches stored in Git. I would argue that this is better than comparing Git with the actual state in Kubernetes because the state can change, resulting in non-deterministic output. The actual state in Kubernetes can temporarily go out-of-sync with Git, and we don't want this to be highlighted in our diff preview. Developers who work with Altanis experience this a lot - each time you run atlantis plan, it may produce a different result if the infrastructure changes often.

How to use it in GitHub Actions

Here is an example of how you would trigger argocd-diff-preview on your pull requests in GitHub Actions

name: Argo CD Diff Preview

on:
  pull_request:
    branches:
      - main

jobs:
  render-diff:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          path: pull-request

      - uses: actions/checkout@v4
        with:
          ref: main
          path: main

      - name: Generate Diff
        run: |
          docker run \
            --network=host \
            -v /var/run/docker.sock:/var/run/docker.sock \
            -v $(pwd)/main:/base-branch \
            -v $(pwd)/pull-request:/target-branch \
            -v $(pwd)/output:/output \
            -e TARGET_BRANCH=${{ github.head_ref }} \
            -e REPO=${{ github.repository }} \
            dagandersen/argocd-diff-preview:v0.0.23

      - name: Post diff as comment
        run: |
          gh pr comment ${{ github.event.number }} --repo ${{ github.repository }} --body-file output/diff.md --edit-last || \
          gh pr comment ${{ github.event.number }} --repo ${{ github.repository }} --body-file output/diff.md
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

Handling credentials

In the simple code example above, I do not provide argocd-diff-preview with any credentials, which only works if the Helm Chart registry and the Git repository are public. If you want to use this tool in a private repository, you need to provide the tool with the required credentials. More details on this can be seen in the GitHub Repository

Output

On a successful run, the tool prints the following output:

✨ Running with:
✨ - base-branch: main
✨ - target-branch: helm-example-3
✨ - repo: dag-andersen/argocd-diff-preview
✨ - timeout: 180
🚀 Creating cluster...
🚀 Cluster created successfully
🦑 Installing Argo CD...
...
🤖 Patching applications for branch: main
🤖 Patching applications for branch: helm-example-3
🌚 Getting resources for base-branch
🌚 Getting resources for target-branch
...
🔮 Generating diff between main and helm-example-3
🙏 Please check the ./output/diff.md file for differences
Enter fullscreen mode Exit fullscreen mode

If something is wrong with your configuration, it prints the Argo CD Application error message:

...
🤖 Patching 4 Argo CD Application[Sets] for branch: helm-example-3
🌚 Getting resources for target-branch
⏳ Waiting for 4 out of 4 applications to become 'OutOfSync'. Retrying in 5 seconds. Timeout in 180 seconds...
❌ Failed to process application, my-app, with error:
    Failed to load target state: failed to generate manifest for source 2 of 2: rpc error: code = Unknown desc = authentication required
Enter fullscreen mode Exit fullscreen mode

Speed

The table below shows how the number of applications correlates with the time it takes to render them all:

Number of applications 1 50 250 500
Seconds** 80 100 210 330

graph showing how the number of applications correlates with the time it takes to render them

Creating a cluster and installing Argo CD on it takes around 1 minute, which is why rendering a single application takes over a minute.

**The speed can vary depending on the distribution between applications used with Kustomize, Helm, and raw manifests. This test's result is based on a codebase mainly filled with Helm Charts.

Speeding up the rendering process

Rendering the manifests generated by all applications in the repository for each pull request can be slow. The tool offers various options to limit the number of applications rendered on each PR. You can choose applications based on label selectors, file paths, or by tracking specific file changes For more information: [docs]


Conclusion

In conclusion, tackling the challenge of accurately visualizing Kubernetes configuration changes within GitOps workflows is essential for ensuring smooth operations and minimizing errors.

argocd-diff-preview works like Atlantis for Terraform. The tool lets you render the diff on PRs, making it easier to review the changes made to the configuration. Since the diff is rendered by Argo CD itself, it is as accurate as possible.

In contrast to other existing solutions, argocd-diff-preview works without direct access to your infrastructure, which can be desirable for organizations with strict security requirements.

If you experience any issues with the tool, please open an issue on the repository

💖 💪 🙅 🚩
dag-andersen
Dag Andersen

Posted on May 23, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related