Argo CD Architecture

Argo CD Architecture

Argo CD is a multi-component GitOps platform for Kubernetes. The Application Controller continuously reconciles desired state (from Git) against live cluster state, the Repository Server generates manifests from various configuration management tools, and the API Server exposes management endpoints to the Web UI and CLI. Redis provides caching, and Dex handles SSO authentication. The system is designed for high availability with horizontal scaling via controller sharding.

High-Level Architecture

graph TD
  User[User / CI System]
  CLI[argocd CLI]
  UI[Web UI<br/>React SPA]
  API[API Server<br/>server/server.go<br/>REST + gRPC]
  Dex[Dex OIDC<br/>SSO Provider]
  Redis[Redis<br/>Cache Layer]
  AC[Application Controller<br/>controller/appcontroller.go]
  RS[Repo Server<br/>reposerver/server.go]
  GE[GitOps Engine<br/>gitops-engine/]
  Git[Git Repositories]
  Helm[Helm Repos / OCI]
  K8sTarget[Target Clusters]
  K8sHost[Host Cluster<br/>CRDs: Application, AppProject]
  ASC[ApplicationSet Controller<br/>applicationset/]
  NC[Notification Controller<br/>notification_controller/]

  User -->|kubectl / PR| Git
  User --> CLI
  User --> UI
  CLI -->|REST/gRPC| API
  UI -->|REST/gRPC| API
  API -->|Auth| Dex
  API -->|Session/Cache| Redis
  API -->|Watch/Update| K8sHost
  AC -->|Watch CRDs| K8sHost
  AC -->|GenerateManifest| RS
  AC -->|Diff/Sync| GE
  AC -->|Cache| Redis
  GE -->|Apply/Watch| K8sTarget
  RS -->|Clone/Pull| Git
  RS -->|Fetch Charts| Helm
  RS -->|Cache| Redis
  ASC -->|Create Applications| K8sHost
  ASC -->|Watch AppSets| K8sHost
  NC -->|Watch Apps| K8sHost
  NC -->|Send Alerts| User

This diagram illustrates the core flow: users interact via CLI/UI through the API Server, the Application Controller orchestrates reconciliation by fetching manifests from the Repo Server and applying them via the GitOps Engine. All state is stored as Kubernetes CRDs.

Component Breakdown

Application Controller

Responsibility: The heart of Argo CD. Watches Application and AppProject CRDs, computes the difference between desired (Git) and live (cluster) state, executes sync operations, and updates Application status. Uses multiple rate-limited work queues for different operation types (refresh, sync, hydration).

Key Files/Directories:

Interfaces: Watches K8s CRDs via client-go informers, calls Repo Server via gRPC for manifest generation, delegates diff/apply to GitOps Engine, stores cache in Redis.

Repository Server

Responsibility: Stateless manifest generation service. Clones Git repositories, runs configuration management tools (Helm template, Kustomize build, Jsonnet), and returns rendered Kubernetes manifests. Implements aggressive caching to avoid redundant generation.

Key Files/Directories:

Interfaces: Accepts gRPC requests from the Application Controller. Interacts with Git (via go-git), Helm registries, and OCI registries. Uses Redis for distributed caching across replicas.

API Server

Responsibility: User-facing gateway. Multiplexes HTTP REST and gRPC on a single port via cmux. Handles authentication (Dex SSO, local users, JWT), authorization (Casbin RBAC via AppProjects), and serves the React Web UI as static assets.

Key Files/Directories:

Interfaces: Serves CLI and Web UI via REST/gRPC. Authenticates via Dex (OIDC) or local users. Reads/writes Application CRDs to the host Kubernetes cluster. Proxies some requests to the Application Controller.

GitOps Engine

Responsibility: Reusable library providing the core GitOps primitives: resource caching, diff computation (three-way merge), sync execution with waves/hooks, and health assessment. Developed as a separate Go module for potential reuse by other GitOps tools.

Key Files/Directories:

Interfaces: Used as a Go library by the Application Controller. Talks directly to target Kubernetes clusters via client-go for watching resources and applying manifests.

ApplicationSet Controller

Responsibility: Watches ApplicationSet CRDs and generates Application CRDs from templates. Supports generators (Git directory, cluster list, PR events, SCM providers, matrix/merge combinators) for dynamic, policy-driven Application creation.

Key Files/Directories:

Interfaces: Watches ApplicationSet CRDs, creates/updates/deletes Application CRDs. Uses Git, SCM APIs, and cluster information as generator inputs.

Notification Controller

Responsibility: Watches Application state changes and sends notifications (Slack, email, webhook, etc.) based on configurable triggers and templates.

Key Files/Directories:

Interfaces: Watches Application CRDs for status changes. Sends notifications via configured backends (ConfigMap-driven).

Data Flow

Typical Application sync cycle from Git commit to cluster deployment:

sequenceDiagram
  participant Git as Git Repository
  participant RS as Repo Server
  participant AC as Application Controller
  participant GE as GitOps Engine
  participant Cache as Redis Cache
  participant K8s as Target Cluster

  Note over AC: Periodic resync (every 120s)<br/>or webhook trigger
  AC->>Cache: Check cached manifest revision
  alt Cache miss or stale
    AC->>RS: GenerateManifest(repo, revision, path)
    RS->>Git: git clone/fetch (shallow)
    Git-->>RS: Source files
    RS->>RS: helm template / kustomize build
    RS-->>AC: Target manifests
    RS->>Cache: Store result
  else Cache hit
    Cache-->>AC: Cached manifests
  end
  AC->>GE: CompareAppState(target, live)
  GE->>K8s: Get live resources (from cache)
  GE-->>AC: Diff result (synced/out-of-sync)
  AC->>AC: Update Application.Status
  alt Sync requested (manual or auto-sync)
    AC->>GE: Sync(target manifests)
    GE->>GE: Plan waves + hooks
    loop For each sync wave
      GE->>K8s: Apply resources (kubectl apply)
      K8s-->>GE: Result
      GE->>GE: Wait for health
    end
    GE-->>AC: Sync complete
    AC->>AC: Update Operation status
  end

On Application/AppProject changes: CRD update -> informer event -> work queue -> controller processing -> status update -> API Server serves to UI.

Key Design Decisions

  • Git as Source of Truth: All desired state lives in Git, providing audit trail, rollback (revert commit), and access control (PR reviews). Trade-off: Requires all configuration to be declarative; imperative operations (e.g., kubectl edit) create drift that Argo CD will detect and optionally revert.

  • Level-Triggered Reconciliation: The controller re-examines the full state on every cycle (not just reacting to events). Clever: Tolerant of missed events, network partitions, and controller restarts. Trade-off: Higher baseline resource usage compared to pure event-driven systems, mitigated by caching and configurable resync intervals.

  • Multi-Component Microservices: Separate processes for API, controller, repo-server, and Redis enable independent scaling. Repo-server scales horizontally for manifest generation throughput; controller scales via sharding. Trade-off: Operational complexity (5+ processes), mitigated by Helm chart and operator packaging.

  • Controller Sharding (controller/sharding/): Applications are distributed across controller replicas by cluster or round-robin. Dynamic redistribution on replica changes. Pattern: Consistent hashing for stable assignment. Trade-off: Split-brain risk if sharding metadata is stale.

  • Pluggable Manifest Generation: Repo Server supports Helm, Kustomize, Jsonnet, plain YAML, and custom Config Management Plugins (CMP). Pattern: Strategy pattern with tool-specific adapters. Trade-off: CMP sidecar model adds pod complexity but isolates untrusted plugins.

  • Casbin RBAC via AppProjects: Fine-grained access control (which repos, clusters, namespaces, resource kinds) using Casbin policy engine. Clever: Projects are CRDs, so RBAC policies are version-controlled. Trade-off: Complex policy expressions can be hard to debug.

  • Resource Health via Lua Scripts: Health assessment for custom resources is defined in Lua scripts under resource_customizations/, loaded at runtime. Clever: Extensible without recompilation, community-maintained. Trade-off: Lua execution overhead (minimal) and potential for incorrect health scripts.

This design prioritizes reliability (level-triggered, idempotent), auditability (Git-based), and extensibility (pluggable tools, Lua health). For deep dives, trace processAppRefreshQueueItem in the controller or GenerateManifest in the repo server.