Application Controller Deep-Dive

What This Component Does

The Application Controller is the core reconciliation engine of Argo CD. It continuously watches Application CRDs, compares the desired state (defined in Git repositories) against the live state in target Kubernetes clusters, and executes sync operations to bring them into alignment. It is the component that makes Argo CD a GitOps tool rather than just a deployment dashboard.

The controller manages the full lifecycle of Application resources: periodic state comparison, health assessment, auto-sync with self-healing, sync operation execution with waves and hooks, and status reporting. It communicates with the Repo Server (via gRPC) for manifest generation and delegates diff/apply operations to the embedded GitOps Engine.

Use cases: Runs as a Deployment in the argocd namespace. Scale horizontally by increasing replicas with controller sharding enabled (--sharding-algorithm). Configure resync behavior via --app-resync (default 120s) and self-heal via --self-heal-timeout (default 5m).

How It Works

The controller uses the Kubernetes informer pattern with multiple specialized work queues. Each queue handles a different concern (refresh, sync, project updates, hydration), with rate limiters to prevent thundering herd. The main reconciliation function is processAppRefreshQueueItem, which fetches manifests, computes diffs, and updates status.

Internal Flow Diagram

flowchart TD
    A[Informer Event<br/>Application CRD changed] --> B{Event Type}
    B -->|Spec changed| C[appRefreshQueue<br/>Rate-limited enqueue]
    B -->|Operation set| D[appOperationQueue<br/>Rate-limited enqueue]
    B -->|Project changed| E[projectRefreshQueue]

    C --> F[processAppRefreshQueueItem]
    F --> G[Load AppProject<br/>Validate permissions]
    G --> H{Cache hit?<br/>Redis: app+rev+params}
    H -->|Yes| I[Use cached comparison]
    H -->|No| J[Call Repo Server<br/>gRPC: GenerateManifest]
    J --> K[Receive target manifests]
    K --> L[GitOps Engine<br/>CompareAppState<br/>Three-way diff]
    I --> L
    L --> M[Compute health status<br/>Per-resource + aggregate]
    M --> N[Update Application.Status<br/>SyncStatus, HealthStatus,<br/>ResourceStatuses]
    N --> O{Auto-sync enabled?<br/>SyncPolicy.Automated}
    O -->|Yes + OutOfSync| P[Enqueue to<br/>appOperationQueue]
    O -->|No| Q[Done - wait for<br/>next resync cycle]

    D --> R[processAppOperationQueueItem]
    R --> S[Read Operation from CRD]
    S --> T[Call Repo Server<br/>GenerateManifest]
    T --> U[GitOps Engine: Sync<br/>Plan waves + hooks]
    U --> V[Apply resources<br/>per wave to cluster]
    V --> W[Update Operation status<br/>succeeded/failed]
    W --> X[Clear Operation field]

    style F fill:#f9f
    style R fill:#f9f

Step-by-Step Process (Refresh Cycle):

Event Detection: Kubernetes informers detect Application CRD changes (create, update, delete) or the periodic resync timer fires (configurable via --app-resync, default 120s).
Queue Routing: Events are routed to the appropriate work queue based on type. Key deduplication ensures only one processing per app at a time.
Project Validation: The controller loads the referenced AppProject to validate that the Application’s source repo, destination cluster/namespace, and resource kinds are permitted.
Cache Check: Checks Redis cache for a previous comparison result matching the current (app, revision, parameters) tuple. On hit, skips manifest generation.
Manifest Generation: On cache miss, calls the Repo Server via gRPC GenerateManifest. The repo server clones the Git repo (or fetches from cache) and runs the configured tool (Helm, Kustomize, etc.).
State Comparison: Passes target manifests and live cluster state to the GitOps Engine’s CompareAppState. Uses three-way merge diff to identify additions, modifications, and deletions.
Health Assessment: Computes per-resource health using built-in rules or custom Lua scripts from resource_customizations/. Aggregates to application-level health.
Status Update: Writes comparison results to Application.Status: SyncStatus (Synced/OutOfSync), HealthStatus (Healthy/Degraded/Progressing), per-resource statuses.
Auto-Sync Decision: If SyncPolicy.Automated is set and the app is OutOfSync, enqueues a sync operation.

Key Code Paths

ApplicationController Struct

controller/appcontroller.go - The central struct managing all controller state:

type ApplicationController struct {
    // Kubernetes clients
    kubeClientset    kubernetes.Interface
    applicationClientset appclientset.Interface

    // Informers for CRD watching
    appInformer      cache.SharedIndexInformer
    projInformer     cache.SharedIndexInformer

    // Work queues (rate-limited)
    appRefreshQueue  workqueue.RateLimitingInterface
    appOperationQueue workqueue.RateLimitingInterface
    projectRefreshQueue workqueue.RateLimitingInterface

    // Service dependencies
    repoClientset    apiclient.Clientset  // gRPC to repo server
    db               db.ArgoDB            // cluster/repo credentials
    settingsMgr      *settings.SettingsManager
    stateCache       statecache.LiveStateCache // gitops-engine cluster cache

    // Sharding
    clusterSharding  sharding.ClusterShardingAlgorithm

    // Configuration
    statusRefreshTimeout     time.Duration  // Default: 120s
    statusHardRefreshTimeout time.Duration  // Default: 0 (disabled)
    selfHealTimeout          time.Duration  // Default: 5m
}

Comparison Flow

controller/state.go - AppStateManager interface:

The CompareAppState method orchestrates the full comparison:

Resolves the Git revision (latest HEAD or pinned)
Calls repo server for target manifests
Gets live resources from the GitOps Engine cluster cache
Runs three-way diff (gitops-engine/pkg/diff)
Computes health per resource (gitops-engine/pkg/health)
Returns CompareStateResult with per-resource diffs

Sync Execution

controller/sync.go - Sync orchestration:

Creates a SyncContext from the GitOps Engine
Sets sync options: prune, dry-run, force, server-side apply
Engine plans execution order using sync waves (argocd.argoproj.io/sync-wave annotations)
Runs hooks in order: PreSync -> Sync -> PostSync (or SyncFailed on error)
Per wave: apply resources, wait for health, proceed to next wave
Reports result back to Application status

Sharding

controller/sharding/ - Multi-replica distribution:

Legacy: Hash application name modulo replica count
Round-Robin: Distribute evenly across replicas
Consistent Hashing: Stable assignment using hash ring, minimizes redistribution on scale events
The controller watches its own Deployment to detect replica count changes and re-shard dynamically

Architecture Diagram

graph TD
    subgraph "Application Controller Pod"
        Informers[K8s Informers<br/>App + Project + Cluster]
        RefreshQ[appRefreshQueue<br/>Rate-limited]
        OpQ[appOperationQueue<br/>Rate-limited]
        ProjQ[projectRefreshQueue]
        Workers[Worker Goroutines<br/>Configurable parallelism]
        StateManager[AppStateManager<br/>controller/state.go]
        SyncManager[Sync Manager<br/>controller/sync.go]
        Sharding[Cluster Sharding<br/>controller/sharding/]
    end

    subgraph "External Dependencies"
        RS[Repo Server<br/>gRPC]
        Redis[Redis<br/>Cache]
        GE[GitOps Engine<br/>Diff/Sync/Health]
        HostK8s[Host Cluster<br/>CRDs]
        TargetK8s[Target Clusters]
    end

    Informers -->|Events| RefreshQ
    Informers -->|Events| OpQ
    Informers -->|Events| ProjQ
    RefreshQ --> Workers
    OpQ --> Workers
    Workers --> StateManager
    Workers --> SyncManager
    StateManager -->|GenerateManifest| RS
    StateManager -->|CompareAppState| GE
    StateManager -->|Get/Set cache| Redis
    SyncManager -->|Sync + Apply| GE
    GE -->|Watch/Apply| TargetK8s
    Workers -->|Update Status| HostK8s
    Sharding -->|Filter| Informers

Configuration

Flag	Default	Description
`--app-resync`	120s	Application resync period
`--app-hard-resync`	0 (disabled)	Hard resync (bypass all caches)
`--self-heal-timeout`	5m	Self-heal retry timeout
`--sync-timeout`	180s	Maximum sync operation duration
`--repo-server`	`argocd-repo-server:8081`	Repo server address
`--redis`	`argocd-redis:6379`	Redis address for caching
`--kubectl-parallelism-limit`	20	Max concurrent kubectl operations
`--sharding-algorithm`	legacy	Sharding algorithm (legacy/round-robin/consistent-hashing)
`--server-side-diff`	false	Use kubectl server-side apply for diffs
`--status-processors`	20	Number of app status refresh workers
`--operation-processors`	10	Number of sync operation workers

Key Design Decisions

Multiple work queues: Separate queues for refresh vs. sync operations prevent slow syncs from blocking status updates. Each queue has independent rate limiters and worker counts.
Level-triggered reconciliation: The controller re-evaluates full state on every cycle rather than relying on event ordering. Handles missed events, network partitions, and restarts gracefully.
Redis-backed comparison cache: Avoids expensive repo server calls and diff computations when nothing has changed. Cache key includes revision, parameters, and tool versions.
Sharding by cluster: Applications are assigned to controller replicas based on their target cluster, ensuring all apps for a cluster are processed by the same replica (cache locality).
Jitter on resync: statusRefreshJitter prevents all applications from refreshing simultaneously, smoothing API server and repo server load.