Application Controller Deep-Dive
What This Component Does
The Application Controller is the core reconciliation engine of Argo CD. It continuously watches Application CRDs, compares the desired state (defined in Git repositories) against the live state in target Kubernetes clusters, and executes sync operations to bring them into alignment. It is the component that makes Argo CD a GitOps tool rather than just a deployment dashboard.
The controller manages the full lifecycle of Application resources: periodic state comparison, health assessment, auto-sync with self-healing, sync operation execution with waves and hooks, and status reporting. It communicates with the Repo Server (via gRPC) for manifest generation and delegates diff/apply operations to the embedded GitOps Engine.
Use cases: Runs as a Deployment in the argocd namespace. Scale horizontally by increasing replicas with controller sharding enabled (--sharding-algorithm). Configure resync behavior via --app-resync (default 120s) and self-heal via --self-heal-timeout (default 5m).
How It Works
The controller uses the Kubernetes informer pattern with multiple specialized work queues. Each queue handles a different concern (refresh, sync, project updates, hydration), with rate limiters to prevent thundering herd. The main reconciliation function is processAppRefreshQueueItem, which fetches manifests, computes diffs, and updates status.
Internal Flow Diagram
flowchart TD
A[Informer Event<br/>Application CRD changed] --> B{Event Type}
B -->|Spec changed| C[appRefreshQueue<br/>Rate-limited enqueue]
B -->|Operation set| D[appOperationQueue<br/>Rate-limited enqueue]
B -->|Project changed| E[projectRefreshQueue]
C --> F[processAppRefreshQueueItem]
F --> G[Load AppProject<br/>Validate permissions]
G --> H{Cache hit?<br/>Redis: app+rev+params}
H -->|Yes| I[Use cached comparison]
H -->|No| J[Call Repo Server<br/>gRPC: GenerateManifest]
J --> K[Receive target manifests]
K --> L[GitOps Engine<br/>CompareAppState<br/>Three-way diff]
I --> L
L --> M[Compute health status<br/>Per-resource + aggregate]
M --> N[Update Application.Status<br/>SyncStatus, HealthStatus,<br/>ResourceStatuses]
N --> O{Auto-sync enabled?<br/>SyncPolicy.Automated}
O -->|Yes + OutOfSync| P[Enqueue to<br/>appOperationQueue]
O -->|No| Q[Done - wait for<br/>next resync cycle]
D --> R[processAppOperationQueueItem]
R --> S[Read Operation from CRD]
S --> T[Call Repo Server<br/>GenerateManifest]
T --> U[GitOps Engine: Sync<br/>Plan waves + hooks]
U --> V[Apply resources<br/>per wave to cluster]
V --> W[Update Operation status<br/>succeeded/failed]
W --> X[Clear Operation field]
style F fill:#f9f
style R fill:#f9f
Step-by-Step Process (Refresh Cycle):
- Event Detection: Kubernetes informers detect Application CRD changes (create, update, delete) or the periodic resync timer fires (configurable via
--app-resync, default 120s). - Queue Routing: Events are routed to the appropriate work queue based on type. Key deduplication ensures only one processing per app at a time.
- Project Validation: The controller loads the referenced AppProject to validate that the Application’s source repo, destination cluster/namespace, and resource kinds are permitted.
- Cache Check: Checks Redis cache for a previous comparison result matching the current
(app, revision, parameters)tuple. On hit, skips manifest generation. - Manifest Generation: On cache miss, calls the Repo Server via gRPC
GenerateManifest. The repo server clones the Git repo (or fetches from cache) and runs the configured tool (Helm, Kustomize, etc.). - State Comparison: Passes target manifests and live cluster state to the GitOps Engine’s
CompareAppState. Uses three-way merge diff to identify additions, modifications, and deletions. - Health Assessment: Computes per-resource health using built-in rules or custom Lua scripts from
resource_customizations/. Aggregates to application-level health. - Status Update: Writes comparison results to
Application.Status:SyncStatus(Synced/OutOfSync),HealthStatus(Healthy/Degraded/Progressing), per-resource statuses. - Auto-Sync Decision: If
SyncPolicy.Automatedis set and the app is OutOfSync, enqueues a sync operation.
Key Code Paths
ApplicationController Struct
controller/appcontroller.go - The central struct managing all controller state:
type ApplicationController struct {
// Kubernetes clients
kubeClientset kubernetes.Interface
applicationClientset appclientset.Interface
// Informers for CRD watching
appInformer cache.SharedIndexInformer
projInformer cache.SharedIndexInformer
// Work queues (rate-limited)
appRefreshQueue workqueue.RateLimitingInterface
appOperationQueue workqueue.RateLimitingInterface
projectRefreshQueue workqueue.RateLimitingInterface
// Service dependencies
repoClientset apiclient.Clientset // gRPC to repo server
db db.ArgoDB // cluster/repo credentials
settingsMgr *settings.SettingsManager
stateCache statecache.LiveStateCache // gitops-engine cluster cache
// Sharding
clusterSharding sharding.ClusterShardingAlgorithm
// Configuration
statusRefreshTimeout time.Duration // Default: 120s
statusHardRefreshTimeout time.Duration // Default: 0 (disabled)
selfHealTimeout time.Duration // Default: 5m
}
Comparison Flow
controller/state.go - AppStateManager interface:
The CompareAppState method orchestrates the full comparison:
- Resolves the Git revision (latest HEAD or pinned)
- Calls repo server for target manifests
- Gets live resources from the GitOps Engine cluster cache
- Runs three-way diff (
gitops-engine/pkg/diff) - Computes health per resource (
gitops-engine/pkg/health) - Returns
CompareStateResultwith per-resource diffs
Sync Execution
controller/sync.go - Sync orchestration:
- Creates a
SyncContextfrom the GitOps Engine - Sets sync options: prune, dry-run, force, server-side apply
- Engine plans execution order using sync waves (
argocd.argoproj.io/sync-waveannotations) - Runs hooks in order: PreSync -> Sync -> PostSync (or SyncFailed on error)
- Per wave: apply resources, wait for health, proceed to next wave
- Reports result back to Application status
Sharding
controller/sharding/ - Multi-replica distribution:
- Legacy: Hash application name modulo replica count
- Round-Robin: Distribute evenly across replicas
- Consistent Hashing: Stable assignment using hash ring, minimizes redistribution on scale events
- The controller watches its own Deployment to detect replica count changes and re-shard dynamically
Architecture Diagram
graph TD
subgraph "Application Controller Pod"
Informers[K8s Informers<br/>App + Project + Cluster]
RefreshQ[appRefreshQueue<br/>Rate-limited]
OpQ[appOperationQueue<br/>Rate-limited]
ProjQ[projectRefreshQueue]
Workers[Worker Goroutines<br/>Configurable parallelism]
StateManager[AppStateManager<br/>controller/state.go]
SyncManager[Sync Manager<br/>controller/sync.go]
Sharding[Cluster Sharding<br/>controller/sharding/]
end
subgraph "External Dependencies"
RS[Repo Server<br/>gRPC]
Redis[Redis<br/>Cache]
GE[GitOps Engine<br/>Diff/Sync/Health]
HostK8s[Host Cluster<br/>CRDs]
TargetK8s[Target Clusters]
end
Informers -->|Events| RefreshQ
Informers -->|Events| OpQ
Informers -->|Events| ProjQ
RefreshQ --> Workers
OpQ --> Workers
Workers --> StateManager
Workers --> SyncManager
StateManager -->|GenerateManifest| RS
StateManager -->|CompareAppState| GE
StateManager -->|Get/Set cache| Redis
SyncManager -->|Sync + Apply| GE
GE -->|Watch/Apply| TargetK8s
Workers -->|Update Status| HostK8s
Sharding -->|Filter| Informers
Configuration
| Flag | Default | Description |
|---|---|---|
--app-resync | 120s | Application resync period |
--app-hard-resync | 0 (disabled) | Hard resync (bypass all caches) |
--self-heal-timeout | 5m | Self-heal retry timeout |
--sync-timeout | 180s | Maximum sync operation duration |
--repo-server | argocd-repo-server:8081 | Repo server address |
--redis | argocd-redis:6379 | Redis address for caching |
--kubectl-parallelism-limit | 20 | Max concurrent kubectl operations |
--sharding-algorithm | legacy | Sharding algorithm (legacy/round-robin/consistent-hashing) |
--server-side-diff | false | Use kubectl server-side apply for diffs |
--status-processors | 20 | Number of app status refresh workers |
--operation-processors | 10 | Number of sync operation workers |
Key Design Decisions
- Multiple work queues: Separate queues for refresh vs. sync operations prevent slow syncs from blocking status updates. Each queue has independent rate limiters and worker counts.
- Level-triggered reconciliation: The controller re-evaluates full state on every cycle rather than relying on event ordering. Handles missed events, network partitions, and restarts gracefully.
- Redis-backed comparison cache: Avoids expensive repo server calls and diff computations when nothing has changed. Cache key includes revision, parameters, and tool versions.
- Sharding by cluster: Applications are assigned to controller replicas based on their target cluster, ensuring all apps for a cluster are processed by the same replica (cache locality).
- Jitter on resync:
statusRefreshJitterprevents all applications from refreshing simultaneously, smoothing API server and repo server load.