Cilium Agent

Cilium Agent Deep-Dive

1. What This Component Does

The Cilium Agent (also referred to as the “Daemon”) is the per-node userspace process that orchestrates all Cilium functionality on a host. It runs as a long-lived daemon on every node in the cluster (e.g., Kubernetes worker nodes) and is responsible for:

  • Programming eBPF programs into the kernel for L3/L4/L7 networking, security policies, load balancing, and observability.
  • Managing cluster state: Allocating security identities, enforcing NetworkPolicies, handling service translations, and monitoring endpoint lifecycle (pod/container add/remove).
  • Interfacing with orchestrators: Integrates with Kubernetes via API server watches, CRDs, and CNI plugins.
  • Exposing APIs and metrics: Runs gRPC/HTTP servers for health checks, Hubble (observability), and operator coordination.

When/why use it? Deploy it on every node for eBPF-powered CNI networking in Kubernetes (or Nomad/others). It’s essential for replacing kube-proxy, enabling encryption (WireGuard/TLS), and scaling to 10k+ pods/node without performance cliffs. Without it, Cilium falls back to iptables (inefficient).

2. How It Works

The Cilium Agent follows a startup → initialization → watch-loop model:

  1. Parse config/flags and validate environment (eBPF FS mount, kernel headers).
  2. Instantiate Daemon struct (NewDaemon), which wires up 20+ subcomponents (datapath, identity allocator, policy repo, endpoint manager).
  3. Initialize core subsystems serially: Load eBPF templates, create maps, attach XDP/TC programs.
  4. Start background controllers/watchers in goroutines: K8s watches for pods/services/policies, identity GC, status reporters.
  5. Enter main loop: Serve REST/gRPC APIs, handle signals (SIGHUP reload), health checks, and process queued events (e.g., endpoint regen).
  6. Packet flow (async): Userspace queues events (e.g., “sync endpoint”), controllers update eBPF maps atomically; kernel eBPF tail-calls for decisions—no syscalls.

Key algorithm: Event-driven with eventual consistency. Changes propagate via pub/sub (channels, notifiers) to eBPF maps. Identities are reference-counted and allocated lazily from a global allocator synced via KV store (etcd).

Internal Flow Diagram:

graph TD
    A[cilium-agent Entry<br/>daemon/cmd/daemon.go] --> B[Parse Flags/Config<br/>daemon/options.ParseFlags()]
    B --> C[NewDaemon<br/>daemon/daemon.go:NewDaemon()]
    C --> D[InitDatapath<br/>pkg/datapath/datapath.go:InitDatapath()]
    D --> E[Init IdentityAllocator<br/>pkg/identity/cache/local.go]
    E --> F[Init PolicyRepository<br/>pkg/policy/repository.go]
    F --> G[Init EndpointManager<br/>pkg/endpointmanager/manager.go]
    G --> H[Start K8s Watchers<br/>pkg/k8s/watchers/]
    H --> I[Start Controllers<br/>pkg/controller/manager.go:Run()]
    I --> J[Main Serve Loop<br/>Daemon.Run():<br/>Health, API, Metrics Servers]
    J --> K[Event Queue Loop<br/>Endpoint sync, Policy update<br/>→ eBPF Map Writes]
    K -->|Kernel| L[eBPF Tailcalls:<br/>XDP/TC/LB/Policy/Encrypt]
    L -.->|Metrics/Logs| J
    style L fill:#ff9999

Step-by-step process:

  • Bootstrap: Ensures /sys/fs/bpf is mounted; loads verifier-approved eBPF objects.
  • State sync: Pulls identities/policies from KV store; watches K8s resources via informers.
  • Endpoint lifecycle: On pod-add → allocate ID → derive L7 policy → regenerate BPF → pin to TC/XDP.
  • Regeneration: Idempotent BPF reloads via endpoint.Regenerate() lock-free where possible.
  • Trade-off: Userspace-centralized for simplicity, but scales via sharded maps (per-CPU) and batched updates. Clever: “bpfgen” templates allow runtime policy injection without full recompiles.

3. Key Code Paths

Main Files:

Key Functions (with explanations):

Hot paths: Endpoint regen (~1-10ms) → endpoint.Regenerate() → map lookups/writes.

4. Configuration

Loaded via CLI flags (highest precedence), then /var/run/cilium/agent-flags.config, env vars, ConfigMap (K8s).

Key Options (from daemon/options/options.go):

  • --bpf-root=/sys/fs/bpf: eBPF FS path.
  • --enable-k8s-without-kube-proxy=true: Replace kube-proxy.
  • --identity-allocation-mode=kvstore: Sync mode (crd/kvstore).
  • --enable-l7-proxy=true: Envoy for L7.
  • --enable-encryption=wireguard: Node-to-node mesh encryption.
  • Env vars: CILIUM_K8S_API_SERVER=..., CILIUM_ENABLE_IPV4=false.

Defaults: pkg/defaults/defaults.go. Reload via SIGHUP on daemon/cmd/daemon.go:handleSignals.

K8s-specific: Helm values → ConfigMap → downward API env vars.

5. Extension Points

Cilium is highly modular; extend via interfaces and hooks:

Trade-offs: Interfaces are narrow (policy-focused); deep changes need eBPF expertise. Test via cilium-dbg or unit tests in TestDaemon*.