Cilium Architecture
Cilium is an eBPF-powered platform for Kubernetes networking, security, load balancing, and observability. The agent runs as a DaemonSet on each node, programming the kernel’s eBPF datapath directly for high-performance packet processing. A dedicated operator reconciles cluster-wide state, while modular components handle policies, identities, and multi-cluster federation.
High-Level Architecture
graph TD
K8s[Kubernetes API Server]
CRDs[Custom Resources<br/>(CNP, CEP, CIDR, etc.)]
Op[Cilium Operator<br/>pkg/identity, pkg/cidr]
Agent[Cilium Agent<br/>daemon/main.go<br/>pkg/hive]
Datapath[eBPF Datapath<br/>bpf/, pkg/datapath]
Proxy[Envoy Proxy<br/>L7 Policies<br/>pkg/proxy]
Endpoints[Endpoints/Pods]
KV[KVStore<br/>(etcd/consul)]
Hubble[Hubble Relay/UI<br/>hubble/]
ClusterMesh[ClusterMesh<br/>pkg/clustermesh]
K8s -->|Watch CRDs| Op
K8s -->|Watch Endpoints/Svcs| Agent
Op -->|Allocate IDs, CIDRs| KV
Op -.->|Reconcile| Agent
Agent -->|Load/Control| Datapath
Agent -->|Access Maps| KV
Agent -->|Configure| Proxy
Endpoints <-->|Packets| Datapath
Datapath <-->|Flows/Metrics| Hubble
Datapath -->|L7 Redirect| Proxy
Agent <-->|Federation| ClusterMesh
ClusterMesh <-->|Remote KV| KV
This diagram illustrates the core flow: Kubernetes resources drive the operator and agent, which program the eBPF datapath. Observability feeds into Hubble, and proxies handle L7.
Component Breakdown
Cilium Agent
Responsibility: Node-local controller that manages endpoints, loads eBPF programs/maps, enforces policies, handles service load-balancing (replacing kube-proxy), and coordinates with operator via KVStore. Uses a cell-based architecture via pkg/hive for modular lifecycle (start/stop/dependencies).
Key Files/Directories:
daemon/main.go: Entry point, initializes Hive cells.daemon/cilium.go: Core agent loop, endpoint regeneration.pkg/endpoint: Manages pod lifecycle, syncs identities/policies.pkg/datapath: Loads eBPF, configures TC/XDP hooks.
Interfaces: Watches K8s resources (informers), reads/writes to KVStore (pkg/kvstore), controls eBPF via github.com/cilium/ebpf (pinned maps in BPF FS), redirects to Envoy.
eBPF Datapath
Responsibility: Kernel-space packet processing for L3-L4 forwarding, NAT, encryption (WireGuard/IPsec), identity-aware filtering. Bypasses iptables/conntrack for scalability using hash maps for policies/services.
Key Files/Directories:
bpf/: eBPF C programs (e.g.,l3.cfor IPv4/IPv6,cgroup.cfor sock ops).pkg/datapath/linux: Go loader for BPF objects, map population.bpf/bpftool/: Custom helpers for map gen.
Interfaces: Agent populates maps (e.g., cilium_ipcache, cilium_policy) via bpf.Map.Update(). Hooks into TC (qdisc), XDP (NIC), cgroups (egress), tracepoints (syscalls). Exports metrics to agent via perf rings.
Cilium Operator
Responsibility: Cluster-wide reconciliation for identities, CIDR allocation, background policy compilation, and CRD status updates. Scales horizontally, stateless.
Key Files/Directories:
operator/main.go: K8s controller-runtime setup.pkg/identity/controller.go: Allocates numerical identities.pkg/cidr: Manages external CIDR IPs.
Interfaces: Uses client-go informers on CRDs like CiliumNetworkPolicy. Publishes to KVStore for agent consumption. Leader election via leases.
Envoy Proxy (L7 Integration)
Responsibility: Handles L7 policy enforcement (HTTP/gRPC/Kafka), transparent proxying via eBPF redirects.
Key Files/Directories:
pkg/proxy: Manages Envoy instances, generates configs.- Depends on [
github.com/cilium/proxy](go.mod dep): Envoy control plane integration.
Interfaces: eBPF redirects sockets to Envoy listeners. Agent configures via xDS (gRPC) using go-control-plane.
Hubble (Observability)
Responsibility: Collects eBPF flowlogs, metrics, and layers 7 visibility. Relay aggregates and serves to UI/CLI.
Key Files/Directories:
hubble/: Relay and API server.pkg/monitor/: Agent-side perf ring consumers.
Interfaces: Agents push samples via gRPC to Relay. Uses statedb for efficient querying.
KVStore & ClusterMesh
Responsibility: Shared state (identities, services) across agents/nodes/clusters. ClusterMesh enables global policies.
Key Files/Directories:
pkg/kvstore: Abstraction over etcd/consul.pkg/clustermesh: Remote KV federation.
Interfaces: etcd watches trigger agent regenerations. Optimistic concurrency via revs.
Data Flow
Typical pod-to-pod policy-enforced packet flow (sequence diagram):
sequenceDiagram
participant PodA as Pod A (TX)
participant TC as TC Ingress (Kernel)
participant BPF as eBPF Prog<br/>bpf/l3.c
participant Maps as BPF Maps<br/>(ipcache, policy)
participant Agent as Cilium Agent
participant PodB as Pod B (RX)
PodA->>TC: send packet (dst IP)
TC->>BPF: tail call entrypoint
BPF->>Maps: lookup src/dst identity (256-bit)
alt No identity
Maps->>Agent: async notify (ipcache miss)
Agent->>Maps: populate from KVStore/CRDs
end
Maps->>BPF: policy verdict (allow/deny/log)
alt Denied
BPF->>BPF: drop/redirect
else Allowed
BPF->>BPF: decap/NAT/forward
BPF->>PodB: deliver
Note over BPF: Perf ring sample -> Agent -> Hubble
end
On identity/policy changes: K8s CNP -> Operator -> KVStore -> Agent watch -> endpoint regen -> map updates -> BPF sees via helper calls.
Key Design Decisions
-
eBPF Datapath (Kernel User-Space Split): Agent in Go userspace programs kernel eBPF bytecode/maps for zero-copy, high-scale processing (millions of policies). Trade-off: Kernel version lock-in (4.9+), but enables kube-proxy replacement with O(1) lookups vs iptables chains. Clever: Generic helpers (
call_policy()) allow hot-loading without full reloads. -
Hive Cell Architecture (
pkg/hive): Modular agent as composable “cells” with dependencies (DAG lifecycle). E.g.,EndpointManagercell depends onIdentityAllocator. Pattern: Dependency injection + lifecycle hooks. Trade-off: Abstraction overhead vs. hot-reloadability (restart cells independently). -
Identity-Based Security: Policies on labels/identities (not IPs), allocated cluster-wide by operator. Clever: Revocable, multi-cluster friendly. Trade-off: Requires KVStore coordination (etcd latency).
-
StateDB (
github.com/cilium/statedb): Embedded MVCC store for efficient diffs/queries (e.g., Hubble). Pattern: Event-sourced state. Avoids full K8s API thundering. -
Monolith-per-Node with Microservices: Agent is monolithic for perf but cell-modular; operator scales out. No central DB—KVStore for loose coupling. Trade-off: Strong consistency via watches vs. eventual (performance).
-
Multi-Cluster (ClusterMesh): KVStore federation without VPNs. Notable: Global service discovery/policies via identity sharing.
This design prioritizes performance/scalability (eBPF), Kubernetes-native (CRDs/informers), and extensibility (maps as APIs). For deep dives, trace hive.Lifecycle in agent startup or BPF map pins in /sys/fs/bpf/tc/globals.