Envoy Architecture
Envoy is a high-performance L7 proxy written in C++ that processes network traffic through a modular filter chain architecture. It runs as a single process with a main thread handling management tasks and multiple worker threads processing connections independently. Configuration is delivered dynamically via the xDS API family, allowing seamless updates without restarts.
High-Level Architecture
graph TD
CP[Control Plane<br/>Istio / xDS Server]
Bootstrap[Bootstrap Config<br/>envoy.yaml / --config-path]
Main[Main Thread<br/>source/server/server.cc]
LM[Listener Manager<br/>source/server/listener_manager_impl.cc]
CM[Cluster Manager<br/>source/common/upstream/cluster_manager_impl.cc]
W1[Worker Thread 1<br/>source/server/worker_impl.cc]
W2[Worker Thread 2]
WN[Worker Thread N]
L[Listeners<br/>TCP/UDP Sockets]
FC[Filter Chains<br/>Network + HTTP Filters]
HCM[HTTP Connection Manager<br/>source/extensions/filters/network/http_connection_manager]
Router[Router Filter<br/>source/common/router/router.cc]
UP[Upstream Connections<br/>Connection Pools]
HC[Health Checkers<br/>source/extensions/health_checkers]
OD[Outlier Detection<br/>source/common/upstream/outlier_detection_impl.cc]
Stats[Stats / Tracing / Logging<br/>source/common/stats]
xDS[xDS Subscriptions<br/>source/common/config]
CP -->|gRPC Stream| xDS
Bootstrap --> Main
Main --> LM
Main --> CM
Main --> xDS
LM --> W1
LM --> W2
LM --> WN
W1 --> L
L --> FC
FC --> HCM
HCM --> Router
Router --> CM
CM --> UP
CM --> HC
CM --> OD
FC --> Stats
xDS -->|LDS/RDS| LM
xDS -->|CDS/EDS| CM
This diagram shows the main thread managing listeners and clusters, with worker threads handling connections through filter chains. The xDS layer feeds dynamic configuration to both the listener and cluster managers.
Component Breakdown
Server & Main Thread
Responsibility: Process lifecycle, signal handling, admin interface, hot restart coordination, and orchestrating all other components. The main thread owns the listener manager, cluster manager, and xDS subscription machinery.
Key Files/Directories:
source/exe/main.cc: Process entry point, createsMainCommon.source/exe/main_common.cc:MainCommoninitializes options, createsServer::InstanceImpl.source/server/server.cc:InstanceImplboots the server—loads bootstrap config, creates managers, starts workers.source/server/admin/: Admin HTTP interface (/stats,/clusters,/config_dump,/ready).source/server/hot_restart_impl.cc: Shared memory IPC for zero-downtime binary upgrades.
Interfaces: Server::Instance (lifecycle), Admin (admin endpoint), HotRestart (binary upgrade IPC via Unix domain sockets and shared memory).
Listener Manager
Responsibility: Creates, updates, and drains listeners. Manages filter chain matching (by SNI, ALPN, destination port/IP) and distributes accepted connections to worker threads. Handles listener discovery (LDS) from xDS.
Key Files/Directories:
source/server/listener_manager_impl.cc: Core listener lifecycle, LDS integration.source/server/listener_impl.cc: Individual listener with socket, filter chains.source/server/filter_chain_manager_impl.cc: Filter chain selection logic using a trie-based matcher.source/server/worker_impl.cc: Per-worker event loop, accepts connections.
Interfaces: ListenerManager, Listener, FilterChainManager. Workers run on separate threads, each with their own Dispatcher (event loop). Connection handoff uses the kernel’s SO_REUSEPORT for load balancing across workers.
Cluster Manager
Responsibility: Manages upstream clusters (groups of endpoints), connection pools, load balancing, health checking, circuit breaking, and outlier detection. Integrates with CDS/EDS for dynamic endpoint discovery.
Key Files/Directories:
source/common/upstream/cluster_manager_impl.cc: Central manager, owns all clusters.source/common/upstream/upstream_impl.cc:ClusterInfoImplholds cluster config, stats, LB policy.source/common/upstream/load_balancer_impl.cc: Round Robin, Least Request, Random, Ring Hash, Maglev implementations.source/common/upstream/health_checker_impl.cc: Active health checking (HTTP, TCP, gRPC).source/common/upstream/outlier_detection_impl.cc: Passive failure tracking, host ejection.
Interfaces: ClusterManager, ThreadLocalCluster, LoadBalancer, HealthChecker, OutlierDetector. Thread-local cluster instances provide lock-free access to host sets and connection pools on worker threads.
HTTP Connection Manager (HCM)
Responsibility: The most critical network filter. Manages HTTP protocol handling (HTTP/1.1, HTTP/2, HTTP/3 codecs), request/response lifecycle, HTTP filter chain execution, access logging, and tracing integration.
Key Files/Directories:
source/extensions/filters/network/http_connection_manager/: HCM filter factory and config.source/common/http/conn_manager_impl.cc: Core HCM logic—codec callbacks, stream management.source/common/http/http1/codec_impl.cc: HTTP/1.1 codec (http-parser based).source/common/http/http2/codec_impl.cc: HTTP/2 codec (nghttp2).
Interfaces: ConnectionManagerImpl implements Network::ReadFilter (receives raw bytes from downstream), Http::ServerConnectionCallbacks (codec events). Each HTTP stream runs through the HTTP filter chain (decode -> route -> encode).
xDS Configuration Subsystem
Responsibility: Manages subscriptions to discovery services (LDS, RDS, CDS, EDS, SDS, ECDS, etc.) via gRPC streaming, REST, or filesystem. Handles config versioning, ACK/NACK, delta updates, and config validation.
Key Files/Directories:
source/common/config/: Subscription factories, gRPC mux, utility helpers.source/common/config/grpc_mux_impl.cc: SotW (state-of-the-world) xDS multiplexer.source/common/config/new_grpc_mux_impl.cc: Delta xDS multiplexer.api/envoy/service/: Protobuf service definitions for all xDS APIs.
Interfaces: Config::SubscriptionFactory, Config::GrpcMux, Config::SubscriptionCallbacks. Components register callbacks for specific resource types and receive updates asynchronously.
Event Loop & Threading
Responsibility: Non-blocking I/O via libevent wrappers. Each worker thread runs its own event dispatcher. The main thread has a separate dispatcher for admin and management tasks.
Key Files/Directories:
source/common/event/dispatcher_impl.cc: Event loop abstraction over libevent.source/common/event/libevent.cc: libevent initialization.source/common/thread_local/thread_local_impl.cc: TLS slot mechanism for cross-thread data sharing.
Interfaces: Event::Dispatcher (timers, file events, deferred callbacks), ThreadLocal::Instance (allocates TLS slots, posts updates from main thread to all workers).
Data Flow
Typical HTTP request lifecycle from downstream client to upstream service:
sequenceDiagram
participant Client as Downstream Client
participant Worker as Worker Thread<br/>Dispatcher
participant Listener as Listener<br/>Filter Chain Match
participant NF as Network Filters<br/>(TLS, HCM)
participant HCM as HTTP Conn Manager<br/>conn_manager_impl.cc
participant HF as HTTP Filters<br/>(Auth, Rate Limit, Router)
participant CM as Cluster Manager<br/>cluster_manager_impl.cc
participant LB as Load Balancer
participant Upstream as Upstream Host
Client->>Worker: TCP connect (SO_REUSEPORT)
Worker->>Listener: Accept connection
Listener->>Listener: Match filter chain (SNI, ALPN)
Listener->>NF: Create network filter chain
NF->>NF: TLS handshake (if configured)
NF->>HCM: onData() - raw bytes
HCM->>HCM: HTTP codec decode (HTTP/1.1 or HTTP/2)
HCM->>HF: decodeHeaders() through filter chain
HF->>HF: Auth filter, rate limit, etc.
HF->>CM: route match -> select cluster
CM->>LB: chooseHost() (Round Robin, Maglev, etc.)
LB->>Upstream: Connect (or reuse from pool)
Upstream-->>HCM: Response headers + body
HCM->>HF: encodeHeaders() through filter chain
HF-->>Client: HTTP response
Note over Worker,HCM: Access log written after response complete
On configuration change: xDS update -> main thread validates -> post to all workers via TLS slots -> workers apply atomically (no locks on hot path).
Key Design Decisions
-
Single-threaded-per-worker model: Each worker thread runs an independent event loop and processes connections without locks. Data shared across threads uses TLS (thread-local storage) slots, where the main thread posts updates and workers apply them in their event loop. Trade-off: Eliminates lock contention on the hot path but means configuration updates propagate asynchronously (millisecond-level delay). Connection count per worker depends on
SO_REUSEPORTkernel balancing. -
Filter chain composition: Both L4 (network) and L7 (HTTP) processing use a chain-of-responsibility pattern. Filters are instantiated per-connection (network) or per-stream (HTTP) from factories registered at startup. Pattern: Decode/encode callbacks with
FilterStatus::ContinueorStopIteration. Trade-off: Flexibility and testability at the cost of per-stream allocation overhead. Mitigated by arena allocation and filter reuse. -
xDS dynamic configuration: All runtime config (listeners, routes, clusters, endpoints, secrets) can be updated without restart. The gRPC streaming protocol with ACK/NACK/version semantics ensures consistency. Clever: Delta xDS sends only changed resources, reducing control plane load in large meshes (10K+ endpoints). Trade-off: Complexity in subscription management and config staleness during network partitions (stale config is served until reconnection).
-
Protobuf-first API: Configuration types are defined as protobuf messages in
api/envoy/, generating C++ types, validation, and JSON/YAML support automatically. Trade-off: Strong typing and cross-language compatibility at the cost of protobuf build complexity and generated code bloat. -
Extension registry pattern: Filters, transport sockets, and other plugins register via typed factory macros (
REGISTER_FACTORY). The build system (source/extensions/extensions_build_config.bzl) controls which extensions are compiled in. Pattern: Enables custom Envoy builds with only needed filters, reducing binary size. Trade-off: Requires Bazel build knowledge for customization. -
Hot restart: Envoy supports zero-downtime binary upgrades via shared memory and Unix domain socket IPC between old and new processes (
source/server/hot_restart_impl.cc). The new process drains listeners from the old one. Trade-off: Complex IPC protocol and shared memory layout constraints, but enables seamless upgrades without connection drops. -
Connection pooling: Per-cluster, per-worker connection pools with HTTP/1.1 (connection-per-request or keep-alive) and HTTP/2 (multiplexed streams over fewer connections). Clever: Upstream protocol can differ from downstream (e.g., downstream HTTP/1.1 -> upstream HTTP/2). Trade-off: Pool management complexity vs. reduced upstream connection count and latency.
This architecture prioritizes performance (lock-free hot path, zero-copy where possible), extensibility (filter chains, typed registries), and operational safety (xDS, hot restart, graceful drain). For deep dives, trace InstanceImpl::initialize() in source/server/server.cc for the startup sequence, or follow a request through ConnectionManagerImpl::onData().