Envoy Code Walkthrough

Envoy Codebase Walkthrough

Envoy is a high-performance C++ L7 proxy built with Bazel. The codebase follows a modular architecture with clear separation between core server infrastructure (source/server/), shared libraries (source/common/), and pluggable extensions (source/extensions/). Configuration types are defined as Protocol Buffers in api/envoy/, and nearly all components implement abstract interfaces that enable dependency injection and testing. The codebase extensively uses the factory pattern, RAII, smart pointers, and a single-threaded event loop model per worker.

1. Where Execution Starts

Envoy builds into a single binary. The startup sequence initializes the server, loads bootstrap configuration, creates managers, starts worker threads, and begins accepting connections.

Primary Entry Points

BinaryPurposeEntry FunctionStartup Flow
envoyMain proxy binarysource/exe/main.cc:mainmain() -> MainCommon() -> InstanceImpl::initialize() -> workers start
envoy (stripped)Custom buildssource/exe/envoy_static.ccSame flow, links statically

Startup Deep Dive

  1. main() in source/exe/main.cc creates MainCommon with CLI args.
  2. MainCommon constructor in source/exe/main_common.cc parses options (--config-path, --config-yaml, --concurrency, etc.), initializes logging, and creates Server::InstanceImpl.
  3. InstanceImpl::initialize() in source/server/server.cc is the core bootstrap:
    • Loads bootstrap protobuf config (envoy::config::bootstrap::v3::Bootstrap).
    • Creates the main thread Dispatcher (event loop).
    • Initializes ThreadLocal::InstanceImpl (TLS slot allocator).
    • Creates ClusterManagerImpl (upstream management).
    • Creates ListenerManagerImpl (downstream listeners).
    • Initializes xDS subscriptions (ADS, or per-resource-type).
    • Creates admin interface (/stats, /clusters, /config_dump).
    • Starts worker threads (WorkerImpl, one per --concurrency).
  4. InstanceImpl::run() enters the main thread event loop (libevent event_base_loop).
  5. Workers independently accept connections via SO_REUSEPORT and process them in their own event loops.
flowchart TD
    A[main.cc: main()] --> B[MainCommon()<br/>Parse CLI args, init logging]
    B --> C[InstanceImpl::initialize()<br/>source/server/server.cc]
    C --> D[Load bootstrap config<br/>Bootstrap protobuf]
    D --> E[Create Dispatcher<br/>Main thread event loop]
    E --> F[Init ThreadLocal slots<br/>TLS mechanism]
    F --> G[Create ClusterManager<br/>Upstream clusters + health checks]
    G --> H[Create ListenerManager<br/>Downstream listeners]
    H --> I[Init xDS subscriptions<br/>LDS, CDS, RDS, EDS, SDS]
    I --> J[Start Admin interface<br/>localhost:9901]
    J --> K[Start Workers<br/>N threads, SO_REUSEPORT]
    K --> L[InstanceImpl::run()<br/>Main event loop]
    style A fill:#f9f
    style L fill:#9f9

Shutdown: SIGTERM triggers InstanceImpl::shutdown() -> drain listeners (configurable drain period) -> close connections -> stop workers -> exit. Hot restart mode transfers listeners to a new process before draining.

2. Core Abstractions

Envoy’s design revolves around interfaces defined in header files, factory-based extension creation, and a threading model that avoids locks on the hot path.

Key Types/Interfaces

  • Server::Instance: Top-level server interface. Owns dispatcher, cluster manager, listener manager, admin, and runtime.
  • Event::Dispatcher: Event loop abstraction wrapping libevent. Provides timers, file events, deferred callbacks, and DNS resolution. Each thread has exactly one.
  • ThreadLocal::Instance: Allocates TLS “slots” for cross-thread data. Main thread posts lambda updates that execute on each worker’s dispatcher. Critical for lock-free config updates.
  • Network::FilterManager: Manages a chain of ReadFilter/WriteFilter instances on a connection. Drives L4 filter callbacks.
  • Http::FilterChainFactory: Creates HTTP filters for each stream. Filters implement StreamDecoderFilter (request) and/or StreamEncoderFilter (response).
  • Upstream::ClusterManager: Thread-safe cluster registry. Workers access thread-local ThreadLocalCluster for lock-free host selection and connection pooling.
  • Upstream::LoadBalancer: Selects a host from a HostSet. Implementations: RoundRobin, LeastRequest, Random, RingHash, Maglev.
classDiagram
    class ServerInstance {
        +initialize()
        +run()
        +shutdown()
        +dispatcher() Dispatcher
        +clusterManager() ClusterManager
        +listenerManager() ListenerManager
    }
    class Dispatcher {
        +createTimer(cb) Timer
        +createFileEvent(fd, cb) FileEvent
        +post(cb) void
        +run(type) void
    }
    class ListenerManager {
        +addOrUpdateListener(config) bool
        +removeListener(name) bool
        +listeners() vector~Listener~
    }
    class ClusterManager {
        +getThreadLocalCluster(name) ThreadLocalCluster
        +addOrUpdateCluster(config) bool
        +httpConnPoolForCluster() ConnPool
    }
    class ConnectionManager {
        +onData(buffer) FilterStatus
        +newStream() ActiveStream
    }
    class StreamDecoderFilter {
        +decodeHeaders(headers, end) FilterStatus
        +decodeData(data, end) FilterStatus
    }
    class StreamEncoderFilter {
        +encodeHeaders(headers, end) FilterStatus
        +encodeData(data, end) FilterStatus
    }
    class LoadBalancer {
        +chooseHost(context) Host
    }
    ServerInstance --> Dispatcher
    ServerInstance --> ListenerManager
    ServerInstance --> ClusterManager
    ConnectionManager ..|> StreamDecoderFilter : creates chain
    ConnectionManager ..|> StreamEncoderFilter : creates chain
    ClusterManager --> LoadBalancer

Clever Patterns:

  • TLS Slot Mechanism: The ThreadLocal::Instance allocates numbered slots. Main thread calls set(slot, [](Event::Dispatcher&) { return data; }) which posts to every worker. Workers read slots synchronously in their loop—zero contention. This is how cluster membership updates, route table changes, and config updates reach workers without locks.
  • Factory Registry: Extensions register with Registry::registerFactory<MyFilterFactory>("name"). At config load, the factory is looked up by name from protobuf typed_config.type_url. Enables compile-time customization and runtime safety.
  • RAII everywhere: Connections, streams, timers, and buffers use std::unique_ptr or intrusive ref counting. Cleanup is deterministic, critical for a proxy handling millions of concurrent connections.
  • FilterStatus flow control: Filters return Continue (pass to next filter) or StopIteration (pause chain, e.g., waiting for async auth response). The filter manager resumes when the filter calls continueDecoding().

3. Request/Operation Lifecycle

Example: HTTP/1.1 request from downstream to upstream cluster.

  1. Accept: Worker’s SO_REUSEPORT socket accepts the TCP connection via libevent callback in source/server/worker_impl.cc.
  2. Filter Chain Match: FilterChainManagerImpl selects the matching filter chain based on connection properties (destination IP/port, SNI, ALPN) in source/server/filter_chain_manager_impl.cc.
  3. TLS Handshake: If TLS is configured, SslSocket performs the handshake (source/extensions/transport_sockets/tls/). ALPN negotiation selects HTTP version.
  4. Network Filter Chain: Raw bytes flow through L4 filters. The HCM (ConnectionManagerImpl) is typically the terminal network filter.
  5. HTTP Decode: HCM invokes the HTTP codec (HTTP/1.1 via http-parser or HTTP/2 via nghttp2) in source/common/http/conn_manager_impl.cc. Creates an ActiveStream per request.
  6. HTTP Filter Chain (Decode): decodeHeaders() and decodeData() traverse the HTTP filter chain. Example filters: RBAC auth, JWT validation, rate limiting. Each can stop iteration for async work.
  7. Route Match: The Router filter (source/common/router/router.cc) matches the request against the route table (from RDS). Selects target cluster, applies retry/timeout policies.
  8. Cluster Selection: Router calls ClusterManager::getThreadLocalCluster() -> LoadBalancer::chooseHost() to pick an upstream endpoint.
  9. Connection Pool: Gets or creates an upstream connection from the per-host pool. HTTP/2 pools multiplex streams; HTTP/1.1 pools use connection-per-request or keep-alive.
  10. Upstream Request: Sends headers/body to the upstream via the codec.
  11. HTTP Filter Chain (Encode): Response traverses encodeHeaders() / encodeData() filters in reverse order.
  12. Response: HCM writes the response to the downstream connection via the codec. Access log is written after response completion.
sequenceDiagram
    participant Client as Downstream
    participant W as Worker Thread
    participant FCM as Filter Chain<br/>Manager
    participant TLS as TLS Socket
    participant HCM as HTTP Conn Manager
    participant Codec as HTTP Codec<br/>(http-parser/nghttp2)
    participant HF as HTTP Filters<br/>(Auth, RateLimit)
    participant Router as Router Filter
    participant CM as Cluster Manager
    participant Pool as Connection Pool
    participant Upstream as Upstream Host

    Client->>W: TCP SYN (SO_REUSEPORT)
    W->>FCM: Match filter chain
    FCM->>TLS: TLS handshake
    TLS->>HCM: onData(raw bytes)
    HCM->>Codec: Parse HTTP request
    Codec->>HCM: onHeaders / onData callbacks
    HCM->>HF: decodeHeaders(headers)
    HF->>HF: Auth check, rate limit
    HF->>Router: decodeHeaders (terminal filter)
    Router->>Router: Route table match
    Router->>CM: getThreadLocalCluster(name)
    CM->>CM: LoadBalancer::chooseHost()
    CM->>Pool: Get/create connection
    Pool->>Upstream: Forward request
    Upstream-->>Pool: Response
    Pool-->>Router: Upstream response
    Router-->>HF: encodeHeaders (reverse)
    HF-->>HCM: encodeHeaders
    HCM-->>Codec: Serialize HTTP response
    Codec-->>Client: HTTP response bytes
    Note over HCM: Access log after response

Key Functions:

4. Reading Order

Prioritize build -> server -> HCM -> upstream. Familiarity with C++17, Bazel, and protobuf is assumed.

  1. Build System (30min): BUILDING.md, bazel/. Understand bazel build //source/exe:envoy-static. Review source/extensions/extensions_build_config.bzl for extension registry.

  2. API Definitions (30min): api/envoy/config/bootstrap/v3/bootstrap.proto for the top-level config shape. Skim api/envoy/config/listener/v3/ and api/envoy/config/cluster/v3/ for listener/cluster configs.

  3. Entry Point & Server (1hr): source/exe/main.cc -> source/exe/main_common.cc -> source/server/server.cc. Follow InstanceImpl::initialize() to see all component creation.

  4. Threading & Event Loop (1hr): source/common/event/dispatcher_impl.cc, source/common/thread_local/thread_local_impl.cc. Understanding the TLS slot mechanism is critical for grasping config updates.

  5. Listener Manager (1hr): source/server/listener_manager_impl.cc, source/server/worker_impl.cc. See how listeners are created/drained and connections dispatched.

  6. HTTP Connection Manager (2hr): source/common/http/conn_manager_impl.cc. This is the largest and most important file. Trace onData() -> codec -> ActiveStream -> filter chain.

  7. Router & Upstream (2hr): source/common/router/router.cc, source/common/upstream/cluster_manager_impl.cc. Follow a request from route matching to upstream connection pool.

  8. xDS Subscriptions (1hr): source/common/config/. Understand GrpcMuxImpl (SotW) and NewGrpcMuxImpl (delta) for dynamic config delivery.

  9. Extensions (ongoing): Pick a filter in source/extensions/filters/http/ (e.g., router, rbac, jwt_authn) to understand the filter factory pattern.

5. Common Patterns

  • Interface + Factory: Every extension implements an abstract interface (e.g., Http::StreamDecoderFilter) and a NamedHttpFilterConfigFactory. Config is a protobuf TypedExtensionConfig. Factory creates instances from proto. Pattern: REGISTER_FACTORY(MyFactory) macro in a .cc file registers with the global Registry.

  • Thread-Local Storage (TLS): The main thread owns canonical state (clusters, routes). It serializes updates as lambdas posted to workers via ThreadLocal::Instance::runOnAllThreads(). Workers apply in their event loop—no locks. Idiom: tls_slot_->set([data](Event::Dispatcher&) { return std::make_shared<ThreadLocalData>(data); }).

  • FilterStatus Flow Control: Filters return Continue or StopIteration. StopIteration pauses the chain (e.g., waiting for an external auth response). The filter calls decoder_callbacks_->continueDecoding() when ready. Watermarking handles backpressure.

  • Protobuf Config Validation: Every config message has a validate() method (auto-generated from validate.proto annotations). Misconfigurations are caught at load time, not at runtime. Pattern: MessageUtil::downcastAndValidate<MyProto>(config).

  • Stats/Metrics Everywhere: Components declare stat names in *_stats.h structs using ENVOY_STAT_SCOPE macros. Stats are allocated from a thread-safe Stats::Store and are counter/gauge/histogram types. The admin /stats endpoint dumps all metrics.

  • Connection Pool Abstraction: Http::ConnectionPool::Instance provides newStream(callbacks). HTTP/1.1 and HTTP/2 pools share this interface but differ internally (connection-per-request vs. multiplexed). Protocol auto-detection uses ALPN.

  • Buffer/Watermark Pattern: Buffer::Instance manages read/write data. High/low watermarks trigger onAboveWriteBufferHighWatermark() / onBelowWriteBufferLowWatermark() callbacks for backpressure propagation through the filter chain.

  • Testing: Extensive use of Google Test + Google Mock. Mocks for all interfaces in test/mocks/. Integration tests use IntegrationTest fixtures with real listeners/upstream servers. Fuzz tests in test/fuzz/. Run with bazel test //test/....

This covers the core flow and patterns. For advanced topics, explore Wasm filter support in source/extensions/filters/http/wasm/, QUIC/HTTP3 in source/common/quic/, or the access log subsystem in source/common/access_log/.