cert-manager Architecture

cert-manager Architecture Documentation

This document provides a deep dive into the architecture of cert-manager, the standard X.509 certificate management controller for Kubernetes. The goal is to help experienced developers understand the internal workings, major components, data flow, and key design decisions of the cert-manager system.

High-Level Architecture

cert-manager follows the Kubernetes controller pattern with multiple specialized controllers running as separate processes. The system extends the Kubernetes API with Custom Resource Definitions (CRDs) and uses reconciliation loops to drive certificate lifecycle automation.

graph TD
    subgraph Control Plane
        API[Kubernetes API Server]
        ETCD[(etcd)]
    end

    subgraph cert-manager
        CTRL[cert-manager Controller]
        WH[Webhook]
        CAI[CA Injector]
    end

    subgraph External
        ACME[ACME Server<br>e.g. Let's Encrypt]
        VAULT[HashiCorp Vault]
        VENAFI[Venafi]
    end

    subgraph Cluster Resources
        ING[Ingress / Gateway]
        SEC[TLS Secrets]
        CERT[Certificate CRs]
        ISS[Issuer / ClusterIssuer CRs]
    end

    CTRL -->|Watches & Reconciles| API
    WH -->|Validates & Mutates| API
    CAI -->|Injects CA Bundles| API
    API --> ETCD

    CTRL -->|ACME Protocol| ACME
    CTRL -->|Vault API| VAULT
    CTRL -->|Venafi API| VENAFI

    CTRL -->|Creates/Updates| SEC
    CTRL -->|Manages| CERT
    ING -->|Triggers via Annotations| CTRL
    ISS -->|Configures| CTRL

This diagram shows the three cert-manager deployments (controller, webhook, cainjector) interacting with the Kubernetes API Server. The controller communicates with external certificate authorities to obtain signed certificates, which are stored as Kubernetes Secrets.

Component Breakdown

1. cert-manager Controller (cmd/controller)

The core component that runs all certificate lifecycle controllers. It watches for cert-manager CRDs and Kubernetes resources, reconciling them toward the desired state.

  • Responsibility: Certificate issuance, renewal, and revocation. Manages the full lifecycle of Certificate, CertificateRequest, Order, and Challenge resources.
  • Key Controllers:
    • certificates-trigger: Detects when a Certificate needs issuance or renewal and creates CertificateRequest resources.
    • certificates-issuing: Tracks CertificateRequest completion and updates the target Secret with the signed certificate.
    • certificates-key-manager: Manages private key generation and rotation for Certificates.
    • certificates-readiness: Updates Certificate status conditions based on the state of the issued certificate.
    • certificaterequests-issuer-*: Per-issuer controllers that process CertificateRequests (e.g., certificaterequests-issuer-acme, certificaterequests-issuer-ca).
    • orders: Manages ACME Order resources, coordinating challenge solving.
    • challenges: Solves individual ACME challenges via HTTP-01 or DNS-01 validation.
    • ingress-shim: Watches Ingress resources for cert-manager annotations and creates Certificate resources automatically.
    • gateway-shim: Watches Gateway API resources and creates Certificate resources automatically.
  • Key Directories:
    • pkg/controller/certificates/: Certificate lifecycle controllers
    • pkg/controller/certificaterequests/: CertificateRequest processing
    • pkg/controller/acmeorders/: ACME Order management
    • pkg/controller/acmechallenges/: ACME Challenge solving

2. Webhook (cmd/webhook)

A Kubernetes admission webhook that validates and mutates cert-manager resources.

  • Responsibility: Validates that cert-manager resources conform to their schemas and sets sensible defaults via mutation. Prevents invalid configurations from being persisted.
  • Key Behaviors:
    • Validates Certificate resources (e.g., ensures DNS names are specified, duration is valid).
    • Validates Issuer and ClusterIssuer configurations.
    • Mutates resources to set default values (e.g., default certificate duration of 90 days).
    • Performs API version conversion between cert-manager API versions.
  • Key Directories:
    • internal/webhook/: Webhook server implementation
    • pkg/webhook/: Validation and defaulting logic

3. CA Injector (cmd/cainjector)

Injects CA certificate bundles into Kubernetes resources that need them.

  • Responsibility: Watches for resources annotated with cert-manager.io/inject-ca-from and injects the CA certificate from the referenced Certificate’s Secret. This ensures webhook configurations, CRDs, and API services trust the correct CA.
  • Key Targets:
    • ValidatingWebhookConfiguration
    • MutatingWebhookConfiguration
    • CustomResourceDefinition (conversion webhooks)
    • APIService (aggregated API servers)
  • Key Directories:
    • pkg/controller/cainjector/: CA injection controller logic

4. ACME Solver (cmd/acmesolver)

A lightweight HTTP server used for solving ACME HTTP-01 challenges.

  • Responsibility: Serves the ACME challenge token at the well-known HTTP path (/.well-known/acme-challenge/<token>) during HTTP-01 domain validation.
  • Lifecycle: Runs as a temporary Pod created by the challenges controller. The Pod is created when a challenge needs solving and deleted after completion.
  • Key Files:
    • cmd/acmesolver/main.go: Entry point for the solver binary
    • pkg/acme/webhook/: ACME webhook solver interface for DNS-01 providers

Certificate Lifecycle

The certificate lifecycle is the core workflow of cert-manager. Below is the flow from a user creating a Certificate to the signed certificate being stored in a Secret.

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant Trigger as certificates-trigger
    participant KeyMgr as certificates-key-manager
    participant Issuing as certificates-issuing
    participant CRCtrl as certificaterequests-issuer
    participant CA as Certificate Authority

    User->>K8s: Create Certificate
    Trigger->>K8s: Watch Certificate
    Trigger->>K8s: Detect issuance needed
    KeyMgr->>K8s: Generate private key
    KeyMgr->>K8s: Store key in Secret
    Trigger->>K8s: Create CertificateRequest (with CSR)
    CRCtrl->>K8s: Watch CertificateRequest
    CRCtrl->>CA: Submit CSR to CA
    CA-->>CRCtrl: Return signed certificate
    CRCtrl->>K8s: Update CertificateRequest (Ready)
    Issuing->>K8s: Watch CertificateRequest completion
    Issuing->>K8s: Update Secret with signed cert
    Issuing->>K8s: Update Certificate status (Ready)

Certificate States

Issuing → Ready

 Failed (retries with backoff)

ACME-Specific Flow

For ACME issuers (e.g., Let’s Encrypt), additional resources are involved:

graph LR
    CERT[Certificate] --> CR[CertificateRequest]
    CR --> ORD[Order]
    ORD --> CH1[Challenge: HTTP-01]
    ORD --> CH2[Challenge: DNS-01]
    CH1 -->|Solved| ORD
    CH2 -->|Solved| ORD
    ORD -->|Finalized| CR
    CR -->|Signed cert| CERT

ACME Order States:

pending → ready → valid (success)
   ↓               ↓
 invalid          errored

ACME Challenge States:

pending → processing → valid (success)

           invalid (failure)

Ingress and Gateway Integration

cert-manager includes “shim” controllers that watch Ingress and Gateway API resources for annotations, automatically creating Certificate resources.

Ingress Shim Flow

graph LR
    ING[Ingress with annotation<br>cert-manager.io/cluster-issuer] -->|ingress-shim watches| CTRL[Controller]
    CTRL -->|Creates| CERT[Certificate]
    CERT -->|triggers| CR[CertificateRequest]
    CR -->|issues| SEC[TLS Secret]
    SEC -->|referenced by| ING

Key annotations:

  • cert-manager.io/issuer - reference a namespace-scoped Issuer
  • cert-manager.io/cluster-issuer - reference a ClusterIssuer
  • cert-manager.io/common-name - set the certificate common name
  • cert-manager.io/duration - override default certificate duration
  • cert-manager.io/renew-before - override renewal window

Renewal Mechanism

cert-manager continuously monitors certificate expiration and automatically triggers renewal:

  1. The certificates-readiness controller calculates when renewal should occur based on spec.renewBefore (default: 2/3 of certificate duration).
  2. When the renewal time is reached, certificates-trigger creates a new CertificateRequest.
  3. The standard issuance flow proceeds, and the Secret is updated with the new certificate.
  4. The old certificate remains valid until the new one replaces it, ensuring zero-downtime rotation.

Key Design Decisions

1. Multi-Controller Architecture

  • Pattern: Rather than a single monolithic controller, cert-manager splits certificate management across multiple focused controllers (trigger, key-manager, issuing, readiness). Each controller owns a specific phase of the lifecycle.
  • Trade-offs:
    • Pros: Clear separation of concerns, easier to debug specific lifecycle phases, and independent scaling of controller logic.
    • Cons: Coordination between controllers relies on status conditions and annotations on shared resources, adding complexity to state management.

2. CertificateRequest as an Abstraction Layer

  • Pattern: Rather than having Certificate controllers directly interact with CAs, cert-manager introduces the CertificateRequest resource as an intermediary. This decouples the certificate lifecycle from the issuer implementation.
  • Trade-offs:
    • Pros: New issuer types can be added by implementing a CertificateRequest controller without modifying core certificate logic. External issuers can be built as separate projects.
    • Cons: Adds an extra resource and reconciliation hop to the issuance flow, increasing latency slightly.

3. Issuer as Configuration, Not Controller

  • Pattern: Issuer and ClusterIssuer resources are purely configuration objects. They do not have dedicated controllers that reconcile them. Instead, they are read by CertificateRequest controllers as configuration for how to contact a CA.
  • Trade-offs:
    • Pros: Simplifies the issuer model and avoids unnecessary reconciliation loops for static configuration.
    • Cons: Issuer health checking is limited; invalid issuer configurations are only detected when a CertificateRequest attempts to use them.

4. ACME as a First-Class Citizen

  • Pattern: While other issuer types are relatively simple (submit CSR, get certificate), ACME requires a multi-step flow with Orders and Challenges. cert-manager models this explicitly with dedicated CRDs and controllers.
  • Trade-offs:
    • Pros: Full visibility into the ACME flow through Kubernetes resources. Users can inspect Order and Challenge states for debugging.
    • Cons: Adds two additional CRDs and controllers specifically for ACME, increasing the surface area of the system.

5. Temporary Solver Pods for HTTP-01

  • Pattern: For HTTP-01 ACME challenges, cert-manager creates temporary Pods running the acmesolver binary and corresponding Services/Ingresses to route challenge validation traffic.
  • Trade-offs:
    • Pros: Self-contained challenge solving without modifying existing application infrastructure.
    • Cons: Requires the ability to create Pods and Ingresses in the target namespace, and the solver Pod must be routable from the internet.

Conclusion

cert-manager’s architecture is a well-executed application of the Kubernetes controller pattern, extending the API with purpose-built CRDs that model the certificate lifecycle declaratively. The multi-controller design, CertificateRequest abstraction, and first-class ACME support make it both flexible and operationally transparent. Understanding these architectural decisions is key to navigating the codebase effectively and building on top of cert-manager.