cert-manager Architecture Documentation
This document provides a deep dive into the architecture of cert-manager, the standard X.509 certificate management controller for Kubernetes. The goal is to help experienced developers understand the internal workings, major components, data flow, and key design decisions of the cert-manager system.
High-Level Architecture
cert-manager follows the Kubernetes controller pattern with multiple specialized controllers running as separate processes. The system extends the Kubernetes API with Custom Resource Definitions (CRDs) and uses reconciliation loops to drive certificate lifecycle automation.
graph TD
subgraph Control Plane
API[Kubernetes API Server]
ETCD[(etcd)]
end
subgraph cert-manager
CTRL[cert-manager Controller]
WH[Webhook]
CAI[CA Injector]
end
subgraph External
ACME[ACME Server<br>e.g. Let's Encrypt]
VAULT[HashiCorp Vault]
VENAFI[Venafi]
end
subgraph Cluster Resources
ING[Ingress / Gateway]
SEC[TLS Secrets]
CERT[Certificate CRs]
ISS[Issuer / ClusterIssuer CRs]
end
CTRL -->|Watches & Reconciles| API
WH -->|Validates & Mutates| API
CAI -->|Injects CA Bundles| API
API --> ETCD
CTRL -->|ACME Protocol| ACME
CTRL -->|Vault API| VAULT
CTRL -->|Venafi API| VENAFI
CTRL -->|Creates/Updates| SEC
CTRL -->|Manages| CERT
ING -->|Triggers via Annotations| CTRL
ISS -->|Configures| CTRL
This diagram shows the three cert-manager deployments (controller, webhook, cainjector) interacting with the Kubernetes API Server. The controller communicates with external certificate authorities to obtain signed certificates, which are stored as Kubernetes Secrets.
Component Breakdown
1. cert-manager Controller (cmd/controller)
The core component that runs all certificate lifecycle controllers. It watches for cert-manager CRDs and Kubernetes resources, reconciling them toward the desired state.
- Responsibility: Certificate issuance, renewal, and revocation. Manages the full lifecycle of Certificate, CertificateRequest, Order, and Challenge resources.
- Key Controllers:
- certificates-trigger: Detects when a Certificate needs issuance or renewal and creates CertificateRequest resources.
- certificates-issuing: Tracks CertificateRequest completion and updates the target Secret with the signed certificate.
- certificates-key-manager: Manages private key generation and rotation for Certificates.
- certificates-readiness: Updates Certificate status conditions based on the state of the issued certificate.
- certificaterequests-issuer-*: Per-issuer controllers that process CertificateRequests (e.g.,
certificaterequests-issuer-acme,certificaterequests-issuer-ca). - orders: Manages ACME Order resources, coordinating challenge solving.
- challenges: Solves individual ACME challenges via HTTP-01 or DNS-01 validation.
- ingress-shim: Watches Ingress resources for cert-manager annotations and creates Certificate resources automatically.
- gateway-shim: Watches Gateway API resources and creates Certificate resources automatically.
- Key Directories:
pkg/controller/certificates/: Certificate lifecycle controllerspkg/controller/certificaterequests/: CertificateRequest processingpkg/controller/acmeorders/: ACME Order managementpkg/controller/acmechallenges/: ACME Challenge solving
2. Webhook (cmd/webhook)
A Kubernetes admission webhook that validates and mutates cert-manager resources.
- Responsibility: Validates that cert-manager resources conform to their schemas and sets sensible defaults via mutation. Prevents invalid configurations from being persisted.
- Key Behaviors:
- Validates Certificate resources (e.g., ensures DNS names are specified, duration is valid).
- Validates Issuer and ClusterIssuer configurations.
- Mutates resources to set default values (e.g., default certificate duration of 90 days).
- Performs API version conversion between cert-manager API versions.
- Key Directories:
internal/webhook/: Webhook server implementationpkg/webhook/: Validation and defaulting logic
3. CA Injector (cmd/cainjector)
Injects CA certificate bundles into Kubernetes resources that need them.
- Responsibility: Watches for resources annotated with
cert-manager.io/inject-ca-fromand injects the CA certificate from the referenced Certificate’s Secret. This ensures webhook configurations, CRDs, and API services trust the correct CA. - Key Targets:
ValidatingWebhookConfigurationMutatingWebhookConfigurationCustomResourceDefinition(conversion webhooks)APIService(aggregated API servers)
- Key Directories:
pkg/controller/cainjector/: CA injection controller logic
4. ACME Solver (cmd/acmesolver)
A lightweight HTTP server used for solving ACME HTTP-01 challenges.
- Responsibility: Serves the ACME challenge token at the well-known HTTP path (
/.well-known/acme-challenge/<token>) during HTTP-01 domain validation. - Lifecycle: Runs as a temporary Pod created by the challenges controller. The Pod is created when a challenge needs solving and deleted after completion.
- Key Files:
cmd/acmesolver/main.go: Entry point for the solver binarypkg/acme/webhook/: ACME webhook solver interface for DNS-01 providers
Certificate Lifecycle
The certificate lifecycle is the core workflow of cert-manager. Below is the flow from a user creating a Certificate to the signed certificate being stored in a Secret.
sequenceDiagram
participant User
participant K8s as Kubernetes API
participant Trigger as certificates-trigger
participant KeyMgr as certificates-key-manager
participant Issuing as certificates-issuing
participant CRCtrl as certificaterequests-issuer
participant CA as Certificate Authority
User->>K8s: Create Certificate
Trigger->>K8s: Watch Certificate
Trigger->>K8s: Detect issuance needed
KeyMgr->>K8s: Generate private key
KeyMgr->>K8s: Store key in Secret
Trigger->>K8s: Create CertificateRequest (with CSR)
CRCtrl->>K8s: Watch CertificateRequest
CRCtrl->>CA: Submit CSR to CA
CA-->>CRCtrl: Return signed certificate
CRCtrl->>K8s: Update CertificateRequest (Ready)
Issuing->>K8s: Watch CertificateRequest completion
Issuing->>K8s: Update Secret with signed cert
Issuing->>K8s: Update Certificate status (Ready)
Certificate States
Issuing → Ready
↓
Failed (retries with backoff)
ACME-Specific Flow
For ACME issuers (e.g., Let’s Encrypt), additional resources are involved:
graph LR
CERT[Certificate] --> CR[CertificateRequest]
CR --> ORD[Order]
ORD --> CH1[Challenge: HTTP-01]
ORD --> CH2[Challenge: DNS-01]
CH1 -->|Solved| ORD
CH2 -->|Solved| ORD
ORD -->|Finalized| CR
CR -->|Signed cert| CERT
ACME Order States:
pending → ready → valid (success)
↓ ↓
invalid errored
ACME Challenge States:
pending → processing → valid (success)
↓
invalid (failure)
Ingress and Gateway Integration
cert-manager includes “shim” controllers that watch Ingress and Gateway API resources for annotations, automatically creating Certificate resources.
Ingress Shim Flow
graph LR
ING[Ingress with annotation<br>cert-manager.io/cluster-issuer] -->|ingress-shim watches| CTRL[Controller]
CTRL -->|Creates| CERT[Certificate]
CERT -->|triggers| CR[CertificateRequest]
CR -->|issues| SEC[TLS Secret]
SEC -->|referenced by| ING
Key annotations:
cert-manager.io/issuer- reference a namespace-scoped Issuercert-manager.io/cluster-issuer- reference a ClusterIssuercert-manager.io/common-name- set the certificate common namecert-manager.io/duration- override default certificate durationcert-manager.io/renew-before- override renewal window
Renewal Mechanism
cert-manager continuously monitors certificate expiration and automatically triggers renewal:
- The certificates-readiness controller calculates when renewal should occur based on
spec.renewBefore(default: 2/3 of certificate duration). - When the renewal time is reached, certificates-trigger creates a new CertificateRequest.
- The standard issuance flow proceeds, and the Secret is updated with the new certificate.
- The old certificate remains valid until the new one replaces it, ensuring zero-downtime rotation.
Key Design Decisions
1. Multi-Controller Architecture
- Pattern: Rather than a single monolithic controller, cert-manager splits certificate management across multiple focused controllers (trigger, key-manager, issuing, readiness). Each controller owns a specific phase of the lifecycle.
- Trade-offs:
- Pros: Clear separation of concerns, easier to debug specific lifecycle phases, and independent scaling of controller logic.
- Cons: Coordination between controllers relies on status conditions and annotations on shared resources, adding complexity to state management.
2. CertificateRequest as an Abstraction Layer
- Pattern: Rather than having Certificate controllers directly interact with CAs, cert-manager introduces the CertificateRequest resource as an intermediary. This decouples the certificate lifecycle from the issuer implementation.
- Trade-offs:
- Pros: New issuer types can be added by implementing a CertificateRequest controller without modifying core certificate logic. External issuers can be built as separate projects.
- Cons: Adds an extra resource and reconciliation hop to the issuance flow, increasing latency slightly.
3. Issuer as Configuration, Not Controller
- Pattern: Issuer and ClusterIssuer resources are purely configuration objects. They do not have dedicated controllers that reconcile them. Instead, they are read by CertificateRequest controllers as configuration for how to contact a CA.
- Trade-offs:
- Pros: Simplifies the issuer model and avoids unnecessary reconciliation loops for static configuration.
- Cons: Issuer health checking is limited; invalid issuer configurations are only detected when a CertificateRequest attempts to use them.
4. ACME as a First-Class Citizen
- Pattern: While other issuer types are relatively simple (submit CSR, get certificate), ACME requires a multi-step flow with Orders and Challenges. cert-manager models this explicitly with dedicated CRDs and controllers.
- Trade-offs:
- Pros: Full visibility into the ACME flow through Kubernetes resources. Users can inspect Order and Challenge states for debugging.
- Cons: Adds two additional CRDs and controllers specifically for ACME, increasing the surface area of the system.
5. Temporary Solver Pods for HTTP-01
- Pattern: For HTTP-01 ACME challenges, cert-manager creates temporary Pods running the
acmesolverbinary and corresponding Services/Ingresses to route challenge validation traffic. - Trade-offs:
- Pros: Self-contained challenge solving without modifying existing application infrastructure.
- Cons: Requires the ability to create Pods and Ingresses in the target namespace, and the solver Pod must be routable from the internet.
Conclusion
cert-manager’s architecture is a well-executed application of the Kubernetes controller pattern, extending the API with purpose-built CRDs that model the certificate lifecycle declaratively. The multi-controller design, CertificateRequest abstraction, and first-class ACME support make it both flexible and operationally transparent. Understanding these architectural decisions is key to navigating the codebase effectively and building on top of cert-manager.