cert-manager Webhook Deep Dive

The cert-manager webhook is a Kubernetes admission webhook server that validates and defaults cert-manager resources before they are persisted to etcd. It also handles API version conversion between different cert-manager API versions, ensuring backward compatibility during upgrades.

1. What This Component Does

The webhook serves three critical functions in the cert-manager deployment:

  • Validating Admission: Rejects invalid cert-manager resources before they are stored, preventing misconfigured Certificates, Issuers, and other resources from entering the system.
  • Mutating Admission: Sets default values on cert-manager resources (e.g., default certificate duration, default private key algorithm) to reduce boilerplate in user configurations.
  • Conversion Webhook: Converts resources between cert-manager API versions (e.g., v1alpha2 to v1) to support rolling upgrades and backward compatibility.

Why a Webhook?

Without the webhook, invalid configurations would be accepted by the Kubernetes API server and only discovered when the controller attempts to process them. This would lead to confusing error messages in controller logs rather than immediate, clear rejection at submission time. The webhook provides fast feedback to users via kubectl error messages.

2. How It Works

The webhook runs as a separate Deployment in the cert-manager namespace, listening for admission review requests from the Kubernetes API server.

graph LR
    USER[kubectl / API Client] -->|1. Create Certificate| API[Kubernetes API Server]
    API -->|2. AdmissionReview| WH[cert-manager Webhook]
    WH -->|3. Validate + Default| WH
    WH -->|4. AdmissionResponse| API
    API -->|5a. Accepted → etcd| ETCD[(etcd)]
    API -->|5b. Rejected → Error| USER

Request Processing Flow

  1. A user submits a cert-manager resource (e.g., Certificate) to the Kubernetes API server.
  2. The API server identifies the resource as matching a registered webhook configuration and sends an AdmissionReview request to the cert-manager webhook.
  3. The webhook deserializes the resource and runs it through the appropriate validation and defaulting logic.
  4. The webhook returns an AdmissionResponse indicating whether the request is allowed or denied, along with any mutations (defaults).
  5. If allowed, the API server persists the resource. If denied, the error message is returned to the user.

3. Validation Rules

The webhook enforces a comprehensive set of validation rules for each cert-manager resource type.

Certificate Validation

Key validation rules for Certificate resources:

  • spec.secretName must be specified and non-empty.
  • spec.issuerRef must reference a valid issuer name, kind (Issuer or ClusterIssuer), and group.
  • At least one of spec.dnsNames, spec.ipAddresses, spec.uris, or spec.emailAddresses must be specified.
  • spec.duration must be greater than spec.renewBefore (the certificate must be valid longer than the renewal window).
  • spec.privateKey.algorithm must be one of RSA, ECDSA, or Ed25519.
  • spec.privateKey.size must be valid for the chosen algorithm (e.g., 2048/4096/8192 for RSA, 256/384/521 for ECDSA).
  • spec.privateKey.rotationPolicy must be Never or Always.
  • spec.usages must contain valid key usage values.
  • ACME-specific: If the issuer is ACME, spec.duration and certain fields may be ignored (ACME servers control duration).

Issuer Validation

Key validation rules for Issuer and ClusterIssuer resources:

  • Exactly one issuer type must be configured (ACME, CA, SelfSigned, Vault, or Venafi).
  • ACME: spec.acme.server must be a valid URL. spec.acme.privateKeySecretRef must be specified. At least one solver must be configured.
  • CA: spec.ca.secretName must reference a Secret containing a CA certificate and key.
  • Vault: spec.vault.server must be specified. spec.vault.path must be specified. Exactly one auth method must be configured.
  • Venafi: Either cloud or TPP configuration must be provided, but not both.

CertificateRequest Validation

  • spec.request must contain a valid PEM-encoded CSR.
  • spec.issuerRef must be specified.
  • The resource is immutable after creation (updates are rejected).

4. Defaulting Logic

The mutating webhook sets sensible defaults to reduce configuration verbosity:

FieldDefault Value
Certificate.spec.duration2160h (90 days)
Certificate.spec.renewBefore720h (30 days)
Certificate.spec.privateKey.algorithmRSA
Certificate.spec.privateKey.size2048 (RSA) or 256 (ECDSA)
Certificate.spec.privateKey.rotationPolicyNever
Certificate.spec.privateKey.encodingPKCS1
Certificate.spec.revisionHistoryLimit1
CertificateRequest.spec.issuerRef.groupcert-manager.io

5. TLS Bootstrap

The webhook itself needs a TLS certificate to serve HTTPS requests from the Kubernetes API server. cert-manager handles this bootstrapping problem through a self-signed CA:

graph TD
    WH[Webhook Deployment] -->|Generates| SK[Self-Signed CA Key]
    SK -->|Signs| WC[Webhook Serving Certificate]
    WC -->|Stored in| SEC[Secret: cert-manager-webhook-ca]
    CAI[CA Injector] -->|Reads CA from| SEC
    CAI -->|Injects into| VWH[ValidatingWebhookConfiguration]
    CAI -->|Injects into| MWH[MutatingWebhookConfiguration]
    API[API Server] -->|Trusts CA via caBundle| VWH
    API -->|Trusts CA via caBundle| MWH
  1. The webhook generates a self-signed CA on startup and uses it to sign its own serving certificate.
  2. The CA certificate is stored in a Kubernetes Secret.
  3. The CA injector reads this Secret and patches the caBundle field in the webhook configurations.
  4. The Kubernetes API server uses the caBundle to verify the webhook’s TLS certificate when sending admission reviews.

This self-bootstrapping approach avoids circular dependencies (the webhook does not need cert-manager to issue its own certificate).

6. API Version Conversion

cert-manager has evolved through multiple API versions (v1alpha2, v1alpha3, v1beta1, v1). The webhook handles conversion between these versions:

  • Hub Version: v1 is the internal hub version that all other versions convert to/from.
  • Spoke Versions: Older API versions are “spokes” that convert to the hub for storage and back for serving.
  • Conversion Logic: Defined in pkg/apis/certmanager/ and pkg/apis/acme/ with Convert_* functions that map fields between versions.

This enables users to continue using older API versions during upgrades while the API server stores resources in the latest version.

7. Key Code Paths

  • Entry Point: cmd/webhook/main.go - Starts the webhook server with TLS configuration.
  • Server Setup: internal/webhook/server.go - Configures the HTTP server, registers handlers for validation, mutation, and conversion.
  • Certificate Validation: internal/webhook/validation/certificate.go - Validation rules for Certificate resources.
  • Issuer Validation: internal/webhook/validation/issuer.go - Validation rules for Issuer and ClusterIssuer resources.
  • Defaulting: internal/webhook/defaults/ - Defaulting logic for all cert-manager resource types.
  • Conversion: pkg/apis/certmanager/v1alpha2/conversion.go (and similar for other versions) - API version conversion functions.

8. Configuration

The webhook is configured primarily through command-line flags:

  • --secure-port: HTTPS serving port (default: 10250).
  • --dynamic-serving-ca-secret-namespace: Namespace for the self-signed CA Secret.
  • --dynamic-serving-ca-secret-name: Name of the self-signed CA Secret.
  • --dynamic-serving-dns-names: DNS names for the serving certificate (must match the Service DNS name).
  • --healthz-port: Port for health check endpoints.
  • --feature-gates: Feature gate flags for experimental functionality.

Helm Values

When deployed via Helm, key webhook configuration values include:

webhook:
  replicaCount: 1
  timeoutSeconds: 30
  securePort: 10250
  hostNetwork: false
  serviceType: ClusterIP
  resources:
    requests:
      cpu: 50m
      memory: 64Mi

9. Operational Considerations

  • High Availability: The webhook should run with multiple replicas in production to avoid becoming a single point of failure. If the webhook is unavailable, no cert-manager resources can be created or updated.
  • Failure Policy: The webhook configurations are set with failurePolicy: Fail by default, meaning that if the webhook is unreachable, cert-manager resource operations are rejected. This prevents invalid resources but requires the webhook to be highly available.
  • Network Policies: The Kubernetes API server must be able to reach the webhook Pod on the configured secure port. Ensure network policies do not block this traffic.
  • Resource Limits: The webhook is lightweight but should have appropriate resource requests/limits to ensure it can handle admission review traffic under load.