Summary

A CRD extends the Kubernetes API by defining a new resource type (its schema, versions, validation, etc.). Once a CRD is installed, the API server can accept, validate, and store objects of that new type.

A CR is an instance of that new type—an actual object created by users or tools that carries the desired state for your operator to realize. The API server treats it like any other object once the CRD exists.

The API server validates your Custom Resource (CR) against its Custom Resource Definition (CRD) and, if valid, stores it in the etcd database. The controller is a separate process that reacts to this event.

Relationship between Controller and Reconciler

They are fundamentally linked. The reconciler is the core piece of logic inside the controller.

* Controller: The controller is the entire running program (the operator). Its primary job is to "watch" the Kubernetes API server for changes to specific resources (like the creation of your CR). It handles the boilerplate work of connecting to the cluster, managing caches of objects, and putting events into a work queue.

* Reconciler: The reconciler is the specific function or method that the controller calls to process an event from the work queue. It contains the actual business logic of your operator. Its one and only goal is to make the current state of the world match the desired state defined in a resource.

Think of the controller as the engine and the reconciler as the specific set of instructions the engine executes for a given task.

The Flow: From CR Creation to Reconciliation

Here is a detailed, step-by-step explanation of the flow:

1. CR Creation: You (or another program) send a manifest for a new Custom Resource to the Kubernetes API server (e.g., via kubectl apply).

2. API Server & etcd: The API server authenticates and validates the request. If it's valid, the API server persists the new CR object as a record in the etcd database. At this point, the object exists in the cluster as the "desired state."

3. The Watcher (Informer): Your running controller has established a "watch" on the API server for the type of CR you just created. The API server notifies the controller's watcher that a new object has been created.

4. The Work Queue: The controller's internal machinery (often called an Informer) receives this notification and places a key for the object (typically in the format namespace/name) into a work queue. This queue ensures that every change is processed, even if the operator is busy, and handles retries on failure.

5. The Reconcile Loop: The controller has a loop that constantly pulls keys from the work queue. When it pulls the key for your new CR, it directly calls the Reconciler function, passing it that key.

6. Reconciliation Logic: Inside the Reconciler function, the following happens:

* It fetches the full CR object from the cluster using the provided key.

* It observes the actual state of the world (e.g., checks if a Pod, a ConfigMap, or an external resource it's supposed to manage already exists).

* It compares the desired state (from the CR's spec) with the actual state.

* It takes action to converge the actual state toward the desired state. For a new CR, this usually means creating new Kubernetes objects (like Deployments, Services, etc.) or interacting with an external API.

* Finally, it often updates the status field of your CR to report on the actual state of the resources it now manages.

CRDs can be categorized into two main types

1. Active CRDs (The kind you are thinking of)

This is the most common pattern. The CR represents a desired, active state in the cluster. You create the CR because you want the operator to do something and create other resources.

* Example: agent
* You want: A running agent pod, a Kubernetes Service, a ConfigMap, etc.
* How it works: You kubectl apply an agent manifest. The agent controller sees it, reads its spec, and creates a Deployment, Service, etc., to make your wish a reality. The controller's job is to reconcile the agent object.

2. Passive / Configuration CRDs (The exception)

This is a less common but powerful pattern. The CR does not represent an active resource, but rather a reusable piece of configuration, a template, or a policy. It doesn't do anything on its own. It just sits in the API server, holding data for other controllers to read and use.

* Example: ServingRuntime
* You want: A reusable blueprint for how agent pods should be configured (e.g., what container image to use, what environment variables are needed).
* How it works: An administrator kubectl applies a ServingRuntime manifest. Nothing happens in the cluster. No pods are created. The CR is just stored as data.

Example CRD, CR

click to open

Active CRD - CRD - `Agent`

Defines the schema your controller watches and reconciles.


apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: agents.example.mycompany.io
spec:
  group: example.mycompany.io
  scope: Namespaced
  names:
    plural: agents
    singular: agent
    kind: Agent
    shortNames: [ag]
    categories: [all, agents]
  versions:
    - name: v1alpha1
      served: true
      storage: true
      additionalPrinterColumns:
        - name: Image
          type: string
          jsonPath: .spec.image
        - name: Replicas
          type: integer
          jsonPath: .spec.replicas
        - name: Phase
          type: string
          jsonPath: .status.phase
      subresources:
        status: {}
      schema:
        openAPIV3Schema:
          type: object
          description: Agent represents a running agent workload managed by the operator.
          properties:
            spec:
              type: object
              required: ["image"]
              properties:
                image:
                  type: string
                  description: Container image for the agent.
                replicas:
                  type: integer
                  minimum: 0
                  default: 1
                env:
                  type: array
                  description: Environment variables injected into the agent pod(s).
                  items:
                    type: object
                    required: ["name"]
                    properties:
                      name:
                        type: string
                      value:
                        type: string
                config:
                  type: object
                  description: Free-form configuration passed via a ConfigMap.
                  properties:
                    configMapRef:
                      type: object
                      properties:
                        name:
                          type: string
                service:
                  type: object
                  description: Exposes the agent via a Service.
                  properties:
                    type:
                      type: string
                      enum: ["ClusterIP","NodePort","LoadBalancer"]
                      default: "ClusterIP"
                    port:
                      type: integer
                      minimum: 1
                      maximum: 65535
                      default: 8080
                resources:
                  type: object
                  description: Pod resource requests/limits.
                  properties:
                    requests:
                      type: object
                      additionalProperties:
                        type: string
                    limits:
                      type: object
                      additionalProperties:
                        type: string
                servingRuntimeRef:
                  type: object
                  description: Optional reference to a passive ServingRuntime config.
                  properties:
                    name:
                      type: string
                    namespace:
                      type: string
            status:
              type: object
              properties:
                phase:
                  type: string
                  enum: ["Pending","Reconciling","Ready","Error"]
                readyReplicas:
                  type: integer
                  minimum: 0
                endpoints:
                  type: array
                  items:
                    type: string
                conditions:
                  type: array
                  items:
                    type: object
                    properties:
                      type: { type: string }
                      status: { type: string }
                      reason: { type: string }
                      message: { type: string }
                      lastTransitionTime: { type: string, format: date-time }

CRs - `Agent`

Minimal:


apiVersion: example.mycompany.io/v1alpha1
kind: Agent
metadata:
  name: agent-minimal
spec:
  image: ghcr.io/myco/agent:1.0.0

Full (references a ConfigMap and exposes a Service):


apiVersion: example.mycompany.io/v1alpha1
kind: Agent
metadata:
  name: agent-full
spec:
  image: ghcr.io/myco/agent:1.2.3
  replicas: 3
  env:
    - name: LOG_LEVEL
      value: info
  config:
    configMapRef:
      name: agent-full-config
  service:
    type: ClusterIP
    port: 8080
  resources:
    requests:
      cpu: "200m"
      memory: "256Mi"
    limits:
      cpu: "1"
      memory: "512Mi"
  servingRuntimeRef:
    name: default-serving-runtime
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: agent-full-config
data:
  agent.yaml: |
    featureFlags:
      coolThing: true
    endpoints:
      metrics: /metrics

Passive / Configuration CRD - CRD - `ServingRuntime`

A reusable config object that doesn’t create pods by itself; your controller reads it when reconciling Agent.


apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: servingruntimes.example.mycompany.io
spec:
  group: example.mycompany.io
  scope: Namespaced
  names:
    plural: servingruntimes
    singular: servingruntime
    kind: ServingRuntime
    shortNames: [sr]
    categories: [all, config]
  versions:
    - name: v1alpha1
      served: true
      storage: true
      additionalPrinterColumns:
        - name: Image
          type: string
          jsonPath: .spec.image
      schema:
        openAPIV3Schema:
          type: object
          description: Passive config describing how agents should be configured.
          properties:
            spec:
              type: object
              required: ["image"]
              properties:
                image:
                  type: string
                  description: Default container image to use for agents.
                envDefaults:
                  type: array
                  items:
                    type: object
                    required: ["name","value"]
                    properties:
                      name: { type: string }
                      value: { type: string }
                ports:
                  type: array
                  items:
                    type: object
                    properties:
                      name: { type: string }
                      containerPort: { type: integer }
                podAnnotations:
                  type: object
                  additionalProperties:
                    type: string

CRs - ServingRuntime


apiVersion: example.mycompany.io/v1alpha1
kind: ServingRuntime
metadata:
  name: default-serving-runtime
spec:
  image: ghcr.io/myco/agent:stable
  envDefaults:
    - name: TZ
      value: UTC
    - name: LOG_FORMAT
      value: json
  ports:
    - name: http
      containerPort: 8080
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"

REF

https://kubernetes.io/docs/concepts/extend-kubernetes/operator/

https://book.kubebuilder.io/reference/watching-resources
https://aws-controllers-k8s.github.io/community/docs/community/overview/

SLQ notes

October 14, 2025

Kubernetes - operator

Summary

Relationship between Controller and Reconciler

The Flow: From CR Creation to Reconciliation

CRDs can be categorized into two main types

1. Active CRDs (The kind you are thinking of)

2. Passive / Configuration CRDs (The exception)

Example CRD, CR

Active CRD - CRD - `Agent`

CRs - `Agent`

Passive / Configuration CRD - CRD - `ServingRuntime`

CRs - ServingRuntime

REF

No comments:

Post a Comment

October 14, 2025

Kubernetes - operator

Summary

Relationship between Controller and Reconciler

The Flow: From CR Creation to Reconciliation

CRDs can be categorized into two main types

1. Active CRDs (The kind you are thinking of)

2. Passive / Configuration CRDs (The exception)

Example CRD, CR

Active CRD - CRD - Agent

CRs - Agent

Passive / Configuration CRD - CRD - ServingRuntime

CRs - ServingRuntime

REF

No comments:

Post a Comment

Active CRD - CRD - `Agent`

CRs - `Agent`

Passive / Configuration CRD - CRD - `ServingRuntime`