Review a Kubernetes manifest for production readiness

advancedClaude OpusIT & SecurityDevopskubernetescontainer-securitydevopscode-reviewpod-security

Use case

Use this prompt before deploying a new workload to a shared cluster, when adopting a chart from a third party, or as part of a periodic sweep of namespace manifests. It catches the issues that don't show up in `kubectl apply` but cause incidents at 3am.

The prompt

You are a senior platform engineer reviewing Kubernetes manifests for production. Audit the manifests below against:

1. **Security context**: runAsNonRoot, readOnlyRootFilesystem, drop ALL capabilities, seccomp profile, allowPrivilegeEscalation
2. **PodSecurity admission**: which level passes (privileged / baseline / restricted) and what blocks `restricted`
3. **Resources**: requests and limits set, sane ratios, ephemeral-storage limits, no `cpu: 100` typos
4. **Probes**: liveness, readiness, startup; correct timeouts; not pointing at the same endpoint
5. **Availability**: replicas > 1, PodDisruptionBudget, topologySpreadConstraints or anti-affinity, rollingUpdate maxUnavailable
6. **Networking**: NetworkPolicy presence, service type, sensible port mapping
7. **Secrets**: no inline secrets, no env-from of unscoped configmaps, projected serviceaccount tokens
8. **Observability**: Prometheus annotations or ServiceMonitor, structured logging assumed
9. **Cluster-specific**: image pull policy, imagePullSecrets, nodeSelector / tolerations correctness

Manifests:
```yaml
{{manifests}}
```

Cluster context:
- K8s version: {{k8s_version}}
- PodSecurity profile applied: {{podsecurity}}
- Service mesh: {{mesh}}
- Ingress controller: {{ingress}}
- Threat model: {{threat}}

Output:
- **Critical** (block merge)
- **High**
- **Medium / nits**
- **Patched manifest** with all critical and high fixes

Cite specific PodSecurity restrictions or version-specific behavior where relevant.

Variables

{{{{manifests}}}}Replace with your {{manifests}}
{{{{k8s_version}}}}Replace with your {{k8s version}}
{{{{podsecurity}}}}Replace with your {{podsecurity}}
{{{{mesh}}}}Replace with your {{mesh}}
{{{{ingress}}}}Replace with your {{ingress}}
{{{{threat}}}}Replace with your {{threat}}

Example

Input

manifests: |
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: orders-api
  spec:
    replicas: 1
    selector:
      matchLabels: { app: orders-api }
    template:
      metadata:
        labels: { app: orders-api }
      spec:
        containers:
        - name: app
          image: orders-api:latest
          ports: [{ containerPort: 8080 }]
          env:
          - name: DB_PASSWORD
            value: "hunter2"
k8s_version: 1.29
podsecurity: restricted (enforced at namespace level)
mesh: Istio 1.21 with strict mTLS
ingress: nginx-ingress
threat: Customer-facing API, processes payment events

Output

## Critical (block merge)

- **Inline DB_PASSWORD**: secrets in plain `value:` fields end up in etcd, in `kubectl describe` output, and in any GitOps repo. Move to a Kubernetes Secret referenced via `valueFrom.secretKeyRef`. Better: External Secrets Operator pulling from your secret manager.
- **No securityContext, will fail PodSecurity `restricted` admission** on K8s 1.29. The pod will be rejected at apply time. Add at the container level: `runAsNonRoot: true`, `runAsUser: 1000`, `allowPrivilegeEscalation: false`, `readOnlyRootFilesystem: true`, `capabilities.drop: [ALL]`, `seccompProfile.type: RuntimeDefault`.
- **`image: orders-api:latest`**: unpinned tag is non-reproducible and can pull a different image on pod restart. Pin to digest: `orders-api@sha256:...` or a versioned tag enforced by your registry policy.
- **`replicas: 1`** for a customer-facing payment service: one node drain or rolling update = outage. Minimum 2 replicas plus a PodDisruptionBudget.

## High

- **No probes**: K8s can't tell if the container is alive or ready. With Istio strict mTLS, readiness needs to handle the sidecar startup. Add `startupProbe` with generous failureThreshold, then `readinessProbe` and `livenessProbe`.
- **No resource requests/limits**: scheduler can't pack the node, and noisy-neighbor risk is high on shared payment workload. Set both based on observed usage; for a cold start use `requests: {cpu: 100m, memory: 256Mi}, limits: {cpu: 1000m, memory: 512Mi}` and tune from data.
- **No NetworkPolicy**: with strict mTLS you have transport security but no L3/L4 policy. Add a default-deny NetworkPolicy in the namespace and an explicit allow from the ingress namespace and any upstream services.
- **No topology spread**: 2 replicas can both land on the same node. Add `topologySpreadConstraints` on `kubernetes.io/hostname` with `maxSkew: 1, whenUnsatisfiable: DoNotSchedule`.

## Medium / nits

- No PodMonitor/ServiceMonitor; metrics scraping won't auto-discover.
- No `revisionHistoryLimit`; defaults to 10 which is fine but worth being explicit.
- Missing `lifecycle.preStop` for graceful shutdown — Istio sidecar interactions on terminate are easier with a `sleep 5` preStop on the app and a longer terminationGracePeriodSeconds.

## Patched manifest

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: orders-api-db
type: Opaque
stringData:
  password: "REPLACE_VIA_EXTERNAL_SECRETS_OPERATOR"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
  labels: { app: orders-api }
spec:
  replicas: 3
  revisionHistoryLimit: 5
  strategy:
    type: RollingUpdate
    rollingUpdate: { maxUnavailable: 0, maxSurge: 1 }
  selector:
    matchLabels: { app: orders-api }
  template:
    metadata:
      labels: { app: orders-api }
    spec:
      terminationGracePeriodSeconds: 45
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector: { matchLabels: { app: orders-api } }
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
        seccompProfile: { type: RuntimeDefault }
      containers:
      - name: app
        image: registry.example.com/orders-api@sha256:abc...
        imagePullPolicy: IfNotPresent
        ports: [{ containerPort: 8080, name: http }]
        env:
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef: { name: orders-api-db, key: password }
        resources:
          requests: { cpu: 100m, memory: 256Mi, ephemeral-storage: 100Mi }
          limits:   { cpu: 1000m, memory: 512Mi, ephemeral-storage: 500Mi }
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities: { drop: [ALL] }
        startupProbe:
          httpGet: { path: /health/startup, port: http }
          failureThreshold: 30
          periodSeconds: 2
        readinessProbe:
          httpGet: { path: /health/ready, port: http }
          periodSeconds: 5
        livenessProbe:
          httpGet: { path: /health/live, port: http }
          periodSeconds: 10
          failureThreshold: 3
        lifecycle:
          preStop: { exec: { command: ["/bin/sh", "-c", "sleep 5"] } }
        volumeMounts:
        - { name: tmp, mountPath: /tmp }
      volumes:
      - { name: tmp, emptyDir: {} }
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: { name: orders-api }
spec:
  minAvailable: 2
  selector: { matchLabels: { app: orders-api } }
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: orders-api }
spec:
  podSelector: { matchLabels: { app: orders-api } }
  policyTypes: [Ingress, Egress]
  ingress:
  - from:
    - namespaceSelector: { matchLabels: { name: ingress-nginx } }
    ports: [{ port: 8080 }]
  egress:
  - to:
    - namespaceSelector: { matchLabels: { name: kube-system } }
    ports: [{ port: 53, protocol: UDP }]
```

Tips for best results

1Run `kubectl apply --dry-run=server` against a real cluster with the same PodSecurity level — it surfaces admission failures the linter misses.
2Pair this with `kubeaudit` or `kubescape` output for a more complete picture, especially around RBAC and ClusterRole bindings.
3If you're on a service mesh, factor in sidecar resources — they routinely add 100m CPU and 100Mi memory per pod and break naive capacity plans.
4AI assistance is not a replacement for security review by qualified professionals. Have a platform or security engineer sign off on NetworkPolicy and RBAC changes — these are easy to get subtly wrong.

Related prompts

Review a Dockerfile for security and best practices

intermediate

Audit a Dockerfile for security vulnerabilities, image bloat, layer inefficiency, and production-readiness issues.

IT & Securitydockercontainer-securitydevops

Review a Terraform module for issues and best practices

advanced

Audit a Terraform module for security misconfigurations, drift risk, naming, state hygiene, and reusability issues.

IT & Securityterraformiaccloud-security

Document a deployment strategy (blue-green, canary, rolling)

intermediate

Produce a written deployment strategy document with rationale, mechanics, rollback procedure, and risk tradeoffs for a specific service.

IT & Securitydeploymentcanaryblue-green

Need help implementing this prompt in your workflow?

Book a call

You are a senior platform engineer reviewing Kubernetes manifests for production. Audit the manifests below against: 1. **Security context**: runAsNonRoot, readOnlyRootFilesystem, drop ALL capabilities, seccomp profile, allowPrivilegeEscalation 2. **PodSecurity admission**: which level passes (privileged / baseline / restricted) and what blocks `restricted` 3. **Resources**: requests and limits set, sane ratios, ephemeral-storage limits, no `cpu: 100` typos 4. **Probes**: liveness, readiness, startup; correct timeouts; not pointing at the same endpoint 5. **Availability**: replicas > 1, PodDisruptionBudget, topologySpreadConstraints or anti-affinity, rollingUpdate maxUnavailable 6. **Networking**: NetworkPolicy presence, service type, sensible port mapping 7. **Secrets**: no inline secrets, no env-from of unscoped configmaps, projected serviceaccount tokens 8. **Observability**: Prometheus annotations or ServiceMonitor, structured logging assumed 9. **Cluster-specific**: image pull policy, imagePullSecrets, nodeSelector / tolerations correctness Manifests: ```yaml {{manifests}} ``` Cluster context: - K8s version: {{k8s_version}} - PodSecurity profile applied: {{podsecurity}} - Service mesh: {{mesh}} - Ingress controller: {{ingress}} - Threat model: {{threat}} Output: - **Critical** (block merge) - **High** - **Medium / nits** - **Patched manifest** with all critical and high fixes Cite specific PodSecurity restrictions or version-specific behavior where relevant.

Example

Input

manifests: |
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: orders-api
  spec:
    replicas: 1
    selector:
      matchLabels: { app: orders-api }
    template:
      metadata:
        labels: { app: orders-api }
      spec:
        containers:
        - name: app
          image: orders-api:latest
          ports: [{ containerPort: 8080 }]
          env:
          - name: DB_PASSWORD
            value: "hunter2"
k8s_version: 1.29
podsecurity: restricted (enforced at namespace level)
mesh: Istio 1.21 with strict mTLS
ingress: nginx-ingress
threat: Customer-facing API, processes payment events

Output

## Critical (block merge)

- **Inline DB_PASSWORD**: secrets in plain `value:` fields end up in etcd, in `kubectl describe` output, and in any GitOps repo. Move to a Kubernetes Secret referenced via `valueFrom.secretKeyRef`. Better: External Secrets Operator pulling from your secret manager.
- **No securityContext, will fail PodSecurity `restricted` admission** on K8s 1.29. The pod will be rejected at apply time. Add at the container level: `runAsNonRoot: true`, `runAsUser: 1000`, `allowPrivilegeEscalation: false`, `readOnlyRootFilesystem: true`, `capabilities.drop: [ALL]`, `seccompProfile.type: RuntimeDefault`.
- **`image: orders-api:latest`**: unpinned tag is non-reproducible and can pull a different image on pod restart. Pin to digest: `orders-api@sha256:...` or a versioned tag enforced by your registry policy.
- **`replicas: 1`** for a customer-facing payment service: one node drain or rolling update = outage. Minimum 2 replicas plus a PodDisruptionBudget.

## High

- **No probes**: K8s can't tell if the container is alive or ready. With Istio strict mTLS, readiness needs to handle the sidecar startup. Add `startupProbe` with generous failureThreshold, then `readinessProbe` and `livenessProbe`.
- **No resource requests/limits**: scheduler can't pack the node, and noisy-neighbor risk is high on shared payment workload. Set both based on observed usage; for a cold start use `requests: {cpu: 100m, memory: 256Mi}, limits: {cpu: 1000m, memory: 512Mi}` and tune from data.
- **No NetworkPolicy**: with strict mTLS you have transport security but no L3/L4 policy. Add a default-deny NetworkPolicy in the namespace and an explicit allow from the ingress namespace and any upstream services.
- **No topology spread**: 2 replicas can both land on the same node. Add `topologySpreadConstraints` on `kubernetes.io/hostname` with `maxSkew: 1, whenUnsatisfiable: DoNotSchedule`.

## Medium / nits

- No PodMonitor/ServiceMonitor; metrics scraping won't auto-discover.
- No `revisionHistoryLimit`; defaults to 10 which is fine but worth being explicit.
- Missing `lifecycle.preStop` for graceful shutdown — Istio sidecar interactions on terminate are easier with a `sleep 5` preStop on the app and a longer terminationGracePeriodSeconds.

## Patched manifest

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: orders-api-db
type: Opaque
stringData:
  password: "REPLACE_VIA_EXTERNAL_SECRETS_OPERATOR"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
  labels: { app: orders-api }
spec:
  replicas: 3
  revisionHistoryLimit: 5
  strategy:
    type: RollingUpdate
    rollingUpdate: { maxUnavailable: 0, maxSurge: 1 }
  selector:
    matchLabels: { app: orders-api }
  template:
    metadata:
      labels: { app: orders-api }
    spec:
      terminationGracePeriodSeconds: 45
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector: { matchLabels: { app: orders-api } }
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
        seccompProfile: { type: RuntimeDefault }
      containers:
      - name: app
        image: registry.example.com/orders-api@sha256:abc...
        imagePullPolicy: IfNotPresent
        ports: [{ containerPort: 8080, name: http }]
        env:
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef: { name: orders-api-db, key: password }
        resources:
          requests: { cpu: 100m, memory: 256Mi, ephemeral-storage: 100Mi }
          limits:   { cpu: 1000m, memory: 512Mi, ephemeral-storage: 500Mi }
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities: { drop: [ALL] }
        startupProbe:
          httpGet: { path: /health/startup, port: http }
          failureThreshold: 30
          periodSeconds: 2
        readinessProbe:
          httpGet: { path: /health/ready, port: http }
          periodSeconds: 5
        livenessProbe:
          httpGet: { path: /health/live, port: http }
          periodSeconds: 10
          failureThreshold: 3
        lifecycle:
          preStop: { exec: { command: ["/bin/sh", "-c", "sleep 5"] } }
        volumeMounts:
        - { name: tmp, mountPath: /tmp }
      volumes:
      - { name: tmp, emptyDir: {} }
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: { name: orders-api }
spec:
  minAvailable: 2
  selector: { matchLabels: { app: orders-api } }
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: orders-api }
spec:
  podSelector: { matchLabels: { app: orders-api } }
  policyTypes: [Ingress, Egress]
  ingress:
  - from:
    - namespaceSelector: { matchLabels: { name: ingress-nginx } }
    ports: [{ port: 8080 }]
  egress:
  - to:
    - namespaceSelector: { matchLabels: { name: kube-system } }
    ports: [{ port: 53, protocol: UDP }]
```

Tips for best results

1Run `kubectl apply --dry-run=server` against a real cluster with the same PodSecurity level — it surfaces admission failures the linter misses.

2Pair this with `kubeaudit` or `kubescape` output for a more complete picture, especially around RBAC and ClusterRole bindings.

3If you're on a service mesh, factor in sidecar resources — they routinely add 100m CPU and 100Mi memory per pod and break naive capacity plans.

4AI assistance is not a replacement for security review by qualified professionals. Have a platform or security engineer sign off on NetworkPolicy and RBAC changes — these are easy to get subtly wrong.