Microservice Deployment Settings

Most of our microservices are Spring Boot Java applications with Actuator monitoring enabled. This guide describes recommended CPU and memory settings, probe configuration, monitoring practices with Grafana, and troubleshooting steps with kubectl.

CPU Settings

Principles

Limit up to 4 CPUs Helps startup time (faster JIT warm-up and class loading). ⚠️ If all services start in parallel this may throttle the node – but in production we always use rolling updates, so this is not a problem.
Constrained CPU (1 vCPU) Works fine, but you must extend the startupProbe timeout/threshold, because startup will be slower.
Normal operation Microservices typically consume very little CPU most of the time. → Set requests.cpu: 100m.

Kubernetes Example

resources:
  requests:
    cpu: "100m"    # low steady consumption
  limits:
    cpu: "4"       # allow faster startup

Memory Settings

What Java Memory Consists Of

Heap (-Xmx) – application objects.
Metaspace – typically ~300 MB.
Thread stacks – ~1 MB per thread (~100 MB for ~100 threads).
JVM internals & native buffers – ~200 MB (GC structures, NIO, Netty, glibc arenas).

👉 Rule of thumb: Pod memory limit = Xmx + ~600 MB.

Guidelines

Set both -Xms and -Xmx to the same value to avoid dynamic heap resizing.
at least 200 MB for small microservices, 300 MB for microservices with process engine
at most 1-5 GBs for large workloads. For larger workloads scale horizontally (add more replicas).
Always leave headroom above -Xmx for Metaspace, threads, and native overhead.
You should never see OOMKilled. If it happens → either increase memory limit or even decrease -Xmx!.
Use Grafana monitoring to calibrate real usage.

Kubernetes Example

resources:
  requests:
    memory: "900Mi"    # Example: Xmx(300Mi) + ~600Mi overhead
  limits:
    memory: "1200Mi"   # Add 300Mi headroom for spikes. For safety, avoid OOMKilled.

JVM Options Example

env:
  - name: JAVA_TOOL_OPTIONS
    value: -Xms300m -Xmx300m            

Probes (Startup, Liveness, Readiness)

Spring Boot with Actuator provides endpoints for Kubernetes probes.

Recommendations

Startup Probe: /actuator/health Allows slow startup without failing the pod. Increase failureThreshold if CPU is constrained.
Liveness Probe: /actuator/health/liveness Detects JVM deadlocks or crash states.
Readiness Probe: /actuator/health/readiness Ensures the pod only receives traffic when ready.

Typical Settings

Startup Probe: initialDelaySeconds: 20, periodSeconds: 10, failureThreshold: 30
Liveness/Readiness Probe: periodSeconds: 5, timeoutSeconds: 5, failureThreshold: 5

Example Deployment Descriptor

A typical Spring Boot microservice (tsm-catalog) Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: tsm-catalog
  name: tsm-catalog
  namespace: tsm-datalite
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: tsm-catalog
  template:
    metadata:
      annotations:
        linkerd.io/inject: disabled
      labels:
        app: tsm-catalog
    spec:
      containers:
        - name: tsm-catalog
          image: registry.datalite.cz/tsm/tsm-catalog:2.3
          imagePullPolicy: Always
          env:
            - name: JAVA_TOOL_OPTIONS
              value: >
                -Xms300m -Xmx300m
            - name: spring.config.import
              value: configserver:http://tsm-config-server
            - name: MALLOC_ARENA_MAX
              value: "2"
          ports:
            - containerPort: 8099
              protocol: TCP
          resources:
            requests:
              cpu: 100m
              memory: 800Mi
            limits:
              cpu: 3000m
              memory: 1100Mi
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8099
            initialDelaySeconds: 0
            periodSeconds: 5
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 5
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8099
            initialDelaySeconds: 0
            periodSeconds: 5
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 5
          startupProbe:
            httpGet:
              path: /actuator/health
              port: 8099
            initialDelaySeconds: 40
            periodSeconds: 10
            timeoutSeconds: 1
            failureThreshold: 30
      imagePullSecrets:
        - name: regcred
      securityContext:
        runAsUser: 1001
        runAsGroup: 1001

Monitoring with Grafana

JVM metrics (via Actuator/Micrometer):
- jvm.memory.used{area=heap} vs. jvm.memory.max{area=heap}
- jvm.memory.used{area=nonheap}
- jvm.threads.live
Pod metrics (via cAdvisor / kubelet):
- Pod memory usage (RSS / working set)
- Pod CPU usage and throttling

👉 Always compare JVM vs Pod metrics: if Pod RSS ≫ JVM heap+nonheap → overhead from threads, native memory, or allocator arenas.

Troubleshooting with kubectl

Check resource usage

kubectl top pod <pod> -n <namespace>

Inspect pod state and events

kubectl describe pod <pod> -n <namespace>

Access logs

kubectl logs <pod> -n <namespace>

Exec into pod

kubectl exec -it <pod> -n <namespace> -- sh

Check memory details

cat /proc/1/smaps_rollup

In case you need detailed native memory tracking (NMT) for troubleshooting, add: -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary and after restart add jcmd 1 VM.native_memory detail.

Or with JDK tools:

jcmd 1 VM.native_memory summary

Checklist

CPU requests = 100m, limits = 4 (or 1 if constrained + probe adjusted).
Memory limit = Xmx + ~600 MB overhead.
JVM options include -Xms -Xmx and optional NMT.
Optionally MALLOC_ARENA_MAX=2 set to reduce native memory overhead.
Probes configured (startup, readiness, liveness).
Monitor JVM and Pod metrics in Grafana.
Never accept OOMKilled events.

CPU Settings​

Principles​

Kubernetes Example​

Memory Settings​

What Java Memory Consists Of​

Guidelines​

Kubernetes Example​

JVM Options Example​

Probes (Startup, Liveness, Readiness)​

Recommendations​

Typical Settings​

Example Deployment Descriptor​

Monitoring with Grafana​

Troubleshooting with kubectl​

Check resource usage​

Inspect pod state and events​

Access logs​

Exec into pod​

Check memory details​

Checklist​

CPU Settings

Principles

Kubernetes Example

Memory Settings

What Java Memory Consists Of

Guidelines

Kubernetes Example

JVM Options Example

Probes (Startup, Liveness, Readiness)

Recommendations

Typical Settings

Example Deployment Descriptor

Monitoring with Grafana

Troubleshooting with kubectl

Check resource usage

Inspect pod state and events

Access logs

Exec into pod

Check memory details

Checklist