Skip to main content

Microservice Deployment Settings

Most of our microservices are Spring Boot Java applications with Actuator monitoring enabled. This guide describes recommended CPU and memory settings, probe configuration, monitoring practices with Grafana, and troubleshooting steps with kubectl.


CPU Settings

Principles

  • Limit up to 4 CPUs Helps startup time (faster JIT warm-up and class loading). ⚠️ If all services start in parallel this may throttle the node – but in production we always use rolling updates, so this is not a problem.

  • Constrained CPU (1 vCPU) Works fine, but you must extend the startupProbe timeout/threshold, because startup will be slower.

  • Normal operation Microservices typically consume very little CPU most of the time. → Set requests.cpu: 100m.

Kubernetes Example

resources:
requests:
cpu: "100m" # low steady consumption
limits:
cpu: "4" # allow faster startup

Memory Settings

What Java Memory Consists Of

  • Heap (-Xmx) – application objects.
  • Metaspace – typically ~300 MB.
  • Thread stacks – ~1 MB per thread (~100 MB for ~100 threads).
  • JVM internals & native buffers – ~200 MB (GC structures, NIO, Netty, glibc arenas).

👉 Rule of thumb: Pod memory limit = Xmx + ~600 MB.

Guidelines

  • Set both -Xms and -Xmx to the same value to avoid dynamic heap resizing.
  • at least 200 MB for small microservices, 300 MB for microservices with process engine
  • at most 1-5 GBs for large workloads. For larger workloads scale horizontally (add more replicas).
  • Always leave headroom above -Xmx for Metaspace, threads, and native overhead.
  • You should never see OOMKilled. If it happens → either increase memory limit or even decrease -Xmx!.
  • Use Grafana monitoring to calibrate real usage.

Kubernetes Example

resources:
requests:
memory: "900Mi" # Example: Xmx(300Mi) + ~600Mi overhead
limits:
memory: "1200Mi" # Add 300Mi headroom for spikes. For safety, avoid OOMKilled.

JVM Options Example

env:
- name: JAVA_TOOL_OPTIONS
value: -Xms300m -Xmx300m

Probes (Startup, Liveness, Readiness)

Spring Boot with Actuator provides endpoints for Kubernetes probes.

Recommendations

  • Startup Probe: /actuator/health Allows slow startup without failing the pod. Increase failureThreshold if CPU is constrained.
  • Liveness Probe: /actuator/health/liveness Detects JVM deadlocks or crash states.
  • Readiness Probe: /actuator/health/readiness Ensures the pod only receives traffic when ready.

Typical Settings

  • Startup Probe: initialDelaySeconds: 20, periodSeconds: 10, failureThreshold: 30
  • Liveness/Readiness Probe: periodSeconds: 5, timeoutSeconds: 5, failureThreshold: 5

Example Deployment Descriptor

A typical Spring Boot microservice (tsm-catalog) Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: tsm-catalog
name: tsm-catalog
namespace: tsm-datalite
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: tsm-catalog
template:
metadata:
annotations:
linkerd.io/inject: disabled
labels:
app: tsm-catalog
spec:
containers:
- name: tsm-catalog
image: registry.datalite.cz/tsm/tsm-catalog:2.3
imagePullPolicy: Always
env:
- name: JAVA_TOOL_OPTIONS
value: >
-Xms300m -Xmx300m
- name: spring.config.import
value: configserver:http://tsm-config-server
- name: MALLOC_ARENA_MAX
value: "2"
ports:
- containerPort: 8099
protocol: TCP
resources:
requests:
cpu: 100m
memory: 800Mi
limits:
cpu: 3000m
memory: 1100Mi
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8099
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8099
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
startupProbe:
httpGet:
path: /actuator/health
port: 8099
initialDelaySeconds: 40
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 30
imagePullSecrets:
- name: regcred
securityContext:
runAsUser: 1001
runAsGroup: 1001

Monitoring with Grafana

  • JVM metrics (via Actuator/Micrometer):

    • jvm.memory.used{area=heap} vs. jvm.memory.max{area=heap}
    • jvm.memory.used{area=nonheap}
    • jvm.threads.live
  • Pod metrics (via cAdvisor / kubelet):

    • Pod memory usage (RSS / working set)
    • Pod CPU usage and throttling

👉 Always compare JVM vs Pod metrics: if Pod RSS ≫ JVM heap+nonheap → overhead from threads, native memory, or allocator arenas.


Troubleshooting with kubectl

Check resource usage

kubectl top pod <pod> -n <namespace>

Inspect pod state and events

kubectl describe pod <pod> -n <namespace>

Access logs

kubectl logs <pod> -n <namespace>

Exec into pod

kubectl exec -it <pod> -n <namespace> -- sh

Check memory details

cat /proc/1/smaps_rollup

In case you need detailed native memory tracking (NMT) for troubleshooting, add: -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary and after restart add jcmd 1 VM.native_memory detail.

Or with JDK tools:

jcmd 1 VM.native_memory summary

Checklist

  • CPU requests = 100m, limits = 4 (or 1 if constrained + probe adjusted).
  • Memory limit = Xmx + ~600 MB overhead.
  • JVM options include -Xms -Xmx and optional NMT.
  • Optionally MALLOC_ARENA_MAX=2 set to reduce native memory overhead.
  • Probes configured (startup, readiness, liveness).
  • Monitor JVM and Pod metrics in Grafana.
  • Never accept OOMKilled events.