Spring Boot Performance Optimization & Profiling

Introduction
Your Spring Boot application works. Tests pass. Users are signing up. Then one day, response times spike from 200ms to 3 seconds, the database connection pool is exhausted, and your monitoring dashboard turns red.
Performance problems don't announce themselves during development — they appear in production under real load. The difference between a slow application and a fast one isn't luck — it's measurement, profiling, and targeted optimization.
This guide teaches you to find and fix performance bottlenecks systematically. No guessing. No premature optimization. Every change backed by data.
What You'll Learn
✅ Identify performance bottlenecks using a measurement-first approach
✅ Profile CPU, memory, and threads with Java Flight Recorder and VisualVM
✅ Monitor runtime metrics with Spring Boot Actuator and Micrometer
✅ Optimize database queries, connection pools, and batch operations
✅ Accelerate HTTP responses with compression, caching, and async handling
✅ Tune JVM memory and garbage collection for production workloads
✅ Speed up application startup with lazy initialization and AOT compilation
✅ Establish performance baselines and detect regressions with JMH
Prerequisites
- Spring Boot fundamentals (Getting Started)
- Database integration (JPA & PostgreSQL)
- Basic understanding of caching (Redis Caching)
- Testing fundamentals (Advanced Testing)
- Java 17+ and Docker installed
1. Understanding Performance Bottlenecks
Before optimizing anything, you need to understand where time is spent. Premature optimization wastes effort and often makes code harder to maintain.
The Golden Rule: Measure First
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." — Donald Knuth
Four Categories of Bottlenecks
| Category | Symptoms | Common Causes | Tools |
|---|---|---|---|
| CPU | High CPU usage, slow computations | Inefficient algorithms, excessive logging, serialization | JFR, VisualVM, top |
| Memory | OutOfMemoryError, frequent GC pauses | Memory leaks, large object graphs, wrong heap size | Heap dumps, MAT, JFR |
| I/O (Database) | Slow queries, connection pool exhaustion | N+1 queries, missing indexes, no caching | Slow query log, EXPLAIN ANALYZE |
| Network | High latency, timeout errors | Uncompressed responses, no HTTP caching, blocking calls | Wireshark, Actuator metrics |
Where Time Is Typically Spent
In most Spring Boot applications, this is the breakdown:
Database queries dominate. This means query optimization and caching deliver the biggest wins — which is why we covered Redis caching and JPA optimization earlier in this series.
2. JVM Profiling with VisualVM & Java Flight Recorder
Profiling reveals exactly where your application spends time and memory. Two tools cover most needs.
Java Flight Recorder (JFR)
JFR is built into the JDK (11+) with near-zero overhead — safe for production use.
Enable JFR at application startup:
java -XX:StartFlightRecording=duration=60s,filename=recording.jfr \
-jar target/myapp.jarEnable JFR with continuous recording (production):
java -XX:StartFlightRecording=disk=true,maxsize=500m,maxage=1d \
-XX:FlightRecorderOptions=stackdepth=256 \
-jar target/myapp.jarTrigger a recording on a running application with jcmd:
# Find the PID
jps -l
# Start recording
jcmd <PID> JFR.start name=profile duration=120s filename=profile.jfr
# Stop recording manually
jcmd <PID> JFR.stop name=profile
# Dump current recording
jcmd <PID> JFR.dump name=profile filename=dump.jfrAnalyzing JFR Recordings
Open .jfr files with JDK Mission Control (JMC) — a free GUI tool bundled with Oracle JDK or downloadable separately:
# Launch JMC (if installed)
jmc
# Or use jfr command-line tool (JDK 17+)
jfr print --events jdk.CPULoad recording.jfr
jfr summary recording.jfrKey events to look for:
| JFR Event | What It Reveals | Action |
|---|---|---|
jdk.CPULoad | Process and system CPU usage | Identify CPU-bound code |
jdk.GCPausePhase | GC pause duration and frequency | Tune GC settings |
jdk.ObjectAllocationSample | Where objects are allocated | Reduce allocations |
jdk.JavaMonitorWait | Thread contention | Fix lock contention |
jdk.FileRead / jdk.FileWrite | File I/O operations | Optimize file access |
jdk.SocketRead / jdk.SocketWrite | Network I/O | Identify slow external calls |
VisualVM for Development Profiling
VisualVM provides a visual interface for profiling during development:
# Install VisualVM (macOS)
brew install --cask visualvm
# Launch
visualvmCPU Profiling workflow:
- Start your Spring Boot app
- Open VisualVM → attach to the running JVM
- Go to Sampler → click CPU
- Trigger the slow operation (e.g., hit the slow API endpoint)
- Click Snapshot to capture the results
- Sort by Self Time to find hotspots
Creating a Profiling Endpoint (Development Only)
Add a diagnostic endpoint for easy profiling during development:
@RestController
@RequestMapping("/api/debug")
@Profile("dev") // Only available in dev profile
public class ProfilingController {
@Autowired
private MeterRegistry meterRegistry;
@GetMapping("/jfr/start")
public ResponseEntity<String> startRecording()
throws Exception {
ProcessBuilder pb = new ProcessBuilder(
"jcmd",
String.valueOf(ProcessHandle.current().pid()),
"JFR.start",
"name=debug",
"duration=60s",
"filename=debug-recording.jfr"
);
pb.redirectErrorStream(true);
Process process = pb.start();
String output = new String(
process.getInputStream().readAllBytes());
return ResponseEntity.ok(
"Recording started: " + output);
}
@GetMapping("/memory")
public Map<String, Object> memoryInfo() {
Runtime runtime = Runtime.getRuntime();
return Map.of(
"maxMemory", formatBytes(runtime.maxMemory()),
"totalMemory",
formatBytes(runtime.totalMemory()),
"freeMemory",
formatBytes(runtime.freeMemory()),
"usedMemory", formatBytes(
runtime.totalMemory()
- runtime.freeMemory())
);
}
private String formatBytes(long bytes) {
return String.format("%.2f MB",
bytes / (1024.0 * 1024.0));
}
}3. Spring Boot Actuator for Runtime Metrics
Spring Boot Actuator exposes production-ready monitoring endpoints. Combined with Micrometer, it provides the metrics foundation for performance monitoring.
Setting Up Actuator
<!-- pom.xml -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- Micrometer Prometheus registry (for Prometheus/Grafana) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency># application.yml
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus,env
endpoint:
health:
show-details: when-authorized
metrics:
tags:
application: ${spring.application.name}
distribution:
percentiles-histogram:
http.server.requests: true
sla:
http.server.requests: 50ms,100ms,200ms,500ms,1sKey Metrics to Monitor
HTTP request metrics:
# All HTTP request metrics
curl localhost:8080/actuator/metrics/http.server.requests
# Filter by endpoint
curl "localhost:8080/actuator/metrics/http.server.requests?tag=uri:/api/users"
# Filter by status
curl "localhost:8080/actuator/metrics/http.server.requests?tag=status:200"JVM metrics:
# Memory usage
curl localhost:8080/actuator/metrics/jvm.memory.used
# GC pause time
curl localhost:8080/actuator/metrics/jvm.gc.pause
# Thread count
curl localhost:8080/actuator/metrics/jvm.threads.live
# CPU usage
curl localhost:8080/actuator/metrics/process.cpu.usageDatabase metrics:
# HikariCP active connections
curl localhost:8080/actuator/metrics/hikaricp.connections.active
# Connection wait time
curl localhost:8080/actuator/metrics/hikaricp.connections.acquire
# Connection pool usage
curl localhost:8080/actuator/metrics/hikaricp.connections.usageCustom Metrics with Micrometer
Track business-specific performance metrics:
@Service
public class OrderService {
private final Counter orderCounter;
private final Timer orderProcessingTimer;
private final DistributionSummary orderValueSummary;
public OrderService(MeterRegistry meterRegistry) {
this.orderCounter = Counter.builder("orders.created")
.description("Total orders created")
.tag("type", "all")
.register(meterRegistry);
this.orderProcessingTimer = Timer
.builder("orders.processing.time")
.description("Order processing duration")
.publishPercentiles(0.5, 0.95, 0.99)
.register(meterRegistry);
this.orderValueSummary = DistributionSummary
.builder("orders.value")
.description("Order values distribution")
.baseUnit("dollars")
.publishPercentiles(0.5, 0.95)
.register(meterRegistry);
}
public Order createOrder(OrderRequest request) {
return orderProcessingTimer.record(() -> {
Order order = processOrder(request);
orderCounter.increment();
orderValueSummary.record(
order.getTotalAmount().doubleValue());
return order;
});
}
private Order processOrder(OrderRequest request) {
// Business logic
return new Order(/* ... */);
}
}Gauge for tracking active state:
@Component
public class QueueMetrics {
private final BlockingQueue<Task> taskQueue;
public QueueMetrics(MeterRegistry registry,
BlockingQueue<Task> taskQueue) {
this.taskQueue = taskQueue;
Gauge.builder("task.queue.size", taskQueue,
BlockingQueue::size)
.description("Current task queue size")
.register(registry);
Gauge.builder("task.queue.remaining",
taskQueue,
BlockingQueue::remainingCapacity)
.description("Remaining queue capacity")
.register(registry);
}
}Prometheus Endpoint
With the Prometheus registry, all metrics are available at /actuator/prometheus in Prometheus format:
curl localhost:8080/actuator/prometheusOutput:
# HELP http_server_requests_seconds
# TYPE http_server_requests_seconds histogram
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.05"} 142
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.1"} 198
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.2"} 210
http_server_requests_seconds_count{method="GET",uri="/api/users",status="200"} 215
http_server_requests_seconds_sum{method="GET",uri="/api/users",status="200"} 18.4324. Database Query Optimization
Database access is the #1 bottleneck in most Spring Boot applications. This section focuses on runtime optimization beyond what we covered in Advanced JPA Optimization.
Enable Slow Query Logging
PostgreSQL:
# application.yml
spring:
jpa:
properties:
hibernate:
generate_statistics: true
session.events.log.LOG_QUERIES_SLOWER_THAN_MS: 100
datasource:
hikari:
leak-detection-threshold: 30000 # 30 secondsHibernate statistics logging:
logging:
level:
org.hibernate.stat: DEBUG
org.hibernate.SQL: DEBUG
org.hibernate.type.descriptor.sql.BasicBinder: TRACEThis outputs query counts per session — a fast way to spot N+1 problems:
Session Metrics {
726313 nanoseconds spent acquiring 1 JDBC connections;
326040 nanoseconds spent releasing 1 JDBC connections;
3524968 nanoseconds spent preparing 12 JDBC statements; ← 12 queries!
42803543 nanoseconds spent executing 12 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
}HikariCP Connection Pool Tuning
HikariCP is Spring Boot's default connection pool. Misconfigured pools are a silent performance killer.
spring:
datasource:
hikari:
# Pool sizing
maximum-pool-size: 20 # Max connections
minimum-idle: 5 # Min idle connections
# Timeouts
connection-timeout: 10000 # 10s to get connection
idle-timeout: 300000 # 5min idle before removal
max-lifetime: 1800000 # 30min max connection age
# Validation
validation-timeout: 5000 # 5s for validation query
# Leak detection
leak-detection-threshold: 60000 # Warn if held > 60s
# Metrics
register-mbeans: truePool sizing formula:
optimal_pool_size = (core_count * 2) + effective_spindle_count
Example: 4-core server with SSD
optimal_pool_size = (4 * 2) + 1 = 9Important: More connections ≠ better performance. Too many connections cause context switching overhead on the database server. Start small (10-20) and increase only if metrics show connection wait times.
Monitor connection pool health:
@Scheduled(fixedRate = 30000) // Every 30 seconds
public void logPoolMetrics() {
HikariPoolMXBean poolProxy = hikariDataSource
.getHikariPoolMXBean();
log.info("HikariCP - Active: {}, Idle: {}, " +
"Waiting: {}, Total: {}",
poolProxy.getActiveConnections(),
poolProxy.getIdleConnections(),
poolProxy.getThreadsAwaitingConnection(),
poolProxy.getTotalConnections());
}Batch Operations
Individual INSERT/UPDATE statements are slow. Batch operations reduce round trips dramatically.
spring:
jpa:
properties:
hibernate:
jdbc:
batch_size: 50
batch_versioned_data: true
order_inserts: true
order_updates: true@Service
@Transactional
public class ProductImportService {
@PersistenceContext
private EntityManager entityManager;
public int importProducts(
List<ProductDto> products) {
int batchSize = 50;
for (int i = 0; i < products.size(); i++) {
Product product = mapToEntity(products.get(i));
entityManager.persist(product);
// Flush and clear every batch
if (i > 0 && i % batchSize == 0) {
entityManager.flush();
entityManager.clear();
}
}
entityManager.flush();
entityManager.clear();
return products.size();
}
private Product mapToEntity(ProductDto dto) {
Product product = new Product();
product.setName(dto.getName());
product.setPrice(dto.getPrice());
product.setCategory(dto.getCategory());
return product;
}
}Performance comparison:
| Approach | 10,000 Records | Round Trips |
|---|---|---|
| Individual inserts | ~45 seconds | 10,000 |
| Batch size 50 | ~3 seconds | 200 |
| Batch size 100 | ~2.5 seconds | 100 |
JDBC executeBatch() | ~1.5 seconds | 100 |
Read-Only Query Optimization
Mark read-only transactions to avoid dirty checking overhead:
@Service
public class ReportService {
@Transactional(readOnly = true)
public List<SalesReport> generateMonthlyReport(
YearMonth month) {
// Hibernate skips dirty checking for read-only
// transactions, saving CPU time
return reportRepository
.findByMonth(month.getMonthValue(),
month.getYear());
}
}// Repository with query hints
public interface ProductRepository
extends JpaRepository<Product, Long> {
@QueryHints({
@QueryHint(
name = "org.hibernate.readOnly",
value = "true"),
@QueryHint(
name = "org.hibernate.fetchSize",
value = "50")
})
@Query("SELECT p FROM Product p WHERE p.active = true")
List<Product> findAllActive();
}5. HTTP & REST API Performance
Optimize the request-response cycle between your API and clients.
Response Compression (GZIP)
Enable GZIP compression to reduce response payload size by 60-80%:
server:
compression:
enabled: true
min-response-size: 1024 # Compress responses > 1KB
mime-types:
- application/json
- application/xml
- text/html
- text/xml
- text/plain
- application/javascript
- text/cssVerify compression is working:
# Without compression
curl -s -o /dev/null -w "%{size_download}" \
http://localhost:8080/api/products
# Output: 45230
# With compression
curl -s -o /dev/null -w "%{size_download}" \
-H "Accept-Encoding: gzip" \
http://localhost:8080/api/products
# Output: 8124 (82% smaller)HTTP Caching with ETags
ETags let clients skip downloading unchanged responses:
@Configuration
public class WebConfig implements WebMvcConfigurer {
@Bean
public FilterRegistrationBean<ShallowEtagHeaderFilter>
shallowEtagHeaderFilter() {
FilterRegistrationBean<ShallowEtagHeaderFilter>
filterBean = new FilterRegistrationBean<>(
new ShallowEtagHeaderFilter());
filterBean.addUrlPatterns("/api/*");
filterBean.setName("etagFilter");
return filterBean;
}
}Custom Cache-Control headers for specific endpoints:
@RestController
@RequestMapping("/api/products")
public class ProductController {
@GetMapping("/{id}")
public ResponseEntity<Product> getProduct(
@PathVariable Long id) {
Product product = productService.findById(id);
return ResponseEntity.ok()
.cacheControl(CacheControl
.maxAge(Duration.ofMinutes(10))
.mustRevalidate())
.eTag(String.valueOf(
product.getUpdatedAt().hashCode()))
.body(product);
}
@GetMapping("/categories")
public ResponseEntity<List<String>> getCategories() {
List<String> categories =
productService.getAllCategories();
// Categories rarely change — cache for 1 hour
return ResponseEntity.ok()
.cacheControl(CacheControl
.maxAge(Duration.ofHours(1))
.cachePublic())
.body(categories);
}
}Async Request Handling
For I/O-bound operations, async handling frees up Tomcat threads:
@RestController
@RequestMapping("/api/reports")
public class ReportController {
@Autowired
private ReportService reportService;
// Synchronous — blocks Tomcat thread
@GetMapping("/sync/{id}")
public Report getReportSync(@PathVariable Long id) {
return reportService.generateReport(id);
}
// Async with CompletableFuture — frees Tomcat thread
@GetMapping("/async/{id}")
public CompletableFuture<Report> getReportAsync(
@PathVariable Long id) {
return CompletableFuture.supplyAsync(
() -> reportService.generateReport(id));
}
// Streaming large responses
@GetMapping(value = "/stream",
produces = MediaType.APPLICATION_NDJSON_VALUE)
public Flux<Report> streamReports() {
return reportService.streamAllReports();
}
}Configure Tomcat thread pool:
server:
tomcat:
threads:
max: 200 # Max worker threads
min-spare: 20 # Min idle threads
max-connections: 8192
accept-count: 100 # Queue size when all threads busy
connection-timeout: 20000JSON Serialization Optimization
Jackson is powerful but can be slow with large objects. Optimize it:
spring:
jackson:
serialization:
WRITE_DATES_AS_TIMESTAMPS: false
deserialization:
FAIL_ON_UNKNOWN_PROPERTIES: false
default-property-inclusion: non_null # Skip null fields@Configuration
public class JacksonConfig {
@Bean
public Jackson2ObjectMapperBuilderCustomizer
jacksonCustomizer() {
return builder -> builder
.featuresToDisable(
SerializationFeature
.WRITE_DATES_AS_TIMESTAMPS,
DeserializationFeature
.FAIL_ON_UNKNOWN_PROPERTIES)
.featuresToEnable(
DeserializationFeature
.READ_UNKNOWN_ENUM_VALUES_AS_NULL)
.serializationInclusion(
JsonInclude.Include.NON_NULL);
}
}Use @JsonView to return only needed fields:
public class Views {
public static class Summary {}
public static class Detail extends Summary {}
}
@Entity
public class Product {
@JsonView(Views.Summary.class)
private Long id;
@JsonView(Views.Summary.class)
private String name;
@JsonView(Views.Summary.class)
private BigDecimal price;
@JsonView(Views.Detail.class)
private String fullDescription;
@JsonView(Views.Detail.class)
private List<Review> reviews;
}
@RestController
@RequestMapping("/api/products")
public class ProductController {
// Returns only id, name, price
@GetMapping
@JsonView(Views.Summary.class)
public List<Product> listProducts() {
return productService.findAll();
}
// Returns all fields including description, reviews
@GetMapping("/{id}")
@JsonView(Views.Detail.class)
public Product getProduct(@PathVariable Long id) {
return productService.findById(id);
}
}6. Memory Optimization & GC Tuning
Memory issues cause the most dramatic failures — OutOfMemoryError crashes your entire application.
Analyzing Heap Usage
Take a heap dump:
# From a running application
jcmd <PID> GC.heap_dump /tmp/heapdump.hprof
# Automatically on OutOfMemoryError (add to JVM args)
java -XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/var/log/app/ \
-jar myapp.jarAnalyze with Eclipse MAT (Memory Analyzer Tool):
- Open the
.hproffile in MAT - Run Leak Suspects Report — MAT automatically identifies likely leaks
- Check the Dominator Tree — shows which objects retain the most memory
- Look at Histogram — counts of each object type
Common Memory Leaks in Spring Boot
1. Unbounded caches:
// ❌ Memory leak — grows forever
private final Map<String, Object> cache =
new HashMap<>();
public Object getData(String key) {
return cache.computeIfAbsent(key,
k -> expensiveQuery(k));
}
// ✅ Bounded cache with eviction
private final Map<String, Object> cache =
Collections.synchronizedMap(
new LinkedHashMap<>(100, 0.75f, true) {
@Override
protected boolean removeEldestEntry(
Map.Entry<String, Object> eldest) {
return size() > 1000;
}
}
);
// ✅ Better: Use Spring Cache with TTL (see Redis post)
@Cacheable(value = "data", key = "#key")
public Object getData(String key) {
return expensiveQuery(key);
}2. Non-closed resources:
// ❌ Stream not closed — database connection leaked
public List<User> getActiveUsers() {
return userRepository.streamAll()
.filter(User::isActive)
.collect(Collectors.toList());
}
// ✅ Try-with-resources closes the stream
@Transactional(readOnly = true)
public List<User> getActiveUsers() {
try (Stream<User> stream =
userRepository.streamAll()) {
return stream
.filter(User::isActive)
.collect(Collectors.toList());
}
}3. Event listener accumulation:
// ❌ Listener registered but never removed
@Component
public class NotificationListener {
@Autowired
private ApplicationEventPublisher publisher;
@PostConstruct
public void init() {
// Each hot-reload in dev adds another listener!
publisher.publishEvent(
new ListenerRegisteredEvent(this));
}
}
// ✅ Use @EventListener (Spring manages lifecycle)
@Component
public class NotificationListener {
@EventListener
public void onOrderCreated(OrderCreatedEvent event) {
// Spring handles registration/cleanup
sendNotification(event.getOrder());
}
}GC Algorithm Selection
| GC Algorithm | Best For | Flag | Pause Target |
|---|---|---|---|
| G1GC | General purpose (default in JDK 17) | -XX:+UseG1GC | 200ms |
| ZGC | Ultra-low latency (<10ms pauses) | -XX:+UseZGC | <10ms |
| Shenandoah | Low latency (OpenJDK) | -XX:+UseShenandoahGC | <10ms |
| Parallel GC | Maximum throughput (batch processing) | -XX:+UseParallelGC | N/A |
Recommended JVM flags for a typical Spring Boot API:
java \
-Xms512m \ # Initial heap
-Xmx2g \ # Maximum heap
-XX:+UseG1GC \ # G1 garbage collector
-XX:MaxGCPauseMillis=200 \ # Target GC pause
-XX:+HeapDumpOnOutOfMemoryError \ # Dump on OOM
-XX:HeapDumpPath=/var/log/app/ \ # Heap dump location
-XX:+ExitOnOutOfMemoryError \ # Exit on OOM (let container restart)
-Xlog:gc*:file=/var/log/app/gc.log:time,uptime:filecount=5,filesize=10m \
-jar myapp.jarFor low-latency APIs (ZGC):
java \
-Xms1g -Xmx4g \
-XX:+UseZGC \
-XX:+ZGenerational \ # Generational ZGC (JDK 21+)
-Xlog:gc*:file=gc.log:time \
-jar myapp.jarMonitoring GC via Actuator
# GC pause time
curl localhost:8080/actuator/metrics/jvm.gc.pause
# Memory pool usage
curl localhost:8080/actuator/metrics/jvm.memory.used?tag=area:heap
# GC count
curl localhost:8080/actuator/metrics/jvm.gc.pause \
| jq '.measurements[] | select(.statistic=="COUNT")'GC health indicators:
| Metric | Healthy | Warning | Critical |
|---|---|---|---|
| GC pause p99 | < 200ms | 200-500ms | > 500ms |
| GC overhead | < 5% | 5-15% | > 15% |
| Heap after GC | < 70% | 70-85% | > 85% |
7. Application Startup Optimization
Slow startup increases deployment time, CI/CD feedback loops, and cold-start latency in serverless/container environments.
Measure Startup Time
# application.yml
spring:
main:
log-startup-info: true
logging:
level:
org.springframework.boot.autoconfigure: DEBUGSpring Boot 3.2+ startup report:
java -jar myapp.jar --spring.main.startup-report=trueOutput shows each auto-configuration class and its initialization time:
Spring Boot Startup Report
============================
Startup Time: 4.2 seconds (process: 5.1 seconds)
Auto-configuration (top 10 by duration):
DataSourceAutoConfiguration: 892ms
HibernateJpaAutoConfiguration: 1240ms
WebMvcAutoConfiguration: 312ms
SecurityAutoConfiguration: 289ms
...Lazy Initialization
Only initialize beans when first used — not at startup:
spring:
main:
lazy-initialization: true # Global lazy initSelective lazy initialization (recommended for production):
// Eager (default) — loaded at startup
@Service
public class PaymentService {
// Needed immediately for health checks
}
// Lazy — loaded on first use
@Service
@Lazy
public class ReportGenerationService {
// Only needed when someone requests a report
}Trade-off: Lazy init reduces startup time but moves initialization cost to the first request. For APIs, this means the first request to a lazy bean is slower.
Exclude Unnecessary Auto-Configuration
@SpringBootApplication(exclude = {
DataSourceAutoConfiguration.class, // If not using DB
MongoAutoConfiguration.class, // If not using Mongo
RedisAutoConfiguration.class, // If not using Redis
MailSenderAutoConfiguration.class, // If not sending email
SecurityAutoConfiguration.class, // If not using security
FlywayAutoConfiguration.class // If not using Flyway
})
public class MyApplication {
public static void main(String[] args) {
SpringApplication.run(MyApplication.class, args);
}
}Find what auto-configurations are active:
java -jar myapp.jar --debug 2>&1 | grep "matched"Spring AOT & GraalVM Native Images
For the fastest possible startup, compile to a native image:
<!-- pom.xml — Spring Boot 3.x -->
<plugin>
<groupId>
org.graalvm.buildtools
</groupId>
<artifactId>
native-maven-plugin
</artifactId>
</plugin># Build native image
mvn -Pnative native:compile
# Run — starts in milliseconds
./target/myappStartup comparison:
| Mode | Startup Time | Memory | Best For |
|---|---|---|---|
| JVM (default) | 3-8 seconds | 200-500 MB | Long-running services |
| JVM + lazy init | 1-4 seconds | 200-500 MB | Long-running services |
| Native image | 50-200 ms | 50-100 MB | Serverless, CLI tools |
Trade-off: Native images have faster startup but slower peak throughput than JIT-compiled JVM. For long-running servers that handle sustained load, the JVM usually wins.
Optimize Component Scanning
// ❌ Scans everything under com.example
@SpringBootApplication
@ComponentScan("com.example")
public class MyApplication {}
// ✅ Scan only what's needed
@SpringBootApplication
@ComponentScan(basePackages = {
"com.example.myapp.controller",
"com.example.myapp.service",
"com.example.myapp.repository",
"com.example.myapp.config"
})
public class MyApplication {}8. Load Testing & Benchmarking
Performance optimization without measurement is guessing. This section covers micro-benchmarking with JMH and establishing baselines.
JMH (Java Microbenchmark Harness)
JMH is the standard tool for measuring small code performance. It handles JVM warm-up, dead-code elimination, and constant folding — problems that make naive benchmarks unreliable.
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.37</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.37</version>
<scope>test</scope>
</dependency>Benchmark example — comparing serialization approaches:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 1)
@Measurement(iterations = 5, time = 1)
@Fork(2)
public class SerializationBenchmark {
private ObjectMapper objectMapper;
private Product product;
@Setup
public void setup() {
objectMapper = new ObjectMapper();
objectMapper.registerModule(
new JavaTimeModule());
product = new Product();
product.setId(1L);
product.setName("Wireless Keyboard");
product.setPrice(new BigDecimal("79.99"));
product.setDescription(
"Ergonomic wireless keyboard with " +
"Bluetooth connectivity");
product.setCategories(
List.of("Electronics", "Accessories"));
product.setCreatedAt(Instant.now());
}
@Benchmark
public String jacksonSerialization()
throws Exception {
return objectMapper.writeValueAsString(product);
}
@Benchmark
public byte[] jacksonSerializationBytes()
throws Exception {
return objectMapper.writeValueAsBytes(product);
}
@Benchmark
public String gsonSerialization() {
return new Gson().toJson(product);
}
public static void main(String[] args)
throws Exception {
Options opt = new OptionsBuilder()
.include(
SerializationBenchmark.class
.getSimpleName())
.resultFormat(ResultFormatType.JSON)
.result("benchmark-results.json")
.build();
new Runner(opt).run();
}
}Output:
Benchmark Mode Cnt Score Error Units
SerializationBenchmark.jacksonBytes avgt 10 1.245 ± 0.034 us/op
SerializationBenchmark.jacksonString avgt 10 1.512 ± 0.041 us/op
SerializationBenchmark.gsonSerialization avgt 10 3.847 ± 0.112 us/opEstablishing Performance Baselines
Create baselines for your critical API endpoints. Store them alongside your code:
@SpringBootTest(
webEnvironment = SpringBootTest.WebEnvironment
.RANDOM_PORT)
public class PerformanceBaselineTest {
@Autowired
private TestRestTemplate restTemplate;
@Test
void getUser_shouldRespondWithin200ms() {
long start = System.nanoTime();
ResponseEntity<User> response = restTemplate
.getForEntity("/api/users/1", User.class);
long duration = (System.nanoTime() - start)
/ 1_000_000;
assertThat(response.getStatusCode())
.isEqualTo(HttpStatus.OK);
assertThat(duration)
.as("GET /api/users/1 should complete " +
"within 200ms, took %dms", duration)
.isLessThan(200);
}
@Test
void listProducts_shouldRespondWithin500ms() {
long start = System.nanoTime();
ResponseEntity<List> response = restTemplate
.getForEntity("/api/products?page=0&size=20",
List.class);
long duration = (System.nanoTime() - start)
/ 1_000_000;
assertThat(response.getStatusCode())
.isEqualTo(HttpStatus.OK);
assertThat(duration)
.as("GET /api/products should complete " +
"within 500ms, took %dms", duration)
.isLessThan(500);
}
}Integrating with Gatling
For full load testing, use Gatling as covered in Advanced Testing. The workflow ties together:
- JMH finds slow methods and algorithms
- Optimize the identified bottleneck
- Gatling validates the fix under realistic load
- JFR digs deeper if Gatling results still don't meet SLA
9. Production Performance Checklist
Use this checklist when deploying a Spring Boot application to production.
JVM Configuration
# Production JVM flags template
JAVA_OPTS="\
-server \
-Xms${HEAP_SIZE} \
-Xmx${HEAP_SIZE} \
-XX:+UseG1GC \
-XX:MaxGCPauseMillis=200 \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/var/log/app/heapdump.hprof \
-XX:+ExitOnOutOfMemoryError \
-Xlog:gc*:file=/var/log/app/gc.log:time:filecount=5,filesize=10m \
-Djava.security.egd=file:/dev/./urandom \
-Dspring.profiles.active=prod \
"
java $JAVA_OPTS -jar myapp.jarDockerfile with Performance Flags
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY target/*.jar app.jar
ENV JAVA_OPTS="\
-server \
-Xms512m -Xmx512m \
-XX:+UseG1GC \
-XX:MaxGCPauseMillis=200 \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/tmp \
-XX:+ExitOnOutOfMemoryError"
EXPOSE 8080
ENTRYPOINT ["sh", "-c", \
"java $JAVA_OPTS -jar app.jar"]Quick Reference: Performance Settings
| Setting | Development | Production |
|---|---|---|
| Heap size | -Xmx512m | -Xms2g -Xmx2g |
| GC | Default (G1) | G1 or ZGC |
| Hibernate SQL logging | DEBUG | WARN |
| Hibernate stats | true | false |
| HikariCP pool | 5 | 10-20 |
| Tomcat threads | 200 | 200-400 |
| GZIP compression | Optional | Enabled |
| Spring lazy init | true | Selective |
| Actuator exposure | All | Health + Prometheus |
| Log level | DEBUG | INFO |
Performance Monitoring Strategy
Key alerts to configure:
| Metric | Warning | Critical |
|---|---|---|
| p99 response time | > 500ms | > 2s |
| Error rate (5xx) | > 1% | > 5% |
| CPU usage | > 70% | > 90% |
| Heap usage | > 70% | > 85% |
| HikariCP wait time | > 1s | > 5s |
| GC pause p99 | > 200ms | > 500ms |
Summary and Key Takeaways
✅ Measure before optimizing — use JFR, VisualVM, and Actuator to identify real bottlenecks, not guessed ones
✅ Database queries dominate response time — optimize queries, tune HikariCP pool size, and use batch operations
✅ Spring Boot Actuator + Micrometer provides production-ready metrics — expose Prometheus endpoint and build Grafana dashboards
✅ GZIP compression reduces payload by 60-80% — enable it for JSON APIs with minimal CPU cost
✅ HTTP caching with ETags and Cache-Control prevents unnecessary data transfer between client and server
✅ G1GC is the safe default, ZGC for ultra-low latency — set -Xms equal to -Xmx to avoid heap resizing
✅ Lazy initialization speeds startup but shifts cost to first request — use selectively in production
✅ GraalVM native images start in milliseconds but trade peak throughput — ideal for serverless and CLI tools
✅ JMH benchmarks prevent optimization regressions — track serialization, algorithm, and query performance over time
✅ Always dump heap on OOM (-XX:+HeapDumpOnOutOfMemoryError) — you can't debug a crash you can't reproduce
What's Next?
Now that you can profile, measure, and optimize performance, continue building production-ready applications:
Continue the Spring Boot Series
- GraphQL with Spring for GraphQL: Build flexible query APIs that let clients request exactly the data they need — reducing over-fetching
- Docker & Kubernetes Deployment: Containerize your optimized app and configure resource limits, health probes, and horizontal pod autoscaling
- Monitoring with Actuator, Prometheus & Grafana: Build real-time dashboards for the metrics you instrumented in this guide
Related Spring Boot Posts
- Caching with Redis — the companion post for Phase 5 (caching strategies)
- Advanced JPA Optimization — deep dive into N+1 solutions and query tuning
- Advanced Testing: Contract & Performance — Gatling load testing for validating optimizations
- Async Processing & Scheduled Tasks — async patterns for non-blocking performance
- REST API Advanced Patterns — API design that scales
Foundation Posts
- Getting Started with Spring Boot — project setup and fundamentals
- Database Integration with JPA — data layer foundations
Part of the Spring Boot Learning Roadmap series
📬 Subscribe to Newsletter
Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.
We respect your privacy. Unsubscribe at any time.
💬 Comments
Sign in to leave a comment
We'll never post without your permission.