Back to blog

Spring Boot Performance Optimization & Profiling

javaspring-bootperformanceprofilingbackend
Spring Boot Performance Optimization & Profiling

Introduction

Your Spring Boot application works. Tests pass. Users are signing up. Then one day, response times spike from 200ms to 3 seconds, the database connection pool is exhausted, and your monitoring dashboard turns red.

Performance problems don't announce themselves during development — they appear in production under real load. The difference between a slow application and a fast one isn't luck — it's measurement, profiling, and targeted optimization.

This guide teaches you to find and fix performance bottlenecks systematically. No guessing. No premature optimization. Every change backed by data.

What You'll Learn

✅ Identify performance bottlenecks using a measurement-first approach
✅ Profile CPU, memory, and threads with Java Flight Recorder and VisualVM
✅ Monitor runtime metrics with Spring Boot Actuator and Micrometer
✅ Optimize database queries, connection pools, and batch operations
✅ Accelerate HTTP responses with compression, caching, and async handling
✅ Tune JVM memory and garbage collection for production workloads
✅ Speed up application startup with lazy initialization and AOT compilation
✅ Establish performance baselines and detect regressions with JMH

Prerequisites


1. Understanding Performance Bottlenecks

Before optimizing anything, you need to understand where time is spent. Premature optimization wastes effort and often makes code harder to maintain.

The Golden Rule: Measure First

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." — Donald Knuth

Four Categories of Bottlenecks

CategorySymptomsCommon CausesTools
CPUHigh CPU usage, slow computationsInefficient algorithms, excessive logging, serializationJFR, VisualVM, top
MemoryOutOfMemoryError, frequent GC pausesMemory leaks, large object graphs, wrong heap sizeHeap dumps, MAT, JFR
I/O (Database)Slow queries, connection pool exhaustionN+1 queries, missing indexes, no cachingSlow query log, EXPLAIN ANALYZE
NetworkHigh latency, timeout errorsUncompressed responses, no HTTP caching, blocking callsWireshark, Actuator metrics

Where Time Is Typically Spent

In most Spring Boot applications, this is the breakdown:

Database queries dominate. This means query optimization and caching deliver the biggest wins — which is why we covered Redis caching and JPA optimization earlier in this series.


2. JVM Profiling with VisualVM & Java Flight Recorder

Profiling reveals exactly where your application spends time and memory. Two tools cover most needs.

Java Flight Recorder (JFR)

JFR is built into the JDK (11+) with near-zero overhead — safe for production use.

Enable JFR at application startup:

java -XX:StartFlightRecording=duration=60s,filename=recording.jfr \
     -jar target/myapp.jar

Enable JFR with continuous recording (production):

java -XX:StartFlightRecording=disk=true,maxsize=500m,maxage=1d \
     -XX:FlightRecorderOptions=stackdepth=256 \
     -jar target/myapp.jar

Trigger a recording on a running application with jcmd:

# Find the PID
jps -l
 
# Start recording
jcmd <PID> JFR.start name=profile duration=120s filename=profile.jfr
 
# Stop recording manually
jcmd <PID> JFR.stop name=profile
 
# Dump current recording
jcmd <PID> JFR.dump name=profile filename=dump.jfr

Analyzing JFR Recordings

Open .jfr files with JDK Mission Control (JMC) — a free GUI tool bundled with Oracle JDK or downloadable separately:

# Launch JMC (if installed)
jmc
 
# Or use jfr command-line tool (JDK 17+)
jfr print --events jdk.CPULoad recording.jfr
jfr summary recording.jfr

Key events to look for:

JFR EventWhat It RevealsAction
jdk.CPULoadProcess and system CPU usageIdentify CPU-bound code
jdk.GCPausePhaseGC pause duration and frequencyTune GC settings
jdk.ObjectAllocationSampleWhere objects are allocatedReduce allocations
jdk.JavaMonitorWaitThread contentionFix lock contention
jdk.FileRead / jdk.FileWriteFile I/O operationsOptimize file access
jdk.SocketRead / jdk.SocketWriteNetwork I/OIdentify slow external calls

VisualVM for Development Profiling

VisualVM provides a visual interface for profiling during development:

# Install VisualVM (macOS)
brew install --cask visualvm
 
# Launch
visualvm

CPU Profiling workflow:

  1. Start your Spring Boot app
  2. Open VisualVM → attach to the running JVM
  3. Go to Sampler → click CPU
  4. Trigger the slow operation (e.g., hit the slow API endpoint)
  5. Click Snapshot to capture the results
  6. Sort by Self Time to find hotspots

Creating a Profiling Endpoint (Development Only)

Add a diagnostic endpoint for easy profiling during development:

@RestController
@RequestMapping("/api/debug")
@Profile("dev") // Only available in dev profile
public class ProfilingController {
 
    @Autowired
    private MeterRegistry meterRegistry;
 
    @GetMapping("/jfr/start")
    public ResponseEntity<String> startRecording()
            throws Exception {
        ProcessBuilder pb = new ProcessBuilder(
            "jcmd",
            String.valueOf(ProcessHandle.current().pid()),
            "JFR.start",
            "name=debug",
            "duration=60s",
            "filename=debug-recording.jfr"
        );
        pb.redirectErrorStream(true);
        Process process = pb.start();
        String output = new String(
            process.getInputStream().readAllBytes());
        return ResponseEntity.ok(
            "Recording started: " + output);
    }
 
    @GetMapping("/memory")
    public Map<String, Object> memoryInfo() {
        Runtime runtime = Runtime.getRuntime();
        return Map.of(
            "maxMemory", formatBytes(runtime.maxMemory()),
            "totalMemory",
                formatBytes(runtime.totalMemory()),
            "freeMemory",
                formatBytes(runtime.freeMemory()),
            "usedMemory", formatBytes(
                runtime.totalMemory()
                    - runtime.freeMemory())
        );
    }
 
    private String formatBytes(long bytes) {
        return String.format("%.2f MB",
            bytes / (1024.0 * 1024.0));
    }
}

3. Spring Boot Actuator for Runtime Metrics

Spring Boot Actuator exposes production-ready monitoring endpoints. Combined with Micrometer, it provides the metrics foundation for performance monitoring.

Setting Up Actuator

<!-- pom.xml -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
 
<!-- Micrometer Prometheus registry (for Prometheus/Grafana) -->
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus,env
  endpoint:
    health:
      show-details: when-authorized
  metrics:
    tags:
      application: ${spring.application.name}
    distribution:
      percentiles-histogram:
        http.server.requests: true
      sla:
        http.server.requests: 50ms,100ms,200ms,500ms,1s

Key Metrics to Monitor

HTTP request metrics:

# All HTTP request metrics
curl localhost:8080/actuator/metrics/http.server.requests
 
# Filter by endpoint
curl "localhost:8080/actuator/metrics/http.server.requests?tag=uri:/api/users"
 
# Filter by status
curl "localhost:8080/actuator/metrics/http.server.requests?tag=status:200"

JVM metrics:

# Memory usage
curl localhost:8080/actuator/metrics/jvm.memory.used
 
# GC pause time
curl localhost:8080/actuator/metrics/jvm.gc.pause
 
# Thread count
curl localhost:8080/actuator/metrics/jvm.threads.live
 
# CPU usage
curl localhost:8080/actuator/metrics/process.cpu.usage

Database metrics:

# HikariCP active connections
curl localhost:8080/actuator/metrics/hikaricp.connections.active
 
# Connection wait time
curl localhost:8080/actuator/metrics/hikaricp.connections.acquire
 
# Connection pool usage
curl localhost:8080/actuator/metrics/hikaricp.connections.usage

Custom Metrics with Micrometer

Track business-specific performance metrics:

@Service
public class OrderService {
 
    private final Counter orderCounter;
    private final Timer orderProcessingTimer;
    private final DistributionSummary orderValueSummary;
 
    public OrderService(MeterRegistry meterRegistry) {
        this.orderCounter = Counter.builder("orders.created")
            .description("Total orders created")
            .tag("type", "all")
            .register(meterRegistry);
 
        this.orderProcessingTimer = Timer
            .builder("orders.processing.time")
            .description("Order processing duration")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry);
 
        this.orderValueSummary = DistributionSummary
            .builder("orders.value")
            .description("Order values distribution")
            .baseUnit("dollars")
            .publishPercentiles(0.5, 0.95)
            .register(meterRegistry);
    }
 
    public Order createOrder(OrderRequest request) {
        return orderProcessingTimer.record(() -> {
            Order order = processOrder(request);
            orderCounter.increment();
            orderValueSummary.record(
                order.getTotalAmount().doubleValue());
            return order;
        });
    }
 
    private Order processOrder(OrderRequest request) {
        // Business logic
        return new Order(/* ... */);
    }
}

Gauge for tracking active state:

@Component
public class QueueMetrics {
 
    private final BlockingQueue<Task> taskQueue;
 
    public QueueMetrics(MeterRegistry registry,
            BlockingQueue<Task> taskQueue) {
        this.taskQueue = taskQueue;
 
        Gauge.builder("task.queue.size", taskQueue,
                BlockingQueue::size)
            .description("Current task queue size")
            .register(registry);
 
        Gauge.builder("task.queue.remaining",
                taskQueue,
                BlockingQueue::remainingCapacity)
            .description("Remaining queue capacity")
            .register(registry);
    }
}

Prometheus Endpoint

With the Prometheus registry, all metrics are available at /actuator/prometheus in Prometheus format:

curl localhost:8080/actuator/prometheus

Output:

# HELP http_server_requests_seconds
# TYPE http_server_requests_seconds histogram
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.05"} 142
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.1"} 198
http_server_requests_seconds_bucket{method="GET",uri="/api/users",status="200",le="0.2"} 210
http_server_requests_seconds_count{method="GET",uri="/api/users",status="200"} 215
http_server_requests_seconds_sum{method="GET",uri="/api/users",status="200"} 18.432

4. Database Query Optimization

Database access is the #1 bottleneck in most Spring Boot applications. This section focuses on runtime optimization beyond what we covered in Advanced JPA Optimization.

Enable Slow Query Logging

PostgreSQL:

# application.yml
spring:
  jpa:
    properties:
      hibernate:
        generate_statistics: true
        session.events.log.LOG_QUERIES_SLOWER_THAN_MS: 100
  datasource:
    hikari:
      leak-detection-threshold: 30000  # 30 seconds

Hibernate statistics logging:

logging:
  level:
    org.hibernate.stat: DEBUG
    org.hibernate.SQL: DEBUG
    org.hibernate.type.descriptor.sql.BasicBinder: TRACE

This outputs query counts per session — a fast way to spot N+1 problems:

Session Metrics {
    726313 nanoseconds spent acquiring 1 JDBC connections;
    326040 nanoseconds spent releasing 1 JDBC connections;
    3524968 nanoseconds spent preparing 12 JDBC statements;  ← 12 queries!
    42803543 nanoseconds spent executing 12 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
}

HikariCP Connection Pool Tuning

HikariCP is Spring Boot's default connection pool. Misconfigured pools are a silent performance killer.

spring:
  datasource:
    hikari:
      # Pool sizing
      maximum-pool-size: 20       # Max connections
      minimum-idle: 5             # Min idle connections
 
      # Timeouts
      connection-timeout: 10000   # 10s to get connection
      idle-timeout: 300000        # 5min idle before removal
      max-lifetime: 1800000       # 30min max connection age
 
      # Validation
      validation-timeout: 5000    # 5s for validation query
 
      # Leak detection
      leak-detection-threshold: 60000  # Warn if held > 60s
 
      # Metrics
      register-mbeans: true

Pool sizing formula:

optimal_pool_size = (core_count * 2) + effective_spindle_count
 
Example: 4-core server with SSD
optimal_pool_size = (4 * 2) + 1 = 9

Important: More connections ≠ better performance. Too many connections cause context switching overhead on the database server. Start small (10-20) and increase only if metrics show connection wait times.

Monitor connection pool health:

@Scheduled(fixedRate = 30000) // Every 30 seconds
public void logPoolMetrics() {
    HikariPoolMXBean poolProxy = hikariDataSource
        .getHikariPoolMXBean();
 
    log.info("HikariCP - Active: {}, Idle: {}, " +
             "Waiting: {}, Total: {}",
        poolProxy.getActiveConnections(),
        poolProxy.getIdleConnections(),
        poolProxy.getThreadsAwaitingConnection(),
        poolProxy.getTotalConnections());
}

Batch Operations

Individual INSERT/UPDATE statements are slow. Batch operations reduce round trips dramatically.

spring:
  jpa:
    properties:
      hibernate:
        jdbc:
          batch_size: 50
          batch_versioned_data: true
        order_inserts: true
        order_updates: true
@Service
@Transactional
public class ProductImportService {
 
    @PersistenceContext
    private EntityManager entityManager;
 
    public int importProducts(
            List<ProductDto> products) {
        int batchSize = 50;
 
        for (int i = 0; i < products.size(); i++) {
            Product product = mapToEntity(products.get(i));
            entityManager.persist(product);
 
            // Flush and clear every batch
            if (i > 0 && i % batchSize == 0) {
                entityManager.flush();
                entityManager.clear();
            }
        }
        entityManager.flush();
        entityManager.clear();
 
        return products.size();
    }
 
    private Product mapToEntity(ProductDto dto) {
        Product product = new Product();
        product.setName(dto.getName());
        product.setPrice(dto.getPrice());
        product.setCategory(dto.getCategory());
        return product;
    }
}

Performance comparison:

Approach10,000 RecordsRound Trips
Individual inserts~45 seconds10,000
Batch size 50~3 seconds200
Batch size 100~2.5 seconds100
JDBC executeBatch()~1.5 seconds100

Read-Only Query Optimization

Mark read-only transactions to avoid dirty checking overhead:

@Service
public class ReportService {
 
    @Transactional(readOnly = true)
    public List<SalesReport> generateMonthlyReport(
            YearMonth month) {
        // Hibernate skips dirty checking for read-only
        // transactions, saving CPU time
        return reportRepository
            .findByMonth(month.getMonthValue(),
                         month.getYear());
    }
}
// Repository with query hints
public interface ProductRepository
        extends JpaRepository<Product, Long> {
 
    @QueryHints({
        @QueryHint(
            name = "org.hibernate.readOnly",
            value = "true"),
        @QueryHint(
            name = "org.hibernate.fetchSize",
            value = "50")
    })
    @Query("SELECT p FROM Product p WHERE p.active = true")
    List<Product> findAllActive();
}

5. HTTP & REST API Performance

Optimize the request-response cycle between your API and clients.

Response Compression (GZIP)

Enable GZIP compression to reduce response payload size by 60-80%:

server:
  compression:
    enabled: true
    min-response-size: 1024      # Compress responses > 1KB
    mime-types:
      - application/json
      - application/xml
      - text/html
      - text/xml
      - text/plain
      - application/javascript
      - text/css

Verify compression is working:

# Without compression
curl -s -o /dev/null -w "%{size_download}" \
  http://localhost:8080/api/products
# Output: 45230
 
# With compression
curl -s -o /dev/null -w "%{size_download}" \
  -H "Accept-Encoding: gzip" \
  http://localhost:8080/api/products
# Output: 8124 (82% smaller)

HTTP Caching with ETags

ETags let clients skip downloading unchanged responses:

@Configuration
public class WebConfig implements WebMvcConfigurer {
 
    @Bean
    public FilterRegistrationBean<ShallowEtagHeaderFilter>
            shallowEtagHeaderFilter() {
        FilterRegistrationBean<ShallowEtagHeaderFilter>
            filterBean = new FilterRegistrationBean<>(
                new ShallowEtagHeaderFilter());
        filterBean.addUrlPatterns("/api/*");
        filterBean.setName("etagFilter");
        return filterBean;
    }
}

Custom Cache-Control headers for specific endpoints:

@RestController
@RequestMapping("/api/products")
public class ProductController {
 
    @GetMapping("/{id}")
    public ResponseEntity<Product> getProduct(
            @PathVariable Long id) {
        Product product = productService.findById(id);
 
        return ResponseEntity.ok()
            .cacheControl(CacheControl
                .maxAge(Duration.ofMinutes(10))
                .mustRevalidate())
            .eTag(String.valueOf(
                product.getUpdatedAt().hashCode()))
            .body(product);
    }
 
    @GetMapping("/categories")
    public ResponseEntity<List<String>> getCategories() {
        List<String> categories =
            productService.getAllCategories();
 
        // Categories rarely change — cache for 1 hour
        return ResponseEntity.ok()
            .cacheControl(CacheControl
                .maxAge(Duration.ofHours(1))
                .cachePublic())
            .body(categories);
    }
}

Async Request Handling

For I/O-bound operations, async handling frees up Tomcat threads:

@RestController
@RequestMapping("/api/reports")
public class ReportController {
 
    @Autowired
    private ReportService reportService;
 
    // Synchronous — blocks Tomcat thread
    @GetMapping("/sync/{id}")
    public Report getReportSync(@PathVariable Long id) {
        return reportService.generateReport(id);
    }
 
    // Async with CompletableFuture — frees Tomcat thread
    @GetMapping("/async/{id}")
    public CompletableFuture<Report> getReportAsync(
            @PathVariable Long id) {
        return CompletableFuture.supplyAsync(
            () -> reportService.generateReport(id));
    }
 
    // Streaming large responses
    @GetMapping(value = "/stream",
        produces = MediaType.APPLICATION_NDJSON_VALUE)
    public Flux<Report> streamReports() {
        return reportService.streamAllReports();
    }
}

Configure Tomcat thread pool:

server:
  tomcat:
    threads:
      max: 200          # Max worker threads
      min-spare: 20     # Min idle threads
    max-connections: 8192
    accept-count: 100   # Queue size when all threads busy
    connection-timeout: 20000

JSON Serialization Optimization

Jackson is powerful but can be slow with large objects. Optimize it:

spring:
  jackson:
    serialization:
      WRITE_DATES_AS_TIMESTAMPS: false
    deserialization:
      FAIL_ON_UNKNOWN_PROPERTIES: false
    default-property-inclusion: non_null  # Skip null fields
@Configuration
public class JacksonConfig {
 
    @Bean
    public Jackson2ObjectMapperBuilderCustomizer
            jacksonCustomizer() {
        return builder -> builder
            .featuresToDisable(
                SerializationFeature
                    .WRITE_DATES_AS_TIMESTAMPS,
                DeserializationFeature
                    .FAIL_ON_UNKNOWN_PROPERTIES)
            .featuresToEnable(
                DeserializationFeature
                    .READ_UNKNOWN_ENUM_VALUES_AS_NULL)
            .serializationInclusion(
                JsonInclude.Include.NON_NULL);
    }
}

Use @JsonView to return only needed fields:

public class Views {
    public static class Summary {}
    public static class Detail extends Summary {}
}
 
@Entity
public class Product {
 
    @JsonView(Views.Summary.class)
    private Long id;
 
    @JsonView(Views.Summary.class)
    private String name;
 
    @JsonView(Views.Summary.class)
    private BigDecimal price;
 
    @JsonView(Views.Detail.class)
    private String fullDescription;
 
    @JsonView(Views.Detail.class)
    private List<Review> reviews;
}
 
@RestController
@RequestMapping("/api/products")
public class ProductController {
 
    // Returns only id, name, price
    @GetMapping
    @JsonView(Views.Summary.class)
    public List<Product> listProducts() {
        return productService.findAll();
    }
 
    // Returns all fields including description, reviews
    @GetMapping("/{id}")
    @JsonView(Views.Detail.class)
    public Product getProduct(@PathVariable Long id) {
        return productService.findById(id);
    }
}

6. Memory Optimization & GC Tuning

Memory issues cause the most dramatic failures — OutOfMemoryError crashes your entire application.

Analyzing Heap Usage

Take a heap dump:

# From a running application
jcmd <PID> GC.heap_dump /tmp/heapdump.hprof
 
# Automatically on OutOfMemoryError (add to JVM args)
java -XX:+HeapDumpOnOutOfMemoryError \
     -XX:HeapDumpPath=/var/log/app/ \
     -jar myapp.jar

Analyze with Eclipse MAT (Memory Analyzer Tool):

  1. Open the .hprof file in MAT
  2. Run Leak Suspects Report — MAT automatically identifies likely leaks
  3. Check the Dominator Tree — shows which objects retain the most memory
  4. Look at Histogram — counts of each object type

Common Memory Leaks in Spring Boot

1. Unbounded caches:

// ❌ Memory leak — grows forever
private final Map<String, Object> cache =
    new HashMap<>();
 
public Object getData(String key) {
    return cache.computeIfAbsent(key,
        k -> expensiveQuery(k));
}
 
// ✅ Bounded cache with eviction
private final Map<String, Object> cache =
    Collections.synchronizedMap(
        new LinkedHashMap<>(100, 0.75f, true) {
            @Override
            protected boolean removeEldestEntry(
                    Map.Entry<String, Object> eldest) {
                return size() > 1000;
            }
        }
    );
 
// ✅ Better: Use Spring Cache with TTL (see Redis post)
@Cacheable(value = "data", key = "#key")
public Object getData(String key) {
    return expensiveQuery(key);
}

2. Non-closed resources:

// ❌ Stream not closed — database connection leaked
public List<User> getActiveUsers() {
    return userRepository.streamAll()
        .filter(User::isActive)
        .collect(Collectors.toList());
}
 
// ✅ Try-with-resources closes the stream
@Transactional(readOnly = true)
public List<User> getActiveUsers() {
    try (Stream<User> stream =
             userRepository.streamAll()) {
        return stream
            .filter(User::isActive)
            .collect(Collectors.toList());
    }
}

3. Event listener accumulation:

// ❌ Listener registered but never removed
@Component
public class NotificationListener {
 
    @Autowired
    private ApplicationEventPublisher publisher;
 
    @PostConstruct
    public void init() {
        // Each hot-reload in dev adds another listener!
        publisher.publishEvent(
            new ListenerRegisteredEvent(this));
    }
}
 
// ✅ Use @EventListener (Spring manages lifecycle)
@Component
public class NotificationListener {
 
    @EventListener
    public void onOrderCreated(OrderCreatedEvent event) {
        // Spring handles registration/cleanup
        sendNotification(event.getOrder());
    }
}

GC Algorithm Selection

GC AlgorithmBest ForFlagPause Target
G1GCGeneral purpose (default in JDK 17)-XX:+UseG1GC200ms
ZGCUltra-low latency (<10ms pauses)-XX:+UseZGC<10ms
ShenandoahLow latency (OpenJDK)-XX:+UseShenandoahGC<10ms
Parallel GCMaximum throughput (batch processing)-XX:+UseParallelGCN/A

Recommended JVM flags for a typical Spring Boot API:

java \
  -Xms512m \                        # Initial heap
  -Xmx2g \                          # Maximum heap
  -XX:+UseG1GC \                    # G1 garbage collector
  -XX:MaxGCPauseMillis=200 \        # Target GC pause
  -XX:+HeapDumpOnOutOfMemoryError \ # Dump on OOM
  -XX:HeapDumpPath=/var/log/app/ \  # Heap dump location
  -XX:+ExitOnOutOfMemoryError \     # Exit on OOM (let container restart)
  -Xlog:gc*:file=/var/log/app/gc.log:time,uptime:filecount=5,filesize=10m \
  -jar myapp.jar

For low-latency APIs (ZGC):

java \
  -Xms1g -Xmx4g \
  -XX:+UseZGC \
  -XX:+ZGenerational \           # Generational ZGC (JDK 21+)
  -Xlog:gc*:file=gc.log:time \
  -jar myapp.jar

Monitoring GC via Actuator

# GC pause time
curl localhost:8080/actuator/metrics/jvm.gc.pause
 
# Memory pool usage
curl localhost:8080/actuator/metrics/jvm.memory.used?tag=area:heap
 
# GC count
curl localhost:8080/actuator/metrics/jvm.gc.pause \
  | jq '.measurements[] | select(.statistic=="COUNT")'

GC health indicators:

MetricHealthyWarningCritical
GC pause p99< 200ms200-500ms> 500ms
GC overhead< 5%5-15%> 15%
Heap after GC< 70%70-85%> 85%

7. Application Startup Optimization

Slow startup increases deployment time, CI/CD feedback loops, and cold-start latency in serverless/container environments.

Measure Startup Time

# application.yml
spring:
  main:
    log-startup-info: true
 
logging:
  level:
    org.springframework.boot.autoconfigure: DEBUG

Spring Boot 3.2+ startup report:

java -jar myapp.jar --spring.main.startup-report=true

Output shows each auto-configuration class and its initialization time:

Spring Boot Startup Report
============================
Startup Time: 4.2 seconds (process: 5.1 seconds)
 
Auto-configuration (top 10 by duration):
  DataSourceAutoConfiguration: 892ms
  HibernateJpaAutoConfiguration: 1240ms
  WebMvcAutoConfiguration: 312ms
  SecurityAutoConfiguration: 289ms
  ...

Lazy Initialization

Only initialize beans when first used — not at startup:

spring:
  main:
    lazy-initialization: true  # Global lazy init

Selective lazy initialization (recommended for production):

// Eager (default) — loaded at startup
@Service
public class PaymentService {
    // Needed immediately for health checks
}
 
// Lazy — loaded on first use
@Service
@Lazy
public class ReportGenerationService {
    // Only needed when someone requests a report
}

Trade-off: Lazy init reduces startup time but moves initialization cost to the first request. For APIs, this means the first request to a lazy bean is slower.

Exclude Unnecessary Auto-Configuration

@SpringBootApplication(exclude = {
    DataSourceAutoConfiguration.class,       // If not using DB
    MongoAutoConfiguration.class,            // If not using Mongo
    RedisAutoConfiguration.class,            // If not using Redis
    MailSenderAutoConfiguration.class,       // If not sending email
    SecurityAutoConfiguration.class,         // If not using security
    FlywayAutoConfiguration.class            // If not using Flyway
})
public class MyApplication {
    public static void main(String[] args) {
        SpringApplication.run(MyApplication.class, args);
    }
}

Find what auto-configurations are active:

java -jar myapp.jar --debug 2>&1 | grep "matched"

Spring AOT & GraalVM Native Images

For the fastest possible startup, compile to a native image:

<!-- pom.xml — Spring Boot 3.x -->
<plugin>
    <groupId>
        org.graalvm.buildtools
    </groupId>
    <artifactId>
        native-maven-plugin
    </artifactId>
</plugin>
# Build native image
mvn -Pnative native:compile
 
# Run — starts in milliseconds
./target/myapp

Startup comparison:

ModeStartup TimeMemoryBest For
JVM (default)3-8 seconds200-500 MBLong-running services
JVM + lazy init1-4 seconds200-500 MBLong-running services
Native image50-200 ms50-100 MBServerless, CLI tools

Trade-off: Native images have faster startup but slower peak throughput than JIT-compiled JVM. For long-running servers that handle sustained load, the JVM usually wins.

Optimize Component Scanning

// ❌ Scans everything under com.example
@SpringBootApplication
@ComponentScan("com.example")
public class MyApplication {}
 
// ✅ Scan only what's needed
@SpringBootApplication
@ComponentScan(basePackages = {
    "com.example.myapp.controller",
    "com.example.myapp.service",
    "com.example.myapp.repository",
    "com.example.myapp.config"
})
public class MyApplication {}

8. Load Testing & Benchmarking

Performance optimization without measurement is guessing. This section covers micro-benchmarking with JMH and establishing baselines.

JMH (Java Microbenchmark Harness)

JMH is the standard tool for measuring small code performance. It handles JVM warm-up, dead-code elimination, and constant folding — problems that make naive benchmarks unreliable.

<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-core</artifactId>
    <version>1.37</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-generator-annprocess</artifactId>
    <version>1.37</version>
    <scope>test</scope>
</dependency>

Benchmark example — comparing serialization approaches:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 1)
@Measurement(iterations = 5, time = 1)
@Fork(2)
public class SerializationBenchmark {
 
    private ObjectMapper objectMapper;
    private Product product;
 
    @Setup
    public void setup() {
        objectMapper = new ObjectMapper();
        objectMapper.registerModule(
            new JavaTimeModule());
 
        product = new Product();
        product.setId(1L);
        product.setName("Wireless Keyboard");
        product.setPrice(new BigDecimal("79.99"));
        product.setDescription(
            "Ergonomic wireless keyboard with " +
            "Bluetooth connectivity");
        product.setCategories(
            List.of("Electronics", "Accessories"));
        product.setCreatedAt(Instant.now());
    }
 
    @Benchmark
    public String jacksonSerialization()
            throws Exception {
        return objectMapper.writeValueAsString(product);
    }
 
    @Benchmark
    public byte[] jacksonSerializationBytes()
            throws Exception {
        return objectMapper.writeValueAsBytes(product);
    }
 
    @Benchmark
    public String gsonSerialization() {
        return new Gson().toJson(product);
    }
 
    public static void main(String[] args)
            throws Exception {
        Options opt = new OptionsBuilder()
            .include(
                SerializationBenchmark.class
                    .getSimpleName())
            .resultFormat(ResultFormatType.JSON)
            .result("benchmark-results.json")
            .build();
        new Runner(opt).run();
    }
}

Output:

Benchmark                                   Mode  Cnt   Score   Error  Units
SerializationBenchmark.jacksonBytes         avgt   10   1.245 ± 0.034  us/op
SerializationBenchmark.jacksonString        avgt   10   1.512 ± 0.041  us/op
SerializationBenchmark.gsonSerialization    avgt   10   3.847 ± 0.112  us/op

Establishing Performance Baselines

Create baselines for your critical API endpoints. Store them alongside your code:

@SpringBootTest(
    webEnvironment = SpringBootTest.WebEnvironment
        .RANDOM_PORT)
public class PerformanceBaselineTest {
 
    @Autowired
    private TestRestTemplate restTemplate;
 
    @Test
    void getUser_shouldRespondWithin200ms() {
        long start = System.nanoTime();
 
        ResponseEntity<User> response = restTemplate
            .getForEntity("/api/users/1", User.class);
 
        long duration = (System.nanoTime() - start)
            / 1_000_000;
 
        assertThat(response.getStatusCode())
            .isEqualTo(HttpStatus.OK);
        assertThat(duration)
            .as("GET /api/users/1 should complete " +
                "within 200ms, took %dms", duration)
            .isLessThan(200);
    }
 
    @Test
    void listProducts_shouldRespondWithin500ms() {
        long start = System.nanoTime();
 
        ResponseEntity<List> response = restTemplate
            .getForEntity("/api/products?page=0&size=20",
                List.class);
 
        long duration = (System.nanoTime() - start)
            / 1_000_000;
 
        assertThat(response.getStatusCode())
            .isEqualTo(HttpStatus.OK);
        assertThat(duration)
            .as("GET /api/products should complete " +
                "within 500ms, took %dms", duration)
            .isLessThan(500);
    }
}

Integrating with Gatling

For full load testing, use Gatling as covered in Advanced Testing. The workflow ties together:

  1. JMH finds slow methods and algorithms
  2. Optimize the identified bottleneck
  3. Gatling validates the fix under realistic load
  4. JFR digs deeper if Gatling results still don't meet SLA

9. Production Performance Checklist

Use this checklist when deploying a Spring Boot application to production.

JVM Configuration

# Production JVM flags template
JAVA_OPTS="\
  -server \
  -Xms${HEAP_SIZE} \
  -Xmx${HEAP_SIZE} \
  -XX:+UseG1GC \
  -XX:MaxGCPauseMillis=200 \
  -XX:+HeapDumpOnOutOfMemoryError \
  -XX:HeapDumpPath=/var/log/app/heapdump.hprof \
  -XX:+ExitOnOutOfMemoryError \
  -Xlog:gc*:file=/var/log/app/gc.log:time:filecount=5,filesize=10m \
  -Djava.security.egd=file:/dev/./urandom \
  -Dspring.profiles.active=prod \
"
 
java $JAVA_OPTS -jar myapp.jar

Dockerfile with Performance Flags

FROM eclipse-temurin:21-jre-alpine
 
WORKDIR /app
COPY target/*.jar app.jar
 
ENV JAVA_OPTS="\
  -server \
  -Xms512m -Xmx512m \
  -XX:+UseG1GC \
  -XX:MaxGCPauseMillis=200 \
  -XX:+HeapDumpOnOutOfMemoryError \
  -XX:HeapDumpPath=/tmp \
  -XX:+ExitOnOutOfMemoryError"
 
EXPOSE 8080
 
ENTRYPOINT ["sh", "-c", \
  "java $JAVA_OPTS -jar app.jar"]

Quick Reference: Performance Settings

SettingDevelopmentProduction
Heap size-Xmx512m-Xms2g -Xmx2g
GCDefault (G1)G1 or ZGC
Hibernate SQL loggingDEBUGWARN
Hibernate statstruefalse
HikariCP pool510-20
Tomcat threads200200-400
GZIP compressionOptionalEnabled
Spring lazy inittrueSelective
Actuator exposureAllHealth + Prometheus
Log levelDEBUGINFO

Performance Monitoring Strategy

Key alerts to configure:

MetricWarningCritical
p99 response time> 500ms> 2s
Error rate (5xx)> 1%> 5%
CPU usage> 70%> 90%
Heap usage> 70%> 85%
HikariCP wait time> 1s> 5s
GC pause p99> 200ms> 500ms

Summary and Key Takeaways

Measure before optimizing — use JFR, VisualVM, and Actuator to identify real bottlenecks, not guessed ones
Database queries dominate response time — optimize queries, tune HikariCP pool size, and use batch operations
Spring Boot Actuator + Micrometer provides production-ready metrics — expose Prometheus endpoint and build Grafana dashboards
GZIP compression reduces payload by 60-80% — enable it for JSON APIs with minimal CPU cost
HTTP caching with ETags and Cache-Control prevents unnecessary data transfer between client and server
G1GC is the safe default, ZGC for ultra-low latency — set -Xms equal to -Xmx to avoid heap resizing
Lazy initialization speeds startup but shifts cost to first request — use selectively in production
GraalVM native images start in milliseconds but trade peak throughput — ideal for serverless and CLI tools
JMH benchmarks prevent optimization regressions — track serialization, algorithm, and query performance over time
Always dump heap on OOM (-XX:+HeapDumpOnOutOfMemoryError) — you can't debug a crash you can't reproduce


What's Next?

Now that you can profile, measure, and optimize performance, continue building production-ready applications:

Continue the Spring Boot Series

  • GraphQL with Spring for GraphQL: Build flexible query APIs that let clients request exactly the data they need — reducing over-fetching
  • Docker & Kubernetes Deployment: Containerize your optimized app and configure resource limits, health probes, and horizontal pod autoscaling
  • Monitoring with Actuator, Prometheus & Grafana: Build real-time dashboards for the metrics you instrumented in this guide

Foundation Posts


Part of the Spring Boot Learning Roadmap series

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.