🚀 10,000 Threads and a Dream: Virtual Threads and Concurrent Microservices

Traditional thread pools cap at 200 threads eating 200 MB of memory. Java virtual threads handle 10,000+ concurrent operations at 1 KB each. This article covers the ASYNC framework — virtual thread configuration, CompletableFuture composition, parallel fetching, and async bulk processing.

Java virtual threads — concurrent microservices with Spring Boot

📚 Series Navigation:
Previous: Part 8 - Fail Gracefully
👉 You are here: Part 9 - 10,000 Threads and a Dream
Next: Part 10 - Can You See Me Now? →


📋 Introduction

Traditional Java web servers have a dirty little secret: they run out of threads way before they run out of anything else. A typical Tomcat instance starts with 200 platform threads. Each thread consumes about 1MB of stack memory. Under load, you hit the thread ceiling, and new requests wait in a queue while existing threads are blocked on I/O — waiting for database responses, API calls, file reads.

Your CPU is 5% utilized. Your memory is fine. But you're out of threads, so everything's slow.

Java 21 changed everything with virtual threads (Project Loom). A virtual thread consumes about 1KB of memory. You can create 10,000 of them. Or 100,000. They're cheap, they're plentiful, and they yield automatically when blocked on I/O. Your 200-thread bottleneck? Gone.

In this article, we'll explore how the Weather Microservice harnesses virtual threads for massive concurrency. We'll see three executor configurations, CompletableFuture-based parallel fetching, and async bulk processing with batching. ☕


⚡ The ASYNC Framework: Five Pillars of Virtual Thread Architecture

Meet ASYNC — five principles for concurrent microservices:

Letter Principle What It Means
A Asynchronous Composition CompletableFuture for parallel, composable operations
S Scalable Threading Virtual threads for 10,000+ concurrent operations
Y Yielding I/O Virtual threads automatically yield on blocking I/O
N Non-blocking Batching Bulk operations process items in parallel batches
C Cancellation/Timeout Every async operation has a timeout and error handler

🧵 Three Executors for Three Purposes

The Weather Microservice configures three distinct executors:

@Configuration
@EnableAsync
@ConfigurationProperties(prefix = "thread-pool.platform")
@Validated
public class AsyncConfig {

  @Bean(name = "taskExecutor")
  public Executor taskExecutor() {
    return Executors.newVirtualThreadPerTaskExecutor();
  }

  @Bean(name = "compositeExecutor", destroyMethod = "close")
  public ExecutorService compositeExecutor() {
    return Executors.newVirtualThreadPerTaskExecutor();
  }

  @Bean(name = "platformExecutor")
  public Executor platformExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(corePoolSize);
    executor.setMaxPoolSize(maxPoolSize);
    executor.setQueueCapacity(queueCapacity);
    executor.setThreadNamePrefix("platform-async-");
    executor.setWaitForTasksToCompleteOnShutdown(true);
    executor.setAwaitTerminationSeconds(awaitTerminationSeconds);
    executor.initialize();
    return executor;
  }
}

Why Three?

Executor Type Purpose When to Use
taskExecutor Virtual Default @Async executor General async operations
compositeExecutor Virtual Parallel data fetching CompletableFuture.supplyAsync()
platformExecutor Platform CPU-bound fallback Heavy computation, thread pinning

The key difference between taskExecutor and compositeExecutor:

// compositeExecutor returns ExecutorService (has shutdown/close)
@Bean(name = "compositeExecutor", destroyMethod = "close")
public ExecutorService compositeExecutor() {
    return Executors.newVirtualThreadPerTaskExecutor();
}

The compositeExecutor bean returns ExecutorService instead of Executor. This matters because CompletableFuture.supplyAsync() requires an Executor, but graceful shutdown requires ExecutorService.close(). The destroyMethod = "close" annotation ensures all in-flight tasks complete before the application stops.

Virtual Threads vs. Platform Threads

Platform Thread (traditional):
+-------------------------------------------------+
| ~1MB stack memory | 1:1 with OS thread | Limited|
+-------------------------------------------------+

Virtual Thread (Project Loom):
+----------------------------------------------+
| ~1KB memory | M:N with OS threads | Unlimited|
+----------------------------------------------+
Aspect Platform Threads Virtual Threads
Memory per thread ~1MB ~1KB
Max concurrent ~200 (typical) 10,000+
Blocking I/O Blocks OS thread Yields, frees OS thread
Thread pool needed Yes (carefully tuned) No (create freely)
CPU-bound work Efficient Same as platform
Context switch cost Expensive (OS) Cheap (JVM)

🔀 Parallel Data Fetching with CompletableFuture

The CompositeWeatherService demonstrates the power of parallel fetching:

Two-Way Parallel Fetch

@Service
public class CompositeWeatherService {

  private final WeatherService weatherService;
  private final ForecastService forecastService;
  private final LocationService locationService;
  private final ExecutorService virtualExecutor;

  public CompositeWeatherService(
      WeatherService weatherService,
      ForecastService forecastService,
      LocationService locationService,
      @Qualifier("compositeExecutor") ExecutorService virtualExecutor) {
    this.weatherService = weatherService;
    this.forecastService = forecastService;
    this.locationService = locationService;
    this.virtualExecutor = virtualExecutor;
  }

  public record WeatherWithForecast(
      WeatherDto weather, List<ForecastDto> forecasts) {}

  @Observed(name = "composite.weather.forecast")
  public WeatherWithForecast getWeatherWithForecast(
      String location, int days, boolean save) {

    CompletableFuture<WeatherDto> weatherFuture =
        CompletableFuture.supplyAsync(
            () -> weatherService.getCurrentWeather(location, save),
            virtualExecutor)
        .orTimeout(30, TimeUnit.SECONDS);

    CompletableFuture<List<ForecastDto>> forecastFuture =
        CompletableFuture.supplyAsync(
            () -> forecastService.getForecast(location, days, save),
            virtualExecutor)
        .orTimeout(30, TimeUnit.SECONDS);

    try {
      CompletableFuture.allOf(weatherFuture, forecastFuture).join();
      return new WeatherWithForecast(weatherFuture.join(), forecastFuture.join());
    } catch (Exception ex) {
      weatherFuture.cancel(true);
      forecastFuture.cancel(true);
      throw ex;
    }
  }
}

The flow:

Request arrives
    |
    +--- Virtual Thread 1: getCurrentWeather("London", true)
    |         +-- Call external API
    |         +-- Save to database
    |         +-- Return WeatherDto
    |
    +--- Virtual Thread 2: getForecast("London", 7, true)
              +-- Call external API
              +-- Save forecasts to database
              +-- Return List<ForecastDto>
    |
    +-- Both complete → Combine into WeatherWithForecast
    +-- Either fails → Cancel the other, throw exception

Without virtual threads, these would consume 2 platform threads from your limited pool while waiting for external API responses. With virtual threads, they yield during I/O and consume nearly zero resources while waiting.

Key Patterns

1. Timeout on every future:

.orTimeout(30, TimeUnit.SECONDS)

No async operation runs forever. If the API doesn't respond in 30 seconds, the future completes exceptionally.

2. Cancel on failure:

catch (Exception ex) {
    weatherFuture.cancel(true);
    forecastFuture.cancel(true);
    throw ex;
}

If either future fails, cancel the other. No point waiting for weather data if the forecast call already failed.

3. No @Transactional on the orchestrator:

// No @Transactional here — deliberate!
public WeatherWithForecast getWeatherWithForecast(...)

The service documentation explains why:

Adding @Transactional here would create a single transaction spanning both parallel operations. This defeats the purpose of parallel execution because transactions are thread-local — the parallel operations would need their own transactions anyway. Each service method (getCurrentWeather, getForecast) manages its own transaction.


📦 Async Bulk Processing with Batching

The AsyncBulkWeatherService handles processing multiple items concurrently:

@Service
public class AsyncBulkWeatherService {

  private final WeatherService weatherService;
  private final ForecastService forecastService;
  private final MeterRegistry meterRegistry;
  private final int timeoutSeconds;

  // 6 counters + 3 timers for per-operation metrics
  private final Counter weatherSuccessCounter;
  private final Counter weatherFailureCounter;
  private final Timer weatherTimer;
  // ... (similar for forecasts and updates)
}

Bulk Weather Fetch

The service processes multiple locations in parallel with per-item error handling:

public CompletableFuture<BulkOperationResult<WeatherDto>> bulkFetchWeather(
    List<String> locations, boolean save) {

  return CompletableFuture.supplyAsync(() -> {
    List<CompletableFuture<WeatherDto>> futures = locations.stream()
        .map(location -> CompletableFuture.supplyAsync(
            () -> weatherTimer.record(() ->
                weatherService.getCurrentWeather(location, save)))
            .exceptionally(ex -> {
                weatherFailureCounter.increment();
                log.warn("Failed to fetch weather for {}: {}", location, ex.getMessage());
                return null;  // Individual failures don't kill the batch
            }))
        .toList();

    List<WeatherDto> results = futures.stream()
        .map(CompletableFuture::join)
        .filter(Objects::nonNull)
        .toList();

    weatherSuccessCounter.increment(results.size());
    return new BulkOperationResult<>(results, locations.size(), results.size());
  });
}

Key Design Decisions

1. Individual failure handling:

.exceptionally(ex -> {
    weatherFailureCounter.increment();
    log.warn("Failed to fetch weather for {}: {}", location, ex.getMessage());
    return null;
})

If London fails but Paris and Berlin succeed, you get results for Paris and Berlin. The failure is logged and metered, but it doesn't kill the entire batch.

2. Per-operation metrics:

Counter weatherSuccessCounter = Counter.builder("async.bulk.weather.success")
    .description("Number of successful bulk weather fetches")
    .register(meterRegistry);

Six counters (success/failure for weather, forecast, update) and three timers let you monitor:

  • How many operations succeed vs. fail?
  • How long does each operation type take?
  • Which operation type has the highest failure rate?

3. Input validation at the controller:

@PostMapping("/weather/bulk")
public CompletableFuture<BulkOperationResult<WeatherDto>> bulkFetchWeather(
    @RequestBody
    @NotEmpty(message = "Locations list cannot be empty")
    @Size(max = 100, message = "Maximum 100 locations per request")
    List<String> locations,
    @RequestParam(defaultValue = "false") boolean save) {
  return asyncBulkWeatherService.bulkFetchWeather(locations, save);
}

The @Size(max = 100) constraint prevents abuse — nobody should send 10,000 locations in one request. The limit is enforced before any processing starts.


🎮 The AsyncBulkController: CompletableFuture Returns

@RestController
@RequestMapping("/api/async")
public class AsyncBulkController {

  @GetMapping("/weather")
  public CompletableFuture<List<WeatherDto>> getAsyncWeather(
      @RequestParam @NotEmpty List<String> locations,
      @RequestParam(defaultValue = "false") boolean save) {
    return asyncBulkWeatherService.bulkFetchWeather(locations, save)
        .thenApply(BulkOperationResult::results);
  }
}

When a Spring MVC controller returns CompletableFuture, Spring:

  1. Releases the request thread immediately (non-blocking)
  2. Waits for the future to complete asynchronously
  3. Writes the response when the future resolves
  4. Returns an error if the future fails

This means the request thread is free to handle other requests while the async operation runs on virtual threads.


⚙️ Virtual Threads Everywhere

The Weather Microservice uses virtual threads at three levels:

Level 1: Tomcat Request Handling

spring:
  threads:
    virtual:
      enabled: true

Every incoming HTTP request runs on a virtual thread. Tomcat's thread pool limit is effectively removed.

Level 2: HTTP Client

HttpClient httpClient = HttpClient.newBuilder()
    .executor(Executors.newVirtualThreadPerTaskExecutor())
    .build();

Outgoing HTTP calls to the weather API use virtual threads. When the API call blocks waiting for a response, the virtual thread yields and the OS thread handles other work.

Level 3: Async Operations

@Bean(name = "taskExecutor")
public Executor taskExecutor() {
    return Executors.newVirtualThreadPerTaskExecutor();
}

Both @Async methods and CompletableFuture.supplyAsync() operations run on virtual threads.

The result: virtual threads from request to response, at every layer of the stack.


🚫 Why No @Transactional on Orchestrators

This is a subtle but important architectural decision:

// CompositeWeatherService — no @Transactional
public WeatherWithForecast getWeatherWithForecast(
    String location, int days, boolean save) {
    // Parallel calls to weatherService and forecastService
}

Adding @Transactional would:

  1. Create a single transaction spanning both parallel operations
  2. Hold database connections for the entire duration (weather + forecast)
  3. Prevent true parallelism because Spring's transaction context is thread-local
  4. Risk long-running transactions that hold locks unnecessarily

Instead, each service method manages its own transaction:

  • weatherService.getCurrentWeather()@Transactional(propagation = REQUIRED)
  • forecastService.getForecast()@Transactional(propagation = REQUIRED)

Each parallel operation gets its own transaction, its own database connection, and its own rollback boundary. If the weather save fails, it doesn't roll back the forecast save.


📊 When to Use What

Pattern Use Case Example
@Async Fire-and-forget operations Send notification after save
CompletableFuture.supplyAsync Parallel fetch with results Weather + forecast together
CompletableFuture.allOf Wait for multiple operations All parallel fetches complete
.orTimeout() Prevent hanging operations 30-second API call limit
.exceptionally() Per-item error handling Bulk operation resilience
.thenApply() Transform results Extract from BulkOperationResult
virtualExecutor I/O-bound parallel work API calls, DB queries
platformExecutor CPU-bound work Heavy computation

✅ Virtual Threads Checklist

  • [ ] spring.threads.virtual.enabled: true for Tomcat virtual thread handling
  • [ ] Virtual thread executor on HTTP client for non-blocking API calls
  • [ ] compositeExecutor with destroyMethod = "close" for graceful shutdown
  • [ ] platformExecutor as fallback for CPU-bound tasks
  • [ ] .orTimeout() on every CompletableFuture — no operation runs forever
  • [ ] .exceptionally() for per-item failure handling in bulk operations
  • [ ] Cancel on failure — if one parallel operation fails, cancel the others
  • [ ] No @Transactional on orchestrators — let each service manage its own transaction
  • [ ] @Size(max = 100) on bulk inputs — prevent abuse
  • [ ] Per-operation metrics — counters and timers for success/failure tracking
  • [ ] CompletableFuture return from controllers for non-blocking request handling

🎓 Conclusion: Threads Are Cheap Now

Virtual threads fundamentally change how you think about concurrency in Java. The key takeaways:

  1. The ASYNC framework (Asynchronous composition, Scalable threading, Yielding I/O, Non-blocking batching, Cancellation/timeout) guides concurrent design
  2. Virtual threads (~1KB each) replace platform threads (~1MB each) for 10,000+ concurrent operations
  3. Three executors serve different purposes: general async, composite operations, and CPU-bound fallback
  4. CompletableFuture.supplyAsync() with virtual thread executors enables parallel data fetching
  5. .orTimeout() and .exceptionally() provide timeout and per-item error handling
  6. No @Transactional on orchestrators — parallel operations need independent transactions
  7. @Size(max = 100) on bulk inputs prevents abuse at the controller level
  8. Controllers returning CompletableFuture free request threads immediately

Virtual threads are the biggest concurrency improvement in Java's history. The Weather Microservice puts them everywhere — from Tomcat to HTTP clients to async operations — and the result is a service that handles thousands of concurrent requests with minimal resource usage.

Coming Next Week:
Part 10: Can You See Me Now? - The Three Pillars of Microservice Observability 🔭


📚 Series Progress

✅ Part 1: The Blueprint Before the Build
✅ Part 2: Spring Boot Alchemy
✅ Part 3: REST Assured
✅ Part 4: The Data Foundation
✅ Part 5: When the World Breaks
✅ Part 6: Cache Me If You Can
✅ Part 7: Guarding the Gates
✅ Part 8: Fail Gracefully
✅ Part 9: 10,000 Threads and a Dream ← You just finished this!
⬜ Part 10: Can You See Me Now?
⬜ Part 11: Trust, But Verify
⬜ Part 12: Ship It
⬜ Part 13: To Production and Beyond


Happy coding, and remember — the best thread pool is the one you don't have to tune.


Robert Marcel Saveanu

Robert Marcel Saveanu

Software engineer with 15 years in testing, architecture, and the art of surviving corporate dysfunction. Writing about code, quality, and the humans behind both.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Codyssey.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.