Caching Strategies for Spring Microservices

📚 Series Navigation:
← Previous: Part 5 - When the World Breaks
👉 You are here: Part 6 - Cache Me If You Can
Next: Part 7 - Guarding the Gates →

📋 Introduction

Your weather microservice is humming along nicely. The architecture is clean, the resilience patterns are solid, the database is performing well. Then marketing sends an email blast and suddenly you have 10,000 users all checking the weather in London at the same time.

Without caching, that's 10,000 identical API calls to your external weather provider (goodbye, API quota), 10,000 identical database queries (hello, connection pool exhaustion), and a latency spike that makes your P99 look like a phone number.

But caching isn't just "put everything in a HashMap and call it a day." Get it wrong and you'll serve stale data, exhaust memory, create subtle race conditions, or — the classic — forget to invalidate the cache when data changes and spend three hours debugging why updates don't appear.

In this article, we'll explore how the Weather Microservice implements smart caching with Caffeine, Spring's cache abstraction, and a custom meta-annotation that keeps cache invalidation sane. ☕

⏱️ The TEMPO Framework: Five Principles of Smart Caching

Meet TEMPO — five principles for effective caching:

Letter	Principle	What It Means
T	TTL-based Expiration	Every cached entry has a time-to-live appropriate to its data type
E	Eviction Strategy	Size-bounded caches with intelligent eviction policies
M	Multiple Regions	Separate caches for different data types with independent TTLs
P	Pattern-based Keys	Cache keys follow predictable patterns using SpEL expressions
O	Operation-aware Invalidation	Write operations automatically evict affected cache entries

🏗️ Cache Architecture: Three Regions, Three TTLs

The Weather Microservice doesn't use a single monolithic cache. It creates three distinct cache regions, each tuned for its data characteristics:

@Configuration
public class CacheConfig {

  private static final int DEFAULT_CACHE_SIZE = 500;

  @Value("${weather.api.cache.current-weather-ttl:300}")
  private long currentWeatherTtl;

  @Value("${weather.api.cache.forecast-ttl:3600}")
  private long forecastTtl;

  @Value("${weather.api.cache.location-ttl:900}")
  private long locationTtl;

  @Bean
  public CacheManager cacheManager() {
    SimpleCacheManager cacheManager = new SimpleCacheManager();

    cacheManager.setCaches(Arrays.asList(
        buildCache("currentWeather", currentWeatherTtl),
        buildCache("forecasts", forecastTtl),
        buildCache("locations", locationTtl)));

    cacheManager.initializeCaches();
    return cacheManager;
  }

  private CaffeineCache buildCache(String name, long ttlSeconds) {
    return new CaffeineCache(
        name,
        Caffeine.newBuilder()
            .maximumSize(DEFAULT_CACHE_SIZE)
            .expireAfterWrite(ttlSeconds, TimeUnit.SECONDS)
            .recordStats()
            .build());
  }
}

Why Three Separate Caches?

Cache Region	TTL	Rationale
`currentWeather`	5 min (300s)	Weather changes frequently but not every second
`forecasts`	1 hour (3600s)	Forecasts change less frequently than current weather
`locations`	15 min (900s)	Location data is mostly static but might be updated

Different data has different staleness tolerances. Current weather from 5 minutes ago is perfectly fine. A 5-minute-old forecast is equally acceptable. But serving an hour-old current temperature when it's raining? That's a bad user experience.

Cache Configuration Deep Dive

Caffeine.newBuilder()
    .maximumSize(DEFAULT_CACHE_SIZE)      // Max 500 entries
    .expireAfterWrite(ttlSeconds, TimeUnit.SECONDS)  // TTL from write time
    .recordStats()                         // Enable hit/miss metrics
    .build()

Setting	Value	Purpose
`maximumSize(500)`	500 entries	Prevents unbounded memory growth
`expireAfterWrite`	Variable	Entries expire N seconds after creation
`recordStats()`	Enabled	Exposes hit rate, eviction count for monitoring

🔥 Critical Insight: expireAfterWrite vs. expireAfterAccess is a crucial distinction. expireAfterWrite means entries expire N seconds after being written, regardless of how often they're read. expireAfterAccess resets the timer on every read. For weather data, expireAfterWrite is correct — you want fresh data after 5 minutes even if the cache entry was just accessed.

Why SimpleCacheManager Instead of CaffeineCacheManager?

Spring provides CaffeineCacheManager which applies the same configuration to all caches. The Weather Microservice uses SimpleCacheManager with individually configured CaffeineCache instances because each cache needs different TTLs. This is more work to set up but gives precise control over each cache region.

🔑 Cache Key Design with SpEL

The Weather Microservice uses Spring Expression Language (SpEL) to generate cache keys:

Weather Cache Keys

// WeatherService.java
@Cacheable(
    value = "currentWeather",
    key = "'weather:byName:' + #locationName",
    unless = "#result == null")
public WeatherDto getCurrentWeather(String locationName, boolean saveToDatabase) {
    // ...
}

@Cacheable(
    value = "currentWeather",
    key = "'weather:byId:' + #locationId",
    unless = "#result == null")
public WeatherDto getCurrentWeatherByLocationId(Long locationId, boolean saveToDatabase) {
    // ...
}

Generated keys:

weather:byName:London — Weather for London by name lookup
weather:byId:42 — Weather for location ID 42

Location Cache Keys

// LocationService.java
@Cacheable(value = "locations", key = "'location:byId:' + #id")
public LocationDto getLocationById(Long id) { ... }

@Cacheable(value = "locations", key = "'location:all'")
public List<LocationDto> getAllLocations() { ... }

@Cacheable(
    value = "locations",
    key = "'location:page:' + #pageable.pageNumber + ':' + #pageable.pageSize")
public Page<LocationDto> getAllLocations(Pageable pageable) { ... }

Generated keys:

location:byId:42 — Single location by ID
location:all — All locations list
location:page:0:20 — Page 0, size 20 of locations

Key Design Patterns

Pattern	Example	When to Use
`type:byField:value`	`weather:byName:London`	Single entity lookup
`type:all`	`location:all`	Full collection
`type:page:N:M`	`location:page:0:20`	Paginated results

The prefix convention (weather:, location:) makes it easy to identify what data a key represents when debugging. It also prevents key collisions between different data types in the same cache region.

The `unless` Guard

@Cacheable(value = "currentWeather", key = "...", unless = "#result == null")

The unless = "#result == null" clause prevents caching null results. Without this, a failed API call that returns null would be cached, and subsequent requests would get null from cache instead of retrying the API call.

🧹 Cache Invalidation: The Hard Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

The Weather Microservice solves cache invalidation with a custom meta-annotation:

The @CacheEvictingOperation Meta-Annotation

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Transactional
@CacheEvict
public @interface CacheEvictingOperation {

  @AliasFor(annotation = CacheEvict.class, attribute = "value")
  String[] cacheNames() default {};

  @AliasFor(annotation = CacheEvict.class, attribute = "key")
  String key() default "";

  @AliasFor(annotation = CacheEvict.class, attribute = "allEntries")
  boolean allEntries() default false;

  @AliasFor(annotation = CacheEvict.class, attribute = "beforeInvocation")
  boolean beforeInvocation() default false;
}

This single annotation combines @Transactional and @CacheEvict. Every write operation that modifies cached data uses it:

// LocationService.java
@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public LocationDto createLocation(CreateLocationRequest request) {
    // Create location in database
    // Cache is automatically evicted after successful transaction
}

@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public LocationDto updateLocation(Long id, CreateLocationRequest request) {
    // Update location in database
    // Cache is automatically evicted after successful transaction
}

@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public void deleteLocation(Long id) {
    // Delete location from database
    // Cache is automatically evicted after successful transaction
}

Why `allEntries = true`?

All three write operations use allEntries = true to evict the entire locations cache. The LocationService documentation explains why:

/**
 * Cache Strategy: Uses allEntries = true because this operation affects:
 * - getAllLocations() - the new location will appear in the full list
 * - All paginated queries - the new location may appear on any page
 * - Cannot selectively evict paginated caches as page keys are dynamic
 */

Consider what happens when you create a new location:

location:all is now stale (missing the new location)
location:page:0:20 might be stale (new location might be on page 0)
location:page:1:20 might be stale (new location might push an entry to page 2)

You can't know which paginated cache entries are affected without recalculating all of them. Evicting everything is the safe, correct approach.

@AliasFor: The Spring Magic

@AliasFor(annotation = CacheEvict.class, attribute = "value")
String[] cacheNames() default {};

The @AliasFor annotation forwards attributes from the meta-annotation to the underlying Spring annotation. When you write @CacheEvictingOperation(cacheNames = "locations"), Spring treats it as @CacheEvict(value = "locations").

This is what makes composed annotations possible in Spring — they can combine multiple annotations while exposing their attributes through a unified interface.

Why Not `beforeInvocation = true`?

boolean beforeInvocation() default false;

The default is false, meaning the cache is evicted after the method completes successfully. If the database write fails and throws an exception, the cache keeps its current (correct) data. With beforeInvocation = true, you'd evict the cache and then fail the database write, leaving an empty cache for subsequent requests to hit the database unnecessarily.

📊 Cache Strategy Per Service Method

Here's the complete picture of caching across the Weather Microservice:

WeatherService Cache Strategy

Method	Cache	Key Pattern	Eviction
getCurrentWeather	currentWeather	weather:byName: {name}	TTL (5 min)
getCurrentWeather ByLocationId	currentWeather	weather:byId: {id}	TTL (5 min)
getWeatherHistory	None	—	N/A (always fresh)
getWeatherHistory ByDateRange	None	—	N/A (always fresh)

Historical data isn't cached because:

Date range queries have too many possible key combinations
Historical data doesn't change (it's already recorded)
The database query is fast with proper indexes

LocationService Cache Strategy

Method	Cache	Key Pattern	Eviction
`getLocationById`	`locations`	`location:byId:{id}`	TTL (15 min) + write ops
`getAllLocations`	`locations`	`location:all`	TTL (15 min) + write ops
`getAllLocations(pageable)`	`locations`	`location:page:{n}:{size}`	TTL (15 min) + write ops
`searchLocationsByName`	None	—	N/A (search results vary)
`createLocation`	—	`allEntries` evict	On write
`updateLocation`	—	`allEntries` evict	On write
`deleteLocation`	—	`allEntries` evict	On write

Search results aren't cached because the search query is unpredictable — there are too many possible name fragments to cache effectively.

🏎️ Why Caffeine?

Caffeine is the de facto standard for in-process Java caching. Here's why the Weather Microservice uses it:

Feature	Caffeine	ConcurrentHashMap	Guava Cache
Eviction policy	Window TinyLfu	None	LRU
Hit rate	Near-optimal	N/A	Good
Thread safety	Lock-free	Yes	Yes
Statistics	Built-in	No	Built-in
Async loading	Yes	No	No
Spring integration	First-class	Manual	Limited

Caffeine's Window TinyLfu eviction policy consistently achieves near-optimal hit rates in benchmarks. It combines recency (LRU) and frequency (LFU) information to make better eviction decisions than either alone.

The recordStats() Call

Caffeine.newBuilder()
    .recordStats()  // ← This enables monitoring
    .build()

With recordStats(), Caffeine tracks:

Hit count — How many requests were served from cache
Miss count — How many requests went to the database/API
Eviction count — How many entries were evicted
Load time — How long cache misses take to resolve

These stats integrate with Micrometer (Part 10) for Prometheus/Grafana dashboards.

✅ Caching Checklist

[ ] Separate cache regions for data with different staleness tolerances
[ ] TTLs match business requirements — Current weather (5min), forecasts (1hr), locations (15min)
[ ] Size-bounded caches — maximumSize prevents memory exhaustion
[ ] expireAfterWrite for time-sensitive data (not expireAfterAccess)
[ ] SpEL cache keys follow type:byField:value convention
[ ] unless = "#result == null" prevents caching failed lookups
[ ] @CacheEvictingOperation combines @Transactional + @CacheEvict
[ ] allEntries = true for write operations that affect collections/pagination
[ ] beforeInvocation = false to preserve cache on transaction failure
[ ] recordStats() enabled for monitoring cache effectiveness
[ ] Search results NOT cached — too many key combinations
[ ] Historical data NOT cached — already fresh from indexed database queries

🎓 Conclusion: Cache Smart, Not Hard

Caching done well is invisible to users and transformative for performance. Done poorly, it serves stale data and creates debugging nightmares. The principles that keep it on the right side:

The TEMPO framework (TTL-based expiration, Eviction strategy, Multiple regions, Pattern-based keys, Operation-aware invalidation) guides caching decisions
Three separate cache regions with independent TTLs match the staleness tolerance of each data type
Caffeine provides near-optimal hit rates with Window TinyLfu eviction and built-in statistics
SpEL cache keys ('weather:byName:' + #locationName) create predictable, debuggable key patterns
@CacheEvictingOperation is a custom meta-annotation combining @Transactional and @CacheEvict via @AliasFor
allEntries = true is the safe choice for write operations that affect paginated caches
Not everything should be cached — searches and historical data are better served directly
recordStats() enables monitoring so you can measure cache effectiveness in production

Caching is one of those things that's easy to add and hard to get right. The Weather Microservice takes a measured approach — cache what benefits from it, invalidate correctly, and monitor constantly.

Coming Next Week:
Part 7: Guarding the Gates - Security Fundamentals for Microservices 🔒

📚 Series Progress

✅ Part 1: The Blueprint Before the Build
✅ Part 2: Spring Boot Alchemy
✅ Part 3: REST Assured
✅ Part 4: The Data Foundation
✅ Part 5: When the World Breaks
✅ Part 6: Cache Me If You Can ← You just finished this!
⬜ Part 7: Guarding the Gates
⬜ Part 8: Fail Gracefully
⬜ Part 9: 10,000 Threads and a Dream
⬜ Part 10: Can You See Me Now?
⬜ Part 11: Trust, But Verify
⬜ Part 12: Ship It
⬜ Part 13: To Production and Beyond

Happy coding, and remember — the fastest request is the one you never make. ☕

⚡ Cache Me If You Can: Smart Caching Strategies for Microservices

📋 Introduction

⏱️ The TEMPO Framework: Five Principles of Smart Caching

🏗️ Cache Architecture: Three Regions, Three TTLs

Why Three Separate Caches?

Cache Configuration Deep Dive

Why SimpleCacheManager Instead of CaffeineCacheManager?

🔑 Cache Key Design with SpEL

Weather Cache Keys

Location Cache Keys

Key Design Patterns

The `unless` Guard

🧹 Cache Invalidation: The Hard Problem

The @CacheEvictingOperation Meta-Annotation

Why `allEntries = true`?

@AliasFor: The Spring Magic

Why Not `beforeInvocation = true`?

📊 Cache Strategy Per Service Method

WeatherService Cache Strategy

LocationService Cache Strategy

🏎️ Why Caffeine?

The recordStats() Call

✅ Caching Checklist

🎓 Conclusion: Cache Smart, Not Hard

Robert Marcel Saveanu

Read next

⚡ Cache Me If You Can: Smart Caching Strategies for Microservices

📋 Introduction

⏱️ The TEMPO Framework: Five Principles of Smart Caching

🏗️ Cache Architecture: Three Regions, Three TTLs

Why Three Separate Caches?

Cache Configuration Deep Dive

Why SimpleCacheManager Instead of CaffeineCacheManager?

🔑 Cache Key Design with SpEL

Weather Cache Keys

Location Cache Keys

Key Design Patterns

The unless Guard

🧹 Cache Invalidation: The Hard Problem

The @CacheEvictingOperation Meta-Annotation

Why allEntries = true?

@AliasFor: The Spring Magic

Why Not beforeInvocation = true?

📊 Cache Strategy Per Service Method

WeatherService Cache Strategy

LocationService Cache Strategy

🏎️ Why Caffeine?

The recordStats() Call

✅ Caching Checklist

🎓 Conclusion: Cache Smart, Not Hard

Robert Marcel Saveanu

Read next

🛡️ When the World Breaks: External API Integration and Resilience Patterns

🗄️ The Data Foundation: JPA, Hibernate, and the Database Migration Playbook

🌐 REST Assured: Designing APIs Developers Actually Want to Use

The `unless` Guard

Why `allEntries = true`?

Why Not `beforeInvocation = true`?