⚡ Cache Me If You Can: Smart Caching Strategies for Microservices

Naive caching causes stale data, memory exhaustion, and race conditions. Smart caching solves all three. This article introduces the TEMPO framework — TTL-based expiration, eviction strategies, multiple cache regions, and SpEL-driven cache keys using Caffeine and Spring.

Smart caching strategies — Caffeine cache with TTL and eviction for Spring Boot

📚 Series Navigation:
Previous: Part 5 - When the World Breaks
👉 You are here: Part 6 - Cache Me If You Can
Next: Part 7 - Guarding the Gates →


📋 Introduction

Your weather microservice is humming along nicely. The architecture is clean, the resilience patterns are solid, the database is performing well. Then marketing sends an email blast and suddenly you have 10,000 users all checking the weather in London at the same time.

Without caching, that's 10,000 identical API calls to your external weather provider (goodbye, API quota), 10,000 identical database queries (hello, connection pool exhaustion), and a latency spike that makes your P99 look like a phone number.

But caching isn't just "put everything in a HashMap and call it a day." Get it wrong and you'll serve stale data, exhaust memory, create subtle race conditions, or — the classic — forget to invalidate the cache when data changes and spend three hours debugging why updates don't appear.

In this article, we'll explore how the Weather Microservice implements smart caching with Caffeine, Spring's cache abstraction, and a custom meta-annotation that keeps cache invalidation sane. ☕


⏱️ The TEMPO Framework: Five Principles of Smart Caching

Meet TEMPO — five principles for effective caching:

Letter Principle What It Means
T TTL-based Expiration Every cached entry has a time-to-live appropriate to its data type
E Eviction Strategy Size-bounded caches with intelligent eviction policies
M Multiple Regions Separate caches for different data types with independent TTLs
P Pattern-based Keys Cache keys follow predictable patterns using SpEL expressions
O Operation-aware Invalidation Write operations automatically evict affected cache entries

🏗️ Cache Architecture: Three Regions, Three TTLs

The Weather Microservice doesn't use a single monolithic cache. It creates three distinct cache regions, each tuned for its data characteristics:

@Configuration
public class CacheConfig {

  private static final int DEFAULT_CACHE_SIZE = 500;

  @Value("${weather.api.cache.current-weather-ttl:300}")
  private long currentWeatherTtl;

  @Value("${weather.api.cache.forecast-ttl:3600}")
  private long forecastTtl;

  @Value("${weather.api.cache.location-ttl:900}")
  private long locationTtl;

  @Bean
  public CacheManager cacheManager() {
    SimpleCacheManager cacheManager = new SimpleCacheManager();

    cacheManager.setCaches(Arrays.asList(
        buildCache("currentWeather", currentWeatherTtl),
        buildCache("forecasts", forecastTtl),
        buildCache("locations", locationTtl)));

    cacheManager.initializeCaches();
    return cacheManager;
  }

  private CaffeineCache buildCache(String name, long ttlSeconds) {
    return new CaffeineCache(
        name,
        Caffeine.newBuilder()
            .maximumSize(DEFAULT_CACHE_SIZE)
            .expireAfterWrite(ttlSeconds, TimeUnit.SECONDS)
            .recordStats()
            .build());
  }
}

Why Three Separate Caches?

Cache Region TTL Rationale
currentWeather 5 min (300s) Weather changes frequently but not every second
forecasts 1 hour (3600s) Forecasts change less frequently than current weather
locations 15 min (900s) Location data is mostly static but might be updated

Different data has different staleness tolerances. Current weather from 5 minutes ago is perfectly fine. A 5-minute-old forecast is equally acceptable. But serving an hour-old current temperature when it's raining? That's a bad user experience.

Cache Configuration Deep Dive

Caffeine.newBuilder()
    .maximumSize(DEFAULT_CACHE_SIZE)      // Max 500 entries
    .expireAfterWrite(ttlSeconds, TimeUnit.SECONDS)  // TTL from write time
    .recordStats()                         // Enable hit/miss metrics
    .build()
Setting Value Purpose
maximumSize(500) 500 entries Prevents unbounded memory growth
expireAfterWrite Variable Entries expire N seconds after creation
recordStats() Enabled Exposes hit rate, eviction count for monitoring

🔥 Critical Insight: expireAfterWrite vs. expireAfterAccess is a crucial distinction. expireAfterWrite means entries expire N seconds after being written, regardless of how often they're read. expireAfterAccess resets the timer on every read. For weather data, expireAfterWrite is correct — you want fresh data after 5 minutes even if the cache entry was just accessed.

Why SimpleCacheManager Instead of CaffeineCacheManager?

Spring provides CaffeineCacheManager which applies the same configuration to all caches. The Weather Microservice uses SimpleCacheManager with individually configured CaffeineCache instances because each cache needs different TTLs. This is more work to set up but gives precise control over each cache region.


🔑 Cache Key Design with SpEL

The Weather Microservice uses Spring Expression Language (SpEL) to generate cache keys:

Weather Cache Keys

// WeatherService.java
@Cacheable(
    value = "currentWeather",
    key = "'weather:byName:' + #locationName",
    unless = "#result == null")
public WeatherDto getCurrentWeather(String locationName, boolean saveToDatabase) {
    // ...
}

@Cacheable(
    value = "currentWeather",
    key = "'weather:byId:' + #locationId",
    unless = "#result == null")
public WeatherDto getCurrentWeatherByLocationId(Long locationId, boolean saveToDatabase) {
    // ...
}

Generated keys:

  • weather:byName:London — Weather for London by name lookup
  • weather:byId:42 — Weather for location ID 42

Location Cache Keys

// LocationService.java
@Cacheable(value = "locations", key = "'location:byId:' + #id")
public LocationDto getLocationById(Long id) { ... }

@Cacheable(value = "locations", key = "'location:all'")
public List<LocationDto> getAllLocations() { ... }

@Cacheable(
    value = "locations",
    key = "'location:page:' + #pageable.pageNumber + ':' + #pageable.pageSize")
public Page<LocationDto> getAllLocations(Pageable pageable) { ... }

Generated keys:

  • location:byId:42 — Single location by ID
  • location:all — All locations list
  • location:page:0:20 — Page 0, size 20 of locations

Key Design Patterns

Pattern Example When to Use
type:byField:value weather:byName:London Single entity lookup
type:all location:all Full collection
type:page:N:M location:page:0:20 Paginated results

The prefix convention (weather:, location:) makes it easy to identify what data a key represents when debugging. It also prevents key collisions between different data types in the same cache region.

The unless Guard

@Cacheable(value = "currentWeather", key = "...", unless = "#result == null")

The unless = "#result == null" clause prevents caching null results. Without this, a failed API call that returns null would be cached, and subsequent requests would get null from cache instead of retrying the API call.


🧹 Cache Invalidation: The Hard Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

The Weather Microservice solves cache invalidation with a custom meta-annotation:

The @CacheEvictingOperation Meta-Annotation

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Transactional
@CacheEvict
public @interface CacheEvictingOperation {

  @AliasFor(annotation = CacheEvict.class, attribute = "value")
  String[] cacheNames() default {};

  @AliasFor(annotation = CacheEvict.class, attribute = "key")
  String key() default "";

  @AliasFor(annotation = CacheEvict.class, attribute = "allEntries")
  boolean allEntries() default false;

  @AliasFor(annotation = CacheEvict.class, attribute = "beforeInvocation")
  boolean beforeInvocation() default false;
}

This single annotation combines @Transactional and @CacheEvict. Every write operation that modifies cached data uses it:

// LocationService.java
@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public LocationDto createLocation(CreateLocationRequest request) {
    // Create location in database
    // Cache is automatically evicted after successful transaction
}

@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public LocationDto updateLocation(Long id, CreateLocationRequest request) {
    // Update location in database
    // Cache is automatically evicted after successful transaction
}

@CacheEvictingOperation(cacheNames = "locations", allEntries = true)
public void deleteLocation(Long id) {
    // Delete location from database
    // Cache is automatically evicted after successful transaction
}

Why allEntries = true?

All three write operations use allEntries = true to evict the entire locations cache. The LocationService documentation explains why:

/**
 * Cache Strategy: Uses allEntries = true because this operation affects:
 * - getAllLocations() - the new location will appear in the full list
 * - All paginated queries - the new location may appear on any page
 * - Cannot selectively evict paginated caches as page keys are dynamic
 */

Consider what happens when you create a new location:

  • location:all is now stale (missing the new location)
  • location:page:0:20 might be stale (new location might be on page 0)
  • location:page:1:20 might be stale (new location might push an entry to page 2)

You can't know which paginated cache entries are affected without recalculating all of them. Evicting everything is the safe, correct approach.

@AliasFor: The Spring Magic

@AliasFor(annotation = CacheEvict.class, attribute = "value")
String[] cacheNames() default {};

The @AliasFor annotation forwards attributes from the meta-annotation to the underlying Spring annotation. When you write @CacheEvictingOperation(cacheNames = "locations"), Spring treats it as @CacheEvict(value = "locations").

This is what makes composed annotations possible in Spring — they can combine multiple annotations while exposing their attributes through a unified interface.

Why Not beforeInvocation = true?

boolean beforeInvocation() default false;

The default is false, meaning the cache is evicted after the method completes successfully. If the database write fails and throws an exception, the cache keeps its current (correct) data. With beforeInvocation = true, you'd evict the cache and then fail the database write, leaving an empty cache for subsequent requests to hit the database unnecessarily.


📊 Cache Strategy Per Service Method

Here's the complete picture of caching across the Weather Microservice:

WeatherService Cache Strategy

Method Cache Key Pattern Eviction
getCurrentWeather currentWeather weather:byName:
{name}
TTL
(5 min)
getCurrentWeather
ByLocationId
currentWeather weather:byId:
{id}
TTL
(5 min)
getWeatherHistory None N/A
(always fresh)
getWeatherHistory
ByDateRange
None N/A
(always fresh)

Historical data isn't cached because:

  • Date range queries have too many possible key combinations
  • Historical data doesn't change (it's already recorded)
  • The database query is fast with proper indexes

LocationService Cache Strategy

Method Cache Key Pattern Eviction
getLocationById locations location:byId:{id} TTL (15 min) + write ops
getAllLocations locations location:all TTL (15 min) + write ops
getAllLocations(pageable) locations location:page:{n}:{size} TTL (15 min) + write ops
searchLocationsByName None N/A (search results vary)
createLocation allEntries evict On write
updateLocation allEntries evict On write
deleteLocation allEntries evict On write

Search results aren't cached because the search query is unpredictable — there are too many possible name fragments to cache effectively.


🏎️ Why Caffeine?

Caffeine is the de facto standard for in-process Java caching. Here's why the Weather Microservice uses it:

Feature Caffeine ConcurrentHashMap Guava Cache
Eviction policy Window TinyLfu None LRU
Hit rate Near-optimal N/A Good
Thread safety Lock-free Yes Yes
Statistics Built-in No Built-in
Async loading Yes No No
Spring integration First-class Manual Limited

Caffeine's Window TinyLfu eviction policy consistently achieves near-optimal hit rates in benchmarks. It combines recency (LRU) and frequency (LFU) information to make better eviction decisions than either alone.

The recordStats() Call

Caffeine.newBuilder()
    .recordStats()  // ← This enables monitoring
    .build()

With recordStats(), Caffeine tracks:

  • Hit count — How many requests were served from cache
  • Miss count — How many requests went to the database/API
  • Eviction count — How many entries were evicted
  • Load time — How long cache misses take to resolve

These stats integrate with Micrometer (Part 10) for Prometheus/Grafana dashboards.


✅ Caching Checklist

  • [ ] Separate cache regions for data with different staleness tolerances
  • [ ] TTLs match business requirements — Current weather (5min), forecasts (1hr), locations (15min)
  • [ ] Size-bounded cachesmaximumSize prevents memory exhaustion
  • [ ] expireAfterWrite for time-sensitive data (not expireAfterAccess)
  • [ ] SpEL cache keys follow type:byField:value convention
  • [ ] unless = "#result == null" prevents caching failed lookups
  • [ ] @CacheEvictingOperation combines @Transactional + @CacheEvict
  • [ ] allEntries = true for write operations that affect collections/pagination
  • [ ] beforeInvocation = false to preserve cache on transaction failure
  • [ ] recordStats() enabled for monitoring cache effectiveness
  • [ ] Search results NOT cached — too many key combinations
  • [ ] Historical data NOT cached — already fresh from indexed database queries

🎓 Conclusion: Cache Smart, Not Hard

Caching done well is invisible to users and transformative for performance. Done poorly, it serves stale data and creates debugging nightmares. The principles that keep it on the right side:

  1. The TEMPO framework (TTL-based expiration, Eviction strategy, Multiple regions, Pattern-based keys, Operation-aware invalidation) guides caching decisions
  2. Three separate cache regions with independent TTLs match the staleness tolerance of each data type
  3. Caffeine provides near-optimal hit rates with Window TinyLfu eviction and built-in statistics
  4. SpEL cache keys ('weather:byName:' + #locationName) create predictable, debuggable key patterns
  5. @CacheEvictingOperation is a custom meta-annotation combining @Transactional and @CacheEvict via @AliasFor
  6. allEntries = true is the safe choice for write operations that affect paginated caches
  7. Not everything should be cached — searches and historical data are better served directly
  8. recordStats() enables monitoring so you can measure cache effectiveness in production

Caching is one of those things that's easy to add and hard to get right. The Weather Microservice takes a measured approach — cache what benefits from it, invalidate correctly, and monitor constantly.

Coming Next Week:
Part 7: Guarding the Gates - Security Fundamentals for Microservices 🔒


📚 Series Progress

✅ Part 1: The Blueprint Before the Build
✅ Part 2: Spring Boot Alchemy
✅ Part 3: REST Assured
✅ Part 4: The Data Foundation
✅ Part 5: When the World Breaks
✅ Part 6: Cache Me If You Can ← You just finished this!
⬜ Part 7: Guarding the Gates
⬜ Part 8: Fail Gracefully
⬜ Part 9: 10,000 Threads and a Dream
⬜ Part 10: Can You See Me Now?
⬜ Part 11: Trust, But Verify
⬜ Part 12: Ship It
⬜ Part 13: To Production and Beyond


Happy coding, and remember — the fastest request is the one you never make.


Robert Marcel Saveanu

Robert Marcel Saveanu

Software engineer with 15 years in testing, architecture, and the art of surviving corporate dysfunction. Writing about code, quality, and the humans behind both.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Codyssey.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.