📚 Series Navigation:
← Previous: Part 3 - REST Assured
👉 You are here: Part 4 - The Data Foundation
Next: Part 5 - When the World Breaks →
📋 Introduction
Every microservice eventually needs to remember things. Weather data that was fetched an hour ago. The list of locations your users track. Forecast records spanning the last two weeks. Without persistence, your service has the memory of a goldfish — brilliant in the moment, useless five seconds later.
But getting the data layer right is harder than it looks. You need to design entities that map cleanly to your domain, manage schema changes without downtime, handle relationships efficiently, and keep transactions correct under concurrency. One wrong annotation and you're looking at N+1 query problems. One missing migration and your production database is out of sync.
In this article, we'll explore how the Weather Microservice builds its data foundation. We'll cover JPA entity design, Flyway migrations, repository patterns, and transaction management — everything you need to build a data layer that's both powerful and predictable. ☕
🔨 The FORGE Framework: Five Pillars of Data Layer Design
Meet FORGE — five principles for building solid data foundations:
| Letter | Principle | What It Means |
|---|---|---|
| F | Flyway Evolution | Schema changes through versioned, immutable migrations |
| O | ORM Mapping | Entities map cleanly to tables with explicit annotations |
| R | Repository Abstractions | Data access through Spring Data interfaces |
| G | Guarded Entities | Entities protect their invariants, use audit trails |
| E | Entity Relationships | Relationships are explicit, fetch strategies intentional |
🏛️ Entity Design: The Location Entity
Let's start with the Location entity — the foundation of the Weather Microservice's data model:
@Entity
@Table(
name = "locations",
indexes = {
@Index(name = "idx_location_name", columnList = "name"),
@Index(name = "idx_location_country", columnList = "country")
},
uniqueConstraints = {
@UniqueConstraint(
name = "uk_location_name_country",
columnNames = {"name", "country"})
})
@Getter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString
@EqualsAndHashCode(of = "id", callSuper = false)
@Auditable
@EntityListeners(AuditableEntityListener.class)
public class Location extends BaseAuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(nullable = false, length = 100)
private String name;
@Column(nullable = false, length = 100)
private String country;
@Column(nullable = false)
private Double latitude;
@Column(nullable = false)
private Double longitude;
@Column(name = "region", length = 100)
private String region;
}
There's a lot packed into this entity, so we'll go through it piece by piece.
Table-Level Annotations
Indexes speed up the most common queries:
@Index(name = "idx_location_name", columnList = "name")
@Index(name = "idx_location_country", columnList = "country")
These correspond to the queries findByNameContaining() and findByCountry() in the repository. Without indexes, these queries would scan the entire table.
Unique Constraint prevents duplicate locations:
@UniqueConstraint(
name = "uk_location_name_country",
columnNames = {"name", "country"})
"London, UK" and "London, Canada" are different locations. The composite unique constraint on (name, country) allows this while preventing true duplicates.
💡 Pro Tip: Always name your constraints explicitly. uk_location_name_country tells you exactly what it protects. Auto-generated names like UK_4xf3a2b are debugging nightmares.
No Setter — By Design
Notice the comment: // No @Setter - updates go through LocationMapper for controlled modifications. The entity uses @Getter but not @Setter. Updates happen through the mapper layer, which controls what fields can change and validates the new values.
This is defensive entity design. By removing setters, you prevent code like:
// This is impossible without setters — and that's the point
location.setLatitude(999.0); // ← Won't compile!
Instead, updates go through the proper channel:
// LocationMapper handles the update with validation
Location updatedLocation = locationMapper.updateEntityFromRequest(location, request);
Equality by ID Only
@EqualsAndHashCode(of = "id", callSuper = false)
Two Location objects are equal if they have the same id. This is critical for JPA because:
- Entities in
Setcollections need consistent equality - Hibernate's dirty checking relies on proper
equals()/hashCode() - Business fields (name, latitude) can change — ID can't
ToString Exclusions
The @ToString annotation doesn't exclude anything on Location, but look at WeatherRecord:
@ToString(exclude = {"location"})
This prevents infinite recursion: if Location's toString() includes WeatherRecords, and WeatherRecord's toString() includes Location, you get a StackOverflowError in your logs.
📜 The Auditable Base Entity
Every entity in the Weather Microservice extends BaseAuditableEntity:
@MappedSuperclass
@Getter
public abstract class BaseAuditableEntity {
@Column(name = "created_at", nullable = false, updatable = false)
private LocalDateTime createdAt;
@Column(name = "updated_at", nullable = false)
private LocalDateTime updatedAt;
public void setCreatedAt(LocalDateTime createdAt) {
this.createdAt = createdAt;
}
public void setUpdatedAt(LocalDateTime updatedAt) {
this.updatedAt = updatedAt;
}
}
Key annotations:
@MappedSuperclass— This class isn't an entity itself, but its fields are inherited by child entitiesupdatable = falseoncreated_at— Once set, creation timestamp can never change- Setters only for the listener — The
AuditableEntityListeneruses these to set timestamps
The @Auditable annotation and AuditableEntityListener automatically set createdAt on persist and updatedAt on every save. No manual timestamp management needed.
🔗 Entity Relationships: WeatherRecord
The WeatherRecord entity demonstrates a well-designed JPA relationship:
@Entity
@Table(
name = "weather_records",
indexes = {
@Index(name = "idx_weather_location_id", columnList = "location_id"),
@Index(name = "idx_weather_timestamp", columnList = "timestamp")
})
@Getter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString(exclude = {"location"})
@EqualsAndHashCode(of = "id", callSuper = false)
@Auditable
@EntityListeners(AuditableEntityListener.class)
public class WeatherRecord extends BaseAuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@ManyToOne(fetch = FetchType.EAGER)
@JoinColumn(name = "location_id", nullable = false)
private Location location;
@Column(nullable = false)
private Double temperature;
@Column(name = "feels_like")
private Double feelsLike;
@Column(nullable = false)
private Integer humidity;
@Column(name = "wind_speed", nullable = false)
private Double windSpeed;
@Column(name = "wind_direction")
private String windDirection;
@Column(nullable = false, length = 100)
private String condition;
@Column(nullable = false)
private LocalDateTime timestamp;
@PrePersist
protected void onCreate() {
if (timestamp == null) {
timestamp = LocalDateTime.now();
}
}
}
The EAGER Fetch Decision
@ManyToOne(fetch = FetchType.EAGER)
@JoinColumn(name = "location_id", nullable = false)
private Location location;
The javadoc explains this deliberate choice: weather records are almost always displayed with their location name. Using EAGER prevents N+1 queries when loading lists of weather records.
| Fetch Type | When | Trade-off |
|---|---|---|
| EAGER | Always loaded with parent | More data per query, no N+1 |
| LAZY | Loaded on first access | Less data per query, risk of N+1 |
For this relationship, EAGER is correct because:
- The
WeatherMapper.toDto()always accessesweatherRecord.getLocation().getName() - Location is a small entity (6 fields)
- Loading weather records without their location is never useful
🔥 Critical Insight: EAGER is a permanent decision — you can't make it lazy per-query. Choose LAZY as the default for most relationships, and only use EAGER when you're sure the related entity is always needed.
The @PrePersist Safety Net
@PrePersist
protected void onCreate() {
if (timestamp == null) {
timestamp = LocalDateTime.now();
}
}
This JPA lifecycle callback ensures the timestamp is never null when saving. Normally, the timestamp comes from the API response. But if something goes wrong in the mapping pipeline, this callback prevents a database constraint violation.
It's defensive coding for the data layer — a safety net that shouldn't be needed but prevents catastrophic failures when edge cases arise.
📦 Flyway Migrations: Schema as Code
The Weather Microservice uses Flyway for all database schema changes. Here's the initial migration:
-- V1__Initial_Schema.sql
-- Locations table
CREATE TABLE locations (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
country VARCHAR(100) NOT NULL,
latitude DOUBLE NOT NULL,
longitude DOUBLE NOT NULL,
region VARCHAR(100),
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL
);
CREATE INDEX idx_location_name ON locations(name);
CREATE INDEX idx_location_country ON locations(country);
-- Weather records table
CREATE TABLE weather_records (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
location_id BIGINT NOT NULL,
temperature DOUBLE NOT NULL,
feels_like DOUBLE,
humidity INT NOT NULL,
wind_speed DOUBLE NOT NULL,
wind_direction VARCHAR(10),
condition VARCHAR(100) NOT NULL,
description VARCHAR(500),
pressure_mb DOUBLE,
precipitation_mm DOUBLE,
cloud_coverage INT,
uv_index DOUBLE,
timestamp TIMESTAMP NOT NULL,
created_at TIMESTAMP NOT NULL,
CONSTRAINT fk_weather_location
FOREIGN KEY (location_id) REFERENCES locations(id) ON DELETE CASCADE
);
CREATE INDEX idx_weather_location_id ON weather_records(location_id);
CREATE INDEX idx_weather_timestamp ON weather_records(timestamp);
-- Forecast records table
CREATE TABLE forecast_records (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
location_id BIGINT NOT NULL,
forecast_date DATE NOT NULL,
max_temperature DOUBLE NOT NULL,
min_temperature DOUBLE NOT NULL,
avg_temperature DOUBLE,
max_wind_speed DOUBLE,
avg_humidity INT,
condition VARCHAR(100) NOT NULL,
description VARCHAR(500),
precipitation_mm DOUBLE,
precipitation_probability INT,
uv_index DOUBLE,
sunrise_time VARCHAR(10),
sunset_time VARCHAR(10),
created_at TIMESTAMP NOT NULL,
CONSTRAINT fk_forecast_location
FOREIGN KEY (location_id) REFERENCES locations(id) ON DELETE CASCADE
);
CREATE INDEX idx_forecast_location_id ON forecast_records(location_id);
CREATE INDEX idx_forecast_date ON forecast_records(forecast_date);
Flyway Migration Rules
| Rule | Example | Why |
|---|---|---|
| Version prefix | V1__, V2__ |
Ensures execution order |
| Descriptive name | Initial_Schema |
Documents what the migration does |
| Never modify applied migrations | — | Checksums prevent tampering |
| Forward-only | Add columns, don't remove | Backward compatibility |
| Indexes in migrations | CREATE INDEX |
Not in entity annotations alone |
Flyway Configuration
spring:
flyway:
enabled: true
baseline-on-migrate: true
baseline-version: 0
locations: classpath:db/migration
validate-on-migrate: true
baseline-on-migrate: true— Works with existing databases by creating a baselinevalidate-on-migrate: true— Verifies migration checksums haven't been tampered with
💡 Pro Tip: Why does the migration match the JPA entity annotations? Because Hibernate's ddl-auto: validate checks at startup. If your migration creates a VARCHAR(100) column but your entity declares @Column(length = 50), Hibernate will fail with a schema validation error. The migration and entities must agree.
🏪 Repository Design: Spring Data JPA
The Weather Microservice uses Spring Data JPA repositories — interfaces that Spring implements at runtime:
@Repository
public interface LocationRepository extends JpaRepository<Location, Long> {
// Derived query - Spring generates SQL from method name
Optional<Location> findByNameAndCountry(String name, String country);
// Derived query with collection return
List<Location> findByCountry(String country);
// Overloaded for pagination
Page<Location> findByCountry(String country, Pageable pageable);
// Custom JPQL query for case-insensitive search
@Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
List<Location> findByNameContaining(@Param("name") String name);
// Same query with pagination
@Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
Page<Location> findByNameContaining(@Param("name") String name, Pageable pageable);
// Existence check - faster than loading the entity
boolean existsByNameAndCountry(String name, String country);
}
Derived Queries vs. JPQL
Derived queries (Spring generates SQL from method names):
Optional<Location> findByNameAndCountry(String name, String country);
// → SELECT * FROM locations WHERE name = ? AND country = ?
JPQL queries (explicit SQL-like syntax):
@Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
List<Location> findByNameContaining(@Param("name") String name);
When to use each:
| Approach | Use When | Example |
|---|---|---|
| Derived query | Simple conditions | findByNameAndCountry |
@Query JPQL |
Complex logic, functions | Case-insensitive LIKE |
| Native SQL | Database-specific features | @Query(nativeQuery = true) |
The existsBy Pattern
boolean existsByNameAndCountry(String name, String country);
This generates SELECT COUNT(*) > 0 instead of loading the full entity. For checks like "does this location already exist?", it's significantly more efficient than findByNameAndCountry(...).isPresent().
🔄 Transaction Management
The Weather Microservice uses transactions strategically, not as a blanket default:
Read-Only Transactions
@Transactional(readOnly = true)
public Page<WeatherDto> getWeatherHistory(Long locationId, Pageable pageable) {
locationService.getLocationEntityById(locationId);
return weatherRecordRepository.findByLocationId(locationId, pageable)
.map(weatherMapper::toDto);
}
Setting readOnly = true tells Hibernate to:
- Skip dirty checking (performance boost)
- Use read-only database connections (from connection pool)
- Potentially route to read replicas in distributed setups
Standard Transactions
@Transactional(propagation = Propagation.REQUIRED)
public WeatherDto getCurrentWeather(String locationName, boolean saveToDatabase) {
WeatherApiResponse apiResponse = weatherApiClient.getCurrentWeather(locationName);
if (saveToDatabase) {
saveWeatherRecord(apiResponse);
}
return weatherMapper.toDtoFromApi(apiResponse);
}
The Propagation.REQUIRED setting is the default — join an existing transaction or create a new one. This ensures the API call, location creation, and weather record save all happen in the same transaction.
Isolated Transactions with REQUIRES_NEW
@Transactional(propagation = Propagation.REQUIRES_NEW)
public long deleteOldWeatherRecords(LocalDateTime cutoffDate) {
log.info("Deleting weather records before: {}", cutoffDate);
Long deletedCount = weatherRecordRepository.deleteByTimestampBefore(cutoffDate);
log.info("Successfully deleted {} weather records", deletedCount);
return deletedCount;
}
Using REQUIRES_NEW creates a fresh, independent transaction. The cleanup operation doesn't hold locks on records that concurrent weather fetches might need. If the cleanup fails, it doesn't roll back the caller's transaction.
| Propagation | Behavior | Use Case |
|---|---|---|
REQUIRED |
Join or create | Default — most operations |
REQUIRES_NEW |
Always create new | Independent cleanup, logging |
SUPPORTS |
Join if exists, else none | Optional transactional context |
Race Condition Handling
// LocationService.java
@Transactional(propagation = Propagation.REQUIRED)
public Location findOrCreateLocation(String name, String country,
Double latitude, Double longitude, String region) {
Optional<Location> existing = locationRepository.findByNameAndCountry(name, country);
if (existing.isPresent()) {
return existing.get();
}
try {
Location newLocation = Location.builder()
.name(name).country(country)
.latitude(latitude).longitude(longitude)
.region(region).build();
return locationRepository.save(newLocation);
} catch (DataIntegrityViolationException e) {
// Another thread created it concurrently — retry the query
log.debug("Concurrent location creation detected, retrying query");
return locationRepository.findByNameAndCountry(name, country)
.orElseThrow(() -> new IllegalStateException(
"Location creation failed and retry found nothing"));
}
}
This is the check-then-create pattern with a race condition safety net:
- Check — Does the location exist?
- Create — If not, try to save it
- Catch — If another thread created it simultaneously, the unique constraint throws
DataIntegrityViolationException - Retry — Query again; the record must exist now
This pattern is essential in concurrent microservices where multiple requests might try to create the same location simultaneously.
🗺️ The Data Model Overview
+-----------------------------------------+
| locations |
+-----------------------------------------+
| id BIGINT (PK, AUTO_INCREMENT) |
| name VARCHAR(100) NOT NULL |
| country VARCHAR(100) NOT NULL |
| latitude DOUBLE NOT NULL |
| longitude DOUBLE NOT NULL |
| region VARCHAR(100) |
| created_at TIMESTAMP NOT NULL |
| updated_at TIMESTAMP NOT NULL |
| UNIQUE(name, country) |
| INDEX(name), INDEX(country) |
+-------------+---------------------------+
| 1
|
| *
+-------------+---------------------------+
| weather_records |
+-----------------------------------------+
| id BIGINT (PK) |
| location_id BIGINT (FK) NOT NULL |
| temperature DOUBLE NOT NULL |
| feels_like DOUBLE |
| humidity INT NOT NULL |
| wind_speed DOUBLE NOT NULL |
| wind_direction VARCHAR(10) |
| condition VARCHAR(100) NOT NULL |
| timestamp TIMESTAMP NOT NULL |
| ON DELETE CASCADE |
| INDEX(location_id), INDEX(timestamp) |
+-----------------------------------------+
Key design decisions:
- CASCADE delete — When a location is deleted, all its weather records go too
- Indexes on foreign keys —
location_idis indexed for fast joins and lookups - Timestamp indexes — Support date range queries efficiently
- Nullable vs NOT NULL — Core fields (
temperature,humidity) are required; supplementary fields (feels_like,description) are optional
✅ Data Layer Checklist
- [ ] Entities use
@EqualsAndHashCode(of = "id")— Equality based on business identity - [ ] No
@Setteron entities — Updates through mappers for controlled modifications - [ ]
@ToString(exclude = ...)— Prevent circular reference in bidirectional relationships - [ ]
BaseAuditableEntity— AutomaticcreatedAt/updatedAttimestamps - [ ] Indexes on frequently queried columns — Defined in both
@Tableand migration scripts - [ ] Named constraints —
uk_location_name_country, not auto-generated names - [ ] Flyway migrations — All schema changes in versioned SQL files
- [ ]
ddl-auto: validate— Hibernate validates but never modifies schema - [ ] Derived queries for simple lookups,
@Queryfor complex operations - [ ]
existsByfor existence checks instead of loading entities - [ ]
readOnly = trueon read-only service methods - [ ]
REQUIRES_NEWfor independent cleanup operations - [ ] Race condition handling in
findOrCreateLocationwith retry on constraint violation
🎓 Conclusion: Data Is the Foundation
The data layer is where your application meets reality. Get it right and everything else flows smoothly:
- The FORGE framework (Flyway Evolution, ORM Mapping, Repository Abstractions, Guarded Entities, Entity Relationships) guides data layer design
- JPA entities use explicit annotations for indexes, constraints, and column mappings — no guessing
BaseAuditableEntitywith@MappedSuperclassprovides automatic audit timestamps for all entities- EAGER vs. LAZY fetch is a deliberate decision based on how entities are accessed
@PrePersistcallbacks provide safety nets for required fields- Flyway migrations are the single source of truth for database schema — immutable, versioned, validated
- Spring Data repositories provide powerful abstractions — derived queries, JPQL, pagination — through interfaces
- Transaction propagation (
REQUIRED,REQUIRES_NEW,readOnly) controls isolation and performance - Race condition handling with
DataIntegrityViolationExceptioncatch-and-retry protects concurrent operations
Your data layer is the foundation everything else builds on. Get it wrong, and every layer above it suffers. Get it right, and your service has a solid base to grow from.
Coming Next Week:
Part 5: When the World Breaks - External API Integration and Resilience Patterns 🛡️
📚 Series Progress
✅ Part 1: The Blueprint Before the Build
✅ Part 2: Spring Boot Alchemy
✅ Part 3: REST Assured
✅ Part 4: The Data Foundation ← You just finished this!
⬜ Part 5: When the World Breaks
⬜ Part 6: Cache Me If You Can
⬜ Part 7: Guarding the Gates
⬜ Part 8: Fail Gracefully
⬜ Part 9: 10,000 Threads and a Dream
⬜ Part 10: Can You See Me Now?
⬜ Part 11: Trust, But Verify
⬜ Part 12: Ship It
⬜ Part 13: To Production and Beyond
Happy coding, and remember — treat your database like a house foundation: measure twice, migrate once. ☕