JPA, Hibernate & Flyway in Spring Boot

📚 Series Navigation:
← Previous: Part 3 - REST Assured
👉 You are here: Part 4 - The Data Foundation
Next: Part 5 - When the World Breaks →

📋 Introduction

Every microservice eventually needs to remember things. Weather data that was fetched an hour ago. The list of locations your users track. Forecast records spanning the last two weeks. Without persistence, your service has the memory of a goldfish — brilliant in the moment, useless five seconds later.

But getting the data layer right is harder than it looks. You need to design entities that map cleanly to your domain, manage schema changes without downtime, handle relationships efficiently, and keep transactions correct under concurrency. One wrong annotation and you're looking at N+1 query problems. One missing migration and your production database is out of sync.

In this article, we'll explore how the Weather Microservice builds its data foundation. We'll cover JPA entity design, Flyway migrations, repository patterns, and transaction management — everything you need to build a data layer that's both powerful and predictable. ☕

🔨 The FORGE Framework: Five Pillars of Data Layer Design

Meet FORGE — five principles for building solid data foundations:

Letter	Principle	What It Means
F	Flyway Evolution	Schema changes through versioned, immutable migrations
O	ORM Mapping	Entities map cleanly to tables with explicit annotations
R	Repository Abstractions	Data access through Spring Data interfaces
G	Guarded Entities	Entities protect their invariants, use audit trails
E	Entity Relationships	Relationships are explicit, fetch strategies intentional

🏛️ Entity Design: The Location Entity

Let's start with the Location entity — the foundation of the Weather Microservice's data model:

@Entity
@Table(
    name = "locations",
    indexes = {
      @Index(name = "idx_location_name", columnList = "name"),
      @Index(name = "idx_location_country", columnList = "country")
    },
    uniqueConstraints = {
      @UniqueConstraint(
          name = "uk_location_name_country",
          columnNames = {"name", "country"})
    })
@Getter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString
@EqualsAndHashCode(of = "id", callSuper = false)
@Auditable
@EntityListeners(AuditableEntityListener.class)
public class Location extends BaseAuditableEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @Column(nullable = false, length = 100)
  private String name;

  @Column(nullable = false, length = 100)
  private String country;

  @Column(nullable = false)
  private Double latitude;

  @Column(nullable = false)
  private Double longitude;

  @Column(name = "region", length = 100)
  private String region;
}

There's a lot packed into this entity, so we'll go through it piece by piece.

Table-Level Annotations

Indexes speed up the most common queries:

@Index(name = "idx_location_name", columnList = "name")
@Index(name = "idx_location_country", columnList = "country")

These correspond to the queries findByNameContaining() and findByCountry() in the repository. Without indexes, these queries would scan the entire table.

Unique Constraint prevents duplicate locations:

@UniqueConstraint(
    name = "uk_location_name_country",
    columnNames = {"name", "country"})

"London, UK" and "London, Canada" are different locations. The composite unique constraint on (name, country) allows this while preventing true duplicates.

💡 Pro Tip: Always name your constraints explicitly. uk_location_name_country tells you exactly what it protects. Auto-generated names like UK_4xf3a2b are debugging nightmares.

No Setter — By Design

Notice the comment: // No @Setter - updates go through LocationMapper for controlled modifications. The entity uses @Getter but not @Setter. Updates happen through the mapper layer, which controls what fields can change and validates the new values.

This is defensive entity design. By removing setters, you prevent code like:

// This is impossible without setters — and that's the point
location.setLatitude(999.0);  // ← Won't compile!

Instead, updates go through the proper channel:

// LocationMapper handles the update with validation
Location updatedLocation = locationMapper.updateEntityFromRequest(location, request);

Equality by ID Only

@EqualsAndHashCode(of = "id", callSuper = false)

Two Location objects are equal if they have the same id. This is critical for JPA because:

Entities in Set collections need consistent equality
Hibernate's dirty checking relies on proper equals()/hashCode()
Business fields (name, latitude) can change — ID can't

ToString Exclusions

The @ToString annotation doesn't exclude anything on Location, but look at WeatherRecord:

@ToString(exclude = {"location"})

This prevents infinite recursion: if Location's toString() includes WeatherRecords, and WeatherRecord's toString() includes Location, you get a StackOverflowError in your logs.

📜 The Auditable Base Entity

Every entity in the Weather Microservice extends BaseAuditableEntity:

@MappedSuperclass
@Getter
public abstract class BaseAuditableEntity {

  @Column(name = "created_at", nullable = false, updatable = false)
  private LocalDateTime createdAt;

  @Column(name = "updated_at", nullable = false)
  private LocalDateTime updatedAt;

  public void setCreatedAt(LocalDateTime createdAt) {
    this.createdAt = createdAt;
  }

  public void setUpdatedAt(LocalDateTime updatedAt) {
    this.updatedAt = updatedAt;
  }
}

Key annotations:

@MappedSuperclass — This class isn't an entity itself, but its fields are inherited by child entities
updatable = false on created_at — Once set, creation timestamp can never change
Setters only for the listener — The AuditableEntityListener uses these to set timestamps

The @Auditable annotation and AuditableEntityListener automatically set createdAt on persist and updatedAt on every save. No manual timestamp management needed.

🔗 Entity Relationships: WeatherRecord

The WeatherRecord entity demonstrates a well-designed JPA relationship:

@Entity
@Table(
    name = "weather_records",
    indexes = {
      @Index(name = "idx_weather_location_id", columnList = "location_id"),
      @Index(name = "idx_weather_timestamp", columnList = "timestamp")
    })
@Getter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString(exclude = {"location"})
@EqualsAndHashCode(of = "id", callSuper = false)
@Auditable
@EntityListeners(AuditableEntityListener.class)
public class WeatherRecord extends BaseAuditableEntity {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  @ManyToOne(fetch = FetchType.EAGER)
  @JoinColumn(name = "location_id", nullable = false)
  private Location location;

  @Column(nullable = false)
  private Double temperature;

  @Column(name = "feels_like")
  private Double feelsLike;

  @Column(nullable = false)
  private Integer humidity;

  @Column(name = "wind_speed", nullable = false)
  private Double windSpeed;

  @Column(name = "wind_direction")
  private String windDirection;

  @Column(nullable = false, length = 100)
  private String condition;

  @Column(nullable = false)
  private LocalDateTime timestamp;

  @PrePersist
  protected void onCreate() {
    if (timestamp == null) {
      timestamp = LocalDateTime.now();
    }
  }
}

The EAGER Fetch Decision

@ManyToOne(fetch = FetchType.EAGER)
@JoinColumn(name = "location_id", nullable = false)
private Location location;

The javadoc explains this deliberate choice: weather records are almost always displayed with their location name. Using EAGER prevents N+1 queries when loading lists of weather records.

Fetch Type	When	Trade-off
EAGER	Always loaded with parent	More data per query, no N+1
LAZY	Loaded on first access	Less data per query, risk of N+1

For this relationship, EAGER is correct because:

The WeatherMapper.toDto() always accesses weatherRecord.getLocation().getName()
Location is a small entity (6 fields)
Loading weather records without their location is never useful

🔥 Critical Insight: EAGER is a permanent decision — you can't make it lazy per-query. Choose LAZY as the default for most relationships, and only use EAGER when you're sure the related entity is always needed.

The @PrePersist Safety Net

@PrePersist
protected void onCreate() {
  if (timestamp == null) {
    timestamp = LocalDateTime.now();
  }
}

This JPA lifecycle callback ensures the timestamp is never null when saving. Normally, the timestamp comes from the API response. But if something goes wrong in the mapping pipeline, this callback prevents a database constraint violation.

It's defensive coding for the data layer — a safety net that shouldn't be needed but prevents catastrophic failures when edge cases arise.

📦 Flyway Migrations: Schema as Code

The Weather Microservice uses Flyway for all database schema changes. Here's the initial migration:

-- V1__Initial_Schema.sql

-- Locations table
CREATE TABLE locations (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    country VARCHAR(100) NOT NULL,
    latitude DOUBLE NOT NULL,
    longitude DOUBLE NOT NULL,
    region VARCHAR(100),
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL
);

CREATE INDEX idx_location_name ON locations(name);
CREATE INDEX idx_location_country ON locations(country);

-- Weather records table
CREATE TABLE weather_records (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    location_id BIGINT NOT NULL,
    temperature DOUBLE NOT NULL,
    feels_like DOUBLE,
    humidity INT NOT NULL,
    wind_speed DOUBLE NOT NULL,
    wind_direction VARCHAR(10),
    condition VARCHAR(100) NOT NULL,
    description VARCHAR(500),
    pressure_mb DOUBLE,
    precipitation_mm DOUBLE,
    cloud_coverage INT,
    uv_index DOUBLE,
    timestamp TIMESTAMP NOT NULL,
    created_at TIMESTAMP NOT NULL,
    CONSTRAINT fk_weather_location
        FOREIGN KEY (location_id) REFERENCES locations(id) ON DELETE CASCADE
);

CREATE INDEX idx_weather_location_id ON weather_records(location_id);
CREATE INDEX idx_weather_timestamp ON weather_records(timestamp);

-- Forecast records table
CREATE TABLE forecast_records (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    location_id BIGINT NOT NULL,
    forecast_date DATE NOT NULL,
    max_temperature DOUBLE NOT NULL,
    min_temperature DOUBLE NOT NULL,
    avg_temperature DOUBLE,
    max_wind_speed DOUBLE,
    avg_humidity INT,
    condition VARCHAR(100) NOT NULL,
    description VARCHAR(500),
    precipitation_mm DOUBLE,
    precipitation_probability INT,
    uv_index DOUBLE,
    sunrise_time VARCHAR(10),
    sunset_time VARCHAR(10),
    created_at TIMESTAMP NOT NULL,
    CONSTRAINT fk_forecast_location
        FOREIGN KEY (location_id) REFERENCES locations(id) ON DELETE CASCADE
);

CREATE INDEX idx_forecast_location_id ON forecast_records(location_id);
CREATE INDEX idx_forecast_date ON forecast_records(forecast_date);

Flyway Migration Rules

Rule	Example	Why
Version prefix	`V1__`, `V2__`	Ensures execution order
Descriptive name	`Initial_Schema`	Documents what the migration does
Never modify applied migrations	—	Checksums prevent tampering
Forward-only	Add columns, don't remove	Backward compatibility
Indexes in migrations	`CREATE INDEX`	Not in entity annotations alone

Flyway Configuration

spring:
  flyway:
    enabled: true
    baseline-on-migrate: true
    baseline-version: 0
    locations: classpath:db/migration
    validate-on-migrate: true

baseline-on-migrate: true — Works with existing databases by creating a baseline
validate-on-migrate: true — Verifies migration checksums haven't been tampered with

💡 Pro Tip: Why does the migration match the JPA entity annotations? Because Hibernate's ddl-auto: validate checks at startup. If your migration creates a VARCHAR(100) column but your entity declares @Column(length = 50), Hibernate will fail with a schema validation error. The migration and entities must agree.

🏪 Repository Design: Spring Data JPA

The Weather Microservice uses Spring Data JPA repositories — interfaces that Spring implements at runtime:

@Repository
public interface LocationRepository extends JpaRepository<Location, Long> {

  // Derived query - Spring generates SQL from method name
  Optional<Location> findByNameAndCountry(String name, String country);

  // Derived query with collection return
  List<Location> findByCountry(String country);

  // Overloaded for pagination
  Page<Location> findByCountry(String country, Pageable pageable);

  // Custom JPQL query for case-insensitive search
  @Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
  List<Location> findByNameContaining(@Param("name") String name);

  // Same query with pagination
  @Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
  Page<Location> findByNameContaining(@Param("name") String name, Pageable pageable);

  // Existence check - faster than loading the entity
  boolean existsByNameAndCountry(String name, String country);
}

Derived Queries vs. JPQL

Derived queries (Spring generates SQL from method names):

Optional<Location> findByNameAndCountry(String name, String country);
// → SELECT * FROM locations WHERE name = ? AND country = ?

JPQL queries (explicit SQL-like syntax):

@Query("SELECT l FROM Location l WHERE LOWER(l.name) LIKE LOWER(CONCAT('%', :name, '%'))")
List<Location> findByNameContaining(@Param("name") String name);

When to use each:

Approach	Use When	Example
Derived query	Simple conditions	`findByNameAndCountry`
`@Query` JPQL	Complex logic, functions	Case-insensitive LIKE
Native SQL	Database-specific features	`@Query(nativeQuery = true)`

The `existsBy` Pattern

boolean existsByNameAndCountry(String name, String country);

This generates SELECT COUNT(*) > 0 instead of loading the full entity. For checks like "does this location already exist?", it's significantly more efficient than findByNameAndCountry(...).isPresent().

🔄 Transaction Management

The Weather Microservice uses transactions strategically, not as a blanket default:

Read-Only Transactions

@Transactional(readOnly = true)
public Page<WeatherDto> getWeatherHistory(Long locationId, Pageable pageable) {
    locationService.getLocationEntityById(locationId);
    return weatherRecordRepository.findByLocationId(locationId, pageable)
        .map(weatherMapper::toDto);
}

Setting readOnly = true tells Hibernate to:

Skip dirty checking (performance boost)
Use read-only database connections (from connection pool)
Potentially route to read replicas in distributed setups

Standard Transactions

@Transactional(propagation = Propagation.REQUIRED)
public WeatherDto getCurrentWeather(String locationName, boolean saveToDatabase) {
    WeatherApiResponse apiResponse = weatherApiClient.getCurrentWeather(locationName);
    if (saveToDatabase) {
        saveWeatherRecord(apiResponse);
    }
    return weatherMapper.toDtoFromApi(apiResponse);
}

The Propagation.REQUIRED setting is the default — join an existing transaction or create a new one. This ensures the API call, location creation, and weather record save all happen in the same transaction.

Isolated Transactions with REQUIRES_NEW

@Transactional(propagation = Propagation.REQUIRES_NEW)
public long deleteOldWeatherRecords(LocalDateTime cutoffDate) {
    log.info("Deleting weather records before: {}", cutoffDate);
    Long deletedCount = weatherRecordRepository.deleteByTimestampBefore(cutoffDate);
    log.info("Successfully deleted {} weather records", deletedCount);
    return deletedCount;
}

Using REQUIRES_NEW creates a fresh, independent transaction. The cleanup operation doesn't hold locks on records that concurrent weather fetches might need. If the cleanup fails, it doesn't roll back the caller's transaction.

Propagation	Behavior	Use Case
`REQUIRED`	Join or create	Default — most operations
`REQUIRES_NEW`	Always create new	Independent cleanup, logging
`SUPPORTS`	Join if exists, else none	Optional transactional context

Race Condition Handling

// LocationService.java
@Transactional(propagation = Propagation.REQUIRED)
public Location findOrCreateLocation(String name, String country,
    Double latitude, Double longitude, String region) {

  Optional<Location> existing = locationRepository.findByNameAndCountry(name, country);
  if (existing.isPresent()) {
    return existing.get();
  }

  try {
    Location newLocation = Location.builder()
        .name(name).country(country)
        .latitude(latitude).longitude(longitude)
        .region(region).build();
    return locationRepository.save(newLocation);
  } catch (DataIntegrityViolationException e) {
    // Another thread created it concurrently — retry the query
    log.debug("Concurrent location creation detected, retrying query");
    return locationRepository.findByNameAndCountry(name, country)
        .orElseThrow(() -> new IllegalStateException(
            "Location creation failed and retry found nothing"));
  }
}

This is the check-then-create pattern with a race condition safety net:

Check — Does the location exist?
Create — If not, try to save it
Catch — If another thread created it simultaneously, the unique constraint throws DataIntegrityViolationException
Retry — Query again; the record must exist now

This pattern is essential in concurrent microservices where multiple requests might try to create the same location simultaneously.

🗺️ The Data Model Overview

+-----------------------------------------+
|              locations                  |
+-----------------------------------------+
| id          BIGINT (PK, AUTO_INCREMENT) |
| name        VARCHAR(100) NOT NULL       |
| country     VARCHAR(100) NOT NULL       |
| latitude    DOUBLE NOT NULL             |
| longitude   DOUBLE NOT NULL             |
| region      VARCHAR(100)                |
| created_at  TIMESTAMP NOT NULL          |
| updated_at  TIMESTAMP NOT NULL          |
| UNIQUE(name, country)                   |
| INDEX(name), INDEX(country)             |
+-------------+---------------------------+
              | 1
              |
              | *
+-------------+---------------------------+
|          weather_records                |
+-----------------------------------------+
| id              BIGINT (PK)             |
| location_id     BIGINT (FK) NOT NULL    |
| temperature     DOUBLE NOT NULL         |
| feels_like      DOUBLE                  |
| humidity        INT NOT NULL            |
| wind_speed      DOUBLE NOT NULL         |
| wind_direction  VARCHAR(10)             |
| condition       VARCHAR(100) NOT NULL   |
| timestamp       TIMESTAMP NOT NULL      |
| ON DELETE CASCADE                       |
| INDEX(location_id), INDEX(timestamp)    |
+-----------------------------------------+

Key design decisions:

CASCADE delete — When a location is deleted, all its weather records go too
Indexes on foreign keys — location_id is indexed for fast joins and lookups
Timestamp indexes — Support date range queries efficiently
Nullable vs NOT NULL — Core fields (temperature, humidity) are required; supplementary fields (feels_like, description) are optional

✅ Data Layer Checklist

[ ] Entities use @EqualsAndHashCode(of = "id") — Equality based on business identity
[ ] No @Setter on entities — Updates through mappers for controlled modifications
[ ] @ToString(exclude = ...) — Prevent circular reference in bidirectional relationships
[ ] BaseAuditableEntity — Automatic createdAt/updatedAt timestamps
[ ] Indexes on frequently queried columns — Defined in both @Table and migration scripts
[ ] Named constraints — uk_location_name_country, not auto-generated names
[ ] Flyway migrations — All schema changes in versioned SQL files
[ ] ddl-auto: validate — Hibernate validates but never modifies schema
[ ] Derived queries for simple lookups, @Query for complex operations
[ ] existsBy for existence checks instead of loading entities
[ ] readOnly = true on read-only service methods
[ ] REQUIRES_NEW for independent cleanup operations
[ ] Race condition handling in findOrCreateLocation with retry on constraint violation

🎓 Conclusion: Data Is the Foundation

The data layer is where your application meets reality. Get it right and everything else flows smoothly:

The FORGE framework (Flyway Evolution, ORM Mapping, Repository Abstractions, Guarded Entities, Entity Relationships) guides data layer design
JPA entities use explicit annotations for indexes, constraints, and column mappings — no guessing
BaseAuditableEntity with @MappedSuperclass provides automatic audit timestamps for all entities
EAGER vs. LAZY fetch is a deliberate decision based on how entities are accessed
@PrePersist callbacks provide safety nets for required fields
Flyway migrations are the single source of truth for database schema — immutable, versioned, validated
Spring Data repositories provide powerful abstractions — derived queries, JPQL, pagination — through interfaces
Transaction propagation (REQUIRED, REQUIRES_NEW, readOnly) controls isolation and performance
Race condition handling with DataIntegrityViolationException catch-and-retry protects concurrent operations

Your data layer is the foundation everything else builds on. Get it wrong, and every layer above it suffers. Get it right, and your service has a solid base to grow from.

Coming Next Week:
Part 5: When the World Breaks - External API Integration and Resilience Patterns 🛡️

📚 Series Progress

✅ Part 1: The Blueprint Before the Build
✅ Part 2: Spring Boot Alchemy
✅ Part 3: REST Assured
✅ Part 4: The Data Foundation ← You just finished this!
⬜ Part 5: When the World Breaks
⬜ Part 6: Cache Me If You Can
⬜ Part 7: Guarding the Gates
⬜ Part 8: Fail Gracefully
⬜ Part 9: 10,000 Threads and a Dream
⬜ Part 10: Can You See Me Now?
⬜ Part 11: Trust, But Verify
⬜ Part 12: Ship It
⬜ Part 13: To Production and Beyond

Happy coding, and remember — treat your database like a house foundation: measure twice, migrate once. ☕

🗄️ The Data Foundation: JPA, Hibernate, and the Database Migration Playbook

📋 Introduction

🔨 The FORGE Framework: Five Pillars of Data Layer Design

🏛️ Entity Design: The Location Entity

Table-Level Annotations

No Setter — By Design

Equality by ID Only

ToString Exclusions

📜 The Auditable Base Entity

🔗 Entity Relationships: WeatherRecord

The EAGER Fetch Decision

The @PrePersist Safety Net

📦 Flyway Migrations: Schema as Code

Flyway Migration Rules

Flyway Configuration

🏪 Repository Design: Spring Data JPA

Derived Queries vs. JPQL

The `existsBy` Pattern

🔄 Transaction Management

Read-Only Transactions

Standard Transactions

Isolated Transactions with REQUIRES_NEW

Race Condition Handling

🗺️ The Data Model Overview

✅ Data Layer Checklist

🎓 Conclusion: Data Is the Foundation

Robert Marcel Saveanu

Read next

🗄️ The Data Foundation: JPA, Hibernate, and the Database Migration Playbook

📋 Introduction

🔨 The FORGE Framework: Five Pillars of Data Layer Design

🏛️ Entity Design: The Location Entity

Table-Level Annotations

No Setter — By Design

Equality by ID Only

ToString Exclusions

📜 The Auditable Base Entity

🔗 Entity Relationships: WeatherRecord

The EAGER Fetch Decision

The @PrePersist Safety Net

📦 Flyway Migrations: Schema as Code

Flyway Migration Rules

Flyway Configuration

🏪 Repository Design: Spring Data JPA

Derived Queries vs. JPQL

The existsBy Pattern

🔄 Transaction Management

Read-Only Transactions

Standard Transactions

Isolated Transactions with REQUIRES_NEW

Race Condition Handling

🗺️ The Data Model Overview

✅ Data Layer Checklist

🎓 Conclusion: Data Is the Foundation

Robert Marcel Saveanu

Read next

🌐 REST Assured: Designing APIs Developers Actually Want to Use

⚙️ Spring Boot Alchemy: Turning Configuration into a Running Service

🏗️ The Blueprint Before the Build: Thinking in Layers

The `existsBy` Pattern