Posts

How to Use Spring Data MongoDB

Jul 10, 2025

Catherine Edelveis

44.8

Introduction: Why MongoDB with Spring Boot?

MongoDB is an open-source cross-platform document-oriented database. Belonging to the family of NoSQL database solutions, it provides impressive scalability and flexibility to data-driven applications dealing with real-time analytics, IoT, operational intelligence, or e-commerce to name a few.

Spring developers can benefit from integrating MongoDB into their projects without leaving the familiar grounds of the framework thanks to Spring Data MongoDB.

In this guide, I will walk you through setting up MongoDB with Spring Boot, using a variety of its features, from indexes to aggregation. I will also demonstrate how to integrate Mongock for reliable database migrations.

This guide is beginner-friendly but also includes some advanced topics, so, feel free to navigate to the section you are most interested in.

The code used in the tutorial is available on GitHub.

Setting Up MongoDB with Spring Boot

The application we are building is called NeuroWatch. It is a cyberpunk-themed app aimed at collecting the data on civilians and cyberware they implanted, as well as live reports sent by implants and aggregates the data to monitor the implant health over the time.

So much more fun than typical Student, User, and Author entities, right?

Prerequisites:

Spring Boot 3+
JDK 24 or at least JDK 17 supported by Spring Boot 3.x. I’m using Liberica JDK recommended by Spring.
Your favorite IDE
Docker and Docker Compose

First, let’s create a new Spring Boot project. Head over to Spring Initializr and select three dependencies: Docker Compose, Spring Data MongoDB, and Testcontainers.

Creating a new project

Hit ‘Generate’ and open it in the IDE.

Go to the Main application class and add the @EnableMongoRepositories annotation.

@SpringBootApplication
@EnableMongoRepositories
public class MongodbDemoApp {

    public static void main(String[] args) {
        SpringApplication.run(MongodbDemoApp.class, args);
    }
}

The next essential step is to configure the MongoDB connection.

You can either run a MongoDB server locally or in a container. I will spin up a MongoDB container using Docker Compose. Here’s the compose.yml file:

services:
  mongodb:
    image: mongo:latest
    container_name: mongodb
    restart: unless-stopped
    environment:
      - MONGO_INITDB_DATABASE=neurowatch
    networks:
      - neurowatch_net
    ports:
      - "27017:27017"
    volumes:
      - mongo_data:/data/db-mongo-data
    command: "mongod --quiet --logpath /dev/null "
    healthcheck:
      test: [ "CMD", "mongosh", "--eval", "db.adminCommand('ping')" ]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 10s

volumes:
  mongo_data:

networks:
  neurowatch_net:

Let’s see briefly what is going on here:

We specify the image we want to use (mongo:latest)
restart:unless-stopped starts the container automatically if it is stopped not by the user.
We specify the database name that mongo needs to create upon the first start (MONGO_INITDB_DATABASE=neurowatch)
We provide the connection details such as the network and port
The healthcheck part helps to verify that the container is functioning properly
In the volumes part, we specify the path to the local data storage so that when you stop and restart the container image, your data is preserved.

Next, specify the connection details in the application.properties file:

spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=neurowatch
spring.data.mongodb.auto-index-creation=true

Here, we specify the host, port, and database name. By default, user and password are not required. We also enable automatic index creation (more on indexes below).

Before running the application, you need to start the mongodb container:

docker-compose up -d

That’s it, we are all set up for writing the actual application!

Defining Your Data Models

SQL vs NoSQL approach

MongoDB is a NoSQL document-based database, which means that its approach to storing and retrieving data is different to that of relational databases.

With traditional relational databases, we need to create a database schema beforehand and map the relationships between entities using foreign keys and joins. The data is stored in tables, where they form rows and columns.

MongoDB stores entities in JSON-like documents. It enables the developers to nest arrays and sub-documents without first declaring a rigid schema. Accompanying information is stored together in one document and can be indexed for convenient and rapid access. There’s no need to predetermine database schema, MongoDB takes care of it. At the same time, data can be efficiently queried, sorted, aggregated, and filtered thanks to the MongoDB Query API.

Keeping that in mind, let’s see how to create entities MongoDB-style!

Basic Modelling

Let’s start with basic modelling. We will have two collections: civilians mapped to the Civilian model and implant_logs mapped to the ImplantMonitoringLog. A collection in MongoDB is used to store data and is similar to tables in relational databases.

The Implant class doesn’t need a separate collection: the implants will be stored in the Civilian model.

Let’s define a Civilian class annotated with @Document. This annotation tells MongoDB that the class represents a MongoDB document.

A MongoDB document is a self-contained “record” written in JSON-like syntax supporting more data types than plain JSON.The document can hold any set of key-value pairs, numbers, text, dates, arrays, or even nested documents.

You can define the name of the collection in the annotation if it differs from the name of the class:

@Document(collection = "civilians")
public class Civilian {

    @Id
    private String id;
    private String legalName;
    private String nationalId;
    private LocalDate birthDate;
    private boolean criminalRecord;
    private boolean underSurveillance;
    private final LocalDateTime registeredInSystemAt;

    private List<Implant> implants = new ArrayList<>();

// getters, setters, etc. are omitted for brevity
}

For now, we have only one annotated field. The @Id annotation specifies the primary key. In MongoDB, the id should have a type String, ObjectId, or UUID.

If you want to define a custom name for a field, you can use a @Field annotation:

@Field(name = "registered_at")
private final LocalDateTime registeredInSystemAt;

Now, let’s create the Implant class:

public class Implant {

    private String type;
    private String model;
    private String version;
    private String manufacturer;
    private String serialNumber;
    private int lotNumber;
    private LocalDate installedAt;
    private final LocalDateTime registeredInSystemAt;
// getters, setters, etc. are omitted for brevity
}

As you can see, it doesn’t have a separate collection in the database.

You can also embed documents into other documents if required. Imagine we made Implant a separate Document for some reason, gave it the id and a collection. Then, you just add the list of implants to Civilian:

@Document
public class Implant {
  @Id
  private String id;
}

@Document
public class Civilian {
  @Id
  private String id;
  private List<Implant> implants;
}

Instead of embedding the document, you can reference it using the @DBRef annotation. Such reference will be eagerly resolved:

@Document
public class Implant {
  @Id
  private String id;
}

@Document
public class Civilian {
  @Id
  private String id;
  @DBRef
  private List<Implant> implants;
}

You can also use @DocumentReference instead of @DBRef to define more flexibly which fields of the embedded document should be loaded (the id field is referenced by default):

@Document
public class Implant {
  @Id
  private String id;
}

@Document
public class Civilian {
  @Id
  private String id;
  @DocumentReference
  private List<Implant> implants;
}

Either way, you should be cautious when embedding documents or referencing them. Updating deeply nested documents requires rewriting the entire document. In addition, eagerly fetched documents with @DBRef can impact the application performance, whereas lazy loading with @DBRef may complicate debugging.

Advanced Annotations

Moving on to more advanced annotations!

Let’s start with indexing. MongoDB indexes are special data structures that facilitate querying data. They are similar to a book index that helps to find required content without reading each page. MongoDB indexes help to avoid a full collection scan to find necessary data.

Indexes are created for frequently queried fields such as nationalId in Civilian. With an index, you can also specify the uniqueness of the field if necessary (although the uniqueness of the national id is debated in real world, let’s suppose that it is a unique number for the sake of this small demo):

@Document(collection = "civilians")
public class Civilian {
    @Indexed(unique = true)
    private String nationalId;
}

Indexes can be used not only with documents but with any entity whose data is preserved in the database. Let’s also create indexes for Implant:

public class Implant {

    @Indexed(unique = true)
    private String serialNumber;
    @Indexed
    private int lotNumber;
}

Apart from single-field indexes, MongoDB supports compound (@CompoundIndex), text (@TextIndexed), geospatial (@GeoSpatialIndexed), and hashed (@HashIndexed) indexes.

Compound indexes can help improve the performance of queries using criteria on multiple fields. They are defined at a class level. Let’s create a compound index of implantSerialNumber in ascending order and timestamp in descending order for ImplantMonitoringLog:

@Document(collection = "implant_logs")
@CompoundIndex(name = "implant_ts_idx",
        def = "{'implant_serial_number': 1, 'timestamp': -1}")
public class ImplantMonitoringLog {
}

Geo-spatial queries is an exciting MongoDB feature that enables you to find documents within a given distance. We can’t go past it, I suggest we see it in action!

Let’s add one more field to our ImplantMonitoringLog that will define the location it was created at:

@GeoSpatialIndexed(type = GeoSpatialIndexType.GEO_2DSPHERE)
private Point location;

Here,

Point is a class from org.springframework.data.geo,
@GeoSpatialIndexed(type = GeoSpatialIndexType.GEO_2DSPHERE) annotation creates an index of type GEO_2DSPHERE that enforces usage of the $nearSphere operator when fetching the data. This operator takes into account the curvature of the Earth and performs a spherical, geodesic distance calculation. It enables you to search documents within a given radius and work with realistic distances on a globe.

After adding this annotation, we can create repository methods for fetching ImplantMonitoringLog within a certain distance — I’ll show you how to do that in the next section.

Another interesting set of annotations is audit annotations @CreatedDate and @LastModifiedDate. They help to keep track of document lifecycle for maintaining data traceability and enabling analytics or data versioning.

Let’s add these annotations to our Civilian and ImplantMonitoringLog classes:

@Document(collection = "civilians")
public class Civilian {
    @CreatedDate
    private final LocalDateTime registeredInSystemAt;
}

@Document(collection = "implant_logs")
public class ImplantMonitoringLog {
    @CreatedDate
    private final LocalDateTime timestamp;
}

Our models are ready, moving on to setting up repositories!

Creating Mongo Repositories

Spring Data MongoDB provides the MongoRepository interface, which, in turn, extends

CrudRepository,
QueryByExampleExecutor, and
PagindAndSortingRepository.

Therefore, it provides basic methods for CRUD operations on the entities, paging and sorting capabilities, and MongoDB-specific methods.

Let’s create two repository interfaces annotated with @Repository, CivilianRepository and ImplantMonitoringLogRepository. Make them extend MongoRepository and specify the entity class name and id type:

@Repository
public interface CivilianRepository extends MongoRepository<Civilian, String> { }

@Repository
public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String> { }

At this point, we can already perform CRUD operations with methods provided internally by Spring repositories.

In addition, we can use queries by convention. This feature enables the developers to define queries in method names following an established pattern.

For instance, we can define a method for searching a civilian by nationalId as follows:

@Repository
public interface CivilianRepository extends MongoRepository<Civilian, String> {
    Optional<Civilian> findByNationalId(String nationalId);
}

We can also define methods for returning a List of ImplantMonitoringLog objects by implant serial number and a timestamp after or between specified dates:

@Repository
public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String> {

    List<ImplantMonitoringLog> findByImplantSerialNumber(String implantSerialNumber);

    List<ImplantMonitoringLog> findByImplantSerialNumberAndTimestampAfter(String implantSerialNumber,
                                                                          LocalDateTime timestamp);

    List<ImplantMonitoringLog> findByImplantSerialNumberAndTimestampBetween(String implantSerialNumber,
                                                                            LocalDateTime timestampFrom,
                                                                            LocalDateTime timestampTo);
}

Spring automatically parses these methods and generates the corresponding MongoDB queries.

Remember we enabled geo-spatial queries with MongoDB? Let’s add a corresponding method to ImplantMonitoringLog repository:

List<ImplantMonitoringLog> findByLocationNear(Point point, Distance distance);

The Near keyword in the method enables you to fetch all ImplantMonitoringLog within a given distance from the specified point.

Of course, this method is for demonstration purposes only as in our application, there could be thousands of logs, which may seriously affect performance of the query. What you can do here is

Specify additional parameters in the method to limit the number of search results. For instance, you can include a time window:

List<ImplantMonitoringLog> findByLocationNearAndTimestampBetween(
Point point, Distance distance, LocalDateTime from, LocalDateTime to);

Create an aggregation pipeline to filter and sort fetched documents.

For such complicated queries or a more fine-grained control, we can use the @Query annotation or a MongoTemplate. We will discuss these approaches further on in the article.

For now, let’s see how to perform basic CRUD operations with MongoDB.

Basic CRUD Operations with MongoDB

Create the CivilianService class and annotate it with @Service. This class will be responsible for communicating with the data access layer:

@Service
public class CivilianService {

    private final CivilianRepository civilianRepository;

    public CivilianService(CivilianRepository civilianRepository) {
        this.civilianRepository = civilianRepository;
    }
}

We can save a new Civilian document using the method save() or insert().

The save() method will insert a new document if it doesn't exist, or update the existing one if the id matches. Therefore, it can also be used for updating the entity.

public Civilian saveCivilian(Civilian civilian) {
    return civilianRepository.save(civilian);
}

public Civilian updateCivilian(String id, Implant implant) {
    Civilian civilian = civilianRepository.findById(id).orElseThrow();
    civilian.getImplants().add(implant);
    return civilianRepository.save(civilian);

}

On the other hand, insert() only adds a new document and will fail if the document already exists. The method can only be used for creating documents, but at the same time, it protects from accidental overwrites:

public Civilian saveCivilian(Civilian civilian) {
      return civilianRepository.insert(civilian);
}

To find civilians, we can use in-built methods and the ones we defined in the interface:

public Civilian getCivilianById(String id) {
    return civilianRepository.findById(id).orElseThrow();
}

public Civilian getCivilianByNationalId(String nationalId) {
    return civilianRepository.findByNationalId(nationalId).orElseThrow();
}

public List<Civilian> getAllCivilians() {
    return civilianRepository.findAll();
}

Finally, we can delete civilians using in-built repository methods:

public void deleteCivilian(Civilian civilian) {
    civilianRepository.delete(civilian);
}

public void deleteAllCivilians() {
    civilianRepository.deleteAll();
}

For ImpantMonitoringLog, the process is similar.

This is all very well, but let’s see how MongoDB shines in more complex querying cases.

Using MongoTemplate and @Query for Custom Logic

Sometimes, the method naming approach is not enough. You may want to query array elements, create complex filters, or avoid unreadable method names.

If you want to go beyond the MongoRepository capabilities and define custom data querying and processing, you can use MongoTemplate or a @Query annotation.

The @Query annotation allows you to define custom MongoDB queries directly in Repository interfaces. Using it is straightforward: you annotate a relevant repository method with a @Query and write a query using the BSON-based MongoDB query language.

For instance, we want to find all civilians with implants whose lot number is greater than or equal to N:

@Query("{ 'implants': { $elemMatch: { 'lotNumber': { $gte: ?0 } } } }")
List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber);

Here,

implants refers to the embedded implants array inside Civilian;
$elemMatch: filters for objects in the array that match the condition;
$gte: ?0 checks for lotNumber greater than or equal to the method parameter.

Refer to the official documentation for more details on the syntax.

What about MongoTemplate?

MongoTemplate is an API that provides an abstraction layer over the MongoDB operations. It enables you to create complex queries and aggregations without leaving the Spring programming model.

When using MongoTemplate, we work with Query, Criteria, and Aggregation objects to define queries programmatically. Then, MongoTemplate translates these objects into BSON query documents.

How do we use MongoTemplate? Well, we need to perform a few additional steps to enjoy programmatic querying.

First, we need to define a custom repository interface, say, CivilianRepositoryCustom, where we define custom query methods:

public interface CivilianRepositoryCustom {
    List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber);
}

After that, create a CivilianRepositoryCustomImpl class. The ‘Impl’ prefix is super important! This class will implement CivilianRepositoryCustom. Annotate this class with @Repository, inject the MongoTemplate bean, and override interface methods:

@Repository
public class CivilianRepositoryCustomImpl implements CivilianRepositoryCustom {

    private final MongoTemplate mongoTemplate;

    public CivilianRepositoryCustomImpl(MongoTemplate mongoTemplate) {
        this.mongoTemplate = mongoTemplate;
    }

    @Override
    public List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber) {
        return null;
    }
}

Alright, the most interesting part begins here!

We need to create a new Query object inside this method. This object accepts Criteria, which builds the required condition. In our case, we need to search in the implants array for an implant with a specific lot number. This lot number should be greater than or equal to the passed parameter. Finally, we need to pass this Query to MongoTemplate, which runs the query with its find(...) method and returns all matching civilians from the database.

The code looks like that:

@Override
public List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber) {
    Query query = new Query(Criteria.where("implants.lotNumber").gte(lotNumber));
    return mongoTemplate.find(query, Civilian.class);
}

The explanation is longer than the actual implementation 🙂

You can also write typesafe queries. One approach is to use the FluentMongo wrapper.

As we used the same method for both examples, you can now compare the @Query-based approach with MongoTemplate.

The question is, when to use which? Both methods are suitable for writing complex custom queries, so it is a matter of taste, really.

Later on in the article we will also compare both approaches when writing aggregation pipelines.

Up next: Projections and DTOs!

Projections and DTOs

MongoDB projection is the process of retrieving only those fields of the document that were specified instead of fetching the entire document. Projections help to reduce network traffic related to over-fetching and can protect against accidental data exposure.

For instance, we want to retrieve only the legalName and nationalId of civilians. We can specify these fields using the @Query annotation:

@Query(value = "{}", fields = "{ legalName : 1, nationalId : 1 }")
List<Civilian> findAllLegalNamesAndIds();

This method returns a list of Civilian objects with the data on legalName and nationalId only, all the other fields will be null.

A more sophisticated approach is to use interface-based projections. In this case, you define an interface specifying the fields you need, and Spring will automatically map query results to this projection.

The example above can be adjusted as follows. Create an interface CivilianSummary with two getter methods for legalName and nationalId:

public interface CivilianSummary {
    String getLegalName();
    String getNationalId();
}

Now, define a method in the CivilianRepository:

List<CivilianSummary> findAllByUnderSurveillance(boolean underSurveillance);

As a result, Spring Data will return a CivilianSummary directly from the database.

Interface-based projections bear a striking resemblance to DTOs, wouldn't you agree? Actually, they are not the same.

DTOs or class-based projections are classes that you write manually, with fields, constructors, or even computed fields and custom logic. DTOs are more flexible as they let you define custom logic and easily handle deeply nested objects.

On the other hand, interface-based projections are interfaces with getters for required fields. They are most suitable when you don’t need any custom logic in the resulting object. They are also associated with smaller overhead than DTOs because only the requested fields are fetched.

Good news is that we can combine DTOs with projections and get the best of two worlds!

Suppose we want to gather statistics on implant performance. For that purpose, we must calculate the average indicators for power usage, CPU usage, and neural latency over a given period of time. For that purpose, it would be better to create a DTO with corresponding fields.

As such, let’s create a record MonitoringStats:

public record MonitoringStats(String implantSerialNumber,
                              double avgPowerUsageUw,
                              double avgCpuUsagePct,
                              double avgNeuralLatencyMs) {
}

We can now use it to hold the data we fetched and calculated.

There’s just a small nuisance: to perform such a task, we need to master another powerful MongoDB tool, which is aggregations.

So, follow me to the next section!

Aggregations with Spring Data MongoDB

Aggregation in MongoDB is the process of running a series of operations like filtering, grouping, or transforming data against a document directly on the database side.

Aggregations are extremely useful when you need to perform analytics, gather statistics, or craft reports because they help to reduce data transfer and client-side computations.

You can use out-of-the-box single-purpose aggregations like distinct() or countDocument(). In case you need to perform more complex computations, you can build your own aggregation pipeline.

Aggregation pipeline is a series of operations performed on a set of data. You can create aggregation pipelines

declaratively in a repository interface using a @Query annotation or
programmatically with the help of MongoTemplate and Criteria.

In our small demo, We already have two perfect use cases suitable for aggregation pipeline:

We want to filter ImplantMonitoringLogs by time and distance, group them by implantSerialNumber, and transform them into a Map where implantSerialNumber is the key and a List of ImplantMonitoringLogs is a value.
We want to calculate average values for implant performance metrics for a given implantSerialNumber and a time window, and return a MonitoringStats DTO as a query result.

Well, what are we waiting for? Let’s get on to the tasks at hand!

First, let’s prepare the grounds. We need to create the ImplantMonitoringLogRepositoryCustom interface with custom methods aggregateStats() and findLogsByAreaAndTimeGrouped():

public interface ImplantMonitoringLogRepositoryCustom {

    MonitoringStats aggregateStats(String serialNumber, LocalDateTime from, LocalDateTime to);

    public Map<String, List<ImplantMonitoringLog>> findLogsByAreaAndTimeGrouped(
            Point center, double maxDistanceMeters, LocalDateTime from, LocalDateTime to);

}

Don’t forget to make the ImplantMonitoringLogRepository extend the new interface:

public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String>, ImplantMonitoringLogRepositoryCustom {
}

Next, create an ImplantMonitoringLogRepositoryCustomImpl class that implements the ImplantMonitoringLogRepositoryCustom interface and overrides its methods. In addition, inject the MongoTemplate bean:

@Repository
public class ImplantMonitoringLogRepositoryCustomImpl implements ImplantMonitoringLogRepositoryCustom {

    private final MongoTemplate mongoTemplate;

    public ImplantMonitoringLogRepositoryCustomImpl(MongoTemplate mongoTemplate) {
        this.mongoTemplate = mongoTemplate;
    }
}

Shall we start with the aggregateStats() method?

Once again, what we need to do is:

Filter the logs and fetch only documents that match the given criteria: implant serial number and a time window;
Calculate the average of certain fields: power usage, CPU usage, and neural latency;
Create a projection that will contain only four fields: implant serial number and calculated metrics;
Return a MonitoringStats object from the database.

For the first operation, we need the MatchOperation class that will hold the given Criteria:

MatchOperation match = Aggregation.match(Criteria.where("implantSerialNumber").is(serialNumber)
                .and("timestamp").gte(from).lte(to));

To calculate the average, we need the GroupOperation class. Here, we group logs by implantSerialNumber, calculate the average for each group, and give a new name to each resulting metric:

GroupOperation group = Aggregation.group("implantSerialNumber")
                .avg("powerUsageUw").as("avgPowerUsageUw")
                .avg("cpuUsagePct").as("avgCpuUsagePct")
                .avg("neuralLatencyMs").as("avgNeuralLatencyMs");

Next, we need the ProjectionOperation class to rename _id to implantSerialNumber and round each average metric to 2 decimal places.

ProjectionOperation project = Aggregation.project().and("_id").as("implantSerialNumber")
.and(ArithmeticOperators.Round.roundValueOf("avgPowerUsageUw").place(2)).as("avgPowerUsageUw")
.and(ArithmeticOperators.Round.roundValueOf("avgCpuUsagePct").place(2)).as("avgCpuUsagePct")
.and(ArithmeticOperators.Round.roundValueOf("avgNeuralLatencyMs").place(2)).as("avgNeuralLatencyMs");

Finally, we combine all three stages into a single aggregation pipeline and ‘feed’ it to MongoTemplate, which executes the pipeline against the implant_logs collection and maps the result to a MonitoringStats object.

Aggregation aggregation = Aggregation.newAggregation(match, group, project);

AggregationResults<MonitoringStats> results = mongoTemplate.aggregate(
                aggregation, "implant_logs", MonitoringStats.class);

Full method implementation:

Accordion header

@Override
public MonitoringStats aggregateStats(String serialNumber, LocalDateTime from, LocalDateTime to) {
    MatchOperation match = Aggregation.match(Criteria.where("implantSerialNumber").is(serialNumber)
            .and("timestamp").gte(from).lte(to));

    GroupOperation group = Aggregation.group("implantSerialNumber")
            .avg("powerUsageUw").as("avgPowerUsageUw")
            .avg("cpuUsagePct").as("avgCpuUsagePct")
            .avg("neuralLatencyMs").as("avgNeuralLatencyMs");

    ProjectionOperation project = Aggregation.project()
            .and("_id").as("implantSerialNumber")            .and(ArithmeticOperators.Round.roundValueOf("avgPowerUsageUw").place(2)).as("avgPowerUsageUw")            .and(ArithmeticOperators.Round.roundValueOf("avgCpuUsagePct").place(2)).as("avgCpuUsagePct")            .and(ArithmeticOperators.Round.roundValueOf("avgNeuralLatencyMs").place(2)).as("avgNeuralLatencyMs");

    Aggregation aggregation = Aggregation.newAggregation(match, group, project);

    AggregationResults<MonitoringStats> results = mongoTemplate.aggregate(
            aggregation, "implant_logs", MonitoringStats.class);

    return results.getUniqueMappedResult();
}

This aggregation pipeline is ready, one more to go!

A quick reminder: we want to find all ImplantMonitoringLogs within a given radius and time window, group them by implant serial id, and return a map with key implantSerialNimber and value List<ImplantMonitoringLog>.

The first operation is already familiar to us: we use MatchOperation to filter logs by distance and time:

MatchOperation match = Aggregation.match(
            Criteria.where("location").nearSphere(center)
                    .maxDistance(maxDistanceMeters)
                    .and("timestamp").gte(from).lte(to));

Then, we use the GroupOperation to group logs by implantSerialNumber and push each matching document into the logs array for that group:

GroupOperation group = Aggregation.group("implantSerialNumber")
            .push(Aggregation.ROOT).as("logs");

After that, we create an aggregation pipeline, which is used by the MongoTemplate to return List<Document>:

Aggregation aggregation = Aggregation.newAggregation(match, group);

AggregationResults<Document> results = mongoTemplate.aggregate(
            aggregation, "implant_logs", Document.class);

The next step is to iterate over each Document, get the implantSerialNumber and a list of Document logs. These logs can be then converted to the ImplantMonitoringLog class using the mongoTemplate.getConverter() method. Finally, we put a new entry into the Map, where implantSerialNumber is the key and List<ImplantMonitoringLog> is a value:

Map<String, List<ImplantMonitoringLog>> grouped = new HashMap<>();

for (Document doc : results.getMappedResults()) {
    String serialNumber = doc.getString("_id");
    List<Document> logsDocs = (List<Document>) doc.get("logs");

    List<ImplantMonitoringLog> logs = logsDocs.stream()
            .map(d -> mongoTemplate.getConverter()
                    .read(ImplantMonitoringLog.class, d))
            .toList();

    grouped.put(serialNumber, logs);

Full method implementation:

Accordion header

@Override
public Map<String, List<ImplantMonitoringLog>> findLogsByAreaAndTimeGrouped(Point center,
                                                                            double maxDistanceMeters,
                                                                            LocalDateTime from,
                                                                            LocalDateTime to) {
    MatchOperation match = Aggregation.match(
            Criteria.where("location").nearSphere(center)
                    .maxDistance(maxDistanceMeters)
                    .and("timestamp").gte(from).lte(to));

    GroupOperation group = Aggregation.group("implantSerialNumber")
            .push(Aggregation.ROOT).as("logs");

    Aggregation aggregation = Aggregation.newAggregation(match, group);

    AggregationResults<Document> results = mongoTemplate.aggregate(
            aggregation, "implant_logs", Document.class);

    Map<String, List<ImplantMonitoringLog>> grouped = new HashMap<>();

    for (Document doc : results.getMappedResults()) {
        String serialNumber = doc.getString("_id");
        List<Document> logsDocs = (List<Document>) doc.get("logs");

        List<ImplantMonitoringLog> logs = logsDocs.stream()
                .map(d -> mongoTemplate.getConverter()
                        .read(ImplantMonitoringLog.class, d))
                .toList();

        grouped.put(serialNumber, logs);
    }

    return grouped;
}

That was awesome! You can now proceed to writing the Controllers if you want to build a web application and take advantage of the logic we have written.

As for this tutorial, there are two more topics left to discuss: testing and database migrations.

MongoDB Migrations with Mongock

Why use migration tool with NoSQL

Using a database migration tool such as Liquibase or Flyway with SQL databases is justified as we need to write the schema and update it explicitly as needed. With MongoDB, the schema is generated automatically, and changes to the @Document class fields can be applied with a single write command.

So, why bother with a migration tool?

Even if MongoDB doesn’t enforce a rigid schema, your application needs it. The indexes, validation rules, and conventions must be correct and consistent across dev, CI, and prod. However, relying on auto-created schema may lead to

Missing indexes resulting in slow queries,
Incomplete validation rules leading to corrupted documents.

You also might want to modify the data, which, in case of automatic updates, may result in mismatching schema versions in production.

Therefore, a migration tool enables you to create version-controlled and idempotent change sets so that every environment starts with the same collections, indexes, and validations.

What is Mongock

Mongock is an open-source Java-based migration tool for NoSQL databases. It offers a code-first approach to schema generation meaning that you can write migration scripts in Java/Kotlin and ship them with your app. With Mongock you can

Version changelogs,
Create indexes and validation rules,
Be sure of idempotent execution of change sets,
Split documents,
Seed sample data.

Mongock is natively compatible with Spring/Spring Boot, so adding it to your app is just a matter of two dependencies and one annotation.

Why not use the familiar tools such as Liquibase or Flyway?

Liquibase/Flyway are tailored to relational databases. They have very limited and/or experimental support for MongoDB and BSON and don’t work with many Mongo-specific features such as geo-indexes.

Therefore, Liquibase and Flyway remain gold standards for relational databases, whereas Mongock is a perfect fit for MongoDB and other non-relational DBs.

Setup Mongock with Spring Boot

Let’s add the dependencies for Mongock Runner and MongoDB driver to the pom.xml:

<dependency>
    <groupId>io.mongock</groupId>
    <artifactId>mongock-springboot</artifactId>
</dependency>
<dependency>
    <groupId>io.mongock</groupId>
    <artifactId>mongodb-springdata-v4-driver</artifactId>
</dependency>

Now, add the annotation @EnableMongock to the main application class. This annotation triggers the Mongock runner upon application start to run the migrations:

@EnableMongock
@SpringBootApplication
@EnableMongoRepositories
public class MongodbDemoApp {

    public static void main(String[] args) {
        SpringApplication.run(MongodbDemoApp.class, args);
    }

}

Finally, add a new property to application.properties pointing to the location of changelog files:

mongock.migration-scan-package=dev.cyberjar.migration

It is important to note that if you want to create a schema manually with Mongock, you have to remove all @Indexed annotations from the classes or else MongoDB will attempt to create the indexes automatically, and migration will fail with an error that such index already exists.

Create Changelogs and Apply Updates

Note that this guide is applicable to Mongock version 5.x — some methods and flows were changed as compared to the previous major Mongock version.

Create the SchemaDataInitializerChangeUnit class. Annotate it with @ChangeUnit and specify

id, which will be stored in the ChangeUnit history collection,
execution order,
author (optional).

@ChangeUnit(id = "schema-and-test-data", order = "001", author = "cyberjar")
public class SchemaDataInitializerChangeUnit {

    private final MongoTemplate mongoTemplate;

    public SchemaDataInitializerChangeUnit(MongoTemplate mongoTemplate) {
        this.mongoTemplate = mongoTemplate;
    }
}

First, let’s take care of creating the collections and indices.

ChangeUnit classes can contain methods annotated with

@BeforeExecution (optional) for executing operations such as DDL before the actual migration
@RollbackBeforeExecution (obligatory if @BeforeExecution is present) reverts the changes made in the @BeforeExecution method,
@Execution for the main migration method,
@RollbackExecution for reverting changes made in the execution method.

The creation of collections and indexes is a DDL operation and should be performed in the @BeforeExecution method.

Firstly, let’s create the collections using the createCollection() method of MongoTemplate:

@BeforeExecution
public void beforeExecution() {

    mongoTemplate.createCollection("civilians");
    mongoTemplate.createCollection("implant_logs");
}

In the same method, create the indices. For that, we need the IndexOperations object that is bound to a specific collection by MongoTemplate. Using this object, we can create indices for the specified collection:

IndexOperations civilianOps = mongoTemplate.indexOps("civilians");

civilianOps.createIndex(
        new Index().on("nationalId", Sort.Direction.ASC).unique());

Note that the new Index accepts the name and sorting direction. You can also specify additional properties of this index. For instance, whether it is unique or not.

In a similar way, let’s create the indices for the implants. As we don’t have the separate collection for them, we bind the index to the class:

IndexOperations implantOps = mongoTemplate.indexOps(Implant.class);
implantOps.createIndex(
        new Index().on("serialNumber", Sort.Direction.ASC).unique());
implantOps.createIndex(
        new Index().on("lotNumber", Sort.Direction.ASC));

As for the ImplantMonitoringLogs, we need to create a GeospatialIndex indeed of the standard Index for the location field and specify its time.

We can also make the timestamp not just an Index, but a TTL Index. MongoDB uses such indices to remove the documents from the collection after a specified amount of time. I believe in the case of logs, a TTL index is more beneficial because otherwise, the amount of logs in the collection may exceed the sensible numbers.

IndexOperations logOps = mongoTemplate.indexOps("implant_logs");

logOps.createIndex(
        new Index().on("implantSerialNumber", Sort.Direction.ASC));
logOps.createIndex(
        new Index().on("timestamp", Sort.Direction.DESC));
logOps.createIndex(
        new GeospatialIndex("location")
                .typed(GeoSpatialIndexType.GEO_2DSPHERE));

logOps.createIndex(
        new Index()
                .on("timestamp", Sort.Direction.ASC)
                .expire(Duration.ofDays(90)));

We also need a @RollbackBeforeExecution method is something goes wrong during schema creation:

@RollbackBeforeExecution
public void rollbackBeforeExecution() {

    mongoTemplate.dropCollection("civilians");
    mongoTemplate.dropCollection("implant_logs");
}

We can now move on to the @Execution method to seed some test data into our database:

Accordion header

@Execution
public void seedDatabase(MongoTemplate mongoTemplate) {

    List<Implant> implants = new ArrayList<>();
    implants.add(new Implant("limb", "Model-Dvb688", "2.2", "MechaMed", 536, "742669", "2025-03-21"));
    implants.add(new Implant("ocular", "Model-SiT679", "1.5", "MechaMed", 434, "306310", "2025-06-08"));
    implants.add(new Implant("limb", "Model-Jtv413", "1.3", "MechaMed", 536, "470917", "2025-04-03"));


    List<Civilian> civilians = new ArrayList<>();
    civilians.add(new Civilian(null, "Aarav Das", "Ni-96751543-BP", "1965-05-02", true, false, List.of(implants.get(0))));
    civilians.add(new Civilian(null, "Paula Lin", "NP-59909166-Wg", "1998-11-01", false, false, List.of(implants.get(1))));
    civilians.add(new Civilian(null, "Aelita Fang", "gQ-01247486-nk", "1989-12-01", true, false, List.of(implants.get(3))));

    mongoTemplate.insert(civilians, Civilian.class);


    String implantSerialNum = implants.getFirst().getSerialNumber();
    String civilianNationalId = civilians.getFirst().getNationalId();

    double powerUsage = 1.5;
    double cpuUsage = 1.0;
    double neuralLatency = 0.5;

    List<ImplantMonitoringLog> logs = new ArrayList<>();

    for (int i = 0; i < 30; i++) {
        ImplantMonitoringLog implantMonitoringLog = new ImplantMonitoringLog(null, implantSerialNum, civilianNationalId, LocalDateTime.now().minusHours(i), powerUsage + i, cpuUsage + i, neuralLatency + i, new Point(4.899, 52.372)); //Coordinates for Amsterdam longitude/latitude

        logs.add(implantMonitoringLog);
    }

    mongoTemplate.insert(logs, ImplantMonitoringLog.class);
}

Finally, the @RollbackExecution method will revert the changes of the migration method if necessary. You can delete all documents from the collections or simply drop the collections:

@RollbackExecution
public void rollbackExecution() {

    mongoTemplate.dropCollection("civilians");
    mongoTemplate.dropCollection("implant_logs");
}

Testing MongoDB Applications with @DataMongoTest and Testcontainers

Our small demo app is ready, it’s time to test it!

Testing Data Layer with @DataMongoTest and Testcontainers

Spring Boot provides the @DataMongoTest for testing the data layer of the application without the full auto-configuration. When applied at the test class level, it searches for MongoDB-specific beans like repositories and documents and configures only them, leaving services, controllers, etc. out of the equation.

By default, tests annotated with @DataMongoTest use the embedded MongoDB database. But we will override this behavior and use dockerized MongoDB, which we will spin up with the help of Testcontainers.

Why would we want to do that? Unlike the embedded database, Testcontainers provide the actual database instance in a Docker container meaning that your tests will run against the realistic production-like environment.

We have already added the Testcontainers support when we created the project. We need to add one more, for JUnit. All three dependencies:

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>junit-jupiter</artifactId>
    <version>1.21.0</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>mongodb</artifactId>
    <version>1.21.0</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-testcontainers</artifactId>
    <scope>test</scope>
</dependency>

Now, let’s create a test class for CivilianRepository and annotate it with @Testcontainers, which delegates the lifecycle of containers to Testcontainers, and @DataMongoTest:

@Testcontainers
@DataMongoTest
class CivilianRepositoryTest {
}

Now, we need to create the instance of a MongoDBContainer using the specified Docker image. The @ServiceConnection annotation allows the MongoDB-related beans to communicate with MongoDB inside the Docker container.

Also, autowire the CivilianRepository bean that we will test and the MongoTemplate bean that will be responsible for adding test data to the database.

@Testcontainers
@DataMongoTest
class CivilianRepositoryTest {

    @Container
    @ServiceConnection
    static MongoDBContainer mongoDBContainer = new MongoDBContainer("mongo");

    @Autowired
    private CivilianRepository repository;

    @Autowired
    private MongoTemplate mongoTemplate;
}

The final step is to disable Mongock for this set of tests. You can and should test migrations separately, but in other tests, Mongock is not necessary and will only complicate the setup.

Create the test.properties file and add one property to disable Mongock:

mongock.enabled=false

Now, specify the path to the test.properties file with the class-level @TestPropertySource annotation:

@Testcontainers
@DataMongoTest
@TestPropertySource(locations = "classpath:test.properties")
class CivilianRepositoryTest {
}

Excellent. Now, let’s add some sample data to the containerized database.

The best practice is to isolate the tests from one another so that they don’t interfere with each other’s results. To achieve that, we can perform database cleanup with a subsequent data insert after each test. Use the @BeforeEach and @AfterEach annotations:

@BeforeEach
void populateWithData() {

    mongoTemplate.createCollection("civilians");

    List<Implant> implants = new ArrayList<>();

    implants.add(new Implant("limb", "Model-Dvb688", "2.2", "MechaMed", 536, "742669", "2025-03-21"));
    implants.add(new Implant("ocular", "Model-SiT679", "1.5", "MechaMed", 434, "306310", "2025-06-08"));
    implants.add(new Implant("limb", "Model-Jtv413", "1.3", "MechaMed", 536, "470917", "2025-04-03"));

    List<Civilian> civilians = new ArrayList<>();
    civilians.add(new Civilian(null, "Rin Morse", "fI-88901036-kD", "1985-08-01", true, false, List.of(implants.get(0))));
    civilians.add(new Civilian(null, "Heather Huang", "YD-99086969-CP", "1994-04-16", false, true, List.of(implants.get(1))));
    civilians.add(new Civilian(null, "Amir Morgan", "MP-66879496-vg", "1975-06-26", false, true, List.of(implants.get(2))));

    mongoTemplate.insert(civilians, Civilian.class);

}

@AfterEach
void cleanUp() {
    mongoTemplate.dropCollection("civilians");
}

In the example above, we added data manually. Another approach is to create a JSON file with all required data and add the Jackson2RepositoryPopulatorFactoryBean to the config file. This bean will populate the database with data when the container starts if we import it to the test class:

@Configuration
public class PopulatorConfig {
    @Bean
    public Jackson2RepositoryPopulatorFactoryBean populator() {
        var bean = new Jackson2RepositoryPopulatorFactoryBean();
        bean.setResources(new Resource[] {
            new ClassPathResource("mydata.json")
        });
        return bean;
    }
}

@DataMongoTest
@Import(PopulatorConfig.class)
class DataMongoTestWithJson {
    // Repos and tests as needed
}

Finally, we can use the familiar flow to write some tests:

@Test
void shouldFindCivilianByNationalId() {
    Optional<Civilian> civilian = repository.findByNationalId("fI-88901036-kD");
    String name = "Rin Morse";
    assertEquals(name, civilian.get().getLegalName());
}

@Test
void shouldFindCiviliansByLotNumber() {
    List<Civilian> civilians = repository.findAllByImplantLotNumber(536);
    int expected = 2;
    assertEquals(expected, civilians.size());
}

Integration Testing with Testcontainers

After we have tested all data classes in isolation, we can move on to integration testing. Integration testing is aimed at verifying that different parts of the application work correctly together.

Let’s create a test class for our ImplantMonitoringLogService. We need the @Testcontainers annotation and the @SpringBootTest annotation instead of @DataMongoTest to use the whole application context.

The container setup can be copied from the previous test class:

@Testcontainers
@SpringBootTest(classes = MongodbDemoApp.class)
@TestPropertySource(locations = "classpath:test.properties")
class ImplantMonitoringLogServiceTest {

    @Container
    @ServiceConnection
    static MongoDBContainer mongoDBContainer = new MongoDBContainer("mongo");

    @Autowired
    private ImplantMonitoringLogService monitoringLogService;

    @Autowired
    private MongoTemplate mongoTemplate;
}

Let’s add some test data to the database. Again, we are populating and dropping the database for each test:

@BeforeEach
void populateWithData() {
    mongoTemplate.createCollection("implant_logs");

    String implantSerialNum = "123456qw";
    String civilianNationalId = "rtfg5674-98";

    double powerUsage = 1.5;
    double cpuUsage = 1.0;
    double neuralLatency = 0.5;

    List<ImplantMonitoringLog> logs = new ArrayList<>();

    for (int i = 0; i < 30; i++) {
        ImplantMonitoringLog implantMonitoringLog = new ImplantMonitoringLog(null,
                implantSerialNum, civilianNationalId,
                LocalDateTime.now().minusHours(i),
                powerUsage + i,
                cpuUsage + i,
                neuralLatency + i,
                new Point(4.899, 52.372)); //Coordinates for Amsterdam

        logs.add(implantMonitoringLog);
    }

    mongoTemplate.insert(logs, ImplantMonitoringLog.class);

}

@AfterEach
void cleanUp() {
    mongoTemplate.dropCollection("implant_logs");

}

Finally, you can write corresponding methods to test the application logic:

@Test
void shouldGatherStatsForImplantLogs() {

    MonitoringStats stats = monitoringLogService.aggregateStatsForImplantForPeriod(
            "123456qw",
            LocalDateTime.now().minusDays(7),
            LocalDateTime.now()
    );

    double expectedAvgPowerUsage = 16.0;
    assertEquals(expectedAvgPowerUsage, stats.avgPowerUsageUw());
}

MongoDB Alternatives

MongoDB is a powerful and flexible NoSQL solution, but it is not the only one on the market. Here are some MongoDB alternatives that may fit your needs better:

Couchbase is a distributed NoSQL database platform that offers built-in caching for lower latency SQL-like query language (N1QL) for JSON documents.
Apache Cassandra is a highly-scalable NoSQL solution designed for high availability and big data management. It boasts multi-center support and fault-tolerance by default.
Amazon DynamoDB is a fully-managed NoSQL solution that is tightly integrated with the AWS ecosystem.

I would recommend studying carefully what each solution offers and try it out to find which of them fits your needs best.

Conclusion

A quick summary? If you’ve read this far, you can now

Set up MongoDB with Spring Boot and perform CRUD operations
Use @Query annotation and MongoTemplate for more complex business logic
Use specific MongoDB annotations for robust and efficient schema
Use Projections and build Aggregation Pipelines
Write migration scripts with Mongock
Perform data layer and integration testing of MongoDB apps with @DataMongoTest and Testcontainers

If you’d like to deepen your knowledge of MongoDB, you can further explore the docs:

And of course, subscribe to our newsletter for a deep dive into other cool technologies around Java

Download Alpaquita Linux

Download Liberica JDK

Download Liberica Native Image Kit

Alpaquita Cloud Native Platform

Pricing

Alpaquita

Containers for Spring Boot

Liberica JDK

Liberica JDK 6&7

Liberica JDK for Embedded

Liberica JDKPerformance Edition

Liberica JDK with CRaC

Liberica Native Image Kit

Liberica Mission Control

Liberica Administration Center

Blog

Documentation

Content Library

Support for Liberica JDK

Support for Alpaquita Linux

Support for Liberica Native Image Kit

Company

Newsroom

How to Use Spring Data MongoDB

Jul 10, 2025

Catherine Edelveis

44.8

Table of Contents

Setting Up MongoDB with Spring Boot

Defining Your Data Models

SQL vs NoSQL approach

Basic Modelling

Advanced Annotations

Creating Mongo Repositories

Basic CRUD Operations with MongoDB

Using MongoTemplate and @Query for Custom Logic

Projections and DTOs

Aggregations with Spring Data MongoDB

MongoDB Migrations with Mongock

Why use migration tool with NoSQL

What is Mongock

Setup Mongock with Spring Boot

Create Changelogs and Apply Updates

Testing MongoDB Applications with @DataMongoTest and Testcontainers

Testing Data Layer with @DataMongoTest and Testcontainers

Integration Testing with Testcontainers

MongoDB Alternatives

Conclusion

BellSoft Released Liberica JDK Performance Edition with JVM 21

Jul 23, 2025

Aleksei Voitylov

0

BellSoft Upgrades Liberica JDK Performance Edition with JVM 21

Jul 22, 2025

0

Subcribe to our newsletter

Further reading

Java Performance Testing: Best Practices & Tools

Mar 13, 2025

Catherine Edelveis

0

How to use Testcontainers with Spring Boot applications for integration testing

Feb 29, 2024

Catherine Edelveis

0

Mastering Reactive Programming with Spring Data R2DBC

Aug 15, 2024

Mikhail Polivakha

0

Alpaquita Cloud
Native Platform

Liberica JDK
Performance Edition