Introduction: Why MongoDB with Spring Boot?
MongoDB is an open-source cross-platform document-oriented database. Belonging to the family of NoSQL database solutions, it provides impressive scalability and flexibility to data-driven applications dealing with real-time analytics, IoT, operational intelligence, or e-commerce to name a few.
Spring developers can benefit from integrating MongoDB into their projects without leaving the familiar grounds of the framework. The secret ingredient is Spring Data MongoDB, which combines the powerful features of MongoDB and the Spring-based programming model. In addition, Spring Data MongoDB abstracts away boilerplate code and integrates smoothly with Spring Boot auto-configuration.
In this guide, I will walk you through setting up MongoDB with Spring Boot, using a variety of its features, from indexes to aggregation. I will also demonstrate how to integrate Mongock for reliable database migrations.
This guide is beginner-friendly but also includes some advanced topics, so, feel free to navigate to the section you are most interested in.
The code used in the tutorial is available on GitHub.
Table of Contents
- Setting Up MongoDB with Spring Boot
- Defining Your Data Models
- Creating Mongo Repositories
- Basic CRUD Operations with MongoDB
- Using MongoTemplate and @Query for Custom Logic
- Projections and DTOs
- Aggregations with Spring Data MongoDB
- MongoDB Migrations with Mongock
- Testing MongoDB Applications with @DataMongoTest and Testcontainers
- MongoDB Alternatives
- Conclusion
Setting Up MongoDB with Spring Boot
The application we are building is called NeuroWatch. It is a cyberpunk-themed app aimed at collecting the data on civilians and cyberware they implanted, as well as live reports sent by implants and aggregates the data to monitor the implant health over the time.
So much more fun than typical Student, User, and Author entities, right?
Prerequisites:
- Spring Boot 3+
- JDK 24 or at least JDK 17 supported by Spring Boot 3.x. I’m using Liberica JDK recommended by Spring.
- Your favorite IDE
- Docker and Docker Compose
First, let’s create a new Spring Boot project. Head over to Spring Initializr and select three dependencies: Docker Compose, Spring Data MongoDB, and Testcontainers.
Creating a new project
Hit ‘Generate’ and open it in the IDE.
Go to the Main application class and add the @EnableMongoRepositories
annotation.
@SpringBootApplication
@EnableMongoRepositories
public class MongodbDemoApp {
public static void main(String[] args) {
SpringApplication.run(MongodbDemoApp.class, args);
}
}
The next essential step is to configure the MongoDB connection.
You can either run a MongoDB server locally or in a container. I will spin up a MongoDB container using Docker Compose. Here’s the compose.yml file:
services:
mongodb:
image: mongo:latest
container_name: mongodb
restart: unless-stopped
environment:
- MONGO_INITDB_DATABASE=neurowatch
networks:
- neurowatch_net
ports:
- "27017:27017"
volumes:
- mongo_data:/data/db-mongo-data
command: "mongod --quiet --logpath /dev/null "
healthcheck:
test: [ "CMD", "mongosh", "--eval", "db.adminCommand('ping')" ]
interval: 10s
timeout: 5s
retries: 5
start_period: 10s
volumes:
mongo_data:
networks:
neurowatch_net:
Let’s see briefly what is going on here:
- We specify the image we want to use (mongo:latest)
restart:unless-stopped
starts the container automatically if it is stopped not by the user.- We specify the database name that mongo needs to create upon the first start (
MONGO_INITDB_DATABASE=neurowatch
) - We provide the connection details such as the network and port
- The healthcheck part helps to verify that the container is functioning properly
- In the volumes part, we specify the path to the local data storage so that when you stop and restart the container image, your data is preserved.
Next, specify the connection details in the application.properties file:
spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=neurowatch
spring.data.mongodb.auto-index-creation=true
Here, we specify the host, port, and database name. By default, user and password are not required. We also enable automatic index creation (more on indexes below).
Before running the application, you need to start the mongodb container:
docker-compose up -d
That’s it, we are all set up for writing the actual application!
Defining Your Data Models
SQL vs NoSQL approach
MongoDB is a NoSQL document-based database, which means that its approach to storing and retrieving data is different to that of relational databases.
With traditional relational databases, we need to create a database schema beforehand and map the relationships between entities using foreign keys and joins. The data is stored in tables, where they form rows and columns.
MongoDB stores entities in JSON-like documents. It enables the developers to nest arrays and sub-documents without first declaring a rigid schema. Accompanying information is stored together in one document and can be indexed for convenient and rapid access. There’s no need to predetermine database schema, MongoDB takes care of it. At the same time, data can be efficiently queried, sorted, aggregated, and filtered thanks to the MongoDB Query API.
Keeping that in mind, let’s see how to create entities MongoDB-style!
Basic Modelling
Let’s start with basic modelling. We will have two collections: civilians mapped to the Civilian
model and implant_logs mapped to the ImplantMonitoringLog
. A collection in MongoDB is used to store data and is similar to tables in relational databases.
The Implant
class doesn’t need a separate collection: the implants will be stored in the Civilian
model.
Let’s define a Civilian
class annotated with @Document
. This annotation tells MongoDB that the class represents a MongoDB document.
A MongoDB document is a self-contained “record” written in JSON-like syntax supporting more data types than plain JSON.The document can hold any set of key-value pairs, numbers, text, dates, arrays, or even nested documents.
You can define the name of the collection in the annotation if it differs from the name of the class:
@Document(collection = "civilians")
public class Civilian {
@Id
private String id;
private String legalName;
private String nationalId;
private LocalDate birthDate;
private boolean criminalRecord;
private boolean underSurveillance;
private final LocalDateTime registeredInSystemAt;
private List<Implant> implants = new ArrayList<>();
// getters, setters, etc. are omitted for brevity
}
For now, we have only one annotated field. The @Id
annotation specifies the primary key. In MongoDB, the id should have a type String, ObjectId, or UUID.
If you want to define a custom name for a field, you can use a @Field
annotation:
@Field(name = "registered_at")
private final LocalDateTime registeredInSystemAt;
Now, let’s create the Implant
class:
public class Implant {
private String type;
private String model;
private String version;
private String manufacturer;
private String serialNumber;
private int lotNumber;
private LocalDate installedAt;
private final LocalDateTime registeredInSystemAt;
// getters, setters, etc. are omitted for brevity
}
As you can see, it doesn’t have a separate collection in the database.
You can also embed documents into other documents if required. Imagine we made Implant
a separate Document for some reason, gave it the id and a collection. Then, you just add the list of implants to Civilian
:
@Document
public class Implant {
@Id
private String id;
}
@Document
public class Civilian {
@Id
private String id;
private List<Implant> implants;
}
Instead of embedding the document, you can reference it using the @DBRef
annotation. Such reference will be eagerly resolved:
@Document
public class Implant {
@Id
private String id;
}
@Document
public class Civilian {
@Id
private String id;
@DBRef
private List<Implant> implants;
}
You can also use @DocumentReference
instead of @DBRef
to define more flexibly which fields of the embedded document should be loaded (the id field is referenced by default):
@Document
public class Implant {
@Id
private String id;
}
@Document
public class Civilian {
@Id
private String id;
@DocumentReference
private List<Implant> implants;
}
Either way, you should be cautious when embedding documents or referencing them. Updating deeply nested documents requires rewriting the entire document. In addition, eagerly fetched documents with @DBRef
can impact the application performance, whereas lazy loading with @DBRef
may complicate debugging.
Advanced Annotations
Moving on to more advanced annotations!
Let’s start with indexing. MongoDB indexes are special data structures that facilitate querying data. They are similar to a book index that helps to find required content without reading each page. MongoDB indexes help to avoid a full collection scan to find necessary data.
Indexes are created for frequently queried fields such as nationalId
in Civilian
. With an index, you can also specify the uniqueness of the field if necessary (although the uniqueness of the national id is debated in real world, let’s suppose that it is a unique number for the sake of this small demo):
@Document(collection = "civilians")
public class Civilian {
@Indexed(unique = true)
private String nationalId;
}
Indexes can be used not only with documents but with any entity whose data is preserved in the database. Let’s also create indexes for Implant
:
public class Implant {
@Indexed(unique = true)
private String serialNumber;
@Indexed
private int lotNumber;
}
Apart from single-field indexes, MongoDB supports compound (@CompoundIndex
), text (@TextIndexed
), geospatial (@GeoSpatialIndexed
), and hashed (@HashIndexed
) indexes.
Compound indexes can help improve the performance of queries using criteria on multiple fields. They are defined at a class level. Let’s create a compound index of implantSerialNumber
in ascending order and timestamp
in descending order for ImplantMonitoringLog
:
@Document(collection = "implant_logs")
@CompoundIndex(name = "implant_ts_idx",
def = "{'implant_serial_number': 1, 'timestamp': -1}")
public class ImplantMonitoringLog {
}
Geo-spatial queries is an exciting MongoDB feature that enables you to find documents within a given distance. We can’t go past it, I suggest we see it in action!
Let’s add one more field to our ImplantMonitoringLog
that will define the location it was created at:
@GeoSpatialIndexed(type = GeoSpatialIndexType.GEO_2DSPHERE)
private Point location;
Here,
Point
is a class fromorg.springframework.data.geo
,@GeoSpatialIndexed(type = GeoSpatialIndexType.GEO_2DSPHERE)
annotation creates an index of typeGEO_2DSPHERE
that enforces usage of the$nearSphere
operator when fetching the data. This operator takes into account the curvature of the Earth and performs a spherical, geodesic distance calculation. It enables you to search documents within a given radius and work with realistic distances on a globe.
After adding this annotation, we can create repository methods for fetching ImplantMonitoringLog
within a certain distance — I’ll show you how to do that in the next section.
Another interesting set of annotations is audit annotations @CreatedDate
and @LastModifiedDate
. They help to keep track of document lifecycle for maintaining data traceability and enabling analytics or data versioning.
Let’s add these annotations to our Civilian
and ImplantMonitoringLog
classes:
@Document(collection = "civilians")
public class Civilian {
@CreatedDate
private final LocalDateTime registeredInSystemAt;
}
@Document(collection = "implant_logs")
public class ImplantMonitoringLog {
@CreatedDate
private final LocalDateTime timestamp;
}
Our models are ready, moving on to setting up repositories!
Creating Mongo Repositories
Spring Data MongoDB provides the MongoRepository
interface, which, in turn, extends
CrudRepository
,QueryByExampleExecutor
, andPagindAndSortingRepository
.
Therefore, it provides basic methods for CRUD operations on the entities, paging and sorting capabilities, and MongoDB-specific methods.
Let’s create two repository interfaces annotated with @Repository
, CivilianRepository
and ImplantMonitoringLogRepository
. Make them extend MongoRepository
and specify the entity class name and id type:
@Repository
public interface CivilianRepository extends MongoRepository<Civilian, String> { }
@Repository
public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String> { }
At this point, we can already perform CRUD operations with methods provided internally by Spring repositories.
In addition, we can use queries by convention. This feature enables the developers to define queries in method names following an established pattern.
For instance, we can define a method for searching a civilian by nationalId
as follows:
@Repository
public interface CivilianRepository extends MongoRepository<Civilian, String> {
Optional<Civilian> findByNationalId(String nationalId);
}
We can also define methods for returning a List of ImplantMonitoringLog
objects by implant serial number and a timestamp after or between specified dates:
@Repository
public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String> {
List<ImplantMonitoringLog> findByImplantSerialNumber(String implantSerialNumber);
List<ImplantMonitoringLog> findByImplantSerialNumberAndTimestampAfter(String implantSerialNumber,
LocalDateTime timestamp);
List<ImplantMonitoringLog> findByImplantSerialNumberAndTimestampBetween(String implantSerialNumber,
LocalDateTime timestampFrom,
LocalDateTime timestampTo);
}
Spring automatically parses these methods and generates the corresponding MongoDB queries.
Remember we enabled geo-spatial queries with MongoDB? Let’s add a corresponding method to ImplantMonitoringLog
repository:
List<ImplantMonitoringLog> findByLocationNear(Point point, Distance distance);
The Near
keyword in the method enables you to fetch all ImplantMonitoringLog
within a given distance from the specified point.
Of course, this method is for demonstration purposes only as in our application, there could be thousands of logs, which may seriously affect performance of the query. What you can do here is
- Specify additional parameters in the method to limit the number of search results. For instance, you can include a time window:
List<ImplantMonitoringLog> findByLocationNearAndTimestampBetween(
Point point, Distance distance, LocalDateTime from, LocalDateTime to);
- Create an aggregation pipeline to filter and sort fetched documents.
For such complicated queries or a more fine-grained control, we can use the @Query
annotation or a MongoTemplate
. We will discuss these approaches further on in the article.
For now, let’s see how to perform basic CRUD operations with MongoDB.
Basic CRUD Operations with MongoDB
Create the CivilianService
class and annotate it with @Service
. This class will be responsible for communicating with the data access layer:
@Service
public class CivilianService {
private final CivilianRepository civilianRepository;
public CivilianService(CivilianRepository civilianRepository) {
this.civilianRepository = civilianRepository;
}
}
We can save a new Civilian
document using the method save()
or insert()
.
The save()
method will insert a new document if it doesn't exist, or update the existing one if the id matches. Therefore, it can also be used for updating the entity.
public Civilian saveCivilian(Civilian civilian) {
return civilianRepository.save(civilian);
}
public Civilian updateCivilian(String id, Implant implant) {
Civilian civilian = civilianRepository.findById(id).orElseThrow();
civilian.getImplants().add(implant);
return civilianRepository.save(civilian);
}
On the other hand, insert()
only adds a new document and will fail if the document already exists. The method can only be used for creating documents, but at the same time, it protects from accidental overwrites:
public Civilian saveCivilian(Civilian civilian) {
return civilianRepository.insert(civilian);
}
To find civilians, we can use in-built methods and the ones we defined in the interface:
public Civilian getCivilianById(String id) {
return civilianRepository.findById(id).orElseThrow();
}
public Civilian getCivilianByNationalId(String nationalId) {
return civilianRepository.findByNationalId(nationalId).orElseThrow();
}
public List<Civilian> getAllCivilians() {
return civilianRepository.findAll();
}
Finally, we can delete civilians using in-built repository methods:
public void deleteCivilian(Civilian civilian) {
civilianRepository.delete(civilian);
}
public void deleteAllCivilians() {
civilianRepository.deleteAll();
}
For ImpantMonitoringLog
, the process is similar.
This is all very well, but let’s see how MongoDB shines in more complex querying cases.
Using MongoTemplate and @Query for Custom Logic
Sometimes, the method naming approach is not enough. You may want to query array elements, create complex filters, or avoid unreadable method names.
If you want to go beyond the MongoRepository
capabilities and define custom data querying and processing, you can use MongoTemplate
or a @Query
annotation.
The @Query
annotation allows you to define custom MongoDB queries directly in Repository interfaces. Using it is straightforward: you annotate a relevant repository method with a @Query
and write a query using the BSON-based MongoDB query language.
For instance, we want to find all civilians with implants whose lot number is greater than or equal to N:
@Query("{ 'implants': { $elemMatch: { 'lotNumber': { $gte: ?0 } } } }")
List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber);
Here,
implants
refers to the embedded implants array insideCivilian
;$elemMatch:
filters for objects in the array that match the condition;$gte: ?0
checks forlotNumber
greater than or equal to the method parameter.
Refer to the official documentation for more details on the syntax.
What about MongoTemplate
?
MongoTemplate
is an API that provides an abstraction layer over the MongoDB operations. It enables you to create complex queries and aggregations without leaving the Spring programming model.
When using MongoTemplate
, we work with Query
, Criteria
, and Aggregation
objects to define queries programmatically. Then, MongoTemplate
translates these objects into BSON query documents.
How do we use MongoTemplate
? Well, we need to perform a few additional steps to enjoy programmatic querying.
First, we need to define a custom repository interface, say, CivilianRepositoryCustom
, where we define custom query methods:
public interface CivilianRepositoryCustom {
List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber);
}
After that, create a CivilianRepositoryCustomImpl
class. The ‘Impl’ prefix is super important! This class will implement CivilianRepositoryCustom
. Annotate this class with @Repository
, inject the MongoTemplate
bean, and override interface methods:
@Repository
public class CivilianRepositoryCustomImpl implements CivilianRepositoryCustom {
private final MongoTemplate mongoTemplate;
public CivilianRepositoryCustomImpl(MongoTemplate mongoTemplate) {
this.mongoTemplate = mongoTemplate;
}
@Override
public List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber) {
return null;
}
}
Alright, the most interesting part begins here!
We need to create a new Query
object inside this method. This object accepts Criteria
, which builds the required condition. In our case, we need to search in the implants array for an implant with a specific lot number. This lot number should be greater than or equal to the passed parameter. Finally, we need to pass this Query
to MongoTemplate
, which runs the query with its find(...)
method and returns all matching civilians from the database.
The code looks like that:
@Override
public List<Civilian> findAllByImplantLotNumberGreaterThanEqual(int lotNumber) {
Query query = new Query(Criteria.where("implants.lotNumber").gte(lotNumber));
return mongoTemplate.find(query, Civilian.class);
}
The explanation is longer than the actual implementation 🙂
You can also write typesafe queries. One approach is to use the FluentMongo wrapper.
As we used the same method for both examples, you can now compare the @Query
-based approach with MongoTemplate
.
The question is, when to use which? Both methods are suitable for writing complex custom queries, so it is a matter of taste, really.
Later on in the article we will also compare both approaches when writing aggregation pipelines.
Up next: Projections and DTOs!
Projections and DTOs
MongoDB projection is the process of retrieving only those fields of the document that were specified instead of fetching the entire document. Projections help to reduce network traffic related to over-fetching and can protect against accidental data exposure.
For instance, we want to retrieve only the legalName
and nationalId
of civilians. We can specify these fields using the @Query
annotation:
@Query(value = "{}", fields = "{ legalName : 1, nationalId : 1 }")
List<Civilian> findAllLegalNamesAndIds();
This method returns a list of Civilian
objects with the data on legalName
and nationalId
only, all the other fields will be null.
A more sophisticated approach is to use interface-based projections. In this case, you define an interface specifying the fields you need, and Spring will automatically map query results to this projection.
The example above can be adjusted as follows. Create an interface CivilianSummary
with two getter methods for legalName
and nationalId
:
public interface CivilianSummary {
String getLegalName();
String getNationalId();
}
Now, define a method in the CivilianRepository
:
List<CivilianSummary> findAllByUnderSurveillance(boolean underSurveillance);
As a result, Spring Data will return a CivilianSummary
directly from the database.
Interface-based projections bear a striking resemblance to DTOs, wouldn't you agree? Actually, they are not the same.
DTOs or class-based projections are classes that you write manually, with fields, constructors, or even computed fields and custom logic. DTOs are more flexible as they let you define custom logic and easily handle deeply nested objects.
On the other hand, interface-based projections are interfaces with getters for required fields. They are most suitable when you don’t need any custom logic in the resulting object. They are also associated with smaller overhead than DTOs because only the requested fields are fetched.
Good news is that we can combine DTOs with projections and get the best of two worlds!
Suppose we want to gather statistics on implant performance. For that purpose, we must calculate the average indicators for power usage, CPU usage, and neural latency over a given period of time. For that purpose, it would be better to create a DTO with corresponding fields.
As such, let’s create a record MonitoringStats
:
public record MonitoringStats(String implantSerialNumber,
double avgPowerUsageUw,
double avgCpuUsagePct,
double avgNeuralLatencyMs) {
}
We can now use it to hold the data we fetched and calculated.
There’s just a small nuisance: to perform such a task, we need to master another powerful MongoDB tool, which is aggregations.
So, follow me to the next section!
Aggregations with Spring Data MongoDB
Aggregation in MongoDB is the process of running a series of operations like filtering, grouping, or transforming data against a document directly on the database side.
Aggregations are extremely useful when you need to perform analytics, gather statistics, or craft reports because they help to reduce data transfer and client-side computations.
You can use out-of-the-box single-purpose aggregations like distinct()
or countDocument()
. In case you need to perform more complex computations, you can build your own aggregation pipeline.
Aggregation pipeline is a series of operations performed on a set of data. You can create aggregation pipelines
- declaratively in a repository interface using a
@Query
annotation or - programmatically with the help of
MongoTemplate
andCriteria
.
In our small demo, We already have two perfect use cases suitable for aggregation pipeline:
- We want to filter
ImplantMonitoringLogs
by time and distance, group them byimplantSerialNumber
, and transform them into a Map whereimplantSerialNumber
is the key and a List ofImplantMonitoringLogs
is a value. - We want to calculate average values for implant performance metrics for a given
implantSerialNumber
and a time window, and return aMonitoringStats
DTO as a query result.
Well, what are we waiting for? Let’s get on to the tasks at hand!
First, let’s prepare the grounds. We need to create the ImplantMonitoringLogRepositoryCustom
interface with custom methods aggregateStats()
and findLogsByAreaAndTimeGrouped()
:
public interface ImplantMonitoringLogRepositoryCustom {
MonitoringStats aggregateStats(String serialNumber, LocalDateTime from, LocalDateTime to);
public Map<String, List<ImplantMonitoringLog>> findLogsByAreaAndTimeGrouped(
Point center, double maxDistanceMeters, LocalDateTime from, LocalDateTime to);
}
Don’t forget to make the ImplantMonitoringLogRepository
extend the new interface:
public interface ImplantMonitoringLogRepository extends MongoRepository<ImplantMonitoringLog, String>, ImplantMonitoringLogRepositoryCustom {
}
Next, create an ImplantMonitoringLogRepositoryCustomImpl
class that implements the ImplantMonitoringLogRepositoryCustom
interface and overrides its methods. In addition, inject the MongoTemplate
bean:
@Repository
public class ImplantMonitoringLogRepositoryCustomImpl implements ImplantMonitoringLogRepositoryCustom {
private final MongoTemplate mongoTemplate;
public ImplantMonitoringLogRepositoryCustomImpl(MongoTemplate mongoTemplate) {
this.mongoTemplate = mongoTemplate;
}
}
Shall we start with the aggregateStats()
method?
Once again, what we need to do is:
- Filter the logs and fetch only documents that match the given criteria: implant serial number and a time window;
- Calculate the average of certain fields: power usage, CPU usage, and neural latency;
- Create a projection that will contain only four fields: implant serial number and calculated metrics;
- Return a
MonitoringStats
object from the database.
For the first operation, we need the MatchOperation
class that will hold the given Criteria
:
MatchOperation match = Aggregation.match(Criteria.where("implantSerialNumber").is(serialNumber)
.and("timestamp").gte(from).lte(to));
To calculate the average, we need the GroupOperation
class. Here, we group logs by implantSerialNumber
, calculate the average for each group, and give a new name to each resulting metric:
GroupOperation group = Aggregation.group("implantSerialNumber")
.avg("powerUsageUw").as("avgPowerUsageUw")
.avg("cpuUsagePct").as("avgCpuUsagePct")
.avg("neuralLatencyMs").as("avgNeuralLatencyMs");
Next, we need the ProjectionOperation
class to rename _id
to implantSerialNumber
and round each average metric to 2 decimal places.
ProjectionOperation project = Aggregation.project().and("_id").as("implantSerialNumber")
.and(ArithmeticOperators.Round.roundValueOf("avgPowerUsageUw").place(2)).as("avgPowerUsageUw")
.and(ArithmeticOperators.Round.roundValueOf("avgCpuUsagePct").place(2)).as("avgCpuUsagePct")
.and(ArithmeticOperators.Round.roundValueOf("avgNeuralLatencyMs").place(2)).as("avgNeuralLatencyMs");
Finally, we combine all three stages into a single aggregation pipeline and ‘feed’ it to MongoTemplate
, which executes the pipeline against the implant_logs collection and maps the result to a MonitoringStats
object.
Aggregation aggregation = Aggregation.newAggregation(match, group, project);
AggregationResults<MonitoringStats> results = mongoTemplate.aggregate(
aggregation, "implant_logs", MonitoringStats.class);
Full method implementation:
@Override
public MonitoringStats aggregateStats(String serialNumber, LocalDateTime from, LocalDateTime to) {
MatchOperation match = Aggregation.match(Criteria.where("implantSerialNumber").is(serialNumber)
.and("timestamp").gte(from).lte(to));
GroupOperation group = Aggregation.group("implantSerialNumber")
.avg("powerUsageUw").as("avgPowerUsageUw")
.avg("cpuUsagePct").as("avgCpuUsagePct")
.avg("neuralLatencyMs").as("avgNeuralLatencyMs");
ProjectionOperation project = Aggregation.project()
.and("_id").as("implantSerialNumber") .and(ArithmeticOperators.Round.roundValueOf("avgPowerUsageUw").place(2)).as("avgPowerUsageUw") .and(ArithmeticOperators.Round.roundValueOf("avgCpuUsagePct").place(2)).as("avgCpuUsagePct") .and(ArithmeticOperators.Round.roundValueOf("avgNeuralLatencyMs").place(2)).as("avgNeuralLatencyMs");
Aggregation aggregation = Aggregation.newAggregation(match, group, project);
AggregationResults<MonitoringStats> results = mongoTemplate.aggregate(
aggregation, "implant_logs", MonitoringStats.class);
return results.getUniqueMappedResult();
}
This aggregation pipeline is ready, one more to go!
A quick reminder: we want to find all ImplantMonitoringLogs
within a given radius and time window, group them by implant serial id, and return a map with key implantSerialNimber
and value List<ImplantMonitoringLog>
.
The first operation is already familiar to us: we use MatchOperation
to filter logs by distance and time:
MatchOperation match = Aggregation.match(
Criteria.where("location").nearSphere(center)
.maxDistance(maxDistanceMeters)
.and("timestamp").gte(from).lte(to));
Then, we use the GroupOperation
to group logs by implantSerialNumber
and push each matching document into the logs array for that group:
GroupOperation group = Aggregation.group("implantSerialNumber")
.push(Aggregation.ROOT).as("logs");
After that, we create an aggregation pipeline, which is used by the MongoTemplate to return List<Document>:
Aggregation aggregation = Aggregation.newAggregation(match, group);
AggregationResults<Document> results = mongoTemplate.aggregate(
aggregation, "implant_logs", Document.class);
The next step is to iterate over each Document, get the implantSerialNumber
and a list of Document logs. These logs can be then converted to the ImplantMonitoringLog
class using the mongoTemplate.getConverter()
method. Finally, we put a new entry into the Map, where implantSerialNumber
is the key and List<ImplantMonitoringLog>
is a value:
Map<String, List<ImplantMonitoringLog>> grouped = new HashMap<>();
for (Document doc : results.getMappedResults()) {
String serialNumber = doc.getString("_id");
List<Document> logsDocs = (List<Document>) doc.get("logs");
List<ImplantMonitoringLog> logs = logsDocs.stream()
.map(d -> mongoTemplate.getConverter()
.read(ImplantMonitoringLog.class, d))
.toList();
grouped.put(serialNumber, logs);
Full method implementation:
@Override
public Map<String, List<ImplantMonitoringLog>> findLogsByAreaAndTimeGrouped(Point center,
double maxDistanceMeters,
LocalDateTime from,
LocalDateTime to) {
MatchOperation match = Aggregation.match(
Criteria.where("location").nearSphere(center)
.maxDistance(maxDistanceMeters)
.and("timestamp").gte(from).lte(to));
GroupOperation group = Aggregation.group("implantSerialNumber")
.push(Aggregation.ROOT).as("logs");
Aggregation aggregation = Aggregation.newAggregation(match, group);
AggregationResults<Document> results = mongoTemplate.aggregate(
aggregation, "implant_logs", Document.class);
Map<String, List<ImplantMonitoringLog>> grouped = new HashMap<>();
for (Document doc : results.getMappedResults()) {
String serialNumber = doc.getString("_id");
List<Document> logsDocs = (List<Document>) doc.get("logs");
List<ImplantMonitoringLog> logs = logsDocs.stream()
.map(d -> mongoTemplate.getConverter()
.read(ImplantMonitoringLog.class, d))
.toList();
grouped.put(serialNumber, logs);
}
return grouped;
}
That was awesome! You can now proceed to writing the Controllers if you want to build a web application and take advantage of the logic we have written.
As for this tutorial, there are two more topics left to discuss: testing and database migrations.
MongoDB Migrations with Mongock
Why use migration tool with NoSQL
Using a database migration tool such as Liquibase or Flyway with SQL databases is justified as we need to write the schema and update it explicitly as needed. With MongoDB, the schema is generated automatically, and changes to the @Document
class fields can be applied with a single write command.
So, why bother with a migration tool?
Even if MongoDB doesn’t enforce a rigid schema, your application needs it. The indexes, validation rules, and conventions must be correct and consistent across dev, CI, and prod. However, relying on auto-created schema may lead to
- Missing indexes resulting in slow queries,
- Incomplete validation rules leading to corrupted documents.
You also might want to modify the data, which, in case of automatic updates, may result in mismatching schema versions in production.
Therefore, a migration tool enables you to create version-controlled and idempotent change sets so that every environment starts with the same collections, indexes, and validations.
What is Mongock
Mongock is an open-source Java-based migration tool for NoSQL databases. It offers a code-first approach to schema generation meaning that you can write migration scripts in Java/Kotlin and ship them with your app. With Mongock you can
- Version changelogs,
- Create indexes and validation rules,
- Be sure of idempotent execution of change sets,
- Split documents,
- Seed sample data.
Mongock is natively compatible with Spring/Spring Boot, so adding it to your app is just a matter of two dependencies and one annotation.
Why not use the familiar tools such as Liquibase or Flyway?
Liquibase/Flyway are tailored to relational databases. They have very limited and/or experimental support for MongoDB and BSON and don’t work with many Mongo-specific features such as geo-indexes.
Therefore, Liquibase and Flyway remain gold standards for relational databases, whereas Mongock is a perfect fit for MongoDB and other non-relational DBs.
Setup Mongock with Spring Boot
Let’s add the dependencies for Mongock Runner and MongoDB driver to the pom.xml:
<dependency>
<groupId>io.mongock</groupId>
<artifactId>mongock-springboot</artifactId>
</dependency>
<dependency>
<groupId>io.mongock</groupId>
<artifactId>mongodb-springdata-v4-driver</artifactId>
</dependency>
Now, add the annotation @EnableMongock
to the main application class. This annotation triggers the Mongock runner upon application start to run the migrations:
@EnableMongock
@SpringBootApplication
@EnableMongoRepositories
public class MongodbDemoApp {
public static void main(String[] args) {
SpringApplication.run(MongodbDemoApp.class, args);
}
}
Finally, add a new property to application.properties pointing to the location of changelog files:
mongock.migration-scan-package=dev.cyberjar.migration
It is important to note that if you want to create a schema manually with Mongock, you have to remove all @Indexed
annotations from the classes or else MongoDB will attempt to create the indexes automatically, and migration will fail with an error that such index already exists.
Create Changelogs and Apply Updates
Note that this guide is applicable to Mongock version 5.x — some methods and flows were changed as compared to the previous major Mongock version.
Create the SchemaDataInitializerChangeUnit
class. Annotate it with @ChangeUnit
and specify
- id, which will be stored in the ChangeUnit history collection,
- execution order,
- author (optional).
@ChangeUnit(id = "schema-and-test-data", order = "001", author = "cyberjar")
public class SchemaDataInitializerChangeUnit {
private final MongoTemplate mongoTemplate;
public SchemaDataInitializerChangeUnit(MongoTemplate mongoTemplate) {
this.mongoTemplate = mongoTemplate;
}
}
First, let’s take care of creating the collections and indices.
ChangeUnit classes can contain methods annotated with
@BeforeExecution
(optional) for executing operations such as DDL before the actual migration@RollbackBeforeExecution
(obligatory if@BeforeExecution
is present) reverts the changes made in the@BeforeExecution
method,@Execution
for the main migration method,@RollbackExecution
for reverting changes made in the execution method.
The creation of collections and indexes is a DDL operation and should be performed in the @BeforeExecution
method.
Firstly, let’s create the collections using the createCollection()
method of MongoTemplate
:
@BeforeExecution
public void beforeExecution() {
mongoTemplate.createCollection("civilians");
mongoTemplate.createCollection("implant_logs");
}
In the same method, create the indices. For that, we need the IndexOperations
object that is bound to a specific collection by MongoTemplate
. Using this object, we can create indices for the specified collection:
IndexOperations civilianOps = mongoTemplate.indexOps("civilians");
civilianOps.createIndex(
new Index().on("nationalId", Sort.Direction.ASC).unique());
Note that the new Index accepts the name and sorting direction. You can also specify additional properties of this index. For instance, whether it is unique or not.
In a similar way, let’s create the indices for the implants. As we don’t have the separate collection for them, we bind the index to the class:
IndexOperations implantOps = mongoTemplate.indexOps(Implant.class);
implantOps.createIndex(
new Index().on("serialNumber", Sort.Direction.ASC).unique());
implantOps.createIndex(
new Index().on("lotNumber", Sort.Direction.ASC));
As for the ImplantMonitoringLogs
, we need to create a GeospatialIndex
indeed of the standard Index for the location field and specify its time.
We can also make the timestamp not just an Index, but a TTL Index. MongoDB uses such indices to remove the documents from the collection after a specified amount of time. I believe in the case of logs, a TTL index is more beneficial because otherwise, the amount of logs in the collection may exceed the sensible numbers.
IndexOperations logOps = mongoTemplate.indexOps("implant_logs");
logOps.createIndex(
new Index().on("implantSerialNumber", Sort.Direction.ASC));
logOps.createIndex(
new Index().on("timestamp", Sort.Direction.DESC));
logOps.createIndex(
new GeospatialIndex("location")
.typed(GeoSpatialIndexType.GEO_2DSPHERE));
logOps.createIndex(
new Index()
.on("timestamp", Sort.Direction.ASC)
.expire(Duration.ofDays(90)));
We also need a @RollbackBeforeExecution method is something goes wrong during schema creation:
@RollbackBeforeExecution
public void rollbackBeforeExecution() {
mongoTemplate.dropCollection("civilians");
mongoTemplate.dropCollection("implant_logs");
}
We can now move on to the @Execution
method to seed some test data into our database:
@Execution
public void seedDatabase(MongoTemplate mongoTemplate) {
List<Implant> implants = new ArrayList<>();
implants.add(new Implant("limb", "Model-Dvb688", "2.2", "MechaMed", 536, "742669", "2025-03-21"));
implants.add(new Implant("ocular", "Model-SiT679", "1.5", "MechaMed", 434, "306310", "2025-06-08"));
implants.add(new Implant("limb", "Model-Jtv413", "1.3", "MechaMed", 536, "470917", "2025-04-03"));
List<Civilian> civilians = new ArrayList<>();
civilians.add(new Civilian(null, "Aarav Das", "Ni-96751543-BP", "1965-05-02", true, false, List.of(implants.get(0))));
civilians.add(new Civilian(null, "Paula Lin", "NP-59909166-Wg", "1998-11-01", false, false, List.of(implants.get(1))));
civilians.add(new Civilian(null, "Aelita Fang", "gQ-01247486-nk", "1989-12-01", true, false, List.of(implants.get(3))));
mongoTemplate.insert(civilians, Civilian.class);
String implantSerialNum = implants.getFirst().getSerialNumber();
String civilianNationalId = civilians.getFirst().getNationalId();
double powerUsage = 1.5;
double cpuUsage = 1.0;
double neuralLatency = 0.5;
List<ImplantMonitoringLog> logs = new ArrayList<>();
for (int i = 0; i < 30; i++) {
ImplantMonitoringLog implantMonitoringLog = new ImplantMonitoringLog(null, implantSerialNum, civilianNationalId, LocalDateTime.now().minusHours(i), powerUsage + i, cpuUsage + i, neuralLatency + i, new Point(4.899, 52.372)); //Coordinates for Amsterdam longitude/latitude
logs.add(implantMonitoringLog);
}
mongoTemplate.insert(logs, ImplantMonitoringLog.class);
}
Finally, the @RollbackExecution
method will revert the changes of the migration method if necessary. You can delete all documents from the collections or simply drop the collections:
@RollbackExecution
public void rollbackExecution() {
mongoTemplate.dropCollection("civilians");
mongoTemplate.dropCollection("implant_logs");
}
Testing MongoDB Applications with @DataMongoTest and Testcontainers
Our small demo app is ready, it’s time to test it!
Testing Data Layer with @DataMongoTest and Testcontainers
Spring Boot provides the @DataMongoTest
for testing the data layer of the application without the full auto-configuration. When applied at the test class level, it searches for MongoDB-specific beans like repositories and documents and configures only them, leaving services, controllers, etc. out of the equation.
By default, tests annotated with @DataMongoTest
use the embedded MongoDB database. But we will override this behavior and use dockerized MongoDB, which we will spin up with the help of Testcontainers.
Why would we want to do that? Unlike the embedded database, Testcontainers provide the actual database instance in a Docker container meaning that your tests will run against the realistic production-like environment.
We have already added the Testcontainers support when we created the project. We need to add one more, for JUnit. All three dependencies:
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>junit-jupiter</artifactId>
<version>1.21.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>mongodb</artifactId>
<version>1.21.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-testcontainers</artifactId>
<scope>test</scope>
</dependency>
Now, let’s create a test class for CivilianRepository
and annotate it with @Testcontainers
, which delegates the lifecycle of containers to Testcontainers, and @DataMongoTest
:
@Testcontainers
@DataMongoTest
class CivilianRepositoryTest {
}
Now, we need to create the instance of a MongoDBContainer using the specified Docker image. The @ServiceConnection
annotation allows the MongoDB-related beans to communicate with MongoDB inside the Docker container.
Also, autowire the CivilianRepository
bean that we will test and the MongoTemplate
bean that will be responsible for adding test data to the database.
@Testcontainers
@DataMongoTest
class CivilianRepositoryTest {
@Container
@ServiceConnection
static MongoDBContainer mongoDBContainer = new MongoDBContainer("mongo");
@Autowired
private CivilianRepository repository;
@Autowired
private MongoTemplate mongoTemplate;
}
The final step is to disable Mongock for this set of tests. You can and should test migrations separately, but in other tests, Mongock is not necessary and will only complicate the setup.
Create the test.properties file and add one property to disable Mongock:
mongock.enabled=false
Now, specify the path to the test.properties file with the class-level @TestPropertySource
annotation:
@Testcontainers
@DataMongoTest
@TestPropertySource(locations = "classpath:test.properties")
class CivilianRepositoryTest {
}
Excellent. Now, let’s add some sample data to the containerized database.
The best practice is to isolate the tests from one another so that they don’t interfere with each other’s results. To achieve that, we can perform database cleanup with a subsequent data insert after each test. Use the @BeforeEach
and @AfterEach
annotations:
@BeforeEach
void populateWithData() {
mongoTemplate.createCollection("civilians");
List<Implant> implants = new ArrayList<>();
implants.add(new Implant("limb", "Model-Dvb688", "2.2", "MechaMed", 536, "742669", "2025-03-21"));
implants.add(new Implant("ocular", "Model-SiT679", "1.5", "MechaMed", 434, "306310", "2025-06-08"));
implants.add(new Implant("limb", "Model-Jtv413", "1.3", "MechaMed", 536, "470917", "2025-04-03"));
List<Civilian> civilians = new ArrayList<>();
civilians.add(new Civilian(null, "Rin Morse", "fI-88901036-kD", "1985-08-01", true, false, List.of(implants.get(0))));
civilians.add(new Civilian(null, "Heather Huang", "YD-99086969-CP", "1994-04-16", false, true, List.of(implants.get(1))));
civilians.add(new Civilian(null, "Amir Morgan", "MP-66879496-vg", "1975-06-26", false, true, List.of(implants.get(2))));
mongoTemplate.insert(civilians, Civilian.class);
}
@AfterEach
void cleanUp() {
mongoTemplate.dropCollection("civilians");
}
In the example above, we added data manually. Another approach is to create a JSON file with all required data and add the Jackson2RepositoryPopulatorFactoryBean
to the config file. This bean will populate the database with data when the container starts if we import it to the test class:
@Configuration
public class PopulatorConfig {
@Bean
public Jackson2RepositoryPopulatorFactoryBean populator() {
var bean = new Jackson2RepositoryPopulatorFactoryBean();
bean.setResources(new Resource[] {
new ClassPathResource("mydata.json")
});
return bean;
}
}
@DataMongoTest
@Import(PopulatorConfig.class)
class DataMongoTestWithJson {
// Repos and tests as needed
}
Finally, we can use the familiar flow to write some tests:
@Test
void shouldFindCivilianByNationalId() {
Optional<Civilian> civilian = repository.findByNationalId("fI-88901036-kD");
String name = "Rin Morse";
assertEquals(name, civilian.get().getLegalName());
}
@Test
void shouldFindCiviliansByLotNumber() {
List<Civilian> civilians = repository.findAllByImplantLotNumber(536);
int expected = 2;
assertEquals(expected, civilians.size());
}
Integration Testing with Testcontainers
After we have tested all data classes in isolation, we can move on to integration testing. Integration testing is aimed at verifying that different parts of the application work correctly together.
Let’s create a test class for our ImplantMonitoringLogService
. We need the @Testcontainers
annotation and the @SpringBootTest
annotation instead of @DataMongoTest
to use the whole application context.
The container setup can be copied from the previous test class:
@Testcontainers
@SpringBootTest(classes = MongodbDemoApp.class)
@TestPropertySource(locations = "classpath:test.properties")
class ImplantMonitoringLogServiceTest {
@Container
@ServiceConnection
static MongoDBContainer mongoDBContainer = new MongoDBContainer("mongo");
@Autowired
private ImplantMonitoringLogService monitoringLogService;
@Autowired
private MongoTemplate mongoTemplate;
}
Let’s add some test data to the database. Again, we are populating and dropping the database for each test:
@BeforeEach
void populateWithData() {
mongoTemplate.createCollection("implant_logs");
String implantSerialNum = "123456qw";
String civilianNationalId = "rtfg5674-98";
double powerUsage = 1.5;
double cpuUsage = 1.0;
double neuralLatency = 0.5;
List<ImplantMonitoringLog> logs = new ArrayList<>();
for (int i = 0; i < 30; i++) {
ImplantMonitoringLog implantMonitoringLog = new ImplantMonitoringLog(null,
implantSerialNum, civilianNationalId,
LocalDateTime.now().minusHours(i),
powerUsage + i,
cpuUsage + i,
neuralLatency + i,
new Point(4.899, 52.372)); //Coordinates for Amsterdam
logs.add(implantMonitoringLog);
}
mongoTemplate.insert(logs, ImplantMonitoringLog.class);
}
@AfterEach
void cleanUp() {
mongoTemplate.dropCollection("implant_logs");
}
Finally, you can write corresponding methods to test the application logic:
@Test
void shouldGatherStatsForImplantLogs() {
MonitoringStats stats = monitoringLogService.aggregateStatsForImplantForPeriod(
"123456qw",
LocalDateTime.now().minusDays(7),
LocalDateTime.now()
);
double expectedAvgPowerUsage = 16.0;
assertEquals(expectedAvgPowerUsage, stats.avgPowerUsageUw());
}
MongoDB Alternatives
MongoDB is a powerful and flexible NoSQL solution, but it is not the only one on the market. Here are some MongoDB alternatives that may fit your needs better:
- Couchbase is a distributed NoSQL database platform that offers built-in caching for lower latency SQL-like query language (N1QL) for JSON documents.
- Apache Cassandra is a highly-scalable NoSQL solution designed for high availability and big data management. It boasts multi-center support and fault-tolerance by default.
- Amazon DynamoDB is a fully-managed NoSQL solution that is tightly integrated with the AWS ecosystem.
I would recommend studying carefully what each solution offers and try it out to find which of them fits your needs best.
Conclusion
A quick summary? If you’ve read this far, you can now
- Set up MongoDB with Spring Boot and perform CRUD operations
- Use
@Query
annotation andMongoTemplate
for more complex business logic - Use specific MongoDB annotations for robust and efficient schema
- Use Projections and build Aggregation Pipelines
- Write migration scripts with Mongock
- Perform data layer and integration testing of MongoDB apps with
@DataMongoTest
and Testcontainers
If you’d like to deepen your knowledge of MongoDB, you can further explore the docs:
And of course, subscribe to our newsletter for a deep dive into other cool technologies around Java