JDK Flight Recorder – a gem hidden in OpenJDK
June 24, 2020
JDK Flight Recorder (JFR) is a mighty powerful feature of HotSpot JVM, although it is not as widely known. JFR is not a novel technology. It takes its origin from JRockit JVM; later, it was ported to HotSpot JVM and introduced in Oracle Java 7. At the time of Java 9, JFR was fully open-source, and now it is an integral part of OpenJDK. In this article, I would like to highlight the key features of JDK Flight Recorder and how it can be useful for Java developers.
What is JDK Flight Recorder?
JDK Flight Recorder (JFR) is a performance/diagnostic tool that could be a huge time saver for Java engineers while troubleshooting applications at runtime. As the name suggests, JDK Flight Recorder collects various kinds of events from JVM runtime and records them in the form of binary log files (further referred to as recordings). In OpenJDK 11, JFR is capable of tracing a few hundred types of events creating a comprehensive JVM runtime picture. Recordings produced by JFR are self-contained files that could be further analyzed in Mission Control.
What is Mission Control?
Mission Control is essentially a graphical front end working with JFR recordings (binary log files) produced by JDK Flight Recorder. Mission Control is also open-source, and its source code is hosted on GitHub.
JFR recordings are logically straightforward, each one a plain collection of events. For them to make sense, these data need to be post-processed. Mission Control offers several built-in reports outlining various aspects of JVM runtime (e.g., code execution, object allocation, garbage collection, and more). Besides general reports, Mission Control has a number of rule-based heuristics for detecting typical problems of Java applications. Finally, you can use a generic event browser and customizable filters to configure reports tailored for your task.
Is JDK Flight Recorder just another Java profiler?
To a certain extent, features of JFR + Mission Control compete with ones of profilers, this is true. Though comparing it to traditional profilers is not very fair.
JFR is a tool of its own kind. It is also a feature of JDK, so other Java profilers besides Mission Control could benefit from JFR capabilities too.
For certain profiling tasks (e.g., identifying code host spots or object allocation profiling), JFR + Mission Control is similar to a typical Java profiler (Visual VM, JProfiler, YourKit profiler).
Some features essential for profilers are not covered by JFR. Say, if you want to trace SQL statements being sent to the database, JFR is not your tool.
You can send application-specific events to JFR. This is a compelling way to collect telemetry peculiar to your application, though it requires upfront work to place event generation in application logic.
Finally, some kinds of JVM telemetry are available only via JFR.
How can I start using JDK Flight Recorder?
JFR is an integral part of OpenJDK now. You do not need to download anything besides JDK or configure JVM in any special way to start using it.
Open JDK 11 is the earliest LTS (Long Term Support) release with integrated JFR.
If you do not already have OpenJDK with JFR support, you can get a version matching your OS and CPU architecture from Liberica.
JFR workflow is as follows:
- start flight recording session with a particular configuration,
- once the session is active for some time, dump recording to a file,
- stop the session after the dump (this step is optional, you can leave it active if you plan to take more dumps later).
There are three ways to start a flight recording session:
- Using the
jcmdCLI tool included in JDK, start/stop/dump JFR sessions. You would need shell access to the server running your Java application to use
- Using Mission Control, you can do the same for both locally running and remote processes (you would need JMX configured to connect to JVM remotely).
- It is also possible to activate a JFR session on startup via JVM command-line options (this is very useful if you want to profile applications during startup).
Starting a flight recording session also requires a configuration (called a profile). The profile defines which kind of events would be collected during the session. It also determines per probe configuration (frequency for method sampling, duration threshold for I/O events, etc.).
There are two built-in profiles, “default” and “profiling.” You can also configure and use custom profiles via Mission Control UI.
Once you have a JFR recording file, you would need Mission Control to open it. Mission Control is distributed separately from OpenJDK. I would recommend getting Liberica Mission Control at this stage.
A brief overview of Mission Control reports
Mission Control has tons of features and a fairly steep learning curve. Oracle’s Java 8 had pre-open source Mission Control 5.5 bundled with JDK.
Mission Control 6 was bundled with JDK 9, and it was a dramatic change partially related to open-sourcing of the codebase. This same version, Mission Control 6, introduced automatic anomaly detection. Hints produced by built-in heuristics could be very practical for beginners, but sooner or later you would need to dive into other reporting screens.
At the time of writing, Mission Control 7 was the latest stable version.
Here I would like to give a brief description of the key reports available in Mission Control 7.
JFR can periodically capture stack traces for running threads in JVM. This technique is known as sampling profiling and available in all Java profilers. Statistical analysis of a large corpus of stack traces gives an excellent picture of where the application spends most of the time down to method name or even line number. This report provides useful visualization for this subset of data collected by JFR.
This method is typically used in cases where the application is CPU bound. Sampling helps to identify code that is inefficient or could be optimized for better performance gain.
This report outlines a memory usage story of your Java application. It combines multiple data points, with the most important of them being object allocation samples.
If enabled, JFR records samples of new object allocation events, including class being allocated, size of object, and stack trace to the point of allocation. Statistical analysis of these samples could reveal a lot about your code.
“Memory” reports show top classes of objects being allocated. Information on actual allocated size, including stack trace to the point of allocation, is also available.
This report is most helpful if you want to reduce garbage collector pressure or optimize memory usage. Memory profiling is extremely valuable when it quickly identifies the “littering” code in your application and provides ideas for optimization.
Contentions and deadlocks is another class of Java problems hard to approach without a good tool. JFR collects events of thread blocking on synchronized sections and other concurrency primitives that can be useful during the investigation of Java concurrency problems.
“Lock Instances” report would show you aggregated statistics for these events. “Threads” is another report helpful in tracking down concurrency issues.
A built-in “Profiling” configuration has a threshold on the duration for such events. Keep in mind that you may need to lower it to get a more detailed picture.
The cost of inter-thread contention becomes a harsh reality for high parallel code running on dozens of cores. Java ships with a few concurrency primitives, including lock-free ones. The contention report helps understand the cost of multithreading in your application and possibly highlights bottlenecks where switching to more elaborate locking is most beneficial.
File I/O and Socket I/O
Blocking file and socket I/O operations are also traced by JFR. These reports show aggregated statistics built from file and socket I/O events, respectively.
It is exhilarating to see which files and how long your application has been reading or at which remote endpoints it has exchanged data.
There is a little caveat, though. If using JFR for the first time, you are likely to find both of these reports to be empty or almost empty. The reason is the 10 ms threshold used by default for these kinds of events. If you want to see a full I/O picture, set this threshold to 0 when starting a flight recording session.
This report is very good at catching “unexpected” I/O events buried deep inside of layers of libraries and frameworks.
In this report, you can see various events (contention related, I/O associated, Java thread state changes) put into a unified timeline. Here you can see the interaction of threads in an intuitive visual way. The ability to zoom into the millisecond level of precision is helpful too.
Every Java application is producing a handful of exceptions at runtime. Some of them end up in log files, but many are suppressed for one reason or another. Silently swallowed exceptions sometimes lead to many hours wasted during the incident investigation afterward.
JFR sees all exceptions (whether they were suppressed or not). There are two kinds of events related. JFR records the total count of exceptions produced by runtime. This metric helps to identify exception misuse. JFR could also record every single exception created, including the stack trace, though you would need to tweak the flight recording session’s configuration to get this level of detail.
GC troubleshooting typically involves digging through GC logs. And of course, you have your GC logging configured properly in your production, don’t you?
JFR collects detailed information concerning GC events, and this report visualizes this data. Almost any metric of GC present in JVM is available here.
If you need to tune GC or troubleshoot abnormal GC pauses, this is a report to start with.
HotSpot has a concept of VM operations ranging in types. But the critical point is that all of them require the infamous Stop-the-World pause. With some JVM options, you can enable logging of VM operation or use JFR and this handy report.
VM operations are often mistaken for GC pauses (which is just one type of VM operations). People tend to blame Stop-the-World on garbage collectors. This report is your first stop if the application is experiencing Stop-the-World pauses.
Configuration and environment
A neat thing about JFR files is that you can send them for analysis to an expert in the JVM domain. Files include JVM configuration, hardware configuration, and even a list of other processes running on a server and competing for resources without application. As a JVM guru, I can get almost any context information I need from the recording file itself.
Continuous flight recording
JFR + Mission Control could be used as a free and capable Java profiler, no doubt. Though using it in continuous mode could bring even more value.
To get the full benefit from continuous flight recording, you would need to
- configure your JVM to be accessible via JMX
- add JVM command-line arguments to activate flight recorder on startup
A minimal command to start flight recording session on startup is below
An important option is “maxsize,” which limits how many events would be retained in memory.
Now you have JFR silently collecting events in the background. As long as the application runs fine, you can forget about JFR being active. The default JFR profile adds very little overhead.
Still, in an unfortunate turn of events, you can quickly connect to JVM via Mission Control and dump accumulated events from a problematic server, which is especially useful for tracking down problems reproducing only on live production.
One can argue that instead of pulling data dumps from individual VMs, it is better to have a centralized monitoring system where you can access diagnostics from any server. In theory, that would be incredible, but actually arranging quality centralized monitoring is extremely difficult. Central monitoring solutions have to compromise on the level of details.
Continuous flight recording and the ability to pull event dumps on demand are complementary to central logging and general monitoring solutions, as you can get very detailed telemetry from JVM.
Plus, it is already built into your OpenJDK, so why not start using it?
Java 8 roadblock
While Java 14 is already out, many systems are stuck with Java 8, whatever the reason. Unfortunately, JDK Flight Recorder is not included in OpenJDK 8 (it is available in Oracle Java but not for free).
Java version dependency stands behind the low JDK Flight Recorder adoption among developers. You need to move to Java 11 to be able to use JFR in OpenJDK.
The good news is that JDK Flight Recorder has been backported to OpenJDK 8 codebase and will soon be ready to use.
JDK Flight Recorder is a mature technology integrated with modern OpenJDK.
JFR + Mission Control offers a reliable alternative to commercial Java profilers for most common profiling tasks.
JFR also goes beyond simple profiler workflow with its ability to continuously collect data and access them remotely.
As part of the JVM binary, JDK Flight Recorder could be easily enabled and does not require any upfront configuration, making it a powerful tool in the Java engineer’s toolbox.
So, if you have problems with your Java performance or stability, JDK Flight Recorder and Mission Control is a combo you definitely need to give a try.