JVM Advent

The JVM Programming Advent Calendar

Keep Your SQL in Check With Flight Recorder, JMC Agent and JfrUnit

It’s a common misconception that once you’re using an object-relational mapping (ORMs) tool, you wouldn’t have to know about SQL any longer, as the ORM would fully abstract you from that. Nothing could be further from the truth though; while an ORM indeed will free you from the burden of writing many SQL statements from hand and keeping them in sync with your data model as it evolves, it is absolutely vital for you to understand which queries actually are emitted by your ORM-based application and what costs they may cause.

Otherwise, you may be caught by surprise by poor querying behavior, e.g. due to the ORM issuing an inappropriately large number of statements (for instance, the notorious N+1 SELECTs, instead of a more suitable fetch join), due to queries returning to much data (e.g. by selecting all columns of a table, instead of more efficient projections), or due to the same queries being run again and again (instead of caching some read-only data for instance in-memory). That’s why ORM solutions like Hibernate ORM typically provide you with the option to log all the SQL statements they emit. This lets you identify potentially troublesome statements, but analyzing purely text-based log files can be cumbersome. Another option would be typically proprietarily licensed application performance monitoring (APM) tools, which oftentimes hook into your application using a Java agent, extract performance-related data such as SQL statements executed by the persistence layer, and make that information available in easy-to-use dashboards.

In this article, I’d like to introduce an alternative solution for solving the problem of keeping track of executed SQL statements, solely based on open-source tools, and providing you deep insight into the SQL statements of your application early on in the development lifecycle. The proposed solution is based on three powerful tools:

  • JDK Flight Recorder (JFR), a low-overhead event collection and profiling engine built into OpenJDK
  • JMC Agent, a Java agent for instrumenting existing 3rd party libraries, so as to have them emit custom JFR events upon specific method invocations
  • JfrUnit, an extension to JUnit 5, which lets you run assertions against JFR events, allowing you to identify potential performance regressions by means of plain unit tests

As an example we are going to discuss a small web application for managing todos, which is based on Quarkus, a modern stack for building cloud-native microservices running on the JVM and as native binaries via GraalVM.

Gaining Insight Into Your Application With JDK Flight Recorder

JDK Flight Recorder is a “troubleshooting, monitoring, and profiling framework embedded into the JVM”. Historically a proprietary commercial feature of Oracle JDK (and before that, JRockit), JFR got fully open-sourced as part of OpenJDK 11 via JEP 328. Like the flight recorder in an airplane, it is centered around the notion of recording events on a persistent medium, which later on can be examined and analysed.

It comes with more than 150+ built-in event types, which provide you with deep insights into performance-related aspects such as garbage collection, class loading and unloading, JIT compilation, locking, object allocations, file and network IO, and much more. In addition, you can define your own application-specific event types, representing relevant metrics of your particular domain. Performance is paramount to the design of JFR itself, so using it should cause an overhead of less than 1%, which means you can use it continuously in production. Then, in case of any problems with your application’s performance, you can access the JFR recording of the last minutes or hours (depending on your chosen settings) for investigation. In order to analyse JFR recordings, another open-source application, JDK Mission Control, is used most of the time. Since Java 14, there’s also the possibility to consume JFR events in a real-time streaming fashion (JEP 349), which you can use for instance for exporting (potentially pre-aggregated) event data to external monitoring and dashboarding solutions such as Prometheus and Grafana.

Injecting JFR Events With JMC Agent

Now how does all that relate to the goal of keeping track of the SQL statements emitted by our todo application? Unfortunately, there’s no event type built into JFR itself, which would directly help you with this task. Ideally, components higher up in the stack, such as Hibernate ORM or the JDBC driver would have the ability of logging JFR events for each issued query. That’s not the case (yet?), though.

This is where JMC Agent comes in: it lets you define rules for instrumenting existing code in order to let it emit JFR events upon the invocation of specific methods. All that’s needed is some understanding of the inner workings of the library to be instrumented, so that you know for which method invocations which kind of event should be emitted. Optionally, parameter and/or return values of a method invocation can be added as attributes to the created JFR event.

After digging a bit into the code base of Hibernate ORM, I decided to instrument the following methods for the task at hand:

  • io.agroal.pool.wrapper.PreparedStatementWrapper::execute(), executeQuery(), and executeUpdate() for producing a JFR event whenever a prepared statement is executed; i.e. instead of emitting these events from Hibernate ORM itself, I am going one level deeper, to the Agroal connection pool used by this application; this allows you to capture all the query executions, no matter from where exactly within Hibernate ORM they are initiated
  • org.hibernate.type.descriptor.sql.BasicBinder::bind(); for creating a JFR event whenever a query parameter is bound, so that you also can capture the actual parameter values of executed prepared statements

The configuration of JMC Agent happens via an XML file; you can find the complete configuration for all the four methods listed above in the repo with the todo example application on GitHub; as an example, here is the configuration of the PreparedStatementWrapper::executeQuery() and BasicBinder::bind() methods (note that as of version 8.1, JMC comes with a plug-in for editing the JMC configuration):

<?xml version="1.0" encoding="UTF-8"?>
<jfragent>
  <config>
    <classprefix>__JFREvent</classprefix>
    <allowtostring>true</allowtostring>
    <allowconverter>true</allowconverter>
  </config>

  <events>
    <event id="jdbc.PreparedQuery">
      <label>Prepared Query</label>
      <description>Execution of a prepared query</description>
      <path>Hibernate</path>
      <stacktrace>true</stacktrace>
     <class>io.agroal.pool.wrapper.PreparedStatementWrapper</class>
      <method>
        <name>executeQuery</name>
        <descriptor>()Ljava/sql/ResultSet;</descriptor>
      </method>
      <location>WRAP</location>
      <fields>
        <field>
          <name>SQL Query</name>
          <description>The executed SQL query</description>
          <expression>wrappedStatement</expression>
        </field>
      </fields>
    </event>
    <event id="jdbc.BindParameter">
      <label>Prepared Statement Bind Parameter</label>
      <description>Bind parameter of a prepared query</description>
      <path>Hibernate</path>
      <stacktrace>true</stacktrace>
      <class>org.hibernate.type.descriptor.sql.BasicBinder</class>
      <method>
        <name>bind</name>
        <descriptor>(Ljava/sql/PreparedStatement;Ljava/lang/Object;ILorg/hibernate/type/descriptor/WrapperOptions;)V</descriptor>
        <parameters>
          <parameter index="1">
            <name>Bind Parameter</name>
            <description>The bind parameter</description>
            <contenttype>None</contenttype>
          </parameter>
          <parameter index="2">
            <name>Index</name>
            <description>The parameter index</description>
            <contenttype>None</contenttype>
          </parameter>
        </parameters>
      </method>
      <location>ENTRY</location>
    </event>
  </events>
</jfragent>

The label and description values will be used for the generated event type and be displayed in JMC later on. The class and method elements are used for describing the method for which a JFR event should be emitted whenever it is invoked. Specifically, the descriptor element defines the targeted method, following the method descriptor format as described in the Java Virtual Machine specification.

For the executeQuery() method, the value of the wrappedStatement field of the PreparedStatementWrapper instance will be emitted as an event attribute. As its type (PreparedStatement) isn’t a supported JFR attribute type, the code generated by JMC agent will invoke its toString() method for converting the value. That’s good enough for our purposes here, as this will yield the executed SQL statement. But if needed, JMC Agent also supports the usage of custom value converters.

The WRAP location indicates that the event should be started before the method invocation starts and be committed after the invocation has finished, i.e. the event duration will represent the duration of the method execution. This will come in handy for identifying long running queries in JMC.

The configuration of the bind() method is mostly similar, only that this time the parameters element is used for specifying that the method parameters representing the value and index of the given bind parameter should be emitted as part of the event. ENTRY is used as the location in this case, as the event duration doesn’t matter and it’s enough to simply emit that event before the method gets invoked. The third option besides WRAP and ENTRY would be EXIT for emitting an event after a method was invoked.

Now let’s see how you can use JMC Agent together with this configuration. A pre-built binary for JMC Agent can be downloaded from the Adoptium JMC Overrides project. Assuming your configuration file is named my-config.xml, you’d launch your application with the agent like this:

java -javaagent:path/to/agent-1.0.1.jar=my-config.xml \   
  --add-opens=java.base/jdk.internal.misc=ALL-UNNAMED \
  –jar path/to/my-application.jar

Note that the --add-opens parameter is currently required with JMC Agent, as it makes use of the jdk.internal.misc.Unsafe class. Hopefully, this will not be needed in a future version of JMC Agent.

In a fully automated test set-up, we could use the download-maven-plugin and the maven-surefire-plugin for first retrieving the JMC Agent JAR from GitHub, and then applying it to the JVM for running the unit tests of our project:

...
<build>
  <plugins>
    <plugin>
      <groupId>com.googlecode.maven-download-plugin</groupId>
      <artifactId>download-maven-plugin</artifactId>
      <version>1.6.7</version>
      <executions>
        <execution>
          <id>install-jmc-agent</id>
          <phase>generate-test-sources</phase>
          <goals>
            <goal>wget</goal>
          </goals>
          <configuration>
            <url>https://github.com/adoptium/jmc-overrides/releases/download/8.1.1/agent-1.0.1.jar</url>
            <outputDirectory>${project.build.directory}/jmc-agent/</outputDirectory>
            <md5>d42b1b8d55cfbee4b320a4996157bb49</md5>
          </configuration>
        </execution>
      </executions>
    </plugin>
    <plugin>
      <artifactId>maven-surefire-plugin</artifactId>
      <version>3.0.0-M5</version>
      <configuration>
        <argLine>-javaagent:${project.build.directory}/jmc-agent/agent-1.0.1.jar=hibernate-probes.xml --add-opens=java.base/jdk.internal.misc=ALL-UNNAMED</argLine>
      </configuration>
    </plugin>
  </plugins>
</build>
...

Now let’s take a look at some of these events by loading a JFR recording into JMC. This particular dataset is taken from a run of one of the unit tests of the example project. In this test, the application’s GET /todo/{id} endpoint is invoked ten times. The data in the JFR recording, as visible in JMC’s event browser view matches that: there’s ten pairs of “Bind Parameter” and “Prepared Query” events:

If you select one specific event, you can examine its attributes, i.e. the query in case of the “Prepared Query” event, and the actual parameter value of the id parameter for the “Bind Parameter” event:

By sorting the execution events by event duration, you can easily identify long-running statements. For the fun of it, let’s use the pg_sleep() function for producing an artificially “slow” query. As you see, the event duration of 5.017 sec pretty exactly matches the sleep duration of five seconds. The stack trace associated with an event allows you to identify the code location in the application which caused that particular query (more on that in the next section).

Going a step further, you could also use the rules framework built into JMC for automatically analysing JFR recordings and identifying SQL execution events matching certain criteria (running longer than x seconds, joining more than y tables, etc.). Another interesting option would be to leverage the JFR event streaming API for feeding the events in realtime to external event consumers, such as metrics collectors or dashboards. A typical workflow could for instance be to set up an alert which fires when the number and/or total execution time of our application’s statements surpasses some defined threshold in a given time window. If that happens, you could then obtain the complete JFR recording from the affected machine and load it for further detailed investigation into JMC. If needed, you also could trigger a more detailed JFR recording (either using JMC or the jcmd tool coming with the JDK) in this case, running for a shorter period of time.

Another tool which could come in handy when deploying your application on Kubernetes or OpenShift is the Cryostat project. It allows you to manage JFR recordings (e.g. starting and stopping them, analysing them in a web UI, etc.) for containerized workloads.

One important thing to keep in mind is that SQL statements, and in particular bind parameter values, can be sensitive data, which means you should limit access to any JFR recordings with this data, have a strategy in place for deleting them, and generally follow related best practices, in accordance with your local data protection laws, such as GDPR.

Identifying SQL Performance Regressions with JfrUnit

At this point you’ve gained some good understanding of how JFR and JMC Agent can be used for tracking the SQL statements executed by a Java application in a production environment. As the last step, let’s explore how you can identify potential performance bottlenecks in an application already during development and testing. The earlier you catch any performance problems or regressions, the easier and cheaper it will be to address them.

There’s a problem though: relying on absolute performance metrics such as query execution times usually isn’t a good idea when setting up automated tests; the results will typically depend on the underlying hardware and thus vary between environments. Another issue is concurrent load in a testing environment: when for instance trying to run performance regression tests on a CI environment with other concurrent jobs, any results will be influenced by these other processes.

This is why the JfrUnit project tries to approach the challenge of implementing reliable performance regression tests from another angle: instead of measuring and asserting the actual performance characteristics like latency or through-put, JfrUnit relies on more stable proxy metrics, such as for instance the I/O created by a specific use case, or the numbers and kinds of database queries which it triggers. These proxy metrics are very stable (the amount of database I/O for a GET todo/{id} request should be the same on a developer laptop and on a production-grade machine; also the number of SQL statements issued for this shouldn’t vary), while any regressions (more I/O, more statements executed) are a good indicator that a performance regression may have occurred.

While an in-depth introduction to JfrUnit is beyond the scope of this article (please refer to this post to learn more about JfrUnit and its approach to performance regression testing), its general principle should become clear from this example:

@Test
public void retrieveTodoShouldYieldCorrectNumberOfSqlStatements()
    throws Exception {

  Random r = new Random();

  for (int i = 1; i<= ITERATIONS; i++) {
    int id = r.nextInt(20) + 1;

    given()
      .when()
        .contentType(ContentType.JSON)
        .get("/todo/" + id)
      .then()
        .statusCode(200);
  }

  jfrEvents.awaitEvents();

  long numberOfStatements = jfrEvents
      .filter(this::isQueryEvent)
      .count();

  assertThat(numberOfStatements).isEqualTo(ITERATIONS);
}

The idea is to enable one or more JFR event types (any event types triggered via the JMC Agent are implicitly enabled) and then execute the code under test. JfrUnit will create a JFR recording with these event types, which subsequently can be asserted. This could be as simple as counting events of a particular type, but you also could assert the presence of events with specific attribute values, perform calculations like adding up all the I/O in bytes done by certain application threads, and more. In the example above, the test invokes an endpoint of the todo application in a loop and asserts that there’s exactly one SQL statement per iteration. A failure of that assertion would indicate that the application under test now executes more statements for that REST call than it is expected to do. Here’s how such test failure would look like:

60 statements instead of 10, that’s quite something! All the statements are also printed out, which may help you to identify the cause of this regression. But JfrUnit also persists the JFR recording for each test method in a file which you can load into JMC for further analysis. By doing so, the problem becomes clear very quickly: for each SELECT on the Todo table, there’s further SELECTs on the TodoDetail table. By looking at the stack traces of these unexpected events, you can easily see where exactly in the application these statements were triggered.

In the case at hand, it’s a classic N+1 problem, perhaps added by an unsuspecting fellow developer on our team in reaction to some new business requirements. Using JFR, JMC Agent, and JfrUnit you can identify this problem solely by executing tests in your IDEs, way before this code hits your staging or even production environment, and react to it, for instance by introducing a fetch join for loading a todo and all its details with one single query.

Wrapping Up

Keeping track of the SQL statements executed by an application is key for ensuring good performance characteristics, avoiding poor response times or overloading the database due to suboptimal queries.

This task can be solved by using three open-source tools working hand-in-hand: JDK Flight Recorder as a low-overhead event recording engine built into OpenJDK, JMC Agent for injecting JFR events into existing code of third-party libraries, and JfrUnit for asserting the JFR events emitted by an application. Combined with JDK Mission Control, a powerful client application for visualizing and examining JFR recordings, these tools provide a complete open-source tool chain for identifying and addressing performance problems of Java applications.

The possibilities of this combination of tools go way beyond the the use case of tracking SQL statements; with JFR, JMC Agent, JMC, and JfrUnit, you can monitor, analyse and assert all kinds of performance-related metrics in your Java applications. To get started experimenting with these tools yourself, take a look at the jfrunit-examples repository where besides the SQL example you can find other examples for identifying regression in regards to object allocations and I/O.

Lastly, here’s a few further resources if you’d like to learn more about the tools discussed in this post:

Author: Gunnar Morling

Gunnar Morling is a software engineer and open-source enthusiast by heart. He is leading the Debezium project, a distributed platform for change data capture. He is a Java Champion, the spec lead for Bean Validation 2.0 (JSR 380) and has founded multiple open source projects such as JfrUnit, kcctl, and MapStruct. Gunnar is an avid blogger (morling.dev) and has spoken at a wide range of conferences like QCon, Java One, Devoxx, JavaZone, and many others. He’s based in Hamburg, Germany.

Next Post

Previous Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2024 JVM Advent | Powered by steinhauer.software Logosteinhauer.software

Theme by Anders Norén