Introduction
Many years ago, I used to work as a team leader, and, one day, the General Manager asked me to take a look at a project that was in big trouble.
The application in question had been developed by a team of software developers for over 9 months, and the client had just tested in a production-like environment.
The client got very upset when realizing that the application was barely crawling. For instance, I was told that a query had been running for 10 hours without showing any signs of stopping.
After analyzing the project, I identified many areas that could have been improved, and that’s how my passion for high-performance data access was born.
JPA and Hibernate
It was 2004 when I first heard of Hibernate. Back then, I was working on a .NET project for my college thesis and wasn’t very happy with ADO.NET at the time. Therefore, I started reading about NHibernatem, which was still in Beta at the time. NHibernate was trying to adapt the Hibernate 2 project from Java to .NET, and even the Beta version at the time was a much better alternative to ADO.NET.
From that moment, Hibernate got really popular. In fact, the Java Persistence API, which emerged in 2006, is very much based on Hibernate.
Thanks to JPA, Hibernate’s popularity grew even larger as most Java EE or Spring projects used it either directly or indirectly. Even today, most Spring Boot projects use Hibernate too, via the Spring Data JPA module.
Logging SQL statements
When using a data access framework where all queries must be stated explicitly, it’s obvious what SQL queries will be executed by the application.
On the other hand, JPA and Hibernate execute SQL statements based on the entity state transitions operated by the data access layer code.
For this reason, it’s very important to always log the SQL statement generated by JPA and Hibernate.
The best way to log SQL statements is to use a JDBCDataSourceorDriverproxy, as explained in this article.
Domain Model
Let’s consider you are mapping a post parent table and the post_comment child table. There is a one-to-many table relationship between the post and post_comment tables via the post_id Foreign Key column in the post_comment table.
You could map the post and post_comment tables as JPA entities in the following way:
@Entity(name = "Post")
@Table(name = "post")
public class Post {
@Id
private Long id;
private String title;
public Long getId() {
return id;
}
public Post setId(Long id) {
this.id = id;
return this;
}
public String getTitle() {
return title;
}
public Post setTitle(String title) {
this.title = title;
return this;
}
}
@Entity(name = "PostComment")
@Table(name = "post_comment")
public class PostComment {
@Id
private Long id;
@ManyToOne
private Post post;
private String review;
public PostComment setId(Long id) {
this.id = id;
return this;
}
public Post getPost() {
return post;
}
public PostComment setPost(Post post) {
this.post = post;
return this;
}
public String getReview() {
return review;
}
public PostComment setReview(String review) {
this.review = review;
return this;
}
}
Notice that theNow, let’s assume we are adding threePostandPostCommentuse a fluent-style API. For more details about the advantages of using this strategy, check out this article.
Post entities into our database, each Post containing three PostComment child entities:
doInJPA(entityManager -> {
long pastId = 1;
long commentId = 1;
for (long i = 1; i <= 3; i++) {
Post post = new Post()
.setId(pastId++)
.setTitle(
String.format(
"High-Performance Java Persistence, part %d",
i
)
);
entityManager.persist(post);
for (int j = 0; j < 3; j++) {
entityManager.persist(
new PostComment()
.setId(commentId++)
.setPost(post)
.setReview(
String.format(
"The part %d was %s",
i,
reviews[j]
)
)
);
}
}
});
Fetching data
Let’s assume you want to load aPostComment from the database. For that, you can call the find JPA method as follows:
PostComment comment = entityManager.find(
PostComment.class,
1L
);
When executing the find method, Hibernate generates the following SQL query:
SELECT
pc.id AS id1_1_0_,
pc.post_id AS post_id3_1_0_,
pc.review AS review2_1_0_,
p.id AS id1_0_1_,
p.title AS title2_0_1_
FROM
post_comment pc
LEFT OUTER JOIN
post p ON pc.post_id=p.id
WHERE
pc.id=1
Where did that LEFT OUTER JOIN come from?
Well, this is because the @ManyToOne association in PostComment uses the default fetching strategy, which is FetchType.EAGER.
So, Hibernate has to do the LEFT OUTER JOIN as the mapping says it should always initialize the post association when fetching the PostComment entity.
Now, look what happens when you execute a JPQL query to fetch the same PostComment entity:
PostComment comment = entityManager
.createQuery(
"select pc " +
"from PostComment pc " +
"where pc.id = :id", PostComment.class)
.setParameter("id",1L)
.getSingleResult();
Instead of a LEFT OUTER JOIN, we have a secondary query now:
SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc
WHERE
pc.id = 1
SELECT
p.id AS id1_0_0_,
p.title AS title2_0_0_
FROM
post p
WHERE
p.id = 1
Now, there was a single extra query this time, but if we fetch all PostComment entities associated with a given Post title:
List comments = entityManager
.createQuery(
"select pc " +
"from PostComment pc " +
"join pc.post p " +
"where p.title like :titlePatttern", PostComment.class)
.setParameter(
"titlePatttern",
"High-Performance Java Persistence%"
)
.getResultList();
assertEquals(9, comments.size());
Hibernate will issue 4 queries now:
SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc
JOIN
post p ON pc.post_id=p.id
WHERE
p.title LIKE 'High-Performance Java Persistence%'
SELECT
p.id AS id1_0_0_,
p.title AS title2_0_0_
FROM
post p
WHERE
p.id = 1
SELECT
p.id AS id1_0_0_,
p.title AS title2_0_0_
FROM
post p
WHERE
p.id = 2
SELECT
p.id AS id1_0_0_,
p.title AS title2_0_0_
FROM
post p
WHERE
p.id = 3
There are four SQL queries this time. The first one is for the actual JPQL query that filters the post_comment table records while the remaining three are for fetching the Post entity eagerly.
Reviewing and validating all these @ManyToOne associations and making sure they are always using FetchTYpe.LAZY will take time. More, you cannot guarantee that one day, someone else will come and change a given association from FetchTYpe.LAZY to FetchTYpe.EAGER.
Detecting performance problems automatically
A much better approach to solving this issue is to use Hypersistence Optimizer. After setting the Maven dependency:<dependency>
<groupId>io.hypersistence</groupId>
<artifactId>hypersistence-optimizer</artifactId>
<version>${hypersistence-optimizer.version}</version>
</dependency>
All you need to do is add the following code to any of your integration tests:
public void testNoPerformanceIssues() {
ListEventHandler listEventHandler = new ListEventHandler();
new HypersistenceOptimizer(
new JpaConfig(entityManagerFactory())
.addEventHandler(listEventHandler)
).init();
assertTrue(listEventHandler.getEvents().isEmpty());
}
That’s it!
Now, if you try to run the tests, the suite will fail with the following error:
ERROR [main]: Hypersistence Optimizer - CRITICAL
- EagerFetchingEvent
- The [post] attribute in the [io.hypersistence.optimizer.config.PostComment] entity
uses eager fetching. Consider using a lazy fetching which,
not only that is more efficient, but it is way more flexible
when it comes to fetching data.
For more info about this event, check out this User Guide link
- https://vladmihalcea.com/hypersistence-optimizer/docs/user-guide/#EagerFetchingEvent
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at io.hypersistence.optimizer.config.FailFastOnPerformanceIssuesTest.testNoPerformanceIssues(FailFastOnPerformanceIssuesTest.java:41)
Awesome, right?
You cannot even build the project with performance issues like this one sneaking in your data access code.
Conclusion
Using JPA and Hibernate is very convenient, but you need to pay extra attention to the underlying SQL statements that get generated on your behalf.
With a tool like Hypersistence Optimizer, you can finally spend your time focusing on your application requirements instead of chasing JPA and Hibernate performance issues.
