JVM Advent

The JVM Programming Advent Calendar

WTF Connection pools

Let’s talk about connection pools.

I claim that:

Default settings of most popular connection pools are poor!

For you, it means:

Go review your connection pool settings.

You might have a problem if you rely on default settings. You may have memory leaks and unresponsive application (even if load is not high at all).

Below I will show some of most important settings and my recommendations how they really should be configured.

 

What is connection pool?

A plain web application that needs to write or read data from database, does it like this:

  1. Open a connection to DB        // takes N ms
  2. read/write data
  3. close the connection

(by the way, in old good CGI applications it was the only possible approach)

This approach is perfectly fine in many cases. And you probably don’t need anything more. But it has some disadvantages for highly-performant systems:

  • Step 1 can take some time. Probably tens or hundreds of milliseconds (it depends, of course).
  • It’s easy to forget Step 3 (close the connection) which causes a connection leak (causing memory leaks and other problems).

A new hero

That’s why another approach was born: application may preliminarily open a bunch of connections and hold them open all the time. The bunch of open connections is called connection pool. Then any operation looks like this:

  1. Take a DB connection from pool        // blazingly fast in most cases
  2. read/write data
  3. return the connection to the pool

Seems cool. But new power always means new problems.

… and new problems

When using a connection pool, we need to solve (at least) the following questions:

  • How many connections we should keep open?
  • How long should they be kept?
  • What if they appear to be broken?
  • What if application needs more connections than the pool currently has?
  • What if somebody forgets to return connection to pool?

To answer these questions, connection pools have a lot of settings. And their default values are mostly bad. Intrigued? Let me show.

Basic settings

I will consider the 2 most popular connection pools in Java world:

The basic parameters, of cause, are:

  • min size (minimum number of connections that should be open at any moment)
  • initial size (how many connections application opens at start)
  • max size (maximum number of connections in pool)

By the way, these are the only settings that have reasonable defaults. Here they are:

c3p0 HikariCP
min size 3 10
initial size 3 10
max size 15 10

 

Let’s continue with more problematic settings.

Critical settings

checkout Timeout

How long can application wait until it gets a connection from pool.

  • c3p0 setting: checkoutTimeout
  • HikariCP setting: connectionTimeout

Default values:

c3p0 HikariCP I recommend
checkoutTimeout 30 s 1 ms

Both default values are just disaster. 

As I mentioned, in most cases getting a connection from pool is blazingly fast. Except the case when the pool has no more open connections. Then the pool needs to acquire a new connection (which takes less than a second, as a rule). But if maxSize is reached, pool cannot open new connection, and just waits until somebody returns its connection to pool. But if the application has a connection leak (a bug which prevents connections to be returned), the pool will never get the connection back!

What then happens? 

In case of c3p0, we end up with all Threads frozen in the following state:

"qtp1905485420-495 13e09-3211" #495 prio=5 os_prio=0 tid=0x00007f20e078d800 nid=0x10d7 in Object.wait() [0x00007f204bc79000]

   java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable()

- locked <0x00000000c3295ef8> (a com.mchange.v2.resourcepool.BasicResourcePool)

at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource()

    …

    at org.hibernate.jpa.internal.QueryImpl.getResultList()

    at domain.funds.FundsRepository.get()

    …

It may seem that the HikariCP default “30 seconds” is a bit better. No, it doesn’t really help in high-performant applications. During those 30 seconds, a lot of new requests may come, and all of them are just frozen. Apparently application will get an OutOfMemory error soon. Any waiting just postpones the death of application for a few seconds.

That’s why I recommend to set checkoutTimeout to the minimal possible value: 1ms. Unfortunately we cannot set it to 0 because 0 means endless waiting 🙂 The sooner we fail, the more chances we give working threads to complete their job. And we can clearly inform the user that the application is currently overloaded, and he should try later.

test Connection On Checkout

Sometimes connections in pool may die. Database can close them by its initiative, or a system administrator can just break network cable. That’s why pool should monitor connection aliveness.

The easiest setting to do that is “testConnectionOnCheckout” in c3p0 (I haven’t found a similar setting in HikariCP, it seems to be always enabled).

Default values:

c3p0 HikariCP I recommend
testConnectionOnCheckout false true? true

Definitely, it should be enabled by default!

Otherwise you will end up with lots of such exceptions in log:

org.hibernate.TransactionException: Unable to rollback against JDBC Connection
at o.h.r.j.i.AbstractLogicalConnectionImplementor.rollback()
at o.h.r.t.b.j.i.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.rollback(JdbcResourceLocalTransactionCoordinatorImpl.java:294)

P.S. If you want to achieve even better performance, you may consider testing connection in background, not on checkout:

  • testConnectionOnCheckout=false
  • testConnectionOnCheckin=true
  • idleConnectionTestPeriod=10

preferred test query

But how exactly should the pool test connections?

The problem is that it depends on database.

By default, both pools test connections by executing

  • “connection.isValid()” (in case of JDBC4), or
  • “connection.getMetaData().getTables()” (in case of JDBC3)

It may be slow because “getTables()” retrieves meta information about all tables each time. A recommended value is something like

  • “SELECT 1” (in case of MySql), or
  • “SELECT 1 FROM DUAL” (in case of Oracle) etc.

By executing this simple and fast query, the pool can check if a connection is still alive.

max idle time

How long can an unused connection stay in pool

  • c3p0 setting: maxIdleTime
  • HikariCP setting: idleTimeout

Default values:

c3p0 HikariCP I recommend
maxIdleTimeout 10 minutes 1..10 minutes

It’s not probably a big deal, but every opened connection

  • holds some resources inside database
  • prevents other systems from getting connections to the same database (every database has some limit of maximum possible number of connections)

That’s why it’s a good idea to close unused (idle) connection. I recommend to set this value to non-endless period. Probably several minutes is reasonable.

min pool size

How many connections pools should always have (even if unused).

  • c3p0 setting: minPoolSize
  • HikariCP setting: minimumIdle

Default values:

c3p0 HikariCP I recommend
maxIdleTimeout 3 max pool size 0…N

For the same reason, it’s probably a good idea to close unused connections. I would set this value to 0 or 1 in most cases. If some user unexpectedly decides to log in to your application at midnight, he will just wait for a few more milliseconds. Not a big deal.

max connection age

How long a connection may live in pool (no matter if it’s idle or used)

  • c3p0 setting: maxConnectionAge
  • HikariCP setting: maxLifetime

Default values:

c3p0 HikariCP I recommend
maxIdleTimeout 30 minutes say, 30 minutes

Just in case, it’s probably a good idea to close connections time-to-time. Probably it helps to avoid some memory leaks.

A quote from HikariCP documentation:

“We strongly recommend setting this value, and it should be several seconds shorter than any database or infrastructure imposed connection time limit.”

unreturned connection timeout

One of typical problems is a connection leak. Some buggy code took a connection from pool and didn’t return it. How to detect this problem?

Fortunately, we have a good setting for this case:

  • c3p0 setting: unreturnedConnectionTimeout
  • HikariCP setting: leakDetectionThreshold

Default values:

c3p0 HikariCP I recommend
maxIdleTimeout disabled disabled 5 minutes?

If any buggy code took a connection and didn’t return it during 5 minutes, the pool will forcedly return the connection and write warnings like this:

[C3P0PooledConnectionPoolManager Logging the stack trace by which the overdue resource was checked-out.
java.lang.Exception: DEBUG STACK TRACE: Overdue resource check-out stack trace.
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource()
at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1885)
at domain.application.ApplicationReportSender.sendWeeklyReport(ApplicationReportSender.java:63)

It will help you to find out where is the guilty code.

Conclusion

I gave an overview of some connection pool settings. There is more of them. I gave some advices which seem reasonable from my experience. But your application may have different load. You users may have different behaviour. My advices may seem stupid to you.

No problems. Don’t trust me. But please, also

Don’t trust defaults.

Go check your pool settings!

Author: Andrei Solntsev

Software developer at Codeborne (Estonia).

Creator of selenide.org

Next Post

Previous Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2024 JVM Advent | Powered by steinhauer.software Logosteinhauer.software

Theme by Anders Norén