JVM Advent

The JVM Programming Advent Calendar

5 things you probably didn’t know about java concurrency. 

Thread is the heart of the java programming language. When we run a hello world java program, we run on the main thread. And then, we can definitely create threads easily as we need to compose our application code to be functional, responsive, and performant at the same time. Think about a web server; it simultaneously handles hundreds of requests at the same time. In java, we achieve the using multiple threads. While threads are helpful, it is dreadful to many of the developers. That’s why in this article, I will randomly share 5 interesting threading concepts that the beginner and intermediate developers might not know.  

The program order and the execution order are not the same. 

When we write a code, we assume the code will be executed exactly the way we write it. However, in reality, this is not the case. The Java compiler may change the execution order to optimize it if it can determine that the output won’t change in single-threaded code.

Look at the following code snippet: 

package ca.bazlur.playground;

import java.util.concurrent.Phaser;

public class ExecutionOrderDemo {
    private static class A {
        int x = 0;
    }

    private static final A sharedData1 = new A();
    private static final A sharedData2 = new A();

    public static void main(String[] args) {
        var phaser = new Phaser(3);
        var t1 = new Thread(() -> {
            phaser.arriveAndAwaitAdvance();
            var l1 = sharedData1;
            var l2 = l1.x;
            var l3 = sharedData2;
            var l4 = l3.x;
            var l5 = l1.x;
            System.out.println("Thread 1: " + l2 + "," + l4 + "," + l5);
        });
        var t2 = new Thread(() -> {
            phaser.arriveAndAwaitAdvance();
            var l6 = sharedData1;
            l6.x = 3;
            System.out.println("Thread 2: " + l6.x);
        });
        t1.start();
        t2.start();
        phaser.arriveAndDeregister();
    }
}

The above code seems straightforward. We have two shared data (sharedData1 & sharedData2) two threads use them. When we’d execute the code, we assume the output would be: 

Thread 2: 3
Thread 1: 0,0,0

But if you run it a few times, you will see different output: 

Thread 2: 3
Thread 1: 3,0,3

Thread 2: 3
Thread 1: 0,0,3

Thread 2: 3
Thread 1: 3,3,3

Thread 2: 3
Thread 1: 0,3,0

Thread 2: 3
Thread 1: 0,3,3

I’m not claiming all of them can be reproducible on your machine, but all of them are possibilities. 

You can read further about it here: https://foojay.io/today/java-thread-programming-part-4/

Java threads are limited.

Creating a thread is easy in java. However, that doesn’t mean we can make as much as we want. Threads are limited. We can easily find out how many threads we can create on a particular machine with the following program: 

package ca.bazlur.playground;

import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.locks.LockSupport;

public class Playground {

    public static void main(String[] args) {
        var counter = new AtomicInteger();

        while (true) {
            new Thread(() -> {
                int count = counter.incrementAndGet();
                System.out.println("thread count = " + count);
                LockSupport.park();
            }).start();
        }
    }
}

The above program is a simple one. It creates a thread in a loop and then park it, which means the thread gets disabled for further use, but it certainly does the system call and allocates memory. It keeps creating threads until it cannot create anymore, and then it throws an exception. We are interested in the number that we get until the program throws an exception. 

On my machine, I was able to create only 4065 threads. 

Too many threads don’t guaranty better performance. 

Sometimes we may naively think that, since we can create threads easily in java, that certainly boosts application performance. Unfortunately, the assumption is flawed in the case of our traditional threading model that java provides today. Too many threads may, in fact, hurt the application performance.

Let’s ask this question first, what is the optimal maximum number of threads we can create that maximize the performance of an application?

Well, the answer isn’t straightforward; I wish it was. It very much depends on the type of work we are doing. 

If we have multiple independent tasks and they are all computational and don’t block any external resources, then having many threads will not improve performance much. On the other hand, if we have an 8 Crore CPU, the optimal number of threads can be (8 + 1). In such a case, we may rely on the parallel stream introduced in java 8. By default, the parallel stream uses the Fork/Join common pool. By default, it creates threads equal to the number of available processors, which is sufficient for CPU-intensive work.  

Adding more threads to the CPU-intensive work where nothing blocks will not result in better performance. Rather, we will waste resources. 

Note: The reason for having an extra one is that even compute-intensive thread occasionally takes a page fault or pauses for some other reason. (Ref: Java Concurrency In practice by Brian Goetz, page 170) 

However, suppose the tasks are I/O bound, for example. In that case, they depend on external communication (e.g. database, rest api ), making more threads make sense. The reason is, when a thread waits on the Rest API, other threads can go on and continue working. 

Now again, we can ask, how many threads is too many threads for such a case? 

Well, it depends. There are no ideal numbers that fit all cases. So we have to do adequate testing to find out what is the best for our particular workload and application. 

However, in a most typical scenario, we usually have a mixed set of tasks. And things go completed in such cases. 

In “Java Concurrency in Practice,” Brian Goetz provided a formula that we can use in most cases. 

Number of threads = Number of Available Cores * (1 + Wait time / Service time)

Waiting time could IO, e.g., waiting for an HTTP response, acquiring Lock, etc. 

Service Time– is the time of computation, e.g. processing the HTTP response, marshalling/unmarshalling etc. 

For example- an application calls an API and then processes it. If we have 8 processors in the application server, and then on average, the response time of the API is 100ms and the processing time of the response is 20ms, then the ideal size of thread would be –

N = 8 * ( 1 + 100/20)
  = 48

However, this is an oversimplification; adequate testing is always critical to figure out the number. 

Concurrency isn’t parallelism 

Sometimes we use concurrency and parallelism interchangeably, which isn’t current. Although in java, we achieve both using the thread, these two are two different things. 

“In programming, concurrency is the composition of independently executing processes, while parallelism is the simultaneous execution of (possibly related) computations. Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.” 

The above definition is given by Rob Pike is pretty accurate. 

Ref: https://www.youtube.com/watch?v=f6kdp27TYZs

Suppose we have absolutely independent tasks, and they can be computed separately. In that case, those tasks are said to be parallel and can be run with Fork/Join pool or parallel stream. 

On the other hand, if we have many tasks, some of them may depend on another. The way we compose and structure are said to be concurrency. Concurrency is all about structure. We may want to progress multiple tasks simultaneously to achieve a particular result, not necessarily finish one faster. 

Project Loom enables us to create millions of threads. 

In our previous point, I argued that having many threads doesn’t mean performance gain in the application. However, in modern days of the microservices era, we communicate with too many services to do particular work. In such a scenario, threads stay in a blocked state most of the time. While modern OS can handle millions of open sockets at times, we cannot open many communication channels since we are limited by the number of threads. What if we can create millions of threads, and each of them will use an open socket to deal with the external communication? That would certainly improve our throughput of an application. 

I have discussed this idea in this article in detail: https://bazlur.com/2021/06/project-loom-the-light-at-the-end-of-the-tunnel/

To accolade the idea, there is an initiate going on in java called project loom. Using project loom, we can, in fact, create millions of virtual threads. For example, using the following code snippet, I was able to create 4.5 million threads in my machine, but you can do more based on your machine.  

import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.locks.LockSupport;

public class Main {

    public static void main(String[] args) {
        var counter = new AtomicInteger();

        // 4_576_279
        while (true) {
            Thread.startVirtualThread(() -> {
                int count = counter.incrementAndGet();
                System.out.println("thread count = " + count);
                LockSupport.park();
            });
        }
    }
}

To run this program, you need to have Java 18 can be downloaded from here: http://jdk.java.net/loom/

You can run using the following command – 

java --source 18 --enable-preview Main.java

To know more about Project Loom: https://wiki.openjdk.java.net/display/loom/Main

If you feel dreadful about the thread, I have started a java threading series here: https://foojay.io/today/java-thread-programming-part-1/

That’s all for today, cheers! 

Author: A N M Bazlur Rahman

A N M Bazlur Rahman is a Software Engineer, Java Champion, Jakarta EE Ambassador, Author, Blogger, and Speaker. He has more than a decade of experience in the software industry, primarily with Java and Java-related technologies. He enjoys mentoring, writing, delivering talks at conferences, and contributing to open-source projects. He is the founder and current moderator of the Bangladesh Java User Group. He is an editor for the Java Queue at InfoQ and Foojay.io.

Next Post

Previous Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2024 JVM Advent | Powered by steinhauer.software Logosteinhauer.software

Theme by Anders Norén