Under the JVM hood – Classloaders

By Simon Maple, @sjmapleZeroTurnaround Technical Evangelist

Classloaders are a low level and often ignored aspect of the Java language among many developers. At ZeroTurnaround, our developers have had to live, breathe, eat, drink and almost get intimate with classloaders to produce the JRebel technology which interacts at a classloader level to provide live runtime class reloading, avoiding lengthy rebuilds/repackaging/redeploying cycles. Here are some of the things we’ve learnt around classloaders including some debugging tips which will hopefully save you time and potential headdesking in the future.

A classloader is just a plain java object

Yes, it’s nothing clever, well other than the system classloader in the JVM, a classloader is just a java object! It’s an abstract class, ClassLoader, which can be implemented by a class you create. Here is the API:

public abstract class ClassLoader {

public Class loadClass(String name);

protected Class defineClass(byte[] b);

public URL getResource(String name);

public Enumeration getResources(String name);

public ClassLoader getParent()

}

Looks pretty straightforward, right? Let’s take a look method by method. The central method is loadClass which just takes a String class name and returns you the actual Class object. This is the method which if you’ve used classloaders before is probably the most familiar as it’s the most used in day to day coding. defineClass is a final method in the JVM that takes a byte array from a file or a location on the network and produces the same outcome, a Class object.

A classloader can also find resources from a classpath. It works in a similar way to the loadClass method. There are a couple of methods, getResource and getResources, which return a URL or an Enumeration of URLs which point to the resource which represents the name passed as input to the method.

Every classloader has a parent; getParent returns the classloaders parent, which is not Java inheritance related, rather a linked list style connection. We will look into this in a little more depth later on.

Classloaders are lazy, so classes are only ever loaded when they are requested at runtime. Classes are loaded by the resource which invokes the class, so a class, at runtime, could be loaded by multiple classloaders depending on where they are referenced from and which classloader loaded the classes which referen… oops, I’ve gone cross-eyed! Let’s look at some code.

public class A {

public void doSmth() {

B b = new B();

b.doSmthElse();

}

}

Here we have class A calling the constructor of class B within the doSmth of it’s methods.  Under the covers this is what is happening

A.class.getClassLoader().loadClass(“B”);

The classloader which originally loaded class A is invoked to load the class B.

Classloaders are hierarchical, but like children, they don’t always ask their parents

Every classloader has a parent classloader. When a classloader is asked for a class, it will typically go straight to the parent classloader first calling loadClass which may in turn ask it’s parent and so on. If two classloaders with the same parent are asked to load the same class, it would only be done once, by the parent. It gets very troublesome when two classloaders load the same class separately, as this can cause problems which we’ll look at later.

When the JEE spec was designed, the web classloader was designed to work the opposite way – great. Let’s take a look at the figure below as our example.  
 


Module WAR1 has its own classloader and prefers to load classes itself rather than delegate to it’s parent, the classloader scoped by App1.ear. This means different WAR modules, like WAR1 and WAR2 cannot see each others classes. The App1.ear module has its own classloader and is parent to the WAR1 and WAR2 classloaders.  The App1.ear classloader is used by the WAR1 and WAR2 classloaders when they needs to delegate a request up the hierarchy i.e. a class is required outside of the WAR classloader scope. Effectively the WAR classes override the EAR classes where both exist. Finally the EAR classloader’s parent is the container classloader.  The EAR classloader will delegate requests to the container classloader, but it does not do it in the same way as the WAR classloader, as the EAR classloader will actually prefer to delegate up rather than prefer local classes. As you can see this is getting quite hairy and is different to the plain JSE class loading behaviour.

The flat classpath

We talked about how the system classloader looks to the classpath to find classes that have been requested. This classpath could include directories or JAR files and the order which they are looked through is actually dependant on the JVM you are using. There may be multiple copies or versions of the class you require on the classpath, but you will always get the first instance of the class found on the classpath.  It’s essentially just a list of resources, which is why it’s referred to as flat. As a result the classpath list can often be relatively slow to iterate through when looking for a resource.

Problems can occur when applications who are using the same classpath want to use different versions of a class, lets use Hibernate as an example. When two versions of Hibernate JARs exist on the classpath, one version cannot be higher up the classpath for one application than it is for the other, which means both will have to use the same version. One way around this is to bloat the application (WAR) with all the libraries necessary, so that they use their local resources, but this then leads to big applications which are hard to maintain. Welcome to JAR hell! OSGi provides a solution here as it allows versioning of JAR files, or bundles, which results in a mechanism to allow wiring to particular versions of JAR files avoiding the flat classpath problems.

How do I debug my class loading errors?

NoClassDefFoundError/ClassNotFoundException/ClassNoDefFoundException?

 

So, you’ve got an error/exception like the ones above. Well, does the class actually exist? Don’t bother looking in your IDE, as that’s where you compiled your class, it must be there otherwise you’ll get a compile time exception. This is a runtime exception so it’s in the runtime we want to look for the class which it says we’re missing… but where do you start? Consider the following piece of code…

Arrays.toString((((URLClassLoader) Test.class.getClassLoader())
.getURLs()));

This code returns an array list of all jars and directories on the classpath of the classloader the class Test is using. So now we can see if the JAR or location our mystery class should exist in is actually on the classpath. If it does not exist, add it! If it does exist, check the JAR/directory to make sure your class actually exists in that location and add it if it’s missing. These are the two typical problems which result in this error case.

NoSuchMethodError/NoSuchFieldError/AbstractMethodError/IllegalAccessError?

 

Now it’s getting interesting! These are all subclasses of the IncompatibleClassChangeError. We know the classloader has found the class we want (by name), but clearly it hasn’t found the right version. Here we have a class called Test which is making an invocation to another class, Util, but BANG – We get an exception! Lets look at the next snippet of code to debug:

Test.class.getClassLoader().getResource(Util.class.getName()
.replace('.', '/') + ".class");

We’re calling getResource on the classloader of class Test. This returns us the URL of the Util resource. Notice we’ve replaced the ‘.’ with a ‘/’ and added a ‘.class’ at the end of the String. This changes the package and classname of the class we’re looking for (from the perspective of the classloader) into a directory structure and filename on the filesystem – neat. This will show us the exact class we have loaded and we can make sure it’s the correct version. We can use javap -private on the class at a command prompt to see the byte code and check which methods and fields actually exist. You can easily see the structure of the class and validate whether it’s you or the Java runtime which is going crazy! Believe me, at one stage or another you’ll question both, and nearly every time it will be you! :o)

LinkageError/ClassCastException/IllegalAccessError

 

These can occur if two different classloaders load the same class and they try to interact… ouch! Yes, it’s now getting a bit hairy. This can cause problems as we do not know if they will load the classes from the same place. How can this happen? Lets look at the following code, still in the Test class:

Factory.instance().sayHello();

The code looks pretty clean and safe, and it’s not clear how an error could emerge from this line. We’re calling a static factory method to get us an instance of the Test class and are invoking a method on it. Lets look at this supporting image to show the reason why an exception is being thrown.


Here we can see a web classloader (which loaded the Test class) will prefer local classes, so when it makes reference to a class, it will be loaded by the web classloader, if possible. Fairly straightforward so far.  The Test class uses the Factory class to get hold of an instance of the Util class which is fairly typical practice in Java, but the Factory class doesn’t exist in the WAR as it is an external library.  This is no problem as the web classloader can delegate to the shared classloader, which can see the Factory class. Note that the shared classloader is now loading it’s own version of the Util class as when the Factory instantiates the class, it uses the shared classloader (as shown in the first example earlier). The Factory class returns the Util object (created by the shared classloader) back to the WAR, which then tries to use the class, and effectively cast the class to a potentially different version of the same class (the Util class visible to the web classloader). BOOM!

We can run the same code as before from within both places (The Factory.instance() method and the Test class) to see where each of our Util classes are being loaded from.

Test.class.getClassLoader().getResource(Util.class.getName()
.replace('.', '/') + ".class"));

Hopefully this has given you an insight into the world of classloading, and instead of not understanding the classloader, you can now appreciate it with a hint of fear and uncertainty! Thanks for reading and making it to the end. We’d all like to wish you a Merry Christmas and a happy new year from ZeroTurnaround!  Happy coding!
 
Meta: this post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on! Want to write for the blog? We are looking for contributors to fill all 24 slot and would love to have your contribution! Contact Attila Balazs to contribute!

Pleading for Java

This post is based on an article I wrote almost 10 years ago in the GInfo Computer Science Magazine. I am surprised things did not change much since then. So, here I go again…
In Romanian high schools the programming languages used or teaching Computer Science are Pascal and C++. In college, the students learn other programming languages they might need in order to become software engineers. Unfortunately, this is true only from a theoretical perspective. Most Romanian software engineers claim they had to learn by themselves…
Imagine a junior high school student who’s dream is to become a software engineer. He starts by learning how to implement more and more complicated algorithms (using the above mentioned languages) and then also learns something about databases.
When she graduates from high school, she realizes show know nothing useful. She knows how tosolve many problems, but she has no idea how to create a simple application with a friendly user interface. She decides to go to college so that she can learn more…
Therefore, the student must go to college because she does not learn enough in high school. In my opinion, a high school graduate should be able to a software engineer without a college diploma. She should not be one of the best, but she should be able to join the crowd. Why? Because it is possible. There are many college students that work as programmers while still studying and there are also enough experienced programmers that do not have a University degree. Of course, they learn by themselves, but they learn. Therefore, it is possible!
About 20 years ago, high school students started by learning QBasic. Someone decided this cannot go on and, at some point, the first programming language was Pascal. A few years later the C language was introduced as an alternative to Pascal.
Things changed, so change is possible. Was it hard? Yes! Why? Because it will be harder for the kids to learn Pascal then it was to learn Basic. This was the official answer. Unfortunately the real answer was something like: it will be harder for the teachers to teach Pascal then it was to teach Basic. But, this was changed. The Basic language was forgotten; Pascal and C were introduced. Then, someone came with the idea to start with C. To war between Basic and Pascal was replaced by a war between Pascal and C. Why? Because it will be harder for the kids to learn C then it was to learn Pascal.
I remember that a computer science teacher said that a math teacher graduates from college and is able to teach math for the next 50 years without having tolearn anything new. The two teachers have the same salary. Therefore, why should the computer science teacher have to learn new things. Because she teaches computer science, not math (I would say)!
Anyway, another small (half)step was made. The programming language was no longer specified in the curricula. The teacher may choose Pascal or C or something else (Algol, Fortran, Java, C# or anything else comes to mind). Again, the theory is nice. In practice, for the Maturity exams at the end of high school, the solutions must be written in Pascal or C.
These were useful steps (or half-steps), but it is not enough. The current curricula creates several stars (who win medals in international olympiads) and many unknowledgeable student who cannot say what’s the difference between sorting an array and finding the last digit of a number. And… so what? Someone will teach them to draw some nice user interfaces using Delphi and they will become software engineers.
I think the effort to teach algorithms to all computer science students is useless. Algorithms might be necessary, but it is more important to know a modern programming language.
Let’s see now why it is more difficult for a student to learn Java rather than C or Pascal. The for is still a for, the while is still a while, the if is still an if. Therefore, the control structures that must be explained are the same. The subroutines are also the same. Some claim the Java syntax is more difficult. Might be, but it is practically identical to the C++syntax. But, we have objects in Java and the teacher must explain difficult concepts like inheritance, polymorphism etc. Does she really have to? Of course,but not from the beginning.
The teacher can start without adding anything complicated without adding anything to the current style. Then, object oriented paradigms can be added. I think people someone is missing an important point: at the beginning, the student know nothing. Why should we believe it is easier for the student to learn structured programming rather than object oriented programming? The only reason is: we think structured programming is easier. I am confident that the engineers that know both techniques have no problem using any of them. They do not think one of them is easier than the other. They both have some basic things that must be understood.
I claim that it is not more difficult to understand the object oriented paradigms and they are more useful. Why aren’t they more difficult? Because the student knows nothing. She is not accustomed to work based on some principles; the teachers are… Any new things can be assimilated; it all depends on the talent of the teacher.
I will now try to show that the Java language is a viable alternative. First of all, Java is a language that was accepted by the software engineer community. It is not just a trendy language that will disappear in a few years. There are many Java technologies that are developed based on the core language. Even if one cannot expect JSP or hibernate to be thought in high school, the student will be able to easily learn all these technologies if she knows the Java basics. Java is an object oriented language. Most languages actually used for developing software are object oriented. If one knows such a language, it will be easy to learn the others (when needed).
Please allow me to present some examples that will show how good the “coffee” actually is.
Operations with big integers are a pain for all students that take part in programming competitions. In most contests there is at least a problem that needs the implementation of such operations. If dividing is necessary, everything turns into a nightmare.
By using Java we do not have this problem. Someone already implemented these operations and we only have to use the already written code. The following example shows how easy it is to use big integers:
import java.math.BigInteger;
public class BigIntegers {
  public static void main(String[] args) {
    BigInteger a = new BigInteger(“2458”);
    BigInteger b = new BigInteger(“13”);
    System.out.println(a + ” + ” + b + ” = ” + a.add(b));
    System.out.println(a + ” – ” + b + ” = ” + a.subtract(b));
    System.out.println(a + ” * ” + b + ” = ” + a.multiply(b));
    System.out.println(a + ” / ” + b + ” = ” + a.divide(b));
    System.out.println(a + ” % ” + b + ” = ” + a.remainder(b));
  }
}
But it does not stop here. We have operations for decimal numbers too:
import java.math.BigDecimal;
public class BigDecimals {
  public static void main(String[] args) {
    BigDecimal a = new BigDecimal(“2458”);
    BigDecimal b = new BigDecimal(“13”);
    System.out.println(a + ” + ” + b + ” = ” + a.add(b));
    System.out.println(a + ” – ” + b + ” = ” + a.subtract(b));
    System.out.println(a + ” * ” + b + ” = ” + a.multiply(b));
    System.out.println(a + ” / ” + b + ” = ” + a.divide(b, 5, BigDecimal.ROUND_DOWN));
  }
}
One may say that, given a few hours, an engineer would have implemented all that. That might be true, but it would be a couple of wasted hours. But, Java allows us to save days or even months in some cases. How long do you think it would take you to implement the compressing of a regular file into a zip file? You would probably need a few days just to study the format of the zip file.
I think the next sequence of code needs no further comments:
import java.util.zip.ZipOutputStream;
import java.util.zip.ZipEntry;
import java.io.FileOutputStream;
import java.io.FileInputStream;
public class ZIP {
  public static void main(String args[]) throws Exception {
    ZipOutputStream zo = new ZipOutputStream (new FileOutputStream(“myfile.zip”));
    ZipEntry e = new ZipEntry(“myfile.txt”);
    zo.putNextEntry(e);
    FileInputStream in = new FileInputStream(“myfile.txt”);
    byte b[] = new byte[16384];
    int i;
    while ((i = in.read(b)) > 0)
      zo.write(b, 0, i);
    zo.flush();
    zo.closeEntry();
    zo.close();
  }
}
Like I stated 10 years ago, I will not draw any conclusions. I am waiting for other opinions…
P.S.
I apologize for my English. It is not what it should be… 🙂

Meta: this post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on! Want to write for the blog? We are looking for contributors to fill all 24 slot and would love to have your contribution! Contact Attila Balazs to contribute!