JVM Advent

The JVM Programming Advent Calendar

JVM Hello World

Writing a “Hello World” program is often a rite of passage for a software engineer when learning a new language.

If you’re a Java developer,  you might even remember the first time you typed public static void main(String[] args) in your editor of choice. But did you ever wonder what’s inside that “.class” file that the compiler spits out? Let’s look at how we can write a JVM “Hello World” by creating a class file programmatically.

We’ll work through creating a class file for the following simple Java Hello World application.

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
}

By the end of this post you’ll have made your first steps into the world of Java bytecode: being able to generate a Java class file without a Java compiler (OK, technically we’ll still need a Java compiler, since we’re going to write Java code to generate the class file!).

What is a class file anyway?

A Java class file is a container for the compiled Java class, interface, enum or record definitions along with their corresponding members such as fields & methods. The methods in-turn contain the Java bytecode instructions that will be executed by a Java Virtual Machine (JVM).

At a high-level, a Java class file, as defined in the Java Virtual Machine Specification, contains the following structure:

  • The magic number 0xCAFEBABE used to identify the file as a Java class file
  • The major and minor version of the class file
  • A constant pool containing all the literal constants used within the class file
  • Access flags indicating whether the class is public, abstract etc
  • The name of the class and its superclass
  • The list of interfaces implemented by the class
  • Fields and methods
  • Attributes

In this post, we’re going to write code to generate a class that contains a main method and that method will contain bytecode which contains instructions to print “Hello World”.

Creating a class

So, how can we create a Java class file without starting from Java source code? Technically, a class file is just a bunch of bytes so we could just start writing out a stream of bytes:

DataOutputStream dataOutputStream =
    new DataOutputStream(
    new FileOutputStream("HelloWorld.class"));

dataOutputStream.writeInt(0xCAFEBABE);
//...
dataOutputStream.close();

But once we get past the magic number things get more complicated and we’d benefit from a higher-level API to help us out.

This is when a library like ProGuardCORE, ASM or ByteBuddy comes in handy.

ProGuardCORE

ProGuardCORE is a Java bytecode manipulation & analysis library that contains the tools required to read, write and manipulate Java class files and their bytecode. It abstracts away some of the details and provides model classes, editors and builders for all things class file related.

In order to create a representation of a Java class for our Hello World program we can use the ClassBuilder utility. We simply need to provide, at minimum, the Java class file version, the access flags, the class name and the super class name:

ClassBuilder classBuilder = new ClassBuilder(
/* version     = */ CLASS_VERSION_1_6,
/* accessFlags = */ PUBLIC,
/* className   = */ "HelloWorld",
/* superClass  = */ "java/lang/Object"
);

ProgramClass helloWorldClass =
    classBuilder.getProgramClass();

Using ProGuardCORE

ProGuardCORE is published to Maven Central, so you can simply create a new Java project and add a dependency to start using it. For example, a Gradle build.gradle file could look like the following:

plugins {
    id 'java'
}

repositories {
    mavenCentral()
}

dependencies {
    implementation 'com.guardsquare:proguard-core:9.0.6'
}

Writing a Java class file

Once we’ve created a Java class representation in memory we can write it to a file with a ProgramClassWriter.

ProGuardCORE heavily uses the visitor pattern to implement functionality that can be applied to the model classes. The ProgramClassWriter visitor implements the functionality to write the class model to an output stream.

The class can be written to a file HelloWorld.class using a DataOutputStream, a FileOutputStream and a ProgramClassWriter as follows:

ClassBuilder classBuilder = new ClassBuilder(
/* version     = */ CLASS_VERSION_1_6,
/* accessFlags = */ PUBLIC,
/* className   = */ "HelloWorld",
/* superClass  = */ "java/lang/Object"
);

ProgramClass helloWorldClass = 
    classBuilder.getProgramClass();

DataOutputStream dataOutputStream =
    new DataOutputStream(
    new FileOutputStream("HelloWorld.class"));

helloWorldClass.accept(
    new ProgramClassWriter(dataOutputStream));

dataOutputStream.close();

You can now use the command line tool javap to check that we’ve created a valid class file:

$ javap -c -v -p HelloWorld.class
Classfile HelloWorld.class
Last modified 12 Nov 2022; size 62 bytes
SHA-256 checksum 650610b365dac2ca00fee4b090a6089b90d0086c862141a3ac43030911f07489
public class HelloWorld
minor version: 0
major version: 50
flags: (0x0001) ACC_PUBLIC
this_class: #2                          // HelloWorld
super_class: #4                         // java/lang/Object
interfaces: 0, fields: 0, methods: 0, attributes: 0
Constant pool:
#1 = Utf8               HelloWorld
#2 = Class              #1              // HelloWorld
#3 = Utf8               java/lang/Object
#4 = Class              #3              // java/lang/Object
{
}

Notice that the generated file already contains the class name, version, superclass and a small constant pool containing the strings representing the class and superclass names.

There are, however, no fields or methods in the class!

Adding a main method

Adding a method using the ClassBuilder is easy with the addMethod builder methods. You must provide, at minimum, the access flags, the name and the descriptor (see “type descriptors”):

ClassBuilder classBuilder = new ClassBuilder(
/* version     = */ CLASS_VERSION_1_6,
/* accessFlags = */ PUBLIC,
/* className   = */ "HelloWorld",
/* superClass  = */ "java/lang/Object"
);

ProgramClass helloWorldClass =
    classBuilder.getProgramClass();

classBuilder.addMethod(
    PUBLIC | STATIC,
    "main",
    "([Ljava/lang/String;)V"
);

DataOutputStream dataOutputStream =
    new DataOutputStream(
    new FileOutputStream("HelloWorld.class"));

helloWorldClass.accept(
    new ProgramClassWriter(dataOutputStream));

dataOutputStream.close();

If you try to run the generated class file now, you’ll receive an error:

$ java HelloWorld
Error: LinkageError occurred while loading main class HelloWorld
java.lang.ClassFormatError: Absent Code attribute in method that is not native or abstract in class file HelloWorld

We added a method, but the method doesn’t contain any code!

Type descriptors

As you may have noticed, the descriptor doesn’t look like a Java signature as you would write in Java source code.

The types in descriptors in Java class files are encoded using characters which represent the types on the JVM and class names are always fully qualified, with the / as a separator instead of ..

For example, the descriptor for the main method in Java (public static void main(String[] args)) is ([Ljava/lang/String;)V.

CharacterJava type
Bbyte
Cchar
Ddouble
Ffloat
Iint
Jlong
LClassName;class
Sshort
Zboolean
[array

Java bytecode instructions

We’ll need to add some code to our main method to actually get our Hello World program to print “Hello World”. The code that we need to generate is, of course, Java bytecode.

Since our Hello World program is very simple we’ll just need a few instructions to:

  1. load the string “Hello World”
  2. execute System.out.println

A Java virtual machine is a stack-based machine: many of the instructions deal with pushing and popping from the operand stack. For example, the instruction ldc is used to load a constant onto the stack and the invoke instructions will pop their operands from the stack.

In order to execute an instance method, such as println, we can use the invokevirtual instruction. The first operand for invokevirtual is a reference to the instance on which the method will be called: in our case a reference to System.out. The System.out instance and the string “Hello World” will be popped from the stack and the method will be executed.

In total, for our Hello World program, we’ll need 4 different bytecode instructions:

InstructionStack beforeStack afterExampleExample Description
getstatic …,…, valuegetstatic Ljava/lang/System; outPushes a reference to the System.out instance onto the stack
ldc…,…, valueldc “Hello World”Pushes the constant “Hello World” onto the stack
invokevirtual…, objectref, [arg1, arg2, argN]…, [return value]invokevirtual Ljava/io/PrintStream; println(Ljava/lang/String;)VPops the reference to System.out and the “Hello World” string, and executes println
return…,emptyreturnReturns from a method

CompactCodeAttributeComposer

We’ve already added a main method to our program using a ClassBuilder but without any code. As we learnt in the previous section we’ll need to generate four instructions: getstatic, ldc, invokevirtual and return.

The ClassBuilder provides a second addMethod which allows building code with a CodeBuilder. The CodeBuilder interface declares a single method compose that provides a CompactCodeAttributeComposer parameter.

The CompactCodeAttributeComposer is one of the core tools in the ProGuardCORE toolbox for creating code snippets. The API closely resembles the JVM instruction set, so our code snippet to print “Hello World” uses 4 methods with familiar names to generate the getstatic, ldc, invokevirtual, and return instructions:

ClassBuilder classBuilder = new ClassBuilder(
/* version     = */ CLASS_VERSION_1_6,
/* accessFlags = */ PUBLIC,
/* className   = */ "HelloWorld",
/* superClass  = */ "java/lang/Object"
);

classBuilder.addMethod(PUBLIC | STATIC, "main", "([Ljava/lang/String;)V", 100, composer -> composer
  .getstatic("java/lang/System", "out", "Ljava/io/PrintStream;")
  .ldc("Hello World")
  .invokevirtual("java/io/PrintStream", "println", "(Ljava/lang/String;)V")
  .return_()
);

ProgramClass helloWorldClass =
    classBuilder.getProgramClass();

DataOutputStream dataOutputStream =
    new DataOutputStream(
    new FileOutputStream("HelloWorld.class"));

helloWorldClass.accept(
    new ProgramClassWriter(dataOutputStream));

dataOutputStream.close();

Finally, “Hello World”

Using the ProGuardCORE toolbox we’ve written a Java program that produces a Java class file that when executed prints “Hello World”.

You should be able to execute the generated HelloWorld.class file and see the result yourself:

$ java HelloWorld
Hello World

Congratulations! You’ve taken your first step into the world of Java bytecode in which you’ve learnt your first 4 Java bytecode instructions!

Next steps

We’ve only just scratched the surface of Java class files, Java bytecode and the toolbox provided by ProGuardCORE.

ProGuardCORE provides many tools to read, write and analyse Java bytecode and is the underlying library used by software such as the open-source ProGuard shrinker, the Android security solution DexGuard and the application security testing tool AppSweep.

For your next steps, take a look at the ProGuardCORE manual, ProGuardCORE examples or this small Brainf*ck compiler that uses ProGuardCORE to generate Java bytecode.


Author: James Hamilton

I’m a compiler engineer working at Guardsquare on JVM/Android related tools & libraries including ProGuardCORE, ProGuard and DexGuard.

Next Post

Previous Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2024 JVM Advent | Powered by steinhauer.software Logosteinhauer.software

Theme by Anders Norén