Java Virtual Machine Architecture

Leslie's Blog

2019-11-19

java

Java Virtual Machine Architecture

When we run a Java program, typical steps are as follows:

Edit HelloWorld.java file
javac HelloWorld.java, i.e., compile .java file into bytecode(class file).
java HelloWorld, i.e., create a JVM instance execute the bytecode(class file).

Now I am going to talk about how does a JVM instance load and execute a class file, which involves with the JVM architecture. The Java Virtual Machine(JVM) is an abstract computing machine, which has an instruction set and manipulates various memory areas at run time(i.e., runtime data areas). The structure of JVM is shown as follows:

jvm_architecture

Class Loader

JVM knows nothing of the Java programming language, only of a particular binary format, the class file format. That is why languages which can be expressed in terms of a valid class file (such as Scala and Kotlin) can run on JVM. To run a Java program, the class loaders should firstly load the class files. A class file contains Java Virtual Machine instructions (bytecodes) and a symbol table, as well as other ancillary information.

Runtime Date Areas

JVM uses various run time data areas during execute of class files. Some of these data areas are created on JVM start-up and are destroyed only when the JVM exits, so that threads can share these areas. Other data areas are per thread. Per-thread data areas are created when a thread is created and destroyed when the thread exits.

PC(Program Counter) Registers: Each JVM thread has its own pc register. At any point, each thread is executing the code of a single method, namely the current method for that thread. If that method is not native, the pc register contains the address of the Java Virtual Machine instruction currently being executed. If the method currently being executed by the thread is native, the value of the Java Virtual Machine’s pc register is undefined.
JVM Stack: Each JVM thread has a private JVM stack, created at the same time as the thread. A Java Virtual Machine stack stores frames.
Heap: The heap is created on JVM start-up and shared among all threads. It is the run-time data area from which memory for all class instances and arrays is allocated. Heap storage for objects is reclaimed by a garbage collector.
Method Area(Metaspace): The method area is created on JVM start-up and shared among all threads. And the method area is analogous to the storage area for compiled code of a conventional language or analogous to the “text” segment in an operating system process (which we will talk later). The method area stores:
- class structures (fields and methods data, super class name, interfaces names, version, …)
- a runtime constant pool per class loaded
- the bytecode of methods and constructors
Native Method Stack: Native method stack is used to execute native methods. A native method is a method that is linked to a native library. Native libraries are linked to a java program through JNI (Java Native Interface). A native method looks like public native void method();. It’s just a declaration, because the method implementation is done in the native library.

Frame

JVM is a stack-based machine. Each thread has its own stack, and each stack has its own stack of frames. A new frame is created each time a method is invoked. A frame is destroyed when its method invocation completes, whether that completion is normal or abrupt (it throws an uncaught exception).

stack_frame

Each frame has its own array of local variables, its own operand stack, and a reference to the run-time constant pool of the class of the method. The frame for the executing method is called current frame, and its method is known as the current method. The class in which the current method is defined is the current class. When the current method returns, the current frame passes back the result to the previous frame. The current frame is then discarded as the previous frame becomes the current one.

Operand Stack

The operand stack is empty when the frame that contains it is created. The Java Virtual Machine supplies instructions to load constants or values from local variables(from local variable array) or fields onto the operand stack. Other Java Virtual Machine instructions take operands from the operand stack, operate on them, and push the result back onto the operand stack. The operand stack is also used to prepare parameters to be passed to methods and to receive method results.

For example, here is a Java file OperandStackDemo.java:

public class OperandStackDemo {
    public static void main(String[] args) {
        int i = 1;
        int j = 2;
        int k = i + j;
    }
}

We compile this file into class file and then disassemble the class file using javap -c OperandStackDemo.class:

operand_stack_javap

Now we start to analyse the instructions(opcode) in the main method:

iconst_1: Push the int constant 1 (from the constant pool) onto the operand stack.

operand_stack_step_1

istore_1: Pop the top(which is int value 1) of the operand stack, and store it into the local variable array at index 1.(Note that in the local variable array, the value at index 0 is the argument of the main method, i.e., args).

operand_stack_step_2

iconst_2: Push the int constant 2 onto the operand stack.

operand_stack_step_3

istore_2: Pop the top(which is int value 2) of the operand stack, and store it into the local variable array at index 2.

operand_stack_step_4

iload_1: Push the value of the local variable at index 1 (which is 1) onto the operand stack.

operand_stack_step_5

iload_2: Push the value of the local variable at index 2 (which is 2) onto the operand stack.

operand_stack_step_6

iadd: Pop the top two int values(which is 2 and 1) from the operand stack, add these two values and push the result (which is 2+1 = 3) onto the operand stack.

operand_stack_step_7

istore_3: Pop the top(which is int value 3) of the operand stack, and store it into the local variable array at index 3.

operand_stack_step_8

return: Return void from method.