pinInvestigating Java Class Loading

Overload Journal #68 - Aug 2005 + Programming Topics   Author: Roger Orr

Introduction

The class loader in Java is a powerful concept that helps to provide an extensible environment for running Java code with varying degrees of trust. Each piece of byte code in a running program is loaded into the Java virtual machine by a class loader, and Java can grant different security permissions to runtime objects based upon the class loaders used to load them.

Most of the time this mechanism is used implicitly by both the writer and user of a Java program and 'it just works'. However there is quite a lot happening behind the scenes; for example when you run a Java applet some of the classes are being loaded across the Internet while others are read from the local machine. The class loader does the work of getting the byte code from the target Web site and it also helps to enforce the so-called 'sandbox' security model.

Another place where class loaders are used is for Web services. Typically the main application classes are loaded from a WAR (Web ARchive) file but may make use of standard Java classes as well as other classes or JAR (Java ARchive) files that may be shared between multiple applications running inside a single server. In this case the principal reason for the extra class loaders is to ensure that each Web application remains as independent of the others as possible and in particular that there is no conflict should a class with the same name exist in two different WAR files. Java achieves this because each class loader defines a separate name space - two Java classes are the same only if they were loaded with the same class loader. As we shall see this can have some surprising results.

Using an Additional Class Loader

In the case of a browser or a Web server the framework usually provides all the various class loaders. However you can use additional class loaders, and it is surprisingly easy to do so. Java provides an abstract base class, java.lang.ClasssLoader, which all class loaders must extend. The normal model is that each class loader has a link to its 'parent' class loader and all requests for loading classes are first passed to the parent to see if they can be loaded, and only if this delegated load fails does the class loader try to satisfy the load. The class loaders for a Java program form a tree, with the 'bootstrap' class loader as the top node of the tree and this model ensures that standard Java classes, such as String, are found in the usual place and only the application's own classes are loaded with the user-supplied class handler. (Note that this is only a convention and not all class loaders follow the same pattern. In particular it is up to the implementer of a class loader to decide when and if to delegate load requests to its parent)

One important issue when creating a class loader is deciding which class loader to use as the parent. There are several possibilities:

  • No parent loader. In this case the loader will be responsible for loading all classes.

  • Use the system class loader. This is the commonest practice.

  • Use the class loader used to load the current class. This is how Java itself loads dependent classes.

  • Use a class loader for the current thread context.

Java provides a simple API for getting and setting the default class loader for the current thread context. This can be useful since Java does not provide any way to navigate from a parent class loader to its child class loader(s). I demonstrate setting the thread's default class loader in the example below.

Java provides a standard URLClassLoader that is ready to use, or you can implement your own class loader.

As an example of the first case, you might want to run a Java program on workstations in your organisation, but be able to hold all the Java code centrally on a Web server. Here is some example code that uses the standard java.net.URLClassLoader to instantiate an object from a class held, in this instance, on my own Web site:

/**
 *This is a trivial example of a class loader.
 *It loads an object from a class on my own 
 *Web site.
 */

public class URLExample
{
  private static final String defaultURL =
        "http://www.howzatt.demon.co.uk/";
  private static final String defaultClass =
       "articles.java.Welcome";

  public static void main( String args[] ) 
       throws Exception
  {
    final String targetURL = ( args.length 
          < 1 ) ? defaultURL : args[0];
    final String targetClass = ( args.length 
          < 2 ) ? defaultClass : args[1];

    // Step 1: create the URL class loader.
    System.out.println( "Creating class 
          loader for: " + targetURL );
    java.net.URL[] urls = { new java.net.URL
          ( targetURL ) };
    ClassLoader newClassLoader = new 
          java.net.URLClassLoader( urls );
    Thread.currentThread()
          .setContextClassLoader
          ( newClassLoader );
      // Step 2: load the class and create an
    instance of it.
    System.out.println( "Loading: " + 
          targetClass );
    Class urlClass = 
          newClassLoader.loadClass
          ( targetClass );
    Object obj = urlClass.newInstance();
    System.out.println( "Object is: \"" 
          + obj.toString() + "\"" );

    // Step 3: check the URL of the
    loaded class.
    java.net.URL url 
          = obj.getClass().getResource
          ( "Welcome.class" );
    if ( url != null )
    {
      System.out.println( "URL used: " 
            + url.toExternalForm() );
    }
  }
}

When I compile and run this program it produces the folllowing output:

Creating class loader for: http://www.howzatt.demon.co.uk/
Loading: articles.java.Welcome
Object is: "Welcome from Roger Orr's Web site"
URL used: http://www.howzatt.demon.co.uk/articles/java/Welcome.class

The URLClassLoader class supplied with standard Java is doing all the hard work. Obviously there is more to write for a complete solution, for example a SecurityManager object may be required in order to provide control over the access rights of the loaded code.

The source code for the 'Welcome.class' looks like this:

package articles.java;

public class Welcome
{
  private WelcomeImpl impl 
        = new WelcomeImpl();

  public String toString()
  {
    return impl.toString();
  }
}

Notice that the class has a dependency upon WelcomeImpl - but we did not have to load it ourselves. The same class loader newClassLoader we use to load Welcome is used by the system to resolve references to dependent classes, and so the system automatically loaded WelcomeImpl from the Web site as it was not found locally. There is little code needed for this example and 'it just works' as expected.

Writing your own Class Loader

Although undoubtedly useful the URLClassLoader does not provide everything and there will be cases where a new class loader must be written. This might be because you wish to provide a non-standard way of reading the bytes code or to give additional control over the security of the loaded classes. All you need to do is to override the findClass method in the new class loader to try and locate the byte code for the named class; the implementation of other methods in ClassLoader does not usually need overriding.

Here is a simple example of a class loader which looks for class files with the .clazz extension by providing a findClass method. This automatically produces a class loader that implements the delegation pattern - the new class loader is only used when the parent class loader is not able to find the class. At this point the findClass method shown below is invoked and the myDataLoad method tries to obtain the class data from a .clazz file. Although only an example it does illustrate the principles of writing a simple class loader of your own.

import java.io.*;
public class MyClassLoader extends ClassLoader
{
  public MyClassLoader( ClassLoader parent )
  {
    super( parent );
  }
  protected Class findClass(String name)
                 throws ClassNotFoundException
  {
    try
    {
      byte[] classData = myDataLoad( name );
      return defineClass( name, classData, 
            0, classData.length );
    }
    catch ( Exception ex )
    {
      throw new ClassNotFoundException();
    }
  }
  // Example: look for byte code in files
  with .clazz extension
  private byte[] myDataLoad
        ( String name ) throws Exception
  {
    ByteArrayOutputStream bos 
          = new ByteArrayOutputStream();
    InputStream is =
          getClass().getResourceAsStream
          ( name + ".clazz" );
    if ( is != null )
    {
      int nextByte;
      while ( ( nextByte = is.read() ) != -1 )
      {
        bos.write( (byte) nextByte );
      }
    } 
    return bos.toByteArray();
  }
}

We might want to get the Java runtime to install the class loader when the application starts. This can be done by defining the java.system.class.loader property - for this JVM instance - as the class name of our class loader. An object of this class will be constructed at startup, with the 'parent' class loader being the default system class loader. The supplied class loader is then used as the system class loader for the duration of the application.

For example:

C:>javac Hello.java
C:>rename Hello.class Hello.clazz
C:>java Hello
Exception in thread "main" java.lang.NoClass
DefFoundError: Hello

C:>java -Djava.system.class.loader=MyClass
LoaderHello

Hello World

Class Loading - Java's Answer to DLL Hell?

In practice, for both applets and Web servers, everything does not always work without problem. Unfortunately from time to time there are interactions between the various class loaders and, in my experience, these are typically rather hard to track down.The sort of problems I have had include:

  • strange runtime errors caused by different versions of the same class file(s) in different places in the CLASSPATH.

  • problems with log4j generating NoClassDefFound or ClassCastException errors.

  • difficulties registering protocol handlers inside a WAR file.

My experience is that resolving these sort of problems is made more difficult by the lack of easy ways to see which class loader was used to load each class in the system. For any given object it is quite easy to track down the class loader - the getClass() method returns the correct 'Class' object and calling the getClassLoader() method then returns the actual class loader used to instantiate this class. The class loader can be null - for classes loaded by the JVM 'bootstrap' class loader.

Since Java treats any classes loaded by different class loaders as different classes it can be critical to find out the exact class loaders involved. However I do not know of a way to list all classes and their loaders. The Java debugger 'JDB' has a 'classes' command but this simply lists all the classes without, as far as I know, any way to break them down by class loader.

I wanted to find a way to list loaded classes and their corresponding class loader so I could try and identify the root cause of this sort of problem. One way is to extract the source for ClassLoader.java, make changes to it to provide additional logging and to place the modified class file in the bootstrap class path before the real ClassLoader. This is a technique giving maximum control, but has a couple of downsides. Firstly you need access to the boot class path - this may not always be easy to achieve. Secondly you must ensure the code modified matches the exact version of the JVM being run. After some experimentation, I decided to use reflection on ClassLoader itself to provide me pretty well what I wanted.

Reflecting on the Class Loader

Reflection allows a program to query, at run time, the fields and methods of objects and classes in the system. This feature, by no means unique to Java, provides some techniques of particular use for testing and debugging. For example, a test harness such as JUnit can query at run time the methods and arguments of public methods of a target object and then call all methods matching a particular signature. This sort of programming is very flexible, and automatically tracks changes made to the target class as long as they comply with the appropriate conventions for the test harness. However the downside of late binding like this is that errors such as argument type mismatch are no longer caught by the compiler but only at runtime.

There are two main types of reflection supported for a class; the first type provides access to all the public methods and fields for the class and its superclasses, and this is the commonest use of reflection. However there is a second type of reflection giving access to all the declared methods and fields on a class (not including inherited names). This sort of reflection can be used, subject to the security manager granting permission, to provide read (and write) access even to private members of another object.

I noticed that each ClassLoader contains a 'classes' Vector that is updated by the JVM for each class loaded by this class loader.

[Code from ClassLoader.java in 'java.lang']

// Invoked by the VM to record every loaded class with this loader.
void addClass(Class c) {
    classes.addElement(c);
}

I use reflection to obtain the original vector for each traced class loader and replace it with a proxy object that logs each addition using addElement. The steps are simple, although a lot of work is going on under the covers in the JVM to support this functionality. The class for the ClassLoader itself is queried with the getDeclaredField to obtain a 'Field' object for the (private) member 'classes'. This object is then marked as accessible (since by default private fields are not accessible) and finally the field contents are read and written.

The complete code looks something like this:

// Add a hook to a class loader (using reflection)
private void hookClassLoader(
   final ClassLoader currLoader )
{
  try
  {
    java.lang.reflect.Field field =     
       ClassLoader.class.getDeclaredField
       ( "classes" );
    field.setAccessible( true );
    final java.util.Vector currClasses =
          (java.util.Vector)field.get
          ( currLoader );
    field.set( currLoader, 
          new java.util.Vector() {
        public void addElement( Object o ) {
           showClass( (Class)o );
           currClasses.addElement(o);
        }
    });
  }
  catch ( java.lang.Exception ex )
  {
    streamer.println( "Can't hook " +
          currLoader + ": " + ex );
    }
  }

The end result of running this code against a class loader is that every time the JVM marks the class loader as having loaded a class the showClass method will be called. In this method we can take any action we choose based on the newly loaded class. This could be to simply log the class and its loader, or something more advanced.

When I first used reflection to modify the behaviour of a class in Java like this I was a little surprised - I've done similar tricks in C++ but it involves self-modifying code and assembly instructions.

Limitations of this Approach

There are several problems with this approach.

  • First of all, it requires sufficient security permissions to be able to access the private member of ClassLoader. This is not usually a problem for stand-alone applications but will cause difficulty for applets since the container by default installs a security manager that prevents applet code from having access to the ClassLoader fields.

  • Secondly, the code is not future proof since it relies upon the behaviour of a private member variable. This does not worry me greatly in this code as it is solely designed to assist in debugging a problem and is not intended to be part of a released program, but some care does need to be taken. What I have done by replacing private member data with a proxy breaks encapsulation.

  • Thirdly, the technique is not generally applicable since there must be a suitable member variable in the target class - in this case I was able to override Vector.addElement().

  • Fourthly, the code needs calling for each class loader in the system - but there is no standard way for us to locate them all!

  • Fifthly, the bootstrap class loader is not included in this code since it is part of the JVM and does not have a corresponding ClassLoader object.

It is possible to partly work around the fourth and fifth problems by registering our own class loader at the head of the chain of class loaders. Remember that each class loader in the system (apart from the JVM's own class loader) has a 'parent' class loader. I use reflection to insert my own class loader as the topmost parent for all class loaders.

Once again I achieve my end by modifying a private member variable of the classloader - this time the 'parent' field.

/**
 * This method injects a ClassLoadTracer
 object into the current class loader chain.
 * @param parent the current active class
   loader
 * @return the new (or existing) tracer object
 */
public static synchronized ClassLoadTracer
   inject( ClassLoader parent )
{
  // get the current topmost class loader.
  ClassLoader root = parent;
  while ( root.getParent() != null )
     root = root.getParent();
  if ( root instanceof ClassLoadTracer )
     return (ClassLoadTracer)root;
  ClassLoadTracer newRoot = new
     ClassLoadTracer( parent );
  // reflect on the topmost classloader to 
     install the ClassLoadTracer ...
  try
  {
    // we want root->parent = newRoot;
    java.lang.reflect.Field field = 
       ClassLoader.class.getDeclaredField(
       "parent" );
    field.setAccessible( true );
    field.set( root, newRoot );
  }
  catch ( Exception ex )
  {
    ex.printStackTrace();
    System.out.println( "Could not install
       ClassLoadTracer: " + ex );
  }
  return newRoot;
}

The end result of calling the above method against an existing class loader is that the top-most parent becomes an instance of my own ClassLoadTracer class. This class, yet another extension of ClassLoader, overrides the loadClass method to log successful calls to the bootstrap class loader (thus solving the fifth problem listed above). It also keeps track of the current thread context class loader to detect any class loaders added to the system and thus helps to resolve the fourth problem.

Note however that this is only a partial solution since there is no requirement that class loaders will follow the delegation technique and so it is possible that my ClassLoadTracer will never be invoked. However, for the cases I have used it the mechanism seems to work well enough for me to get a log of the classes being loaded by the various class loaders.

Conclusion

Class loaders are powerful, but there does not seem to be enough debugging information supplied as standard to resolve problems when the mechanism breaks down. I have shown a couple of uses of reflection to enable additional tracing to be provided where such problems exist. The techniques shown are of wider use too, enabling some quite flexible debugging techniques that add and remove probes from target objects in the application at runtime.

All the source code for this article is available at: http://www.howzatt.demon.co.uk/articles/ClassLoading.zip

Thanks are due to Alan Griffiths, Richard Blundell and Phil Bass who reviewed drafts of this article and suggested a number of improvements.

Overload Journal #68 - Aug 2005 + Programming Topics