Wednesday, December 29, 2010

Troubleshooting class loading issues with Virgo

My colleague Diyan Yordanov outlined the class loading issues a web developer can face in his ESE 2010 talk "Migration of existing web applications to OSGi". The classification in the talk divided the class loading issues, based on the exceptions:
  • ClassNotFoundException - missing class definition
  • NoClassDefFoundError - class definition found, but instantiation failed
  • ClassCastException - additional class copies
There is not much I can add, so I will reuse this classification to present the root causes we faced during our work on SAP NetWeave and the use of OSGi.

In the simple/standalone Java applications the classpath was the thing that determined the class loading and the way that we investigated what caused the exceptions above. The problems in this simple world were caused by:
  • ClassNotFoundException
    • Missing class definition (.class files)
    • Too restrictive file system permissions 
  •  NoClassDefFoundError
    • Exceptions in static block
    • Missing imported class
    • Class version mismatch (class is compiled with incompatible JVM)

JEE was the next step in the evolution of the class loading problems :) With the JEE containers we introduced complicated class loading hierarchies. The root causes in this hierarchies were the same as with the simple applications, but we faced dependency (or "reference" in NetWeaver) problems as well . Fortunately (depends on the point of view) we didn't have the time to implement versioned dependencies and get ourselves into more troubles.

The last step was made when we found out that OSGi pretty much eases the development and the support of components. It featured its own class loading that had everything we could dream of (plus some things we didn't really know we wanted). One of the extra things that were missing in our JEE world were the imports and exports of packages. And this is the thing that introduces more root causes for class loading exceptions:
  • ClassNotFoundException
    • Missing package import or export
    • Import version mismatch
    • Wrong boot class delegation property

 Most of the problems described so far can be solved by simply finding which bundle:
  • contains a class
  • exports a class
  • can load a class

This is not an easy task in bare OSGi environment, but fortunately Virgo provides shell commands that will help you troubleshoot class loading issues by doing exactly the operations described above.The commands are:
  • clhas - lists all bundles that contain a class
  • clexport - lists all bundles that export a class or package
  • clload - lists all bundles that can load a class
 
ClassNotFoundException can be solved by checking if:
  • a bundle provides the class - clhas command
  • the class is exported - clexport command

NoClassDefFoundError cause can be determined by:
  1. inspecting the logs for exception
  2. trying to load the class
The result from the second step depends on the parameters of clload command and can be:
  • a list with bundles that can actually load the class
  • confirmation that a bundle can/cannot load the class

The fact that the bundle can load the class now but was unable to do so when an exception was produced would probably mean that a dependency was missing at the time of the problem.

If the bundle cannot load a class, although it should be able to do so, means that there is some problem with the dependencies - most probably there is something wrong with the:
  • imports of the bundle
  • exported packages
You can check if the needed packages are exported with the help of the clexport command.
 
ClassCastException is almost always caused by duplicated class(es) in some bundle. You can check the bundles that contain the class in doubt with the clhas command.

Finally you can find some example usages of the commands in the latest build of Virgo's User Guide.

The commands should be available in the next release of Virgo (the current one being 2.1.0).

id_rsa.pub: invalid format, error in libcrypto

After I upgraded my Linux and got Python 3.10 by default, it turned out that Ansible 2.9 will no longer run and is unsupported together with...