Class Ids

As JaCoCo's class identifiers are sometimes causing confusion this chapter answers the concepts and common issues with class ids in FAQ style format.

What are class ids and how are they created?

Class ids are 64-bit integer values, for example 0x638e104737889183 in hex notation. Their calculation is considered an implementation detail of JaCoCo. Currently ids are created with a CRC64 checksum of the raw class file.

What are class ids used for?

Class ids are used to unambiguously identify Java classes. At runtime execution data is sampled for every loaded class and typically stored to *.exec files. At analysis time — for example for report generation — the class ids are used to relate analyzed classes with the execution data.

What are the advantages of JaCoCo class ids?

The concept of class ids allows distinguishing different versions of classes, for example when multiple versions of an application are deployed to an application server or different versions of libraries are included.

Also class ids are the prerequisite for JaCoCo's minimal runtime-overhead and small *.exec files even for very large applications under test.

What is the disadvantage of JaCoCo class ids?

The fact that class ids identify a specific version of a class causes problems in setups where different classes are used at runtime and at analysis time.

What happens if different classes are used at runtime and at analysis time?

In this case execution data cannot be related to the analyzed classes. As a consequence such classes are reported with 0% coverage.

How can I detect that I have a problem with class ids?

The typical symptom of class id mismatch is classes not shown as covered although they have been executed during the test. This situation can be easily detected e.g. in the HTML report: Open the Sessions page with the link on the top-right corner. You see a list of all classes where execution data has been collected for. Find the class in questions and check whether the entry has a link to the corresponding coverage report page. If the entry is not linked this means there is a class id mismatch between the class used at runtime and the class provided to create the report.

What can cause different class ids?

Class ids are identical for the exact same class file only (byte-by-byte). There is a couple of reasons why you might get different class files. First compiling Java source files will result in different class files if you use a different tool chain:

Different compiler vendor (e.g. Eclipse vs. Oracle JDK)
Different compiler versions
Different compiler settings (e.g. debug vs. non-debug)

Also post-processing class files (obfuscation, AspectJ, etc.) will typically change the class files. JaCoCo will work well if you simply use the same class files for runtime as well as for analysis. So the tool chain to create these class files does not matter.

Even if the class files on the file system are the same there is possible that classes seen by the JaCoCo runtime agent are different anyways. This typically happens when another Java agent is configured before the JaCoCo agent or special class loaders pre-process the class files. Typical candidates are:

Mocking frameworks
Application servers
Persistence frameworks

What workarounds exist to deal with runtime-modified classes?

If classes get modified at runtime in your setup there are some workarounds to make JaCoCo work anyways:

If you use another Java agent make sure the JaCoCo agent is specified at first in the command line. This way the JaCoCo agent should see the original class files.
Specify the classdumpdir option of the JaCoCo agent and use the dumped classes at report generation. Note that only loaded classes will be dumped, i.e. classes not executed at all will not show-up in your report as not covered.
Use offline instrumentation before you run your tests. This way classes get instrumented by JaCoCo before any runtime modification can take place. Note that in this case the report has to be generated with the original classes, not with instrumented ones.

Why can't JaCoCo simply use the class name to identify classes?

To understand why JaCoCo can't rely on class names we need to have a look at the way how JaCoCo measures code coverage.

JaCoCo tracks execution with so called probes. Probes are additional byte code instructions inserted in the original class file which will note when they are executed and report this to the JaCoCo runtime. This process is called instrumentation. To keep the runtime overhead minimal, only a few probes are inserted at "strategic" places. These probe positions are determined by analyzing the control flow of all methods of a class. As a result every instrumented class produces a list of n boolean flags indicating whether the probe has been executed or not. A JaCoCo *.exec file simply stores a boolean array per class id.

At analysis time, for example for report generation, the *.exec file is used to get information about probe execution status. But as probes are stored in a plain boolean array there is no information like corresponding methods or lines. To retrieve this information we need the original class files and perform the exact same control flow analysis than at instrumentation time. Because this is a deterministic process we get the same probe positions. With this information we can now interfere the execution status of every single instruction and branch of a method. Using the debug information embedded in the class files we can also calculate line coverage.

If we would use just slightly different classes at analysis time than at runtime — e.g. different method ordering or additional branches — we would end-up with different probes. For example the probe at index i would be in method a() and not in method b(). Obviously this will create random coverage results.

Why do I get an error when I try to analyze multiple versions of the same class with a group?

JaCoCo always analyzes a set of class as a group. The group is used to aggregate data for source files and packages (both can contain multiple classes). Within the reporting API classes are identified by their fully qualified name (e.g. to create stable file names in the HTML reports). Therefore it is not possible to include two different classes with the same name within a group. Anyhow it is possible to analyze different versions of class files in separate groups, for example the Ant report task can be configured with multiple groups.