Class Ids

As JaCoCo's class identifiers are sometimes causing confusion this chapter answers the concepts and common issues with class ids in FAQ style format.

What are class ids and how are they created?

Class ids are 64-bit integer values, for example 0x638e104737889183 in hex notation. Their calculation is considered an implementation detail of JaCoCo. Currently ids are created with a CRC64 checksum of the raw class file.

What are class ids used for?

Class ids are used to unambiguously identify Java classes. At runtime execution data is sampled for every loaded class and typically stored to *.exec files. At analysis time — for example for report generation — the class ids are used to relate analyzed classes with the execution data.

What are the advantages of JaCoCo class ids?

The concept of class ids allows distinguishing different versions of classes, for example when multiple versions of an application are deployed to an application server or different versions of libraries are included.

Also class ids are the prerequisite for JaCoCo's minimal runtime-overhead and small *.exec files even for very large applications under test.

What is the disadvantage of JaCoCo class ids?

The fact that class ids identify a specific version of a class causes problems in setups where different classes are used at runtime and at analysis time.

What happens if different classes are used at runtime and at analysis time?

In this case execution data cannot be related to the analyzed classes. As a consequence such classes are reported with 0% coverage.

How can I detect that I have a problem with class ids?

The typical symptom of class id mismatch is classes not shown as covered although they have been executed during the test. This situation can be easily detected e.g. in the HTML report: Open the Sessions page with the link on the top-right corner. You see a list of all classes where execution data has been collected for. Find the class in questions and check whether the entry has a link to the corresponding coverage report page. If the entry is not linked this means there is a class id mismatch between the class used at runtime and the class provided to create the report.

What can cause different class ids?

Class ids are identical for the exact same class file only (byte-by-byte). There is a couple of reasons why you might get different class files. First compiling Java source files will result in different class files if you use a different tool chain:

Also post-processing class files (obfuscation, AspectJ, etc.) will typically change the class files. JaCoCo will work well if you simply use the same class files for runtime as well as for analysis. So the tool chain to create these class files does not matter.

Even if the class files on the file system are the same there is possible that classes seen by the JaCoCo runtime agent are different anyways. This typically happens when another Java agent is configured before the JaCoCo agent or special class loaders pre-process the class files. Typical candidates are:

What workarounds exist to deal with runtime-modified classes?

If classes get modified at runtime in your setup there are some workarounds to make JaCoCo work anyways:

Why can't JaCoCo simply use the class name to identify classes?

To understand why JaCoCo can't rely on class names we need to have a look at the way how JaCoCo measures code coverage.

JaCoCo tracks execution with so called probes. Probes are additional byte code instructions inserted in the original class file which will note when they are executed and report this to the JaCoCo runtime. This process is called instrumentation. To keep the runtime overhead minimal, only a few probes are inserted at "strategic" places. These probe positions are determined by analyzing the control flow of all methods of a class. As a result every instrumented class produces a list of n boolean flags indicating whether the probe has been executed or not. A JaCoCo *.exec file simply stores a boolean array per class id.

At analysis time, for example for report generation, the *.exec file is used to get information about probe execution status. But as probes are stored in a plain boolean array there is no information like corresponding methods or lines. To retrieve this information we need the original class files and perform the exact same control flow analysis than at instrumentation time. Because this is a deterministic process we get the same probe positions. With this information we can now interfere the execution status of every single instruction and branch of a method. Using the debug information embedded in the class files we can also calculate line coverage.

If we would use just slightly different classes at analysis time than at runtime — e.g. different method ordering or additional branches — we would end-up with different probes. For example the probe at index i would be in method a() and not in method b(). Obviously this will create random coverage results.

Why do I get an error when I try to analyze multiple versions of the same class with a group?

JaCoCo always analyzes a set of class as a group. The group is used to aggregate data for source files and packages (both can contain multiple classes). Within the reporting API classes are identified by their fully qualified name (e.g. to create stable file names in the HTML reports). Therefore it is not possible to include two different classes with the same name within a group. Anyhow it is possible to analyze different versions of class files in separate groups, for example the Ant report task can be configured with multiple groups.