Code Coverage Analysis
A major part of the engineering of a professional software project is creating a test suite for it. Without some sort of test suite, it is impossible to know if the software works at all. The D language has many features to aid in the creation of test suites, such as unit tests and contract programming. But there's the issue of how thoroughly the test suite tests the code. The profiler can give valuable information on which functions were called, and by whom. But to look inside a function, and determine which statements were executed and which were not, requires a code coverage analyzer.
A code coverage analyzer will help in these ways:
- Expose code that is not exercised by the test suite. Add test cases that will exercise it.
- Identify code that is unreachable. Unreachable code is often the leftover result of program design changes. Unreachable code should be removed, as it can be very confusing to the maintenance programmer.
- It can be used to track down why a particular section of code exists, as the test case that causes it to execute will illuminate why.
- Since execution counts are given for each line, it is possible to use the coverage analysis to reorder the basic blocks in a function to minimize branches in the most used path, reducing stress on the processor's pipeline (a run-of-the-mill X86 processor can usually handle 48 branches in its pipeline). This technique is a special case of profile-guided optimization, which can be performed automatically if either the LLVM or GCC backend is utilized.
Experience with code coverage analyzers show that they dramatically reduce the number of bugs in shipping code. But it isn't a panacea, a code coverage analyzer won't help (unlike Valgrind or the sanitizers available in GCC and LLVM) with:
- Identifying race conditions.
- Memory consumption problems.
- Pointer bugs.
- Verifying that the program got the correct result.
Code coverage analysers are available for many popular languages, but they are often third party products that integrate poorly with the compiler. A big problem with third party products is, in order to instrument the source code, they must include what is essentially a full-blown compiler front end for the same language. Not only is this an expensive proposition, it often winds up out of step with the various compiler vendors as their implementations change and as they evolve various extensions. (gcov, the Gnu coverage analyzer is an exception as it is both free and is integrated into gcc.)
The D code coverage analyser is part of the D compiler. Therefore, it is always in perfect synchronization with the language implementation. It's implemented by establishing a counter for each line in each module compiled with the -cov switch. Code is inserted at the beginning of each statement to increment the corresponding counter. When the program finishes, the runtime collects all the counters, merges it with the source files, and writes the reports out to listing (.lst) files.
For example, consider the Sieve program:
/* Eratosthenes Sieve prime number calculation. */ import std.stdio; bool[8191] flags; int main() { int i, prime, k, count, iter; writeln("10 iterations"); for (iter = 1; iter <= 10; iter++) { count = 0; flags[] = true; for (i = 0; i < flags.length; i++) { if (flags[i]) { prime = i + i + 3; k = i + prime; while (k < flags.length) { flags[k] = false; k += prime; } count += 1; } } } writefln("%d primes", count); return 0; }
Compile and run it with:
dmd sieve -cov sieve
The output file will be created called sieve.lst, the contents of which are:
|/* Eratosthenes Sieve prime number calculation. */ | |import std.stdio; | |bool[8191] flags; | |int main() |{ 5| int i, prime, k, count, iter; | 1| writeln("10 iterations"); 22| for (iter = 1; iter <= 10; iter++) | { 10| count = 0; 10| flags[] = true; 163840| for (i = 0; i < flags.length; i++) | { 81910| if (flags[i]) | { 18990| prime = i + i + 3; 18990| k = i + prime; 168980| while (k < flags.length) | { 149990| flags[k] = false; 149990| k += prime; | } 18990| count += 1; | } | } | } 1| writefln("%d primes", count); 1| return 0; |} sieve.d is 100% covered
The numbers to the left of the | are the execution counts for that line. Lines that have no executable code are left blank. Lines that have executable code, but were not executed, have a "0000000" as the execution count. At the end of the .lst file, the percent coverage is given.
There are 3 lines with an execution count of 1, these were each executed once. The declaration line for i, prime, etc., has 5 because there are 5 declarations, and the initialization of each declaration counts as one statement.
The first for loop shows 22. This is the sum of the 3 parts of the for-header. If the for-header is broken up into 3 lines, the data is similarly divided:
1| for (iter = 1; 11| iter <= 10; 10| iter++)
which adds up to 22.
e1&&e2 and e1||e2 expressions conditionally execute the right-hand operand e2. Therefore, the right-hand operand is treated as a separate statement with its own counter:
|void foo(int a, int b) |{ 5| bar(a); 8| if (a && b) 1| bar(b); |}
By putting the right-hand operand on a separate line, this illuminates things:
|void foo(int a, int b) |{ 5| bar(a); 5| if (a && 3| b) 1| bar(b); |}
Similarly, for the e?e1:e2 expressions, e1 and e2 are treated as separate statements.
Controlling the Coverage Analyser
When the -cov switch is thrown, the version identifier D_Coverage is defined.
The standard runtime option passing mechanism can also be used to control how code coverage reports are generated at runtime. The command-line option --DRT-covopt or the environment variable DRT_COVOPT can be used to pass options. Options are passed separated by spaces as key,value with the following format: key:value. Current options are:
- merge
- Merge the current run with existing reports if 1, or overwrite the existing reports if 0.
- srcpath
- Set path to where source files are located.
- dstpath
- Set path to where listing files are to be written (must exist).
Example:
sieve --DRT-covopt="merge:1" mkdir reports sieve --DRT-covopt="merge:1 dstpath:reports"
Will merge the results of this run with the previous run in the sieve.lst file in the current directory. While:
mkdir reports sieve --DRT-covopt="dstpath:reports"
Will create a new report in reports/sieve.lst. Another run adding merge will add to the new report:
sieve --DRT-covopt="merge:1 dstpath:reports"