1 | ========================== |
2 | Source-based Code Coverage |
3 | ========================== |
4 | |
5 | .. contents:: |
6 | :local: |
7 | |
8 | Introduction |
9 | ============ |
10 | |
11 | This document explains how to use clang's source-based code coverage feature. |
12 | It's called "source-based" because it operates on AST and preprocessor |
13 | information directly. This allows it to generate very precise coverage data. |
14 | |
15 | Clang ships two other code coverage implementations: |
16 | |
17 | * :doc:`SanitizerCoverage` - A low-overhead tool meant for use alongside the |
18 | various sanitizers. It can provide up to edge-level coverage. |
19 | |
20 | * gcov - A GCC-compatible coverage implementation which operates on DebugInfo. |
21 | This is enabled by ``-ftest-coverage`` or ``--coverage``. |
22 | |
23 | From this point onwards "code coverage" will refer to the source-based kind. |
24 | |
25 | The code coverage workflow |
26 | ========================== |
27 | |
28 | The code coverage workflow consists of three main steps: |
29 | |
30 | * Compiling with coverage enabled. |
31 | |
32 | * Running the instrumented program. |
33 | |
34 | * Creating coverage reports. |
35 | |
36 | The next few sections work through a complete, copy-'n-paste friendly example |
37 | based on this program: |
38 | |
39 | .. code-block:: cpp |
40 | |
41 | % cat <<EOF > foo.cc |
42 | #define BAR(x) ((x) || (x)) |
43 | template <typename T> void foo(T x) { |
44 | for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
45 | } |
46 | int main() { |
47 | foo<int>(0); |
48 | foo<float>(0); |
49 | return 0; |
50 | } |
51 | EOF |
52 | |
53 | Compiling with coverage enabled |
54 | =============================== |
55 | |
56 | To compile code with coverage enabled, pass ``-fprofile-instr-generate |
57 | -fcoverage-mapping`` to the compiler: |
58 | |
59 | .. code-block:: console |
60 | |
61 | # Step 1: Compile with coverage enabled. |
62 | % clang++ -fprofile-instr-generate -fcoverage-mapping foo.cc -o foo |
63 | |
64 | Note that linking together code with and without coverage instrumentation is |
65 | supported. Uninstrumented code simply won't be accounted for in reports. |
66 | |
67 | Running the instrumented program |
68 | ================================ |
69 | |
70 | The next step is to run the instrumented program. When the program exits it |
71 | will write a **raw profile** to the path specified by the ``LLVM_PROFILE_FILE`` |
72 | environment variable. If that variable does not exist, the profile is written |
73 | to ``default.profraw`` in the current directory of the program. If |
74 | ``LLVM_PROFILE_FILE`` contains a path to a non-existent directory, the missing |
75 | directory structure will be created. Additionally, the following special |
76 | **pattern strings** are rewritten: |
77 | |
78 | * "%p" expands out to the process ID. |
79 | |
80 | * "%h" expands out to the hostname of the machine running the program. |
81 | |
82 | * "%Nm" expands out to the instrumented binary's signature. When this pattern |
83 | is specified, the runtime creates a pool of N raw profiles which are used for |
84 | on-line profile merging. The runtime takes care of selecting a raw profile |
85 | from the pool, locking it, and updating it before the program exits. If N is |
86 | not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. N must |
87 | be between 1 and 9. The merge pool specifier can only occur once per filename |
88 | pattern. |
89 | |
90 | .. code-block:: console |
91 | |
92 | # Step 2: Run the program. |
93 | % LLVM_PROFILE_FILE="foo.profraw" ./foo |
94 | |
95 | Creating coverage reports |
96 | ========================= |
97 | |
98 | Raw profiles have to be **indexed** before they can be used to generate |
99 | coverage reports. This is done using the "merge" tool in ``llvm-profdata`` |
100 | (which can combine multiple raw profiles and index them at the same time): |
101 | |
102 | .. code-block:: console |
103 | |
104 | # Step 3(a): Index the raw profile. |
105 | % llvm-profdata merge -sparse foo.profraw -o foo.profdata |
106 | |
107 | There are multiple different ways to render coverage reports. The simplest |
108 | option is to generate a line-oriented report: |
109 | |
110 | .. code-block:: console |
111 | |
112 | # Step 3(b): Create a line-oriented coverage report. |
113 | % llvm-cov show ./foo -instr-profile=foo.profdata |
114 | |
115 | This report includes a summary view as well as dedicated sub-views for |
116 | templated functions and their instantiations. For our example program, we get |
117 | distinct views for ``foo<int>(...)`` and ``foo<float>(...)``. If |
118 | ``-show-line-counts-or-regions`` is enabled, ``llvm-cov`` displays sub-line |
119 | region counts (even in macro expansions): |
120 | |
121 | .. code-block:: none |
122 | |
123 | 1| 20|#define BAR(x) ((x) || (x)) |
124 | ^20 ^2 |
125 | 2| 2|template <typename T> void foo(T x) { |
126 | 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
127 | ^22 ^20 ^20^20 |
128 | 4| 2|} |
129 | ------------------ |
130 | | void foo<int>(int): |
131 | | 2| 1|template <typename T> void foo(T x) { |
132 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
133 | | ^11 ^10 ^10^10 |
134 | | 4| 1|} |
135 | ------------------ |
136 | | void foo<float>(int): |
137 | | 2| 1|template <typename T> void foo(T x) { |
138 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
139 | | ^11 ^10 ^10^10 |
140 | | 4| 1|} |
141 | ------------------ |
142 | |
143 | To generate a file-level summary of coverage statistics instead of a |
144 | line-oriented report, try: |
145 | |
146 | .. code-block:: console |
147 | |
148 | # Step 3(c): Create a coverage summary. |
149 | % llvm-cov report ./foo -instr-profile=foo.profdata |
150 | Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover |
151 | -------------------------------------------------------------------------------------------------------------------------------------- |
152 | /tmp/foo.cc 13 0 100.00% 3 0 100.00% 13 0 100.00% |
153 | -------------------------------------------------------------------------------------------------------------------------------------- |
154 | TOTAL 13 0 100.00% 3 0 100.00% 13 0 100.00% |
155 | |
156 | The ``llvm-cov`` tool supports specifying a custom demangler, writing out |
157 | reports in a directory structure, and generating html reports. For the full |
158 | list of options, please refer to the `command guide |
159 | <https://llvm.org/docs/CommandGuide/llvm-cov.html>`_. |
160 | |
161 | A few final notes: |
162 | |
163 | * The ``-sparse`` flag is optional but can result in dramatically smaller |
164 | indexed profiles. This option should not be used if the indexed profile will |
165 | be reused for PGO. |
166 | |
167 | * Raw profiles can be discarded after they are indexed. Advanced use of the |
168 | profile runtime library allows an instrumented program to merge profiling |
169 | information directly into an existing raw profile on disk. The details are |
170 | out of scope. |
171 | |
172 | * The ``llvm-profdata`` tool can be used to merge together multiple raw or |
173 | indexed profiles. To combine profiling data from multiple runs of a program, |
174 | try e.g: |
175 | |
176 | .. code-block:: console |
177 | |
178 | % llvm-profdata merge -sparse foo1.profraw foo2.profdata -o foo3.profdata |
179 | |
180 | Exporting coverage data |
181 | ======================= |
182 | |
183 | Coverage data can be exported into JSON using the ``llvm-cov export`` |
184 | sub-command. There is a comprehensive reference which defines the structure of |
185 | the exported data at a high level in the llvm-cov source code. |
186 | |
187 | Interpreting reports |
188 | ==================== |
189 | |
190 | There are four statistics tracked in a coverage summary: |
191 | |
192 | * Function coverage is the percentage of functions which have been executed at |
193 | least once. A function is considered to be executed if any of its |
194 | instantiations are executed. |
195 | |
196 | * Instantiation coverage is the percentage of function instantiations which |
197 | have been executed at least once. Template functions and static inline |
198 | functions from headers are two kinds of functions which may have multiple |
199 | instantiations. |
200 | |
201 | * Line coverage is the percentage of code lines which have been executed at |
202 | least once. Only executable lines within function bodies are considered to be |
203 | code lines. |
204 | |
205 | * Region coverage is the percentage of code regions which have been executed at |
206 | least once. A code region may span multiple lines (e.g in a large function |
207 | body with no control flow). However, it's also possible for a single line to |
208 | contain multiple code regions (e.g in "return x || y && z"). |
209 | |
210 | Of these four statistics, function coverage is usually the least granular while |
211 | region coverage is the most granular. The project-wide totals for each |
212 | statistic are listed in the summary. |
213 | |
214 | Format compatibility guarantees |
215 | =============================== |
216 | |
217 | * There are no backwards or forwards compatibility guarantees for the raw |
218 | profile format. Raw profiles may be dependent on the specific compiler |
219 | revision used to generate them. It's inadvisable to store raw profiles for |
220 | long periods of time. |
221 | |
222 | * Tools must retain **backwards** compatibility with indexed profile formats. |
223 | These formats are not forwards-compatible: i.e, a tool which uses format |
224 | version X will not be able to understand format version (X+k). |
225 | |
226 | * Tools must also retain **backwards** compatibility with the format of the |
227 | coverage mappings emitted into instrumented binaries. These formats are not |
228 | forwards-compatible. |
229 | |
230 | * The JSON coverage export format has a (major, minor, patch) version triple. |
231 | Only a major version increment indicates a backwards-incompatible change. A |
232 | minor version increment is for added functionality, and patch version |
233 | increments are for bugfixes. |
234 | |
235 | Using the profiling runtime without static initializers |
236 | ======================================================= |
237 | |
238 | By default the compiler runtime uses a static initializer to determine the |
239 | profile output path and to register a writer function. To collect profiles |
240 | without using static initializers, do this manually: |
241 | |
242 | * Export a ``int __llvm_profile_runtime`` symbol from each instrumented shared |
243 | library and executable. When the linker finds a definition of this symbol, it |
244 | knows to skip loading the object which contains the profiling runtime's |
245 | static initializer. |
246 | |
247 | * Forward-declare ``void __llvm_profile_initialize_file(void)`` and call it |
248 | once from each instrumented executable. This function parses |
249 | ``LLVM_PROFILE_FILE``, sets the output path, and truncates any existing files |
250 | at that path. To get the same behavior without truncating existing files, |
251 | pass a filename pattern string to ``void __llvm_profile_set_filename(char |
252 | *)``. These calls can be placed anywhere so long as they precede all calls |
253 | to ``__llvm_profile_write_file``. |
254 | |
255 | * Forward-declare ``int __llvm_profile_write_file(void)`` and call it to write |
256 | out a profile. This function returns 0 when it succeeds, and a non-zero value |
257 | otherwise. Calling this function multiple times appends profile data to an |
258 | existing on-disk raw profile. |
259 | |
260 | In C++ files, declare these as ``extern "C"``. |
261 | |
262 | Collecting coverage reports for the llvm project |
263 | ================================================ |
264 | |
265 | To prepare a coverage report for llvm (and any of its sub-projects), add |
266 | ``-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On`` to the cmake configuration. Raw |
267 | profiles will be written to ``$BUILD_DIR/profiles/``. To prepare an html |
268 | report, run ``llvm/utils/prepare-code-coverage-artifact.py``. |
269 | |
270 | To specify an alternate directory for raw profiles, use |
271 | ``-DLLVM_PROFILE_DATA_DIR``. To change the size of the profile merge pool, use |
272 | ``-DLLVM_PROFILE_MERGE_POOL_SIZE``. |
273 | |
274 | Drawbacks and limitations |
275 | ========================= |
276 | |
277 | * Prior to version 2.26, the GNU binutils BFD linker is not able link programs |
278 | compiled with ``-fcoverage-mapping`` in its ``--gc-sections`` mode. Possible |
279 | workarounds include disabling ``--gc-sections``, upgrading to a newer version |
280 | of BFD, or using the Gold linker. |
281 | |
282 | * Code coverage does not handle unpredictable changes in control flow or stack |
283 | unwinding in the presence of exceptions precisely. Consider the following |
284 | function: |
285 | |
286 | .. code-block:: cpp |
287 | |
288 | int f() { |
289 | may_throw(); |
290 | return 0; |
291 | } |
292 | |
293 | If the call to ``may_throw()`` propagates an exception into ``f``, the code |
294 | coverage tool may mark the ``return`` statement as executed even though it is |
295 | not. A call to ``longjmp()`` can have similar effects. |
296 | |