| 1 | ================= |
| 2 | SanitizerCoverage |
| 3 | ================= |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
| 11 | LLVM has a simple code coverage instrumentation built in (SanitizerCoverage). |
| 12 | It inserts calls to user-defined functions on function-, basic-block-, and edge- levels. |
| 13 | Default implementations of those callbacks are provided and implement |
| 14 | simple coverage reporting and visualization, |
| 15 | however if you need *just* coverage visualization you may want to use |
| 16 | :doc:`SourceBasedCodeCoverage <SourceBasedCodeCoverage>` instead. |
| 17 | |
| 18 | Tracing PCs with guards |
| 19 | ======================= |
| 20 | |
| 21 | With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code |
| 22 | on every edge: |
| 23 | |
| 24 | .. code-block:: none |
| 25 | |
| 26 | __sanitizer_cov_trace_pc_guard(&guard_variable) |
| 27 | |
| 28 | Every edge will have its own `guard_variable` (uint32_t). |
| 29 | |
| 30 | The compler will also insert calls to a module constructor: |
| 31 | |
| 32 | .. code-block:: c++ |
| 33 | |
| 34 | // The guards are [start, stop). |
| 35 | // This function will be called at least once per DSO and may be called |
| 36 | // more than once with the same values of start/stop. |
| 37 | __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop); |
| 38 | |
| 39 | With an additional ``...=trace-pc,indirect-calls`` flag |
| 40 | ``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call. |
| 41 | |
| 42 | The functions `__sanitizer_cov_trace_pc_*` should be defined by the user. |
| 43 | |
| 44 | Example: |
| 45 | |
| 46 | .. code-block:: c++ |
| 47 | |
| 48 | // trace-pc-guard-cb.cc |
| 49 | #include <stdint.h> |
| 50 | #include <stdio.h> |
| 51 | #include <sanitizer/coverage_interface.h> |
| 52 | |
| 53 | // This callback is inserted by the compiler as a module constructor |
| 54 | // into every DSO. 'start' and 'stop' correspond to the |
| 55 | // beginning and end of the section with the guards for the entire |
| 56 | // binary (executable or DSO). The callback will be called at least |
| 57 | // once per DSO and may be called multiple times with the same parameters. |
| 58 | extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, |
| 59 | uint32_t *stop) { |
| 60 | static uint64_t N; // Counter for the guards. |
| 61 | if (start == stop || *start) return; // Initialize only once. |
| 62 | printf("INIT: %p %p\n", start, stop); |
| 63 | for (uint32_t *x = start; x < stop; x++) |
| 64 | *x = ++N; // Guards should start from 1. |
| 65 | } |
| 66 | |
| 67 | // This callback is inserted by the compiler on every edge in the |
| 68 | // control flow (some optimizations apply). |
| 69 | // Typically, the compiler will emit the code like this: |
| 70 | // if(*guard) |
| 71 | // __sanitizer_cov_trace_pc_guard(guard); |
| 72 | // But for large functions it will emit a simple call: |
| 73 | // __sanitizer_cov_trace_pc_guard(guard); |
| 74 | extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { |
| 75 | if (!*guard) return; // Duplicate the guard check. |
| 76 | // If you set *guard to 0 this code will not be called again for this edge. |
| 77 | // Now you can get the PC and do whatever you want: |
| 78 | // store it somewhere or symbolize it and print right away. |
| 79 | // The values of `*guard` are as you set them in |
| 80 | // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive |
| 81 | // and use them to dereference an array or a bit vector. |
| 82 | void *PC = __builtin_return_address(0); |
| 83 | char PcDescr[1024]; |
| 84 | // This function is a part of the sanitizer run-time. |
| 85 | // To use it, link with AddressSanitizer or other sanitizer. |
| 86 | __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr)); |
| 87 | printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); |
| 88 | } |
| 89 | |
| 90 | .. code-block:: c++ |
| 91 | |
| 92 | // trace-pc-guard-example.cc |
| 93 | void foo() { } |
| 94 | int main(int argc, char **argv) { |
| 95 | if (argc > 1) foo(); |
| 96 | } |
| 97 | |
| 98 | .. code-block:: console |
| 99 | |
| 100 | clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c |
| 101 | clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address |
| 102 | ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out |
| 103 | |
| 104 | .. code-block:: console |
| 105 | |
| 106 | INIT: 0x71bcd0 0x71bce0 |
| 107 | guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2 |
| 108 | guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7 |
| 109 | |
| 110 | .. code-block:: console |
| 111 | |
| 112 | ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out with-foo |
| 113 | |
| 114 | |
| 115 | .. code-block:: console |
| 116 | |
| 117 | INIT: 0x71bcd0 0x71bce0 |
| 118 | guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:3 |
| 119 | guard: 0x71bcdc 4 PC 0x4ecdc7 in main trace-pc-guard-example.cc:4:17 |
| 120 | guard: 0x71bcd0 1 PC 0x4ecd20 in foo() trace-pc-guard-example.cc:2:14 |
| 121 | |
| 122 | Inline 8bit-counters |
| 123 | ==================== |
| 124 | |
| 125 | **Experimental, may change or disappear in future** |
| 126 | |
| 127 | With ``-fsanitize-coverage=inline-8bit-counters`` the compiler will insert |
| 128 | inline counter increments on every edge. |
| 129 | This is similar to ``-fsanitize-coverage=trace-pc-guard`` but instead of a |
| 130 | callback the instrumentation simply increments a counter. |
| 131 | |
| 132 | Users need to implement a single function to capture the counters at startup. |
| 133 | |
| 134 | .. code-block:: c++ |
| 135 | |
| 136 | extern "C" |
| 137 | void __sanitizer_cov_8bit_counters_init(char *start, char *end) { |
| 138 | // [start,end) is the array of 8-bit counters created for the current DSO. |
| 139 | // Capture this array in order to read/modify the counters. |
| 140 | } |
| 141 | |
| 142 | PC-Table |
| 143 | ======== |
| 144 | |
| 145 | **Experimental, may change or disappear in future** |
| 146 | |
| 147 | **Note:** this instrumentation might be incompatible with dead code stripping |
| 148 | (``-Wl,-gc-sections``) for linkers other than LLD, thus resulting in a |
| 149 | significant binary size overhead. For more information, see |
| 150 | `Bug 34636 <https://bugs.llvm.org/show_bug.cgi?id=34636>`_. |
| 151 | |
| 152 | With ``-fsanitize-coverage=pc-table`` the compiler will create a table of |
| 153 | instrumented PCs. Requires either ``-fsanitize-coverage=inline-8bit-counters`` or |
| 154 | ``-fsanitize-coverage=trace-pc-guard``. |
| 155 | |
| 156 | Users need to implement a single function to capture the PC table at startup: |
| 157 | |
| 158 | .. code-block:: c++ |
| 159 | |
| 160 | extern "C" |
| 161 | void __sanitizer_cov_pcs_init(const uintptr_t *pcs_beg, |
| 162 | const uintptr_t *pcs_end) { |
| 163 | // [pcs_beg,pcs_end) is the array of ptr-sized integers representing |
| 164 | // pairs [PC,PCFlags] for every instrumented block in the current DSO. |
| 165 | // Capture this array in order to read the PCs and their Flags. |
| 166 | // The number of PCs and PCFlags for a given DSO is the same as the number |
| 167 | // of 8-bit counters (-fsanitize-coverage=inline-8bit-counters) or |
| 168 | // trace_pc_guard callbacks (-fsanitize-coverage=trace-pc-guard) |
| 169 | // A PCFlags describes the basic block: |
| 170 | // * bit0: 1 if the block is the function entry block, 0 otherwise. |
| 171 | } |
| 172 | |
| 173 | |
| 174 | Tracing PCs |
| 175 | =========== |
| 176 | |
| 177 | With ``-fsanitize-coverage=trace-pc`` the compiler will insert |
| 178 | ``__sanitizer_cov_trace_pc()`` on every edge. |
| 179 | With an additional ``...=trace-pc,indirect-calls`` flag |
| 180 | ``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call. |
| 181 | These callbacks are not implemented in the Sanitizer run-time and should be defined |
| 182 | by the user. |
| 183 | This mechanism is used for fuzzing the Linux kernel |
| 184 | (https://github.com/google/syzkaller). |
| 185 | |
| 186 | Instrumentation points |
| 187 | ====================== |
| 188 | Sanitizer Coverage offers different levels of instrumentation. |
| 189 | |
| 190 | * ``edge`` (default): edges are instrumented (see below). |
| 191 | * ``bb``: basic blocks are instrumented. |
| 192 | * ``func``: only the entry block of every function will be instrumented. |
| 193 | |
| 194 | Use these flags together with ``trace-pc-guard`` or ``trace-pc``, |
| 195 | like this: ``-fsanitize-coverage=func,trace-pc-guard``. |
| 196 | |
| 197 | When ``edge`` or ``bb`` is used, some of the edges/blocks may still be left |
| 198 | uninstrumented (pruned) if such instrumentation is considered redundant. |
| 199 | Use ``no-prune`` (e.g. ``-fsanitize-coverage=bb,no-prune,trace-pc-guard``) |
| 200 | to disable pruning. This could be useful for better coverage visualization. |
| 201 | |
| 202 | |
| 203 | Edge coverage |
| 204 | ------------- |
| 205 | |
| 206 | Consider this code: |
| 207 | |
| 208 | .. code-block:: c++ |
| 209 | |
| 210 | void foo(int *a) { |
| 211 | if (a) |
| 212 | *a = 0; |
| 213 | } |
| 214 | |
| 215 | It contains 3 basic blocks, let's name them A, B, C: |
| 216 | |
| 217 | .. code-block:: none |
| 218 | |
| 219 | A |
| 220 | |\ |
| 221 | | \ |
| 222 | | B |
| 223 | | / |
| 224 | |/ |
| 225 | C |
| 226 | |
| 227 | If blocks A, B, and C are all covered we know for certain that the edges A=>B |
| 228 | and B=>C were executed, but we still don't know if the edge A=>C was executed. |
| 229 | Such edges of control flow graph are called |
| 230 | `critical <https://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. |
| 231 | The edge-level coverage simply splits all critical edges by introducing new |
| 232 | dummy blocks and then instruments those blocks: |
| 233 | |
| 234 | .. code-block:: none |
| 235 | |
| 236 | A |
| 237 | |\ |
| 238 | | \ |
| 239 | D B |
| 240 | | / |
| 241 | |/ |
| 242 | C |
| 243 | |
| 244 | Tracing data flow |
| 245 | ================= |
| 246 | |
| 247 | Support for data-flow-guided fuzzing. |
| 248 | With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation |
| 249 | around comparison instructions and switch statements. |
| 250 | Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument |
| 251 | integer division instructions (to capture the right argument of division) |
| 252 | and with ``-fsanitize-coverage=trace-gep`` -- |
| 253 | the `LLVM GEP instructions <https://llvm.org/docs/GetElementPtr.html>`_ |
| 254 | (to capture array indices). |
| 255 | |
| 256 | Unless ``no-prune`` option is provided, some of the comparison instructions |
| 257 | will not be instrumented. |
| 258 | |
| 259 | .. code-block:: c++ |
| 260 | |
| 261 | // Called before a comparison instruction. |
| 262 | // Arg1 and Arg2 are arguments of the comparison. |
| 263 | void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2); |
| 264 | void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2); |
| 265 | void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2); |
| 266 | void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2); |
| 267 | |
| 268 | // Called before a comparison instruction if exactly one of the arguments is constant. |
| 269 | // Arg1 and Arg2 are arguments of the comparison, Arg1 is a compile-time constant. |
| 270 | // These callbacks are emitted by -fsanitize-coverage=trace-cmp since 2017-08-11 |
| 271 | void __sanitizer_cov_trace_const_cmp1(uint8_t Arg1, uint8_t Arg2); |
| 272 | void __sanitizer_cov_trace_const_cmp2(uint16_t Arg1, uint16_t Arg2); |
| 273 | void __sanitizer_cov_trace_const_cmp4(uint32_t Arg1, uint32_t Arg2); |
| 274 | void __sanitizer_cov_trace_const_cmp8(uint64_t Arg1, uint64_t Arg2); |
| 275 | |
| 276 | // Called before a switch statement. |
| 277 | // Val is the switch operand. |
| 278 | // Cases[0] is the number of case constants. |
| 279 | // Cases[1] is the size of Val in bits. |
| 280 | // Cases[2:] are the case constants. |
| 281 | void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases); |
| 282 | |
| 283 | // Called before a division statement. |
| 284 | // Val is the second argument of division. |
| 285 | void __sanitizer_cov_trace_div4(uint32_t Val); |
| 286 | void __sanitizer_cov_trace_div8(uint64_t Val); |
| 287 | |
| 288 | // Called before a GetElemementPtr (GEP) instruction |
| 289 | // for every non-constant array index. |
| 290 | void __sanitizer_cov_trace_gep(uintptr_t Idx); |
| 291 | |
| 292 | Default implementation |
| 293 | ====================== |
| 294 | |
| 295 | The sanitizer run-time (AddressSanitizer, MemorySanitizer, etc) provide a |
| 296 | default implementations of some of the coverage callbacks. |
| 297 | You may use this implementation to dump the coverage on disk at the process |
| 298 | exit. |
| 299 | |
| 300 | Example: |
| 301 | |
| 302 | .. code-block:: console |
| 303 | |
| 304 | % cat -n cov.cc |
| 305 | 1 #include <stdio.h> |
| 306 | 2 __attribute__((noinline)) |
| 307 | 3 void foo() { printf("foo\n"); } |
| 308 | 4 |
| 309 | 5 int main(int argc, char **argv) { |
| 310 | 6 if (argc == 2) |
| 311 | 7 foo(); |
| 312 | 8 printf("main\n"); |
| 313 | 9 } |
| 314 | % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=trace-pc-guard |
| 315 | % ASAN_OPTIONS=coverage=1 ./a.out; wc -c *.sancov |
| 316 | main |
| 317 | SanitizerCoverage: ./a.out.7312.sancov 2 PCs written |
| 318 | 24 a.out.7312.sancov |
| 319 | % ASAN_OPTIONS=coverage=1 ./a.out foo ; wc -c *.sancov |
| 320 | foo |
| 321 | main |
| 322 | SanitizerCoverage: ./a.out.7316.sancov 3 PCs written |
| 323 | 24 a.out.7312.sancov |
| 324 | 32 a.out.7316.sancov |
| 325 | |
| 326 | Every time you run an executable instrumented with SanitizerCoverage |
| 327 | one ``*.sancov`` file is created during the process shutdown. |
| 328 | If the executable is dynamically linked against instrumented DSOs, |
| 329 | one ``*.sancov`` file will be also created for every DSO. |
| 330 | |
| 331 | Sancov data format |
| 332 | ------------------ |
| 333 | |
| 334 | The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic, |
| 335 | one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the |
| 336 | magic defines the size of the following offsets. The rest of the data is the |
| 337 | offsets in the corresponding binary/DSO that were executed during the run. |
| 338 | |
| 339 | Sancov Tool |
| 340 | ----------- |
| 341 | |
| 342 | An simple ``sancov`` tool is provided to process coverage files. |
| 343 | The tool is part of LLVM project and is currently supported only on Linux. |
| 344 | It can handle symbolization tasks autonomously without any extra support |
| 345 | from the environment. You need to pass .sancov files (named |
| 346 | ``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files. |
| 347 | Sancov matches these files using module names and binaries file names. |
| 348 | |
| 349 | .. code-block:: console |
| 350 | |
| 351 | USAGE: sancov [options] <action> (<binary file>|<.sancov file>)... |
| 352 | |
| 353 | Action (required) |
| 354 | -print - Print coverage addresses |
| 355 | -covered-functions - Print all covered functions. |
| 356 | -not-covered-functions - Print all not covered functions. |
| 357 | -symbolize - Symbolizes the report. |
| 358 | |
| 359 | Options |
| 360 | -blacklist=<string> - Blacklist file (sanitizer blacklist format). |
| 361 | -demangle - Print demangled function name. |
| 362 | -strip_path_prefix=<string> - Strip this prefix from file paths in reports |
| 363 | |
| 364 | |
| 365 | Coverage Reports |
| 366 | ---------------- |
| 367 | |
| 368 | **Experimental** |
| 369 | |
| 370 | ``.sancov`` files do not contain enough information to generate a source-level |
| 371 | coverage report. The missing information is contained |
| 372 | in debug info of the binary. Thus the ``.sancov`` has to be symbolized |
| 373 | to produce a ``.symcov`` file first: |
| 374 | |
| 375 | .. code-block:: console |
| 376 | |
| 377 | sancov -symbolize my_program.123.sancov my_program > my_program.123.symcov |
| 378 | |
| 379 | The ``.symcov`` file can be browsed overlayed over the source code by |
| 380 | running ``tools/sancov/coverage-report-server.py`` script that will start |
| 381 | an HTTP server. |
| 382 | |
| 383 | Output directory |
| 384 | ---------------- |
| 385 | |
| 386 | By default, .sancov files are created in the current working directory. |
| 387 | This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``: |
| 388 | |
| 389 | .. code-block:: console |
| 390 | |
| 391 | % ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo |
| 392 | % ls -l /tmp/cov/*sancov |
| 393 | -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov |
| 394 | -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov |
| 395 | |