1 | ========================== |
2 | UndefinedBehaviorSanitizer |
3 | ========================== |
4 | |
5 | .. contents:: |
6 | :local: |
7 | |
8 | Introduction |
9 | ============ |
10 | |
11 | UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. |
12 | UBSan modifies the program at compile-time to catch various kinds of undefined |
13 | behavior during program execution, for example: |
14 | |
15 | * Using misaligned or null pointer |
16 | * Signed integer overflow |
17 | * Conversion to, from, or between floating-point types which would |
18 | overflow the destination |
19 | |
20 | See the full list of available :ref:`checks <ubsan-checks>` below. |
21 | |
22 | UBSan has an optional run-time library which provides better error reporting. |
23 | The checks have small runtime cost and no impact on address space layout or ABI. |
24 | |
25 | How to build |
26 | ============ |
27 | |
28 | Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_. |
29 | |
30 | Usage |
31 | ===== |
32 | |
33 | Use ``clang++`` to compile and link your program with ``-fsanitize=undefined`` |
34 | flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your |
35 | executable is linked with proper UBSan runtime libraries. You can use ``clang`` |
36 | instead of ``clang++`` if you're compiling/linking C code. |
37 | |
38 | .. code-block:: console |
39 | |
40 | % cat test.cc |
41 | int main(int argc, char **argv) { |
42 | int k = 0x7fffffff; |
43 | k += argc; |
44 | return 0; |
45 | } |
46 | % clang++ -fsanitize=undefined test.cc |
47 | % ./a.out |
48 | test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' |
49 | |
50 | You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan, |
51 | and define the desired behavior for each kind of check: |
52 | |
53 | * ``-fsanitize=...``: print a verbose error report and continue execution (default); |
54 | * ``-fno-sanitize-recover=...``: print a verbose error report and exit the program; |
55 | * ``-fsanitize-trap=...``: execute a trap instruction (doesn't require UBSan run-time support). |
56 | |
57 | For example if you compile/link your program as: |
58 | |
59 | .. code-block:: console |
60 | |
61 | % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment |
62 | |
63 | the program will continue execution after signed integer overflows, exit after |
64 | the first invalid use of a null pointer, and trap after the first use of misaligned |
65 | pointer. |
66 | |
67 | .. _ubsan-checks: |
68 | |
69 | Available checks |
70 | ================ |
71 | |
72 | Available checks are: |
73 | |
74 | - ``-fsanitize=alignment``: Use of a misaligned pointer or creation |
75 | of a misaligned reference. Also sanitizes assume_aligned-like attributes. |
76 | - ``-fsanitize=bool``: Load of a ``bool`` value which is neither |
77 | ``true`` nor ``false``. |
78 | - ``-fsanitize=builtin``: Passing invalid values to compiler builtins. |
79 | - ``-fsanitize=bounds``: Out of bounds array indexing, in cases |
80 | where the array bound can be statically determined. |
81 | - ``-fsanitize=enum``: Load of a value of an enumerated type which |
82 | is not in the range of representable values for that enumerated |
83 | type. |
84 | - ``-fsanitize=float-cast-overflow``: Conversion to, from, or |
85 | between floating-point types which would overflow the |
86 | destination. |
87 | - ``-fsanitize=float-divide-by-zero``: Floating point division by |
88 | zero. |
89 | - ``-fsanitize=function``: Indirect call of a function through a |
90 | function pointer of the wrong type (Darwin/Linux, C++ and x86/x86_64 |
91 | only). |
92 | - ``-fsanitize=implicit-unsigned-integer-truncation``, |
93 | ``-fsanitize=implicit-signed-integer-truncation``: Implicit conversion from |
94 | integer of larger bit width to smaller bit width, if that results in data |
95 | loss. That is, if the demoted value, after casting back to the original |
96 | width, is not equal to the original value before the downcast. |
97 | The ``-fsanitize=implicit-unsigned-integer-truncation`` handles conversions |
98 | between two ``unsigned`` types, while |
99 | ``-fsanitize=implicit-signed-integer-truncation`` handles the rest of the |
100 | conversions - when either one, or both of the types are signed. |
101 | Issues caught by these sanitizers are not undefined behavior, |
102 | but are often unintentional. |
103 | - ``-fsanitize=implicit-integer-sign-change``: Implicit conversion between |
104 | integer types, if that changes the sign of the value. That is, if the the |
105 | original value was negative and the new value is positive (or zero), |
106 | or the original value was positive, and the new value is negative. |
107 | Issues caught by this sanitizer are not undefined behavior, |
108 | but are often unintentional. |
109 | - ``-fsanitize=integer-divide-by-zero``: Integer division by zero. |
110 | - ``-fsanitize=nonnull-attribute``: Passing null pointer as a function |
111 | parameter which is declared to never be null. |
112 | - ``-fsanitize=null``: Use of a null pointer or creation of a null |
113 | reference. |
114 | - ``-fsanitize=nullability-arg``: Passing null as a function parameter |
115 | which is annotated with ``_Nonnull``. |
116 | - ``-fsanitize=nullability-assign``: Assigning null to an lvalue which |
117 | is annotated with ``_Nonnull``. |
118 | - ``-fsanitize=nullability-return``: Returning null from a function with |
119 | a return type annotated with ``_Nonnull``. |
120 | - ``-fsanitize=object-size``: An attempt to potentially use bytes which |
121 | the optimizer can determine are not part of the object being accessed. |
122 | This will also detect some types of undefined behavior that may not |
123 | directly access memory, but are provably incorrect given the size of |
124 | the objects involved, such as invalid downcasts and calling methods on |
125 | invalid pointers. These checks are made in terms of |
126 | ``__builtin_object_size``, and consequently may be able to detect more |
127 | problems at higher optimization levels. |
128 | - ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which |
129 | overflows. |
130 | - ``-fsanitize=return``: In C++, reaching the end of a |
131 | value-returning function without returning a value. |
132 | - ``-fsanitize=returns-nonnull-attribute``: Returning null pointer |
133 | from a function which is declared to never return null. |
134 | - ``-fsanitize=shift``: Shift operators where the amount shifted is |
135 | greater or equal to the promoted bit-width of the left hand side |
136 | or less than zero, or where the left hand side is negative. For a |
137 | signed left shift, also checks for signed overflow in C, and for |
138 | unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or |
139 | ``-fsanitize=shift-exponent`` to check only left-hand side or |
140 | right-hand side of shift operation, respectively. |
141 | - ``-fsanitize=signed-integer-overflow``: Signed integer overflow, where the |
142 | result of a signed integer computation cannot be represented in its type. |
143 | This includes all the checks covered by ``-ftrapv``, as well as checks for |
144 | signed division overflow (``INT_MIN/-1``), but not checks for |
145 | lossy implicit conversions performed before the computation |
146 | (see ``-fsanitize=implicit-conversion``). Both of these two issues are |
147 | handled by ``-fsanitize=implicit-conversion`` group of checks. |
148 | - ``-fsanitize=unreachable``: If control flow reaches an unreachable |
149 | program point. |
150 | - ``-fsanitize=unsigned-integer-overflow``: Unsigned integer overflow, where |
151 | the result of an unsigned integer computation cannot be represented in its |
152 | type. Unlike signed integer overflow, this is not undefined behavior, but |
153 | it is often unintentional. This sanitizer does not check for lossy implicit |
154 | conversions performed before such a computation |
155 | (see ``-fsanitize=implicit-conversion``). |
156 | - ``-fsanitize=vla-bound``: A variable-length array whose bound |
157 | does not evaluate to a positive value. |
158 | - ``-fsanitize=vptr``: Use of an object whose vptr indicates that it is of |
159 | the wrong dynamic type, or that its lifetime has not begun or has ended. |
160 | Incompatible with ``-fno-rtti``. Link must be performed by ``clang++``, not |
161 | ``clang``, to make sure C++-specific parts of the runtime library and C++ |
162 | standard libraries are present. |
163 | |
164 | You can also use the following check groups: |
165 | - ``-fsanitize=undefined``: All of the checks listed above other than |
166 | ``unsigned-integer-overflow``, ``implicit-conversion`` and the |
167 | ``nullability-*`` group of checks. |
168 | - ``-fsanitize=undefined-trap``: Deprecated alias of |
169 | ``-fsanitize=undefined``. |
170 | - ``-fsanitize=implicit-integer-truncation``: Catches lossy integral |
171 | conversions. Enables ``implicit-signed-integer-truncation`` and |
172 | ``implicit-unsigned-integer-truncation``. |
173 | - ``-fsanitize=implicit-integer-arithmetic-value-change``: Catches implicit |
174 | conversions that change the arithmetic value of the integer. Enables |
175 | ``implicit-signed-integer-truncation`` and ``implicit-integer-sign-change``. |
176 | - ``-fsanitize=implicit-conversion``: Checks for suspicious |
177 | behaviour of implicit conversions. Enables |
178 | ``implicit-unsigned-integer-truncation``, |
179 | ``implicit-signed-integer-truncation`` and |
180 | ``implicit-integer-sign-change``. |
181 | - ``-fsanitize=integer``: Checks for undefined or suspicious integer |
182 | behavior (e.g. unsigned integer overflow). |
183 | Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``, |
184 | ``shift``, ``integer-divide-by-zero``, |
185 | ``implicit-unsigned-integer-truncation``, |
186 | ``implicit-signed-integer-truncation`` and |
187 | ``implicit-integer-sign-change``. |
188 | - ``-fsanitize=nullability``: Enables ``nullability-arg``, |
189 | ``nullability-assign``, and ``nullability-return``. While violating |
190 | nullability does not have undefined behavior, it is often unintentional, |
191 | so UBSan offers to catch it. |
192 | |
193 | Volatile |
194 | -------- |
195 | |
196 | The ``null``, ``alignment``, ``object-size``, and ``vptr`` checks do not apply |
197 | to pointers to types with the ``volatile`` qualifier. |
198 | |
199 | Minimal Runtime |
200 | =============== |
201 | |
202 | There is a minimal UBSan runtime available suitable for use in production |
203 | environments. This runtime has a small attack surface. It only provides very |
204 | basic issue logging and deduplication, and does not support ``-fsanitize=vptr`` |
205 | checking. |
206 | |
207 | To use the minimal runtime, add ``-fsanitize-minimal-runtime`` to the clang |
208 | command line options. For example, if you're used to compiling with |
209 | ``-fsanitize=undefined``, you could enable the minimal runtime with |
210 | ``-fsanitize=undefined -fsanitize-minimal-runtime``. |
211 | |
212 | Stack traces and report symbolization |
213 | ===================================== |
214 | If you want UBSan to print symbolized stack trace for each error report, you |
215 | will need to: |
216 | |
217 | #. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug |
218 | information in your binary. |
219 | #. Run your program with environment variable |
220 | ``UBSAN_OPTIONS=print_stacktrace=1``. |
221 | #. Make sure ``llvm-symbolizer`` binary is in ``PATH``. |
222 | |
223 | Silencing Unsigned Integer Overflow |
224 | =================================== |
225 | To silence reports from unsigned integer overflow, you can set |
226 | ``UBSAN_OPTIONS=silence_unsigned_overflow=1``. This feature, combined with |
227 | ``-fsanitize-recover=unsigned-integer-overflow``, is particularly useful for |
228 | providing fuzzing signal without blowing up logs. |
229 | |
230 | Issue Suppression |
231 | ================= |
232 | |
233 | UndefinedBehaviorSanitizer is not expected to produce false positives. |
234 | If you see one, look again; most likely it is a true positive! |
235 | |
236 | Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))`` |
237 | ---------------------------------------------------------------------------- |
238 | |
239 | You disable UBSan checks for particular functions with |
240 | ``__attribute__((no_sanitize("undefined")))``. You can use all values of |
241 | ``-fsanitize=`` flag in this attribute, e.g. if your function deliberately |
242 | contains possible signed integer overflow, you can use |
243 | ``__attribute__((no_sanitize("signed-integer-overflow")))``. |
244 | |
245 | This attribute may not be |
246 | supported by other compilers, so consider using it together with |
247 | ``#if defined(__clang__)``. |
248 | |
249 | Suppressing Errors in Recompiled Code (Blacklist) |
250 | ------------------------------------------------- |
251 | |
252 | UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in |
253 | :doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports |
254 | in the specified source files or functions. |
255 | |
256 | Runtime suppressions |
257 | -------------------- |
258 | |
259 | Sometimes you can suppress UBSan error reports for specific files, functions, |
260 | or libraries without recompiling the code. You need to pass a path to |
261 | suppression file in a ``UBSAN_OPTIONS`` environment variable. |
262 | |
263 | .. code-block:: bash |
264 | |
265 | UBSAN_OPTIONS=suppressions=MyUBSan.supp |
266 | |
267 | You need to specify a :ref:`check <ubsan-checks>` you are suppressing and the |
268 | bug location. For example: |
269 | |
270 | .. code-block:: bash |
271 | |
272 | signed-integer-overflow:file-with-known-overflow.cpp |
273 | alignment:function_doing_unaligned_access |
274 | vptr:shared_object_with_vptr_failures.so |
275 | |
276 | There are several limitations: |
277 | |
278 | * Sometimes your binary must have enough debug info and/or symbol table, so |
279 | that the runtime could figure out source file or function name to match |
280 | against the suppression. |
281 | * It is only possible to suppress recoverable checks. For the example above, |
282 | you can additionally pass |
283 | ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although |
284 | most of UBSan checks are recoverable by default. |
285 | * Check groups (like ``undefined``) can't be used in suppressions file, only |
286 | fine-grained checks are supported. |
287 | |
288 | Supported Platforms |
289 | =================== |
290 | |
291 | UndefinedBehaviorSanitizer is supported on the following operating systems: |
292 | |
293 | * Android |
294 | * Linux |
295 | * NetBSD |
296 | * FreeBSD |
297 | * OpenBSD |
298 | * OS X 10.6 onwards |
299 | * Windows |
300 | |
301 | The runtime library is relatively portable and platform independent. If the OS |
302 | you need is not listed above, UndefinedBehaviorSanitizer may already work for |
303 | it, or could be made to work with a minor porting effort. |
304 | |
305 | Current Status |
306 | ============== |
307 | |
308 | UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM |
309 | 3.3. The test suite is integrated into the CMake build and can be run with |
310 | ``check-ubsan`` command. |
311 | |
312 | Additional Configuration |
313 | ======================== |
314 | |
315 | UndefinedBehaviorSanitizer adds static check data for each check unless it is |
316 | in trap mode. This check data includes the full file name. The option |
317 | ``-fsanitize-undefined-strip-path-components=N`` can be used to trim this |
318 | information. If ``N`` is positive, file information emitted by |
319 | UndefinedBehaviorSanitizer will drop the first ``N`` components from the file |
320 | path. If ``N`` is negative, the last ``N`` components will be kept. |
321 | |
322 | Example |
323 | ------- |
324 | |
325 | For a file called ``/code/library/file.cpp``, here is what would be emitted: |
326 | |
327 | * Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp`` |
328 | * ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp`` |
329 | * ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp`` |
330 | * ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp`` |
331 | * ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp`` |
332 | |
333 | More Information |
334 | ================ |
335 | |
336 | * From LLVM project blog: |
337 | `What Every C Programmer Should Know About Undefined Behavior |
338 | <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_ |
339 | * From John Regehr's *Embedded in Academia* blog: |
340 | `A Guide to Undefined Behavior in C and C++ |
341 | <https://blog.regehr.org/archives/213>`_ |
342 | |