1 | ======================================================= |
2 | Hardware-assisted AddressSanitizer Design Documentation |
3 | ======================================================= |
4 | |
5 | This page is a design document for |
6 | **hardware-assisted AddressSanitizer** (or **HWASAN**) |
7 | a tool similar to :doc:`AddressSanitizer`, |
8 | but based on partial hardware assistance. |
9 | |
10 | |
11 | Introduction |
12 | ============ |
13 | |
14 | :doc:`AddressSanitizer` |
15 | tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*), |
16 | uses *redzones* to find buffer-overflows and |
17 | *quarantine* to find use-after-free. |
18 | The redzones, the quarantine, and, to a less extent, the shadow, are the |
19 | sources of AddressSanitizer's memory overhead. |
20 | See the `AddressSanitizer paper`_ for details. |
21 | |
22 | AArch64 has the `Address Tagging`_ (or top-byte-ignore, TBI), a hardware feature that allows |
23 | software to use 8 most significant bits of a 64-bit pointer as |
24 | a tag. HWASAN uses `Address Tagging`_ |
25 | to implement a memory safety tool, similar to :doc:`AddressSanitizer`, |
26 | but with smaller memory overhead and slightly different (mostly better) |
27 | accuracy guarantees. |
28 | |
29 | Algorithm |
30 | ========= |
31 | * Every heap/stack/global memory object is forcibly aligned by `TG` bytes |
32 | (`TG` is e.g. 16 or 64). We call `TG` the **tagging granularity**. |
33 | * For every such object a random `TS`-bit tag `T` is chosen (`TS`, or tag size, is e.g. 4 or 8) |
34 | * The pointer to the object is tagged with `T`. |
35 | * The memory for the object is also tagged with `T` (using a `TG=>1` shadow memory) |
36 | * Every load and store is instrumented to read the memory tag and compare it |
37 | with the pointer tag, exception is raised on tag mismatch. |
38 | |
39 | For a more detailed discussion of this approach see https://arxiv.org/pdf/1802.09517.pdf |
40 | |
41 | Instrumentation |
42 | =============== |
43 | |
44 | Memory Accesses |
45 | --------------- |
46 | All memory accesses are prefixed with an inline instruction sequence that |
47 | verifies the tags. Currently, the following sequence is used: |
48 | |
49 | |
50 | .. code-block:: none |
51 | |
52 | // int foo(int *a) { return *a; } |
53 | // clang -O2 --target=aarch64-linux -fsanitize=hwaddress -c load.c |
54 | foo: |
55 | 0: 08 00 00 90 adrp x8, 0 <__hwasan_shadow> |
56 | 4: 08 01 40 f9 ldr x8, [x8] // shadow base (to be resolved by the loader) |
57 | 8: 09 dc 44 d3 ubfx x9, x0, #4, #52 // shadow offset |
58 | c: 28 69 68 38 ldrb w8, [x9, x8] // load shadow tag |
59 | 10: 09 fc 78 d3 lsr x9, x0, #56 // extract address tag |
60 | 14: 3f 01 08 6b cmp w9, w8 // compare tags |
61 | 18: 61 00 00 54 b.ne 24 // jump on mismatch |
62 | 1c: 00 00 40 b9 ldr w0, [x0] // original load |
63 | 20: c0 03 5f d6 ret |
64 | 24: 40 20 21 d4 brk #0x902 // trap |
65 | |
66 | Alternatively, memory accesses are prefixed with a function call. |
67 | |
68 | Heap |
69 | ---- |
70 | |
71 | Tagging the heap memory/pointers is done by `malloc`. |
72 | This can be based on any malloc that forces all objects to be TG-aligned. |
73 | `free` tags the memory with a different tag. |
74 | |
75 | Stack |
76 | ----- |
77 | |
78 | Stack frames are instrumented by aligning all non-promotable allocas |
79 | by `TG` and tagging stack memory in function prologue and epilogue. |
80 | |
81 | Tags for different allocas in one function are **not** generated |
82 | independently; doing that in a function with `M` allocas would require |
83 | maintaining `M` live stack pointers, significantly increasing register |
84 | pressure. Instead we generate a single base tag value in the prologue, |
85 | and build the tag for alloca number `M` as `ReTag(BaseTag, M)`, where |
86 | ReTag can be as simple as exclusive-or with constant `M`. |
87 | |
88 | Stack instrumentation is expected to be a major source of overhead, |
89 | but could be optional. |
90 | |
91 | Globals |
92 | ------- |
93 | |
94 | TODO: details. |
95 | |
96 | Error reporting |
97 | --------------- |
98 | |
99 | Errors are generated by the `HLT` instruction and are handled by a signal handler. |
100 | |
101 | Attribute |
102 | --------- |
103 | |
104 | HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching |
105 | C function attribute. An alternative would be to re-use ASAN's attribute |
106 | `sanitize_address`. The reasons to use a separate attribute are: |
107 | |
108 | * Users may need to disable ASAN but not HWASAN, or vise versa, |
109 | because the tools have different trade-offs and compatibility issues. |
110 | * LLVM (ideally) does not use flags to decide which pass is being used, |
111 | ASAN or HWASAN are being applied, based on the function attributes. |
112 | |
113 | This does mean that users of HWASAN may need to add the new attribute |
114 | to the code that already uses the old attribute. |
115 | |
116 | |
117 | Comparison with AddressSanitizer |
118 | ================================ |
119 | |
120 | HWASAN: |
121 | * Is less portable than :doc:`AddressSanitizer` |
122 | as it relies on hardware `Address Tagging`_ (AArch64). |
123 | Address Tagging can be emulated with compiler instrumentation, |
124 | but it will require the instrumentation to remove the tags before |
125 | any load or store, which is infeasible in any realistic environment |
126 | that contains non-instrumented code. |
127 | * May have compatibility problems if the target code uses higher |
128 | pointer bits for other purposes. |
129 | * May require changes in the OS kernels (e.g. Linux seems to dislike |
130 | tagged pointers passed from address space: |
131 | https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt). |
132 | * **Does not require redzones to detect buffer overflows**, |
133 | but the buffer overflow detection is probabilistic, with roughly |
134 | `1/(2**TS)` chance of missing a bug (6.25% or 0.39% with 4 and 8-bit TS |
135 | respectively). |
136 | * **Does not require quarantine to detect heap-use-after-free, |
137 | or stack-use-after-return**. |
138 | The detection is similarly probabilistic. |
139 | |
140 | The memory overhead of HWASAN is expected to be much smaller |
141 | than that of AddressSanitizer: |
142 | `1/TG` extra memory for the shadow |
143 | and some overhead due to `TG`-aligning all objects. |
144 | |
145 | Supported architectures |
146 | ======================= |
147 | HWASAN relies on `Address Tagging`_ which is only available on AArch64. |
148 | For other 64-bit architectures it is possible to remove the address tags |
149 | before every load and store by compiler instrumentation, but this variant |
150 | will have limited deployability since not all of the code is |
151 | typically instrumented. |
152 | |
153 | The HWASAN's approach is not applicable to 32-bit architectures. |
154 | |
155 | |
156 | Related Work |
157 | ============ |
158 | * `SPARC ADI`_ implements a similar tool mostly in hardware. |
159 | * `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses |
160 | similar approaches ("lock & key"). |
161 | * `Watchdog`_ discussed a heavier, but still somewhat similar |
162 | "lock & key" approach. |
163 | * *TODO: add more "related work" links. Suggestions are welcome.* |
164 | |
165 | |
166 | .. _Watchdog: https://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf |
167 | .. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf |
168 | .. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html |
169 | .. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf |
170 | .. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html |
171 | |
172 | |