1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
2 | "http://www.w3.org/TR/html4/strict.dtd"> |
3 | <html> |
4 | <head> |
5 | <title>Open Projects</title> |
6 | <link type="text/css" rel="stylesheet" href="menu.css"> |
7 | <link type="text/css" rel="stylesheet" href="content.css"> |
8 | <script type="text/javascript" src="scripts/menu.js"></script> |
9 | </head> |
10 | <body> |
11 | |
12 | <div id="page"> |
13 | <!--#include virtual="menu.html.incl"--> |
14 | <div id="content"> |
15 | |
16 | <h1>Open Projects</h1> |
17 | |
18 | <p>This page lists several projects that would boost analyzer's usability and |
19 | power. Most of the projects listed here are infrastructure-related so this list |
20 | is an addition to the <a href="potential_checkers.html">potential checkers |
21 | list</a>. If you are interested in tackling one of these, please send an email |
22 | to the <a href=http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev |
23 | mailing list</a> to notify other members of the community.</p> |
24 | |
25 | <ul> |
26 | <li>Release checkers from "alpha" |
27 | <p>New checkers which were contributed to the analyzer, |
28 | but have not passed a rigorous evaluation process, |
29 | are committed as "alpha checkers" (from "alpha version"), |
30 | and are not enabled by default.</p> |
31 | |
32 | <p>Ideally, only the checkers which are actively being worked on should be in |
33 | "alpha", |
34 | but over the years the development of many of those has stalled. |
35 | Such checkers should either be improved |
36 | up to a point where they can be enabled by default, |
37 | or removed from the analyzer entirely. |
38 | |
39 | <ul> |
40 | <li><code>alpha.security.ArrayBound</code> and |
41 | <code>alpha.security.ArrayBoundV2</code> |
42 | <p>Array bounds checking is a desired feature, |
43 | but having an acceptable rate of false positives might not be possible |
44 | without a proper |
45 | <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening</a> support. |
46 | Additionally, it might be more promising to perform index checking based on |
47 | <a href="https://en.wikipedia.org/wiki/Taint_checking">tainted</a> index values. |
48 | <p><i>(Difficulty: Medium)</i></p></p> |
49 | </li> |
50 | |
51 | <li><code>alpha.cplusplus.MisusedMovedObject</code> |
52 | <p>The checker emits a warning on objects which were used after |
53 | <a href="https://en.cppreference.com/w/cpp/utility/move">move</a>. |
54 | Currently it has an overly high false positive rate due to classes |
55 | which have a well-defined semantics for use-after-move. |
56 | This property does not hold for STL objects, but is often the case |
57 | for custom containers. |
58 | <p><i>(Difficulty: Medium)</i></p></p> |
59 | </li> |
60 | |
61 | <li><code>alpha.unix.StreamChecker</code> |
62 | <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 |
63 | Hours talk |
64 | (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> |
65 | <a href="https://youtu.be/kdxlsP5QVPw">video</a>).</p> |
66 | |
67 | <p>This alpha checker is an attempt to write a production grade stream checker. |
68 | However, it was found to have an unacceptably high false positive rate. |
69 | One of the found problems was that eagerly splitting the state |
70 | based on whether the system call may fail leads to too many reports. |
71 | A <em>delayed</em> split where the implication is stored in the state |
72 | (similarly to nullability implications in <code>TrustNonnullChecker</code>) |
73 | may produce much better results.</p> |
74 | <p><i>(Difficulty: Medium)</i></p> |
75 | </li> |
76 | </ul> |
77 | </li> |
78 | |
79 | <li>Improve C++ support |
80 | <ul> |
81 | <li>Handle aggregate construction. |
82 | <p><a href="https://en.cppreference.com/w/cpp/language/aggregate_initialization">Aggregates</a> |
83 | are objects that can be brace-initialized without calling a |
84 | constructor (that is, <code><a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html"> |
85 | CXXConstructExpr</a></code> does not occur in the AST), |
86 | but potentially calling |
87 | constructors for their fields and base classes |
88 | These |
89 | constructors of sub-objects need to know what object they are constructing. |
90 | Moreover, if the aggregate contains |
91 | references, lifetime extension needs to be properly modeled. |
92 | |
93 | One can start untangling this problem by trying to replace the |
94 | current ad-hoc <code><a href="https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html"> |
95 | ParentMap</a></code> lookup in <a href="https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430"> |
96 | <code>CXXConstructExpr::CK_NonVirtualBase</code></a> branch of |
97 | <code>ExprEngine::VisitCXXConstructExpr()</code> |
98 | with proper support for the feature. |
99 | <p><i>(Difficulty: Medium) </i></p></p> |
100 | </li> |
101 | |
102 | <li>Handle constructors within <code>new[]</code> |
103 | <p>When an array of objects is allocated using the <code>operator new[]</code>, |
104 | constructors for all elements of the array are called. |
105 | We should model (potentially some of) such evaluations, |
106 | and the same applies for destructors called from |
107 | <code>operator delete[]</code>. |
108 | </p> |
109 | </li> |
110 | |
111 | <li>Handle constructors that can be elided due to Named Return Value Optimization (NRVO) |
112 | <p>Local variables which are returned by values on all return statements |
113 | may be stored directly at the address for the return value, |
114 | eliding the copy or move constructor call. |
115 | Such variables can be identified using the AST call <code>VarDecl::isNRVOVariable</code>. |
116 | </p> |
117 | </li> |
118 | |
119 | <li>Handle constructors of lambda captures |
120 | <p>Variables which are captured by value into a lambda require a call to |
121 | a copy constructor. |
122 | This call is not currently modeled. |
123 | </p> |
124 | </li> |
125 | |
126 | <li>Handle constructors for default arguments |
127 | <p>Default arguments in C++ are recomputed at every call, |
128 | and are therefore local, and not static, variables. |
129 | </p> |
130 | </li> |
131 | |
132 | <li>Enhance the modeling of the standard library. |
133 | <p>The analyzer needs a better understanding of STL in order to be more |
134 | useful on C++ codebases. |
135 | While full library modeling is not an easy task, |
136 | large gains can be achieved by supporting only a few cases: |
137 | e.g. calling <code>.length()</code> on an empty |
138 | <code>std::string</code> always yields zero. |
139 | <p><i>(Difficulty: Medium)</i></p><p> |
140 | </li> |
141 | |
142 | <li>Enhance CFG to model exception-handling. |
143 | <p>Currently exceptions are treated as "black holes", and exception-handling |
144 | control structures are poorly modeled in order to be conservative. |
145 | This could be improved for both C++ and Objective-C exceptions. |
146 | <p><i>(Difficulty: Hard)</i></p></p> |
147 | </li> |
148 | </ul> |
149 | </li> |
150 | |
151 | <li>Core Analyzer Infrastructure |
152 | <ul> |
153 | <li>Handle unions. |
154 | <p>Currently in the analyzer the value of a union is always regarded as |
155 | an unknown. |
156 | This problem was |
157 | previously <a href="http://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html">discussed</a> |
158 | on the mailing list, but no solution was implemented. |
159 | <p><i> (Difficulty: Medium) </i></p></p> |
160 | </li> |
161 | |
162 | <li>Floating-point support. |
163 | <p>Currently, the analyzer treats all floating-point values as unknown. |
164 | This project would involve adding a new <code>SVal</code> kind |
165 | for constant floats, generalizing the constraint manager to handle floats, |
166 | and auditing existing code to make sure it doesn't |
167 | make incorrect assumptions (most notably, that <code>X == X</code> |
168 | is always true, since it does not hold for <code>NaN</code>). |
169 | <p><i> (Difficulty: Medium)</i></p></p> |
170 | </li> |
171 | |
172 | <li>Improved loop execution modeling. |
173 | <p>The analyzer simply unrolls each loop <tt>N</tt> times before |
174 | dropping the path, for a fixed constant <tt>N</tt>. |
175 | However, that results in lost coverage in cases where the loop always |
176 | executes more than <tt>N</tt> times. |
177 | A Google Summer Of Code |
178 | <a href="https://summerofcode.withgoogle.com/archive/2017/projects/6071606019358720/">project</a> |
179 | was completed to make the loop bound parameterizable, |
180 | but the <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">widening</a> |
181 | problem still remains open. |
182 | |
183 | <p><i> (Difficulty: Hard)</i></p></p> |
184 | </li> |
185 | |
186 | <li>Basic function summarization support |
187 | <p>The analyzer performs inter-procedural analysis using |
188 | either inlining or "conservative evaluation" (invalidating all data |
189 | passed to the function). |
190 | Often, a very simple summary |
191 | (e.g. "this function is <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>") would be |
192 | enough to be a large improvement over conservative evaluation. |
193 | Such summaries could be obtained either syntactically, |
194 | or using a dataflow framework. |
195 | <p><i>(Difficulty: Hard)</i></p><p> |
196 | </li> |
197 | |
198 | <li>Implement a dataflow flamework. |
199 | <p>The analyzer core |
200 | implements a <a href="https://en.wikipedia.org/wiki/Symbolic_execution">symbolic execution</a> |
201 | engine, which performs checks |
202 | (use-after-free, uninitialized value read, etc.) |
203 | over a <em>single</em> program path. |
204 | However, many useful properties |
205 | (dead code, check-after-use, etc.) require |
206 | reasoning over <em>all</em> possible in a program. |
207 | Such reasoning requires a |
208 | <a href="https://en.wikipedia.org/wiki/Data-flow_analysis">dataflow analysis</a> framework. |
209 | Clang already implements |
210 | a few dataflow analyses (most notably, liveness), |
211 | but they implemented in an ad-hoc fashion. |
212 | A proper framework would enable us writing many more useful checkers. |
213 | <p><i> (Difficulty: Hard) </i></p></p> |
214 | </li> |
215 | |
216 | <li>Track type information through casts more precisely. |
217 | <p>The <code>DynamicTypePropagation</code> |
218 | checker is in charge of inferring a region's |
219 | dynamic type based on what operations the code is performing. |
220 | Casts are a rich source of type information that the analyzer currently ignores. |
221 | <p><i>(Difficulty: Medium)</i></p></p> |
222 | </li> |
223 | |
224 | </ul> |
225 | </li> |
226 | |
227 | <li>Fixing miscellaneous bugs |
228 | <p>Apart from the open projects listed above, |
229 | contributors are welcome to fix any of the outstanding |
230 | <a href="https://bugs.llvm.org/buglist.cgi?component=Static%20Analyzer&list_id=147756&product=clang&resolution=---">bugs</a> |
231 | in the Bugzilla. |
232 | <p><i>(Difficulty: Anything)</i></p></p> |
233 | </li> |
234 | |
235 | </ul> |
236 | |
237 | </div> |
238 | </div> |
239 | </body> |
240 | </html> |
241 | |
242 | |