| 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| 2 | "http://www.w3.org/TR/html4/strict.dtd"> |
| 3 | <html> |
| 4 | <head> |
| 5 | <title>Open Projects</title> |
| 6 | <link type="text/css" rel="stylesheet" href="menu.css"> |
| 7 | <link type="text/css" rel="stylesheet" href="content.css"> |
| 8 | <script type="text/javascript" src="scripts/menu.js"></script> |
| 9 | </head> |
| 10 | <body> |
| 11 | |
| 12 | <div id="page"> |
| 13 | <!--#include virtual="menu.html.incl"--> |
| 14 | <div id="content"> |
| 15 | |
| 16 | <h1>Open Projects</h1> |
| 17 | |
| 18 | <p>This page lists several projects that would boost analyzer's usability and |
| 19 | power. Most of the projects listed here are infrastructure-related so this list |
| 20 | is an addition to the <a href="potential_checkers.html">potential checkers |
| 21 | list</a>. If you are interested in tackling one of these, please send an email |
| 22 | to the <a href=http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev |
| 23 | mailing list</a> to notify other members of the community.</p> |
| 24 | |
| 25 | <ul> |
| 26 | <li>Release checkers from "alpha" |
| 27 | <p>New checkers which were contributed to the analyzer, |
| 28 | but have not passed a rigorous evaluation process, |
| 29 | are committed as "alpha checkers" (from "alpha version"), |
| 30 | and are not enabled by default.</p> |
| 31 | |
| 32 | <p>Ideally, only the checkers which are actively being worked on should be in |
| 33 | "alpha", |
| 34 | but over the years the development of many of those has stalled. |
| 35 | Such checkers should either be improved |
| 36 | up to a point where they can be enabled by default, |
| 37 | or removed from the analyzer entirely. |
| 38 | |
| 39 | <ul> |
| 40 | <li><code>alpha.security.ArrayBound</code> and |
| 41 | <code>alpha.security.ArrayBoundV2</code> |
| 42 | <p>Array bounds checking is a desired feature, |
| 43 | but having an acceptable rate of false positives might not be possible |
| 44 | without a proper |
| 45 | <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening</a> support. |
| 46 | Additionally, it might be more promising to perform index checking based on |
| 47 | <a href="https://en.wikipedia.org/wiki/Taint_checking">tainted</a> index values. |
| 48 | <p><i>(Difficulty: Medium)</i></p></p> |
| 49 | </li> |
| 50 | |
| 51 | <li><code>alpha.cplusplus.MisusedMovedObject</code> |
| 52 | <p>The checker emits a warning on objects which were used after |
| 53 | <a href="https://en.cppreference.com/w/cpp/utility/move">move</a>. |
| 54 | Currently it has an overly high false positive rate due to classes |
| 55 | which have a well-defined semantics for use-after-move. |
| 56 | This property does not hold for STL objects, but is often the case |
| 57 | for custom containers. |
| 58 | <p><i>(Difficulty: Medium)</i></p></p> |
| 59 | </li> |
| 60 | |
| 61 | <li><code>alpha.unix.StreamChecker</code> |
| 62 | <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 |
| 63 | Hours talk |
| 64 | (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> |
| 65 | <a href="https://youtu.be/kdxlsP5QVPw">video</a>).</p> |
| 66 | |
| 67 | <p>This alpha checker is an attempt to write a production grade stream checker. |
| 68 | However, it was found to have an unacceptably high false positive rate. |
| 69 | One of the found problems was that eagerly splitting the state |
| 70 | based on whether the system call may fail leads to too many reports. |
| 71 | A <em>delayed</em> split where the implication is stored in the state |
| 72 | (similarly to nullability implications in <code>TrustNonnullChecker</code>) |
| 73 | may produce much better results.</p> |
| 74 | <p><i>(Difficulty: Medium)</i></p> |
| 75 | </li> |
| 76 | </ul> |
| 77 | </li> |
| 78 | |
| 79 | <li>Improve C++ support |
| 80 | <ul> |
| 81 | <li>Handle aggregate construction. |
| 82 | <p><a href="https://en.cppreference.com/w/cpp/language/aggregate_initialization">Aggregates</a> |
| 83 | are objects that can be brace-initialized without calling a |
| 84 | constructor (that is, <code><a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html"> |
| 85 | CXXConstructExpr</a></code> does not occur in the AST), |
| 86 | but potentially calling |
| 87 | constructors for their fields and base classes |
| 88 | These |
| 89 | constructors of sub-objects need to know what object they are constructing. |
| 90 | Moreover, if the aggregate contains |
| 91 | references, lifetime extension needs to be properly modeled. |
| 92 | |
| 93 | One can start untangling this problem by trying to replace the |
| 94 | current ad-hoc <code><a href="https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html"> |
| 95 | ParentMap</a></code> lookup in <a href="https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430"> |
| 96 | <code>CXXConstructExpr::CK_NonVirtualBase</code></a> branch of |
| 97 | <code>ExprEngine::VisitCXXConstructExpr()</code> |
| 98 | with proper support for the feature. |
| 99 | <p><i>(Difficulty: Medium) </i></p></p> |
| 100 | </li> |
| 101 | |
| 102 | <li>Handle constructors within <code>new[]</code> |
| 103 | <p>When an array of objects is allocated using the <code>operator new[]</code>, |
| 104 | constructors for all elements of the array are called. |
| 105 | We should model (potentially some of) such evaluations, |
| 106 | and the same applies for destructors called from |
| 107 | <code>operator delete[]</code>. |
| 108 | </p> |
| 109 | </li> |
| 110 | |
| 111 | <li>Handle constructors that can be elided due to Named Return Value Optimization (NRVO) |
| 112 | <p>Local variables which are returned by values on all return statements |
| 113 | may be stored directly at the address for the return value, |
| 114 | eliding the copy or move constructor call. |
| 115 | Such variables can be identified using the AST call <code>VarDecl::isNRVOVariable</code>. |
| 116 | </p> |
| 117 | </li> |
| 118 | |
| 119 | <li>Handle constructors of lambda captures |
| 120 | <p>Variables which are captured by value into a lambda require a call to |
| 121 | a copy constructor. |
| 122 | This call is not currently modeled. |
| 123 | </p> |
| 124 | </li> |
| 125 | |
| 126 | <li>Handle constructors for default arguments |
| 127 | <p>Default arguments in C++ are recomputed at every call, |
| 128 | and are therefore local, and not static, variables. |
| 129 | </p> |
| 130 | </li> |
| 131 | |
| 132 | <li>Enhance the modeling of the standard library. |
| 133 | <p>The analyzer needs a better understanding of STL in order to be more |
| 134 | useful on C++ codebases. |
| 135 | While full library modeling is not an easy task, |
| 136 | large gains can be achieved by supporting only a few cases: |
| 137 | e.g. calling <code>.length()</code> on an empty |
| 138 | <code>std::string</code> always yields zero. |
| 139 | <p><i>(Difficulty: Medium)</i></p><p> |
| 140 | </li> |
| 141 | |
| 142 | <li>Enhance CFG to model exception-handling. |
| 143 | <p>Currently exceptions are treated as "black holes", and exception-handling |
| 144 | control structures are poorly modeled in order to be conservative. |
| 145 | This could be improved for both C++ and Objective-C exceptions. |
| 146 | <p><i>(Difficulty: Hard)</i></p></p> |
| 147 | </li> |
| 148 | </ul> |
| 149 | </li> |
| 150 | |
| 151 | <li>Core Analyzer Infrastructure |
| 152 | <ul> |
| 153 | <li>Handle unions. |
| 154 | <p>Currently in the analyzer the value of a union is always regarded as |
| 155 | an unknown. |
| 156 | This problem was |
| 157 | previously <a href="http://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html">discussed</a> |
| 158 | on the mailing list, but no solution was implemented. |
| 159 | <p><i> (Difficulty: Medium) </i></p></p> |
| 160 | </li> |
| 161 | |
| 162 | <li>Floating-point support. |
| 163 | <p>Currently, the analyzer treats all floating-point values as unknown. |
| 164 | This project would involve adding a new <code>SVal</code> kind |
| 165 | for constant floats, generalizing the constraint manager to handle floats, |
| 166 | and auditing existing code to make sure it doesn't |
| 167 | make incorrect assumptions (most notably, that <code>X == X</code> |
| 168 | is always true, since it does not hold for <code>NaN</code>). |
| 169 | <p><i> (Difficulty: Medium)</i></p></p> |
| 170 | </li> |
| 171 | |
| 172 | <li>Improved loop execution modeling. |
| 173 | <p>The analyzer simply unrolls each loop <tt>N</tt> times before |
| 174 | dropping the path, for a fixed constant <tt>N</tt>. |
| 175 | However, that results in lost coverage in cases where the loop always |
| 176 | executes more than <tt>N</tt> times. |
| 177 | A Google Summer Of Code |
| 178 | <a href="https://summerofcode.withgoogle.com/archive/2017/projects/6071606019358720/">project</a> |
| 179 | was completed to make the loop bound parameterizable, |
| 180 | but the <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">widening</a> |
| 181 | problem still remains open. |
| 182 | |
| 183 | <p><i> (Difficulty: Hard)</i></p></p> |
| 184 | </li> |
| 185 | |
| 186 | <li>Basic function summarization support |
| 187 | <p>The analyzer performs inter-procedural analysis using |
| 188 | either inlining or "conservative evaluation" (invalidating all data |
| 189 | passed to the function). |
| 190 | Often, a very simple summary |
| 191 | (e.g. "this function is <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>") would be |
| 192 | enough to be a large improvement over conservative evaluation. |
| 193 | Such summaries could be obtained either syntactically, |
| 194 | or using a dataflow framework. |
| 195 | <p><i>(Difficulty: Hard)</i></p><p> |
| 196 | </li> |
| 197 | |
| 198 | <li>Implement a dataflow flamework. |
| 199 | <p>The analyzer core |
| 200 | implements a <a href="https://en.wikipedia.org/wiki/Symbolic_execution">symbolic execution</a> |
| 201 | engine, which performs checks |
| 202 | (use-after-free, uninitialized value read, etc.) |
| 203 | over a <em>single</em> program path. |
| 204 | However, many useful properties |
| 205 | (dead code, check-after-use, etc.) require |
| 206 | reasoning over <em>all</em> possible in a program. |
| 207 | Such reasoning requires a |
| 208 | <a href="https://en.wikipedia.org/wiki/Data-flow_analysis">dataflow analysis</a> framework. |
| 209 | Clang already implements |
| 210 | a few dataflow analyses (most notably, liveness), |
| 211 | but they implemented in an ad-hoc fashion. |
| 212 | A proper framework would enable us writing many more useful checkers. |
| 213 | <p><i> (Difficulty: Hard) </i></p></p> |
| 214 | </li> |
| 215 | |
| 216 | <li>Track type information through casts more precisely. |
| 217 | <p>The <code>DynamicTypePropagation</code> |
| 218 | checker is in charge of inferring a region's |
| 219 | dynamic type based on what operations the code is performing. |
| 220 | Casts are a rich source of type information that the analyzer currently ignores. |
| 221 | <p><i>(Difficulty: Medium)</i></p></p> |
| 222 | </li> |
| 223 | |
| 224 | </ul> |
| 225 | </li> |
| 226 | |
| 227 | <li>Fixing miscellaneous bugs |
| 228 | <p>Apart from the open projects listed above, |
| 229 | contributors are welcome to fix any of the outstanding |
| 230 | <a href="https://bugs.llvm.org/buglist.cgi?component=Static%20Analyzer&list_id=147756&product=clang&resolution=---">bugs</a> |
| 231 | in the Bugzilla. |
| 232 | <p><i>(Difficulty: Anything)</i></p></p> |
| 233 | </li> |
| 234 | |
| 235 | </ul> |
| 236 | |
| 237 | </div> |
| 238 | </div> |
| 239 | </body> |
| 240 | </html> |
| 241 | |
| 242 | |