1 | # Checking Go Package API Compatibility |
---|---|
2 | |
3 | The `apidiff` tool in this directory determines whether two versions of the same |
4 | package are compatible. The goal is to help the developer make an informed |
5 | choice of semantic version after they have changed the code of their module. |
6 | |
7 | `apidiff` reports two kinds of changes: incompatible ones, which require |
8 | incrementing the major part of the semantic version, and compatible ones, which |
9 | require a minor version increment. If no API changes are reported but there are |
10 | code changes that could affect client code, then the patch version should |
11 | be incremented. |
12 | |
13 | Because `apidiff` ignores package import paths, it may be used to display API |
14 | differences between any two packages, not just different versions of the same |
15 | package. |
16 | |
17 | The current version of `apidiff` compares only packages, not modules. |
18 | |
19 | |
20 | ## Compatibility Desiderata |
21 | |
22 | Any tool that checks compatibility can offer only an approximation. No tool can |
23 | detect behavioral changes; and even if it could, whether a behavioral change is |
24 | a breaking change or not depends on many factors, such as whether it closes a |
25 | security hole or fixes a bug. Even a change that causes some code to fail to |
26 | compile may not be considered a breaking change by the developers or their |
27 | users. It may only affect code marked as experimental or unstable, for |
28 | example, or the break may only manifest in unlikely cases. |
29 | |
30 | For a tool to be useful, its notion of compatibility must be relaxed enough to |
31 | allow reasonable changes, like adding a field to a struct, but strict enough to |
32 | catch significant breaking changes. A tool that is too lax will miss important |
33 | incompatibilities, and users will stop trusting it; one that is too strict may |
34 | generate so much noise that users will ignore it. |
35 | |
36 | To a first approximation, this tool reports a change as incompatible if it could |
37 | cause client code to stop compiling. But `apidiff` ignores five ways in which |
38 | code may fail to compile after a change. Three of them are mentioned in the |
39 | [Go 1 Compatibility Guarantee](https://golang.org/doc/go1compat). |
40 | |
41 | ### Unkeyed Struct Literals |
42 | |
43 | Code that uses an unkeyed struct literal would fail to compile if a field was |
44 | added to the struct, making any such addition an incompatible change. An example: |
45 | |
46 | ``` |
47 | // old |
48 | type Point struct { X, Y int } |
49 | |
50 | // new |
51 | type Point struct { X, Y, Z int } |
52 | |
53 | // client |
54 | p := pkg.Point{1, 2} // fails in new because there are more fields than expressions |
55 | ``` |
56 | Here and below, we provide three snippets: the code in the old version of the |
57 | package, the code in the new version, and the code written in a client of the package, |
58 | which refers to it by the name `pkg`. The client code compiles against the old |
59 | code but not the new. |
60 | |
61 | ### Embedding and Shadowing |
62 | |
63 | Adding an exported field to a struct can break code that embeds that struct, |
64 | because the newly added field may conflict with an identically named field |
65 | at the same struct depth. A selector referring to the latter would become |
66 | ambiguous and thus erroneous. |
67 | |
68 | |
69 | ``` |
70 | // old |
71 | type Point struct { X, Y int } |
72 | |
73 | // new |
74 | type Point struct { X, Y, Z int } |
75 | |
76 | // client |
77 | type z struct { Z int } |
78 | |
79 | var v struct { |
80 | pkg.Point |
81 | z |
82 | } |
83 | |
84 | _ = v.Z // fails in new |
85 | ``` |
86 | In the new version, the last line fails to compile because there are two embedded `Z` |
87 | fields at the same depth, one from `z` and one from `pkg.Point`. |
88 | |
89 | |
90 | ### Using an Identical Type Externally |
91 | |
92 | If it is possible for client code to write a type expression representing the |
93 | underlying type of a defined type in a package, then external code can use it in |
94 | assignments involving the package type, making any change to that type incompatible. |
95 | ``` |
96 | // old |
97 | type Point struct { X, Y int } |
98 | |
99 | // new |
100 | type Point struct { X, Y, Z int } |
101 | |
102 | // client |
103 | var p struct { X, Y int } = pkg.Point{} // fails in new because of Point's extra field |
104 | ``` |
105 | Here, the external code could have used the provided name `Point`, but chose not |
106 | to. I'll have more to say about this and related examples later. |
107 | |
108 | ### unsafe.Sizeof and Friends |
109 | |
110 | Since `unsafe.Sizeof`, `unsafe.Offsetof` and `unsafe.Alignof` are constant |
111 | expressions, they can be used in an array type literal: |
112 | |
113 | ``` |
114 | // old |
115 | type S struct{ X int } |
116 | |
117 | // new |
118 | type S struct{ X, y int } |
119 | |
120 | // client |
121 | var a [unsafe.Sizeof(pkg.S{})]int = [8]int{} // fails in new because S's size is not 8 |
122 | ``` |
123 | Use of these operations could make many changes to a type potentially incompatible. |
124 | |
125 | |
126 | ### Type Switches |
127 | |
128 | A package change that merges two different types (with same underlying type) |
129 | into a single new type may break type switches in clients that refer to both |
130 | original types: |
131 | |
132 | ``` |
133 | // old |
134 | type T1 int |
135 | type T2 int |
136 | |
137 | // new |
138 | type T1 int |
139 | type T2 = T1 |
140 | |
141 | // client |
142 | switch x.(type) { |
143 | case T1: |
144 | case T2: |
145 | } // fails with new because two cases have the same type |
146 | ``` |
147 | This sort of incompatibility is sufficiently esoteric to ignore; the tool allows |
148 | merging types. |
149 | |
150 | ## First Attempt at a Definition |
151 | |
152 | Our first attempt at defining compatibility captures the idea that all the |
153 | exported names in the old package must have compatible equivalents in the new |
154 | package. |
155 | |
156 | A new package is compatible with an old one if and only if: |
157 | - For every exported package-level name in the old package, the same name is |
158 | declared in the new at package level, and |
159 | - the names denote the same kind of object (e.g. both are variables), and |
160 | - the types of the objects are compatible. |
161 | |
162 | We will work out the details (and make some corrections) below, but it is clear |
163 | already that we will need to determine what makes two types compatible. And |
164 | whatever the definition of type compatibility, it's certainly true that if two |
165 | types are the same, they are compatible. So we will need to decide what makes an |
166 | old and new type the same. We will call this sameness relation _correspondence_. |
167 | |
168 | ## Type Correspondence |
169 | |
170 | Go already has a definition of when two types are the same: |
171 | [type identity](https://golang.org/ref/spec#Type_identity). |
172 | But identity isn't adequate for our purpose: it says that two defined |
173 | types are identical if they arise from the same definition, but it's unclear |
174 | what "same" means when talking about two different packages (or two versions of |
175 | a single package). |
176 | |
177 | The obvious change to the definition of identity is to require that old and new |
178 | [defined types](https://golang.org/ref/spec#Type_definitions) |
179 | have the same name instead. But that doesn't work either, for two |
180 | reasons. First, type aliases can equate two defined types with different names: |
181 | |
182 | ``` |
183 | // old |
184 | type E int |
185 | |
186 | // new |
187 | type t int |
188 | type E = t |
189 | ``` |
190 | Second, an unexported type can be renamed: |
191 | |
192 | ``` |
193 | // old |
194 | type u1 int |
195 | var V u1 |
196 | |
197 | // new |
198 | type u2 int |
199 | var V u2 |
200 | ``` |
201 | Here, even though `u1` and `u2` are unexported, their exported fields and |
202 | methods are visible to clients, so they are part of the API. But since the name |
203 | `u1` is not visible to clients, it can be changed compatibly. We say that `u1` |
204 | and `u2` are _exposed_: a type is exposed if a client package can declare variables of that type. |
205 | |
206 | We will say that an old defined type _corresponds_ to a new one if they have the |
207 | same name, or one can be renamed to the other without otherwise changing the |
208 | API. In the first example above, old `E` and new `t` correspond. In the second, |
209 | old `u1` and new `u2` correspond. |
210 | |
211 | Two or more old defined types can correspond to a single new type: we consider |
212 | "merging" two types into one to be a compatible change. As mentioned above, |
213 | code that uses both names in a type switch will fail, but we deliberately ignore |
214 | this case. However, a single old type can correspond to only one new type. |
215 | |
216 | So far, we've explained what correspondence means for defined types. To extend |
217 | the definition to all types, we parallel the language's definition of type |
218 | identity. So, for instance, an old and a new slice type correspond if their |
219 | element types correspond. |
220 | |
221 | ## Definition of Compatibility |
222 | |
223 | We can now present the definition of compatibility used by `apidiff`. |
224 | |
225 | ### Package Compatibility |
226 | |
227 | > A new package is compatible with an old one if: |
228 | >1. Each exported name in the old package's scope also appears in the new |
229 | >package's scope, and the object (constant, variable, function or type) denoted |
230 | >by that name in the old package is compatible with the object denoted by the |
231 | >name in the new package, and |
232 | >2. For every exposed type that implements an exposed interface in the old package, |
233 | > its corresponding type should implement the corresponding interface in the new package. |
234 | > |
235 | >Otherwise the packages are incompatible. |
236 | |
237 | As an aside, the tool also finds exported names in the new package that are not |
238 | exported in the old, and marks them as compatible changes. |
239 | |
240 | Clause 2 is discussed further in "Whole-Package Compatibility." |
241 | |
242 | ### Object Compatibility |
243 | |
244 | This section provides compatibility rules for constants, variables, functions |
245 | and types. |
246 | |
247 | #### Constants |
248 | |
249 | >A new exported constant is compatible with an old one of the same name if and only if |
250 | >1. Their types correspond, and |
251 | >2. Their values are identical. |
252 | |
253 | It is tempting to allow changing a typed constant to an untyped one. That may |
254 | seem harmless, but it can break code like this: |
255 | |
256 | ``` |
257 | // old |
258 | const C int64 = 1 |
259 | |
260 | // new |
261 | const C = 1 |
262 | |
263 | // client |
264 | var x = C // old type is int64, new is int |
265 | var y int64 = x // fails with new: different types in assignment |
266 | ``` |
267 | |
268 | A change to the value of a constant can break compatibility if the value is used |
269 | in an array type: |
270 | |
271 | ``` |
272 | // old |
273 | const C = 1 |
274 | |
275 | // new |
276 | const C = 2 |
277 | |
278 | // client |
279 | var a [C]int = [1]int{} // fails with new because [2]int and [1]int are different types |
280 | ``` |
281 | Changes to constant values are rare, and determining whether they are compatible |
282 | or not is better left to the user, so the tool reports them. |
283 | |
284 | #### Variables |
285 | |
286 | >A new exported variable is compatible with an old one of the same name if and |
287 | >only if their types correspond. |
288 | |
289 | Correspondence doesn't look past names, so this rule does not prevent adding a |
290 | field to `MyStruct` if the package declares `var V MyStruct`. It does, however, mean that |
291 | |
292 | ``` |
293 | var V struct { X int } |
294 | ``` |
295 | is incompatible with |
296 | ``` |
297 | var V struct { X, Y int } |
298 | ``` |
299 | I discuss this at length below in the section "Compatibility, Types and Names." |
300 | |
301 | #### Functions |
302 | |
303 | >A new exported function or variable is compatible with an old function of the |
304 | >same name if and only if their types (signatures) correspond. |
305 | |
306 | This rule captures the fact that, although many signature changes are compatible |
307 | for all call sites, none are compatible for assignment: |
308 | |
309 | ``` |
310 | var v func(int) = pkg.F |
311 | ``` |
312 | Here, `F` must be of type `func(int)` and not, for instance, `func(...int)` or `func(interface{})`. |
313 | |
314 | Note that the rule permits changing a function to a variable. This is a common |
315 | practice, usually done for test stubbing, and cannot break any code at compile |
316 | time. |
317 | |
318 | #### Exported Types |
319 | |
320 | > A new exported type is compatible with an old one if and only if their |
321 | > names are the same and their types correspond. |
322 | |
323 | This rule seems far too strict. But, ignoring aliases for the moment, it demands only |
324 | that the old and new _defined_ types correspond. Consider: |
325 | ``` |
326 | // old |
327 | type T struct { X int } |
328 | |
329 | // new |
330 | type T struct { X, Y int } |
331 | ``` |
332 | The addition of `Y` is a compatible change, because this rule does not require |
333 | that the struct literals have to correspond, only that the defined types |
334 | denoted by `T` must correspond. (Remember that correspondence stops at type |
335 | names.) |
336 | |
337 | If one type is an alias that refers to the corresponding defined type, the |
338 | situation is the same: |
339 | |
340 | ``` |
341 | // old |
342 | type T struct { X int } |
343 | |
344 | // new |
345 | type u struct { X, Y int } |
346 | type T = u |
347 | ``` |
348 | Here, the only requirement is that old `T` corresponds to new `u`, not that the |
349 | struct types correspond. (We can't tell from this snippet that the old `T` and |
350 | the new `u` do correspond; that depends on whether `u` replaces `T` throughout |
351 | the API.) |
352 | |
353 | However, the following change is incompatible, because the names do not |
354 | denote corresponding types: |
355 | |
356 | ``` |
357 | // old |
358 | type T = struct { X int } |
359 | |
360 | // new |
361 | type T = struct { X, Y int } |
362 | ``` |
363 | ### Type Literal Compatibility |
364 | |
365 | Only five kinds of types can differ compatibly: defined types, structs, |
366 | interfaces, channels and numeric types. We only consider the compatibility of |
367 | the last four when they are the underlying type of a defined type. See |
368 | "Compatibility, Types and Names" for a rationale. |
369 | |
370 | We justify the compatibility rules by enumerating all the ways a type |
371 | can be used, and by showing that the allowed changes cannot break any code that |
372 | uses values of the type in those ways. |
373 | |
374 | Values of all types can be used in assignments (including argument passing and |
375 | function return), but we do not require that old and new types are assignment |
376 | compatible. That is because we assume that the old and new packages are never |
377 | used together: any given binary will link in either the old package or the new. |
378 | So in describing how a type can be used in the sections below, we omit |
379 | assignment. |
380 | |
381 | Any type can also be used in a type assertion or conversion. The changes we allow |
382 | below may affect the run-time behavior of these operations, but they cannot affect |
383 | whether they compile. The only such breaking change would be to change |
384 | the type `T` in an assertion `x.T` so that it no longer implements the interface |
385 | type of `x`; but the rules for interfaces below disallow that. |
386 | |
387 | > A new type is compatible with an old one if and only if they correspond, or |
388 | > one of the cases below applies. |
389 | |
390 | #### Defined Types |
391 | |
392 | Other than assignment, the only ways to use a defined type are to access its |
393 | methods, or to make use of the properties of its underlying type. Rule 2 below |
394 | covers the latter, and rules 3 and 4 cover the former. |
395 | |
396 | > A new defined type is compatible with an old one if and only if all of the |
397 | > following hold: |
398 | >1. They correspond. |
399 | >2. Their underlying types are compatible. |
400 | >3. The new exported value method set is a superset of the old. |
401 | >4. The new exported pointer method set is a superset of the old. |
402 | |
403 | An exported method set is a method set with all unexported methods removed. |
404 | When comparing methods of a method set, we require identical names and |
405 | corresponding signatures. |
406 | |
407 | Removing an exported method is clearly a breaking change. But removing an |
408 | unexported one (or changing its signature) can be breaking as well, if it |
409 | results in the type no longer implementing an interface. See "Whole-Package |
410 | Compatibility," below. |
411 | |
412 | #### Channels |
413 | |
414 | > A new channel type is compatible with an old one if |
415 | > 1. The element types correspond, and |
416 | > 2. Either the directions are the same, or the new type has no direction. |
417 | |
418 | Other than assignment, the only ways to use values of a channel type are to send |
419 | and receive on them, to close them, and to use them as map keys. Changes to a |
420 | channel type cannot cause code that closes a channel or uses it as a map key to |
421 | fail to compile, so we need not consider those operations. |
422 | |
423 | Rule 1 ensures that any operations on the values sent or received will compile. |
424 | Rule 2 captures the fact that any program that compiles with a directed channel |
425 | must use either only sends, or only receives, so allowing the other operation |
426 | by removing the channel direction cannot break any code. |
427 | |
428 | |
429 | #### Interfaces |
430 | |
431 | > A new interface is compatible with an old one if and only if: |
432 | > 1. The old interface does not have an unexported method, and it corresponds |
433 | > to the new interfaces (i.e. they have the same method set), or |
434 | > 2. The old interface has an unexported method and the new exported method set is a |
435 | > superset of the old. |
436 | |
437 | Other than assignment, the only ways to use an interface are to implement it, |
438 | embed it, or call one of its methods. (Interface values can also be used as map |
439 | keys, but that cannot cause a compile-time error.) |
440 | |
441 | Certainly, removing an exported method from an interface could break a client |
442 | call, so neither rule allows it. |
443 | |
444 | Rule 1 also disallows adding a method to an interface without an existing unexported |
445 | method. Such an interface can be implemented in client code. If adding a method |
446 | were allowed, a type that implements the old interface could fail to implement |
447 | the new one: |
448 | |
449 | ``` |
450 | type I interface { M1() } // old |
451 | type I interface { M1(); M2() } // new |
452 | |
453 | // client |
454 | type t struct{} |
455 | func (t) M1() {} |
456 | var i pkg.I = t{} // fails with new, because t lacks M2 |
457 | ``` |
458 | |
459 | Rule 2 is based on the observation that if an interface has an unexported |
460 | method, the only way a client can implement it is to embed it. |
461 | Adding a method is compatible in this case, because the embedding struct will |
462 | continue to implement the interface. Adding a method also cannot break any call |
463 | sites, since no program that compiles could have any such call sites. |
464 | |
465 | #### Structs |
466 | |
467 | > A new struct is compatible with an old one if all of the following hold: |
468 | > 1. The new set of top-level exported fields is a superset of the old. |
469 | > 2. The new set of _selectable_ exported fields is a superset of the old. |
470 | > 3. If the old struct is comparable, so is the new one. |
471 | |
472 | The set of selectable exported fields is the set of exported fields `F` |
473 | such that `x.F` is a valid selector expression for a value `x` of the struct |
474 | type. `F` may be at the top level of the struct, or it may be a field of an |
475 | embedded struct. |
476 | |
477 | Two fields are the same if they have the same name and corresponding types. |
478 | |
479 | Other than assignment, there are only four ways to use a struct: write a struct |
480 | literal, select a field, use a value of the struct as a map key, or compare two |
481 | values for equality. The first clause ensures that struct literals compile; the |
482 | second, that selections compile; and the third, that equality expressions and |
483 | map index expressions compile. |
484 | |
485 | #### Numeric Types |
486 | |
487 | > A new numeric type is compatible with an old one if and only if they are |
488 | > both unsigned integers, both signed integers, both floats or both complex |
489 | > types, and the new one is at least as large as the old on both 32-bit and |
490 | > 64-bit architectures. |
491 | |
492 | Other than in assignments, numeric types appear in arithmetic and comparison |
493 | expressions. Since all arithmetic operations but shifts (see below) require that |
494 | operand types be identical, and by assumption the old and new types underly |
495 | defined types (see "Compatibility, Types and Names," below), there is no way for |
496 | client code to write an arithmetic expression that compiles with operands of the |
497 | old type but not the new. |
498 | |
499 | Numeric types can also appear in type switches and type assertions. Again, since |
500 | the old and new types underly defined types, type switches and type assertions |
501 | that compiled using the old defined type will continue to compile with the new |
502 | defined type. |
503 | |
504 | Going from an unsigned to a signed integer type is an incompatible change for |
505 | the sole reason that only an unsigned type can appear as the right operand of a |
506 | shift. If this rule is relaxed, then changes from an unsigned type to a larger |
507 | signed type would be compatible. See [this |
508 | issue](https://github.com/golang/go/issues/19113). |
509 | |
510 | Only integer types can be used in bitwise and shift operations, and for indexing |
511 | slices and arrays. That is why switching from an integer to a floating-point |
512 | type--even one that can represent all values of the integer type--is an |
513 | incompatible change. |
514 | |
515 | |
516 | Conversions from floating-point to complex types or vice versa are not permitted |
517 | (the predeclared functions real, imag, and complex must be used instead). To |
518 | prevent valid floating-point or complex conversions from becoming invalid, |
519 | changing a floating-point type to a complex type or vice versa is considered an |
520 | incompatible change. |
521 | |
522 | Although conversions between any two integer types are valid, assigning a |
523 | constant value to a variable of integer type that is too small to represent the |
524 | constant is not permitted. That is why the only compatible changes are to |
525 | a new type whose values are a superset of the old. The requirement that the new |
526 | set of values must include the old on both 32-bit and 64-bit machines allows |
527 | conversions from `int32` to `int` and from `int` to `int64`, but not the other |
528 | direction; and similarly for `uint`. |
529 | |
530 | Changing a type to or from `uintptr` is considered an incompatible change. Since |
531 | its size is not specified, there is no way to know whether the new type's values |
532 | are a superset of the old type's. |
533 | |
534 | ## Whole-Package Compatibility |
535 | |
536 | Some changes that are compatible for a single type are not compatible when the |
537 | package is considered as a whole. For example, if you remove an unexported |
538 | method on a defined type, it may no longer implement an interface of the |
539 | package. This can break client code: |
540 | |
541 | ``` |
542 | // old |
543 | type T int |
544 | func (T) m() {} |
545 | type I interface { m() } |
546 | |
547 | // new |
548 | type T int // no method m anymore |
549 | |
550 | // client |
551 | var i pkg.I = pkg.T{} // fails with new because T lacks m |
552 | ``` |
553 | |
554 | Similarly, adding a method to an interface can cause defined types |
555 | in the package to stop implementing it. |
556 | |
557 | The second clause in the definition for package compatibility handles these |
558 | cases. To repeat: |
559 | > 2. For every exposed type that implements an exposed interface in the old package, |
560 | > its corresponding type should implement the corresponding interface in the new package. |
561 | Recall that a type is exposed if it is part of the package's API, even if it is |
562 | unexported. |
563 | |
564 | Other incompatibilities that involve more than one type in the package can arise |
565 | whenever two types with identical underlying types exist in the old or new |
566 | package. Here, a change "splits" an identical underlying type into two, breaking |
567 | conversions: |
568 | |
569 | ``` |
570 | // old |
571 | type B struct { X int } |
572 | type C struct { X int } |
573 | |
574 | // new |
575 | type B struct { X int } |
576 | type C struct { X, Y int } |
577 | |
578 | // client |
579 | var b B |
580 | _ = C(b) // fails with new: cannot convert B to C |
581 | ``` |
582 | Finally, changes that are compatible for the package in which they occur can |
583 | break downstream packages. That can happen even if they involve unexported |
584 | methods, thanks to embedding. |
585 | |
586 | The definitions given here don't account for these sorts of problems. |
587 | |
588 | |
589 | ## Compatibility, Types and Names |
590 | |
591 | The above definitions state that the only types that can differ compatibly are |
592 | defined types and the types that underly them. Changes to other type literals |
593 | are considered incompatible. For instance, it is considered an incompatible |
594 | change to add a field to the struct in this variable declaration: |
595 | |
596 | ``` |
597 | var V struct { X int } |
598 | ``` |
599 | or this alias definition: |
600 | ``` |
601 | type T = struct { X int } |
602 | ``` |
603 | |
604 | We make this choice to keep the definition of compatibility (relatively) simple. |
605 | A more precise definition could, for instance, distinguish between |
606 | |
607 | ``` |
608 | func F(struct { X int }) |
609 | ``` |
610 | where any changes to the struct are incompatible, and |
611 | |
612 | ``` |
613 | func F(struct { X, u int }) |
614 | ``` |
615 | where adding a field is compatible (since clients cannot write the signature, |
616 | and thus cannot assign `F` to a variable of the signature type). The definition |
617 | should then also allow other function signature changes that only require |
618 | call-site compatibility, like |
619 | |
620 | ``` |
621 | func F(struct { X, u int }, ...int) |
622 | ``` |
623 | The result would be a much more complex definition with little benefit, since |
624 | the examples in this section rarely arise in practice. |
625 |
Members