linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bug-introducing patches
@ 2018-05-01 16:38 Sasha Levin
  2018-05-01 19:44 ` Theodore Y. Ts'o
                   ` (2 more replies)
  0 siblings, 3 replies; 145+ messages in thread
From: Sasha Levin @ 2018-05-01 16:38 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: Greg KH, w, julia.lawall, linux-kernel

Working on AUTOSEL, it became even more obvious to me how difficult it is for a
patch to get a proper review. Maintainers found it difficult to keep up with
the upstream work for their subsystem, and reviewing additional -stable patches
put even more load on them which some suggested would be more than what they
can handle.

While AUTOSEL tries to understand if a patch fixes a bug, this was a bit late:
the bug was already introduced, folks already have to deal with it, and the
kernel is broken. I was wondering if I can do a similar process to AUTOSEL, but
teach the AI about bug-introducing patches.

When someone fixes a bug, he would describe the patch differently than he would
if he was writing a new feature. This lets AUTOSEL build on different commit
message constructs, among various inputs, to recognize bug fixes. However,
people are unaware that they introduce a bug, so the commit message for bug
introducing patches is essentially the same as for commits that don't introduce
a bug. This meant that I had to try and source data out of different sources.

Few of the parameters I ended up using are:
 - -next data (days spent in -next, changes in the patch between -next trees)
 - Mailing list data (was this patch ever sent to a ML? How long before it was
   merged? How many replies did it get? ...)
 - Author/commiter/maintainer chain data. Just like sports, some folks are more
   likely to produce better results than others. This goes beyond just "skill",
but also looks at things such as whether the author patches a subsystem he's
"familiar with" (== subsystem where most of his patches usually go), or is he
modifying a subsystem he never sent a patch for.
 - Patch complexity metrics - various code metrics to indicate how "complex" a
   patch is. Think 100 lines of whitespace fixes vs 100 lines that
significantly changes a subsystem.
 - Kernel process correctness - I tried using "violations" of the kernel
   process (patch formatting, correctness of the mailing to lkml, etc) as an
indicator of how familiar the author is with the kernel, with the presumption
that folks who are newer to kernel development are more likely to introduce
bugs

Running an initial iteration on a set of commits made two things very obvious
to me:

1. -rc releases suck. seriously suck. The quality of commits that went in -rc
cycles was much worse that merge window commit:
 - All commits had the same chance of introducing a bug whether they came in a
   merge window or an -rc cycle. This means that -rc commits mostly end up
replacing obvious bugs with less obvious ones.
 - While the average merge window commit changes, on average, 3x more lines
   than an -rc commit, the chances of a bug introduced per patch is the same,
which means that bugs-per-line metric of code is much higher with -rc patches.
 - A merge window commit spent 50% more days, on average, in -next than a -rc
   commit.
 - The number of -rc commits that never saw any mailing list or has never been
   replied to on a mailing list was **way** higher than merge window commits.
 - For some reason, the odds of a -rc commit to be targetted for -stable is
   over 20%, while for merge window commits it's about 3%. I can't quite
explain why that happens, but this would suggest that -rc commits end up
hurting -stable pretty badly.

2. Maintainers need to stop writing patches, commiting them, and pushing them
in without reviews.  In -rc cycles there is quite a large number of commits
that were either written by maintainers, commited, and merged upstream the same
day. These patches are very likely to introduce a new bug.


I don't really have a proposal beyond "tighten up -rc cycles", but I think it's
a discussion worth having. We have enough data to show what parts of kernel
development work, and what parts are just hurting us.

I'd be happy to gather more data if someone has an idea he wants to look
into. The data used for this work is based on:

 - v4.4..v4.16 (just becuase it's as far as linux-next-history goes).
 - "bugs" are commits that were mentioned in a Fixes: tag of a later
   commit.
 - "stable commits" are commits that made it to a -stable tree.

^ permalink raw reply	[flat|nested] 145+ messages in thread

end of thread, other threads:[~2018-07-15 20:15 UTC | newest]

Thread overview: 145+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-01 16:38 bug-introducing patches Sasha Levin
2018-05-01 19:44 ` Theodore Y. Ts'o
2018-05-01 20:00   ` Sasha Levin
2018-05-01 20:33     ` Willy Tarreau
2018-05-01 20:42       ` Sasha Levin
2018-05-01 20:54     ` [Ksummit-discuss] " Theodore Y. Ts'o
2018-05-01 21:15       ` Mark Brown
2018-05-02  8:11         ` Daniel Vetter
2018-05-02 19:46           ` Sasha Levin via Ksummit-discuss
2018-05-03  2:05             ` Mark Brown via Ksummit-discuss
2018-05-03  3:10               ` Theodore Y. Ts'o
2018-05-03  3:52                 ` Guenter Roeck
2018-05-03 12:03                   ` Greg KH
2018-05-03 22:42                   ` Mark Brown
2018-05-03 23:09                     ` Tony Lindgren
2018-05-04 14:21                       ` Ulf Hansson
2018-05-09  8:44                         ` Mark Brown
2018-05-09  8:47                           ` Daniel Vetter
2018-05-09  8:51                             ` Geert Uytterhoeven
2018-05-09  9:03                             ` Mark Brown
2018-05-09 10:47                               ` Stephen Rothwell
2018-05-09 10:55                                 ` Vinod Koul
2018-05-09 12:43                                   ` Stephen Rothwell
2018-05-09 12:47                                     ` Vinod Koul
2018-05-15 10:42                                     ` Krzysztof Kozlowski
2018-05-15 11:54                                       ` Stephen Rothwell
2018-05-09 14:05                                 ` Mark Brown
2018-05-09 22:09                                   ` Stephen Rothwell
2018-05-10 13:36                                     ` Mark Brown
2018-05-10 22:01                                       ` Stephen Rothwell
2018-05-09 15:57                                 ` Guenter Roeck
2018-05-09 21:45                                   ` Stephen Rothwell
2018-05-09 16:04                                 ` Dan Williams
2018-05-09 21:51                                   ` Stephen Rothwell
2018-05-09 19:35                                 ` Boris Brezillon
2018-05-09 21:58                                   ` Stephen Rothwell
2018-05-10  3:15                                 ` Sasha Levin via Ksummit-discuss
2018-05-10 15:57                                 ` Tony Lindgren
2018-05-10 22:05                                   ` Stephen Rothwell
2018-05-11  8:47                                 ` David Sterba
2018-05-12  4:03                                   ` Stephen Rothwell
2018-05-12  4:38                                 ` Stephen Rothwell
2018-05-12 18:34                                   ` Guenter Roeck
2018-05-13 13:53                                   ` Andy Shevchenko
2018-05-14  8:36                                 ` Ulf Hansson
2018-05-14 21:45                                   ` Stephen Rothwell
2018-05-17  5:10                                   ` Mark Brown
2018-05-10 16:03                             ` Jiri Kosina
2018-05-10 16:47                               ` Sasha Levin via Ksummit-discuss
2018-05-14  7:53                                 ` Geert Uytterhoeven
2018-05-14  8:00                                   ` Geert Uytterhoeven
2018-05-14  8:12                                     ` Boris Brezillon
2018-05-14  8:29                                       ` Geert Uytterhoeven
2018-05-14  8:34                                         ` Boris Brezillon
2018-05-14  8:40                                           ` Geert Uytterhoeven
2018-05-14  8:48                                             ` Boris Brezillon
2018-05-14  9:25                                               ` Fengguang Wu
2018-05-11  2:10                               ` Mark Brown
2018-05-08  2:34                       ` Sasha Levin
2018-05-08  3:48                         ` Theodore Y. Ts'o
2018-05-08 14:49                           ` Tony Lindgren
2018-05-09  8:13                             ` Mark Brown
2018-05-10 15:36                             ` Tony Lindgren
2018-05-08 20:29                           ` Sasha Levin via Ksummit-discuss
2018-05-08 20:40                             ` Matthew Wilcox
2018-05-08 20:55                               ` Sasha Levin
2018-05-08 20:59                                 ` David Lang
2018-05-08 21:43                                   ` Sasha Levin via Ksummit-discuss
2018-05-08 21:51                                     ` Dan Williams
2018-05-08 22:41                                     ` James Bottomley
2018-05-08 21:26                                 ` Justin Forbes
2018-05-08 21:00                             ` Ken Moffat
2018-05-08 22:15                             ` Theodore Y. Ts'o
2018-05-10 16:39                               ` Sasha Levin
2018-05-09  4:47                             ` Willy Tarreau
2018-05-08 13:58                         ` Justin Forbes
2018-05-08  2:39                     ` Sasha Levin via Ksummit-discuss
2018-05-01 22:02       ` Sasha Levin
2018-05-02  4:30         ` Willy Tarreau
2018-05-02 19:42           ` Sasha Levin
2018-05-02 20:02             ` Willy Tarreau
2018-07-14 17:38               ` Pavel Machek
2018-07-14 18:37                 ` [Ksummit-discuss] " Guenter Roeck
2018-07-14 19:47                   ` Pavel Machek
2018-07-14 20:40                     ` Guenter Roeck
2018-07-14 21:09                       ` Pavel Machek
2018-07-15  5:57                         ` Willy Tarreau
2018-07-15  8:54                 ` Greg KH
2018-07-15 14:50                   ` Theodore Y. Ts'o
2018-07-15 20:15                   ` Pavel Machek
2018-05-03 11:08       ` [Ksummit-discuss] " Jani Nikula
2018-05-03 14:33         ` James Bottomley
2018-05-03 14:48           ` Willy Tarreau
2018-05-03 15:06             ` Sasha Levin via Ksummit-discuss
2018-05-03 15:27               ` James Bottomley
2018-05-03 15:43                 ` Sasha Levin via Ksummit-discuss
2018-05-03 17:17                   ` Randy Dunlap
2018-05-03 17:39                     ` Sasha Levin via Ksummit-discuss
2018-05-03 18:10                   ` James Bottomley
2018-05-03 15:56                 ` Willy Tarreau
2018-05-03 18:58         ` Theodore Y. Ts'o
2018-05-02 15:32 ` Geert Uytterhoeven
2018-05-02 19:51   ` Sasha Levin via Ksummit-discuss
2018-05-02 20:41     ` Geert Uytterhoeven
2018-05-03  0:06       ` [Ksummit-discuss] " Theodore Y. Ts'o
2018-05-03  0:38         ` Guenter Roeck
2018-05-03  2:30           ` Willy Tarreau
2018-05-03 14:55           ` Sasha Levin
2018-05-03 15:49             ` Guenter Roeck
2018-05-03 16:02               ` Sasha Levin via Ksummit-discuss
2018-05-03 16:50                 ` Justin Forbes
2018-05-03 17:09                 ` Guenter Roeck
2018-05-03 11:48         ` Al Viro
2018-05-03 14:46         ` Sasha Levin via Ksummit-discuss
2018-05-03 14:52           ` Willy Tarreau
2018-05-03 15:01             ` Sasha Levin via Ksummit-discuss
2018-05-03 16:00               ` Willy Tarreau
2018-05-03 16:14                 ` Sasha Levin
2018-05-03 16:35                   ` Willy Tarreau
2018-05-03 17:29                     ` Sasha Levin via Ksummit-discuss
2018-05-03 17:57                       ` Willy Tarreau
2018-05-03 18:12                         ` Sasha Levin
2018-05-03 18:46                           ` Guenter Roeck
2018-05-03 19:03                           ` Willy Tarreau
2018-05-03 16:54           ` Al Viro
2018-05-03 17:34             ` Sasha Levin via Ksummit-discuss
2018-05-03 18:20               ` Al Viro
2018-05-03 18:55                 ` Greg KH
2018-05-03 19:14                   ` Willy Tarreau
2018-05-03 19:17                     ` Sasha Levin via Ksummit-discuss
2018-05-03 19:04                 ` Sasha Levin
2018-05-04  9:57                 ` David Howells
2018-05-04 12:31                   ` Jani Nikula
2018-05-04 13:09                     ` Theodore Y. Ts'o
2018-05-04 17:40                       ` Greg KH
2018-05-04 21:13                         ` Theodore Y. Ts'o
2018-05-04 21:38                           ` James Bottomley
2018-05-04 21:51                             ` Sasha Levin
2018-05-04 23:35                               ` Theodore Y. Ts'o
2018-05-05  4:23                                 ` Willy Tarreau
2018-05-05  5:02                                   ` Eric W. Biederman
2018-05-05 16:37                                     ` Greg KH
2018-05-05  5:27                                 ` Sasha Levin via Ksummit-discuss
2018-05-03 11:43       ` Al Viro
2018-05-02 15:32 ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).