[Ksummit-discuss] [MAINTAINER TOPIC] ABI feature gates?

* [Ksummit-discuss] [MAINTAINER TOPIC] ABI feature gates?
@ 2017-08-04  1:16 Andy Lutomirski
  2017-08-04  1:30 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Andy Lutomirski @ 2017-08-04  1:16 UTC (permalink / raw)
  To: ksummit-discuss

[Note: I'm not entirely sure I can make it to the kernel summit this
year, due to having a tiny person and tons of travel]

This may be highly controversial, but: there seems to be a weakness in
the kernel development model in the way that new ABI features become
stable.  The current model is, roughly:

1. Someone writes the code.  Maybe they cc linux-abi, maybe they don't.
2. People hopefully review the code.
3. A subsystem maintainer merges the code.  They hope the ABI is right.
4. Linus gets a pull request.  Linus probably doesn't review the ABI
for sanity, style, blatant bugs, etc.  If Linus did, then he'd never
get anything else done.
5. The new ABI lands in -rc1.
6. If someone finds a problem or objects, it had better get fixed
before the next real release.

There's a few problems here.  One is that the people who would really
review the ABI might not even notice until step 5 or 6 or so.  Another
is that it takes some time for userspace to get experience with a new
ABI.

I'm wondering if there are other models that could work.  I think it
would be nice for us to be able to land a kernel in Linus tree and
still wait a while before stabilizing it.  Rust, for example, has a
strict policy for this that seems to work quite well.

Maybe we could pull something off where big new features hide behind a
named feature gate for a while.  That feature gate can only be enabled
under some circumstances that make it very hard to mistake it for true
stability.  (For example, maybe you *can't* enable feature gates on a
final kernel unless you manually patch something.)

Here are a few examples that come to mind for where this would have helped:

 - Whatever that new RDMA socket type was that was deemed totally
broken but only just after it hit a real release.
 - O_TMPFILE.  I discovered that it corrupted filesystems in -rc6 or
-rc7.  That got fixed, the the API is still a steaming pile of crap.
 - Some cgroup+bpf stuff that got cleaned up in a -rc7 or so a few releases ago.

I'm sure there are tons more.

Is this too crazy, or is it worth discussing?

^ permalink raw reply	[flat|nested] 37+ messages in thread