XFS LTS backport cabal

* XFS LTS backport cabal
@ 2022-05-25 21:23 Darrick J. Wong
  2022-05-26  3:43 ` Amir Goldstein
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Darrick J. Wong @ 2022-05-25 21:23 UTC (permalink / raw)
  To: xfs
  Cc: Leah Rumancik, Theodore Ts'o, Shirley Ma, Amir Goldstein,
	Eric Sandeen, Dave Chinner, Chandan Babu R, Konrad Wilk

Hi everyone,

As most of the people cc'd on this message are aware, there's been a
recent groundswell of interest in increasing the amount of staff time
assigned to backporting bug fixes to LTS kernels.  As part of preparing
to resume maintainership on June 5th, I thought it would be a good idea
to introduce all the participants (that I know of!) and to capture a
summary of everyone's thoughts w.r.t. how to make stable backports
happen.

First, some introductions: Leah and Ted work at Google, and they've
expressed interest in 5.15 LTS backports.  Dave and Eric are the other
upstream XFS maintainers, so I've included them.  Amir seems to be
working on 5.10 LTS backports for his employer(?).  Chandan and I work
at Oracle (obviously) and he's also been working on a few 5.15 LTS
backports.

I won't speak for other organizations, but we (Oracle) are also
interested in stable backports for the 5.4 and 4.14 LTS kernels, since
we have customers running <cough> derivatives of those kernels.  Given
what I've heard from others, many kernel distributors lean on the LTS
kernels.

The goal of this thread, then, is to shed some light on who's currently
doing what to reduce duplication of LTS work, and to make sure that
we're all more or less on the same page with regards to what we will and
won't try to push to stable.  (A side goal of mine is to help everyone
working on the stable branches to avoid the wrath and unhelpful form
letters of the stable maintainers.)

Briefly, I think the patches that flow into XFS could be put into three
rough categories:

(a) Straightforward fixes.  These are usually pretty simple fixes (e.g.
omitted errno checking, insufficient validation, etc.) sometimes get
proper Fixes tags, which means that AUTOSEL can be of some benefit.

(b) Probable fixes.  Often these aren't all that obvious -- for example,
the author may be convinced that they correct a mis-interaction between
subsystems, but we would like the changes to soak in upstream for a few
months to build confidence that they solve the problem and without
causing more problems.

(c) Everything else.  New features, giant refactorings, etc.  These
generally should not be backported, unless someone has a /really/ good
reason.

Here are a few principles I'd like to see guiding stable backport
efforts:

1. AUTOSEL is a good tool to _start_ the process of identifying low
hanging fruit to backport.  Automation is our friend, but XFS is complex
so we still need people who have kept up with linux-xfs to know what's
appropriate (and what compile tests can't find) to finish the process.

2. Some other tag for patches that could be a fix, but need a few months
to soak.  This is targetted at (b), since I'm terrible at remembering
that there are patches that are reaching ripeness.

3. fstesting -- new patches proposed for stable branches shouldn't
introduce new regressions, and ideally there would also be a regression
test that would now pass.  As Dave and I have stated in the past,
fstests is a big umbrella of a test suite, which implies that A/B
testing is the way to go.  I think at least Zorro and I would like to
improve the tagging in fstests to make it more obvious which tests
contain enough randomness that they cannot be expected to behave 100%
reliably.

Here's a couple of antipatterns from the past:

i. Robots shovelling patches into stable kernels with no testing.

ii. Massively large backports.  New features don't go to stable kernels,
and I doubt the stable kernel maintainers will accept that anyway.  I
grok the temptation to backport more so that it's easier to land future
fixes via AUTOSEL, but I personally wouldn't endorse frontloading a
bunch of work to chase a promise of less future work.

And a question or two:

a> I've been following the recent fstests threads, and it seems to me
that there are really two classes of users -- sustaining people who want
fstests to run reliably so they can tell if their backports have broken
anything; and developers, who want the randomness to try to poke into
dusty corners of the filesystem.  Can we make it easier to associate
random bits of data (reliability rates, etc.) with a given fstests
configuration?  And create a test group^Wtag for the tests that rely on
RNGs to shake things up?

b> Testing relies very heavily on being able to spin up a lot of testing
resources.  Can/should we make it easier for people with a kernel.org
account to get free(ish) cloud accounts with the LF members who are also
cloud vendors?

Thoughts? Flames?

--D

^ permalink raw reply	[flat|nested] 7+ messages in thread