linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] FS, MM, and stable trees
@ 2019-02-12 17:00 Sasha Levin
  2019-02-12 21:32 ` Steve French
  2022-03-08  9:32 ` Amir Goldstein
  0 siblings, 2 replies; 55+ messages in thread
From: Sasha Levin @ 2019-02-12 17:00 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-fsdevel, linux-mm, linux-kernel

Hi all,

I'd like to propose a discussion about the workflow of the stable trees
when it comes to fs/ and mm/. In the past year we had some friction with
regards to the policies and the procedures around picking patches for
stable tree, and I feel it would be very useful to establish better flow
with the folks who might be attending LSF/MM.

I feel that fs/ and mm/ are in very different places with regards to
which patches go in -stable, what tests are expected, and the timeline
of patches from the point they are proposed on a mailing list to the
point they are released in a stable tree. Therefore, I'd like to propose
two different sessions on this (one for fs/ and one for mm/), as a
common session might be less conductive to agreeing on a path forward as
the starting point for both subsystems are somewhat different.

We can go through the existing processes, automation, and testing
mechanisms we employ when building stable trees, and see how we can
improve these to address the concerns of fs/ and mm/ folks.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-12 17:00 [LSF/MM TOPIC] FS, MM, and stable trees Sasha Levin
@ 2019-02-12 21:32 ` Steve French
  2019-02-13  7:20   ` Amir Goldstein
  2022-03-08  9:32 ` Amir Goldstein
  1 sibling, 1 reply; 55+ messages in thread
From: Steve French @ 2019-02-12 21:32 UTC (permalink / raw)
  To: Sasha Levin; +Cc: lsf-pc, linux-fsdevel, linux-mm, LKML

Makes sense - e.g. I would like to have a process to make automation
of the xfstests for proposed patches for stable for cifs.ko easier and
part of the process (as we already do for cifs/smb3 related checkins
to for-next ie linux next before sending to mainline for cifs.ko).
Each filesystem has a different set of xfstests (and perhaps other
mechanisms) to run so might be very specific to each file system, but
would be helpful to discuss

On Tue, Feb 12, 2019 at 9:32 AM Sasha Levin <sashal@kernel.org> wrote:
>
> Hi all,
>
> I'd like to propose a discussion about the workflow of the stable trees
> when it comes to fs/ and mm/. In the past year we had some friction with
> regards to the policies and the procedures around picking patches for
> stable tree, and I feel it would be very useful to establish better flow
> with the folks who might be attending LSF/MM.
>
> I feel that fs/ and mm/ are in very different places with regards to
> which patches go in -stable, what tests are expected, and the timeline
> of patches from the point they are proposed on a mailing list to the
> point they are released in a stable tree. Therefore, I'd like to propose
> two different sessions on this (one for fs/ and one for mm/), as a
> common session might be less conductive to agreeing on a path forward as
> the starting point for both subsystems are somewhat different.
>
> We can go through the existing processes, automation, and testing
> mechanisms we employ when building stable trees, and see how we can
> improve these to address the concerns of fs/ and mm/ folks.
>
> --
> Thanks,
> Sasha



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-12 21:32 ` Steve French
@ 2019-02-13  7:20   ` Amir Goldstein
  2019-02-13  7:37     ` Greg KH
  0 siblings, 1 reply; 55+ messages in thread
From: Amir Goldstein @ 2019-02-13  7:20 UTC (permalink / raw)
  To: Steve French, Sasha Levin
  Cc: lsf-pc, linux-fsdevel, linux-mm, LKML, Greg KH, Luis R. Rodriguez

On Tue, Feb 12, 2019 at 11:56 PM Steve French <smfrench@gmail.com> wrote:
>
> Makes sense - e.g. I would like to have a process to make automation
> of the xfstests for proposed patches for stable for cifs.ko easier and
> part of the process (as we already do for cifs/smb3 related checkins
> to for-next ie linux next before sending to mainline for cifs.ko).
> Each filesystem has a different set of xfstests (and perhaps other
> mechanisms) to run so might be very specific to each file system, but
> would be helpful to discuss
>

Agreed.

Perhaps it is just a matter of communicating the stable tree workflow.
I currently only see notice emails from Greg about patches being queued
for stable.

I never saw an email from you or Greg saying, the branch "stable-xxx" is
in review. Please run your tests.

I have seen reports from LTP about stable kernels, so I know it is
being run regularly and I recently saw the set of xfstests configurations
that Sasha and Luis posted.

Is there any publicly available information about which tests are being run
on stable candidate branches?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13  7:20   ` Amir Goldstein
@ 2019-02-13  7:37     ` Greg KH
  2019-02-13  9:01       ` Amir Goldstein
  0 siblings, 1 reply; 55+ messages in thread
From: Greg KH @ 2019-02-13  7:37 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Steve French, Sasha Levin, lsf-pc, linux-fsdevel, linux-mm, LKML,
	Luis R. Rodriguez

On Wed, Feb 13, 2019 at 09:20:00AM +0200, Amir Goldstein wrote:
> I never saw an email from you or Greg saying, the branch "stable-xxx" is
> in review. Please run your tests.

That is what my "Subject: [PATCH 4.9 000/137] 4.9.156-stable review"
type emails are supposed to kick off.  They are sent both to the stable
mailing list and lkml.

This message already starts the testing systems going for a number of
different groups out there, do you want to be added to the cc: list so
you get them directly?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13  7:37     ` Greg KH
@ 2019-02-13  9:01       ` Amir Goldstein
  2019-02-13  9:18         ` Greg KH
  0 siblings, 1 reply; 55+ messages in thread
From: Amir Goldstein @ 2019-02-13  9:01 UTC (permalink / raw)
  To: Greg KH
  Cc: Steve French, Sasha Levin, lsf-pc, linux-fsdevel, linux-mm, LKML,
	Luis R. Rodriguez

On Wed, Feb 13, 2019 at 9:37 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Feb 13, 2019 at 09:20:00AM +0200, Amir Goldstein wrote:
> > I never saw an email from you or Greg saying, the branch "stable-xxx" is
> > in review. Please run your tests.
>
> That is what my "Subject: [PATCH 4.9 000/137] 4.9.156-stable review"
> type emails are supposed to kick off.  They are sent both to the stable
> mailing list and lkml.
>
> This message already starts the testing systems going for a number of
> different groups out there, do you want to be added to the cc: list so
> you get them directly?
>

No thanks, I'll fix my email filters ;-)

I think the main difference between these review announcements
and true CI is what kind of guaranty you get for a release candidate
from NOT getting a test failure response, which is one of the main
reasons that where holding back xfs stable fixes for so long.

Best effort testing in timely manner is good, but a good way to
improve confidence in stable kernel releases is a publicly
available list of tests that the release went through.

Do you have any such list of tests that you *know* are being run,
that you (or Sasha) run yourself or that you actively wait on an
ACK from a group before a release?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13  9:01       ` Amir Goldstein
@ 2019-02-13  9:18         ` Greg KH
  2019-02-13 19:25           ` Sasha Levin
  0 siblings, 1 reply; 55+ messages in thread
From: Greg KH @ 2019-02-13  9:18 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Steve French, Sasha Levin, lsf-pc, linux-fsdevel, linux-mm, LKML,
	Luis R. Rodriguez

On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein wrote:
> I think the main difference between these review announcements
> and true CI is what kind of guaranty you get for a release candidate
> from NOT getting a test failure response, which is one of the main
> reasons that where holding back xfs stable fixes for so long.

That's not true, I know to wait for some responses before doing a
release of these kernels.

> Best effort testing in timely manner is good, but a good way to
> improve confidence in stable kernel releases is a publicly
> available list of tests that the release went through.

We have that, you aren't noticing them...

> Do you have any such list of tests that you *know* are being run,
> that you (or Sasha) run yourself or that you actively wait on an
> ACK from a group before a release?

Yes, look at the responses to those messages from Guenter, Shuah, Jon,
kernel.ci, Red Hat testing, the Linaro testing teams, and a few other
testers that come and go over time.  Those list out all of the tests
that are being run, and the results of those tests.

I also get a number of private responses from different build systems
from companies that don't want to post in public, which is fine, I
understand the issues involved with that.

I would argue that the stable releases are better tested than Linus's
releases for that reason alone :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13  9:18         ` Greg KH
@ 2019-02-13 19:25           ` Sasha Levin
  2019-02-13 19:52             ` Greg KH
  0 siblings, 1 reply; 55+ messages in thread
From: Sasha Levin @ 2019-02-13 19:25 UTC (permalink / raw)
  To: Greg KH
  Cc: Amir Goldstein, Steve French, lsf-pc, linux-fsdevel, linux-mm,
	LKML, Luis R. Rodriguez

On Wed, Feb 13, 2019 at 10:18:03AM +0100, Greg KH wrote:
>On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein wrote:
>> Best effort testing in timely manner is good, but a good way to
>> improve confidence in stable kernel releases is a publicly
>> available list of tests that the release went through.
>
>We have that, you aren't noticing them...

This is one of the biggest things I want to address: there is a
disconnect between the stable kernel testing story and the tests the fs/
and mm/ folks expect to see here.

On one had, the stable kernel folks see these kernels go through entire
suites of testing by multiple individuals and organizations, receiving
way more coverage than any of Linus's releases.

On the other hand, things like LTP and selftests tend to barely scratch
the surface of our mm/ and fs/ code, and the maintainers of these
subsystems do not see LTP-like suites as something that adds significant
value and ignore them. Instead, they have a (convoluted) set of testing
they do with different tools and configurations that qualifies their
code as being "tested".

So really, it sounds like a low hanging fruit: we don't really need to
write much more testing code code nor do we have to refactor existing
test suites. We just need to make sure the right tests are running on
stable kernels. I really want to clarify what each subsystem sees as
"sufficient" (and have that documented somewhere).

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13 19:25           ` Sasha Levin
@ 2019-02-13 19:52             ` Greg KH
  2019-02-13 20:14               ` James Bottomley
  2019-03-20  3:46               ` Jon Masters
  0 siblings, 2 replies; 55+ messages in thread
From: Greg KH @ 2019-02-13 19:52 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Amir Goldstein, Steve French, lsf-pc, linux-fsdevel, linux-mm,
	LKML, Luis R. Rodriguez

On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
> On Wed, Feb 13, 2019 at 10:18:03AM +0100, Greg KH wrote:
> > On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein wrote:
> > > Best effort testing in timely manner is good, but a good way to
> > > improve confidence in stable kernel releases is a publicly
> > > available list of tests that the release went through.
> > 
> > We have that, you aren't noticing them...
> 
> This is one of the biggest things I want to address: there is a
> disconnect between the stable kernel testing story and the tests the fs/
> and mm/ folks expect to see here.
> 
> On one had, the stable kernel folks see these kernels go through entire
> suites of testing by multiple individuals and organizations, receiving
> way more coverage than any of Linus's releases.
> 
> On the other hand, things like LTP and selftests tend to barely scratch
> the surface of our mm/ and fs/ code, and the maintainers of these
> subsystems do not see LTP-like suites as something that adds significant
> value and ignore them. Instead, they have a (convoluted) set of testing
> they do with different tools and configurations that qualifies their
> code as being "tested".
> 
> So really, it sounds like a low hanging fruit: we don't really need to
> write much more testing code code nor do we have to refactor existing
> test suites. We just need to make sure the right tests are running on
> stable kernels. I really want to clarify what each subsystem sees as
> "sufficient" (and have that documented somewhere).

kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
to their test suites to address these issues (I think 0-day already has
many of them).  So this is happening, but not quite obvious.  I know I
keep asking Linaro about this :(

Anyway, just having a list of what tests each subsystem things is "good
to run" would be great to have somewhere.  Ideally in the kernel tree
itself, as that's what kselftests are for :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13 19:52             ` Greg KH
@ 2019-02-13 20:14               ` James Bottomley
  2019-02-15  1:50                 ` Sasha Levin
  2019-03-20  3:46               ` Jon Masters
  1 sibling, 1 reply; 55+ messages in thread
From: James Bottomley @ 2019-02-13 20:14 UTC (permalink / raw)
  To: Greg KH, Sasha Levin
  Cc: Amir Goldstein, Steve French, lsf-pc, linux-fsdevel, linux-mm,
	LKML, Luis R. Rodriguez

On Wed, 2019-02-13 at 20:52 +0100, Greg KH wrote:
> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
> > On Wed, Feb 13, 2019 at 10:18:03AM +0100, Greg KH wrote:
> > > On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein wrote:
> > > > Best effort testing in timely manner is good, but a good way to
> > > > improve confidence in stable kernel releases is a publicly
> > > > available list of tests that the release went through.
> > > 
> > > We have that, you aren't noticing them...
> > 
> > This is one of the biggest things I want to address: there is a
> > disconnect between the stable kernel testing story and the tests
> > the fs/ and mm/ folks expect to see here.
> > 
> > On one had, the stable kernel folks see these kernels go through
> > entire suites of testing by multiple individuals and organizations,
> > receiving way more coverage than any of Linus's releases.
> > 
> > On the other hand, things like LTP and selftests tend to barely
> > scratch the surface of our mm/ and fs/ code, and the maintainers of
> > these subsystems do not see LTP-like suites as something that adds
> > significant value and ignore them. Instead, they have a
> > (convoluted) set of testing they do with different tools and
> > configurations that qualifies their code as being "tested".
> > 
> > So really, it sounds like a low hanging fruit: we don't really need
> > to write much more testing code code nor do we have to refactor
> > existing test suites. We just need to make sure the right tests are
> > running on stable kernels. I really want to clarify what each
> > subsystem sees as "sufficient" (and have that documented
> > somewhere).
> 
> kernel.ci and 0-day and Linaro are starting to add the fs and mm
> tests to their test suites to address these issues (I think 0-day
> already has many of them).  So this is happening, but not quite
> obvious.  I know I keep asking Linaro about this :(

0day has xfstests at least, but it's opt-in only (you have to request
that it be run on your trees).  When I did it for the SCSI tree, I had
to email Fenguangg directly, there wasn't any other way of getting it.

James


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13 20:14               ` James Bottomley
@ 2019-02-15  1:50                 ` Sasha Levin
  2019-02-15  2:48                   ` James Bottomley
  0 siblings, 1 reply; 55+ messages in thread
From: Sasha Levin @ 2019-02-15  1:50 UTC (permalink / raw)
  To: James Bottomley
  Cc: Greg KH, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On Wed, Feb 13, 2019 at 12:14:35PM -0800, James Bottomley wrote:
>On Wed, 2019-02-13 at 20:52 +0100, Greg KH wrote:
>> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
>> > On Wed, Feb 13, 2019 at 10:18:03AM +0100, Greg KH wrote:
>> > > On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein wrote:
>> > > > Best effort testing in timely manner is good, but a good way to
>> > > > improve confidence in stable kernel releases is a publicly
>> > > > available list of tests that the release went through.
>> > >
>> > > We have that, you aren't noticing them...
>> >
>> > This is one of the biggest things I want to address: there is a
>> > disconnect between the stable kernel testing story and the tests
>> > the fs/ and mm/ folks expect to see here.
>> >
>> > On one had, the stable kernel folks see these kernels go through
>> > entire suites of testing by multiple individuals and organizations,
>> > receiving way more coverage than any of Linus's releases.
>> >
>> > On the other hand, things like LTP and selftests tend to barely
>> > scratch the surface of our mm/ and fs/ code, and the maintainers of
>> > these subsystems do not see LTP-like suites as something that adds
>> > significant value and ignore them. Instead, they have a
>> > (convoluted) set of testing they do with different tools and
>> > configurations that qualifies their code as being "tested".
>> >
>> > So really, it sounds like a low hanging fruit: we don't really need
>> > to write much more testing code code nor do we have to refactor
>> > existing test suites. We just need to make sure the right tests are
>> > running on stable kernels. I really want to clarify what each
>> > subsystem sees as "sufficient" (and have that documented
>> > somewhere).
>>
>> kernel.ci and 0-day and Linaro are starting to add the fs and mm
>> tests to their test suites to address these issues (I think 0-day
>> already has many of them).  So this is happening, but not quite
>> obvious.  I know I keep asking Linaro about this :(
>
>0day has xfstests at least, but it's opt-in only (you have to request
>that it be run on your trees).  When I did it for the SCSI tree, I had
>to email Fenguangg directly, there wasn't any other way of getting it.

It's very tricky to do even if someone would just run it. I worked with
the xfs folks for quite a while to gather the various configs they want
to use, and to establish the baseline for a few of the stable trees
(some tests are know to fail, etc).

So just running xfstests "blindly" doesn't add much value beyond ltp I
think.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-15  1:50                 ` Sasha Levin
@ 2019-02-15  2:48                   ` James Bottomley
  2019-02-16 18:28                     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 55+ messages in thread
From: James Bottomley @ 2019-02-15  2:48 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg KH, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On Thu, 2019-02-14 at 20:50 -0500, Sasha Levin wrote:
> On Wed, Feb 13, 2019 at 12:14:35PM -0800, James Bottomley wrote:
> > On Wed, 2019-02-13 at 20:52 +0100, Greg KH wrote:
> > > On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
> > > > On Wed, Feb 13, 2019 at 10:18:03AM +0100, Greg KH wrote:
> > > > > On Wed, Feb 13, 2019 at 11:01:25AM +0200, Amir Goldstein
> > > > > wrote:
> > > > > > Best effort testing in timely manner is good, but a good
> > > > > > way to improve confidence in stable kernel releases is a
> > > > > > publicly available list of tests that the release went
> > > > > > through.
> > > > > 
> > > > > We have that, you aren't noticing them...
> > > > 
> > > > This is one of the biggest things I want to address: there is a
> > > > disconnect between the stable kernel testing story and the
> > > > tests the fs/ and mm/ folks expect to see here.
> > > > 
> > > > On one had, the stable kernel folks see these kernels go
> > > > through entire suites of testing by multiple individuals and
> > > > organizations, receiving way more coverage than any of Linus's
> > > > releases.
> > > > 
> > > > On the other hand, things like LTP and selftests tend to barely
> > > > scratch the surface of our mm/ and fs/ code, and the
> > > > maintainers of these subsystems do not see LTP-like suites as
> > > > something that adds significant value and ignore them. Instead,
> > > > they have a (convoluted) set of testing they do with different
> > > > tools and configurations that qualifies their code as being
> > > > "tested".
> > > > 
> > > > So really, it sounds like a low hanging fruit: we don't really
> > > > need to write much more testing code code nor do we have to
> > > > refactor existing test suites. We just need to make sure the
> > > > right tests are running on stable kernels. I really want to
> > > > clarify what each subsystem sees as "sufficient" (and have that
> > > > documented somewhere).
> > > 
> > > kernel.ci and 0-day and Linaro are starting to add the fs and mm
> > > tests to their test suites to address these issues (I think 0-day
> > > already has many of them).  So this is happening, but not quite
> > > obvious.  I know I keep asking Linaro about this :(
> > 
> > 0day has xfstests at least, but it's opt-in only (you have to
> > request that it be run on your trees).  When I did it for the SCSI
> > tree, I had to email Fenguangg directly, there wasn't any other way
> > of getting it.
> 
> It's very tricky to do even if someone would just run it.

It is?  It's a test suite, so you just run it and it exercises standard
and growing set of regression tests.

>  I worked with the xfs folks for quite a while to gather the various
> configs they want to use, and to establish the baseline for a few of
> the stable trees (some tests are know to fail, etc).

The only real config issue is per-fs non-standard tests (features
specific to a given filesystem).  I just want it to exercise the
storage underneath, so the SCSI tree is configured for the default set
on xfs.

> So just running xfstests "blindly" doesn't add much value beyond ltp
> I think.

Well, we differ on the value of running regression tests, then.  The
whole point of a test infrastructure is that it's simple to run 'make
check' in autoconf parlance.  xfstests does provide a useful baseline
set of regression tests.  However, since my goal is primarily to detect
problems in the storage path rather than the filesystem, the utility is
exercising that path, although I fully appreciate that filesystem
regression tests aren't going to catch every SCSI issue, they do
provide some level of assurance against bugs.

Hopefully we can switch over to blktests when it's ready, but in the
meantime xfstests is way better than nothing.

James


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-15  2:48                   ` James Bottomley
@ 2019-02-16 18:28                     ` Theodore Y. Ts'o
  2019-02-21 15:34                       ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Theodore Y. Ts'o @ 2019-02-16 18:28 UTC (permalink / raw)
  To: James Bottomley
  Cc: Sasha Levin, Greg KH, Amir Goldstein, Steve French, lsf-pc,
	linux-fsdevel, linux-mm, LKML, Luis R. Rodriguez

On Thu, Feb 14, 2019 at 06:48:22PM -0800, James Bottomley wrote:
> Well, we differ on the value of running regression tests, then.  The
> whole point of a test infrastructure is that it's simple to run 'make
> check' in autoconf parlance.  xfstests does provide a useful baseline
> set of regression tests.  However, since my goal is primarily to detect
> problems in the storage path rather than the filesystem, the utility is
> exercising that path, although I fully appreciate that filesystem
> regression tests aren't going to catch every SCSI issue, they do
> provide some level of assurance against bugs.
> 
> Hopefully we can switch over to blktests when it's ready, but in the
> meantime xfstests is way better than nothing.

blktests isn't yet comprehensive, but I think there's value in running
blktests as well as xfstests.  I've been integrating blktests into
{kvm,gce}-xfstets because if the problem is caused to some regression
introduced in the block layer, I'm not wasting time trying to figure
out if it's caused by the block layer or not.  It won't catch
everything, but at least it has some value...

The block/*, loop/* and scsi/* tests in blktests do seem to be in
pretty good shape.  The nvme, nvmeof, and srp tests are *definitely*
not as mature.

				- Ted

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-16 18:28                     ` Theodore Y. Ts'o
@ 2019-02-21 15:34                       ` Luis Chamberlain
  2019-02-21 18:52                         ` Theodore Y. Ts'o
  0 siblings, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2019-02-21 15:34 UTC (permalink / raw)
  To: Theodore Y. Ts'o, James Bottomley, Sasha Levin, Greg KH,
	Amir Goldstein, Steve French, lsf-pc, linux-fsdevel, linux-mm,
	LKML

On Sat, Feb 16, 2019 at 01:28:35PM -0500, Theodore Y. Ts'o wrote:
> The block/*, loop/* and scsi/* tests in blktests do seem to be in
> pretty good shape.  The nvme, nvmeof, and srp tests are *definitely*
> not as mature.

Can you say more about this later part. What would you like to see more
of for nvme tests for instance?

It sounds like a productive session would include tracking our:

  a) sour spots
  b) who's already working on these
  c) gather volutneers for these sour spots

 Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-21 15:34                       ` Luis Chamberlain
@ 2019-02-21 18:52                         ` Theodore Y. Ts'o
  0 siblings, 0 replies; 55+ messages in thread
From: Theodore Y. Ts'o @ 2019-02-21 18:52 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: James Bottomley, Sasha Levin, Greg KH, Amir Goldstein,
	Steve French, lsf-pc, linux-fsdevel, linux-mm, LKML

On Thu, Feb 21, 2019 at 07:34:15AM -0800, Luis Chamberlain wrote:
> On Sat, Feb 16, 2019 at 01:28:35PM -0500, Theodore Y. Ts'o wrote:
> > The block/*, loop/* and scsi/* tests in blktests do seem to be in
> > pretty good shape.  The nvme, nvmeof, and srp tests are *definitely*
> > not as mature.
> 
> Can you say more about this later part. What would you like to see more
> of for nvme tests for instance?
> 
> It sounds like a productive session would include tracking our:
> 
>   a) sour spots
>   b) who's already working on these
>   c) gather volutneers for these sour spots

I complained on another LSF/MM topic thread, but there are a lot of
failures where it's not clear whether it's because I guessed
incorrectly about which version of nvme-cli I should be using (debian
stable and head of nvme-cli both are apparently wrong answers), or
kernel bugs or kernel misconfiguration issues on my side.

Current nvme/* failures that I'm still seeing are attached below.

	       		     	       - Ted

nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns) [failed]
    runtime  ...  100.265s
    something found in dmesg:
    [ 1857.188083] run blktests nvme/012 at 2019-02-12 01:11:33
    [ 1857.437322] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
    [ 1857.456187] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:78dc695a-2c99-4841-968d-c2c16a49a02a.
    [ 1857.458162] nvme nvme0: ANA group 1: optimized.
    [ 1857.458257] nvme nvme0: creating 2 I/O queues.
    [ 1857.460893] nvme nvme0: new ctrl: "blktests-subsystem-1"
    
    [ 1857.720666] ============================================
    [ 1857.726308] WARNING: possible recursive locking detected
    [ 1857.731784] 5.0.0-rc3-xfstests-00014-g1236f7d60242 #843 Not tainted
    ...
    (See '/results/nodev/nvme/012.dmesg' for the entire message)
nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed ns) [failed]
    runtime  ...  32.634s
    --- tests/nvme/013.out	2019-02-11 18:57:39.000000000 -0500
    +++ /results/nodev/nvme/013.out.bad	2019-02-12 01:13:46.708757206 -0500
    @@ -1,5 +1,9 @@
     Running nvme/013
     91fdba0d-f87b-4c25-b80f-db7be1418b9e
     uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
    +fio: io_u error on file /mnt/blktests///verify.0.0: Input/output error: write offset=329326592, buflen=4096
    +fio: io_u error on file /mnt/blktests///verify.0.0: Input/output error: write offset=467435520, buflen=4096
    +fio exited with status 0
    +4;fio-3.2;verify;0;5;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;1.000000%=0;5.000000%=0;10.000000%=0;20.000000%=0;30.000000%=0;40.000000%=0;50.000000%=0;60.000000%=0;70.000000%=0;80.000000%=0;90.000000%=0;95.000000%=0;99.000000%=0;99.500000%=0;99.900000%=0;99.950000%=0;99.990000%=0;0%=0;0%=0;0%=0;0;0;0.000000;0.000000;0;0;0.000000%;0.000000;0.000000;192672;6182;1546;31166;4;9044;63.332763;57.979218;482;29948;10268.332290;1421.459893;1.000000%=4145;5.000000%=9109;10.000000%=9502;20.000000%=9764;30.000000%=10027;40.000000%=10289;50.000000%=10420;60.000000%=10551;70.000000%=10682;80.000000%=10682;90.000000%=10944;95.000000%=11206;99.000000%=13172;99.500000%=16318;99.900000%=24510;99.950000%=27394;99.990000%=29229;0%=0;0%=0;0%=0;507;30005;10331.973087;1421.131712;6040;8232;100.000000%;6189.741935;267.091495;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;1.000000%=0;5.000000%=0;10.000000%=0;20.000000%=0;30.000000%=0;40.000000%=0;50.000000%=0;60.000000%=0;70.000000%=0;80.000000%=0;90.000000%=0;95.000000%=0;99.000000%=0;99.500000%=0;99.900000%=0;99.950000%=0;99.990000%=0;0%=0;0%=0;0%=0;0;0;0.000000;0.000000;0;0;0.000000%;0.000000;0.000000;3.991657%;6.484839%;91142;0;1024;0.1%;0.1%;0.1%;0.1%;100.0%;0.0%;0.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.20%;0.34%;0.25%;0.19%;27.86%;70.89%;0.23%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;nvme0n1;0;0;0;0;0;0;0;0.00%
    ...
    (Run 'diff -u tests/nvme/013.out /results/nodev/nvme/013.out.bad' to see the entire diff)
nvme/015 (unit test for NVMe flush for file backed ns)       [failed]
    runtime  ...  8.914s
    --- tests/nvme/015.out	2019-02-11 18:57:39.000000000 -0500
    +++ /results/nodev/nvme/015.out.bad	2019-02-12 01:14:05.429328259 -0500
    @@ -1,6 +1,6 @@
     Running nvme/015
     91fdba0d-f87b-4c25-b80f-db7be1418b9e
     uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
    -NVMe Flush: success
    +NVME IO command error:INTERNAL: The command was not completed successfully due to an internal error(6006)
     NQN:blktests-subsystem-1 disconnected 1 controller(s)
     Test complete
nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery)
    runtime  ...
nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery) [failed]
    runtime  ...  23.576s
    --- tests/nvme/016.out	2019-02-11 18:57:39.000000000 -0500
    +++ /results/nodev/nvme/016.out.bad	2019-02-12 01:14:29.173378854 -0500
    @@ -1,11 +1,11 @@
     Running nvme/016
     
    -Discovery Log Number of Records 1, Generation counter 1
    +Discovery Log Number of Records 1, Generation counter 5
     =====Discovery Log Entry 0======
     trtype:  loop
     adrfam:  pci
    ...
    (Run 'diff -u tests/nvme/016.out /results/nodev/nvme/016.out.bad' to see the entire diff)
nvme/017 (create/delete many file-ns and test discovery)    
    runtime  ...
nvme/017 (create/delete many file-ns and test discovery)     [failed]
    runtime  ...  23.592s
    --- tests/nvme/017.out	2019-02-11 18:57:39.000000000 -0500
    +++ /results/nodev/nvme/017.out.bad	2019-02-12 01:14:52.880762691 -0500
    @@ -1,11 +1,11 @@
     Running nvme/017
     
    -Discovery Log Number of Records 1, Generation counter 1
    +Discovery Log Number of Records 1, Generation counter 2
     =====Discovery Log Entry 0======
     trtype:  loop
     adrfam:  pci
    ...
    (Run 'diff -u tests/nvme/017.out /results/nodev/nvme/017.out.bad' to see the entire diff)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-13 19:52             ` Greg KH
  2019-02-13 20:14               ` James Bottomley
@ 2019-03-20  3:46               ` Jon Masters
  2019-03-20  5:06                 ` Greg KH
  1 sibling, 1 reply; 55+ messages in thread
From: Jon Masters @ 2019-03-20  3:46 UTC (permalink / raw)
  To: Greg KH, Sasha Levin
  Cc: Amir Goldstein, Steve French, lsf-pc, linux-fsdevel, linux-mm,
	LKML, Luis R. Rodriguez

On 2/13/19 2:52 PM, Greg KH wrote:
> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:

>> So really, it sounds like a low hanging fruit: we don't really need to
>> write much more testing code code nor do we have to refactor existing
>> test suites. We just need to make sure the right tests are running on
>> stable kernels. I really want to clarify what each subsystem sees as
>> "sufficient" (and have that documented somewhere).
> 
> kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
> to their test suites to address these issues (I think 0-day already has
> many of them).  So this is happening, but not quite obvious.  I know I
> keep asking Linaro about this :(

We're working on investments for LDCG[0] in 2019 that include kernel CI
changes for server use cases. Please keep us informed of what you folks
ultimately want to see, and I'll pass on to the steering committee too.

Ultimately I've been pushing for a kernel 0-day project for Arm. That's
probably going to require a lot of duplicated effort since the original
0-day project isn't open, but creating an open one could help everyone.

Jon.

[0] Linaro DataCenter Group (formerly "LEG")

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-03-20  3:46               ` Jon Masters
@ 2019-03-20  5:06                 ` Greg KH
  2019-03-20  6:14                   ` Jon Masters
  0 siblings, 1 reply; 55+ messages in thread
From: Greg KH @ 2019-03-20  5:06 UTC (permalink / raw)
  To: Jon Masters
  Cc: Sasha Levin, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On Tue, Mar 19, 2019 at 11:46:09PM -0400, Jon Masters wrote:
> On 2/13/19 2:52 PM, Greg KH wrote:
> > On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
> 
> >> So really, it sounds like a low hanging fruit: we don't really need to
> >> write much more testing code code nor do we have to refactor existing
> >> test suites. We just need to make sure the right tests are running on
> >> stable kernels. I really want to clarify what each subsystem sees as
> >> "sufficient" (and have that documented somewhere).
> > 
> > kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
> > to their test suites to address these issues (I think 0-day already has
> > many of them).  So this is happening, but not quite obvious.  I know I
> > keep asking Linaro about this :(
> 
> We're working on investments for LDCG[0] in 2019 that include kernel CI
> changes for server use cases. Please keep us informed of what you folks
> ultimately want to see, and I'll pass on to the steering committee too.
> 
> Ultimately I've been pushing for a kernel 0-day project for Arm. That's
> probably going to require a lot of duplicated effort since the original
> 0-day project isn't open, but creating an open one could help everyone.

Why are you trying to duplicate it on your own?  That's what kernel.ci
should be doing, please join in and invest in that instead.  It's an
open source project with its own governance and needs sponsors, why
waste time and money doing it all on your own?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-03-20  5:06                 ` Greg KH
@ 2019-03-20  6:14                   ` Jon Masters
  2019-03-20  6:28                     ` Greg KH
  0 siblings, 1 reply; 55+ messages in thread
From: Jon Masters @ 2019-03-20  6:14 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On 3/20/19 1:06 AM, Greg KH wrote:
> On Tue, Mar 19, 2019 at 11:46:09PM -0400, Jon Masters wrote:
>> On 2/13/19 2:52 PM, Greg KH wrote:
>>> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
>>
>>>> So really, it sounds like a low hanging fruit: we don't really need to
>>>> write much more testing code code nor do we have to refactor existing
>>>> test suites. We just need to make sure the right tests are running on
>>>> stable kernels. I really want to clarify what each subsystem sees as
>>>> "sufficient" (and have that documented somewhere).
>>>
>>> kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
>>> to their test suites to address these issues (I think 0-day already has
>>> many of them).  So this is happening, but not quite obvious.  I know I
>>> keep asking Linaro about this :(
>>
>> We're working on investments for LDCG[0] in 2019 that include kernel CI
>> changes for server use cases. Please keep us informed of what you folks
>> ultimately want to see, and I'll pass on to the steering committee too.
>>
>> Ultimately I've been pushing for a kernel 0-day project for Arm. That's
>> probably going to require a lot of duplicated effort since the original
>> 0-day project isn't open, but creating an open one could help everyone.
> 
> Why are you trying to duplicate it on your own?  That's what kernel.ci
> should be doing, please join in and invest in that instead.  It's an
> open source project with its own governance and needs sponsors, why
> waste time and money doing it all on your own?

To clarify, I'm pushing for investment in kernel.ci to achieve that goal
that it could provide the same 0-day capability for Arm and others.
It'll ultimately result in duplicated effort vs if 0-day were open.

Jon.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-03-20  6:14                   ` Jon Masters
@ 2019-03-20  6:28                     ` Greg KH
  2019-03-20  6:32                       ` Jon Masters
  0 siblings, 1 reply; 55+ messages in thread
From: Greg KH @ 2019-03-20  6:28 UTC (permalink / raw)
  To: Jon Masters
  Cc: Sasha Levin, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On Wed, Mar 20, 2019 at 02:14:09AM -0400, Jon Masters wrote:
> On 3/20/19 1:06 AM, Greg KH wrote:
> > On Tue, Mar 19, 2019 at 11:46:09PM -0400, Jon Masters wrote:
> >> On 2/13/19 2:52 PM, Greg KH wrote:
> >>> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
> >>
> >>>> So really, it sounds like a low hanging fruit: we don't really need to
> >>>> write much more testing code code nor do we have to refactor existing
> >>>> test suites. We just need to make sure the right tests are running on
> >>>> stable kernels. I really want to clarify what each subsystem sees as
> >>>> "sufficient" (and have that documented somewhere).
> >>>
> >>> kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
> >>> to their test suites to address these issues (I think 0-day already has
> >>> many of them).  So this is happening, but not quite obvious.  I know I
> >>> keep asking Linaro about this :(
> >>
> >> We're working on investments for LDCG[0] in 2019 that include kernel CI
> >> changes for server use cases. Please keep us informed of what you folks
> >> ultimately want to see, and I'll pass on to the steering committee too.
> >>
> >> Ultimately I've been pushing for a kernel 0-day project for Arm. That's
> >> probably going to require a lot of duplicated effort since the original
> >> 0-day project isn't open, but creating an open one could help everyone.
> > 
> > Why are you trying to duplicate it on your own?  That's what kernel.ci
> > should be doing, please join in and invest in that instead.  It's an
> > open source project with its own governance and needs sponsors, why
> > waste time and money doing it all on your own?
> 
> To clarify, I'm pushing for investment in kernel.ci to achieve that goal
> that it could provide the same 0-day capability for Arm and others.

Great, that's what I was trying to suggest :)

> It'll ultimately result in duplicated effort vs if 0-day were open.

"Half" of 0-day is open, but it's that other half that is still
needed...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-03-20  6:28                     ` Greg KH
@ 2019-03-20  6:32                       ` Jon Masters
  0 siblings, 0 replies; 55+ messages in thread
From: Jon Masters @ 2019-03-20  6:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, Amir Goldstein, Steve French, lsf-pc, linux-fsdevel,
	linux-mm, LKML, Luis R. Rodriguez

On 3/20/19 2:28 AM, Greg KH wrote:
> On Wed, Mar 20, 2019 at 02:14:09AM -0400, Jon Masters wrote:
>> On 3/20/19 1:06 AM, Greg KH wrote:
>>> On Tue, Mar 19, 2019 at 11:46:09PM -0400, Jon Masters wrote:
>>>> On 2/13/19 2:52 PM, Greg KH wrote:
>>>>> On Wed, Feb 13, 2019 at 02:25:12PM -0500, Sasha Levin wrote:
>>>>
>>>>>> So really, it sounds like a low hanging fruit: we don't really need to
>>>>>> write much more testing code code nor do we have to refactor existing
>>>>>> test suites. We just need to make sure the right tests are running on
>>>>>> stable kernels. I really want to clarify what each subsystem sees as
>>>>>> "sufficient" (and have that documented somewhere).
>>>>>
>>>>> kernel.ci and 0-day and Linaro are starting to add the fs and mm tests
>>>>> to their test suites to address these issues (I think 0-day already has
>>>>> many of them).  So this is happening, but not quite obvious.  I know I
>>>>> keep asking Linaro about this :(
>>>>
>>>> We're working on investments for LDCG[0] in 2019 that include kernel CI
>>>> changes for server use cases. Please keep us informed of what you folks
>>>> ultimately want to see, and I'll pass on to the steering committee too.
>>>>
>>>> Ultimately I've been pushing for a kernel 0-day project for Arm. That's
>>>> probably going to require a lot of duplicated effort since the original
>>>> 0-day project isn't open, but creating an open one could help everyone.
>>>
>>> Why are you trying to duplicate it on your own?  That's what kernel.ci
>>> should be doing, please join in and invest in that instead.  It's an
>>> open source project with its own governance and needs sponsors, why
>>> waste time and money doing it all on your own?
>>
>> To clarify, I'm pushing for investment in kernel.ci to achieve that goal
>> that it could provide the same 0-day capability for Arm and others.
> 
> Great, that's what I was trying to suggest :)
> 
>> It'll ultimately result in duplicated effort vs if 0-day were open.
> 
> "Half" of 0-day is open, but it's that other half that is still
> needed...

;) I'm hoping this might also help that to happen...

Best,

Jon.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2019-02-12 17:00 [LSF/MM TOPIC] FS, MM, and stable trees Sasha Levin
  2019-02-12 21:32 ` Steve French
@ 2022-03-08  9:32 ` Amir Goldstein
  2022-03-08 10:08   ` Greg KH
                     ` (2 more replies)
  1 sibling, 3 replies; 55+ messages in thread
From: Amir Goldstein @ 2022-03-08  9:32 UTC (permalink / raw)
  To: Sasha Levin
  Cc: lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso, Darrick J. Wong,
	Josef Bacik, Luis R. Rodriguez, Matthew Wilcox, Greg KH

On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
>
> Hi all,
>
> I'd like to propose a discussion about the workflow of the stable trees
> when it comes to fs/ and mm/. In the past year we had some friction with
> regards to the policies and the procedures around picking patches for
> stable tree, and I feel it would be very useful to establish better flow
> with the folks who might be attending LSF/MM.
>
> I feel that fs/ and mm/ are in very different places with regards to
> which patches go in -stable, what tests are expected, and the timeline
> of patches from the point they are proposed on a mailing list to the
> point they are released in a stable tree. Therefore, I'd like to propose
> two different sessions on this (one for fs/ and one for mm/), as a
> common session might be less conductive to agreeing on a path forward as
> the starting point for both subsystems are somewhat different.
>
> We can go through the existing processes, automation, and testing
> mechanisms we employ when building stable trees, and see how we can
> improve these to address the concerns of fs/ and mm/ folks.
>

Hi Sasha,

I think it would be interesting to have another discussion on the state of fs/
in -stable and see if things have changed over the past couple of years.
If you do not plan to attend LSF/MM in person, perhaps you will be able to
join this discussion remotely?

From what I can see, the flow of ext4/btrfs patches into -stable still looks
a lot healthier than the flow of xfs patches into -stable.

In 2019, Luis started an effort to improve this situation (with some
assistance from me and you) that ended up with several submissions
of stable patches for v4.19.y, but did not continue beyond 2019.

When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
one has to wonder if using xfs on kernels v5.x.y is a wise choice.

Which makes me wonder: how do the distro kernel maintainers keep up
with xfs fixes?

Many of the developers on CC of this message are involved in development
of a distro kernel (at least being consulted with), so I would be very much
interested to know how and if this issue is being dealt with.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08  9:32 ` Amir Goldstein
@ 2022-03-08 10:08   ` Greg KH
  2022-03-08 11:04     ` Amir Goldstein
  2022-03-08 16:40     ` Theodore Ts'o
  2022-03-08 10:54   ` Jan Kara
  2022-03-09  0:02   ` Dave Chinner
  2 siblings, 2 replies; 55+ messages in thread
From: Greg KH @ 2022-03-08 10:08 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox

On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
> On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
> >
> > Hi all,
> >
> > I'd like to propose a discussion about the workflow of the stable trees
> > when it comes to fs/ and mm/. In the past year we had some friction with
> > regards to the policies and the procedures around picking patches for
> > stable tree, and I feel it would be very useful to establish better flow
> > with the folks who might be attending LSF/MM.
> >
> > I feel that fs/ and mm/ are in very different places with regards to
> > which patches go in -stable, what tests are expected, and the timeline
> > of patches from the point they are proposed on a mailing list to the
> > point they are released in a stable tree. Therefore, I'd like to propose
> > two different sessions on this (one for fs/ and one for mm/), as a
> > common session might be less conductive to agreeing on a path forward as
> > the starting point for both subsystems are somewhat different.
> >
> > We can go through the existing processes, automation, and testing
> > mechanisms we employ when building stable trees, and see how we can
> > improve these to address the concerns of fs/ and mm/ folks.
> >
> 
> Hi Sasha,
> 
> I think it would be interesting to have another discussion on the state of fs/
> in -stable and see if things have changed over the past couple of years.
> If you do not plan to attend LSF/MM in person, perhaps you will be able to
> join this discussion remotely?
> 
> >From what I can see, the flow of ext4/btrfs patches into -stable still looks
> a lot healthier than the flow of xfs patches into -stable.

That is explicitly because the ext4/btrfs developers/maintainers are
marking patches for stable backports, while the xfs
developers/maintainers are not.

It has nothing to do with how me and Sasha are working, so go take this
up with the fs developers :)

> In 2019, Luis started an effort to improve this situation (with some
> assistance from me and you) that ended up with several submissions
> of stable patches for v4.19.y, but did not continue beyond 2019.
> 
> When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> one has to wonder if using xfs on kernels v5.x.y is a wise choice.

That's up to the xfs maintainers to discuss.

> Which makes me wonder: how do the distro kernel maintainers keep up
> with xfs fixes?

Who knows, ask the distro maintainers that use xfs.  What do they do?

The xfs developers/maintainer told us (Sasha and I) to not cherry-pick
any xfs "fixes:" patches to the stable trees unless they explicitly
marked it for stable.  So there's not much we can do here about this
without their involvement as I do not want to ever route around an
active maintainer like that.

> Many of the developers on CC of this message are involved in development
> of a distro kernel (at least being consulted with), so I would be very much
> interested to know how and if this issue is being dealt with.

Maybe no distro cares about xfs?  :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08  9:32 ` Amir Goldstein
  2022-03-08 10:08   ` Greg KH
@ 2022-03-08 10:54   ` Jan Kara
  2022-03-09  0:02   ` Dave Chinner
  2 siblings, 0 replies; 55+ messages in thread
From: Jan Kara @ 2022-03-08 10:54 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox,
	Greg KH

On Tue 08-03-22 11:32:43, Amir Goldstein wrote:
> On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
> >
> > Hi all,
> >
> > I'd like to propose a discussion about the workflow of the stable trees
> > when it comes to fs/ and mm/. In the past year we had some friction with
> > regards to the policies and the procedures around picking patches for
> > stable tree, and I feel it would be very useful to establish better flow
> > with the folks who might be attending LSF/MM.
> >
> > I feel that fs/ and mm/ are in very different places with regards to
> > which patches go in -stable, what tests are expected, and the timeline
> > of patches from the point they are proposed on a mailing list to the
> > point they are released in a stable tree. Therefore, I'd like to propose
> > two different sessions on this (one for fs/ and one for mm/), as a
> > common session might be less conductive to agreeing on a path forward as
> > the starting point for both subsystems are somewhat different.
> >
> > We can go through the existing processes, automation, and testing
> > mechanisms we employ when building stable trees, and see how we can
> > improve these to address the concerns of fs/ and mm/ folks.
> >
> 
> I think it would be interesting to have another discussion on the state of fs/
> in -stable and see if things have changed over the past couple of years.
> If you do not plan to attend LSF/MM in person, perhaps you will be able to
> join this discussion remotely?
> 
> From what I can see, the flow of ext4/btrfs patches into -stable still looks
> a lot healthier than the flow of xfs patches into -stable.
> 
> In 2019, Luis started an effort to improve this situation (with some
> assistance from me and you) that ended up with several submissions
> of stable patches for v4.19.y, but did not continue beyond 2019.
> 
> When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> 
> Which makes me wonder: how do the distro kernel maintainers keep up
> with xfs fixes?
> 
> Many of the developers on CC of this message are involved in development
> of a distro kernel (at least being consulted with), so I would be very much
> interested to know how and if this issue is being dealt with.

So I can explain for SUSE I guess. We generally don't use long-term stable
kernels for our distro releases (and short term -stable kernel is long
finished before it passes through our development cycle and is released as
an enterprise distro kernel - so e.g. base for the next SLES kernel is
5.14.21). We have infrastructure which tracks Fixes tags and also long-term
stable kernels and from that generates a feed of patches that may be
interesting for us to push into enterprise kernels. The feed is actually
split by kernel areas so you get patches in your area of expertise (so for
example I get this feed for ext4, udf, fs/notify, writeback, fs-mm
boundary, ...) Then we judge whether each patch makes sense for us or not,
backport what makes sense, run the result through tests (e.g. I usually do
general fstests runs and then some targetted testing if needed for
peculiar bugs) and push it out. The result then goes through another round
of general QA testing before it gets released.

So for XFS in particular we do carry somewhat more patches than stable
trees.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 10:08   ` Greg KH
@ 2022-03-08 11:04     ` Amir Goldstein
  2022-03-08 15:42       ` Luis Chamberlain
  2022-03-08 19:06       ` Sasha Levin
  2022-03-08 16:40     ` Theodore Ts'o
  1 sibling, 2 replies; 55+ messages in thread
From: Amir Goldstein @ 2022-03-08 11:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox

On Tue, Mar 8, 2022 at 12:08 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
> > On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to propose a discussion about the workflow of the stable trees
> > > when it comes to fs/ and mm/. In the past year we had some friction with
> > > regards to the policies and the procedures around picking patches for
> > > stable tree, and I feel it would be very useful to establish better flow
> > > with the folks who might be attending LSF/MM.
> > >
> > > I feel that fs/ and mm/ are in very different places with regards to
> > > which patches go in -stable, what tests are expected, and the timeline
> > > of patches from the point they are proposed on a mailing list to the
> > > point they are released in a stable tree. Therefore, I'd like to propose
> > > two different sessions on this (one for fs/ and one for mm/), as a
> > > common session might be less conductive to agreeing on a path forward as
> > > the starting point for both subsystems are somewhat different.
> > >
> > > We can go through the existing processes, automation, and testing
> > > mechanisms we employ when building stable trees, and see how we can
> > > improve these to address the concerns of fs/ and mm/ folks.
> > >
> >
> > Hi Sasha,
> >
> > I think it would be interesting to have another discussion on the state of fs/
> > in -stable and see if things have changed over the past couple of years.
> > If you do not plan to attend LSF/MM in person, perhaps you will be able to
> > join this discussion remotely?
> >
> > >From what I can see, the flow of ext4/btrfs patches into -stable still looks
> > a lot healthier than the flow of xfs patches into -stable.
>
> That is explicitly because the ext4/btrfs developers/maintainers are
> marking patches for stable backports, while the xfs
> developers/maintainers are not.
>
> It has nothing to do with how me and Sasha are working,

Absolutely, I have no complaints to the stable maintainers here, just wanted
to get a status report from Sasha, because he did invest is growing the stable
tree xfstests coverage AFAIK, which should have addressed some of the
earlier concerns of xfs developers.

> so go take this up with the fs developers :)

It is easy to blame the "fs developers", but is it also very hard on an
overloaded maintainer to ALSO take care of GOOD stable tree updates.

Here is a model that seems to be working well for some subsystems:
When a tester/developer finds a bug they write an LTP test.
That LTP test gets run by stable kernel test bots and prompts action
from distros who now know of this issue and may invest resources
in backporting patches.

If I am seeing random developers reporting bugs from running xfstests
on stable kernels and I am not seeing the stable kernel test bots reporting
those bugs, then there may be room for improvement in the stable kernel
testing process??

> > In 2019, Luis started an effort to improve this situation (with some
> > assistance from me and you) that ended up with several submissions
> > of stable patches for v4.19.y, but did not continue beyond 2019.
> >
> > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
>
> That's up to the xfs maintainers to discuss.
>
> > Which makes me wonder: how do the distro kernel maintainers keep up
> > with xfs fixes?
>
> Who knows, ask the distro maintainers that use xfs.  What do they do?
>

So here I am - asking them via proxy fs developers :)

> The xfs developers/maintainer told us (Sasha and I) to not cherry-pick
> any xfs "fixes:" patches to the stable trees unless they explicitly
> marked it for stable.  So there's not much we can do here about this
> without their involvement as I do not want to ever route around an
> active maintainer like that.
>
> > Many of the developers on CC of this message are involved in development
> > of a distro kernel (at least being consulted with), so I would be very much
> > interested to know how and if this issue is being dealt with.
>
> Maybe no distro cares about xfs?  :)

Some distros (Android) do not care about xfs.
Some distros have a business model to support xfs.
Many distros are still stuck on v4.19 and earlier.
Other distros may be blissfully ignorant about the situation.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 11:04     ` Amir Goldstein
@ 2022-03-08 15:42       ` Luis Chamberlain
  2022-03-08 19:06       ` Sasha Levin
  1 sibling, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-08 15:42 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Greg KH, Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara,
	Theodore Tso, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Tue, Mar 08, 2022 at 01:04:05PM +0200, Amir Goldstein wrote:
> On Tue, Mar 8, 2022 at 12:08 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > so go take this up with the fs developers :)
> 
> It is easy to blame the "fs developers", but is it also very hard on an
> overloaded maintainer to ALSO take care of GOOD stable tree updates.
> 
> Here is a model that seems to be working well for some subsystems:
> When a tester/developer finds a bug they write an LTP test.
> That LTP test gets run by stable kernel test bots and prompts action
> from distros who now know of this issue and may invest resources
> in backporting patches.
> 
> If I am seeing random developers reporting bugs from running xfstests
> on stable kernels and I am not seeing the stable kernel test bots reporting
> those bugs, then there may be room for improvement in the stable kernel
> testing process??

I have been investing huge amounts of time to improve this process, to
the point you can get fstests going and test against a known baseline
on kdevops [0] today with just the following 6 commands (and works with
different cloud providers, or local virtualized solutions):

make menuconfig
make
make bringup
make linux
make fstest
make fstest-baseline

The baseline is what still takes time to create, and so with that it
should in theory be possible to get the average Joe to at least help
start testing a filesystem easily. Patches welcomed.

In so far as actually getting more patches into stable for XFS, it
is just about doing the actual work of thorough review and then ensuring
it doesn't break the baseline. It does require time and effort,
but hopefully the above will help.

Do you have a series of stable candidates you'd like to review?

[0] https://github.com/mcgrof/kdevops

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 10:08   ` Greg KH
  2022-03-08 11:04     ` Amir Goldstein
@ 2022-03-08 16:40     ` Theodore Ts'o
  2022-03-08 17:16       ` Amir Goldstein
                         ` (3 more replies)
  1 sibling, 4 replies; 55+ messages in thread
From: Theodore Ts'o @ 2022-03-08 16:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Amir Goldstein, Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox

On Tue, Mar 08, 2022 at 11:08:48AM +0100, Greg KH wrote:
> > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> 
> That's up to the xfs maintainers to discuss.
> 
> > Which makes me wonder: how do the distro kernel maintainers keep up
> > with xfs fixes?
> 
> Who knows, ask the distro maintainers that use xfs.  What do they do?

This is something which is being worked, so I'm not sure we'll need to
discuss the specifics of the xfs stable backports at LSF/MM.  I'm
hopeful that by May, we'll have come to some kind of resolution of
that topic.

One of my team members has been working with Darrick to set up a set
of xfs configs[1] recommended by Darrick, and she's stood up an
automated test spinner using gce-xfstests which can watch a git branch
and automatically kick off a set of tests whenever it is updated.
Sasha has also provided her with a copy of his scripts so we can do
automated cherry picks of commits with Fixes tags.  So the idea is
that we can, hopefully in a mostly completely automated fashion, do
the backports and do a full set of regression tests on those stable
backports of XFS bug fixes.

[1] https://github.com/tytso/xfstests-bld/tree/master/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg

Next steps are to get a first tranche of cherry-picks for 5.10 and
probably 5.15, and use the test spinner to demonstrate that they don't
have any test regressions (if there are, we'll drop those commits).
Once we have a first set of proposed stable backports for XFS, we'll
present them to the XFS development community for their input.  There
are a couple of things that could happen at this point, depending on
what the XFS community is willing to accept.

The first is that we'll send these tested stable patches directly to
Greg and Sasha for inclusion in the LTS releases, with the linux-xfs
list cc'ed so they know what's going into the stable trees.

The second is that we send them only to the linux-xfs list, and they
have to do whatever approval they want before they go into the
upstream stable trees.

And the third option, if they aren't willing to take our work or they
choose to require manual approvals and those approvals are taking too
long, is that we'll feed the commits into Google's Container-Optimized
OS (COS) kernel, so that our customers can get those fixes and so we
can support XFS fully.  This isn't our preferred path; we'd prefer to
take the backports into the COS tree via the stable trees if at all
possible.  (Note: if requested, we could also publish these
backported-and-tested commits on a git tree for other distros to
take.)

There are still some details we'll need to work out; for example, will
the XFS maintainers let us do minor/obvious patch conflict
resolutions, or perhaps those commits which don't cherry-pick cleanly
will need to go through some round of approval by the linux-xfs list,
if the "we've run a full set of tests and there are no test
regressions" isn't good enough for them.

There is also the problem that sometimes commits aren't marked with
Fixes tag, but maybe there are some other signals we could use (for
example, maybe an indication in a comment in an xfstests test that
it's testing regressions for a specified kernel commit id).  Or
perhaps some other would be willing to contribute candidate commit
id's for backport consideration, with the approval of linux-xfs?
TBD...

Note: Darrick has been very helpful in geting this set up; the issue
is not the XFS maintainer, but rather the will of the whole of the XFS
development community, which sometimes can be a bit... fractious.

						- Ted

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 16:40     ` Theodore Ts'o
@ 2022-03-08 17:16       ` Amir Goldstein
  2022-03-09  0:43       ` Dave Chinner
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 55+ messages in thread
From: Amir Goldstein @ 2022-03-08 17:16 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox

On Tue, Mar 8, 2022 at 6:40 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Mar 08, 2022 at 11:08:48AM +0100, Greg KH wrote:
> > > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> >
> > That's up to the xfs maintainers to discuss.
> >
> > > Which makes me wonder: how do the distro kernel maintainers keep up
> > > with xfs fixes?
> >
> > Who knows, ask the distro maintainers that use xfs.  What do they do?
>
> This is something which is being worked, so I'm not sure we'll need to
> discuss the specifics of the xfs stable backports at LSF/MM.  I'm
> hopeful that by May, we'll have come to some kind of resolution of
> that topic.

Wonderful!

>
> One of my team members has been working with Darrick to set up a set
> of xfs configs[1] recommended by Darrick, and she's stood up an
> automated test spinner using gce-xfstests which can watch a git branch
> and automatically kick off a set of tests whenever it is updated.
> Sasha has also provided her with a copy of his scripts so we can do
> automated cherry picks of commits with Fixes tags.  So the idea is
> that we can, hopefully in a mostly completely automated fashion, do
> the backports and do a full set of regression tests on those stable
> backports of XFS bug fixes.
>
> [1] https://github.com/tytso/xfstests-bld/tree/master/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg
>
> Next steps are to get a first tranche of cherry-picks for 5.10 and
> probably 5.15, and use the test spinner to demonstrate that they don't
> have any test regressions (if there are, we'll drop those commits).
> Once we have a first set of proposed stable backports for XFS, we'll
> present them to the XFS development community for their input.  There
> are a couple of things that could happen at this point, depending on
> what the XFS community is willing to accept.
>
> The first is that we'll send these tested stable patches directly to
> Greg and Sasha for inclusion in the LTS releases, with the linux-xfs
> list cc'ed so they know what's going into the stable trees.
>
> The second is that we send them only to the linux-xfs list, and they
> have to do whatever approval they want before they go into the
> upstream stable trees.
>
> And the third option, if they aren't willing to take our work or they
> choose to require manual approvals and those approvals are taking too
> long, is that we'll feed the commits into Google's Container-Optimized
> OS (COS) kernel, so that our customers can get those fixes and so we
> can support XFS fully.  This isn't our preferred path; we'd prefer to
> take the backports into the COS tree via the stable trees if at all
> possible.  (Note: if requested, we could also publish these
> backported-and-tested commits on a git tree for other distros to
> take.)
>
> There are still some details we'll need to work out; for example, will
> the XFS maintainers let us do minor/obvious patch conflict
> resolutions, or perhaps those commits which don't cherry-pick cleanly
> will need to go through some round of approval by the linux-xfs list,
> if the "we've run a full set of tests and there are no test
> regressions" isn't good enough for them.
>

If you publish the list of fix commits that did not apply cleanly,
individual contributors that have personal interest in those fixes
can help with the backporting work, pass them back to your bot
for testing and then try to get the backport patches acked/reviewed.

Perhaps we can discuss those details in LSF/MM, but even getting
to auto selected, auto tested dumb fix patches will be far better than
the current state of v5.10.y.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 11:04     ` Amir Goldstein
  2022-03-08 15:42       ` Luis Chamberlain
@ 2022-03-08 19:06       ` Sasha Levin
  2022-03-09 18:57         ` Luis Chamberlain
  2022-03-10 23:59         ` Steve French
  1 sibling, 2 replies; 55+ messages in thread
From: Sasha Levin @ 2022-03-08 19:06 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Greg KH, lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox

On Tue, Mar 08, 2022 at 01:04:05PM +0200, Amir Goldstein wrote:
>On Tue, Mar 8, 2022 at 12:08 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>>
>> On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
>> > On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > I'd like to propose a discussion about the workflow of the stable trees
>> > > when it comes to fs/ and mm/. In the past year we had some friction with
>> > > regards to the policies and the procedures around picking patches for
>> > > stable tree, and I feel it would be very useful to establish better flow
>> > > with the folks who might be attending LSF/MM.
>> > >
>> > > I feel that fs/ and mm/ are in very different places with regards to
>> > > which patches go in -stable, what tests are expected, and the timeline
>> > > of patches from the point they are proposed on a mailing list to the
>> > > point they are released in a stable tree. Therefore, I'd like to propose
>> > > two different sessions on this (one for fs/ and one for mm/), as a
>> > > common session might be less conductive to agreeing on a path forward as
>> > > the starting point for both subsystems are somewhat different.
>> > >
>> > > We can go through the existing processes, automation, and testing
>> > > mechanisms we employ when building stable trees, and see how we can
>> > > improve these to address the concerns of fs/ and mm/ folks.
>> > >
>> >
>> > Hi Sasha,
>> >
>> > I think it would be interesting to have another discussion on the state of fs/
>> > in -stable and see if things have changed over the past couple of years.
>> > If you do not plan to attend LSF/MM in person, perhaps you will be able to
>> > join this discussion remotely?
>> >
>> > >From what I can see, the flow of ext4/btrfs patches into -stable still looks
>> > a lot healthier than the flow of xfs patches into -stable.
>>
>> That is explicitly because the ext4/btrfs developers/maintainers are
>> marking patches for stable backports, while the xfs
>> developers/maintainers are not.
>>
>> It has nothing to do with how me and Sasha are working,
>
>Absolutely, I have no complaints to the stable maintainers here, just wanted
>to get a status report from Sasha, because he did invest is growing the stable
>tree xfstests coverage AFAIK, which should have addressed some of the
>earlier concerns of xfs developers.

I can update: we indeed invested in improving the story behind how we
pull XFS patches into -stable, where I ended up with a collection of
scripts that can establish a baseline and compare stable-rc releases to
that baseline, reporting issues.

The system was somewhat expensive to maintain, in the sense that I often
hit flaky tests, chased down issues that are not obviously test
failures, just keeping stuff updated and running, and so on, but it was
still doable.

At one point we hit a few issues that didn't reproduce with xfstests,
and as a result there were asks such as the timing of when I pull
patches and their proposed flow into releases.

At that point the process on my end basically stopped:

  a. It was clear that xfs would be the only subsystem using this as
  mm/ext4/etc aligned with just tagging things for stable and being okay
  with me occasionally bugging them with AUTOSEL mails to catch stuff
  they might have missed.

  b. Neither me nor my employer had a particular interest in xfs.

  c. The process was too much of a snowflake to be doing along with the
  rest of the -stable work.

And so the scripts bitrotted and died.

Somewhat related: about a year ago I joined Google, who got bit in the
arse with a process like the one that Jan described for SUSE, and asked
for a new kernel program to use the upstream LTS trees, upgrade anually,
and run a subset of workloads on the -stable kernel to stay even closer
to upstream and catch issues earlier. (There's a taped presentation
about it here: https://www.youtube.com/watch?v=tryyzWATpaU).

At this point I can run a battery of tests and real workloads mostly at
subsystems that Google ends up caring about (which end up being most of
the core kernel code - mm, ext4, sched, etc), find real bugs, and
address them before a release. Sadly xfs is not one of those subsystem
that they care about.

>> so go take this up with the fs developers :)
>
>It is easy to blame the "fs developers", but is it also very hard on an
>overloaded maintainer to ALSO take care of GOOD stable tree updates.
>
>Here is a model that seems to be working well for some subsystems:
>When a tester/developer finds a bug they write an LTP test.
>That LTP test gets run by stable kernel test bots and prompts action
>from distros who now know of this issue and may invest resources
>in backporting patches.
>
>If I am seeing random developers reporting bugs from running xfstests
>on stable kernels and I am not seeing the stable kernel test bots reporting
>those bugs, then there may be room for improvement in the stable kernel
>testing process??

There always is, and wearing my stable maintainer hat I would *love*
*love* *love* if folks who care about a particular subsystem or workload
to test the kernels and let us know if anything broke, at which point we
will do our best to address it.

What we can't do is invest significant time into doing the testing work
ourselves for each and every subsystem in the kernel.

The testing rig I had is expensive, not even just time-wise but also
w.r.t the compute resources it required to operate, I suspect that most
of the bots that are running around won't dedicate that much resources
to each filesystem on a voluntary basis.

>> > In 2019, Luis started an effort to improve this situation (with some
>> > assistance from me and you) that ended up with several submissions
>> > of stable patches for v4.19.y, but did not continue beyond 2019.
>> >
>> > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
>> > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
>>
>> That's up to the xfs maintainers to discuss.
>>
>> > Which makes me wonder: how do the distro kernel maintainers keep up
>> > with xfs fixes?
>>
>> Who knows, ask the distro maintainers that use xfs.  What do they do?
>>
>
>So here I am - asking them via proxy fs developers :)

I can comment on what I'm seeing with Google's COS distro: it's a
chicken-and-egg problem. It's hard to offer commercial support with the
current state of xfs, but on the other hand it's hard to improve the
state of xfs without a commercial party that would invest more
significant resources into it.

Luckily there is an individual in Google who has picked up this work and
hopefully we will see something coming out of it very soon, but honestly
- we just got lucky.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08  9:32 ` Amir Goldstein
  2022-03-08 10:08   ` Greg KH
  2022-03-08 10:54   ` Jan Kara
@ 2022-03-09  0:02   ` Dave Chinner
  2 siblings, 0 replies; 55+ messages in thread
From: Dave Chinner @ 2022-03-09  0:02 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara, Theodore Tso,
	Darrick J. Wong, Josef Bacik, Luis R. Rodriguez, Matthew Wilcox,
	Greg KH

On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
> On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
> >
> > Hi all,
> >
> > I'd like to propose a discussion about the workflow of the stable trees
> > when it comes to fs/ and mm/. In the past year we had some friction with
> > regards to the policies and the procedures around picking patches for
> > stable tree, and I feel it would be very useful to establish better flow
> > with the folks who might be attending LSF/MM.
> >
> > I feel that fs/ and mm/ are in very different places with regards to
> > which patches go in -stable, what tests are expected, and the timeline
> > of patches from the point they are proposed on a mailing list to the
> > point they are released in a stable tree. Therefore, I'd like to propose
> > two different sessions on this (one for fs/ and one for mm/), as a
> > common session might be less conductive to agreeing on a path forward as
> > the starting point for both subsystems are somewhat different.
> >
> > We can go through the existing processes, automation, and testing
> > mechanisms we employ when building stable trees, and see how we can
> > improve these to address the concerns of fs/ and mm/ folks.
> >
> 
> Hi Sasha,
> 
> I think it would be interesting to have another discussion on the state of fs/
> in -stable and see if things have changed over the past couple of years.
> If you do not plan to attend LSF/MM in person, perhaps you will be able to
> join this discussion remotely?
> 
> From what I can see, the flow of ext4/btrfs patches into -stable still looks
> a lot healthier than the flow of xfs patches into -stable.
> 
> In 2019, Luis started an effort to improve this situation (with some
> assistance from me and you) that ended up with several submissions
> of stable patches for v4.19.y, but did not continue beyond 2019.
> 
> When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> 
> Which makes me wonder: how do the distro kernel maintainers keep up
> with xfs fixes?

For RHEL, we actively backport whole upstream releases with a few
cycle's delay. That means, for example, A RHEL 8 kernel might have a
5.10 XFS + random critical fixes from 5.11-16 in it. We monitor for
relevant "Fixes" tags, manage all the QE of those backports
ourselves, handle all the regressions and end user bug reports
ourselves, etc. 

There is almost zero impact on upstream from the RHEL stable
kernel process - they only intersect when a bug that also affects
upstream kernels is found. At which point, the "upstream first"
policy kicks in, and then we backport the upstream fix to the RHEL
stable kernels that need it as per eveything else that is done.

IOWs, there's a whole team of ppl at RH across FS, QE and SE who are
pretty much entirely dedicated to enabling, testing and supporting
the RHEL backports. This work is largely invisible to upstream
development and developers, except for the fact we tag bug fixes
with "fixes" tags so that distro kernel maintainers know to consider
that they need to consider backporting them sooner rather than later.

Keep in mind that an LTS kernel is no different to a SLES or RHEL
kernel in terms of the number or significance of changes it
accumulates over it's life time. However, those LTS kernels they
don't have anywhere near the same level of quality control as even
.0 upstream releases, nor do LTS kernels have a dedicated QE or
support organisations that maintaining and supporting a reliable
product that has millions of end users really requires.

It sounds like there are some things the LTS maintainers have
underway that substantially change the QE equation for LTS kernels.
WE've been asking for that for a long time with limited short term
success (e.g. Luis' effort), so I'm hopeful that things coming down
the pipeline will create a sustainable long term solution that will
enable us to have confidence that LTS backports (automated or
manual) are robust and regression free.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 16:40     ` Theodore Ts'o
  2022-03-08 17:16       ` Amir Goldstein
@ 2022-03-09  0:43       ` Dave Chinner
  2022-03-09 18:41       ` Luis Chamberlain
  2022-03-29 20:24       ` [LSF/MM TOPIC] FS, MM, and stable trees Amir Goldstein
  3 siblings, 0 replies; 55+ messages in thread
From: Dave Chinner @ 2022-03-09  0:43 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Amir Goldstein, Sasha Levin, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Luis R. Rodriguez,
	Matthew Wilcox

On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> On Tue, Mar 08, 2022 at 11:08:48AM +0100, Greg KH wrote:
> > > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> > 
> > That's up to the xfs maintainers to discuss.
> > 
> > > Which makes me wonder: how do the distro kernel maintainers keep up
> > > with xfs fixes?
> > 
> > Who knows, ask the distro maintainers that use xfs.  What do they do?
> 
> This is something which is being worked, so I'm not sure we'll need to
> discuss the specifics of the xfs stable backports at LSF/MM.  I'm
> hopeful that by May, we'll have come to some kind of resolution of
> that topic.
> 
> One of my team members has been working with Darrick to set up a set
> of xfs configs[1] recommended by Darrick, and she's stood up an
> automated test spinner using gce-xfstests which can watch a git branch
> and automatically kick off a set of tests whenever it is updated.
> Sasha has also provided her with a copy of his scripts so we can do
> automated cherry picks of commits with Fixes tags.  So the idea is
> that we can, hopefully in a mostly completely automated fashion, do
> the backports and do a full set of regression tests on those stable
> backports of XFS bug fixes.
> 
> [1] https://github.com/tytso/xfstests-bld/tree/master/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg
> 
> Next steps are to get a first tranche of cherry-picks for 5.10 and
> probably 5.15, and use the test spinner to demonstrate that they don't
> have any test regressions (if there are, we'll drop those commits).
> Once we have a first set of proposed stable backports for XFS, we'll
> present them to the XFS development community for their input.  There
> are a couple of things that could happen at this point, depending on
> what the XFS community is willing to accept.
> 
> The first is that we'll send these tested stable patches directly to
> Greg and Sasha for inclusion in the LTS releases, with the linux-xfs
> list cc'ed so they know what's going into the stable trees.
> 
> The second is that we send them only to the linux-xfs list, and they
> have to do whatever approval they want before they go into the
> upstream stable trees.

This effectively what we do with RHEL backports - the set of
proposed changes have to be backported cleanly and tested by the
proposer and it doesn't get merged until it has been reviewed.

This is pretty much what we've been asking for from the LTS kernel
process for a few years now (and what Luis did for a while), so I
see no problems with someone actually taking long term
responsibility for driving and maintaining an LTS backport process
like this.

> And the third option, if they aren't willing to take our work or they
> choose to require manual approvals and those approvals are taking too
> long, is that we'll feed the commits into Google's Container-Optimized
> OS (COS) kernel, so that our customers can get those fixes and so we
> can support XFS fully.  This isn't our preferred path; we'd prefer to
> take the backports into the COS tree via the stable trees if at all
> possible.  (Note: if requested, we could also publish these
> backported-and-tested commits on a git tree for other distros to
> take.)
> 
> There are still some details we'll need to work out; for example, will
> the XFS maintainers let us do minor/obvious patch conflict
> resolutions, or perhaps those commits which don't cherry-pick cleanly
> will need to go through some round of approval by the linux-xfs list,
> if the "we've run a full set of tests and there are no test
> regressions" isn't good enough for them.

I would expect that any proposals for backporting changes to LTS
kernels have already had this conflict/merge fixups work already
done and documented in the commit message by whoever is proposing
the backports. I.e. the commit message tells the reviewer where the
change deviates from the upstream commit.

> There is also the problem that sometimes commits aren't marked with
> Fixes tag, but maybe there are some other signals we could use (for
> example, maybe an indication in a comment in an xfstests test that
> it's testing regressions for a specified kernel commit id).  Or
> perhaps some other would be willing to contribute candidate commit
> id's for backport consideration, with the approval of linux-xfs?
> TBD...

That's not unique to XFS - every backport has this problem
regardless of subsystem. If you need a commit to be backported, then
just backport it and it just becomes another patch in the LTS
update process.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 16:40     ` Theodore Ts'o
  2022-03-08 17:16       ` Amir Goldstein
  2022-03-09  0:43       ` Dave Chinner
@ 2022-03-09 18:41       ` Luis Chamberlain
  2022-03-09 18:49         ` Josef Bacik
  2022-03-29 20:24       ` [LSF/MM TOPIC] FS, MM, and stable trees Amir Goldstein
  3 siblings, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-09 18:41 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Amir Goldstein, Sasha Levin, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> One of my team members has been working with Darrick to set up a set
> of xfs configs[1] recommended by Darrick, and she's stood up an
> automated test spinner using gce-xfstests which can watch a git branch
> and automatically kick off a set of tests whenever it is updated.

I think its important to note, as we would all know, that contrary to
most other subsystems, in so far as blktests and fstests is concerned,
simply passing a test once does not mean there is no issue given that
some test can fail with a failure rate of 1/1,000 for instance.

How many times you want to run a full set of fstests against a
filesystem varies depending on your filesystem, requirements and also
what resources you have. It also varies depending on how long you want
to dedicate time towards this.

To help with these concepts I ended up calling this a kernel-ci steady state
goal on kdevops:

  │ CONFIG_KERNEL_CI_STEADY_STATE_GOAL:
  │  
  │ The maximum number of possitive successes to have before bailing out
  │ a kernel-ci loop and report success. This value is currently used for
  │ all workflows. A value of 100 means 100 tests will run before we
  │ bail out and report we have achieved steady state for the workflow
  │ being tested. 

For fstests for XFS and btrfs, when testing for enterprise, I ended up going
with a steady state test goal of 500. That is, 500 consecutive runs of fstests
without any failure. This takes about 1 full week to run and one of my
eventual goals is to reduce this time. Perhaps it makes more sense to
talk generally how to optimize these sorts of tests, or share
information on experiences like these.

Do we want to define a steady state goal for stable for XFS?

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-09 18:41       ` Luis Chamberlain
@ 2022-03-09 18:49         ` Josef Bacik
  2022-03-09 19:00           ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Josef Bacik @ 2022-03-09 18:49 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Wed, Mar 09, 2022 at 10:41:53AM -0800, Luis Chamberlain wrote:
> On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> > One of my team members has been working with Darrick to set up a set
> > of xfs configs[1] recommended by Darrick, and she's stood up an
> > automated test spinner using gce-xfstests which can watch a git branch
> > and automatically kick off a set of tests whenever it is updated.
> 
> I think its important to note, as we would all know, that contrary to
> most other subsystems, in so far as blktests and fstests is concerned,
> simply passing a test once does not mean there is no issue given that
> some test can fail with a failure rate of 1/1,000 for instance.
> 

FWIW we (the btrfs team) have been running nightly runs of fstests against our
devel branch for over a year and tracking the results.  This allowed us to get
down to 0 failures because we could identify flakey tests and fix them or simply
disable them.  Then this means when we do have one of those 1/1,000 failures in
one of our configs (I copied Ted's approach and test all the various feature
combos) we know what set of parameters the failure was on and can go run that
test in a loop to reproduce the problem.

We like this approach because it's not a "wait a week to see if something
failed", we know the day after some new thing was merged if it caused a problem.
If it's more subtle then we still find it because a test will start failing at
some point.  It's a nice balance in how long we have to wait for results and
allows us to be a lot more sure in merging new code without hemming and hawing
for months.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 19:06       ` Sasha Levin
@ 2022-03-09 18:57         ` Luis Chamberlain
  2022-03-11  5:23           ` Theodore Ts'o
  2022-03-10 23:59         ` Steve French
  1 sibling, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-09 18:57 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel, Jan Kara,
	Theodore Tso, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Tue, Mar 08, 2022 at 02:06:57PM -0500, Sasha Levin wrote:
> What we can't do is invest significant time into doing the testing work
> ourselves for each and every subsystem in the kernel.

I think this experience helps though, it gives you I think a better
appreciation for what concerns we have to merge any fix and the effort
and dilligence required to ensure we don't regress. I think the
kernel-ci steady state goal takes this a bit further.

> The testing rig I had is expensive, not even just time-wise but also
> w.r.t the compute resources it required to operate, I suspect that most
> of the bots that are running around won't dedicate that much resources
> to each filesystem on a voluntary basis.

Precicely because of the above is *why* one of *my* requirements for
building a kernel-ci system was to be able to ensure I can run my tests
regardless of what employer I am at, and easily ramp up. So I can use
local virtualized solutions (KVM or virtualbox), or *any* cloud solution
at will (AWS, GCE, Azure, OpenStack). And so kdevops enables all this
using the same commands I posted before, using simple make target
commands.

Perhaps the one area that might interest folks is the test setup,
using loopback drives and truncated files, if you find holes in
this please let me know:

https://github.com/mcgrof/kdevops/blob/master/docs/testing-with-loopback.md

In my experience this setup just finds *more* issues, rather than less,
and in my experience as well none of these issues found were bogus, they
always lead to real bugs:

https://github.com/mcgrof/kdevops/blob/master/docs/seeing-more-issues.md

A test rig for a high kernel-ci steady state goal does require
resources, time and effort. Fortunately I am now confident in the
architecture behind the tests / automation though. So all that is
really needed now is just a dedicated system to run these, agree what
configs we'd test (I have some well defined and documented for XFS on
kdevops through Kconfig, based on conversations we last had about stable
testing), work with a public baseline to reflect this setup (I have
public baselines already published for tons of kernels and for different
filesystems), and then test candidate fixes. This later effort is still
time consuming too. But with a proper ongoing rig running a kernel-ci,
this becomes much easier and it is a much smoother sailing process.

> I can comment on what I'm seeing with Google's COS distro: it's a
> chicken-and-egg problem. It's hard to offer commercial support with the
> current state of xfs, but on the other hand it's hard to improve the
> state of xfs without a commercial party that would invest more
> significant resources into it.

This is the non-Enterprise argument to it.

And yes. I agree, but it doesn't mean we can't resolve it. I think we
just need to agree to a a dedicated test rig, test setup, and a public
baseline might be a good next step.

> Luckily there is an individual in Google who has picked up this work and
> hopefully we will see something coming out of it very soon, but honestly
> - we just got lucky.

Groovy.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-09 18:49         ` Josef Bacik
@ 2022-03-09 19:00           ` Luis Chamberlain
  2022-03-09 21:19             ` Josef Bacik
  0 siblings, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-09 19:00 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Wed, Mar 09, 2022 at 01:49:18PM -0500, Josef Bacik wrote:
> On Wed, Mar 09, 2022 at 10:41:53AM -0800, Luis Chamberlain wrote:
> > On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> > > One of my team members has been working with Darrick to set up a set
> > > of xfs configs[1] recommended by Darrick, and she's stood up an
> > > automated test spinner using gce-xfstests which can watch a git branch
> > > and automatically kick off a set of tests whenever it is updated.
> > 
> > I think its important to note, as we would all know, that contrary to
> > most other subsystems, in so far as blktests and fstests is concerned,
> > simply passing a test once does not mean there is no issue given that
> > some test can fail with a failure rate of 1/1,000 for instance.
> > 
> 
> FWIW we (the btrfs team) have been running nightly runs of fstests against our
> devel branch for over a year and tracking the results.

That's wonderful, what is your steady state goal? And do you have your
configurations used public and also your baseline somehwere? I think
this later aspect could be very useful to everyone.

Yes, everyone's test setup can be different, but this is why I went with
a loopback/truncated file setup, it does find more issues and so far
these have all been real.

It kind of begs the question if we should adopt something like kconfig
on fstests to help enable a few test configs we can agree on. Thoughts?

I've been experimenting a lot with this on kdevops. So the Kconfig logic
could easily just move to fstests.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-09 19:00           ` Luis Chamberlain
@ 2022-03-09 21:19             ` Josef Bacik
  2022-03-10  1:28               ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Josef Bacik @ 2022-03-09 21:19 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> On Wed, Mar 09, 2022 at 01:49:18PM -0500, Josef Bacik wrote:
> > On Wed, Mar 09, 2022 at 10:41:53AM -0800, Luis Chamberlain wrote:
> > > On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> > > > One of my team members has been working with Darrick to set up a set
> > > > of xfs configs[1] recommended by Darrick, and she's stood up an
> > > > automated test spinner using gce-xfstests which can watch a git branch
> > > > and automatically kick off a set of tests whenever it is updated.
> > > 
> > > I think its important to note, as we would all know, that contrary to
> > > most other subsystems, in so far as blktests and fstests is concerned,
> > > simply passing a test once does not mean there is no issue given that
> > > some test can fail with a failure rate of 1/1,000 for instance.
> > > 
> > 
> > FWIW we (the btrfs team) have been running nightly runs of fstests against our
> > devel branch for over a year and tracking the results.
> 
> That's wonderful, what is your steady state goal? And do you have your
> configurations used public and also your baseline somehwere? I think
> this later aspect could be very useful to everyone.
> 

Yeah I post the results to http://toxicpanda.com, you can see the results from
the runs, and http://toxicpanda.com/performance/ has the nightly performance
numbers and graphs as well.

This was all put together to build into something a little more polished, but
clearly priorities being what they are this is as far as we've taken it.  For
configuration you can see my virt-scripts here
https://github.com/josefbacik/virt-scripts which are what I use to generate the
VM's to run xfstests in.

The kernel config I use is in there, I use a variety of btrfs mount options and
mkfs options, not sure how interesting those are for people outside of btrfs.

Right now I have a box with ZNS drives waiting for me to set this up on so that
we can also be testing btrfs zoned support nightly, as well as my 3rd
RaspberryPi that I'm hoping doesn't blow up this time.

I have another virt setup that uses btrfs snapshots to create a one off chroot
to run smoke tests for my development using virtme-run.  I want to replace the
libvirtd vms with virtme-run, however I've got about a 2x performance difference
between virtme-run and libvirtd that I'm trying to figure out, so right now all
the nightly test VM's are using libvirtd.

Long, long term the plan is to replace my janky home setup with AWS VM's that
can be fired from GitHub actions whenever we push branches, that way individual
developers can get results for their patches before they're merged, and we don't
have to rely on my terrible python+html for test results.

> Yes, everyone's test setup can be different, but this is why I went with
> a loopback/truncated file setup, it does find more issues and so far
> these have all been real.
> 
> It kind of begs the question if we should adopt something like kconfig
> on fstests to help enable a few test configs we can agree on. Thoughts?
> 

For us (and I imagine other fs'es) the kconfigs are not interesting, it's the
combo of different file system features that can be toggled on and off via mkfs
as well as different mount options.  For example I run all the different mkfs
features through normal mount options, and then again with compression turned
on.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-09 21:19             ` Josef Bacik
@ 2022-03-10  1:28               ` Luis Chamberlain
  2022-03-10 18:51                 ` Josef Bacik
  0 siblings, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-10  1:28 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Wed, Mar 09, 2022 at 04:19:21PM -0500, Josef Bacik wrote:
> On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> > On Wed, Mar 09, 2022 at 01:49:18PM -0500, Josef Bacik wrote:
> > > On Wed, Mar 09, 2022 at 10:41:53AM -0800, Luis Chamberlain wrote:
> > > > On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> > > > > One of my team members has been working with Darrick to set up a set
> > > > > of xfs configs[1] recommended by Darrick, and she's stood up an
> > > > > automated test spinner using gce-xfstests which can watch a git branch
> > > > > and automatically kick off a set of tests whenever it is updated.
> > > > 
> > > > I think its important to note, as we would all know, that contrary to
> > > > most other subsystems, in so far as blktests and fstests is concerned,
> > > > simply passing a test once does not mean there is no issue given that
> > > > some test can fail with a failure rate of 1/1,000 for instance.
> > > > 
> > > 
> > > FWIW we (the btrfs team) have been running nightly runs of fstests against our
> > > devel branch for over a year and tracking the results.
> > 
> > That's wonderful, what is your steady state goal? And do you have your
> > configurations used public and also your baseline somehwere? I think
> > this later aspect could be very useful to everyone.
> > 
> 
> Yeah I post the results to http://toxicpanda.com, you can see the results from
> the runs, and http://toxicpanda.com/performance/ has the nightly performance
> numbers and graphs as well.

That's great!

But although this runs nightly, it seems this runs fstest *once* to
ensure if there are no regressions. Is that right?

> This was all put together to build into something a little more polished, but
> clearly priorities being what they are this is as far as we've taken it.  For
> configuration you can see my virt-scripts here
> https://github.com/josefbacik/virt-scripts which are what I use to generate the
> VM's to run xfstests in.
> 
> The kernel config I use is in there, I use a variety of btrfs mount options and
> mkfs options, not sure how interesting those are for people outside of btrfs.

Extremely useful.

> Right now I have a box with ZNS drives waiting for me to set this up on so that
> we can also be testing btrfs zoned support nightly, as well as my 3rd
> RaspberryPi that I'm hoping doesn't blow up this time.

Great to hear you will be covering ZNS as well.

> I have another virt setup that uses btrfs snapshots to create a one off chroot
> to run smoke tests for my development using virtme-run.  I want to replace the
> libvirtd vms with virtme-run, however I've got about a 2x performance difference
> between virtme-run and libvirtd that I'm trying to figure out, so right now all
> the nightly test VM's are using libvirtd.
> 
> Long, long term the plan is to replace my janky home setup with AWS VM's that
> can be fired from GitHub actions whenever we push branches, that way individual
> developers can get results for their patches before they're merged, and we don't
> have to rely on my terrible python+html for test results.

If you do move to AWS just keep in mind using loopback drives +
truncated files *finds* more issues than not. So when I used AWS
I got two spare nvme drives and used one to stuff the truncated
files there.

> > Yes, everyone's test setup can be different, but this is why I went with
> > a loopback/truncated file setup, it does find more issues and so far
> > these have all been real.
> > 
> > It kind of begs the question if we should adopt something like kconfig
> > on fstests to help enable a few test configs we can agree on. Thoughts?
> > 
> 
> For us (and I imagine other fs'es) the kconfigs are not interesting, it's the
> combo of different file system features that can be toggled on and off via mkfs
> as well as different mount options.  For example I run all the different mkfs
> features through normal mount options, and then again with compression turned
> on.  Thanks,

So what I mean by kconfig is not the Linux kernel kconfig, but rather
the kdevops kconfig options. kdevops essentially has a kconfig symbol
per mkfs-param-mount config we test. And it runs *ones* guest per each
of these. For example:

config FSTESTS_XFS_SECTION_REFLINK_1024
	bool "Enable testing section: xfs_reflink_1024"
	default y
	help
	  This will create a host to test the baseline of fstests using the
	  following configuration which enables reflink using 1024 byte block
	  size.

	[xfs_reflink]
	MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
	FSTYP=xfs

The other ones can be found here for XFS:

https://github.com/mcgrof/kdevops/blob/master/workflows/fstests/xfs/Kconfig

So indeed, exactly what you mean. What I'm getting at is that it would
be good to construct these with the community. So it would beg the
question if we should embrace for instance kconfig language to be
able to configure fstests (yes I know it is xfstests but I think loose
new people who tend to assume that xfstest is only for XFS, so I only
always call it fstests).

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-10  1:28               ` Luis Chamberlain
@ 2022-03-10 18:51                 ` Josef Bacik
  2022-03-10 22:41                   ` Luis Chamberlain
  2022-03-12  2:07                   ` Luis Chamberlain
  0 siblings, 2 replies; 55+ messages in thread
From: Josef Bacik @ 2022-03-10 18:51 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Wed, Mar 09, 2022 at 05:28:28PM -0800, Luis Chamberlain wrote:
> On Wed, Mar 09, 2022 at 04:19:21PM -0500, Josef Bacik wrote:
> > On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> > > On Wed, Mar 09, 2022 at 01:49:18PM -0500, Josef Bacik wrote:
> > > > On Wed, Mar 09, 2022 at 10:41:53AM -0800, Luis Chamberlain wrote:
> > > > > On Tue, Mar 08, 2022 at 11:40:18AM -0500, Theodore Ts'o wrote:
> > > > > > One of my team members has been working with Darrick to set up a set
> > > > > > of xfs configs[1] recommended by Darrick, and she's stood up an
> > > > > > automated test spinner using gce-xfstests which can watch a git branch
> > > > > > and automatically kick off a set of tests whenever it is updated.
> > > > > 
> > > > > I think its important to note, as we would all know, that contrary to
> > > > > most other subsystems, in so far as blktests and fstests is concerned,
> > > > > simply passing a test once does not mean there is no issue given that
> > > > > some test can fail with a failure rate of 1/1,000 for instance.
> > > > > 
> > > > 
> > > > FWIW we (the btrfs team) have been running nightly runs of fstests against our
> > > > devel branch for over a year and tracking the results.
> > > 
> > > That's wonderful, what is your steady state goal? And do you have your
> > > configurations used public and also your baseline somehwere? I think
> > > this later aspect could be very useful to everyone.
> > > 
> > 
> > Yeah I post the results to http://toxicpanda.com, you can see the results from
> > the runs, and http://toxicpanda.com/performance/ has the nightly performance
> > numbers and graphs as well.
> 
> That's great!
> 
> But although this runs nightly, it seems this runs fstest *once* to
> ensure if there are no regressions. Is that right?
> 

Yup once per config, so 8 full fstest runs.

> > This was all put together to build into something a little more polished, but
> > clearly priorities being what they are this is as far as we've taken it.  For
> > configuration you can see my virt-scripts here
> > https://github.com/josefbacik/virt-scripts which are what I use to generate the
> > VM's to run xfstests in.
> > 
> > The kernel config I use is in there, I use a variety of btrfs mount options and
> > mkfs options, not sure how interesting those are for people outside of btrfs.
> 
> Extremely useful.
> 

[root@fedora-rawhide ~]# cat /xfstests-dev/local.config
[btrfs_normal_freespacetree]
TEST_DIR=/mnt/test
TEST_DEV=/dev/mapper/vg0-lv0
SCRATCH_DEV_POOL="/dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
SCRATCH_MNT=/mnt/scratch
LOGWRITES_DEV=/dev/mapper/vg0-lv8
PERF_CONFIGNAME=jbacik
MKFS_OPTIONS="-K -f -O ^no-holes"
MOUNT_OPTIONS="-o space_cache=v2"
FSTYP=btrfs

[btrfs_compress_freespacetree]
MOUNT_OPTIONS="-o compress=zlib,space_cache=v2"
MKFS_OPTIONS="-K -f -O ^no-holes"

[btrfs_normal]
TEST_DIR=/mnt/test
TEST_DEV=/dev/mapper/vg0-lv0
SCRATCH_DEV_POOL="/dev/mapper/vg0-lv9 /dev/mapper/vg0-lv8 /dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
SCRATCH_MNT=/mnt/scratch
LOGWRITES_DEV=/dev/mapper/vg0-lv10
PERF_CONFIGNAME=jbacik
MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"
MOUNT_OPTIONS="-o discard=async"

[btrfs_compression]
MOUNT_OPTIONS="-o compress=zstd,discard=async"
MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"

[kdave]
MKFS_OPTIONS="-K -O no-holes -R ^free-space-tree"
MOUNT_OPTIONS="-o discard,space_cache=v2"

[root@xfstests3 ~]# cat /xfstests-dev/local.config
[btrfs_normal_noholes]
TEST_DIR=/mnt/test
TEST_DEV=/dev/mapper/vg0-lv0
SCRATCH_DEV_POOL="/dev/mapper/vg0-lv9 /dev/mapper/vg0-lv8 /dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
SCRATCH_MNT=/mnt/scratch
LOGWRITES_DEV=/dev/mapper/vg0-lv10
PERF_CONFIGNAME=jbacik
MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"

[btrfs_compress_noholes]
MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"
MOUNT_OPTIONS="-o compress=lzo"

[btrfs_noholes_freespacetree]
MKFS_OPTIONS="-K -O no-holes -f"
MOUNT_OPTIONS="-o space_cache=v2"


> > Right now I have a box with ZNS drives waiting for me to set this up on so that
> > we can also be testing btrfs zoned support nightly, as well as my 3rd
> > RaspberryPi that I'm hoping doesn't blow up this time.
> 
> Great to hear you will be covering ZNS as well.
> 
> > I have another virt setup that uses btrfs snapshots to create a one off chroot
> > to run smoke tests for my development using virtme-run.  I want to replace the
> > libvirtd vms with virtme-run, however I've got about a 2x performance difference
> > between virtme-run and libvirtd that I'm trying to figure out, so right now all
> > the nightly test VM's are using libvirtd.
> > 
> > Long, long term the plan is to replace my janky home setup with AWS VM's that
> > can be fired from GitHub actions whenever we push branches, that way individual
> > developers can get results for their patches before they're merged, and we don't
> > have to rely on my terrible python+html for test results.
> 
> If you do move to AWS just keep in mind using loopback drives +
> truncated files *finds* more issues than not. So when I used AWS
> I got two spare nvme drives and used one to stuff the truncated
> files there.
> 

My plan was to get ones with attached storage and do the LVM thing I do for my
vms.

> > > Yes, everyone's test setup can be different, but this is why I went with
> > > a loopback/truncated file setup, it does find more issues and so far
> > > these have all been real.
> > > 
> > > It kind of begs the question if we should adopt something like kconfig
> > > on fstests to help enable a few test configs we can agree on. Thoughts?
> > > 
> > 
> > For us (and I imagine other fs'es) the kconfigs are not interesting, it's the
> > combo of different file system features that can be toggled on and off via mkfs
> > as well as different mount options.  For example I run all the different mkfs
> > features through normal mount options, and then again with compression turned
> > on.  Thanks,
> 
> So what I mean by kconfig is not the Linux kernel kconfig, but rather
> the kdevops kconfig options. kdevops essentially has a kconfig symbol
> per mkfs-param-mount config we test. And it runs *ones* guest per each
> of these. For example:
> 
> config FSTESTS_XFS_SECTION_REFLINK_1024
> 	bool "Enable testing section: xfs_reflink_1024"
> 	default y
> 	help
> 	  This will create a host to test the baseline of fstests using the
> 	  following configuration which enables reflink using 1024 byte block
> 	  size.
> 
> 	[xfs_reflink]
> 	MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
> 	FSTYP=xfs
> 
> The other ones can be found here for XFS:
> 
> https://github.com/mcgrof/kdevops/blob/master/workflows/fstests/xfs/Kconfig
> 
> So indeed, exactly what you mean. What I'm getting at is that it would
> be good to construct these with the community. So it would beg the
> question if we should embrace for instance kconfig language to be
> able to configure fstests (yes I know it is xfstests but I think loose
> new people who tend to assume that xfstest is only for XFS, so I only
> always call it fstests).
> 

Got it, that's pretty cool, I pasted my configs above.  Once I figure out why
virtme is so much slower than libvirtd I'll give kdevops a try and see if I can
make it work for my setup.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-10 18:51                 ` Josef Bacik
@ 2022-03-10 22:41                   ` Luis Chamberlain
  2022-03-11 12:09                     ` Jan Kara
  2022-03-12  2:07                   ` Luis Chamberlain
  1 sibling, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-10 22:41 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Matthew Wilcox

On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> On Wed, Mar 09, 2022 at 05:28:28PM -0800, Luis Chamberlain wrote:
> > On Wed, Mar 09, 2022 at 04:19:21PM -0500, Josef Bacik wrote:
> > > On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> > 
> > That's great!
> > 
> > But although this runs nightly, it seems this runs fstest *once* to
> > ensure if there are no regressions. Is that right?
> > 
> 
> Yup once per config, so 8 full fstest runs.

From my experience that is not enough to capture all failures given
lower failure rates on tests other than 1/1, like 1/42 or
1/300. So minimum I'd go for 500 loops of fstests per config.
This does mean this is not possible nightly though, yes. 5 days
on average. And so much more work is needed to bring this down
further.

> > > This was all put together to build into something a little more polished, but
> > > clearly priorities being what they are this is as far as we've taken it.  For
> > > configuration you can see my virt-scripts here
> > > https://github.com/josefbacik/virt-scripts which are what I use to generate the
> > > VM's to run xfstests in.
> > > 
> > > The kernel config I use is in there, I use a variety of btrfs mount options and
> > > mkfs options, not sure how interesting those are for people outside of btrfs.
> > 
> > Extremely useful.
> > 
> 
> [root@fedora-rawhide ~]# cat /xfstests-dev/local.config
> [btrfs_normal_freespacetree]
> TEST_DIR=/mnt/test
> TEST_DEV=/dev/mapper/vg0-lv0
> SCRATCH_DEV_POOL="/dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
> SCRATCH_MNT=/mnt/scratch
> LOGWRITES_DEV=/dev/mapper/vg0-lv8
> PERF_CONFIGNAME=jbacik
> MKFS_OPTIONS="-K -f -O ^no-holes"
> MOUNT_OPTIONS="-o space_cache=v2"
> FSTYP=btrfs
> 
> [btrfs_compress_freespacetree]
> MOUNT_OPTIONS="-o compress=zlib,space_cache=v2"
> MKFS_OPTIONS="-K -f -O ^no-holes"
> 
> [btrfs_normal]
> TEST_DIR=/mnt/test
> TEST_DEV=/dev/mapper/vg0-lv0
> SCRATCH_DEV_POOL="/dev/mapper/vg0-lv9 /dev/mapper/vg0-lv8 /dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
> SCRATCH_MNT=/mnt/scratch
> LOGWRITES_DEV=/dev/mapper/vg0-lv10
> PERF_CONFIGNAME=jbacik
> MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"
> MOUNT_OPTIONS="-o discard=async"
> 
> [btrfs_compression]
> MOUNT_OPTIONS="-o compress=zstd,discard=async"
> MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"
> 
> [kdave]
> MKFS_OPTIONS="-K -O no-holes -R ^free-space-tree"
> MOUNT_OPTIONS="-o discard,space_cache=v2"
> 
> [root@xfstests3 ~]# cat /xfstests-dev/local.config
> [btrfs_normal_noholes]
> TEST_DIR=/mnt/test
> TEST_DEV=/dev/mapper/vg0-lv0
> SCRATCH_DEV_POOL="/dev/mapper/vg0-lv9 /dev/mapper/vg0-lv8 /dev/mapper/vg0-lv7 /dev/mapper/vg0-lv6 /dev/mapper/vg0-lv5 /dev/mapper/vg0-lv4 /dev/mapper/vg0-lv3 /dev/mapper/vg0-lv2 /dev/mapper/vg0-lv1 "
> SCRATCH_MNT=/mnt/scratch
> LOGWRITES_DEV=/dev/mapper/vg0-lv10
> PERF_CONFIGNAME=jbacik
> MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"
> 
> [btrfs_compress_noholes]
> MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"
> MOUNT_OPTIONS="-o compress=lzo"
> 
> [btrfs_noholes_freespacetree]
> MKFS_OPTIONS="-K -O no-holes -f"
> MOUNT_OPTIONS="-o space_cache=v2"

Thanks I can eventually cake these in to kdevops (or patches welcmeD)
modulo I use loopback/truncated filews. It is possible to add an option
to use dm linear too if that is really desirable.

> > > Right now I have a box with ZNS drives waiting for me to set this up on so that
> > > we can also be testing btrfs zoned support nightly, as well as my 3rd
> > > RaspberryPi that I'm hoping doesn't blow up this time.
> > 
> > Great to hear you will be covering ZNS as well.
> > 
> > > I have another virt setup that uses btrfs snapshots to create a one off chroot
> > > to run smoke tests for my development using virtme-run.  I want to replace the
> > > libvirtd vms with virtme-run, however I've got about a 2x performance difference
> > > between virtme-run and libvirtd that I'm trying to figure out, so right now all
> > > the nightly test VM's are using libvirtd.
> > > 
> > > Long, long term the plan is to replace my janky home setup with AWS VM's that
> > > can be fired from GitHub actions whenever we push branches, that way individual
> > > developers can get results for their patches before they're merged, and we don't
> > > have to rely on my terrible python+html for test results.
> > 
> > If you do move to AWS just keep in mind using loopback drives +
> > truncated files *finds* more issues than not. So when I used AWS
> > I got two spare nvme drives and used one to stuff the truncated
> > files there.
> > 
> 
> My plan was to get ones with attached storage and do the LVM thing I do for my
> vms.

The default for AWS for kdevops is to use m5ad.4xlarge (~$0.824 per
Hour) that comes with 61 GiB RAM, 16 vcpus, 1 8 GiB main drive, and two
additional 300 GiB nvme drives. The nvme drives are used so to also
mimic the KVM setup when kdevops uses local virtualization.

FWIW, the kdevops AWS kconfig is at terraform/aws/Kconfig

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 19:06       ` Sasha Levin
  2022-03-09 18:57         ` Luis Chamberlain
@ 2022-03-10 23:59         ` Steve French
  2022-03-11  0:36           ` Chuck Lever III
  1 sibling, 1 reply; 55+ messages in thread
From: Steve French @ 2022-03-10 23:59 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel, Jan Kara,
	Theodore Tso, Darrick J. Wong, Josef Bacik, Luis R. Rodriguez,
	Matthew Wilcox

On Tue, Mar 8, 2022 at 6:16 PM Sasha Levin <sashal@kernel.org> wrote:
>
> On Tue, Mar 08, 2022 at 01:04:05PM +0200, Amir Goldstein wrote:
> >On Tue, Mar 8, 2022 at 12:08 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >>
> >> On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
> >> > On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > I'd like to propose a discussion about the workflow of the stable trees
> >> > > when it comes to fs/ and mm/. In the past year we had some friction with
> >> > > regards to the policies and the procedures around picking patches for
> >> > > stable tree, and I feel it would be very useful to establish better flow
> >> > > with the folks who might be attending LSF/MM.

I would like to participate in this as well - it is very important
that we improve
test automation processes.  We run a series of tests, hosted with VMs in Azure
(mostly xfstests but also the git fs regression tests and various ones
that are fs specific
for testing various scenarios like reconnect and various fs specific
mount options)
regularly (on every pull request sent upstream to mainline) for cifs.ko and
also for the kernel server (ksmbd.ko) as well.

This does leave a big gap for stable although Redhat and SuSE seem to
run a similar set of regression tests so not much risk for the distros.

In theory we could periodically run the cifs/smb3.1.1 automated tests
against stable,
perhaps every few weeks and send results somewhere if there was a process
for this for the various fs - but the tests we run were pretty clearly listed
(and also in the wiki.samba.org) so may be easier ways to do this.  Tests could
be run locally on the same machine to ksmbd from cifs.ko (or to Samba if
preferred) so nothing extra to setup.

Would be worth discussing the best process for automating something like
this - others may have figured out tricks that could help all fs in this
xfstest automation


> >> > > I feel that fs/ and mm/ are in very different places with regards to
> >> > > which patches go in -stable, what tests are expected, and the timeline
> >> > > of patches from the point they are proposed on a mailing list to the
> >> > > point they are released in a stable tree. Therefore, I'd like to propose
> >> > > two different sessions on this (one for fs/ and one for mm/), as a
> >> > > common session might be less conductive to agreeing on a path forward as
> >> > > the starting point for both subsystems are somewhat different.
> >> > >
> >> > > We can go through the existing processes, automation, and testing
> >> > > mechanisms we employ when building stable trees, and see how we can
> >> > > improve these to address the concerns of fs/ and mm/ folks.


> >> > Hi Sasha,
> >> >
> >> > I think it would be interesting to have another discussion on the state of fs/
> >> > in -stable and see if things have changed over the past couple of years.


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-10 23:59         ` Steve French
@ 2022-03-11  0:36           ` Chuck Lever III
  2022-03-11 20:54             ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Chuck Lever III @ 2022-03-11  0:36 UTC (permalink / raw)
  To: Steve French
  Cc: Sasha Levin, Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Theodore Tso, Darrick Wong, Josef Bacik,
	Luis R. Rodriguez, Matthew Wilcox



> On Mar 10, 2022, at 6:59 PM, Steve French <smfrench@gmail.com> wrote:
> 
> On Tue, Mar 8, 2022 at 6:16 PM Sasha Levin <sashal@kernel.org> wrote:
>> 
>> On Tue, Mar 08, 2022 at 01:04:05PM +0200, Amir Goldstein wrote:
>>> On Tue, Mar 8, 2022 at 12:08 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>>>> 
>>>> On Tue, Mar 08, 2022 at 11:32:43AM +0200, Amir Goldstein wrote:
>>>>> On Tue, Feb 12, 2019 at 7:31 PM Sasha Levin <sashal@kernel.org> wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> I'd like to propose a discussion about the workflow of the stable trees
>>>>>> when it comes to fs/ and mm/. In the past year we had some friction with
>>>>>> regards to the policies and the procedures around picking patches for
>>>>>> stable tree, and I feel it would be very useful to establish better flow
>>>>>> with the folks who might be attending LSF/MM.
> 
> I would like to participate in this as well - it is very important
> that we improve
> test automation processes.  We run a series of tests, hosted with VMs in Azure
> (mostly xfstests but also the git fs regression tests and various ones
> that are fs specific
> for testing various scenarios like reconnect and various fs specific
> mount options)
> regularly (on every pull request sent upstream to mainline) for cifs.ko and
> also for the kernel server (ksmbd.ko) as well.
> 
> This does leave a big gap for stable although Redhat and SuSE seem to
> run a similar set of regression tests so not much risk for the distros.
> 
> In theory we could periodically run the cifs/smb3.1.1 automated tests
> against stable,
> perhaps every few weeks and send results somewhere if there was a process
> for this for the various fs - but the tests we run were pretty clearly listed
> (and also in the wiki.samba.org) so may be easier ways to do this.  Tests could
> be run locally on the same machine to ksmbd from cifs.ko (or to Samba if
> preferred) so nothing extra to setup.
> 
> Would be worth discussing the best process for automating something like
> this - others may have figured out tricks that could help all fs in this
> xfstest automation

It deserves mention that network file systems like Steve's and mine
have a slightly heavier lift because two systems at a time are needed
to test with -- client and server. I've found that requires more
infrastructure around Jenkins or whatever framework you like to drive
testing. Having a discussion about that and comparing notes about how
this particular issue can be resolved would be of interest to me.


>>>>>> I feel that fs/ and mm/ are in very different places with regards to
>>>>>> which patches go in -stable, what tests are expected, and the timeline
>>>>>> of patches from the point they are proposed on a mailing list to the
>>>>>> point they are released in a stable tree. Therefore, I'd like to propose
>>>>>> two different sessions on this (one for fs/ and one for mm/), as a
>>>>>> common session might be less conductive to agreeing on a path forward as
>>>>>> the starting point for both subsystems are somewhat different.
>>>>>> 
>>>>>> We can go through the existing processes, automation, and testing
>>>>>> mechanisms we employ when building stable trees, and see how we can
>>>>>> improve these to address the concerns of fs/ and mm/ folks.
> 
> 
>>>>> Hi Sasha,
>>>>> 
>>>>> I think it would be interesting to have another discussion on the state of fs/
>>>>> in -stable and see if things have changed over the past couple of years.
> 
> 
> -- 
> Thanks,
> 
> Steve

--
Chuck Lever




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-09 18:57         ` Luis Chamberlain
@ 2022-03-11  5:23           ` Theodore Ts'o
  2022-03-11 12:00             ` Jan Kara
  2022-03-11 20:52             ` Luis Chamberlain
  0 siblings, 2 replies; 55+ messages in thread
From: Theodore Ts'o @ 2022-03-11  5:23 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Sasha Levin, Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Wed, Mar 09, 2022 at 10:57:24AM -0800, Luis Chamberlain wrote:
> On Tue, Mar 08, 2022 at 02:06:57PM -0500, Sasha Levin wrote:
> > What we can't do is invest significant time into doing the testing work
> > ourselves for each and every subsystem in the kernel.
> 
> I think this experience helps though, it gives you I think a better
> appreciation for what concerns we have to merge any fix and the effort
> and dilligence required to ensure we don't regress. I think the
> kernel-ci steady state goal takes this a bit further.

Different communities seem to have different goals that they believe
the stable kernels should be aiming for.  Sure, if you never merge any
fix, you can guarantee that there will be no regressions.  However,
the question is whether the result is a better quality kernel.  For
example, there is a recent change to XFS which fixes a security bug
which allows an attacker to gain access to deleted data.  How do you
balance the tradeoff of "no regressions, ever", versus, "we'll leave a
security bug in XFS which is fixed in mainline linux, but we fear
regressions so much that we won't even backport a single-line fix to
the stable kernel?"

In my view, the service which Greg, Sasha and the other stable
maintainers provide is super-valuable, and I am happy that ext4
changes are automatically cherry-picked into the stable kernel.  Have
there been times when this has resulted in regressions in ext4 for the
stable kernel?  Sure!  It's only been a handful of a times, though,
and the number of bug fixes that users using stable kernels have _not_
seen *far* outweighs the downsides of the occasional regressions
(which gets found and then reverted).

> 
> Perhaps the one area that might interest folks is the test setup,
> using loopback drives and truncated files, if you find holes in
> this please let me know:
> 
> https://github.com/mcgrof/kdevops/blob/master/docs/testing-with-loopback.md
> 
> In my experience this setup just finds *more* issues, rather than less,
> and in my experience as well none of these issues found were bogus, they
> always lead to real bugs:
> 
> https://github.com/mcgrof/kdevops/blob/master/docs/seeing-more-issues.md

Different storage devices --- Google Cloud Persistent Disks, versus
single spindle HDD's, SSD's, eMMC flash, iSCSI devices --- will have
different timing characteristics, and this will affect what failures
you are likely to find.

So if most of the developers for a particular file system tend to use
a particular kind of hardware --- say, HDD's and SSD's --- and you use
something different, such as file-based loopback drives, it's not
surprising that you'll find a different set of failures more often.
It's not that loopback drives are inherently better at finding
problems --- it's just that all of the bugs that are easily findable
on HDD and SSD devices have already been fixed, and so the first
person to test using loopback will find a bunch of new bugs.

This is why I consider myself very lucky that one of the ext4
developers had been testing on Rasberry PI, and they found bugs that
was missed on my GCE setup, and vice versa.  And when he moved to a
newer test rig, which had a faster CPU and faster SSD, he found a
different set of flaky test failures that he couldn't reproduce on his
older test system.

So having a wide diversity of test rigs is really important.  Another
take home is that if you are responsible for a vast number of data
center servers, there isn't a real substitute for running tests on the
hardware that you are using in production.  One of the reasons why we
created android-xfstests was that there were some bugs that weren't
found when testing using KVM, but were much more easily found when
running xfstests on an actual Android device.  And it's why we run
continuous test spinners running xfstests using data center HDD's,
data center SSD's, iSCSI, iBlock (basically something like FUSE but
for block devices, that we'd love to get upstreamed someday), etc.
And these tests are run using the exact file system configuration that
we use in production.

Different people will have different test philosophies, but mine is
that I'm not looking for absolute perfection on upstream kernels.  I
get much better return on investment if I do basic testing for
upstream, but reserve the massive, continuous test spinning, on the
hardware platforms that my employer cares the most about from a $$$
perspective.

And it's actually not about the hardware costs.  The hardware costs
are actually pretty cheap, at least from a company's perspective.
What's actually super-duper expensive is the engineering time to
monitor the test results; to anaylze and root cause flaky test
failures, etc.  In general, "People time" >>> "hardware costs", by two
orders of magnitude.

So ultimately, it's going to be about the business case.  If I can
justify to my company why investing a portion of an engineer to setup
a dedicated test spinner on a particular hardware / software
combination, I can generally get the head count.  But if it's to do
massive testing and on an LTS kernel or a file system that doesn't
have commercial value for many company, it's going to be a tough slog.

Fortunately, though, I claim that we don't need to run xfstests a
thousand times before a commit is deemed "safe" for backporting to LTS
kernels.  (I'm pretty sure we aren't doing that during our upstream
development.)

It makes more sense to reserve that kind of intensive testing for
product kernels which are downstream of LTS, and if they find
problems, they can report that back to the stable kernel maintainers,
and if necessary, we can revert a commit.  In fact, I suspect that
when we *do* that kind of intensive testing, we'll probably find that
the problem still exists in upstream, it's just no one had actually
noticed.

That's certainly been my experience.  When we first deployed ext4 to
Google Data Centers, ten years ago, the fact that we had extensive
monitoring meant that we found a data corruption bug that was
ultimately root caused to a spinlock being released one line too
early.  Not only was the bug not fixed in upstream, it had turned out
that the bug had been in upstream for ten years before *that*, and it
had not been detected in multiple Red Hat and SuSE "golden master"
release testing, and all of the enterprise users of RHEL and SLES

(My guess is that people had written off the failure to cosmic rays,
or unreproducible hardware flakiness.  It was only when we ran at
scale, on millions of file systems under high stress, with
sufficiently automated monitoring of our production servers, that we
were able to detect it.)

So that's why I'm a bit philosophical about testing.  More testing is
always good, but perfection is not attainable.  So we test up to where
it makes business sense, and we accept that there may be some bug
escapes.  That's OK, though, since I'd much rather make sure security
bugs and other stability bugs get backported, even if that means that
once in a blue moon, there is a regression that requires a revert in
the LTS kernel.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11  5:23           ` Theodore Ts'o
@ 2022-03-11 12:00             ` Jan Kara
  2022-03-11 20:52             ` Luis Chamberlain
  1 sibling, 0 replies; 55+ messages in thread
From: Jan Kara @ 2022-03-11 12:00 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, Sasha Levin, Amir Goldstein, Greg KH, lsf-pc,
	linux-fsdevel, Jan Kara, Darrick J. Wong, Josef Bacik,
	Matthew Wilcox

On Fri 11-03-22 00:23:55, Theodore Ts'o wrote:
> On Wed, Mar 09, 2022 at 10:57:24AM -0800, Luis Chamberlain wrote:
> > On Tue, Mar 08, 2022 at 02:06:57PM -0500, Sasha Levin wrote:
> > > What we can't do is invest significant time into doing the testing work
> > > ourselves for each and every subsystem in the kernel.
> > 
> > I think this experience helps though, it gives you I think a better
> > appreciation for what concerns we have to merge any fix and the effort
> > and dilligence required to ensure we don't regress. I think the
> > kernel-ci steady state goal takes this a bit further.
> 
> Different communities seem to have different goals that they believe
> the stable kernels should be aiming for.  Sure, if you never merge any
> fix, you can guarantee that there will be no regressions.  However,
> the question is whether the result is a better quality kernel.  For
> example, there is a recent change to XFS which fixes a security bug
> which allows an attacker to gain access to deleted data.  How do you
> balance the tradeoff of "no regressions, ever", versus, "we'll leave a
> security bug in XFS which is fixed in mainline linux, but we fear
> regressions so much that we won't even backport a single-line fix to
> the stable kernel?"
> 
> In my view, the service which Greg, Sasha and the other stable
> maintainers provide is super-valuable, and I am happy that ext4
> changes are automatically cherry-picked into the stable kernel.  Have
> there been times when this has resulted in regressions in ext4 for the
> stable kernel?  Sure!  It's only been a handful of a times, though,
> and the number of bug fixes that users using stable kernels have _not_
> seen *far* outweighs the downsides of the occasional regressions
> (which gets found and then reverted).

Yes, I completely agree it is tradeoff between how many fixes you backport
and the risk of regressions. As I wrote distro people (like RHEL or SLES)
have infrastructure and do backport sizable chunk of fixes flowing into
stable kernels anyway but we leave out some for which we deem the ratio fix
value / regression risk is bad for us. Also testing for distro people is
somewhat more difficult because we don't have the comfort of testing on the
HW & various combinations of setup and workload the customer is going to
use. So we do some testing on our HW, default configs, and common workloads
and put more effort into patch selection & review to reduce chances of
regressions on customers' systems. Overall, the tradeoff simply works out a
bit differently for distro people than say for Android and I don't think
there's a silver bullet for all...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-10 22:41                   ` Luis Chamberlain
@ 2022-03-11 12:09                     ` Jan Kara
  2022-03-11 18:32                       ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Jan Kara @ 2022-03-11 12:09 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Josef Bacik, Theodore Ts'o, Greg KH, Amir Goldstein,
	Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara, Darrick J. Wong,
	Matthew Wilcox

On Thu 10-03-22 14:41:30, Luis Chamberlain wrote:
> On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> > On Wed, Mar 09, 2022 at 05:28:28PM -0800, Luis Chamberlain wrote:
> > > On Wed, Mar 09, 2022 at 04:19:21PM -0500, Josef Bacik wrote:
> > > > On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> > > 
> > > That's great!
> > > 
> > > But although this runs nightly, it seems this runs fstest *once* to
> > > ensure if there are no regressions. Is that right?
> > > 
> > 
> > Yup once per config, so 8 full fstest runs.
> 
> From my experience that is not enough to capture all failures given
> lower failure rates on tests other than 1/1, like 1/42 or
> 1/300. So minimum I'd go for 500 loops of fstests per config.
> This does mean this is not possible nightly though, yes. 5 days
> on average. And so much more work is needed to bring this down
> further.

Well, yes, 500 loops have better chance of detecting rare bugs. But if you
did only say 100 loops, you are likely to detect the bug just 5 days later
on average. Sure that makes finding the bug somewhat harder (you generally
need to investigate larger time span to find the bug) but testing costs are
lower... It is a tradeoff.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11 12:09                     ` Jan Kara
@ 2022-03-11 18:32                       ` Luis Chamberlain
  0 siblings, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-11 18:32 UTC (permalink / raw)
  To: Jan Kara
  Cc: Josef Bacik, Theodore Ts'o, Greg KH, Amir Goldstein,
	Sasha Levin, lsf-pc, linux-fsdevel, Darrick J. Wong,
	Matthew Wilcox

On Fri, Mar 11, 2022 at 01:09:35PM +0100, Jan Kara wrote:
> On Thu 10-03-22 14:41:30, Luis Chamberlain wrote:
> > On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> > > On Wed, Mar 09, 2022 at 05:28:28PM -0800, Luis Chamberlain wrote:
> > > > On Wed, Mar 09, 2022 at 04:19:21PM -0500, Josef Bacik wrote:
> > > > > On Wed, Mar 09, 2022 at 11:00:49AM -0800, Luis Chamberlain wrote:
> > > > 
> > > > That's great!
> > > > 
> > > > But although this runs nightly, it seems this runs fstest *once* to
> > > > ensure if there are no regressions. Is that right?
> > > > 
> > > 
> > > Yup once per config, so 8 full fstest runs.
> > 
> > From my experience that is not enough to capture all failures given
> > lower failure rates on tests other than 1/1, like 1/42 or
> > 1/300. So minimum I'd go for 500 loops of fstests per config.
> > This does mean this is not possible nightly though, yes. 5 days
> > on average. And so much more work is needed to bring this down
> > further.
> 
> Well, yes, 500 loops have better chance of detecting rare bugs. But if you
> did only say 100 loops, you are likely to detect the bug just 5 days later
> on average. Sure that makes finding the bug somewhat harder (you generally
> need to investigate larger time span to find the bug) but testing costs are
> lower... It is a tradeoff.

Crap sorry I had my numbers mixed, yes 100 takes about 5 days (for btrfs
or xfs running all confgurations in parallel), so indeed, 100 was
reasonable goal today. 500 would take almost a month and if that doesn't
give you much time to fix issues either if you have a kernel release per
month!

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11  5:23           ` Theodore Ts'o
  2022-03-11 12:00             ` Jan Kara
@ 2022-03-11 20:52             ` Luis Chamberlain
  2022-03-11 22:04               ` Theodore Ts'o
  1 sibling, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-11 20:52 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Sasha Levin, Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Fri, Mar 11, 2022 at 12:23:55AM -0500, Theodore Ts'o wrote:
> On Wed, Mar 09, 2022 at 10:57:24AM -0800, Luis Chamberlain wrote:
> > On Tue, Mar 08, 2022 at 02:06:57PM -0500, Sasha Levin wrote:
> > > What we can't do is invest significant time into doing the testing work
> > > ourselves for each and every subsystem in the kernel.
> > 
> > I think this experience helps though, it gives you I think a better
> > appreciation for what concerns we have to merge any fix and the effort
> > and dilligence required to ensure we don't regress. I think the
> > kernel-ci steady state goal takes this a bit further.
> 
> Different communities seem to have different goals that they believe
> the stable kernels should be aiming for.  Sure, if you never merge any
> fix, you can guarantee that there will be no regressions.  However,
> the question is whether the result is a better quality kernel.  For
> example, there is a recent change to XFS which fixes a security bug
> which allows an attacker to gain access to deleted data.  How do you
> balance the tradeoff of "no regressions, ever", versus, "we'll leave a
> security bug in XFS which is fixed in mainline linux, but we fear
> regressions so much that we won't even backport a single-line fix to
> the stable kernel?"

That patch should just be applied, thanks for the heads up, I'll go try
to spin some resources to test if this is not merged already.

And perhaps in such cases the KERNEL_CI_STEADY_STATE_GOAL can be reduced.

> In my view, the service which Greg, Sasha and the other stable
> maintainers provide is super-valuable, and I am happy that ext4
> changes are automatically cherry-picked into the stable kernel.  Have
> there been times when this has resulted in regressions in ext4 for the
> stable kernel?  Sure!  It's only been a handful of a times, though,
> and the number of bug fixes that users using stable kernels have _not_
> seen *far* outweighs the downsides of the occasional regressions
> (which gets found and then reverted).

I think by now the community should know I'm probably one of the biggest
advocates of kernel automation. Whether that be kernel testing or kernel
code generation... the reason I've started dabbling into the automation
part of testing is that they go hand in hand. So while I value the
stable process, I think it should be respected if subsystems with a
higher threshold than others for testing / review is kept.

The only way to move forward with enabling more automation for kernel
code integration is through better and improved kernel test automation.
And it is *exactly* why I've been working so hard on that problem.

> > Perhaps the one area that might interest folks is the test setup,
> > using loopback drives and truncated files, if you find holes in
> > this please let me know:
> > 
> > https://github.com/mcgrof/kdevops/blob/master/docs/testing-with-loopback.md
> > 
> > In my experience this setup just finds *more* issues, rather than less,
> > and in my experience as well none of these issues found were bogus, they
> > always lead to real bugs:
> > 
> > https://github.com/mcgrof/kdevops/blob/master/docs/seeing-more-issues.md
> 
> Different storage devices --- Google Cloud Persistent Disks, versus
> single spindle HDD's, SSD's,

<-- Insert tons of variability requirements on test drives -->
<-- Insert tons of variability requirements on confidence in testing -->
<-- Insert tons of variability requirements on price / cost assessment -->
<-- Insert tons of variability requirements on business case -->

What you left out in terms of variability was you use GCE, and yes
others will want to use AWS, OpenStack, etc as well. So that's another
variability aspect too.

What's the common theme here? Variability!

And what is the most respectable modeling variability language? Kconfig!

It is why the way I've designed kdevops was to embrace kconfig. It
enables you to test however you want, using whatever test devices,
with whatever test criteria you might have and on any cloud or local
virt solution.

Yes, some of the variability things in kdevops are applicable only to
kdevops, but since I picked up kconfig it meant I also adopted it for
variability for fstest and blktests. It should be possible to move that
to fstests / blktests if we wanted to, and for kdevops to just use it.

And if you are thinking:

   why shucks.. but I don't want to deal with the complexity of
   integrating kconfig into a new project. That sounds difficult.

Yes I hear you, and to help with that I've created a git tree which can
be used as a git subtree (note: different than the stupid git
submodules) to let you easily integrate kconfig adoption into any
project with only a few lines of code differences:

https://github.com/mcgrof/kconfig

Also let's recall that just because you have your own test framework
it does not mean we could not benefit from others testing our
filesystems on their own silly hardware at home as well. Yes tons
of projects can be used which wrap fstests... but I never found one
as easy to use as compiling the kernel and running a few make commands.
So my goal was not just addressing the variability aspect for fstests
and blktests, but also enabling the average user to also easily help
test as well.

There is the concept of results too and a possible way to share things..
but this is getting a bit off topic and I don't want to bore people more.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11  0:36           ` Chuck Lever III
@ 2022-03-11 20:54             ` Luis Chamberlain
  0 siblings, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-11 20:54 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Steve French, Sasha Levin, Amir Goldstein, Greg KH, lsf-pc,
	linux-fsdevel, Jan Kara, Theodore Tso, Darrick Wong, Josef Bacik,
	Matthew Wilcox

On Fri, Mar 11, 2022 at 12:36:23AM +0000, Chuck Lever III wrote:
> It deserves mention that network file systems like Steve's and mine
> have a slightly heavier lift because two systems at a time are needed
> to test with -- client and server.

Should be super easy with kdevops.

> I've found that requires more
> infrastructure around Jenkins or whatever framework you like to drive
> testing. Having a discussion about that and comparing notes about how
> this particular issue can be resolved would be of interest to me.

It sounds like this conversation has been a lot more about testing
than stable. So maybe testing should be its own topic at LSFMM.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11 20:52             ` Luis Chamberlain
@ 2022-03-11 22:04               ` Theodore Ts'o
  2022-03-11 22:36                 ` Luis Chamberlain
  2022-04-27 18:58                 ` Amir Goldstein
  0 siblings, 2 replies; 55+ messages in thread
From: Theodore Ts'o @ 2022-03-11 22:04 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Sasha Levin, Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Fri, Mar 11, 2022 at 12:52:41PM -0800, Luis Chamberlain wrote:
> 
> The only way to move forward with enabling more automation for kernel
> code integration is through better and improved kernel test automation.
> And it is *exactly* why I've been working so hard on that problem.

I think we're on the same page here.

> Also let's recall that just because you have your own test framework
> it does not mean we could not benefit from others testing our
> filesystems on their own silly hardware at home as well. Yes tons
> of projects can be used which wrap fstests...

No argument from me!  I'm strongly in favor of diversity in test
framework automation as well as test environments.

In particular, I think there are some valuable things we can learn
from each other, in terms of cross polination in terms of features and
as well as feedback about how easy it is to use a particular test
framework.

For example: README.md doesn't say anything about running make as root
when running "make" as kdevops.  At least, I *think* this is why
running make as kdevops failed:

fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/usr/sbin/apparmor_status", "--enabled"], "delta": "0:00:00.001426", "end": "2022-03-11 16:23:11.769658", "failed_when_result": true, "rc": 0, "start": "2022-03-11 16:23:11.768232", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

(I do have apparmor installed, but it's currently not enabled.  I
haven't done more experimentation since I'm a bit scared of running
"make XXX" as root for any package I randomly download from the net,
so I haven't explored trying to use kdevops, at least not until I set
up a sandboxed VM.  :-)

Including the Debian package names that should be installed would also
be helpful in kdevops/doc/requirements.md.  That's not a problem for
the experienced Debian developer, but one of my personal goals for
kvm-xfstests and gce-xfstests is to allow a random graduate student
who has presented some research file system like Betrfs at the Usenix
FAST conference to be able to easily run fstests.  And it sounds like
you have similar goals of "enabling the average user to also easily
run tests".


> but I never found one
> as easy to use as compiling the kernel and running a few make commands.

I've actually done a lot of work to optimize developer velocity using
my test framework.  So for example:

kvm-xfstests install-kconfig    # set up a kernel Kconfig suitable for kvm-xfstests and gce-xfstests
make
kvm-xfstests smoke     # boot the test appliance VM, using the kernel that was just built

And a user can test a particular stable kernel using a single command
line (this will checkout a particular kernel, and build it on a build
VM, and then launch tests in parallel on a dozen or so VM's):

gce-xfstests ltm -c ext4/all -g auto --repo stable.git --commit v5.15.28

... or if we want to bisect a particular test failure, we might do
something like this:

gce-xfstests ltm -c ext4 generic/361 --bisect-good v5.15 --bisect-bad v5.16

... or I can establish a watcher that will automatically build a git
tree when a branch on a git tree changes:

gce-xfstests ltm -c ext4/4k -g auto --repo next.git --watch master

Granted, this only works on GCE --- but feel free to take these ideas
and integrate them into kdevops if you feel inspired to do so.  :-)

> There is the concept of results too and a possible way to share things..
> but this is getting a bit off topic and I don't want to bore people more.

This would be worth chatting about, perhaps at LSF/MM.  xfstests
already supports junit results files; we could convert it to TAP
format, but junit has more functionality, so perhaps the right
approach is to have tools that can support both TAP and junit?  What
about some way to establish interchange of test artifacts?  i.e.,
saving the kernel logs, and the generic/NNN.full and
generic/NNN.out.bad files?

I have a large library of these test results and test artifacts, and
perhaps others would find it useful if we had a way sharing test
results between developers, especially we have multiple test
infrastructures that might be running ext4, f2fs, and xfs tests?

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11 22:04               ` Theodore Ts'o
@ 2022-03-11 22:36                 ` Luis Chamberlain
  2022-04-27 18:58                 ` Amir Goldstein
  1 sibling, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-11 22:36 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Sasha Levin, Amir Goldstein, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Fri, Mar 11, 2022 at 05:04:37PM -0500, Theodore Ts'o wrote:
> On Fri, Mar 11, 2022 at 12:52:41PM -0800, Luis Chamberlain wrote:
> > 
> > The only way to move forward with enabling more automation for kernel
> > code integration is through better and improved kernel test automation.
> > And it is *exactly* why I've been working so hard on that problem.
> 
> I think we're on the same page here.
> 
> > Also let's recall that just because you have your own test framework
> > it does not mean we could not benefit from others testing our
> > filesystems on their own silly hardware at home as well. Yes tons
> > of projects can be used which wrap fstests...
> 
> No argument from me!  I'm strongly in favor of diversity in test
> framework automation as well as test environments.
> 
> In particular, I think there are some valuable things we can learn
> from each other, in terms of cross polination in terms of features and
> as well as feedback about how easy it is to use a particular test
> framework.
> 
> For example: README.md doesn't say anything about running make as root
> when running "make" as kdevops.  At least, I *think* this is why
> running make as kdevops failed:
> 
> fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/usr/sbin/apparmor_status", "--enabled"], "delta": "0:00:00.001426", "end": "2022-03-11 16:23:11.769658", "failed_when_result": true, "rc": 0, "start": "2022-03-11 16:23:11.768232", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

Ah a check for sudo without privs is in order. I'll add a check.

> (I do have apparmor installed, but it's currently not enabled.  I
> haven't done more experimentation since I'm a bit scared of running
> "make XXX" as root for any package I randomly download from the net,
> so I haven't explored trying to use kdevops, at least not until I set
> up a sandboxed VM.  :-)

Sure. A lot of that setup stuff on the host was added to make it
even easier to use. It however is optional, as otherwise it just
runs sanity checks.

> Including the Debian package names that should be installed would also
> be helpful in kdevops/doc/requirements.md.  That's not a problem for
> the experienced Debian developer, but one of my personal goals for
> kvm-xfstests and gce-xfstests is to allow a random graduate student
> who has presented some research file system like Betrfs at the Usenix
> FAST conference to be able to easily run fstests.  And it sounds like
> you have similar goals of "enabling the average user to also easily
> run tests".

Yup.

Did this requirements doc not suffice?

https://github.com/mcgrof/kdevops/blob/master/docs/requirements.md

> > but I never found one
> > as easy to use as compiling the kernel and running a few make commands.
> 
> I've actually done a lot of work to optimize developer velocity using
> my test framework.  So for example:
> 
> kvm-xfstests install-kconfig    # set up a kernel Kconfig suitable for kvm-xfstests and gce-xfstests
> make
> kvm-xfstests smoke     # boot the test appliance VM, using the kernel that was just built
> 
> And a user can test a particular stable kernel using a single command
> line (this will checkout a particular kernel, and build it on a build
> VM, and then launch tests in parallel on a dozen or so VM's):
> 
> gce-xfstests ltm -c ext4/all -g auto --repo stable.git --commit v5.15.28

Neat we have parity.

> ... or if we want to bisect a particular test failure, we might do
> something like this:
> 
> gce-xfstests ltm -c ext4 generic/361 --bisect-good v5.15 --bisect-bad v5.16

I don't have that.

> ... or I can establish a watcher that will automatically build a git
> tree when a branch on a git tree changes:
> 
> gce-xfstests ltm -c ext4/4k -g auto --repo next.git --watch master

Nor this, neat.

> Granted, this only works on GCE --- but feel free to take these ideas
> and integrate them into kdevops if you feel inspired to do so.  :-)

Thanks, will do. As you probably know by now each of these definitely
takes a lot of time. Right now I a few other objectives on my goal list
but I will gladly welcome patches to enable such a thing!

> > There is the concept of results too and a possible way to share things..
> > but this is getting a bit off topic and I don't want to bore people more.
> 
> This would be worth chatting about, perhaps at LSF/MM.  xfstests

I'd like to just ask that to help folks who are not used to accepting
the fact that xfstests is actually used for *all filesystems* that we
just call it fstests. Calling it just xfstests confuses people. I recall
people realizing even at LSFMM that xfstests is used *widely* by
everyone to test other filesystems, and the issue I think is the name.

If we just refer to it as fstests folks will get it.

> already supports junit results files; we could convert it to TAP
> format,

Kunit went TAP.

I don't care what format we choose, so long as we all strive for one
thing. I'd be happy with TAP too

> but junit has more functionality, so perhaps the right
> approach is to have tools that can support both TAP and junit? 

Sure.. another lesson I learned:

if you just look for the test failure files *.bad... that will not
tell you all tests that fail. Likewise if you just look at junit it
also will not always tell you all the tests that fail. So kdevops looks
at both...

> What
> about some way to establish interchange of test artifacts?  i.e.,
> saving the kernel logs, and the generic/NNN.full and
> generic/NNN.out.bad files?

Yup, all great topics.

Then .. the expunge files, so to help us express a baseline and also
allows us to easily specify failures and bug URLs.  For instance I have:

https://github.com/mcgrof/kdevops/blob/master/workflows/fstests/expunges/opensuse-leap/15.3/xfs/unassigned/all.txt

And they look like:

generic/047 # bsc#1178756
generic/048 # bsc#1178756
generic/068 # bsc#1178756

And kernel.org bugzilla entries are with korg#1234 where 1234 would be
the bug ID.

> I have a large library of these test results and test artifacts, and
> perhaps others would find it useful if we had a way sharing test
> results between developers, especially we have multiple test
> infrastructures that might be running ext4, f2fs, and xfs tests?

Yes yes and yes. I've been dreaming up of perhaps a ledger.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-10 18:51                 ` Josef Bacik
  2022-03-10 22:41                   ` Luis Chamberlain
@ 2022-03-12  2:07                   ` Luis Chamberlain
  2022-03-14 22:45                     ` btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees) Luis Chamberlain
  1 sibling, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-12  2:07 UTC (permalink / raw)
  To: Josef Bacik, David Sterba
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-btrfs, linux-fsdevel, Jan Kara, Darrick J. Wong,
	Matthew Wilcox, Goldwyn Rodrigues, Pankaj Raghav,
	Javier González, Damien Le Moal, Johannes Thumshirn,
	Chaitanya Kulkarni, Adam Manzanares, kanchan Joshi,
	Pankaj Raghav, Kanchan Joshi

On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> [root@fedora-rawhide ~]# cat /xfstests-dev/local.config
> [btrfs_normal_freespacetree]
> [btrfs_compress_freespacetree]
> [btrfs_normal]
> [btrfs_compression]
> [kdave]
> [btrfs_normal_noholes]
> [btrfs_compress_noholes]
> [btrfs_noholes_freespacetree]

+ linux-btrfs and zone folks.

I simplified these as follows, please let me know if the names are
alright. I think we may be able to come up with something more
clever than btrfs_dave. The raid56/noraid56 exist just for the
defaults of the distro/btrfs-progs. The expunge list is used
to determine if something is raid56 or not sadly given we
have no groups for putting tests into the raid56 group. The idea
is some distros don't support raid56 so that is the goal with
the noraid56 config.

The name needs to be: $FS_$FANCY_SINGLE_SPACED_NAME

Each guest spawned will have that same hostname. And likewise
the expunges are collected for each guest hostname. The hostname
is used to pick the expunge directory so to ensure it reflects
the baseline.

You may want to look at this expunge file:

https://github.com/mcgrof/kdevops/blob/master/workflows/fstests/expunges/opensuse-leap/15.3/btrfs/unassigned/btrfs_noraid56.txt\x02

[default]
TEST_DEV=@FSTESTSTESTDEV@
TEST_DIR=@FSTESTSDIR@
SCRATCH_DEV_POOL="@FSTESTSSCRATCHDEVPOOL@"

SCRATCH_MNT=@FSTESTSSCRATCHMNT@
RESULT_BASE=$PWD/results/$HOST/$(uname -r)

[btrfs_raid56]
MKFS_OPTIONS='-f'
FSTYP=btrfs

[btrfs_noraid56]
MKFS_OPTIONS='-f'
FSTYP=btrfs

[btrfs_normalfreespacetree]
LOGWRITES_DEV=@FSTESTSLOGWRITESDEV@
MKFS_OPTIONS="-K -f -O ^no-holes"
MOUNT_OPTIONS="-o space_cache=v2"
FSTYP=btrfs

[btrfs_compressfreespacetree]
MOUNT_OPTIONS="-o compress=zlib,space_cache=v2"
MKFS_OPTIONS="-K -f -O ^no-holes"

[btrfs_normal]
LOGWRITES_DEV=@FSTESTSLOGWRITESDEV@
MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"
MOUNT_OPTIONS="-o discard=async"

[btrfs_compression]
MOUNT_OPTIONS="-o compress=zstd,discard=async"
MKFS_OPTIONS="-K -O ^no-holes -R ^free-space-tree"

[btrfs_kdave]
MKFS_OPTIONS="-K -O no-holes -R ^free-space-tree"
MOUNT_OPTIONS="-o discard,space_cache=v2"

[btrfs_normalnoholes]
LOGWRITES_DEV=@FSTESTSLOGWRITESDEV@
MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"

[btrfs_compressnoholes]
MKFS_OPTIONS="-K -O no-holes -f -R ^free-space-tree"
MOUNT_OPTIONS="-o compress=lzo"

[btrfs_noholesfreespacetree]
MKFS_OPTIONS="-K -O no-holes -f"
MOUNT_OPTIONS="-o space_cache=v2"

I see nothing for NVMe ZNS.. so how about 

[btrfs_zns]
MKFS_OPTIONS="-f -d single -m single"
TEST_DEV=@FSTESTSTESTZNSDEV@
SCRATCH_DEV_POOL="@FSTESTSSCRATCHDEVZNSPOOL@"

[btrfs_simple]
TEST_DEV=@FSTESTSTESTSDEV@
MKFS_OPTIONS="-f -d single -m single"
SCRATCH_DEV_POOL="@FSTESTSSCRATCHDEVPOOL@"

The idea being btrfs_simple will not use zns drives behind the scenes
but btrfs_zns will.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees)
  2022-03-12  2:07                   ` Luis Chamberlain
@ 2022-03-14 22:45                     ` Luis Chamberlain
  2022-03-15 14:23                       ` Josef Bacik
  0 siblings, 1 reply; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-14 22:45 UTC (permalink / raw)
  To: Josef Bacik, David Sterba
  Cc: Theodore Ts'o, Greg KH, Amir Goldstein, Sasha Levin, lsf-pc,
	linux-btrfs, linux-fsdevel, Jan Kara, Darrick J. Wong,
	Matthew Wilcox, Goldwyn Rodrigues, Pankaj Raghav,
	Javier González, Damien Le Moal, Johannes Thumshirn,
	Chaitanya Kulkarni, Adam Manzanares, kanchan Joshi,
	Pankaj Raghav, Kanchan Joshi

On Fri, Mar 11, 2022 at 06:07:21PM -0800, Luis Chamberlain wrote:
> On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> > [root@fedora-rawhide ~]# cat /xfstests-dev/local.config
> > [btrfs_normal_freespacetree]
> > [btrfs_compress_freespacetree]
> > [btrfs_normal]
> > [btrfs_compression]
> > [kdave]
> > [btrfs_normal_noholes]
> > [btrfs_compress_noholes]
> > [btrfs_noholes_freespacetree]
> 
> + linux-btrfs and zone folks.
> 
> The name needs to be: $FS_$FANCY_SINGLE_SPACED_NAME

Actually using_underscores_is_fine for the hostnames so we can keep
your original except kdave :) and that just gets mapped to btrfs_kdave
for now until you guys figure out what to call it.

Likewise it would be useful if someone goees through these and gives me
hints as to the kernel revision that supports such config, so that if
testing on stable for instance or an older kernel, then the kconfig
option for them does not appear.

> I see nothing for NVMe ZNS.. so how about 
> 
> [btrfs_zns]
> MKFS_OPTIONS="-f -d single -m single"
> TEST_DEV=@FSTESTSTESTZNSDEV@
> SCRATCH_DEV_POOL="@FSTESTSSCRATCHDEVZNSPOOL@"
> 
> [btrfs_simple]
> TEST_DEV=@FSTESTSTESTSDEV@
> MKFS_OPTIONS="-f -d single -m single"
> SCRATCH_DEV_POOL="@FSTESTSSCRATCHDEVPOOL@"
> 
> The idea being btrfs_simple will not use zns drives behind the scenes
> but btrfs_zns will.

I went with:

[btrfs_simple]
[btrfs_simple_zns]

If there are other ZNS profiles we can use / should test please let me know.

Thanks,

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees)
  2022-03-14 22:45                     ` btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees) Luis Chamberlain
@ 2022-03-15 14:23                       ` Josef Bacik
  2022-03-15 17:42                         ` Luis Chamberlain
  0 siblings, 1 reply; 55+ messages in thread
From: Josef Bacik @ 2022-03-15 14:23 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: David Sterba, Theodore Ts'o, Greg KH, Amir Goldstein,
	Sasha Levin, lsf-pc, linux-btrfs, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Matthew Wilcox, Goldwyn Rodrigues,
	Pankaj Raghav, Javier González, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Adam Manzanares,
	kanchan Joshi, Pankaj Raghav, Kanchan Joshi

On Mon, Mar 14, 2022 at 03:45:28PM -0700, Luis Chamberlain wrote:
> On Fri, Mar 11, 2022 at 06:07:21PM -0800, Luis Chamberlain wrote:
> > On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> > > [root@fedora-rawhide ~]# cat /xfstests-dev/local.config
> > > [btrfs_normal_freespacetree]
> > > [btrfs_compress_freespacetree]
> > > [btrfs_normal]
> > > [btrfs_compression]
> > > [kdave]
> > > [btrfs_normal_noholes]
> > > [btrfs_compress_noholes]
> > > [btrfs_noholes_freespacetree]
> > 
> > + linux-btrfs and zone folks.
> > 
> > The name needs to be: $FS_$FANCY_SINGLE_SPACED_NAME
> 
> Actually using_underscores_is_fine for the hostnames so we can keep
> your original except kdave :) and that just gets mapped to btrfs_kdave
> for now until you guys figure out what to call it.
> 

Lol you didn't need to save the name, I just threw that in there because Sterba
wanted me to test something specific for some patch and I just never deleted it.

> Likewise it would be useful if someone goees through these and gives me
> hints as to the kernel revision that supports such config, so that if
> testing on stable for instance or an older kernel, then the kconfig
> option for them does not appear.
> 

I'm cloning this stuff and doing it now, I got fed up trying to find the
performance difference between virtme and libvirt.  If your shit gives me the
right performance and makes it so I don't have to think then I'll be happy
enough to use it.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees)
  2022-03-15 14:23                       ` Josef Bacik
@ 2022-03-15 17:42                         ` Luis Chamberlain
  0 siblings, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-03-15 17:42 UTC (permalink / raw)
  To: Josef Bacik
  Cc: David Sterba, Theodore Ts'o, Greg KH, Amir Goldstein,
	Sasha Levin, lsf-pc, linux-btrfs, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Matthew Wilcox, Goldwyn Rodrigues,
	Pankaj Raghav, Javier González, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Adam Manzanares,
	kanchan Joshi, Pankaj Raghav, Kanchan Joshi

On Tue, Mar 15, 2022 at 10:23:53AM -0400, Josef Bacik wrote:
> On Mon, Mar 14, 2022 at 03:45:28PM -0700, Luis Chamberlain wrote:
> > On Fri, Mar 11, 2022 at 06:07:21PM -0800, Luis Chamberlain wrote:
> > > On Thu, Mar 10, 2022 at 01:51:22PM -0500, Josef Bacik wrote:
> > > > [root@fedora-rawhide ~]# cat /xfstests-dev/local.config
> > > > [btrfs_normal_freespacetree]
> > > > [btrfs_compress_freespacetree]
> > > > [btrfs_normal]
> > > > [btrfs_compression]
> > > > [kdave]
> > > > [btrfs_normal_noholes]
> > > > [btrfs_compress_noholes]
> > > > [btrfs_noholes_freespacetree]
> > > 
> > > + linux-btrfs and zone folks.
> > > 
> > > The name needs to be: $FS_$FANCY_SINGLE_SPACED_NAME
> > 
> > Actually using_underscores_is_fine for the hostnames so we can keep
> > your original except kdave :) and that just gets mapped to btrfs_kdave
> > for now until you guys figure out what to call it.
> > 
> 
> Lol you didn't need to save the name, I just threw that in there because Sterba
> wanted me to test something specific for some patch and I just never deleted it.

Heh, I figured, I just didn't know what the hell to name that.

> > Likewise it would be useful if someone goees through these and gives me
> > hints as to the kernel revision that supports such config, so that if
> > testing on stable for instance or an older kernel, then the kconfig
> > option for them does not appear.
> > 
> 
> I'm cloning this stuff and doing it now, I got fed up trying to find the
> performance difference between virtme and libvirt.  If your shit gives me the
> right performance and makes it so I don't have to think then I'll be happy
> enough to use it.  Thanks,

Groovy.

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-08 16:40     ` Theodore Ts'o
                         ` (2 preceding siblings ...)
  2022-03-09 18:41       ` Luis Chamberlain
@ 2022-03-29 20:24       ` Amir Goldstein
  2022-04-10 15:11         ` Amir Goldstein
  3 siblings, 1 reply; 55+ messages in thread
From: Amir Goldstein @ 2022-03-29 20:24 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Josef Bacik, Matthew Wilcox, Luis R. Rodriguez

On Tue, Mar 8, 2022 at 6:40 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Mar 08, 2022 at 11:08:48AM +0100, Greg KH wrote:
> > > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> >
> > That's up to the xfs maintainers to discuss.
> >
> > > Which makes me wonder: how do the distro kernel maintainers keep up
> > > with xfs fixes?
> >
> > Who knows, ask the distro maintainers that use xfs.  What do they do?
>
> This is something which is being worked, so I'm not sure we'll need to
> discuss the specifics of the xfs stable backports at LSF/MM.  I'm
> hopeful that by May, we'll have come to some kind of resolution of
> that topic.
>
> One of my team members has been working with Darrick to set up a set
> of xfs configs[1] recommended by Darrick, and she's stood up an
> automated test spinner using gce-xfstests which can watch a git branch
> and automatically kick off a set of tests whenever it is updated.
> Sasha has also provided her with a copy of his scripts so we can do
> automated cherry picks of commits with Fixes tags.  So the idea is
> that we can, hopefully in a mostly completely automated fashion,

Here is a little gadget I have been working on in my spare time,
that might be able to assist us in the process of selecting stable
patch candidates.

Many times, the relevant information for considering a patch for
stable tree is in the cover letter of the patch series. It is also a lot
more efficient to skim over 23 cover letter subjects than it is to skim
over 103 commits of the xfs pull request for 5.15 [2].

I've added the command "b4 rn" [1] to produce a list of lore links
to patch series from a PULL request to Linus.

This gadget could be improved to interactively select the patch
series from within a PR to be saved into an mbox.

In any case, I do intend to start surveying the xfs patches that got
merged since v5.10 and stage a branch with my own selections, so we
will be able to compare my selections to Shasha's AUTOSEL selections.

Thanks,
Amir.

P.S. The tool sometimes produces links to two different revisions of
the same patch series (e.g. "xfs: feature flag rework" and
"[v3] xfs: rework feature flags"). It didn't bother me enough to check why.

[1] https://github.com/amir73il/b4/commits/release-notes
[2] For example:

PR=164817214223.9489.12483808836905609419.pr-tracker-bot@kernel.org
b4 pr -e -o xfs-5.18.mbx $PR
PR=163060423908.29568.14182828511329643634.pr-tracker-bot@kernel.org
b4 pr -e -o xfs-5.15.mbx $PR
b4 rn -m xfs-5.18.mbx 2>/dev/null
---
Changes in [GIT PULL] xfs: new code for 5.18:
  [https://lore.kernel.org/r/20220323164821.GP8224@magnolia]

- [PATCH] xfs: add missing cmap->br_state = XFS_EXT_NORM update
  [https://lore.kernel.org/r/20220217095542.68085-1-hsiangkao@linux.alibaba.com]

- [PATCH RESEND] xfs: don't generate selinux audit messages for
capability testing
  [https://lore.kernel.org/r/20220301025052.GF117732@magnolia]

- [PATCHSET 0/2] xfs: use setattr_copy to set VFS file attributes
  [https://lore.kernel.org/r/164685372611.495833.8601145506549093582.stgit@magnolia]

- [PATCHSET v3 0/2] xfs: make quota reservations for directory changes
  [https://lore.kernel.org/r/164694920783.1119636.13401244964062260779.stgit@magnolia]

- [PATCHSET v2 0/2] xfs: constify dotdot global variable
  [https://lore.kernel.org/r/164694922267.1119724.17942999738634110525.stgit@magnolia]

- [PATCH 0/7 v4] xfs: log recovery fixes
  [https://lore.kernel.org/r/20220317053907.164160-1-david@fromorbit.com]

---
b4 rn -m xfs-5.15.mbx 2>/dev/null
---
Changes in [GIT PULL] xfs: new code for 5.15:
  [https://lore.kernel.org/r/20210831211847.GC9959@magnolia]

- don't allow disabling quota accounting on a mounted file system v2
  [https://lore.kernel.org/r/20210809065938.1199181-1-hch@lst.de]

- [PATCHSET v9 00/14] xfs: deferred inode inactivation
  [https://lore.kernel.org/r/162812918259.2589546.16599271324044986858.stgit@magnolia]

- [PATCHSET v8 00/20] xfs: deferred inode inactivation
  [https://lore.kernel.org/r/162758423315.332903.16799817941903734904.stgit@magnolia]

- [PATCHSET 0/5] xfs: other stuff for 5.15
  [https://lore.kernel.org/r/162814684332.2777088.14593133806068529811.stgit@magnolia]

- [PATCH] xfs: drop experimental warnings for bigtime and inobtcount
  [https://lore.kernel.org/r/20210707002313.GG11588@locust]

- [PATCH 0/3 v3] xfs, mm: memory allocation improvements
  [https://lore.kernel.org/r/20210714023440.2608690-1-david@fromorbit.com]

- [PATCH v25 00/14] Log Attribute Replay
  [https://lore.kernel.org/r/20211117041613.3050252-1-allison.henderson@oracle.com]

- [PATCH] fs:xfs: cleanup __FUNCTION__ usage
  [https://lore.kernel.org/r/20210711085153.95856-1-dwaipayanray1@gmail.com]

- [PATCH 0/9 v3] xfs: shutdown is a racy mess
  [https://lore.kernel.org/r/20210810051825.40715-1-david@fromorbit.com]

- [PATCH 0/5 v3] xfs: strictly order log start records
  [https://lore.kernel.org/r/20210810052120.41019-1-david@fromorbit.com]

- [PATCH 0/3 v7] xfs: make CIL pipelining work
  [https://lore.kernel.org/r/20210810052257.41308-1-david@fromorbit.com]

- [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support
  [https://lore.kernel.org/r/20181107063127.3902-1-david@fromorbit.com]

- [PATCHSET 0/3] xfs: fix various bugs in fsmap
  [https://lore.kernel.org/r/162872991654.1220643.136984377220187940.stgit@magnolia]

- [PATCHSET 0/2] xfs: more random tweaks
  [https://lore.kernel.org/r/162872993519.1220748.15526308019664551101.stgit@magnolia]

- [PATCHSET 00/10] xfs: constify btree operations
  [https://lore.kernel.org/r/162881108307.1695493.3416792932772498160.stgit@magnolia]

- [PATCH] xfs: remove support for untagged lookups in xfs_icwalk*
  [https://lore.kernel.org/r/20210813081623.83323-1-hch@lst.de]

- [PATCHSET 00/15] xfs: clean up ftrace field tags and formats
  [https://lore.kernel.org/r/162924373176.761813.10896002154570305865.stgit@magnolia]

- [PATCH 00/16 v3] xfs: rework feature flags
  [https://lore.kernel.org/r/20210818235935.149431-1-david@fromorbit.com]

- [PATCH 0/10] xfs: feature flag rework
  [https://lore.kernel.org/r/20180820044851.414-1-david@fromorbit.com]

- [PATCH 0/3] xfs: clean up buffer cache disk addressing
  [https://lore.kernel.org/r/20210810052851.42312-1-david@fromorbit.com]

- [PATCH] xfs: fix perag structure refcounting error when scrub fails
  [https://lore.kernel.org/r/20210820050647.GW12640@magnolia]

- [RFC PATCH] generic: regression test for a FALLOC_FL_UNSHARE bug in XFS
  [https://lore.kernel.org/r/20210824003835.GD12640@magnolia]

- [RFC PATCH] xfs: test DONTCACHE behavior with the inode cache
  [https://lore.kernel.org/r/20210825230703.GH12640@magnolia]

---

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-29 20:24       ` [LSF/MM TOPIC] FS, MM, and stable trees Amir Goldstein
@ 2022-04-10 15:11         ` Amir Goldstein
  0 siblings, 0 replies; 55+ messages in thread
From: Amir Goldstein @ 2022-04-10 15:11 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Sasha Levin, lsf-pc, linux-fsdevel, Jan Kara,
	Darrick J. Wong, Josef Bacik, Matthew Wilcox, Luis R. Rodriguez

On Tue, Mar 29, 2022 at 11:24 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Mar 8, 2022 at 6:40 PM Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > On Tue, Mar 08, 2022 at 11:08:48AM +0100, Greg KH wrote:
> > > > When one looks at xfstest bug reports on the list for xfs on kernels > v4.19
> > > > one has to wonder if using xfs on kernels v5.x.y is a wise choice.
> > >
> > > That's up to the xfs maintainers to discuss.
> > >
> > > > Which makes me wonder: how do the distro kernel maintainers keep up
> > > > with xfs fixes?
> > >
> > > Who knows, ask the distro maintainers that use xfs.  What do they do?
> >
> > This is something which is being worked, so I'm not sure we'll need to
> > discuss the specifics of the xfs stable backports at LSF/MM.  I'm
> > hopeful that by May, we'll have come to some kind of resolution of
> > that topic.
> >
> > One of my team members has been working with Darrick to set up a set
> > of xfs configs[1] recommended by Darrick, and she's stood up an
> > automated test spinner using gce-xfstests which can watch a git branch
> > and automatically kick off a set of tests whenever it is updated.
> > Sasha has also provided her with a copy of his scripts so we can do
> > automated cherry picks of commits with Fixes tags.  So the idea is
> > that we can, hopefully in a mostly completely automated fashion,
>
> Here is a little gadget I have been working on in my spare time,
> that might be able to assist us in the process of selecting stable
> patch candidates.
>
> Many times, the relevant information for considering a patch for
> stable tree is in the cover letter of the patch series. It is also a lot
> more efficient to skim over 23 cover letter subjects than it is to skim
> over 103 commits of the xfs pull request for 5.15 [2].
>
> I've added the command "b4 rn" [1] to produce a list of lore links
> to patch series from a PULL request to Linus.
>
> This gadget could be improved to interactively select the patch
> series from within a PR to be saved into an mbox.
>
> In any case, I do intend to start surveying the xfs patches that got
> merged since v5.10 and stage a branch with my own selections, so we
> will be able to compare my selections to Shasha's AUTOSEL selections.
>

FYI, I've enhanced the "br rn" gadget to report fstests that were mentioned
on ML discussions on the patch sets, for example:
--
- [PATCH 1/3] mm: Add kvrealloc()
  [https://lore.kernel.org/r/20210714023440.2608690-2-david@fromorbit.com]
  Tests: generic/040 generic/041
--

This can help to understand if some fstests failure seen on LTS baseline
run may already have a fix upstream that could be backported.

You can see example release notes for XFS v5.10..v5.17 [1] and the
subset of fixes [2] manually selected for my 5.10.y backport branch [3].

I am working with Luis so start testing these backports with his
kdevops [4] framework.

Thanks,
Amir.

[1] https://github.com/amir73il/b4/blob/xfs-5.10.y/xfs-5.10..5.17-rn.rst
[2] https://github.com/amir73il/b4/blob/xfs-5.10.y/xfs-5.10..5.17-fixes.rst
[3] https://github.com/amir73il/linux/commits/xfs-5.10.y
[4] https://github.com/mcgrof/kdevops

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-03-11 22:04               ` Theodore Ts'o
  2022-03-11 22:36                 ` Luis Chamberlain
@ 2022-04-27 18:58                 ` Amir Goldstein
  2022-05-01 16:25                   ` Luis Chamberlain
  1 sibling, 1 reply; 55+ messages in thread
From: Amir Goldstein @ 2022-04-27 18:58 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Luis Chamberlain, Sasha Levin, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Sat, Mar 12, 2022 at 12:04 AM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Fri, Mar 11, 2022 at 12:52:41PM -0800, Luis Chamberlain wrote:
> >
> > The only way to move forward with enabling more automation for kernel
> > code integration is through better and improved kernel test automation.
> > And it is *exactly* why I've been working so hard on that problem.
>
> I think we're on the same page here.
>
> > Also let's recall that just because you have your own test framework
> > it does not mean we could not benefit from others testing our
> > filesystems on their own silly hardware at home as well. Yes tons
> > of projects can be used which wrap fstests...
>
> No argument from me!  I'm strongly in favor of diversity in test
> framework automation as well as test environments.
>
> In particular, I think there are some valuable things we can learn
> from each other, in terms of cross polination in terms of features and
> as well as feedback about how easy it is to use a particular test
> framework.
>
> For example: README.md doesn't say anything about running make as root
> when running "make" as kdevops.  At least, I *think* this is why
> running make as kdevops failed:
>
> fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/usr/sbin/apparmor_status", "--enabled"], "delta": "0:00:00.001426", "end": "2022-03-11 16:23:11.769658", "failed_when_result": true, "rc": 0, "start": "2022-03-11 16:23:11.768232", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
>
> (I do have apparmor installed, but it's currently not enabled.  I
> haven't done more experimentation since I'm a bit scared of running
> "make XXX" as root for any package I randomly download from the net,
> so I haven't explored trying to use kdevops, at least not until I set
> up a sandboxed VM.  :-)
>
> Including the Debian package names that should be installed would also
> be helpful in kdevops/doc/requirements.md.  That's not a problem for
> the experienced Debian developer, but one of my personal goals for
> kvm-xfstests and gce-xfstests is to allow a random graduate student
> who has presented some research file system like Betrfs at the Usenix
> FAST conference to be able to easily run fstests.  And it sounds like
> you have similar goals of "enabling the average user to also easily
> run tests".
>
>
> > but I never found one
> > as easy to use as compiling the kernel and running a few make commands.
>
> I've actually done a lot of work to optimize developer velocity using
> my test framework.  So for example:
>
> kvm-xfstests install-kconfig    # set up a kernel Kconfig suitable for kvm-xfstests and gce-xfstests
> make
> kvm-xfstests smoke     # boot the test appliance VM, using the kernel that was just built
>
> And a user can test a particular stable kernel using a single command
> line (this will checkout a particular kernel, and build it on a build
> VM, and then launch tests in parallel on a dozen or so VM's):
>
> gce-xfstests ltm -c ext4/all -g auto --repo stable.git --commit v5.15.28
>
> ... or if we want to bisect a particular test failure, we might do
> something like this:
>
> gce-xfstests ltm -c ext4 generic/361 --bisect-good v5.15 --bisect-bad v5.16
>
> ... or I can establish a watcher that will automatically build a git
> tree when a branch on a git tree changes:
>
> gce-xfstests ltm -c ext4/4k -g auto --repo next.git --watch master
>
> Granted, this only works on GCE --- but feel free to take these ideas
> and integrate them into kdevops if you feel inspired to do so.  :-)
>
> > There is the concept of results too and a possible way to share things..
> > but this is getting a bit off topic and I don't want to bore people more.
>
> This would be worth chatting about, perhaps at LSF/MM.  xfstests
> already supports junit results files; we could convert it to TAP
> format, but junit has more functionality, so perhaps the right
> approach is to have tools that can support both TAP and junit?  What
> about some way to establish interchange of test artifacts?  i.e.,
> saving the kernel logs, and the generic/NNN.full and
> generic/NNN.out.bad files?
>
> I have a large library of these test results and test artifacts, and
> perhaps others would find it useful if we had a way sharing test
> results between developers, especially we have multiple test
> infrastructures that might be running ext4, f2fs, and xfs tests?
>

Hi Ted,

I penciled a session on "Challenges with running fstests" in the
agenda.

I was hoping that you and Luis could co-lead this session and
present the progress both of you made with your test frameworks.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [LSF/MM TOPIC] FS, MM, and stable trees
  2022-04-27 18:58                 ` Amir Goldstein
@ 2022-05-01 16:25                   ` Luis Chamberlain
  0 siblings, 0 replies; 55+ messages in thread
From: Luis Chamberlain @ 2022-05-01 16:25 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Ts'o, Sasha Levin, Greg KH, lsf-pc, linux-fsdevel,
	Jan Kara, Darrick J. Wong, Josef Bacik, Matthew Wilcox

On Wed, Apr 27, 2022 at 09:58:53PM +0300, Amir Goldstein wrote:
> On Sat, Mar 12, 2022 at 12:04 AM Theodore Ts'o <tytso@mit.edu> wrote:
> >
> Hi Ted,
> 
> I penciled a session on "Challenges with running fstests" in the
> agenda.
> 
> I was hoping that you and Luis could co-lead this session and
> present the progress both of you made with your test frameworks.

I'm starting to think that since IO has no session yet scheduled
in for this session it may make sense to make this generic about
fstests and blktests. In fact even fio has tests these days which
we should all be running too.

My point though is that I think that it may make sense to have
both IO and fstests share this for perhaps a common:

"Challenges with running fstests and blktests"

  Luis

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2022-05-01 16:25 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-12 17:00 [LSF/MM TOPIC] FS, MM, and stable trees Sasha Levin
2019-02-12 21:32 ` Steve French
2019-02-13  7:20   ` Amir Goldstein
2019-02-13  7:37     ` Greg KH
2019-02-13  9:01       ` Amir Goldstein
2019-02-13  9:18         ` Greg KH
2019-02-13 19:25           ` Sasha Levin
2019-02-13 19:52             ` Greg KH
2019-02-13 20:14               ` James Bottomley
2019-02-15  1:50                 ` Sasha Levin
2019-02-15  2:48                   ` James Bottomley
2019-02-16 18:28                     ` Theodore Y. Ts'o
2019-02-21 15:34                       ` Luis Chamberlain
2019-02-21 18:52                         ` Theodore Y. Ts'o
2019-03-20  3:46               ` Jon Masters
2019-03-20  5:06                 ` Greg KH
2019-03-20  6:14                   ` Jon Masters
2019-03-20  6:28                     ` Greg KH
2019-03-20  6:32                       ` Jon Masters
2022-03-08  9:32 ` Amir Goldstein
2022-03-08 10:08   ` Greg KH
2022-03-08 11:04     ` Amir Goldstein
2022-03-08 15:42       ` Luis Chamberlain
2022-03-08 19:06       ` Sasha Levin
2022-03-09 18:57         ` Luis Chamberlain
2022-03-11  5:23           ` Theodore Ts'o
2022-03-11 12:00             ` Jan Kara
2022-03-11 20:52             ` Luis Chamberlain
2022-03-11 22:04               ` Theodore Ts'o
2022-03-11 22:36                 ` Luis Chamberlain
2022-04-27 18:58                 ` Amir Goldstein
2022-05-01 16:25                   ` Luis Chamberlain
2022-03-10 23:59         ` Steve French
2022-03-11  0:36           ` Chuck Lever III
2022-03-11 20:54             ` Luis Chamberlain
2022-03-08 16:40     ` Theodore Ts'o
2022-03-08 17:16       ` Amir Goldstein
2022-03-09  0:43       ` Dave Chinner
2022-03-09 18:41       ` Luis Chamberlain
2022-03-09 18:49         ` Josef Bacik
2022-03-09 19:00           ` Luis Chamberlain
2022-03-09 21:19             ` Josef Bacik
2022-03-10  1:28               ` Luis Chamberlain
2022-03-10 18:51                 ` Josef Bacik
2022-03-10 22:41                   ` Luis Chamberlain
2022-03-11 12:09                     ` Jan Kara
2022-03-11 18:32                       ` Luis Chamberlain
2022-03-12  2:07                   ` Luis Chamberlain
2022-03-14 22:45                     ` btrfs profiles to test was: (Re: [LSF/MM TOPIC] FS, MM, and stable trees) Luis Chamberlain
2022-03-15 14:23                       ` Josef Bacik
2022-03-15 17:42                         ` Luis Chamberlain
2022-03-29 20:24       ` [LSF/MM TOPIC] FS, MM, and stable trees Amir Goldstein
2022-04-10 15:11         ` Amir Goldstein
2022-03-08 10:54   ` Jan Kara
2022-03-09  0:02   ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).