linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
@ 2018-07-13  2:43 Luis R. Chamberlain
  2018-07-13  8:39 ` Amir Goldstein
  2018-07-13 20:51 ` Matthew Wilcox
  0 siblings, 2 replies; 15+ messages in thread
From: Luis R. Chamberlain @ 2018-07-13  2:43 UTC (permalink / raw)
  To: Linux FS Devel
  Cc: xfs, fstests, Amir Goldstein, Sasha Levin, Sasha Levin,
	Valentin Rothberg, Luis R. Chamberlain

I had volunteered at the last LSF/MM to help with the stable work for
XFS. To help with this, as part of this year's SUSE Hackweek, I've
first generalized my own set of scripts to help track a baseline of
results from fstests [0], and extended it to be able to easily ramp up
with fstests on different distributions, and I've also created a
respective baseline of results against these distributions as a
further example of how these scripts and wrapper framework can be used
[1]. The distributions currently supported are:

  * Debian testing
  * OpenSUSE Leap 15.0
  * Fedora 28

The stable work starts with creating a baseline for v4.17.3. The
results are visible as a result of expunge files which categorize the
failures for the different sections tested. Other than careful manual
inspection of each stable candidate patch, one of the goals will also
be to ensure such stable patches do not regress the baseline. Work is
currently underway to review the first set of stable candidate patches
for v4.17.3, if they both pass review and do not regress the
established baseline, I'll proceed to post the patches for further
evaluation from the community.

Note that while I used this for XFS, it should be easy to add support
for other filesystems, should folks wish to do something similar for
their filesystems. The current XFS sections being tested are as
follows, please let me know if we should consider extending this
further:

# Matches what we expect to be default on the latests xfsprogs
[xfs]
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
USE_EXTERNAL=no
FSTYP=xfs

[xfs_reflink]
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
USE_EXTERNAL=no
FSTYP=xfs

[xfs_reflink_1024]
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
USE_EXTERNAL=no
FSTYP=xfs

# For older kernels when we didn't have crc
[xfs_nocrc]
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
USE_EXTERNAL=no
FSTYP=xfs

[xfs_nocrc_512]
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
USE_EXTERNAL=no
FSTYP=xfs

# Latest defaults with an external log
[xfs_logdev]
MKFS_OPTIONS="-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0 -lsize=1g"
SCRATCH_LOGDEV=/dev/loop15
USE_EXTERNAL=yes
FSTYP=xfs

# Requires CONFIG_XFS_RT which most distros disable now
[xfs_realtimedev]
MKFS_OPTIONS="-f -lsize=1g"
SCRATCH_LOGDEV=/dev/loop15
SCRATCH_RTDEV=/dev/loop14
USE_EXTERNAL=yes
FSTYP=xfs

There are notes about possible issue with tests and diffs when using
an external log on the respective expunge files for the section, so
section xfs_logdev, and xfs_realtimedev likely have a slew of failures
due to unexpected test results.

Also worth noting is that out of the distributions tested only Debian
currently still enables CONFIG_XFS_RT, so it may be worth for Debian
to consider disabling it as well. Its really unclear exactly who cares
about this still and who's really testing CONFIG_XFS_RT anymore.

To be clear the results are not in yet for the v4.17.3 work, this is
just the framework which I'll use to next address that. Expect more
updates on that later once the results and final review is in.

I'm also pushing tests now also to track the baseline for vanilla
linux and linux-net.

If there are any questions please let me know.

[0] git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
[1] https://gitlab.com/mcgrof/oscheck

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13  2:43 [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines Luis R. Chamberlain
@ 2018-07-13  8:39 ` Amir Goldstein
  2018-07-13 16:44   ` Luis R. Chamberlain
  2018-07-13 20:51 ` Matthew Wilcox
  1 sibling, 1 reply; 15+ messages in thread
From: Amir Goldstein @ 2018-07-13  8:39 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Linux FS Devel, xfs, fstests, Sasha Levin, Sasha Levin,
	Valentin Rothberg

On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
> I had volunteered at the last LSF/MM to help with the stable work for
> XFS. To help with this, as part of this year's SUSE Hackweek, I've
> first generalized my own set of scripts to help track a baseline of
> results from fstests [0], and extended it to be able to easily ramp up
> with fstests on different distributions, and I've also created a
> respective baseline of results against these distributions as a
> further example of how these scripts and wrapper framework can be used

Hi Luis!

Thanks a lot for doing this work!

Will take me some time to try it out, but see some questions below...

> [1]. The distributions currently supported are:
>
>   * Debian testing
>   * OpenSUSE Leap 15.0
>   * Fedora 28
>
> The stable work starts with creating a baseline for v4.17.3. The
> results are visible as a result of expunge files which categorize the
> failures for the different sections tested.

So the only "bad" indication is a test failure?
How about indication about a test that started to pass since baseline?
Tested that started to notrun since baseline?
Are we interested in those?

> Other than careful manual
> inspection of each stable candidate patch, one of the goals will also
> be to ensure such stable patches do not regress the baseline. Work is
> currently underway to review the first set of stable candidate patches
> for v4.17.3, if they both pass review and do not regress the
> established baseline, I'll proceed to post the patches for further
> evaluation from the community.
>
> Note that while I used this for XFS, it should be easy to add support
> for other filesystems, should folks wish to do something similar for
> their filesystems. The current XFS sections being tested are as
> follows, please let me know if we should consider extending this
> further:
>
> # Matches what we expect to be default on the latests xfsprogs
> [xfs]
> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
> USE_EXTERNAL=no
> FSTYP=xfs

Please add a LOGWRITES_DEV to all "internal log" configs.
This is needed to utilize the (relatively) new crash consistency tests
(a.k.a. generic/replay) which caught a few nasty bugs.
Fun fact: the fix for stable 4.4 almost got missed, because your system
was not around ;-)
https://marc.info/?l=linux-xfs&m=152852844615666&w=2

I've used a 10GB LOGWRITES_DEV, which seems to be enough
for the current tests.
I don't think that the dmlogwrite tests play well with external logdev,
so we could probably reuse the same device for LOGWRITES_DEV
for configs that don't use SCRATCH_LOGDEV.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13  8:39 ` Amir Goldstein
@ 2018-07-13 16:44   ` Luis R. Chamberlain
  2018-07-13 17:46     ` Luis R. Chamberlain
                       ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Luis R. Chamberlain @ 2018-07-13 16:44 UTC (permalink / raw)
  To: Amir Goldstein, Jeff Mahoney
  Cc: Luis R. Chamberlain, Linux FS Devel, xfs, fstests, Sasha Levin,
	Sasha Levin, Valentin Rothberg

On Fri, Jul 13, 2018 at 11:39:55AM +0300, Amir Goldstein wrote:
> On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
> > I had volunteered at the last LSF/MM to help with the stable work for
> > XFS. To help with this, as part of this year's SUSE Hackweek, I've
> > first generalized my own set of scripts to help track a baseline of
> > results from fstests [0], and extended it to be able to easily ramp up
> > with fstests on different distributions, and I've also created a
> > respective baseline of results against these distributions as a
> > further example of how these scripts and wrapper framework can be used
> 
> Hi Luis!
> 
> Thanks a lot for doing this work!
> 
> Will take me some time to try it out, but see some questions below...
> 
> > [1]. The distributions currently supported are:
> >
> >   * Debian testing
> >   * OpenSUSE Leap 15.0
> >   * Fedora 28
> >
> > The stable work starts with creating a baseline for v4.17.3. The
> > results are visible as a result of expunge files which categorize the
> > failures for the different sections tested.
> 
> So the only "bad" indication is a test failure?

That is correct to a certain degree, ie, if xfsprogs / the kernel
config could run it we can assume it passed.

> How about indication about a test that started to pass since baseline?

Indeed, that is desirable.

We have a few options. One is share the entire results directory for
a release / section, however this is rather big. For instance for a
full v4.17.3 run this is about 292 MiB alone. I don't think this
scales. IMHO lgogs should only be supplied onto bug reports, not this
framework.

The other option is to use -R xunit to generate the report in the
specified unit. I have not yet run this, or tried it, however IIRC
it does record success runs? Does it also keep logs? Hopefully not.  I'm
assuming it does not as of yet. I should note if one hits CTRL-C in the
middle one does not get the results. An alternative was being worked on
by Jeff which would sprinkle IIRC .ok files for tests which succeed,
then you could just scrape the results directory to determine which
tests did pass -- but you run into the same size problem as above.

Since we are establishing a full baseline, and using expunge files
to skip failures, we *should* be able to complete a full run now
though, and be able to capture the results into this xunit format.
I'll try that out and see how big the file is.

I think having that *and* the expunge list would work well.

We'd have to then process that file to scrape out which tests were
passed, if a user wanted that. Do we have scripts for processing
xunit files?

Having the expunge files separately helps as we can annotate bug URLs to
them optionally. Ie, we should be able to process both expunge lists
and xunit file to construct a nice db schema to process results
in a more easily viewable manner in the future.

So to establish a baseline, one first manually contstructs the expunge
files needed to run a full test. In the future hopefully we can have
a set of scripts to do all this for us.

Once the baseline is in place, a full run with all sections is done,
to generate the -R xunit file. This annotates again failures but also
success.

Thoughts?

> Tested that started to notrun since baseline?

Its unclear if xunit captures this. Otherwise we have some work to do.

> Are we interested in those?

Sure, if we can capture this. Does xunit gather this?

I'd much prefer we tune our kernel to be able to run most tests,
likewise also ensure the dependenciecs for fstests are met, through
the oscheck helpers.sh which handles --install-deps properly.

A side question is -- do we want to keep track of results separately
per filesystem tools version used? Right now fstests does not annotate
this on the results directory, but perhaps it should.

At least for XFS, the configuration file stuff should enable in
the future deployment of the latest xfsprogs on older releases.
Before this, it was rather hard to do this due to the differing
defaults, so another option may be to just only rely on assuming
one is using the latest userspace tool.

Right now I'm using the latest tool on each respective latest distro.
The stable tests are using Debian testing, so whatever xfsprogs
is in debian testing, right now that is 4.15.1-1.

> > Other than careful manual
> > inspection of each stable candidate patch, one of the goals will also
> > be to ensure such stable patches do not regress the baseline. Work is
> > currently underway to review the first set of stable candidate patches
> > for v4.17.3, if they both pass review and do not regress the
> > established baseline, I'll proceed to post the patches for further
> > evaluation from the community.
> >
> > Note that while I used this for XFS, it should be easy to add support
> > for other filesystems, should folks wish to do something similar for
> > their filesystems. The current XFS sections being tested are as
> > follows, please let me know if we should consider extending this
> > further:
> >
> > # Matches what we expect to be default on the latests xfsprogs
> > [xfs]
> > MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
> > USE_EXTERNAL=no
> > FSTYP=xfs
> 
> Please add a LOGWRITES_DEV to all "internal log" configs.
> This is needed to utilize the (relatively) new crash consistency tests
> (a.k.a. generic/replay) which caught a few nasty bugs.

Will do!

> Fun fact: the fix for stable 4.4 almost got missed, because your system
> was not around ;-)
> https://marc.info/?l=linux-xfs&m=152852844615666&w=2
> 
> I've used a 10GB LOGWRITES_DEV, which seems to be enough
> for the current tests.

Will use that, thanks. Better yet, any chance you can send me a patch?

> I don't think that the dmlogwrite tests play well with external logdev,

I don't think its the only test which requires review for external logs.
There are quite a bit of failures when using xfs_logdev and
xfs_realtimedev and I'm suspecting this has to do with the output
differing, and the output for the tests not considering an external
log was used.

The top of expunges/debian/testing/xfs/unassigned/xfs_logdev.txt has:

# Based on a quick glance on the errors, one possibility is that                
# perhaps generic tests do not have the semantics necessary to                  
# determine if an external log is used in a generic form and adjust             
# the test for this. But that does not seem to be the case for all              
# tests. A common error for at least two tests seems to be size                 
# related, and that may be a limitation on the log size, and the                
# inability to generically detect the filesyste log size max allowed            
# to then invalidate the test. But note that we even have XFS specific          
# tests which fail, so if its a matter of semantics this is all just            
# crap and are missing a lot of work for improvement. 

> so we could probably reuse the same device for LOGWRITES_DEV
> for configs that don't use SCRATCH_LOGDEV.

True, the recommended setup on oscheck actually is to create
12 x 20 GiB disks, gendisks.sh does this for you on loopback
devices. In practice you end up only needing about 60 GiB
as it stands today though for XFS, but indeed we can actually
use any of the spare disks for LOGWRITES_DEV then.

I do wonder how much more data the extra LOGWRITES_DEV will
push the upper limit per required guest, we'll see!

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 16:44   ` Luis R. Chamberlain
@ 2018-07-13 17:46     ` Luis R. Chamberlain
  2018-07-13 20:40     ` Jeff Mahoney
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Luis R. Chamberlain @ 2018-07-13 17:46 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Amir Goldstein, Jeff Mahoney, Linux FS Devel, xfs, fstests,
	Sasha Levin, Sasha Levin, Valentin Rothberg

The other thing I forgot to mention is annotations for failure rates.

For now I'll just indicate it as a ratio ratio as as part of the ending
comment for an expunge entry. For instance I just ran into a failure
with a set of stable patches for generic/475, however after re-running
the test it succeeded. So the test does not fail always. I'd like to
annotate this.

I now have to go and re-test with the oscheck/naggy-check.sh a few
times as follows:

./naggy-check.sh -s xfs_nocrc_512 -f generic/475

This will run the test in a loop until it fails, and I can use this
as an initial failure rate hint to the tester. If I wanted to get
more accurate I could use an average of few runs with naggy-check.sh.

In practice I typically am only sure a test succeeds if it passes at
least 1000 times (one can use -c 1000 on naggy-check.sh).

I will also then have to go back to the stable kernel without patches
and verify at least one failure was visible before. This will confirm
this is indeed not a regression. Likewise I'll have to test this also
with the other sections and see if this is also observed there.

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 16:44   ` Luis R. Chamberlain
  2018-07-13 17:46     ` Luis R. Chamberlain
@ 2018-07-13 20:40     ` Jeff Mahoney
  2018-07-13 20:50       ` Luis R. Chamberlain
  2018-07-14  0:52     ` Theodore Y. Ts'o
  2018-07-14  6:56     ` Eryu Guan
  3 siblings, 1 reply; 15+ messages in thread
From: Jeff Mahoney @ 2018-07-13 20:40 UTC (permalink / raw)
  To: Luis R. Chamberlain, Amir Goldstein
  Cc: Linux FS Devel, xfs, fstests, Sasha Levin, Sasha Levin,
	Valentin Rothberg


[-- Attachment #1.1: Type: text/plain, Size: 8936 bytes --]

On 7/13/18 12:44 PM, Luis R. Chamberlain wrote:
> On Fri, Jul 13, 2018 at 11:39:55AM +0300, Amir Goldstein wrote:
>> On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
>>> I had volunteered at the last LSF/MM to help with the stable work for
>>> XFS. To help with this, as part of this year's SUSE Hackweek, I've
>>> first generalized my own set of scripts to help track a baseline of
>>> results from fstests [0], and extended it to be able to easily ramp up
>>> with fstests on different distributions, and I've also created a
>>> respective baseline of results against these distributions as a
>>> further example of how these scripts and wrapper framework can be used
>>
>> Hi Luis!
>>
>> Thanks a lot for doing this work!
>>
>> Will take me some time to try it out, but see some questions below...
>>
>>> [1]. The distributions currently supported are:
>>>
>>>   * Debian testing
>>>   * OpenSUSE Leap 15.0
>>>   * Fedora 28
>>>
>>> The stable work starts with creating a baseline for v4.17.3. The
>>> results are visible as a result of expunge files which categorize the
>>> failures for the different sections tested.
>>
>> So the only "bad" indication is a test failure?
> 
> That is correct to a certain degree, ie, if xfsprogs / the kernel
> config could run it we can assume it passed.
> 
>> How about indication about a test that started to pass since baseline?
> 
> Indeed, that is desirable.
> 
> We have a few options. One is share the entire results directory for
> a release / section, however this is rather big. For instance for a
> full v4.17.3 run this is about 292 MiB alone. I don't think this
> scales. IMHO lgogs should only be supplied onto bug reports, not this
> framework.
> 
> The other option is to use -R xunit to generate the report in the
> specified unit. I have not yet run this, or tried it, however IIRC
> it does record success runs? Does it also keep logs? Hopefully not.  I'm
> assuming it does not as of yet. I should note if one hits CTRL-C in the
> middle one does not get the results. An alternative was being worked on
> by Jeff which would sprinkle IIRC .ok files for tests which succeed,
> then you could just scrape the results directory to determine which
> tests did pass -- but you run into the same size problem as above.

Eryu didn't like that idea, so I abandoned it.  What I have now is a -R
files mode that creates a bunch of files with the goal of just archiving
the results for later comparison or import into a results db.

For each test, there are:
$seq.result.start.txt - start timestamp
$seq.result.stop.txt - stop timestamp
$seq.result.result.txt - simple result: pass/fail/expunged/notrun
$seq.result.detail.txt - contains the contents of $seq.notrun/$seq.expunged
$seq.result.{dmesg,kmemleak,full,check}.txt - contains the contents of
the corresponding files

As an aside, IIRC, -R xunit doesn't catch all kinds of failures.  Also,
as you mentioned, if it's interrupted, all results are lost.  This makes
it difficult to identify test failures that crashed or hung the test system.

I have some basic scripts that parse the output and generate an HTML
report/table (and it does do what Amir asks WRT tests that started passing).

-Jeff

> Since we are establishing a full baseline, and using expunge files
> to skip failures, we *should* be able to complete a full run now
> though, and be able to capture the results into this xunit format.
> I'll try that out and see how big the file is.
> 
> I think having that *and* the expunge list would work well.
> 
> We'd have to then process that file to scrape out which tests were
> passed, if a user wanted that. Do we have scripts for processing
> xunit files?
> 
> Having the expunge files separately helps as we can annotate bug URLs to
> them optionally. Ie, we should be able to process both expunge lists
> and xunit file to construct a nice db schema to process results
> in a more easily viewable manner in the future.
> 
> So to establish a baseline, one first manually contstructs the expunge
> files needed to run a full test. In the future hopefully we can have
> a set of scripts to do all this for us.
> 
> Once the baseline is in place, a full run with all sections is done,
> to generate the -R xunit file. This annotates again failures but also
> success.
> 
> Thoughts?
> 
>> Tested that started to notrun since baseline?
> 
> Its unclear if xunit captures this. Otherwise we have some work to do.
> 
>> Are we interested in those?
> 
> Sure, if we can capture this. Does xunit gather this?
> 
> I'd much prefer we tune our kernel to be able to run most tests,
> likewise also ensure the dependenciecs for fstests are met, through
> the oscheck helpers.sh which handles --install-deps properly.
> 
> A side question is -- do we want to keep track of results separately
> per filesystem tools version used? Right now fstests does not annotate
> this on the results directory, but perhaps it should.
> 
> At least for XFS, the configuration file stuff should enable in
> the future deployment of the latest xfsprogs on older releases.
> Before this, it was rather hard to do this due to the differing
> defaults, so another option may be to just only rely on assuming
> one is using the latest userspace tool.
> 
> Right now I'm using the latest tool on each respective latest distro.
> The stable tests are using Debian testing, so whatever xfsprogs
> is in debian testing, right now that is 4.15.1-1.
> 
>>> Other than careful manual
>>> inspection of each stable candidate patch, one of the goals will also
>>> be to ensure such stable patches do not regress the baseline. Work is
>>> currently underway to review the first set of stable candidate patches
>>> for v4.17.3, if they both pass review and do not regress the
>>> established baseline, I'll proceed to post the patches for further
>>> evaluation from the community.
>>>
>>> Note that while I used this for XFS, it should be easy to add support
>>> for other filesystems, should folks wish to do something similar for
>>> their filesystems. The current XFS sections being tested are as
>>> follows, please let me know if we should consider extending this
>>> further:
>>>
>>> # Matches what we expect to be default on the latests xfsprogs
>>> [xfs]
>>> MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
>>> USE_EXTERNAL=no
>>> FSTYP=xfs
>>
>> Please add a LOGWRITES_DEV to all "internal log" configs.
>> This is needed to utilize the (relatively) new crash consistency tests
>> (a.k.a. generic/replay) which caught a few nasty bugs.
> 
> Will do!
> 
>> Fun fact: the fix for stable 4.4 almost got missed, because your system
>> was not around ;-)
>> https://marc.info/?l=linux-xfs&m=152852844615666&w=2
>>
>> I've used a 10GB LOGWRITES_DEV, which seems to be enough
>> for the current tests.
> 
> Will use that, thanks. Better yet, any chance you can send me a patch?
> 
>> I don't think that the dmlogwrite tests play well with external logdev,
> 
> I don't think its the only test which requires review for external logs.
> There are quite a bit of failures when using xfs_logdev and
> xfs_realtimedev and I'm suspecting this has to do with the output
> differing, and the output for the tests not considering an external
> log was used.
> 
> The top of expunges/debian/testing/xfs/unassigned/xfs_logdev.txt has:
> 
> # Based on a quick glance on the errors, one possibility is that                
> # perhaps generic tests do not have the semantics necessary to                  
> # determine if an external log is used in a generic form and adjust             
> # the test for this. But that does not seem to be the case for all              
> # tests. A common error for at least two tests seems to be size                 
> # related, and that may be a limitation on the log size, and the                
> # inability to generically detect the filesyste log size max allowed            
> # to then invalidate the test. But note that we even have XFS specific          
> # tests which fail, so if its a matter of semantics this is all just            
> # crap and are missing a lot of work for improvement. 
> 
>> so we could probably reuse the same device for LOGWRITES_DEV
>> for configs that don't use SCRATCH_LOGDEV.
> 
> True, the recommended setup on oscheck actually is to create
> 12 x 20 GiB disks, gendisks.sh does this for you on loopback
> devices. In practice you end up only needing about 60 GiB
> as it stands today though for XFS, but indeed we can actually
> use any of the spare disks for LOGWRITES_DEV then.
> 
> I do wonder how much more data the extra LOGWRITES_DEV will
> push the upper limit per required guest, we'll see!
> 
>   Luis
> 


-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 20:40     ` Jeff Mahoney
@ 2018-07-13 20:50       ` Luis R. Chamberlain
  2018-07-13 21:03         ` Jeff Mahoney
  0 siblings, 1 reply; 15+ messages in thread
From: Luis R. Chamberlain @ 2018-07-13 20:50 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Luis R. Chamberlain, Amir Goldstein, Linux FS Devel, xfs,
	fstests, Sasha Levin, Sasha Levin, Valentin Rothberg

On Fri, Jul 13, 2018 at 04:40:39PM -0400, Jeff Mahoney wrote:
> On 7/13/18 12:44 PM, Luis R. Chamberlain wrote:
> > On Fri, Jul 13, 2018 at 11:39:55AM +0300, Amir Goldstein wrote:
> >> On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
> >>> I had volunteered at the last LSF/MM to help with the stable work for
> >>> XFS. To help with this, as part of this year's SUSE Hackweek, I've
> >>> first generalized my own set of scripts to help track a baseline of
> >>> results from fstests [0], and extended it to be able to easily ramp up
> >>> with fstests on different distributions, and I've also created a
> >>> respective baseline of results against these distributions as a
> >>> further example of how these scripts and wrapper framework can be used
> >>
> >> Hi Luis!
> >>
> >> Thanks a lot for doing this work!
> >>
> >> Will take me some time to try it out, but see some questions below...
> >>
> >>> [1]. The distributions currently supported are:
> >>>
> >>>   * Debian testing
> >>>   * OpenSUSE Leap 15.0
> >>>   * Fedora 28
> >>>
> >>> The stable work starts with creating a baseline for v4.17.3. The
> >>> results are visible as a result of expunge files which categorize the
> >>> failures for the different sections tested.
> >>
> >> So the only "bad" indication is a test failure?
> > 
> > That is correct to a certain degree, ie, if xfsprogs / the kernel
> > config could run it we can assume it passed.
> > 
> >> How about indication about a test that started to pass since baseline?
> > 
> > Indeed, that is desirable.
> > 
> > We have a few options. One is share the entire results directory for
> > a release / section, however this is rather big. For instance for a
> > full v4.17.3 run this is about 292 MiB alone. I don't think this
> > scales. IMHO lgogs should only be supplied onto bug reports, not this
> > framework.
> > 
> > The other option is to use -R xunit to generate the report in the
> > specified unit. I have not yet run this, or tried it, however IIRC
> > it does record success runs? Does it also keep logs? Hopefully not.  I'm
> > assuming it does not as of yet. I should note if one hits CTRL-C in the
> > middle one does not get the results. An alternative was being worked on
> > by Jeff which would sprinkle IIRC .ok files for tests which succeed,
> > then you could just scrape the results directory to determine which
> > tests did pass -- but you run into the same size problem as above.
> 
> Eryu didn't like that idea, so I abandoned it.  What I have now is a -R
> files mode that creates a bunch of files with the goal of just archiving
> the results for later comparison or import into a results db.
> 
> For each test, there are:
> $seq.result.start.txt - start timestamp
> $seq.result.stop.txt - stop timestamp
> $seq.result.result.txt - simple result: pass/fail/expunged/notrun
> $seq.result.detail.txt - contains the contents of $seq.notrun/$seq.expunged
> $seq.result.{dmesg,kmemleak,full,check}.txt - contains the contents of
> the corresponding files

This is sexy, it also gives the person interpretting the results to
opt-in or not for the actuall full log of the output. You pick and
choose what info you want.

This is indeed nice.

> As an aside, IIRC, -R xunit doesn't catch all kinds of failures.  Also,
> as you mentioned, if it's interrupted, all results are lost.  This makes
> it difficult to identify test failures that crashed or hung the test system.

OK so indeed not my preference.

> I have some basic scripts that parse the output and generate an HTML
> report/table (and it does do what Amir asks WRT tests that started passing).

These scripts, are they for parsing your new -R files output?

I take it the patches are still being worked on?

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13  2:43 [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines Luis R. Chamberlain
  2018-07-13  8:39 ` Amir Goldstein
@ 2018-07-13 20:51 ` Matthew Wilcox
  2018-07-13 20:59   ` Luis R. Chamberlain
  1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2018-07-13 20:51 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Linux FS Devel, xfs, fstests, Amir Goldstein, Sasha Levin,
	Sasha Levin, Valentin Rothberg, Ross Zwisler

On Thu, Jul 12, 2018 at 07:43:08PM -0700, Luis R. Chamberlain wrote:
> Note that while I used this for XFS, it should be easy to add support
> for other filesystems, should folks wish to do something similar for
> their filesystems. The current XFS sections being tested are as
> follows, please let me know if we should consider extending this
> further:

I think we need an xfs_dax section too.  It's still ridiculously hard
to set up a DAX test environment though.  The best I've been able to
do is now merged into Kent's ktest -- but you're not based on that,
so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
can do it since he's actually been able to get 2MB pages working and I
still haven't :-(

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 20:51 ` Matthew Wilcox
@ 2018-07-13 20:59   ` Luis R. Chamberlain
  2018-07-14 22:21     ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Luis R. Chamberlain @ 2018-07-13 20:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Luis R. Chamberlain, Linux FS Devel, xfs, fstests,
	Amir Goldstein, Sasha Levin, Sasha Levin, Valentin Rothberg,
	Ross Zwisler

On Fri, Jul 13, 2018 at 01:51:54PM -0700, Matthew Wilcox wrote:
> On Thu, Jul 12, 2018 at 07:43:08PM -0700, Luis R. Chamberlain wrote:
> > Note that while I used this for XFS, it should be easy to add support
> > for other filesystems, should folks wish to do something similar for
> > their filesystems. The current XFS sections being tested are as
> > follows, please let me know if we should consider extending this
> > further:
> 
> I think we need an xfs_dax section too.

Indeed! Thanks for reminding me about that.

> It's still ridiculously hard
> to set up a DAX test environment though. 

I was under the impression we actually need real hardware for that,
if you git grep for XXX you will see a section to add DAX is there
but I skipped those tests as I thought we needed real hardware
for it.

> The best I've been able to
> do is now merged into Kent's ktest -- but you're not based on that,
> so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
> can do it since he's actually been able to get 2MB pages working and I
> still haven't :-(

Patches and new sections to cover more ground indeed are appreciated!

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 20:50       ` Luis R. Chamberlain
@ 2018-07-13 21:03         ` Jeff Mahoney
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff Mahoney @ 2018-07-13 21:03 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Amir Goldstein, Linux FS Devel, xfs, fstests, Sasha Levin,
	Sasha Levin, Valentin Rothberg


[-- Attachment #1.1: Type: text/plain, Size: 4049 bytes --]

On 7/13/18 4:50 PM, Luis R. Chamberlain wrote:
> On Fri, Jul 13, 2018 at 04:40:39PM -0400, Jeff Mahoney wrote:
>> On 7/13/18 12:44 PM, Luis R. Chamberlain wrote:
>>> On Fri, Jul 13, 2018 at 11:39:55AM +0300, Amir Goldstein wrote:
>>>> On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
>>>>> I had volunteered at the last LSF/MM to help with the stable work for
>>>>> XFS. To help with this, as part of this year's SUSE Hackweek, I've
>>>>> first generalized my own set of scripts to help track a baseline of
>>>>> results from fstests [0], and extended it to be able to easily ramp up
>>>>> with fstests on different distributions, and I've also created a
>>>>> respective baseline of results against these distributions as a
>>>>> further example of how these scripts and wrapper framework can be used
>>>>
>>>> Hi Luis!
>>>>
>>>> Thanks a lot for doing this work!
>>>>
>>>> Will take me some time to try it out, but see some questions below...
>>>>
>>>>> [1]. The distributions currently supported are:
>>>>>
>>>>>   * Debian testing
>>>>>   * OpenSUSE Leap 15.0
>>>>>   * Fedora 28
>>>>>
>>>>> The stable work starts with creating a baseline for v4.17.3. The
>>>>> results are visible as a result of expunge files which categorize the
>>>>> failures for the different sections tested.
>>>>
>>>> So the only "bad" indication is a test failure?
>>>
>>> That is correct to a certain degree, ie, if xfsprogs / the kernel
>>> config could run it we can assume it passed.
>>>
>>>> How about indication about a test that started to pass since baseline?
>>>
>>> Indeed, that is desirable.
>>>
>>> We have a few options. One is share the entire results directory for
>>> a release / section, however this is rather big. For instance for a
>>> full v4.17.3 run this is about 292 MiB alone. I don't think this
>>> scales. IMHO lgogs should only be supplied onto bug reports, not this
>>> framework.
>>>
>>> The other option is to use -R xunit to generate the report in the
>>> specified unit. I have not yet run this, or tried it, however IIRC
>>> it does record success runs? Does it also keep logs? Hopefully not.  I'm
>>> assuming it does not as of yet. I should note if one hits CTRL-C in the
>>> middle one does not get the results. An alternative was being worked on
>>> by Jeff which would sprinkle IIRC .ok files for tests which succeed,
>>> then you could just scrape the results directory to determine which
>>> tests did pass -- but you run into the same size problem as above.
>>
>> Eryu didn't like that idea, so I abandoned it.  What I have now is a -R
>> files mode that creates a bunch of files with the goal of just archiving
>> the results for later comparison or import into a results db.
>>
>> For each test, there are:
>> $seq.result.start.txt - start timestamp
>> $seq.result.stop.txt - stop timestamp
>> $seq.result.result.txt - simple result: pass/fail/expunged/notrun
>> $seq.result.detail.txt - contains the contents of $seq.notrun/$seq.expunged
>> $seq.result.{dmesg,kmemleak,full,check}.txt - contains the contents of
>> the corresponding files
> 
> This is sexy, it also gives the person interpretting the results to
> opt-in or not for the actuall full log of the output. You pick and
> choose what info you want.
> 
> This is indeed nice.
> 
>> As an aside, IIRC, -R xunit doesn't catch all kinds of failures.  Also,
>> as you mentioned, if it's interrupted, all results are lost.  This makes
>> it difficult to identify test failures that crashed or hung the test system.
> 
> OK so indeed not my preference.
> 
>> I have some basic scripts that parse the output and generate an HTML
>> report/table (and it does do what Amir asks WRT tests that started passing).
> 
> These scripts, are they for parsing your new -R files output?

Yep.

> I take it the patches are still being worked on?

Yeah.  They just need a bit of review and cleaning up and I can post them.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 16:44   ` Luis R. Chamberlain
  2018-07-13 17:46     ` Luis R. Chamberlain
  2018-07-13 20:40     ` Jeff Mahoney
@ 2018-07-14  0:52     ` Theodore Y. Ts'o
  2018-07-14  6:56     ` Eryu Guan
  3 siblings, 0 replies; 15+ messages in thread
From: Theodore Y. Ts'o @ 2018-07-14  0:52 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Amir Goldstein, Jeff Mahoney, Linux FS Devel, xfs, fstests,
	Sasha Levin, Sasha Levin, Valentin Rothberg

On Fri, Jul 13, 2018 at 09:44:20AM -0700, Luis R. Chamberlain wrote:
> 
> We have a few options. One is share the entire results directory for
> a release / section, however this is rather big. For instance for a
> full v4.17.3 run this is about 292 MiB alone. I don't think this
> scales. IMHO lgogs should only be supplied onto bug reports, not this
> framework.

The results directory compress fairly well.  A complete ext4 run (with
the configurations defined for gce-xfstests) is 101MB uncompressed,
and 2.3 MB as a tar.xz file.  That's only 6 cents a month in Google
Cloud Storage, and for me it's worth it to keep them; it's
occasionally been interesting.

> The other option is to use -R xunit to generate the report in the
> specified unit. I have not yet run this, or tried it, however IIRC
> it does record success runs?

Yes, it does.

> Does it also keep logs? Hopefully not.

It does include some information, but not all of the information that
might be in the results directory.

    % ls /tmp/results-ltm-20180709000722/ab/ext4/results-1k/results.xml 
    108 /tmp/results-ltm-20180709000722/ab/ext4/results-1k/results.xml

    vs

    % du -sh /tmp/results-ltm-20180709000722/ab/ext4/results-1k
    3.0M	/tmp/results-ltm-20180709000722/ab/ext4/results-1k

> Having the expunge files separately helps as we can annotate bug URLs to
> them optionally. Ie, we should be able to process both expunge lists
> and xunit file to construct a nice db schema to process results
> in a more easily viewable manner in the future.

I do this.  You can see an example here with my annotations:

https://github.com/tytso/xfstests-bld/blob/master/kvm-xfstests/test-appliance/files/root/fs/ext4/exclude

> So to establish a baseline, one first manually contstructs the expunge
> files needed to run a full test. In the future hopefully we can have
> a set of scripts to do all this for us.

Yep, I do this by using a command-line option when I run gce-xfstests
or kvm-xfstests to skip using the exclude files.  It's definitely
useful.

> We'd have to then process that file to scrape out which tests were
> passed, if a user wanted that. Do we have scripts for processing
> xunit files?

I have some scripts which you may find useful.   They can be found here:

https://github.com/tytso/xfstests-bld/tree/master/kvm-xfstests/test-appliance/files/usr/local/bin
https://github.com/tytso/xfstests-bld/tree/master/kvm-xfstests/test-appliance/files/usr/lib/python2.7

This is what I use to parse through the xunit xml files to generate
summaries like this:

ext4/4k: 430 tests, 1 failures, 43 skipped, 6565 seconds
  Failures: generic/388 
ext4/1k: 441 tests, 7 failures, 55 skipped, 7985 seconds
  Failures: ext4/033 generic/018 generic/383 generic/388 generic/454 
    generic/475 generic/476 
ext4/encrypt: 495 tests, 121 skipped, 4081 seconds
ext4/nojournal: 472 tests, 1 failures, 88 skipped, 4700 seconds
  Failures: ext4/301 
ext4/ext3conv: 429 tests, 1 failures, 43 skipped, 5965 seconds
  Failures: generic/388 
ext4/adv: 434 tests, 2 failures, 49 skipped, 5142 seconds
  Failures: generic/399 generic/477 
ext4/dioread_nolock: 429 tests, 1 failures, 43 skipped, 5870 seconds
  Failures: generic/388 
ext4/data_journal: 476 tests, 2 failures, 91 skipped, 6832 seconds
  Failures: generic/388 generic/475 
ext4/bigalloc: 414 tests, 11 failures, 50 skipped, 6963 seconds
  Failures: ext4/033 generic/204 generic/219 generic/235 generic/273 
    generic/388 generic/456 generic/472 generic/494 generic/495 
    generic/496 
ext4/bigalloc_1k: 428 tests, 11 failures, 64 skipped, 5458 seconds
  Failures: ext4/033 generic/204 generic/235 generic/273 generic/383 
    generic/388 generic/454 generic/472 generic/494 generic/495 
    generic/496 
Totals: 3801 tests, 647 skipped, 37 failures, 0 errors, 59147s

> > Tested that started to notrun since baseline?
> 
> Its unclear if xunit captures this. Otherwise we have some work to do.

There are software packages that will process xunit XML files, store
them into a database and then generate reports against a defined
baseline.  They'll also do fancy graphs, and some of them will show
flakey tests, etc.  I haven't had time to investigate them, though,
but if you do find some cool tools to process the xunit files, I'd
definitely be interested.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 16:44   ` Luis R. Chamberlain
                       ` (2 preceding siblings ...)
  2018-07-14  0:52     ` Theodore Y. Ts'o
@ 2018-07-14  6:56     ` Eryu Guan
  3 siblings, 0 replies; 15+ messages in thread
From: Eryu Guan @ 2018-07-14  6:56 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Amir Goldstein, Jeff Mahoney, Linux FS Devel, xfs, fstests,
	Sasha Levin, Sasha Levin, Valentin Rothberg

On Fri, Jul 13, 2018 at 09:44:20AM -0700, Luis R. Chamberlain wrote:
> On Fri, Jul 13, 2018 at 11:39:55AM +0300, Amir Goldstein wrote:
> > On Fri, Jul 13, 2018 at 5:43 AM, Luis R. Chamberlain <mcgrof@kernel.org> wrote:
> > > I had volunteered at the last LSF/MM to help with the stable work for
> > > XFS. To help with this, as part of this year's SUSE Hackweek, I've
> > > first generalized my own set of scripts to help track a baseline of
> > > results from fstests [0], and extended it to be able to easily ramp up
> > > with fstests on different distributions, and I've also created a
> > > respective baseline of results against these distributions as a
> > > further example of how these scripts and wrapper framework can be used
> > 
> > Hi Luis!
> > 
> > Thanks a lot for doing this work!
> > 
> > Will take me some time to try it out, but see some questions below...
> > 
> > > [1]. The distributions currently supported are:
> > >
> > >   * Debian testing
> > >   * OpenSUSE Leap 15.0
> > >   * Fedora 28
> > >
> > > The stable work starts with creating a baseline for v4.17.3. The
> > > results are visible as a result of expunge files which categorize the
> > > failures for the different sections tested.
> > 
> > So the only "bad" indication is a test failure?
> 
> That is correct to a certain degree, ie, if xfsprogs / the kernel
> config could run it we can assume it passed.
> 
> > How about indication about a test that started to pass since baseline?
> 
> Indeed, that is desirable.
> 
> We have a few options. One is share the entire results directory for
> a release / section, however this is rather big. For instance for a
> full v4.17.3 run this is about 292 MiB alone. I don't think this
> scales. IMHO lgogs should only be supplied onto bug reports, not this
> framework.
> 
> The other option is to use -R xunit to generate the report in the
> specified unit. I have not yet run this, or tried it, however IIRC
> it does record success runs? Does it also keep logs? Hopefully not.  I'm

Yes, it record success runs. For logs, it only keeps the diff outputs
and .out.bad files of failed tests, no logs are saved for passed tests.

> assuming it does not as of yet. I should note if one hits CTRL-C in the
> middle one does not get the results. An alternative was being worked on

Yeah, xunit results are not saved on ctrl-c. I think we could extend the
current signal handlers to also generate xunit results. But I'm not sure
if it's easy to do.

> by Jeff which would sprinkle IIRC .ok files for tests which succeed,
> then you could just scrape the results directory to determine which
> tests did pass -- but you run into the same size problem as above.
> 
> Since we are establishing a full baseline, and using expunge files
> to skip failures, we *should* be able to complete a full run now
> though, and be able to capture the results into this xunit format.
> I'll try that out and see how big the file is.
> 
> I think having that *and* the expunge list would work well.
> 
> We'd have to then process that file to scrape out which tests were
> passed, if a user wanted that. Do we have scripts for processing
> xunit files?

Not in fstests source code.

> 
> Having the expunge files separately helps as we can annotate bug URLs to
> them optionally. Ie, we should be able to process both expunge lists
> and xunit file to construct a nice db schema to process results
> in a more easily viewable manner in the future.
> 
> So to establish a baseline, one first manually contstructs the expunge
> files needed to run a full test. In the future hopefully we can have
> a set of scripts to do all this for us.
> 
> Once the baseline is in place, a full run with all sections is done,
> to generate the -R xunit file. This annotates again failures but also
> success.
> 
> Thoughts?
> 
> > Tested that started to notrun since baseline?
> 
> Its unclear if xunit captures this. Otherwise we have some work to do.

Yes, notrun tests are recorded as "skipped" (along with a reason
message) in xunit result file.

> 
> > Are we interested in those?
> 
> Sure, if we can capture this. Does xunit gather this?
> 
> I'd much prefer we tune our kernel to be able to run most tests,
> likewise also ensure the dependenciecs for fstests are met, through
> the oscheck helpers.sh which handles --install-deps properly.
> 
> A side question is -- do we want to keep track of results separately
> per filesystem tools version used? Right now fstests does not annotate
> this on the results directory, but perhaps it should.

The xunit file currently records variables like MKFS_OPTIONS,
MOUNT_OPTIONS etc, I think it should be easy to dump more version info
there.

Thanks,
Eryu

> 
> At least for XFS, the configuration file stuff should enable in
> the future deployment of the latest xfsprogs on older releases.
> Before this, it was rather hard to do this due to the differing
> defaults, so another option may be to just only rely on assuming
> one is using the latest userspace tool.
> 
> Right now I'm using the latest tool on each respective latest distro.
> The stable tests are using Debian testing, so whatever xfsprogs
> is in debian testing, right now that is 4.15.1-1.
> 
> > > Other than careful manual
> > > inspection of each stable candidate patch, one of the goals will also
> > > be to ensure such stable patches do not regress the baseline. Work is
> > > currently underway to review the first set of stable candidate patches
> > > for v4.17.3, if they both pass review and do not regress the
> > > established baseline, I'll proceed to post the patches for further
> > > evaluation from the community.
> > >
> > > Note that while I used this for XFS, it should be easy to add support
> > > for other filesystems, should folks wish to do something similar for
> > > their filesystems. The current XFS sections being tested are as
> > > follows, please let me know if we should consider extending this
> > > further:
> > >
> > > # Matches what we expect to be default on the latests xfsprogs
> > > [xfs]
> > > MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
> > > USE_EXTERNAL=no
> > > FSTYP=xfs
> > 
> > Please add a LOGWRITES_DEV to all "internal log" configs.
> > This is needed to utilize the (relatively) new crash consistency tests
> > (a.k.a. generic/replay) which caught a few nasty bugs.
> 
> Will do!
> 
> > Fun fact: the fix for stable 4.4 almost got missed, because your system
> > was not around ;-)
> > https://marc.info/?l=linux-xfs&m=152852844615666&w=2
> > 
> > I've used a 10GB LOGWRITES_DEV, which seems to be enough
> > for the current tests.
> 
> Will use that, thanks. Better yet, any chance you can send me a patch?
> 
> > I don't think that the dmlogwrite tests play well with external logdev,
> 
> I don't think its the only test which requires review for external logs.
> There are quite a bit of failures when using xfs_logdev and
> xfs_realtimedev and I'm suspecting this has to do with the output
> differing, and the output for the tests not considering an external
> log was used.
> 
> The top of expunges/debian/testing/xfs/unassigned/xfs_logdev.txt has:
> 
> # Based on a quick glance on the errors, one possibility is that                
> # perhaps generic tests do not have the semantics necessary to                  
> # determine if an external log is used in a generic form and adjust             
> # the test for this. But that does not seem to be the case for all              
> # tests. A common error for at least two tests seems to be size                 
> # related, and that may be a limitation on the log size, and the                
> # inability to generically detect the filesyste log size max allowed            
> # to then invalidate the test. But note that we even have XFS specific          
> # tests which fail, so if its a matter of semantics this is all just            
> # crap and are missing a lot of work for improvement. 
> 
> > so we could probably reuse the same device for LOGWRITES_DEV
> > for configs that don't use SCRATCH_LOGDEV.
> 
> True, the recommended setup on oscheck actually is to create
> 12 x 20 GiB disks, gendisks.sh does this for you on loopback
> devices. In practice you end up only needing about 60 GiB
> as it stands today though for XFS, but indeed we can actually
> use any of the spare disks for LOGWRITES_DEV then.
> 
> I do wonder how much more data the extra LOGWRITES_DEV will
> push the upper limit per required guest, we'll see!
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-13 20:59   ` Luis R. Chamberlain
@ 2018-07-14 22:21     ` Matthew Wilcox
  2018-12-03 23:41       ` Luis Chamberlain
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Matthew Wilcox @ 2018-07-14 22:21 UTC (permalink / raw)
  To: Luis R. Chamberlain
  Cc: Linux FS Devel, xfs, fstests, Amir Goldstein, Sasha Levin,
	Sasha Levin, Valentin Rothberg, Ross Zwisler, Kent Overstreet

On Fri, Jul 13, 2018 at 01:59:31PM -0700, Luis R. Chamberlain wrote:
> > It's still ridiculously hard
> > to set up a DAX test environment though. 
> 
> I was under the impression we actually need real hardware for that,
> if you git grep for XXX you will see a section to add DAX is there
> but I skipped those tests as I thought we needed real hardware
> for it.

qemu has the ability to emulate having real hardware ;-)  Here's
the patch that sets that up in ktest:

https://github.com/koverstreet/ktest/commit/16aa8b2cb68ad152ddebd66e40d633fc675d9796

> > The best I've been able to
> > do is now merged into Kent's ktest -- but you're not based on that,
> > so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
> > can do it since he's actually been able to get 2MB pages working and I
> > still haven't :-(
> 
> Patches and new sections to cover more ground indeed are appreciated!

I feel like we need to merge ktest and oscheck.  oscheck assumes that you
know how to set up qemu, and ktest takes care of setting up qemu for you.
I don't think it's possible to set up DAX testing in the current oscheck
framework ... but I think it might be possible to turn oscheck into a
set of ktest tests.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-14 22:21     ` Matthew Wilcox
@ 2018-12-03 23:41       ` Luis Chamberlain
  2018-12-04 11:01       ` Kent Overstreet
  2019-08-16 17:34       ` Luis Chamberlain
  2 siblings, 0 replies; 15+ messages in thread
From: Luis Chamberlain @ 2018-12-03 23:41 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux FS Devel, xfs, fstests, Amir Goldstein, Sasha Levin,
	Sasha Levin, Valentin Rothberg, Ross Zwisler, Kent Overstreet,
	Brendan Higgins

On Sat, Jul 14, 2018 at 03:21:15PM -0700, Matthew Wilcox wrote:
> On Fri, Jul 13, 2018 at 01:59:31PM -0700, Luis R. Chamberlain wrote:
> > > The best I've been able to
> > > do is now merged into Kent's ktest -- but you're not based on that,
> > > so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
> > > can do it since he's actually been able to get 2MB pages working and I
> > > still haven't :-(
> > 
> > Patches and new sections to cover more ground indeed are appreciated!
> 
> I feel like we need to merge ktest and oscheck.  oscheck assumes that you
> know how to set up qemu, and ktest takes care of setting up qemu for you.
> I think it might be possible to turn oscheck into a set of ktest
> tests.

Everyone uses their own qemu based solution, it just so happens that I was
also not satisfied with what I saw and wrote my own solution, kvm-boot [0].

I'm sure others use other things too.

But... a qemu-less solution for a lot of tests would be great as well,
and for this I believe kunit [1] seems to be like a great possibility worth
exploring in the future.

[0] https://gitlab.com/mcgrof/kvm-boot
[1] https://patchwork.kernel.org/patch/10704147/

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-14 22:21     ` Matthew Wilcox
  2018-12-03 23:41       ` Luis Chamberlain
@ 2018-12-04 11:01       ` Kent Overstreet
  2019-08-16 17:34       ` Luis Chamberlain
  2 siblings, 0 replies; 15+ messages in thread
From: Kent Overstreet @ 2018-12-04 11:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Luis R. Chamberlain, Linux FS Devel, xfs, fstests,
	Amir Goldstein, Sasha Levin, Sasha Levin, Valentin Rothberg,
	Ross Zwisler

On Sat, Jul 14, 2018 at 03:21:15PM -0700, Matthew Wilcox wrote:
> On Fri, Jul 13, 2018 at 01:59:31PM -0700, Luis R. Chamberlain wrote:
> > > It's still ridiculously hard
> > > to set up a DAX test environment though. 
> > 
> > I was under the impression we actually need real hardware for that,
> > if you git grep for XXX you will see a section to add DAX is there
> > but I skipped those tests as I thought we needed real hardware
> > for it.
> 
> qemu has the ability to emulate having real hardware ;-)  Here's
> the patch that sets that up in ktest:
> 
> https://github.com/koverstreet/ktest/commit/16aa8b2cb68ad152ddebd66e40d633fc675d9796
> 
> > > The best I've been able to
> > > do is now merged into Kent's ktest -- but you're not based on that,
> > > so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
> > > can do it since he's actually been able to get 2MB pages working and I
> > > still haven't :-(
> > 
> > Patches and new sections to cover more ground indeed are appreciated!
> 
> I feel like we need to merge ktest and oscheck.  oscheck assumes that you
> know how to set up qemu, and ktest takes care of setting up qemu for you.
> I don't think it's possible to set up DAX testing in the current oscheck
> framework ... but I think it might be possible to turn oscheck into a
> set of ktest tests.

Matthew, do you have any thoughts on what merging ktest with xfstests would look
like?

I'd be willing to spend some time on it, it would make my life easier since I
use both heavily, but there are some impedence mismatches to sort through.

Mainly, ktest needs some configuration, and right now it gets most of that from
the test itself - e.g. how many scratch devices to create and how big. Tests
also declare what kernel config options they require, which is really useful
feature for reducing friction when running tests and making automation easier.

If we can come up with a solution for that, merging ktest and build-test-kernel
into xfstests shouldn't be too hard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines
  2018-07-14 22:21     ` Matthew Wilcox
  2018-12-03 23:41       ` Luis Chamberlain
  2018-12-04 11:01       ` Kent Overstreet
@ 2019-08-16 17:34       ` Luis Chamberlain
  2 siblings, 0 replies; 15+ messages in thread
From: Luis Chamberlain @ 2019-08-16 17:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux FS Devel, xfs, fstests, Amir Goldstein, Sasha Levin,
	Sasha Levin, Valentin Rothberg, Ross Zwisler, Kent Overstreet

On Sat, Jul 14, 2018 at 03:21:15PM -0700, Matthew Wilcox wrote:
> On Fri, Jul 13, 2018 at 01:59:31PM -0700, Luis R. Chamberlain wrote:
> > > It's still ridiculously hard
> > > to set up a DAX test environment though. 
> 
> > > The best I've been able to
> > > do is now merged into Kent's ktest -- but you're not based on that,
> > > so I'll try and get your ostest set up to work with DAX.  Or maybe Ross
> > > can do it since he's actually been able to get 2MB pages working and I
> > > still haven't :-(
> > 
> > Patches and new sections to cover more ground indeed are appreciated!
> 
> I feel like we need to merge ktest and oscheck.

In the end I disagreed.

> oscheck assumes that you
> know how to set up qemu, and ktest takes care of setting up qemu for you.

I really disliked all the stupid hacks we had both mine and Kent's
solution. So I wrote a proper modern devops environment for Linux kernel
development which is agnostic to from an architectural pespective to
your OS, and virtualization environment, whether that be local or cloud.

Addressing cloud and local virtual environment proved more diffcult and
took a bit of time. But with a bit of patience, I found something
suitable, and better than just hacks put together.

It relies on ansible, vagrant and terraform. The later two unfortunately
rely on Ruby...  Let me be clear though, I have my own reservations
about relying on solutions which rely on Ruby... but I find that
startups *should* do a better job than a few kernel developers writing
shell hacks for their own prefferred virtual environment. With a bit of
proper ... nudging...  I think we can steer things in the right
direction. vagrant / terraform are at least perhaps more usable and
popular then a few shell hacks.

oscheck now embraces this solution, and you don't need to know much
about setting up qemu, and even supports running on OS X. I've announced
the effort through lkml as it turns out the nuts and bolts about the
generic setup is actually a more common goal than for filesystems. The
results:

https://people.kernel.org/mcgrof/kdevops-a-devops-framework-for-linux-kernel-development

  Luis

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-08-16 17:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-13  2:43 [ANN] oscheck: wrapper for fstests check.sh - tracking and working with baselines Luis R. Chamberlain
2018-07-13  8:39 ` Amir Goldstein
2018-07-13 16:44   ` Luis R. Chamberlain
2018-07-13 17:46     ` Luis R. Chamberlain
2018-07-13 20:40     ` Jeff Mahoney
2018-07-13 20:50       ` Luis R. Chamberlain
2018-07-13 21:03         ` Jeff Mahoney
2018-07-14  0:52     ` Theodore Y. Ts'o
2018-07-14  6:56     ` Eryu Guan
2018-07-13 20:51 ` Matthew Wilcox
2018-07-13 20:59   ` Luis R. Chamberlain
2018-07-14 22:21     ` Matthew Wilcox
2018-12-03 23:41       ` Luis Chamberlain
2018-12-04 11:01       ` Kent Overstreet
2019-08-16 17:34       ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).