linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Towards 4.14 LTS
@ 2017-11-17  4:50 Tom Gall
  2017-11-19 11:20 ` Greg Kroah-Hartman
  2017-11-20 16:10 ` Cyril Hrubis
  0 siblings, 2 replies; 7+ messages in thread
From: Tom Gall @ 2017-11-17  4:50 UTC (permalink / raw)
  To: linux-kernel, linux- stable, torvalds, Greg Kroah-Hartman
  Cc: shuahkh, Guenter Roeck, ltp, linux-kselftest

At Linaro we’ve been putting effort into regularly running kernel tests over 
arm, arm64 and x86_64 targets. On those targets we’re running mainline, -next, 
4.4, and 4.9 kernels and yes we are adding to this list as the hardware 
capacity grows.

For test buckets we’re using just LTP, kselftest and libhugetlbfs and
like kernels we will add to this list. 

With the 4.14 cycle being a little ‘different’ in so much as the goal to 
have it be an LTS kernel I think it’s important to take a look at some 
4.14 test results. 

Grab a beverage, this is a bit of a long post. But quick summery 4.14 as 
released looks just as good as 4.13, for the test buckets I named above.

I’ve enclosed our short form report. We break down the boards/arch combos for
each bucket pass/skip or potentially fails. Pretty straight forward. Skips
generally happen for a few reasons
1) crappy test cases
2) test isn’t appropriate (x86 specific tests so don’t run elsewhere)

With this, we have a decent baseline for 4.14 and other kernels going
forward. 

Summary
------------------------------------------------------------------------

kernel: 4.14.0
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git branch: master
git commit: bebc6082da0a9f5d47a1ea2edc099bf671058bd4
git describe: v4.14
Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.14


No regressions (compared to build v4.14-rc8)

Boards, architectures and test suites:
-------------------------------------

hi6220-hikey - arm64
* boot - pass: 20
* kselftest - skip: 16, pass: 38
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 76
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 14
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 122, pass: 983
* ltp-timers-tests - pass: 12

juno-r2 - arm64
* boot - pass: 20
* kselftest - skip: 15, pass: 38
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 76
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 156, pass: 943
* ltp-timers-tests - pass: 12

x15 - arm
* boot - pass: 20
* kselftest - skip: 17, pass: 36
* libhugetlbfs - skip: 1, pass: 87
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - pass: 60
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 2, pass: 20
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 1, pass: 13
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - skip: 66, pass: 1040
* ltp-timers-tests - pass: 12

dell-poweredge-r200 - x86_64
* boot - pass: 19
* kselftest - skip: 11, pass: 54
* libhugetlbfs - skip: 1, pass: 76
* ltp-cap_bounds-tests - pass: 1
* ltp-containers-tests - pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 1, pass: 61
* ltp-fs_bind-tests - pass: 1
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 8
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - pass: 9
* ltp-securebits-tests - pass: 3
* ltp-syscalls-tests - skip: 163, pass: 962

Lots of green.


Let’s now talk about coverage, the pandora’s box of validation. It’s never
perfect. There’s a bazillion different build combos. Even tools can
make a difference. We’ve seen a case where the dhcp client from open embedded 
didn’t trigger a network regression in one of the LTS RCs but Debian’s dhclient
did.

Of no surprise between what we and others have, it’s not perfect coverage,
and there are only so many build, boot and run cycles to execute the test 
buckets with various combinations so we need to stay sensible as far as 
kernel configs go. 

Does this kind of system actually FIND anything and is it useful for 
watching for 4.14 regressions as fixes are introduced?

I would assert the answer is yes. We do have data for a couple of kernel
cycles but it’s also somewhat dirty as we have been in the process of 
detecting and tossing out dodgy test cases. 

Take 4.14-RC7, there was one failure that is no longer there.
ltp-syscalls-tests : perf_event_open02 (arm64)

As things are getting merged post 4.14 there are some failures
cropping up. Here’s an example:
https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/ltp-fs-tests/proc01

Note the Build column, the kernels are identified by their git describe. 
Don’t be alarmed if you see n/a in some columns, the queues are catching up
so data will be filling in.


So why didn’t we report these? As mentioned we’ve been tossing out dodgy
test cases to get to a clean baseline. We don’t need or want noise. 

For LTS, I want the system when it detects a failure to enable a quick 
bisect involving the affected test bucket. Given the nature of kernel 
bugs tho, there is that class of bug which only happens occasionally.

This brings up a conundrum when you have a system like this. A failure
turns up, it’s not consistently failing and a path forward isn’t 
necessarily obvious. Remember for an LTS RC, there’s a defined window 
to comment.

I’ve been flamed for reporting a LTS RC test failure which didn't include 
a fix, just a ‘this fails, and we’re looking at it.’ I’ve been flamed 
for not reporting a failure that had been detected but not raised to the 
list since it was still being debugged after the RC comment window had
closed.

My 1990s vintage asbestos underwear thankfully is functional.

There is probably a case to be made either way. It boils down to
either:  

Red Pill) Be fully open reporting early and often
Blue Pill) Be closed and only pass up failures that include a patch to fix a bug.

Red Pill does expose drama yet it also creates an opportunity for others to
get involved.

Blue Pill protects the community from noise and the creation of frustration
that the system has cried wolf for perhaps a stupid test case. 

Likewise from a maintainer or dev perspective, there’s a sea of data. 
Time is precious, and who wants to waste it on some snipe hunt?

I’m personally in the Red Pill camp. I like being open.

Be it 0day, LKFT or whatever I think the responsibility is on us
running these projects to be open and give full guidance. Yes there 
will be noise. Noise can suggest dodgy test cases or bugs that are
hard to trigger. Either way they warrant a look. Take Arnd Bergman’s 
work to get rid of kernel warnings. Same concept in my opinion.

Dodgy test cases can easily be put onto skip lists. As we’ve been
running for a number of months now, data and ol fashioned code 
review has been our guide to banish dodgy test cases to skip lists.
Going forward new test cases will pop up. Some of them will be dodgy. 

There’s lots of room for collaboration in improving test cases. 

In summary I think for mainline, LTS kernels etc, we have a good 
warning system to detect regressions as patches flow in. It will evolve 
and improve as is the nature of our open community. From kernelci, 
LKFT, 0day, etc, that’s a good set of automated systems to ferret out 
problems introduced by patches.

Tom

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Towards 4.14 LTS
  2017-11-17  4:50 Towards 4.14 LTS Tom Gall
@ 2017-11-19 11:20 ` Greg Kroah-Hartman
  2017-11-19 16:09   ` Guenter Roeck
  2017-11-20 16:23   ` Tom Gall
  2017-11-20 16:10 ` Cyril Hrubis
  1 sibling, 2 replies; 7+ messages in thread
From: Greg Kroah-Hartman @ 2017-11-19 11:20 UTC (permalink / raw)
  To: Tom Gall
  Cc: linux-kernel, linux- stable, torvalds, shuahkh, Guenter Roeck,
	ltp, linux-kselftest

On Thu, Nov 16, 2017 at 10:50:23PM -0600, Tom Gall wrote:
> At Linaro we’ve been putting effort into regularly running kernel tests over 
> arm, arm64 and x86_64 targets. On those targets we’re running mainline, -next, 
> 4.4, and 4.9 kernels and yes we are adding to this list as the hardware 
> capacity grows.
> 
> For test buckets we’re using just LTP, kselftest and libhugetlbfs and
> like kernels we will add to this list. 

I'm sorry, I don't understand this sentance.

> With the 4.14 cycle being a little ‘different’ in so much as the goal to 
> have it be an LTS kernel I think it’s important to take a look at some 
> 4.14 test results. 
> 
> Grab a beverage, this is a bit of a long post. But quick summery 4.14 as 
> released looks just as good as 4.13, for the test buckets I named above.

Thanks for doing this testing and letting us know.

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Towards 4.14 LTS
  2017-11-19 11:20 ` Greg Kroah-Hartman
@ 2017-11-19 16:09   ` Guenter Roeck
  2017-11-20 16:23   ` Tom Gall
  1 sibling, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2017-11-19 16:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tom Gall
  Cc: linux-kernel, linux- stable, torvalds, shuahkh, ltp, linux-kselftest

On 11/19/2017 03:20 AM, Greg Kroah-Hartman wrote:
> On Thu, Nov 16, 2017 at 10:50:23PM -0600, Tom Gall wrote:
>> At Linaro we’ve been putting effort into regularly running kernel tests over
>> arm, arm64 and x86_64 targets. On those targets we’re running mainline, -next,
>> 4.4, and 4.9 kernels and yes we are adding to this list as the hardware
>> capacity grows.
>>
>> For test buckets we’re using just LTP, kselftest and libhugetlbfs and
>> like kernels we will add to this list.
> 
> I'm sorry, I don't understand this sentance.
> 
My parsing of it is that they will add to the list of tests as well as
to the list of supported kernel versions (and/or maybe architectures ?).

Guenter

>> With the 4.14 cycle being a little ‘different’ in so much as the goal to
>> have it be an LTS kernel I think it’s important to take a look at some
>> 4.14 test results.
>>
>> Grab a beverage, this is a bit of a long post. But quick summery 4.14 as
>> released looks just as good as 4.13, for the test buckets I named above.
> 
> Thanks for doing this testing and letting us know.
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Towards 4.14 LTS
  2017-11-17  4:50 Towards 4.14 LTS Tom Gall
  2017-11-19 11:20 ` Greg Kroah-Hartman
@ 2017-11-20 16:10 ` Cyril Hrubis
  2017-11-20 16:48   ` Tom Gall
  1 sibling, 1 reply; 7+ messages in thread
From: Cyril Hrubis @ 2017-11-20 16:10 UTC (permalink / raw)
  To: Tom Gall
  Cc: linux-kernel, linux- stable, torvalds, Greg Kroah-Hartman,
	linux-kselftest, ltp, shuahkh, Guenter Roeck

Hi!
> So why didn???t we report these? As mentioned we???ve been tossing out dodgy
> test cases to get to a clean baseline. We don???t need or want noise. 
> 
> For LTS, I want the system when it detects a failure to enable a quick 
> bisect involving the affected test bucket. Given the nature of kernel 
> bugs tho, there is that class of bug which only happens occasionally.

>From my experience debugging kernel bugs requires an actuall human
interaction and there is only certain level of automation that can be
achieved. Don't take me wrong, automatic bisection and other bells and
whistles are a nice to have, but at the end of the day you usually need
someone to reproduce/look at the problem, possibly check the source
code, report a bug, etc. Hence it does not make much sense to have an
automated system without dedicated engineers assigned to review the test
results.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Towards 4.14 LTS
  2017-11-19 11:20 ` Greg Kroah-Hartman
  2017-11-19 16:09   ` Guenter Roeck
@ 2017-11-20 16:23   ` Tom Gall
  2017-11-21 12:41     ` [LTP] " Cyril Hrubis
  1 sibling, 1 reply; 7+ messages in thread
From: Tom Gall @ 2017-11-20 16:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, linux- stable, torvalds, shuahkh, Guenter Roeck,
	ltp, linux-kselftest


> On Nov 19, 2017, at 5:20 AM, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> On Thu, Nov 16, 2017 at 10:50:23PM -0600, Tom Gall wrote:
>> At Linaro we’ve been putting effort into regularly running kernel tests over 
>> arm, arm64 and x86_64 targets. On those targets we’re running mainline, -next, 
>> 4.4, and 4.9 kernels and yes we are adding to this list as the hardware 
>> capacity grows.
>> 
>> For test buckets we’re using just LTP, kselftest and libhugetlbfs and
>> like kernels we will add to this list. 
> 
> I'm sorry, I don't understand this sentance.

I was just saying that we intend to add more test buckets and more kernels.

For instance 4.13-rc just was added to the mix.

For test buckets, I’m currently dorking around with some make check targets
for a few interesting packages. 

> 
>> With the 4.14 cycle being a little ‘different’ in so much as the goal to 
>> have it be an LTS kernel I think it’s important to take a look at some 
>> 4.14 test results. 
>> 
>> Grab a beverage, this is a bit of a long post. But quick summery 4.14 as 
>> released looks just as good as 4.13, for the test buckets I named above.
> 
> Thanks for doing this testing and letting us know.
> 
> greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Towards 4.14 LTS
  2017-11-20 16:10 ` Cyril Hrubis
@ 2017-11-20 16:48   ` Tom Gall
  0 siblings, 0 replies; 7+ messages in thread
From: Tom Gall @ 2017-11-20 16:48 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: linux-kernel, linux- stable, torvalds, Greg Kroah-Hartman,
	linux-kselftest, ltp, shuahkh, Guenter Roeck



> On Nov 20, 2017, at 10:10 AM, Cyril Hrubis <chrubis@suse.cz> wrote:
> 
> Hi!
>> So why didn???t we report these? As mentioned we???ve been tossing out dodgy
>> test cases to get to a clean baseline. We don???t need or want noise. 
>> 
>> For LTS, I want the system when it detects a failure to enable a quick 
>> bisect involving the affected test bucket. Given the nature of kernel 
>> bugs tho, there is that class of bug which only happens occasionally.
> 
> From my experience debugging kernel bugs requires an actuall human
> interaction and there is only certain level of automation that can be
> achieved. Don't take me wrong, automatic bisection and other bells and
> whistles are a nice to have, but at the end of the day you usually need
> someone to reproduce/look at the problem, possibly check the source
> code, report a bug, etc. Hence it does not make much sense to have an
> automated system without dedicated engineers assigned to review the test
> results.

You are entirely right automation only gets so far. We have a few lines
of defense that probably are worth a mention.

1) infra - sometimes results/runs need to be re-run for whatever reason.
2) triage - Crappy test case or something that is real?
3) kernel - bisecting etc

We don’t have huge dedicated teams for each category but likewise each 
has a team.

> -- 
> Cyril Hrubis
> chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LTP] Towards 4.14 LTS
  2017-11-20 16:23   ` Tom Gall
@ 2017-11-21 12:41     ` Cyril Hrubis
  0 siblings, 0 replies; 7+ messages in thread
From: Cyril Hrubis @ 2017-11-21 12:41 UTC (permalink / raw)
  To: Tom Gall
  Cc: Greg Kroah-Hartman, linux-kernel, linux- stable, shuahkh,
	Guenter Roeck, linux-kselftest, torvalds, ltp

Hi!
> For instance 4.13-rc just was added to the mix.
> 
> For test buckets, I???m currently dorking around with some make check targets
> for a few interesting packages. 

You may want to look into xfstests as well, we found a few kernel oopses
recently related to backported FS patches for SLES kernels.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-11-21 12:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17  4:50 Towards 4.14 LTS Tom Gall
2017-11-19 11:20 ` Greg Kroah-Hartman
2017-11-19 16:09   ` Guenter Roeck
2017-11-20 16:23   ` Tom Gall
2017-11-21 12:41     ` [LTP] " Cyril Hrubis
2017-11-20 16:10 ` Cyril Hrubis
2017-11-20 16:48   ` Tom Gall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).