[Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance
@ 2019-09-18  7:16 David Gibson
  2019-09-18 11:13 ` Philippe Mathieu-Daudé
  2019-09-19  1:14 ` Cleber Rosa
  0 siblings, 2 replies; 8+ messages in thread
From: David Gibson @ 2019-09-18  7:16 UTC (permalink / raw)
  To: philmd, ehabkost, crosa; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1662 bytes --]

Hi,

I'm finding make check-acceptance is currently useless for me as a
pre-pull test, because a bunch of the tests are not at all reliable.
There are a bunch which I'm still investigating, but for now I'm
looking at the MIPS Malta SSH tests.

There seem to be at least two problems here.  First, the test includes
a download of a pretty big guest disk image.  This can easily exhaust
the 2m30 timeout on its own.

Even without the timeout, it makes the test really slow, even on
repeated runs.  Is there some way we can make the image download part
of "building" the tests rather than actually running the testsuite, so
that a) the test themselves go faster and b) we don't include the
download in the test timeout - obviously the download speed is hugely
dependent on factors that aren't really related to what we're testing
here.

In the meantime, I tried hacking it by just increasing the timeout to
10m.  That got several of the tests working for me, but one still
failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
timed out for me, but now after booting the guest, rather than during
the image download.  Looking at the avocado log file I'm seeing a
bunch of soft lockup messages from the guest console, AFAICT.  So it
looks like we have a real bug here, which I suspect has been
overlooked precisely because the download problems mean this test
isn't reliable.

Any thoughts on how to improve the situation?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-18  7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson
@ 2019-09-18 11:13 ` Philippe Mathieu-Daudé
  2019-09-18 11:37   ` David Gibson
  2019-09-19  1:14 ` Cleber Rosa
  1 sibling, 1 reply; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-09-18 11:13 UTC (permalink / raw)
  To: David Gibson, ehabkost, crosa; +Cc: Alex Bennée, qemu-devel, Gerd Hoffmann


[-- Attachment #1.1: Type: text/plain, Size: 1738 bytes --]

On 9/18/19 9:16 AM, David Gibson wrote:
> Hi,
> 
> I'm finding make check-acceptance is currently useless for me as a
> pre-pull test, because a bunch of the tests are not at all reliable.
> There are a bunch which I'm still investigating, but for now I'm
> looking at the MIPS Malta SSH tests.
> 
> There seem to be at least two problems here.  First, the test includes
> a download of a pretty big guest disk image.  This can easily exhaust
> the 2m30 timeout on its own.

Gerd raised this issue few months ago:

https://www.mail-archive.com/qemu-devel@nongnu.org/msg615619.html

> Even without the timeout, it makes the test really slow, even on
> repeated runs.  Is there some way we can make the image download part
> of "building" the tests rather than actually running the testsuite, so
> that a) the test themselves go faster and b) we don't include the
> download in the test timeout - obviously the download speed is hugely
> dependent on factors that aren't really related to what we're testing
> here.
> 
> In the meantime, I tried hacking it by just increasing the timeout to
> 10m.  That got several of the tests working for me, but one still
> failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> timed out for me, but now after booting the guest, rather than during
> the image download.  Looking at the avocado log file I'm seeing a
> bunch of soft lockup messages from the guest console, AFAICT.  So it
> looks like we have a real bug here, which I suspect has been
> overlooked precisely because the download problems mean this test
> isn't reliable.
> 
> Any thoughts on how to improve the situation?

Maybe we should disable this test and run it manually...


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-18 11:13 ` Philippe Mathieu-Daudé
@ 2019-09-18 11:37   ` David Gibson
  0 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2019-09-18 11:37 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Alex Bennée, Gerd Hoffmann, ehabkost, crosa

[-- Attachment #1: Type: text/plain, Size: 2208 bytes --]

On Wed, Sep 18, 2019 at 01:13:29PM +0200, Philippe Mathieu-Daudé wrote:
> On 9/18/19 9:16 AM, David Gibson wrote:
> > Hi,
> > 
> > I'm finding make check-acceptance is currently useless for me as a
> > pre-pull test, because a bunch of the tests are not at all reliable.
> > There are a bunch which I'm still investigating, but for now I'm
> > looking at the MIPS Malta SSH tests.
> > 
> > There seem to be at least two problems here.  First, the test includes
> > a download of a pretty big guest disk image.  This can easily exhaust
> > the 2m30 timeout on its own.
> 
> Gerd raised this issue few months ago:
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg615619.html

Ah, yes indeed.

> > Even without the timeout, it makes the test really slow, even on
> > repeated runs.  Is there some way we can make the image download part
> > of "building" the tests rather than actually running the testsuite, so
> > that a) the test themselves go faster and b) we don't include the
> > download in the test timeout - obviously the download speed is hugely
> > dependent on factors that aren't really related to what we're testing
> > here.
> > 
> > In the meantime, I tried hacking it by just increasing the timeout to
> > 10m.  That got several of the tests working for me, but one still
> > failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> > timed out for me, but now after booting the guest, rather than during
> > the image download.  Looking at the avocado log file I'm seeing a
> > bunch of soft lockup messages from the guest console, AFAICT.  So it
> > looks like we have a real bug here, which I suspect has been
> > overlooked precisely because the download problems mean this test
> > isn't reliable.
> > 
> > Any thoughts on how to improve the situation?
> 
> Maybe we should disable this test and run it manually...

Until we can fix it better, I really think we should.  A test this
unreliable verges on worse than useless.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-18  7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson
  2019-09-18 11:13 ` Philippe Mathieu-Daudé
@ 2019-09-19  1:14 ` Cleber Rosa
  2019-09-19 16:56   ` Cleber Rosa
  1 sibling, 1 reply; 8+ messages in thread
From: Cleber Rosa @ 2019-09-19  1:14 UTC (permalink / raw)
  To: David Gibson; +Cc: philmd, ehabkost, qemu-devel

On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote:
> Hi,
> 
> I'm finding make check-acceptance is currently useless for me as a
> pre-pull test, because a bunch of the tests are not at all reliable.
> There are a bunch which I'm still investigating, but for now I'm
> looking at the MIPS Malta SSH tests.
> 
> There seem to be at least two problems here.  First, the test includes
> a download of a pretty big guest disk image.  This can easily exhaust
> the 2m30 timeout on its own.
>

You're correct that successes and failures on those tests depend
largely on bandwith.  On a shared environment I used for tests
the download of those images take roughly 400 seconds, resulting
in failures.  On my own machine, around 60, and the tests pass.

There's a conceptual and conflicting problem in that the environment
for tests to run should be prepared beforehand.  The conflicting
solutions can be:

 * extensive bootstrapping of the test execution environment, such
   as the installation of guests from ISOs or installation trees, or
   the download of "default" images wether the tests will use it or
   not (this is what Avocado-VT does/requires)

 * keeping test assets in the tree (Avocado allows this if you have
   a your_test.py.data/ directory), but it's not practical for large
   files or files that can't or shouldn't be redistributed

> Even without the timeout, it makes the test really slow, even on
> repeated runs.  Is there some way we can make the image download part
> of "building" the tests rather than actually running the testsuite, so
> that a) the test themselves go faster and b) we don't include the
> download in the test timeout - obviously the download speed is hugely
> dependent on factors that aren't really related to what we're testing
> here.
>

On Avocado version 72.0 we attempted to minimize the isse by
implementing a "vmimage" command.  So, if you expect to use Fedora 30
aarch64 images, you could run before your tests:

 $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64

And to list the images on your cache:

 $ avocado vmimage list

Unfortunately, this test doesn't use the vmimage API.  Actually that
is fine because not all test assets map nicely to the vmimage goal,
and should keep using the more generic (and lower level) fetch_asset().

We're now working on various "asset fetcher" improvements that should
allow us to check/cache all assets before a test is executed.  Also,
we're adding a mode in which the "fetch_asset()" API will default to
cancel (aka SKIP) a test if the asset could not be downloaded.

If you're interested in the card we're using to track that new feature:

  https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter

Another possibility that we've prototyped, and we'll be working on
further, is to make a specific part of the "test" code execution
(really a pre-test phase) to be executed without a timeout and even be
tried a number of times before bailing out and skipping the test.

> In the meantime, I tried hacking it by just increasing the timeout to
> 10m.  That got several of the tests working for me, but one still
> failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> timed out for me, but now after booting the guest, rather than during
> the image download.  Looking at the avocado log file I'm seeing a
> bunch of soft lockup messages from the guest console, AFAICT.  So it
> looks like we have a real bug here, which I suspect has been
> overlooked precisely because the download problems mean this test
> isn't reliable.
>

I've schedulled a 100 executions of `make check-acceptance` builds, with
the linux_ssh_mips_malta.py tests having a 1500 seconds timeout.  The
very first execution already brought interesting results:

 ...
 (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s)
 (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s)

I'll let you know about my full results.  This should also serve as a
starting point to a discussion about the reliability of other tests,
as you mentioned before.

In my experience, and backed by the executions on Travis, most tests
have been really stable on x86_64 hosts.  Last week I've worked in
ppc64 and aarch64 hosts, and posted a number of patches addressing
the failures I found.  I'll compile a list of the posted patches and
their status.

Thanks for reporting those issues.
- Cleber.

> Any thoughts on how to improve the situation?
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-19  1:14 ` Cleber Rosa
@ 2019-09-19 16:56   ` Cleber Rosa
  2019-09-19 17:00     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 8+ messages in thread
From: Cleber Rosa @ 2019-09-19 16:56 UTC (permalink / raw)
  To: David Gibson; +Cc: philmd, ehabkost, qemu-devel

On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote:
> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote:
> > Hi,
> > 
> > I'm finding make check-acceptance is currently useless for me as a
> > pre-pull test, because a bunch of the tests are not at all reliable.
> > There are a bunch which I'm still investigating, but for now I'm
> > looking at the MIPS Malta SSH tests.
> > 
> > There seem to be at least two problems here.  First, the test includes
> > a download of a pretty big guest disk image.  This can easily exhaust
> > the 2m30 timeout on its own.
> >
> 
> You're correct that successes and failures on those tests depend
> largely on bandwith.  On a shared environment I used for tests
> the download of those images take roughly 400 seconds, resulting
> in failures.  On my own machine, around 60, and the tests pass.
> 
> There's a conceptual and conflicting problem in that the environment
> for tests to run should be prepared beforehand.  The conflicting
> solutions can be:
> 
>  * extensive bootstrapping of the test execution environment, such
>    as the installation of guests from ISOs or installation trees, or
>    the download of "default" images wether the tests will use it or
>    not (this is what Avocado-VT does/requires)
> 
>  * keeping test assets in the tree (Avocado allows this if you have
>    a your_test.py.data/ directory), but it's not practical for large
>    files or files that can't or shouldn't be redistributed
> 
> > Even without the timeout, it makes the test really slow, even on
> > repeated runs.  Is there some way we can make the image download part
> > of "building" the tests rather than actually running the testsuite, so
> > that a) the test themselves go faster and b) we don't include the
> > download in the test timeout - obviously the download speed is hugely
> > dependent on factors that aren't really related to what we're testing
> > here.
> >
> 
> On Avocado version 72.0 we attempted to minimize the isse by
> implementing a "vmimage" command.  So, if you expect to use Fedora 30
> aarch64 images, you could run before your tests:
> 
>  $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64
> 
> And to list the images on your cache:
> 
>  $ avocado vmimage list
> 
> Unfortunately, this test doesn't use the vmimage API.  Actually that
> is fine because not all test assets map nicely to the vmimage goal,
> and should keep using the more generic (and lower level) fetch_asset().
> 
> We're now working on various "asset fetcher" improvements that should
> allow us to check/cache all assets before a test is executed.  Also,
> we're adding a mode in which the "fetch_asset()" API will default to
> cancel (aka SKIP) a test if the asset could not be downloaded.
> 
> If you're interested in the card we're using to track that new feature:
> 
>   https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter
> 
> Another possibility that we've prototyped, and we'll be working on
> further, is to make a specific part of the "test" code execution
> (really a pre-test phase) to be executed without a timeout and even be
> tried a number of times before bailing out and skipping the test.
> 
> > In the meantime, I tried hacking it by just increasing the timeout to
> > 10m.  That got several of the tests working for me, but one still
> > failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> > timed out for me, but now after booting the guest, rather than during
> > the image download.  Looking at the avocado log file I'm seeing a
> > bunch of soft lockup messages from the guest console, AFAICT.  So it
> > looks like we have a real bug here, which I suspect has been
> > overlooked precisely because the download problems mean this test
> > isn't reliable.
> >
> 
> I've schedulled a 100 executions of `make check-acceptance` builds, with
> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout.  The
> very first execution already brought interesting results:
> 
>  ...
>  (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s)
>  (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s)
> 
> I'll let you know about my full results.  This should also serve as a
> starting point to a discussion about the reliability of other tests,
> as you mentioned before.

Out of the 100 executions on a ppc64le host, the results that contain
failures and errors:

15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0
  - PASS: 92
  - INTERRUPTED: 4
  - FAIL: 4
16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0
  - PASS: 95
  - FAIL: 5

FAIL means that self.fail() was called, which means 'Oops' was found
in the console.  INTERRUPTED here means that the test timeout kicked
in, and I can back David's statements about soft lockups.

Let me know if anyone wants access to the full logs/results.

- Cleber.

> 
> In my experience, and backed by the executions on Travis, most tests
> have been really stable on x86_64 hosts.  Last week I've worked in
> ppc64 and aarch64 hosts, and posted a number of patches addressing
> the failures I found.  I'll compile a list of the posted patches and
> their status.
> 
> Thanks for reporting those issues.
> - Cleber.
> 
> > Any thoughts on how to improve the situation?
> > 
> > -- 
> > David Gibson			| I'll have my music baroque, and my code
> > david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> > 				| _way_ _around_!
> > http://www.ozlabs.org/~dgibson
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-19 16:56   ` Cleber Rosa
@ 2019-09-19 17:00     ` Philippe Mathieu-Daudé
  2019-09-19 17:14       ` Cleber Rosa
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-09-19 17:00 UTC (permalink / raw)
  To: Cleber Rosa, David Gibson; +Cc: ehabkost, qemu-devel

On 9/19/19 6:56 PM, Cleber Rosa wrote:
> On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote:
>> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote:
>>> Hi,
>>>
>>> I'm finding make check-acceptance is currently useless for me as a
>>> pre-pull test, because a bunch of the tests are not at all reliable.
>>> There are a bunch which I'm still investigating, but for now I'm
>>> looking at the MIPS Malta SSH tests.
>>>
>>> There seem to be at least two problems here.  First, the test includes
>>> a download of a pretty big guest disk image.  This can easily exhaust
>>> the 2m30 timeout on its own.
>>>
>>
>> You're correct that successes and failures on those tests depend
>> largely on bandwith.  On a shared environment I used for tests
>> the download of those images take roughly 400 seconds, resulting
>> in failures.  On my own machine, around 60, and the tests pass.
>>
>> There's a conceptual and conflicting problem in that the environment
>> for tests to run should be prepared beforehand.  The conflicting
>> solutions can be:
>>
>>  * extensive bootstrapping of the test execution environment, such
>>    as the installation of guests from ISOs or installation trees, or
>>    the download of "default" images wether the tests will use it or
>>    not (this is what Avocado-VT does/requires)
>>
>>  * keeping test assets in the tree (Avocado allows this if you have
>>    a your_test.py.data/ directory), but it's not practical for large
>>    files or files that can't or shouldn't be redistributed
>>
>>> Even without the timeout, it makes the test really slow, even on
>>> repeated runs.  Is there some way we can make the image download part
>>> of "building" the tests rather than actually running the testsuite, so
>>> that a) the test themselves go faster and b) we don't include the
>>> download in the test timeout - obviously the download speed is hugely
>>> dependent on factors that aren't really related to what we're testing
>>> here.
>>>
>>
>> On Avocado version 72.0 we attempted to minimize the isse by
>> implementing a "vmimage" command.  So, if you expect to use Fedora 30
>> aarch64 images, you could run before your tests:
>>
>>  $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64
>>
>> And to list the images on your cache:
>>
>>  $ avocado vmimage list
>>
>> Unfortunately, this test doesn't use the vmimage API.  Actually that
>> is fine because not all test assets map nicely to the vmimage goal,
>> and should keep using the more generic (and lower level) fetch_asset().
>>
>> We're now working on various "asset fetcher" improvements that should
>> allow us to check/cache all assets before a test is executed.  Also,
>> we're adding a mode in which the "fetch_asset()" API will default to
>> cancel (aka SKIP) a test if the asset could not be downloaded.
>>
>> If you're interested in the card we're using to track that new feature:
>>
>>   https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter
>>
>> Another possibility that we've prototyped, and we'll be working on
>> further, is to make a specific part of the "test" code execution
>> (really a pre-test phase) to be executed without a timeout and even be
>> tried a number of times before bailing out and skipping the test.
>>
>>> In the meantime, I tried hacking it by just increasing the timeout to
>>> 10m.  That got several of the tests working for me, but one still
>>> failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
>>> timed out for me, but now after booting the guest, rather than during
>>> the image download.  Looking at the avocado log file I'm seeing a
>>> bunch of soft lockup messages from the guest console, AFAICT.  So it
>>> looks like we have a real bug here, which I suspect has been
>>> overlooked precisely because the download problems mean this test
>>> isn't reliable.
>>>
>>
>> I've schedulled a 100 executions of `make check-acceptance` builds, with
>> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout.  The
>> very first execution already brought interesting results:
>>
>>  ...
>>  (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s)
>>  (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s)
>>
>> I'll let you know about my full results.  This should also serve as a
>> starting point to a discussion about the reliability of other tests,
>> as you mentioned before.
> 
> Out of the 100 executions on a ppc64le host, the results that contain
> failures and errors:
> 
> 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0
>   - PASS: 92
>   - INTERRUPTED: 4
>   - FAIL: 4
> 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0
>   - PASS: 95
>   - FAIL: 5
> 
> FAIL means that self.fail() was called, which means 'Oops' was found
> in the console.  INTERRUPTED here means that the test timeout kicked
> in, and I can back David's statements about soft lockups.
> 
> Let me know if anyone wants access to the full logs/results.

Can you check if the FAIL case are this bug please?

https://bugs.launchpad.net/qemu/+bug/1833661

Thanks,

Phil.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-19 17:00     ` Philippe Mathieu-Daudé
@ 2019-09-19 17:14       ` Cleber Rosa
  2019-09-19 18:54         ` Cleber Rosa
  0 siblings, 1 reply; 8+ messages in thread
From: Cleber Rosa @ 2019-09-19 17:14 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé; +Cc: qemu-devel, ehabkost, David Gibson

On Thu, Sep 19, 2019 at 07:00:49PM +0200, Philippe Mathieu-Daudé wrote:
> On 9/19/19 6:56 PM, Cleber Rosa wrote:
> > On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote:
> >> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote:
> >>> Hi,
> >>>
> >>> I'm finding make check-acceptance is currently useless for me as a
> >>> pre-pull test, because a bunch of the tests are not at all reliable.
> >>> There are a bunch which I'm still investigating, but for now I'm
> >>> looking at the MIPS Malta SSH tests.
> >>>
> >>> There seem to be at least two problems here.  First, the test includes
> >>> a download of a pretty big guest disk image.  This can easily exhaust
> >>> the 2m30 timeout on its own.
> >>>
> >>
> >> You're correct that successes and failures on those tests depend
> >> largely on bandwith.  On a shared environment I used for tests
> >> the download of those images take roughly 400 seconds, resulting
> >> in failures.  On my own machine, around 60, and the tests pass.
> >>
> >> There's a conceptual and conflicting problem in that the environment
> >> for tests to run should be prepared beforehand.  The conflicting
> >> solutions can be:
> >>
> >>  * extensive bootstrapping of the test execution environment, such
> >>    as the installation of guests from ISOs or installation trees, or
> >>    the download of "default" images wether the tests will use it or
> >>    not (this is what Avocado-VT does/requires)
> >>
> >>  * keeping test assets in the tree (Avocado allows this if you have
> >>    a your_test.py.data/ directory), but it's not practical for large
> >>    files or files that can't or shouldn't be redistributed
> >>
> >>> Even without the timeout, it makes the test really slow, even on
> >>> repeated runs.  Is there some way we can make the image download part
> >>> of "building" the tests rather than actually running the testsuite, so
> >>> that a) the test themselves go faster and b) we don't include the
> >>> download in the test timeout - obviously the download speed is hugely
> >>> dependent on factors that aren't really related to what we're testing
> >>> here.
> >>>
> >>
> >> On Avocado version 72.0 we attempted to minimize the isse by
> >> implementing a "vmimage" command.  So, if you expect to use Fedora 30
> >> aarch64 images, you could run before your tests:
> >>
> >>  $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64
> >>
> >> And to list the images on your cache:
> >>
> >>  $ avocado vmimage list
> >>
> >> Unfortunately, this test doesn't use the vmimage API.  Actually that
> >> is fine because not all test assets map nicely to the vmimage goal,
> >> and should keep using the more generic (and lower level) fetch_asset().
> >>
> >> We're now working on various "asset fetcher" improvements that should
> >> allow us to check/cache all assets before a test is executed.  Also,
> >> we're adding a mode in which the "fetch_asset()" API will default to
> >> cancel (aka SKIP) a test if the asset could not be downloaded.
> >>
> >> If you're interested in the card we're using to track that new feature:
> >>
> >>   https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter
> >>
> >> Another possibility that we've prototyped, and we'll be working on
> >> further, is to make a specific part of the "test" code execution
> >> (really a pre-test phase) to be executed without a timeout and even be
> >> tried a number of times before bailing out and skipping the test.
> >>
> >>> In the meantime, I tried hacking it by just increasing the timeout to
> >>> 10m.  That got several of the tests working for me, but one still
> >>> failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> >>> timed out for me, but now after booting the guest, rather than during
> >>> the image download.  Looking at the avocado log file I'm seeing a
> >>> bunch of soft lockup messages from the guest console, AFAICT.  So it
> >>> looks like we have a real bug here, which I suspect has been
> >>> overlooked precisely because the download problems mean this test
> >>> isn't reliable.
> >>>
> >>
> >> I've schedulled a 100 executions of `make check-acceptance` builds, with
> >> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout.  The
> >> very first execution already brought interesting results:
> >>
> >>  ...
> >>  (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s)
> >>  (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s)
> >>
> >> I'll let you know about my full results.  This should also serve as a
> >> starting point to a discussion about the reliability of other tests,
> >> as you mentioned before.
> > 
> > Out of the 100 executions on a ppc64le host, the results that contain
> > failures and errors:
> > 
> > 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0
> >   - PASS: 92
> >   - INTERRUPTED: 4
> >   - FAIL: 4
> > 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0
> >   - PASS: 95
> >   - FAIL: 5
> > 
> > FAIL means that self.fail() was called, which means 'Oops' was found
> > in the console.  INTERRUPTED here means that the test timeout kicked
> > in, and I can back David's statements about soft lockups.
> > 
> > Let me know if anyone wants access to the full logs/results.
> 
> Can you check if the FAIL case are this bug please?
> 
> https://bugs.launchpad.net/qemu/+bug/1833661
>

Yes, the errors do match.  I posted an updated there:

  https://bugs.launchpad.net/qemu/+bug/1833661/comments/3

Cheers,
- Cleber.

> Thanks,
> 
> Phil.
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problems with MIPS Malta SSH tests in make check-acceptance
  2019-09-19 17:14       ` Cleber Rosa
@ 2019-09-19 18:54         ` Cleber Rosa
  0 siblings, 0 replies; 8+ messages in thread
From: Cleber Rosa @ 2019-09-19 18:54 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé; +Cc: qemu-devel, ehabkost, David Gibson

On Thu, Sep 19, 2019 at 01:14:50PM -0400, Cleber Rosa wrote:
> On Thu, Sep 19, 2019 at 07:00:49PM +0200, Philippe Mathieu-Daudé wrote:
> > On 9/19/19 6:56 PM, Cleber Rosa wrote:
> > > On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote:
> > >> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote:
> > >>> Hi,
> > >>>
> > >>> I'm finding make check-acceptance is currently useless for me as a
> > >>> pre-pull test, because a bunch of the tests are not at all reliable.
> > >>> There are a bunch which I'm still investigating, but for now I'm
> > >>> looking at the MIPS Malta SSH tests.
> > >>>
> > >>> There seem to be at least two problems here.  First, the test includes
> > >>> a download of a pretty big guest disk image.  This can easily exhaust
> > >>> the 2m30 timeout on its own.
> > >>>
> > >>
> > >> You're correct that successes and failures on those tests depend
> > >> largely on bandwith.  On a shared environment I used for tests
> > >> the download of those images take roughly 400 seconds, resulting
> > >> in failures.  On my own machine, around 60, and the tests pass.
> > >>
> > >> There's a conceptual and conflicting problem in that the environment
> > >> for tests to run should be prepared beforehand.  The conflicting
> > >> solutions can be:
> > >>
> > >>  * extensive bootstrapping of the test execution environment, such
> > >>    as the installation of guests from ISOs or installation trees, or
> > >>    the download of "default" images wether the tests will use it or
> > >>    not (this is what Avocado-VT does/requires)
> > >>
> > >>  * keeping test assets in the tree (Avocado allows this if you have
> > >>    a your_test.py.data/ directory), but it's not practical for large
> > >>    files or files that can't or shouldn't be redistributed
> > >>
> > >>> Even without the timeout, it makes the test really slow, even on
> > >>> repeated runs.  Is there some way we can make the image download part
> > >>> of "building" the tests rather than actually running the testsuite, so
> > >>> that a) the test themselves go faster and b) we don't include the
> > >>> download in the test timeout - obviously the download speed is hugely
> > >>> dependent on factors that aren't really related to what we're testing
> > >>> here.
> > >>>
> > >>
> > >> On Avocado version 72.0 we attempted to minimize the isse by
> > >> implementing a "vmimage" command.  So, if you expect to use Fedora 30
> > >> aarch64 images, you could run before your tests:
> > >>
> > >>  $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64
> > >>
> > >> And to list the images on your cache:
> > >>
> > >>  $ avocado vmimage list
> > >>
> > >> Unfortunately, this test doesn't use the vmimage API.  Actually that
> > >> is fine because not all test assets map nicely to the vmimage goal,
> > >> and should keep using the more generic (and lower level) fetch_asset().
> > >>
> > >> We're now working on various "asset fetcher" improvements that should
> > >> allow us to check/cache all assets before a test is executed.  Also,
> > >> we're adding a mode in which the "fetch_asset()" API will default to
> > >> cancel (aka SKIP) a test if the asset could not be downloaded.
> > >>
> > >> If you're interested in the card we're using to track that new feature:
> > >>
> > >>   https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter
> > >>
> > >> Another possibility that we've prototyped, and we'll be working on
> > >> further, is to make a specific part of the "test" code execution
> > >> (really a pre-test phase) to be executed without a timeout and even be
> > >> tried a number of times before bailing out and skipping the test.
> > >>
> > >>> In the meantime, I tried hacking it by just increasing the timeout to
> > >>> 10m.  That got several of the tests working for me, but one still
> > >>> failed.  Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still
> > >>> timed out for me, but now after booting the guest, rather than during
> > >>> the image download.  Looking at the avocado log file I'm seeing a
> > >>> bunch of soft lockup messages from the guest console, AFAICT.  So it
> > >>> looks like we have a real bug here, which I suspect has been
> > >>> overlooked precisely because the download problems mean this test
> > >>> isn't reliable.
> > >>>
> > >>
> > >> I've schedulled a 100 executions of `make check-acceptance` builds, with
> > >> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout.  The
> > >> very first execution already brought interesting results:
> > >>
> > >>  ...
> > >>  (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s)
> > >>  (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s)
> > >>
> > >> I'll let you know about my full results.  This should also serve as a
> > >> starting point to a discussion about the reliability of other tests,
> > >> as you mentioned before.
> > > 
> > > Out of the 100 executions on a ppc64le host, the results that contain
> > > failures and errors:
> > > 
> > > 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0
> > >   - PASS: 92
> > >   - INTERRUPTED: 4
> > >   - FAIL: 4
> > > 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0
> > >   - PASS: 95
> > >   - FAIL: 5
> > > 
> > > FAIL means that self.fail() was called, which means 'Oops' was found
> > > in the console.  INTERRUPTED here means that the test timeout kicked
> > > in, and I can back David's statements about soft lockups.
> > > 
> > > Let me know if anyone wants access to the full logs/results.
> > 
> > Can you check if the FAIL case are this bug please?
> > 
> > https://bugs.launchpad.net/qemu/+bug/1833661
> >
> 
> Yes, the errors do match.  I posted an updated there:
> 
>   https://bugs.launchpad.net/qemu/+bug/1833661/comments/3
>

What if we tag tests like this as "knownbug" (or a better name),
disabling execution by default?

- Cleber.

> Cheers,
> - Cleber.
> 
> > Thanks,
> > 
> > Phil.
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-19 18:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-18  7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson
2019-09-18 11:13 ` Philippe Mathieu-Daudé
2019-09-18 11:37   ` David Gibson
2019-09-19  1:14 ` Cleber Rosa
2019-09-19 16:56   ` Cleber Rosa
2019-09-19 17:00     ` Philippe Mathieu-Daudé
2019-09-19 17:14       ` Cleber Rosa
2019-09-19 18:54         ` Cleber Rosa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.