* [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance @ 2019-09-18 7:16 David Gibson 2019-09-18 11:13 ` Philippe Mathieu-Daudé 2019-09-19 1:14 ` Cleber Rosa 0 siblings, 2 replies; 8+ messages in thread From: David Gibson @ 2019-09-18 7:16 UTC (permalink / raw) To: philmd, ehabkost, crosa; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1662 bytes --] Hi, I'm finding make check-acceptance is currently useless for me as a pre-pull test, because a bunch of the tests are not at all reliable. There are a bunch which I'm still investigating, but for now I'm looking at the MIPS Malta SSH tests. There seem to be at least two problems here. First, the test includes a download of a pretty big guest disk image. This can easily exhaust the 2m30 timeout on its own. Even without the timeout, it makes the test really slow, even on repeated runs. Is there some way we can make the image download part of "building" the tests rather than actually running the testsuite, so that a) the test themselves go faster and b) we don't include the download in the test timeout - obviously the download speed is hugely dependent on factors that aren't really related to what we're testing here. In the meantime, I tried hacking it by just increasing the timeout to 10m. That got several of the tests working for me, but one still failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still timed out for me, but now after booting the guest, rather than during the image download. Looking at the avocado log file I'm seeing a bunch of soft lockup messages from the guest console, AFAICT. So it looks like we have a real bug here, which I suspect has been overlooked precisely because the download problems mean this test isn't reliable. Any thoughts on how to improve the situation? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-18 7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson @ 2019-09-18 11:13 ` Philippe Mathieu-Daudé 2019-09-18 11:37 ` David Gibson 2019-09-19 1:14 ` Cleber Rosa 1 sibling, 1 reply; 8+ messages in thread From: Philippe Mathieu-Daudé @ 2019-09-18 11:13 UTC (permalink / raw) To: David Gibson, ehabkost, crosa; +Cc: Alex Bennée, qemu-devel, Gerd Hoffmann [-- Attachment #1.1: Type: text/plain, Size: 1738 bytes --] On 9/18/19 9:16 AM, David Gibson wrote: > Hi, > > I'm finding make check-acceptance is currently useless for me as a > pre-pull test, because a bunch of the tests are not at all reliable. > There are a bunch which I'm still investigating, but for now I'm > looking at the MIPS Malta SSH tests. > > There seem to be at least two problems here. First, the test includes > a download of a pretty big guest disk image. This can easily exhaust > the 2m30 timeout on its own. Gerd raised this issue few months ago: https://www.mail-archive.com/qemu-devel@nongnu.org/msg615619.html > Even without the timeout, it makes the test really slow, even on > repeated runs. Is there some way we can make the image download part > of "building" the tests rather than actually running the testsuite, so > that a) the test themselves go faster and b) we don't include the > download in the test timeout - obviously the download speed is hugely > dependent on factors that aren't really related to what we're testing > here. > > In the meantime, I tried hacking it by just increasing the timeout to > 10m. That got several of the tests working for me, but one still > failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > timed out for me, but now after booting the guest, rather than during > the image download. Looking at the avocado log file I'm seeing a > bunch of soft lockup messages from the guest console, AFAICT. So it > looks like we have a real bug here, which I suspect has been > overlooked precisely because the download problems mean this test > isn't reliable. > > Any thoughts on how to improve the situation? Maybe we should disable this test and run it manually... [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-18 11:13 ` Philippe Mathieu-Daudé @ 2019-09-18 11:37 ` David Gibson 0 siblings, 0 replies; 8+ messages in thread From: David Gibson @ 2019-09-18 11:37 UTC (permalink / raw) To: Philippe Mathieu-Daudé Cc: qemu-devel, Alex Bennée, Gerd Hoffmann, ehabkost, crosa [-- Attachment #1: Type: text/plain, Size: 2208 bytes --] On Wed, Sep 18, 2019 at 01:13:29PM +0200, Philippe Mathieu-Daudé wrote: > On 9/18/19 9:16 AM, David Gibson wrote: > > Hi, > > > > I'm finding make check-acceptance is currently useless for me as a > > pre-pull test, because a bunch of the tests are not at all reliable. > > There are a bunch which I'm still investigating, but for now I'm > > looking at the MIPS Malta SSH tests. > > > > There seem to be at least two problems here. First, the test includes > > a download of a pretty big guest disk image. This can easily exhaust > > the 2m30 timeout on its own. > > Gerd raised this issue few months ago: > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg615619.html Ah, yes indeed. > > Even without the timeout, it makes the test really slow, even on > > repeated runs. Is there some way we can make the image download part > > of "building" the tests rather than actually running the testsuite, so > > that a) the test themselves go faster and b) we don't include the > > download in the test timeout - obviously the download speed is hugely > > dependent on factors that aren't really related to what we're testing > > here. > > > > In the meantime, I tried hacking it by just increasing the timeout to > > 10m. That got several of the tests working for me, but one still > > failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > > timed out for me, but now after booting the guest, rather than during > > the image download. Looking at the avocado log file I'm seeing a > > bunch of soft lockup messages from the guest console, AFAICT. So it > > looks like we have a real bug here, which I suspect has been > > overlooked precisely because the download problems mean this test > > isn't reliable. > > > > Any thoughts on how to improve the situation? > > Maybe we should disable this test and run it manually... Until we can fix it better, I really think we should. A test this unreliable verges on worse than useless. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-18 7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson 2019-09-18 11:13 ` Philippe Mathieu-Daudé @ 2019-09-19 1:14 ` Cleber Rosa 2019-09-19 16:56 ` Cleber Rosa 1 sibling, 1 reply; 8+ messages in thread From: Cleber Rosa @ 2019-09-19 1:14 UTC (permalink / raw) To: David Gibson; +Cc: philmd, ehabkost, qemu-devel On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote: > Hi, > > I'm finding make check-acceptance is currently useless for me as a > pre-pull test, because a bunch of the tests are not at all reliable. > There are a bunch which I'm still investigating, but for now I'm > looking at the MIPS Malta SSH tests. > > There seem to be at least two problems here. First, the test includes > a download of a pretty big guest disk image. This can easily exhaust > the 2m30 timeout on its own. > You're correct that successes and failures on those tests depend largely on bandwith. On a shared environment I used for tests the download of those images take roughly 400 seconds, resulting in failures. On my own machine, around 60, and the tests pass. There's a conceptual and conflicting problem in that the environment for tests to run should be prepared beforehand. The conflicting solutions can be: * extensive bootstrapping of the test execution environment, such as the installation of guests from ISOs or installation trees, or the download of "default" images wether the tests will use it or not (this is what Avocado-VT does/requires) * keeping test assets in the tree (Avocado allows this if you have a your_test.py.data/ directory), but it's not practical for large files or files that can't or shouldn't be redistributed > Even without the timeout, it makes the test really slow, even on > repeated runs. Is there some way we can make the image download part > of "building" the tests rather than actually running the testsuite, so > that a) the test themselves go faster and b) we don't include the > download in the test timeout - obviously the download speed is hugely > dependent on factors that aren't really related to what we're testing > here. > On Avocado version 72.0 we attempted to minimize the isse by implementing a "vmimage" command. So, if you expect to use Fedora 30 aarch64 images, you could run before your tests: $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64 And to list the images on your cache: $ avocado vmimage list Unfortunately, this test doesn't use the vmimage API. Actually that is fine because not all test assets map nicely to the vmimage goal, and should keep using the more generic (and lower level) fetch_asset(). We're now working on various "asset fetcher" improvements that should allow us to check/cache all assets before a test is executed. Also, we're adding a mode in which the "fetch_asset()" API will default to cancel (aka SKIP) a test if the asset could not be downloaded. If you're interested in the card we're using to track that new feature: https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter Another possibility that we've prototyped, and we'll be working on further, is to make a specific part of the "test" code execution (really a pre-test phase) to be executed without a timeout and even be tried a number of times before bailing out and skipping the test. > In the meantime, I tried hacking it by just increasing the timeout to > 10m. That got several of the tests working for me, but one still > failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > timed out for me, but now after booting the guest, rather than during > the image download. Looking at the avocado log file I'm seeing a > bunch of soft lockup messages from the guest console, AFAICT. So it > looks like we have a real bug here, which I suspect has been > overlooked precisely because the download problems mean this test > isn't reliable. > I've schedulled a 100 executions of `make check-acceptance` builds, with the linux_ssh_mips_malta.py tests having a 1500 seconds timeout. The very first execution already brought interesting results: ... (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s) (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s) I'll let you know about my full results. This should also serve as a starting point to a discussion about the reliability of other tests, as you mentioned before. In my experience, and backed by the executions on Travis, most tests have been really stable on x86_64 hosts. Last week I've worked in ppc64 and aarch64 hosts, and posted a number of patches addressing the failures I found. I'll compile a list of the posted patches and their status. Thanks for reporting those issues. - Cleber. > Any thoughts on how to improve the situation? > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-19 1:14 ` Cleber Rosa @ 2019-09-19 16:56 ` Cleber Rosa 2019-09-19 17:00 ` Philippe Mathieu-Daudé 0 siblings, 1 reply; 8+ messages in thread From: Cleber Rosa @ 2019-09-19 16:56 UTC (permalink / raw) To: David Gibson; +Cc: philmd, ehabkost, qemu-devel On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote: > On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote: > > Hi, > > > > I'm finding make check-acceptance is currently useless for me as a > > pre-pull test, because a bunch of the tests are not at all reliable. > > There are a bunch which I'm still investigating, but for now I'm > > looking at the MIPS Malta SSH tests. > > > > There seem to be at least two problems here. First, the test includes > > a download of a pretty big guest disk image. This can easily exhaust > > the 2m30 timeout on its own. > > > > You're correct that successes and failures on those tests depend > largely on bandwith. On a shared environment I used for tests > the download of those images take roughly 400 seconds, resulting > in failures. On my own machine, around 60, and the tests pass. > > There's a conceptual and conflicting problem in that the environment > for tests to run should be prepared beforehand. The conflicting > solutions can be: > > * extensive bootstrapping of the test execution environment, such > as the installation of guests from ISOs or installation trees, or > the download of "default" images wether the tests will use it or > not (this is what Avocado-VT does/requires) > > * keeping test assets in the tree (Avocado allows this if you have > a your_test.py.data/ directory), but it's not practical for large > files or files that can't or shouldn't be redistributed > > > Even without the timeout, it makes the test really slow, even on > > repeated runs. Is there some way we can make the image download part > > of "building" the tests rather than actually running the testsuite, so > > that a) the test themselves go faster and b) we don't include the > > download in the test timeout - obviously the download speed is hugely > > dependent on factors that aren't really related to what we're testing > > here. > > > > On Avocado version 72.0 we attempted to minimize the isse by > implementing a "vmimage" command. So, if you expect to use Fedora 30 > aarch64 images, you could run before your tests: > > $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64 > > And to list the images on your cache: > > $ avocado vmimage list > > Unfortunately, this test doesn't use the vmimage API. Actually that > is fine because not all test assets map nicely to the vmimage goal, > and should keep using the more generic (and lower level) fetch_asset(). > > We're now working on various "asset fetcher" improvements that should > allow us to check/cache all assets before a test is executed. Also, > we're adding a mode in which the "fetch_asset()" API will default to > cancel (aka SKIP) a test if the asset could not be downloaded. > > If you're interested in the card we're using to track that new feature: > > https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter > > Another possibility that we've prototyped, and we'll be working on > further, is to make a specific part of the "test" code execution > (really a pre-test phase) to be executed without a timeout and even be > tried a number of times before bailing out and skipping the test. > > > In the meantime, I tried hacking it by just increasing the timeout to > > 10m. That got several of the tests working for me, but one still > > failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > > timed out for me, but now after booting the guest, rather than during > > the image download. Looking at the avocado log file I'm seeing a > > bunch of soft lockup messages from the guest console, AFAICT. So it > > looks like we have a real bug here, which I suspect has been > > overlooked precisely because the download problems mean this test > > isn't reliable. > > > > I've schedulled a 100 executions of `make check-acceptance` builds, with > the linux_ssh_mips_malta.py tests having a 1500 seconds timeout. The > very first execution already brought interesting results: > > ... > (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s) > (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s) > > I'll let you know about my full results. This should also serve as a > starting point to a discussion about the reliability of other tests, > as you mentioned before. Out of the 100 executions on a ppc64le host, the results that contain failures and errors: 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0 - PASS: 92 - INTERRUPTED: 4 - FAIL: 4 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0 - PASS: 95 - FAIL: 5 FAIL means that self.fail() was called, which means 'Oops' was found in the console. INTERRUPTED here means that the test timeout kicked in, and I can back David's statements about soft lockups. Let me know if anyone wants access to the full logs/results. - Cleber. > > In my experience, and backed by the executions on Travis, most tests > have been really stable on x86_64 hosts. Last week I've worked in > ppc64 and aarch64 hosts, and posted a number of patches addressing > the failures I found. I'll compile a list of the posted patches and > their status. > > Thanks for reporting those issues. > - Cleber. > > > Any thoughts on how to improve the situation? > > > > -- > > David Gibson | I'll have my music baroque, and my code > > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > > | _way_ _around_! > > http://www.ozlabs.org/~dgibson > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-19 16:56 ` Cleber Rosa @ 2019-09-19 17:00 ` Philippe Mathieu-Daudé 2019-09-19 17:14 ` Cleber Rosa 0 siblings, 1 reply; 8+ messages in thread From: Philippe Mathieu-Daudé @ 2019-09-19 17:00 UTC (permalink / raw) To: Cleber Rosa, David Gibson; +Cc: ehabkost, qemu-devel On 9/19/19 6:56 PM, Cleber Rosa wrote: > On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote: >> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote: >>> Hi, >>> >>> I'm finding make check-acceptance is currently useless for me as a >>> pre-pull test, because a bunch of the tests are not at all reliable. >>> There are a bunch which I'm still investigating, but for now I'm >>> looking at the MIPS Malta SSH tests. >>> >>> There seem to be at least two problems here. First, the test includes >>> a download of a pretty big guest disk image. This can easily exhaust >>> the 2m30 timeout on its own. >>> >> >> You're correct that successes and failures on those tests depend >> largely on bandwith. On a shared environment I used for tests >> the download of those images take roughly 400 seconds, resulting >> in failures. On my own machine, around 60, and the tests pass. >> >> There's a conceptual and conflicting problem in that the environment >> for tests to run should be prepared beforehand. The conflicting >> solutions can be: >> >> * extensive bootstrapping of the test execution environment, such >> as the installation of guests from ISOs or installation trees, or >> the download of "default" images wether the tests will use it or >> not (this is what Avocado-VT does/requires) >> >> * keeping test assets in the tree (Avocado allows this if you have >> a your_test.py.data/ directory), but it's not practical for large >> files or files that can't or shouldn't be redistributed >> >>> Even without the timeout, it makes the test really slow, even on >>> repeated runs. Is there some way we can make the image download part >>> of "building" the tests rather than actually running the testsuite, so >>> that a) the test themselves go faster and b) we don't include the >>> download in the test timeout - obviously the download speed is hugely >>> dependent on factors that aren't really related to what we're testing >>> here. >>> >> >> On Avocado version 72.0 we attempted to minimize the isse by >> implementing a "vmimage" command. So, if you expect to use Fedora 30 >> aarch64 images, you could run before your tests: >> >> $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64 >> >> And to list the images on your cache: >> >> $ avocado vmimage list >> >> Unfortunately, this test doesn't use the vmimage API. Actually that >> is fine because not all test assets map nicely to the vmimage goal, >> and should keep using the more generic (and lower level) fetch_asset(). >> >> We're now working on various "asset fetcher" improvements that should >> allow us to check/cache all assets before a test is executed. Also, >> we're adding a mode in which the "fetch_asset()" API will default to >> cancel (aka SKIP) a test if the asset could not be downloaded. >> >> If you're interested in the card we're using to track that new feature: >> >> https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter >> >> Another possibility that we've prototyped, and we'll be working on >> further, is to make a specific part of the "test" code execution >> (really a pre-test phase) to be executed without a timeout and even be >> tried a number of times before bailing out and skipping the test. >> >>> In the meantime, I tried hacking it by just increasing the timeout to >>> 10m. That got several of the tests working for me, but one still >>> failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still >>> timed out for me, but now after booting the guest, rather than during >>> the image download. Looking at the avocado log file I'm seeing a >>> bunch of soft lockup messages from the guest console, AFAICT. So it >>> looks like we have a real bug here, which I suspect has been >>> overlooked precisely because the download problems mean this test >>> isn't reliable. >>> >> >> I've schedulled a 100 executions of `make check-acceptance` builds, with >> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout. The >> very first execution already brought interesting results: >> >> ... >> (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s) >> (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s) >> >> I'll let you know about my full results. This should also serve as a >> starting point to a discussion about the reliability of other tests, >> as you mentioned before. > > Out of the 100 executions on a ppc64le host, the results that contain > failures and errors: > > 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0 > - PASS: 92 > - INTERRUPTED: 4 > - FAIL: 4 > 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0 > - PASS: 95 > - FAIL: 5 > > FAIL means that self.fail() was called, which means 'Oops' was found > in the console. INTERRUPTED here means that the test timeout kicked > in, and I can back David's statements about soft lockups. > > Let me know if anyone wants access to the full logs/results. Can you check if the FAIL case are this bug please? https://bugs.launchpad.net/qemu/+bug/1833661 Thanks, Phil. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-19 17:00 ` Philippe Mathieu-Daudé @ 2019-09-19 17:14 ` Cleber Rosa 2019-09-19 18:54 ` Cleber Rosa 0 siblings, 1 reply; 8+ messages in thread From: Cleber Rosa @ 2019-09-19 17:14 UTC (permalink / raw) To: Philippe Mathieu-Daudé; +Cc: qemu-devel, ehabkost, David Gibson On Thu, Sep 19, 2019 at 07:00:49PM +0200, Philippe Mathieu-Daudé wrote: > On 9/19/19 6:56 PM, Cleber Rosa wrote: > > On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote: > >> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote: > >>> Hi, > >>> > >>> I'm finding make check-acceptance is currently useless for me as a > >>> pre-pull test, because a bunch of the tests are not at all reliable. > >>> There are a bunch which I'm still investigating, but for now I'm > >>> looking at the MIPS Malta SSH tests. > >>> > >>> There seem to be at least two problems here. First, the test includes > >>> a download of a pretty big guest disk image. This can easily exhaust > >>> the 2m30 timeout on its own. > >>> > >> > >> You're correct that successes and failures on those tests depend > >> largely on bandwith. On a shared environment I used for tests > >> the download of those images take roughly 400 seconds, resulting > >> in failures. On my own machine, around 60, and the tests pass. > >> > >> There's a conceptual and conflicting problem in that the environment > >> for tests to run should be prepared beforehand. The conflicting > >> solutions can be: > >> > >> * extensive bootstrapping of the test execution environment, such > >> as the installation of guests from ISOs or installation trees, or > >> the download of "default" images wether the tests will use it or > >> not (this is what Avocado-VT does/requires) > >> > >> * keeping test assets in the tree (Avocado allows this if you have > >> a your_test.py.data/ directory), but it's not practical for large > >> files or files that can't or shouldn't be redistributed > >> > >>> Even without the timeout, it makes the test really slow, even on > >>> repeated runs. Is there some way we can make the image download part > >>> of "building" the tests rather than actually running the testsuite, so > >>> that a) the test themselves go faster and b) we don't include the > >>> download in the test timeout - obviously the download speed is hugely > >>> dependent on factors that aren't really related to what we're testing > >>> here. > >>> > >> > >> On Avocado version 72.0 we attempted to minimize the isse by > >> implementing a "vmimage" command. So, if you expect to use Fedora 30 > >> aarch64 images, you could run before your tests: > >> > >> $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64 > >> > >> And to list the images on your cache: > >> > >> $ avocado vmimage list > >> > >> Unfortunately, this test doesn't use the vmimage API. Actually that > >> is fine because not all test assets map nicely to the vmimage goal, > >> and should keep using the more generic (and lower level) fetch_asset(). > >> > >> We're now working on various "asset fetcher" improvements that should > >> allow us to check/cache all assets before a test is executed. Also, > >> we're adding a mode in which the "fetch_asset()" API will default to > >> cancel (aka SKIP) a test if the asset could not be downloaded. > >> > >> If you're interested in the card we're using to track that new feature: > >> > >> https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter > >> > >> Another possibility that we've prototyped, and we'll be working on > >> further, is to make a specific part of the "test" code execution > >> (really a pre-test phase) to be executed without a timeout and even be > >> tried a number of times before bailing out and skipping the test. > >> > >>> In the meantime, I tried hacking it by just increasing the timeout to > >>> 10m. That got several of the tests working for me, but one still > >>> failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > >>> timed out for me, but now after booting the guest, rather than during > >>> the image download. Looking at the avocado log file I'm seeing a > >>> bunch of soft lockup messages from the guest console, AFAICT. So it > >>> looks like we have a real bug here, which I suspect has been > >>> overlooked precisely because the download problems mean this test > >>> isn't reliable. > >>> > >> > >> I've schedulled a 100 executions of `make check-acceptance` builds, with > >> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout. The > >> very first execution already brought interesting results: > >> > >> ... > >> (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s) > >> (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s) > >> > >> I'll let you know about my full results. This should also serve as a > >> starting point to a discussion about the reliability of other tests, > >> as you mentioned before. > > > > Out of the 100 executions on a ppc64le host, the results that contain > > failures and errors: > > > > 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0 > > - PASS: 92 > > - INTERRUPTED: 4 > > - FAIL: 4 > > 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0 > > - PASS: 95 > > - FAIL: 5 > > > > FAIL means that self.fail() was called, which means 'Oops' was found > > in the console. INTERRUPTED here means that the test timeout kicked > > in, and I can back David's statements about soft lockups. > > > > Let me know if anyone wants access to the full logs/results. > > Can you check if the FAIL case are this bug please? > > https://bugs.launchpad.net/qemu/+bug/1833661 > Yes, the errors do match. I posted an updated there: https://bugs.launchpad.net/qemu/+bug/1833661/comments/3 Cheers, - Cleber. > Thanks, > > Phil. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with MIPS Malta SSH tests in make check-acceptance 2019-09-19 17:14 ` Cleber Rosa @ 2019-09-19 18:54 ` Cleber Rosa 0 siblings, 0 replies; 8+ messages in thread From: Cleber Rosa @ 2019-09-19 18:54 UTC (permalink / raw) To: Philippe Mathieu-Daudé; +Cc: qemu-devel, ehabkost, David Gibson On Thu, Sep 19, 2019 at 01:14:50PM -0400, Cleber Rosa wrote: > On Thu, Sep 19, 2019 at 07:00:49PM +0200, Philippe Mathieu-Daudé wrote: > > On 9/19/19 6:56 PM, Cleber Rosa wrote: > > > On Wed, Sep 18, 2019 at 09:14:58PM -0400, Cleber Rosa wrote: > > >> On Wed, Sep 18, 2019 at 05:16:54PM +1000, David Gibson wrote: > > >>> Hi, > > >>> > > >>> I'm finding make check-acceptance is currently useless for me as a > > >>> pre-pull test, because a bunch of the tests are not at all reliable. > > >>> There are a bunch which I'm still investigating, but for now I'm > > >>> looking at the MIPS Malta SSH tests. > > >>> > > >>> There seem to be at least two problems here. First, the test includes > > >>> a download of a pretty big guest disk image. This can easily exhaust > > >>> the 2m30 timeout on its own. > > >>> > > >> > > >> You're correct that successes and failures on those tests depend > > >> largely on bandwith. On a shared environment I used for tests > > >> the download of those images take roughly 400 seconds, resulting > > >> in failures. On my own machine, around 60, and the tests pass. > > >> > > >> There's a conceptual and conflicting problem in that the environment > > >> for tests to run should be prepared beforehand. The conflicting > > >> solutions can be: > > >> > > >> * extensive bootstrapping of the test execution environment, such > > >> as the installation of guests from ISOs or installation trees, or > > >> the download of "default" images wether the tests will use it or > > >> not (this is what Avocado-VT does/requires) > > >> > > >> * keeping test assets in the tree (Avocado allows this if you have > > >> a your_test.py.data/ directory), but it's not practical for large > > >> files or files that can't or shouldn't be redistributed > > >> > > >>> Even without the timeout, it makes the test really slow, even on > > >>> repeated runs. Is there some way we can make the image download part > > >>> of "building" the tests rather than actually running the testsuite, so > > >>> that a) the test themselves go faster and b) we don't include the > > >>> download in the test timeout - obviously the download speed is hugely > > >>> dependent on factors that aren't really related to what we're testing > > >>> here. > > >>> > > >> > > >> On Avocado version 72.0 we attempted to minimize the isse by > > >> implementing a "vmimage" command. So, if you expect to use Fedora 30 > > >> aarch64 images, you could run before your tests: > > >> > > >> $ avocado vmimage get --distro fedora --distro-version 30 --arch aarch64 > > >> > > >> And to list the images on your cache: > > >> > > >> $ avocado vmimage list > > >> > > >> Unfortunately, this test doesn't use the vmimage API. Actually that > > >> is fine because not all test assets map nicely to the vmimage goal, > > >> and should keep using the more generic (and lower level) fetch_asset(). > > >> > > >> We're now working on various "asset fetcher" improvements that should > > >> allow us to check/cache all assets before a test is executed. Also, > > >> we're adding a mode in which the "fetch_asset()" API will default to > > >> cancel (aka SKIP) a test if the asset could not be downloaded. > > >> > > >> If you're interested in the card we're using to track that new feature: > > >> > > >> https://trello.com/c/T3SC1sZs/1521-implement-fetch-assets-command-line-parameter > > >> > > >> Another possibility that we've prototyped, and we'll be working on > > >> further, is to make a specific part of the "test" code execution > > >> (really a pre-test phase) to be executed without a timeout and even be > > >> tried a number of times before bailing out and skipping the test. > > >> > > >>> In the meantime, I tried hacking it by just increasing the timeout to > > >>> 10m. That got several of the tests working for me, but one still > > >>> failed. Specifically 'LinuxSSH.test_mips_malta32eb_kernel3_2_0' still > > >>> timed out for me, but now after booting the guest, rather than during > > >>> the image download. Looking at the avocado log file I'm seeing a > > >>> bunch of soft lockup messages from the guest console, AFAICT. So it > > >>> looks like we have a real bug here, which I suspect has been > > >>> overlooked precisely because the download problems mean this test > > >>> isn't reliable. > > >>> > > >> > > >> I've schedulled a 100 executions of `make check-acceptance` builds, with > > >> the linux_ssh_mips_malta.py tests having a 1500 seconds timeout. The > > >> very first execution already brought interesting results: > > >> > > >> ... > > >> (15/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0: PASS (198.38 s) > > >> (16/39) /home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0: FAIL: Failure message found in console: Oops (22.83 s) > > >> > > >> I'll let you know about my full results. This should also serve as a > > >> starting point to a discussion about the reliability of other tests, > > >> as you mentioned before. > > > > > > Out of the 100 executions on a ppc64le host, the results that contain > > > failures and errors: > > > > > > 15-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta32eb_kernel3_2_0 > > > - PASS: 92 > > > - INTERRUPTED: 4 > > > - FAIL: 4 > > > 16-/home/cleber/src/qemu/tests/acceptance/linux_ssh_mips_malta.py:LinuxSSH.test_mips_malta64el_kernel3_2_0 > > > - PASS: 95 > > > - FAIL: 5 > > > > > > FAIL means that self.fail() was called, which means 'Oops' was found > > > in the console. INTERRUPTED here means that the test timeout kicked > > > in, and I can back David's statements about soft lockups. > > > > > > Let me know if anyone wants access to the full logs/results. > > > > Can you check if the FAIL case are this bug please? > > > > https://bugs.launchpad.net/qemu/+bug/1833661 > > > > Yes, the errors do match. I posted an updated there: > > https://bugs.launchpad.net/qemu/+bug/1833661/comments/3 > What if we tag tests like this as "knownbug" (or a better name), disabling execution by default? - Cleber. > Cheers, > - Cleber. > > > Thanks, > > > > Phil. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-09-19 18:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-18 7:16 [Qemu-devel] Problems with MIPS Malta SSH tests in make check-acceptance David Gibson 2019-09-18 11:13 ` Philippe Mathieu-Daudé 2019-09-18 11:37 ` David Gibson 2019-09-19 1:14 ` Cleber Rosa 2019-09-19 16:56 ` Cleber Rosa 2019-09-19 17:00 ` Philippe Mathieu-Daudé 2019-09-19 17:14 ` Cleber Rosa 2019-09-19 18:54 ` Cleber Rosa
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.