All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: "Cornelia Huck" <cohuck@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Cc: qemu-devel@nongnu.org,
	"Wainer dos Santos Moschetta" <wainersm@redhat.com>,
	"Halil Pasic" <pasic@linux.ibm.com>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	qemu-s390x@nongnu.org, "Cleber Rosa" <crosa@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
Date: Tue, 12 Jan 2021 14:31:11 +0100	[thread overview]
Message-ID: <efd30a52-8b5b-1afa-1505-6d6c296fe425@redhat.com> (raw)
In-Reply-To: <20210112132338.4122061b.cohuck@redhat.com>

On 12/01/2021 13.23, Cornelia Huck wrote:
> On Tue, 12 Jan 2021 11:32:44 +0000
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
>> Cornelia Huck <cohuck@redhat.com> writes:
>>
>>> On Fri,  8 Jan 2021 19:56:45 +0100
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>   
>>>> There was a race condition in the first test where there was already the
>>>> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
>>>> in the /sys fs yet. Fix it by waiting until it is there.
>>>>
>>>> The second test has even more problems on gitlab-CI. Even after adding some
>>>> more synchronization points (that wait for some messages in the "dmesg"
>>>> output to make sure that the modules got loaded correctly), there are still
>>>> occasionally some hangs in this test when it is running in the gitlab-CI.
>>>> So far I was unable to reproduce these hangs locally on my computer, so
>>>> this issue might take a while to debug. Thus disable the 2nd test in the
>>>> gitlab-CI until the problems are better understood and fixed.
>>>>
>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>> ---
>>>>   tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
>>>> index eccf26b262..4028c99afc 100644
>>>> --- a/tests/acceptance/machine_s390_ccw_virtio.py
>>>> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
>>>> @@ -12,6 +12,7 @@
>>>>   import os
>>>>   import tempfile
>>>>   
>>>> +from avocado import skipIf
>>>>   from avocado_qemu import Test
>>>>   from avocado_qemu import exec_command_and_wait_for_pattern
>>>>   from avocado_qemu import wait_for_console_pattern
>>>> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>>>>           self.vm.command('device_add', driver='virtio-net-ccw',
>>>>                           devno='fe.0.4711', id='net_4711')
>>>>           self.wait_for_crw_reports()
>>>> -        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
>>>> -                                          '0.0.4711')
>>>> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
>>>> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
>>>> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
>>>> +                    '0.0.4711')
>>>
>>> I'm wondering whether we should introduce a generic helper function for
>>> "execute command repeatedly, if the expected result did not yet show
>>> up", or "wait for a file/directory to exist". It's probably not
>>> uncommon for a desired outcome to arrive asynchronously, and having a
>>> function for waiting/retrying could be handy.
>>
>> We don't really want to encourage fragile shell scripts in the guest so
>> something that makes it easy to encode these loops in python. Currently
>> the _console_interaction helper fails the test if failure_message is
>> seen so I guess we need a slightly more liberal interaction which
>> accepts a command can fail so we can write something like:
>>
>>    while True:
>>        if exec_command_and_check(self, "stat -t /sys/bus/ccw/devices/0.0.4711",
>>                                  "/sys/bus/ccw/devices/0.0.4711"):
>>            break
>>
>> ?
> 
> Yes, something like that. The caller can decide whether they want to
> limit retries.

Fine for me, but I think we should use a timeout, not an amount of retries.

I already put my patch into my pull-request yesterday (so that people are 
not running into failures with their gitlab-CI), so if someone wants to have 
a go at creating such a function in python, feel free to do so by 
refactoring that code again.

  Thomas



  reply	other threads:[~2021-01-12 13:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:56 [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI Thomas Huth
2021-01-08 21:27 ` Willian Rampazzo
2021-01-12 10:53 ` Cornelia Huck
2021-01-12 11:32   ` Alex Bennée
2021-01-12 12:23     ` Cornelia Huck
2021-01-12 13:31       ` Thomas Huth [this message]
2021-01-12 13:53         ` Philippe Mathieu-Daudé
2021-01-14 19:13           ` Willian Rampazzo
2021-01-14 19:25             ` Willian Rampazzo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=efd30a52-8b5b-1afa-1505-6d6c296fe425@redhat.com \
    --to=thuth@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=crosa@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=wainersm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.