All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
@ 2021-01-08 18:56 Thomas Huth
  2021-01-08 21:27 ` Willian Rampazzo
  2021-01-12 10:53 ` Cornelia Huck
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Huth @ 2021-01-08 18:56 UTC (permalink / raw)
  To: qemu-devel, Cornelia Huck
  Cc: Alex Bennée, Wainer dos Santos Moschetta, Halil Pasic,
	Christian Borntraeger, qemu-s390x, Cleber Rosa,
	Philippe Mathieu-Daudé

There was a race condition in the first test where there was already the
"crw" output in the dmesg, but the "0.0.4711" entry has not been created
in the /sys fs yet. Fix it by waiting until it is there.

The second test has even more problems on gitlab-CI. Even after adding some
more synchronization points (that wait for some messages in the "dmesg"
output to make sure that the modules got loaded correctly), there are still
occasionally some hangs in this test when it is running in the gitlab-CI.
So far I was unable to reproduce these hangs locally on my computer, so
this issue might take a while to debug. Thus disable the 2nd test in the
gitlab-CI until the problems are better understood and fixed.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
index eccf26b262..4028c99afc 100644
--- a/tests/acceptance/machine_s390_ccw_virtio.py
+++ b/tests/acceptance/machine_s390_ccw_virtio.py
@@ -12,6 +12,7 @@
 import os
 import tempfile
 
+from avocado import skipIf
 from avocado_qemu import Test
 from avocado_qemu import exec_command_and_wait_for_pattern
 from avocado_qemu import wait_for_console_pattern
@@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
         self.vm.command('device_add', driver='virtio-net-ccw',
                         devno='fe.0.4711', id='net_4711')
         self.wait_for_crw_reports()
-        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
-                                          '0.0.4711')
+        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
+                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
+                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
+                    '0.0.4711')
         # and detach it again
         self.clear_guest_dmesg()
         self.vm.command('device_del', id='net_4711')
@@ -155,6 +158,7 @@ class S390CCWVirtioMachine(Test):
                                           'MemTotal:         115640 kB')
 
 
+    @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
     def test_s390x_fedora(self):
 
         """
@@ -199,6 +203,9 @@ class S390CCWVirtioMachine(Test):
 
         # Some tests to see whether the CLI options have been considered:
         self.log.info("Test whether QEMU CLI options have been considered")
+        exec_command_and_wait_for_pattern(self,
+                        'while ! (dmesg | grep enP7p0s0) ; do sleep 1 ; done',
+                        'virtio_net virtio0 enP7p0s0: renamed')
         exec_command_and_wait_for_pattern(self, 'lspci',
                              '0007:00:00.0 Class 0200: Device 1af4:1000')
         exec_command_and_wait_for_pattern(self,
@@ -222,6 +229,9 @@ class S390CCWVirtioMachine(Test):
         # can simply read the written "magic bytes" back from the PPM file to
         # check whether the framebuffer is working as expected.
         self.log.info("Test screendump of virtio-gpu device")
+        exec_command_and_wait_for_pattern(self,
+                        'while ! (dmesg | grep gpudrmfb) ; do sleep 1 ; done',
+                        'virtio_gpudrmfb frame buffer device')
         exec_command_and_wait_for_pattern(self,
             'echo -e "\e[?25l" > /dev/tty0', ':/#')
         exec_command_and_wait_for_pattern(self, 'for ((i=0;i<250;i++)); do '
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-08 18:56 [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI Thomas Huth
@ 2021-01-08 21:27 ` Willian Rampazzo
  2021-01-12 10:53 ` Cornelia Huck
  1 sibling, 0 replies; 9+ messages in thread
From: Willian Rampazzo @ 2021-01-08 21:27 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Philippe Mathieu-Daudé,
	Cornelia Huck, qemu-devel, Wainer dos Santos Moschetta,
	Halil Pasic, Christian Borntraeger, open list:S390 Virtio-ccw,
	Cleber Rosa, Alex Bennée

On Fri, Jan 8, 2021 at 3:59 PM Thomas Huth <thuth@redhat.com> wrote:
>
> There was a race condition in the first test where there was already the
> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
> in the /sys fs yet. Fix it by waiting until it is there.
>
> The second test has even more problems on gitlab-CI. Even after adding some
> more synchronization points (that wait for some messages in the "dmesg"
> output to make sure that the modules got loaded correctly), there are still
> occasionally some hangs in this test when it is running in the gitlab-CI.
> So far I was unable to reproduce these hangs locally on my computer, so
> this issue might take a while to debug. Thus disable the 2nd test in the
> gitlab-CI until the problems are better understood and fixed.
>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>

Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Tested-by: Willian Rampazzo <willianr@redhat.com>

JOB ID     : 6b2b3c1f6f6b0c4c2e9fd694b475bd12c193adbd
JOB LOG    : /home/linux1/src/qemu.dev/build/tests/results/job-2021-01-08T16.24-6b2b3c1/job.log
 (1/2) tests/acceptance/machine_s390_ccw_virtio.py:S390CCWVirtioMachine.test_s390x_devices:
PASS (8.78 s)
 (2/2) tests/acceptance/machine_s390_ccw_virtio.py:S390CCWVirtioMachine.test_s390x_fedora:
PASS (23.86 s)
RESULTS    : PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0
| CANCEL 0
JOB TIME   : 33.02 s



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-08 18:56 [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI Thomas Huth
  2021-01-08 21:27 ` Willian Rampazzo
@ 2021-01-12 10:53 ` Cornelia Huck
  2021-01-12 11:32   ` Alex Bennée
  1 sibling, 1 reply; 9+ messages in thread
From: Cornelia Huck @ 2021-01-12 10:53 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Alex Bennée, qemu-devel, Wainer dos Santos Moschetta,
	Halil Pasic, Christian Borntraeger, qemu-s390x, Cleber Rosa,
	Philippe Mathieu-Daudé

On Fri,  8 Jan 2021 19:56:45 +0100
Thomas Huth <thuth@redhat.com> wrote:

> There was a race condition in the first test where there was already the
> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
> in the /sys fs yet. Fix it by waiting until it is there.
> 
> The second test has even more problems on gitlab-CI. Even after adding some
> more synchronization points (that wait for some messages in the "dmesg"
> output to make sure that the modules got loaded correctly), there are still
> occasionally some hangs in this test when it is running in the gitlab-CI.
> So far I was unable to reproduce these hangs locally on my computer, so
> this issue might take a while to debug. Thus disable the 2nd test in the
> gitlab-CI until the problems are better understood and fixed.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
> index eccf26b262..4028c99afc 100644
> --- a/tests/acceptance/machine_s390_ccw_virtio.py
> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
> @@ -12,6 +12,7 @@
>  import os
>  import tempfile
>  
> +from avocado import skipIf
>  from avocado_qemu import Test
>  from avocado_qemu import exec_command_and_wait_for_pattern
>  from avocado_qemu import wait_for_console_pattern
> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>          self.vm.command('device_add', driver='virtio-net-ccw',
>                          devno='fe.0.4711', id='net_4711')
>          self.wait_for_crw_reports()
> -        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
> -                                          '0.0.4711')
> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
> +                    '0.0.4711')

I'm wondering whether we should introduce a generic helper function for
"execute command repeatedly, if the expected result did not yet show
up", or "wait for a file/directory to exist". It's probably not
uncommon for a desired outcome to arrive asynchronously, and having a
function for waiting/retrying could be handy.

>          # and detach it again
>          self.clear_guest_dmesg()
>          self.vm.command('device_del', id='net_4711')



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-12 10:53 ` Cornelia Huck
@ 2021-01-12 11:32   ` Alex Bennée
  2021-01-12 12:23     ` Cornelia Huck
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Bennée @ 2021-01-12 11:32 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Thomas Huth, qemu-devel, Wainer dos Santos Moschetta,
	Halil Pasic, Christian Borntraeger, qemu-s390x, Cleber Rosa,
	Philippe Mathieu-Daudé


Cornelia Huck <cohuck@redhat.com> writes:

> On Fri,  8 Jan 2021 19:56:45 +0100
> Thomas Huth <thuth@redhat.com> wrote:
>
>> There was a race condition in the first test where there was already the
>> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
>> in the /sys fs yet. Fix it by waiting until it is there.
>> 
>> The second test has even more problems on gitlab-CI. Even after adding some
>> more synchronization points (that wait for some messages in the "dmesg"
>> output to make sure that the modules got loaded correctly), there are still
>> occasionally some hangs in this test when it is running in the gitlab-CI.
>> So far I was unable to reproduce these hangs locally on my computer, so
>> this issue might take a while to debug. Thus disable the 2nd test in the
>> gitlab-CI until the problems are better understood and fixed.
>> 
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
>> index eccf26b262..4028c99afc 100644
>> --- a/tests/acceptance/machine_s390_ccw_virtio.py
>> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
>> @@ -12,6 +12,7 @@
>>  import os
>>  import tempfile
>>  
>> +from avocado import skipIf
>>  from avocado_qemu import Test
>>  from avocado_qemu import exec_command_and_wait_for_pattern
>>  from avocado_qemu import wait_for_console_pattern
>> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>>          self.vm.command('device_add', driver='virtio-net-ccw',
>>                          devno='fe.0.4711', id='net_4711')
>>          self.wait_for_crw_reports()
>> -        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
>> -                                          '0.0.4711')
>> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
>> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
>> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
>> +                    '0.0.4711')
>
> I'm wondering whether we should introduce a generic helper function for
> "execute command repeatedly, if the expected result did not yet show
> up", or "wait for a file/directory to exist". It's probably not
> uncommon for a desired outcome to arrive asynchronously, and having a
> function for waiting/retrying could be handy.

We don't really want to encourage fragile shell scripts in the guest so
something that makes it easy to encode these loops in python. Currently
the _console_interaction helper fails the test if failure_message is
seen so I guess we need a slightly more liberal interaction which
accepts a command can fail so we can write something like:

  while True:
      if exec_command_and_check(self, "stat -t /sys/bus/ccw/devices/0.0.4711",
                                "/sys/bus/ccw/devices/0.0.4711"):
          break

?

>
>>          # and detach it again
>>          self.clear_guest_dmesg()
>>          self.vm.command('device_del', id='net_4711')


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-12 11:32   ` Alex Bennée
@ 2021-01-12 12:23     ` Cornelia Huck
  2021-01-12 13:31       ` Thomas Huth
  0 siblings, 1 reply; 9+ messages in thread
From: Cornelia Huck @ 2021-01-12 12:23 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Thomas Huth, qemu-devel, Wainer dos Santos Moschetta,
	Halil Pasic, Christian Borntraeger, qemu-s390x, Cleber Rosa,
	Philippe Mathieu-Daudé

On Tue, 12 Jan 2021 11:32:44 +0000
Alex Bennée <alex.bennee@linaro.org> wrote:

> Cornelia Huck <cohuck@redhat.com> writes:
> 
> > On Fri,  8 Jan 2021 19:56:45 +0100
> > Thomas Huth <thuth@redhat.com> wrote:
> >  
> >> There was a race condition in the first test where there was already the
> >> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
> >> in the /sys fs yet. Fix it by waiting until it is there.
> >> 
> >> The second test has even more problems on gitlab-CI. Even after adding some
> >> more synchronization points (that wait for some messages in the "dmesg"
> >> output to make sure that the modules got loaded correctly), there are still
> >> occasionally some hangs in this test when it is running in the gitlab-CI.
> >> So far I was unable to reproduce these hangs locally on my computer, so
> >> this issue might take a while to debug. Thus disable the 2nd test in the
> >> gitlab-CI until the problems are better understood and fixed.
> >> 
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >>  tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
> >>  1 file changed, 12 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
> >> index eccf26b262..4028c99afc 100644
> >> --- a/tests/acceptance/machine_s390_ccw_virtio.py
> >> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
> >> @@ -12,6 +12,7 @@
> >>  import os
> >>  import tempfile
> >>  
> >> +from avocado import skipIf
> >>  from avocado_qemu import Test
> >>  from avocado_qemu import exec_command_and_wait_for_pattern
> >>  from avocado_qemu import wait_for_console_pattern
> >> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
> >>          self.vm.command('device_add', driver='virtio-net-ccw',
> >>                          devno='fe.0.4711', id='net_4711')
> >>          self.wait_for_crw_reports()
> >> -        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
> >> -                                          '0.0.4711')
> >> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
> >> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
> >> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
> >> +                    '0.0.4711')  
> >
> > I'm wondering whether we should introduce a generic helper function for
> > "execute command repeatedly, if the expected result did not yet show
> > up", or "wait for a file/directory to exist". It's probably not
> > uncommon for a desired outcome to arrive asynchronously, and having a
> > function for waiting/retrying could be handy.  
> 
> We don't really want to encourage fragile shell scripts in the guest so
> something that makes it easy to encode these loops in python. Currently
> the _console_interaction helper fails the test if failure_message is
> seen so I guess we need a slightly more liberal interaction which
> accepts a command can fail so we can write something like:
> 
>   while True:
>       if exec_command_and_check(self, "stat -t /sys/bus/ccw/devices/0.0.4711",
>                                 "/sys/bus/ccw/devices/0.0.4711"):
>           break
> 
> ?

Yes, something like that. The caller can decide whether they want to
limit retries.

> 
> >  
> >>          # and detach it again
> >>          self.clear_guest_dmesg()
> >>          self.vm.command('device_del', id='net_4711')  
> 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-12 12:23     ` Cornelia Huck
@ 2021-01-12 13:31       ` Thomas Huth
  2021-01-12 13:53         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Huth @ 2021-01-12 13:31 UTC (permalink / raw)
  To: Cornelia Huck, Alex Bennée
  Cc: qemu-devel, Wainer dos Santos Moschetta, Halil Pasic,
	Christian Borntraeger, qemu-s390x, Cleber Rosa,
	Philippe Mathieu-Daudé

On 12/01/2021 13.23, Cornelia Huck wrote:
> On Tue, 12 Jan 2021 11:32:44 +0000
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
>> Cornelia Huck <cohuck@redhat.com> writes:
>>
>>> On Fri,  8 Jan 2021 19:56:45 +0100
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>   
>>>> There was a race condition in the first test where there was already the
>>>> "crw" output in the dmesg, but the "0.0.4711" entry has not been created
>>>> in the /sys fs yet. Fix it by waiting until it is there.
>>>>
>>>> The second test has even more problems on gitlab-CI. Even after adding some
>>>> more synchronization points (that wait for some messages in the "dmesg"
>>>> output to make sure that the modules got loaded correctly), there are still
>>>> occasionally some hangs in this test when it is running in the gitlab-CI.
>>>> So far I was unable to reproduce these hangs locally on my computer, so
>>>> this issue might take a while to debug. Thus disable the 2nd test in the
>>>> gitlab-CI until the problems are better understood and fixed.
>>>>
>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>> ---
>>>>   tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py b/tests/acceptance/machine_s390_ccw_virtio.py
>>>> index eccf26b262..4028c99afc 100644
>>>> --- a/tests/acceptance/machine_s390_ccw_virtio.py
>>>> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
>>>> @@ -12,6 +12,7 @@
>>>>   import os
>>>>   import tempfile
>>>>   
>>>> +from avocado import skipIf
>>>>   from avocado_qemu import Test
>>>>   from avocado_qemu import exec_command_and_wait_for_pattern
>>>>   from avocado_qemu import wait_for_console_pattern
>>>> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>>>>           self.vm.command('device_add', driver='virtio-net-ccw',
>>>>                           devno='fe.0.4711', id='net_4711')
>>>>           self.wait_for_crw_reports()
>>>> -        exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
>>>> -                                          '0.0.4711')
>>>> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4 5 6 7 ; do '
>>>> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then break; fi ;'
>>>> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
>>>> +                    '0.0.4711')
>>>
>>> I'm wondering whether we should introduce a generic helper function for
>>> "execute command repeatedly, if the expected result did not yet show
>>> up", or "wait for a file/directory to exist". It's probably not
>>> uncommon for a desired outcome to arrive asynchronously, and having a
>>> function for waiting/retrying could be handy.
>>
>> We don't really want to encourage fragile shell scripts in the guest so
>> something that makes it easy to encode these loops in python. Currently
>> the _console_interaction helper fails the test if failure_message is
>> seen so I guess we need a slightly more liberal interaction which
>> accepts a command can fail so we can write something like:
>>
>>    while True:
>>        if exec_command_and_check(self, "stat -t /sys/bus/ccw/devices/0.0.4711",
>>                                  "/sys/bus/ccw/devices/0.0.4711"):
>>            break
>>
>> ?
> 
> Yes, something like that. The caller can decide whether they want to
> limit retries.

Fine for me, but I think we should use a timeout, not an amount of retries.

I already put my patch into my pull-request yesterday (so that people are 
not running into failures with their gitlab-CI), so if someone wants to have 
a go at creating such a function in python, feel free to do so by 
refactoring that code again.

  Thomas



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-12 13:31       ` Thomas Huth
@ 2021-01-12 13:53         ` Philippe Mathieu-Daudé
  2021-01-14 19:13           ` Willian Rampazzo
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-01-12 13:53 UTC (permalink / raw)
  To: Thomas Huth, Cornelia Huck, Alex Bennée, Cleber Rosa
  Cc: qemu-devel, Wainer dos Santos Moschetta, Halil Pasic,
	Christian Borntraeger, qemu-s390x, Willian Rampazzo

+Willian

On 1/12/21 2:31 PM, Thomas Huth wrote:
> On 12/01/2021 13.23, Cornelia Huck wrote:
>> On Tue, 12 Jan 2021 11:32:44 +0000
>> Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>> Cornelia Huck <cohuck@redhat.com> writes:
>>>
>>>> On Fri,  8 Jan 2021 19:56:45 +0100
>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>  
>>>>> There was a race condition in the first test where there was
>>>>> already the
>>>>> "crw" output in the dmesg, but the "0.0.4711" entry has not been
>>>>> created
>>>>> in the /sys fs yet. Fix it by waiting until it is there.
>>>>>
>>>>> The second test has even more problems on gitlab-CI. Even after
>>>>> adding some
>>>>> more synchronization points (that wait for some messages in the
>>>>> "dmesg"
>>>>> output to make sure that the modules got loaded correctly), there
>>>>> are still
>>>>> occasionally some hangs in this test when it is running in the
>>>>> gitlab-CI.
>>>>> So far I was unable to reproduce these hangs locally on my
>>>>> computer, so
>>>>> this issue might take a while to debug. Thus disable the 2nd test
>>>>> in the
>>>>> gitlab-CI until the problems are better understood and fixed.
>>>>>
>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>>> ---
>>>>>   tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> b/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> index eccf26b262..4028c99afc 100644
>>>>> --- a/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> @@ -12,6 +12,7 @@
>>>>>   import os
>>>>>   import tempfile
>>>>>   +from avocado import skipIf
>>>>>   from avocado_qemu import Test
>>>>>   from avocado_qemu import exec_command_and_wait_for_pattern
>>>>>   from avocado_qemu import wait_for_console_pattern
>>>>> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>>>>>           self.vm.command('device_add', driver='virtio-net-ccw',
>>>>>                           devno='fe.0.4711', id='net_4711')
>>>>>           self.wait_for_crw_reports()
>>>>> -        exec_command_and_wait_for_pattern(self, 'ls
>>>>> /sys/bus/ccw/devices/',
>>>>> -                                          '0.0.4711')
>>>>> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4
>>>>> 5 6 7 ; do '
>>>>> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then
>>>>> break; fi ;'
>>>>> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
>>>>> +                    '0.0.4711')
>>>>
>>>> I'm wondering whether we should introduce a generic helper function for
>>>> "execute command repeatedly, if the expected result did not yet show
>>>> up", or "wait for a file/directory to exist". It's probably not
>>>> uncommon for a desired outcome to arrive asynchronously, and having a
>>>> function for waiting/retrying could be handy.
>>>
>>> We don't really want to encourage fragile shell scripts in the guest so
>>> something that makes it easy to encode these loops in python. Currently
>>> the _console_interaction helper fails the test if failure_message is
>>> seen so I guess we need a slightly more liberal interaction which
>>> accepts a command can fail so we can write something like:
>>>
>>>    while True:
>>>        if exec_command_and_check(self, "stat -t
>>> /sys/bus/ccw/devices/0.0.4711",
>>>                                  "/sys/bus/ccw/devices/0.0.4711"):
>>>            break
>>>
>>> ?
>>
>> Yes, something like that. The caller can decide whether they want to
>> limit retries.
> 
> Fine for me, but I think we should use a timeout, not an amount of retries.

We explained this feature request to the Avocado project at the
KVM forum 2018 in Lyon. There was an entry filled on their Trello
dashboard. Then the project switched to GitLab and I lost track
of it.

Cleber, if you remember, can you point us at the new ticket please?

Thanks,

Phil.

> I already put my patch into my pull-request yesterday (so that people
> are not running into failures with their gitlab-CI), so if someone wants
> to have a go at creating such a function in python, feel free to do so
> by refactoring that code again.
> 
>  Thomas
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-12 13:53         ` Philippe Mathieu-Daudé
@ 2021-01-14 19:13           ` Willian Rampazzo
  2021-01-14 19:25             ` Willian Rampazzo
  0 siblings, 1 reply; 9+ messages in thread
From: Willian Rampazzo @ 2021-01-14 19:13 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Thomas Huth, Cornelia Huck, qemu-devel,
	Wainer dos Santos Moschetta, Halil Pasic, Christian Borntraeger,
	open list:S390 Virtio-ccw, Cleber Rosa, Alex Bennée

On Tue, Jan 12, 2021 at 10:53 AM Philippe Mathieu-Daudé
<philmd@redhat.com> wrote:
>
...
> We explained this feature request to the Avocado project at the
> KVM forum 2018 in Lyon. There was an entry filled on their Trello
> dashboard. Then the project switched to GitLab and I lost track
> of it.

We moved our Avocado Trello board to the GitHub issues, but the
Avocado-QEMU Trello board stays live [1], with little or no action in
the last months as we are mostly using GitHub issued to track work any
on Avocado.

I don't recall or I was not involved in a discussion about a feature
like the one described in this thread in the KVM Forum of 2019 (Lyon).

>
> Cleber, if you remember, can you point us at the new ticket please?

I went through the Trello board and could not find a card related to
this feature. Anyway, just like the avocado_qemu package implements a
`wait_for_console_pattern`, it is possible to implement the needed
feature there.

>
> Thanks,
>
> Phil.
>
> > I already put my patch into my pull-request yesterday (so that people
> > are not running into failures with their gitlab-CI), so if someone wants
> > to have a go at creating such a function in python, feel free to do so
> > by refactoring that code again.
> >
> >  Thomas
> >
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
  2021-01-14 19:13           ` Willian Rampazzo
@ 2021-01-14 19:25             ` Willian Rampazzo
  0 siblings, 0 replies; 9+ messages in thread
From: Willian Rampazzo @ 2021-01-14 19:25 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Thomas Huth, Cornelia Huck, qemu-devel,
	Wainer dos Santos Moschetta, Halil Pasic, Christian Borntraeger,
	open list:S390 Virtio-ccw, Cleber Rosa, Alex Bennée

Avocado-QEMU Trello board missing in my previous mail.

On Thu, Jan 14, 2021 at 4:13 PM Willian Rampazzo <wrampazz@redhat.com> wrote:
>
> On Tue, Jan 12, 2021 at 10:53 AM Philippe Mathieu-Daudé
> <philmd@redhat.com> wrote:
> >
> ...
> > We explained this feature request to the Avocado project at the
> > KVM forum 2018 in Lyon. There was an entry filled on their Trello
> > dashboard. Then the project switched to GitLab and I lost track
> > of it.
>
> We moved our Avocado Trello board to the GitHub issues, but the
> Avocado-QEMU Trello board stays live [1], with little or no action in
> the last months as we are mostly using GitHub issued to track work any
> on Avocado.

[1] https://trello.com/b/6Qi1pxVn/avocado-qemu

>
> I don't recall or I was not involved in a discussion about a feature
> like the one described in this thread in the KVM Forum of 2019 (Lyon).
>
> >
> > Cleber, if you remember, can you point us at the new ticket please?
>
> I went through the Trello board and could not find a card related to
> this feature. Anyway, just like the avocado_qemu package implements a
> `wait_for_console_pattern`, it is possible to implement the needed
> feature there.
>
> >
> > Thanks,
> >
> > Phil.
> >
> > > I already put my patch into my pull-request yesterday (so that people
> > > are not running into failures with their gitlab-CI), so if someone wants
> > > to have a go at creating such a function in python, feel free to do so
> > > by refactoring that code again.
> > >
> > >  Thomas
> > >
> >



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-01-14 19:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-08 18:56 [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI Thomas Huth
2021-01-08 21:27 ` Willian Rampazzo
2021-01-12 10:53 ` Cornelia Huck
2021-01-12 11:32   ` Alex Bennée
2021-01-12 12:23     ` Cornelia Huck
2021-01-12 13:31       ` Thomas Huth
2021-01-12 13:53         ` Philippe Mathieu-Daudé
2021-01-14 19:13           ` Willian Rampazzo
2021-01-14 19:25             ` Willian Rampazzo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.