All of lore.kernel.org
 help / color / mirror / Atom feed
* iotests 041 intermittent failure (netbsd)
@ 2021-04-09  9:43 Peter Maydell
  2021-04-09 10:22 ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2021-04-09  9:43 UTC (permalink / raw)
  To: QEMU Developers, Qemu-block; +Cc: Kevin Wolf, Max Reitz

Just hit this (presumably intermittent) 041 failure running
the build-and-test on the tests/vm netbsd setup. Does it look
familiar to anybody?


  TEST   iotest-qcow2: 041 [fail]
QEMU          --
"/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-system-aarch64"
-nodefaults -display none -accel qtest -machine virt
QEMU_IMG      --
"/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-img"
QEMU_IO       --
"/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-io"
--cache writeback --aio threads -f qcow2
QEMU_NBD      --
"/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-nbd"
IMGFMT        -- qcow2
IMGPROTO      -- file
PLATFORM      -- NetBSD/amd64 localhost 9.1
TEST_DIR      -- /home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/scratch
SOCK_DIR      -- /tmp/tmp5wf5bgkm
SOCKET_SCM_HELPER --
--- /home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/041.out
+++ 041.out.bad
@@ -1,5 +1,29 @@
-...........................................................................................................
+..............................................................................E............................
+======================================================================
+ERROR: test_pause (__main__.TestSingleDrive)
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/041", line
111, in test_pause
+    self.pause_job('drive0')
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
line 1064, in pause_job
+    return self.pause_wait(job_id)
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
line 1050, in pause_wait
+    result = self.vm.qmp('query-block-jobs')
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/machine.py",
line 560, in qmp
fcntl(): Invalid argument
+    return self._qmp.cmd(cmd, args=qmp_args)
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
line 278, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
line 257, in cmd_obj
+    resp = self.__json_read()
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
line 140, in __json_read
+    data = self.__sockfile.readline()
+  File "/usr/pkg/lib/python3.7/socket.py", line 589, in readinto
+    return self._sock.recv_into(b)
+  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
line 482, in timeout
+    raise Exception(self.errmsg)
+Exception: Timeout waiting for job to pause
+
 ----------------------------------------------------------------------
 Ran 107 tests

-OK
+FAILED (errors=1)


thanks
-- PMM


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: iotests 041 intermittent failure (netbsd)
  2021-04-09  9:43 iotests 041 intermittent failure (netbsd) Peter Maydell
@ 2021-04-09 10:22 ` Philippe Mathieu-Daudé
  2021-04-09 10:31   ` Daniel P. Berrangé
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-09 10:22 UTC (permalink / raw)
  To: Peter Maydell, QEMU Developers, Qemu-block; +Cc: Kevin Wolf, Max Reitz

On 4/9/21 11:43 AM, Peter Maydell wrote:
> Just hit this (presumably intermittent) 041 failure running
> the build-and-test on the tests/vm netbsd setup. Does it look
> familiar to anybody?

This one is known as the mysterious failure:
https://www.mail-archive.com/qemu-block@nongnu.org/msg73321.html

> 
> 
>   TEST   iotest-qcow2: 041 [fail]
> QEMU          --
> "/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-system-aarch64"
> -nodefaults -display none -accel qtest -machine virt
> QEMU_IMG      --
> "/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-img"
> QEMU_IO       --
> "/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-io"
> --cache writeback --aio threads -f qcow2
> QEMU_NBD      --
> "/home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/../../qemu-nbd"
> IMGFMT        -- qcow2
> IMGPROTO      -- file
> PLATFORM      -- NetBSD/amd64 localhost 9.1
> TEST_DIR      -- /home/qemu/qemu-test.bx6kgg/build/tests/qemu-iotests/scratch
> SOCK_DIR      -- /tmp/tmp5wf5bgkm
> SOCKET_SCM_HELPER --
> --- /home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/041.out
> +++ 041.out.bad
> @@ -1,5 +1,29 @@
> -...........................................................................................................
> +..............................................................................E............................
> +======================================================================
> +ERROR: test_pause (__main__.TestSingleDrive)
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/041", line
> 111, in test_pause
> +    self.pause_job('drive0')
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
> line 1064, in pause_job
> +    return self.pause_wait(job_id)
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
> line 1050, in pause_wait
> +    result = self.vm.qmp('query-block-jobs')
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/machine.py",
> line 560, in qmp
> fcntl(): Invalid argument
> +    return self._qmp.cmd(cmd, args=qmp_args)
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
> line 278, in cmd
> +    return self.cmd_obj(qmp_cmd)
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
> line 257, in cmd_obj
> +    resp = self.__json_read()
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/../../python/qemu/qmp.py",
> line 140, in __json_read
> +    data = self.__sockfile.readline()
> +  File "/usr/pkg/lib/python3.7/socket.py", line 589, in readinto
> +    return self._sock.recv_into(b)
> +  File "/home/qemu/qemu-test.bx6kgg/src/tests/qemu-iotests/iotests.py",
> line 482, in timeout
> +    raise Exception(self.errmsg)
> +Exception: Timeout waiting for job to pause
> +
>  ----------------------------------------------------------------------
>  Ran 107 tests
> 
> -OK
> +FAILED (errors=1)
> 
> 
> thanks
> -- PMM
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: iotests 041 intermittent failure (netbsd)
  2021-04-09 10:22 ` Philippe Mathieu-Daudé
@ 2021-04-09 10:31   ` Daniel P. Berrangé
  2021-04-09 11:37     ` Kevin Wolf
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel P. Berrangé @ 2021-04-09 10:31 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Peter Maydell, QEMU Developers, Qemu-block, Max Reitz

On Fri, Apr 09, 2021 at 12:22:26PM +0200, Philippe Mathieu-Daudé wrote:
> On 4/9/21 11:43 AM, Peter Maydell wrote:
> > Just hit this (presumably intermittent) 041 failure running
> > the build-and-test on the tests/vm netbsd setup. Does it look
> > familiar to anybody?
> 
> This one is known as the mysterious failure:
> https://www.mail-archive.com/qemu-block@nongnu.org/msg73321.html

If the test has been flakey with no confirmed fix since Sept 2020,
then it is well overdue to be switched to disabled by default, at
least on the platforms it is known to be flakey on.

Non-determinsitic failures accumulate until you find yourself in
a situation where its impossible to get CI to pass. We must be
aggressive in either (a) fixing non-deterministic failures promptly,
or (b) disabling the test until someone has time to work on a fix.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: iotests 041 intermittent failure (netbsd)
  2021-04-09 10:31   ` Daniel P. Berrangé
@ 2021-04-09 11:37     ` Kevin Wolf
  2021-04-09 13:41       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2021-04-09 11:37 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	QEMU Developers, Qemu-block, Max Reitz

Am 09.04.2021 um 12:31 hat Daniel P. Berrangé geschrieben:
> On Fri, Apr 09, 2021 at 12:22:26PM +0200, Philippe Mathieu-Daudé wrote:
> > On 4/9/21 11:43 AM, Peter Maydell wrote:
> > > Just hit this (presumably intermittent) 041 failure running
> > > the build-and-test on the tests/vm netbsd setup. Does it look
> > > familiar to anybody?
> > 
> > This one is known as the mysterious failure:
> > https://www.mail-archive.com/qemu-block@nongnu.org/msg73321.html
> 
> If the test has been flakey with no confirmed fix since Sept 2020,
> then it is well overdue to be switched to disabled by default, at
> least on the platforms it is known to be flakey on.

Why do you think this is the same problem? It is a completely different
error message, happening in a different test function. The problems
reported in September were fixed in the next version of the pull
request.

What Peter is reporting here is probably unrelated to NetBSD, but to
overloaded test hosts. QMPTestCase.pause_wait() uses a timeout of
3 seconds until it decides that the job probably has just failed to
pause at all, so that the test case wouldn't hang indefinitely on
failure.

We can increase the timeout, but of course, that doesn't guarantee that
we'll never hit it again on very slow test hosts.

Kevin



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: iotests 041 intermittent failure (netbsd)
  2021-04-09 11:37     ` Kevin Wolf
@ 2021-04-09 13:41       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-04-09 13:41 UTC (permalink / raw)
  To: Kevin Wolf, Daniel P. Berrangé
  Cc: Peter Maydell, QEMU Developers, Qemu-block, Max Reitz

On 4/9/21 1:37 PM, Kevin Wolf wrote:
> Am 09.04.2021 um 12:31 hat Daniel P. Berrangé geschrieben:
>> On Fri, Apr 09, 2021 at 12:22:26PM +0200, Philippe Mathieu-Daudé wrote:
>>> On 4/9/21 11:43 AM, Peter Maydell wrote:
>>>> Just hit this (presumably intermittent) 041 failure running
>>>> the build-and-test on the tests/vm netbsd setup. Does it look
>>>> familiar to anybody?
>>>
>>> This one is known as the mysterious failure:
>>> https://www.mail-archive.com/qemu-block@nongnu.org/msg73321.html
>>
>> If the test has been flakey with no confirmed fix since Sept 2020,
>> then it is well overdue to be switched to disabled by default, at
>> least on the platforms it is known to be flakey on.
> 
> Why do you think this is the same problem? It is a completely different
> error message, happening in a different test function. The problems
> reported in September were fixed in the next version of the pull
> request.

Oops my bad, I thought this was the same, sorry.

> What Peter is reporting here is probably unrelated to NetBSD, but to
> overloaded test hosts. QMPTestCase.pause_wait() uses a timeout of
> 3 seconds until it decides that the job probably has just failed to
> pause at all, so that the test case wouldn't hang indefinitely on
> failure.
> 
> We can increase the timeout, but of course, that doesn't guarantee that
> we'll never hit it again on very slow test hosts.
> 
> Kevin
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-09 13:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09  9:43 iotests 041 intermittent failure (netbsd) Peter Maydell
2021-04-09 10:22 ` Philippe Mathieu-Daudé
2021-04-09 10:31   ` Daniel P. Berrangé
2021-04-09 11:37     ` Kevin Wolf
2021-04-09 13:41       ` Philippe Mathieu-Daudé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.