Transient fail of iotests 215 and 197

* Transient fail of iotests 215 and 197
@ 2021-07-21 17:22 Daniel P. Berrangé
  2021-07-27 14:23 ` Thomas Huth
  0 siblings, 1 reply; 2+ messages in thread
From: Daniel P. Berrangé @ 2021-07-21 17:22 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: Eric Blake, Max Reitz

Peter caught the following transient fail on the staging tree:

  https://gitlab.com/qemu-project/qemu/-/jobs/1438817749

--- /builds/qemu-project/qemu/tests/qemu-iotests/197.out
+++ 197.out.bad
@@ -12,13 +12,12 @@
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 0/0 bytes at offset 0
 0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-read 2147483136/2147483136 bytes at offset 1024
-2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+./common.rc: Killed                  ( VALGRIND_QEMU="${VALGRIND_QEMU_IO}" _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 read 1024/1024 bytes at offset 3221226496
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qemu-io: can't open device TEST_DIR/t.wrap.qcow2: Can't use copy-on-read on read-only device
-2 GiB (0x80010000) bytes     allocated at offset 0 bytes (0x0)
-1023.938 MiB (0x3fff0000) bytes not allocated at offset 2 GiB (0x80010000)
+2 GiB (0x80000000) bytes     allocated at offset 0 bytes (0x0)
+1 GiB (0x40000000) bytes not allocated at offset 2 GiB (0x80000000)
 64 KiB (0x10000) bytes     allocated at offset 3 GiB (0xc0000000)
 1023.938 MiB (0x3fff0000) bytes not allocated at offset 3 GiB (0xc0010000)
 No errors were found on the image.


--- /builds/qemu-project/qemu/tests/qemu-iotests/215.out
+++ 215.out.bad
@@ -12,13 +12,12 @@
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 0/0 bytes at offset 0
 0 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-read 2147483136/2147483136 bytes at offset 1024
-2 GiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+./common.rc: Killed                  ( VALGRIND_QEMU="${VALGRIND_QEMU_IO}" _qemu_proc_exec "${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 read 1024/1024 bytes at offset 3221226496
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qemu-io: can't open device TEST_DIR/t.wrap.qcow2: Block node is read-only
-2 GiB (0x80010000) bytes     allocated at offset 0 bytes (0x0)
-1023.938 MiB (0x3fff0000) bytes not allocated at offset 2 GiB (0x80010000)
+2 GiB (0x80000000) bytes     allocated at offset 0 bytes (0x0)
+1 GiB (0x40000000) bytes not allocated at offset 2 GiB (0x80000000)
 64 KiB (0x10000) bytes     allocated at offset 3 GiB (0xc0000000)
 1023.938 MiB (0x3fff0000) bytes not allocated at offset 3 GiB (0xc0010000)
 No errors were found on the image.


Looks like the process might have been killed off by the OS part way
through.

Interestingly both test cases have a comment:

  #                                        Since a 2G read may exhaust
  # memory on some machines (particularly 32-bit), we skip the test if
  # that fails due to memory pressure.


I'm wondering if the logic for handling this failure is flawed, as being
killed by the OS for exhuasting memory limits for the CI container looks
like a plausible scenario to explain the failure.

The CI shared runners supposedly have 3.75 GB of RAM for the VM as a whole.
If the tests are run in parallel this could still be an issue.

Maybe we need to skip these tests by default if they are known to require
a significant amount of memory to run ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 2+ messages in thread