qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* iotest failure -- test possibly not using sufficiently unique temp filename?
@ 2019-09-27 16:39 Peter Maydell
  2019-09-27 16:44 ` Max Reitz
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2019-09-27 16:39 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Kevin Wolf, Qemu-block, Max Reitz

Hi; I just saw this iotest failure (on an s390x box, as it happens):

  TEST    iotest-qcow2: 130 [fail]
QEMU          --
"/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
-nodefaults -display none -machine accel=qtest
QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
QEMU_IO       --
"/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
--cache writeback -f qcow2
QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
IMGFMT        -- qcow2 (compat=1.1)
IMGPROTO      -- file
PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
SOCKET_SCM_HELPER --
/home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper

--- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
12:27:16.948075733 -0400
+++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
2019-09-27 12:01:23.649722655 -0400
@@ -18,20 +18,22 @@
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) commit testdisk
 (qemu)
-image: TEST_DIR/t.IMGFMT
-file format: IMGFMT
-virtual size: 64 MiB (67108864 bytes)
-backing file: TEST_DIR/t.IMGFMT.orig
-backing file format: raw
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
+Is another process using the image [TEST_DIR/t.IMGFMT]?

 === Marking image dirty (lazy refcounts) ===

+qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
+Is another process using the image [TEST_DIR/t.IMGFMT]?
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
-wrote 4096/4096 bytes at offset 0
-4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-io: can't open device
/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
to get "write" lock
+Is another process using the image
[/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
+no file open, try 'help open'
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 64 MiB (67108864 bytes)
+backing file: TEST_DIR/t.IMGFMT.orig
+backing file format: raw
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
 wrote 4096/4096 bytes at offset 0
 4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)



This looks suspiciously like the test isn't using a unique
filename for its disk image: "qemu-iotests/scratch/t.qcow2"
in the build directory, and so perhaps it has collided with
another iotest ?

If we run 'make check' with a -j<something> option do the
iotests all get run serially anyway, or do they run in
parallel against each other ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: iotest failure -- test possibly not using sufficiently unique temp filename?
  2019-09-27 16:39 iotest failure -- test possibly not using sufficiently unique temp filename? Peter Maydell
@ 2019-09-27 16:44 ` Max Reitz
  2019-10-17 16:41   ` Peter Maydell
  0 siblings, 1 reply; 6+ messages in thread
From: Max Reitz @ 2019-09-27 16:44 UTC (permalink / raw)
  To: Peter Maydell, QEMU Developers; +Cc: Kevin Wolf, Thomas Huth, Qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 3365 bytes --]

On 27.09.19 18:39, Peter Maydell wrote:
> Hi; I just saw this iotest failure (on an s390x box, as it happens):
> 
>   TEST    iotest-qcow2: 130 [fail]
> QEMU          --
> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
> -nodefaults -display none -machine accel=qtest
> QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
> QEMU_IO       --
> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
> --cache writeback -f qcow2
> QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
> IMGFMT        -- qcow2 (compat=1.1)
> IMGPROTO      -- file
> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
> SOCKET_SCM_HELPER --
> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
> 
> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
> 12:27:16.948075733 -0400
> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
> 2019-09-27 12:01:23.649722655 -0400
> @@ -18,20 +18,22 @@
>  QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) commit testdisk
>  (qemu)
> -image: TEST_DIR/t.IMGFMT
> -file format: IMGFMT
> -virtual size: 64 MiB (67108864 bytes)
> -backing file: TEST_DIR/t.IMGFMT.orig
> -backing file format: raw
> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
> +Is another process using the image [TEST_DIR/t.IMGFMT]?
> 
>  === Marking image dirty (lazy refcounts) ===
> 
> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
> -wrote 4096/4096 bytes at offset 0
> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +qemu-io: can't open device
> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
> to get "write" lock
> +Is another process using the image
> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
> +no file open, try 'help open'
>  image: TEST_DIR/t.IMGFMT
>  file format: IMGFMT
>  virtual size: 64 MiB (67108864 bytes)
> +backing file: TEST_DIR/t.IMGFMT.orig
> +backing file format: raw
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>  wrote 4096/4096 bytes at offset 0
>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> 
> 
> 
> This looks suspiciously like the test isn't using a unique
> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
> in the build directory, and so perhaps it has collided with
> another iotest ?
> 
> If we run 'make check' with a -j<something> option do the
> iotests all get run serially anyway, or do they run in
> parallel against each other ?

As far as I know, all iotests are executed serially.  Anything else
would not work with the same scratch directory.

The only thing I suspect is that some tool has been accidentally left
running by some previous test that still accesses its own image.  But I
don’t know.

Max

(I’m personally still very skeptical of running some iotests in make
check, so I don’t have an opinion on the matter other than “If it fails,
we should remove it from make check”.)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: iotest failure -- test possibly not using sufficiently unique temp filename?
  2019-09-27 16:44 ` Max Reitz
@ 2019-10-17 16:41   ` Peter Maydell
  2019-10-18  6:20     ` Thomas Huth
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2019-10-17 16:41 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, Thomas Huth, QEMU Developers, Qemu-block

On Fri, 27 Sep 2019 at 17:44, Max Reitz <mreitz@redhat.com> wrote:
>
> On 27.09.19 18:39, Peter Maydell wrote:
> > Hi; I just saw this iotest failure (on an s390x box, as it happens):
> >
> >   TEST    iotest-qcow2: 130 [fail]
> > QEMU          --
> > "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
> > -nodefaults -display none -machine accel=qtest
> > QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
> > QEMU_IO       --
> > "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
> > --cache writeback -f qcow2
> > QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
> > IMGFMT        -- qcow2 (compat=1.1)
> > IMGPROTO      -- file
> > PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
> > TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
> > SOCKET_SCM_HELPER --
> > /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
> >
> > --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
> > 12:27:16.948075733 -0400
> > +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
> > 2019-09-27 12:01:23.649722655 -0400
> > @@ -18,20 +18,22 @@
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) commit testdisk
> >  (qemu)
> > -image: TEST_DIR/t.IMGFMT
> > -file format: IMGFMT
> > -virtual size: 64 MiB (67108864 bytes)
> > -backing file: TEST_DIR/t.IMGFMT.orig
> > -backing file format: raw
> > +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
> > +Is another process using the image [TEST_DIR/t.IMGFMT]?
> >
> >  === Marking image dirty (lazy refcounts) ===
> >
> > +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
> > +Is another process using the image [TEST_DIR/t.IMGFMT]?
> >  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
> > -wrote 4096/4096 bytes at offset 0
> > -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > +qemu-io: can't open device
> > /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
> > to get "write" lock
> > +Is another process using the image
> > [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
> > +no file open, try 'help open'
> >  image: TEST_DIR/t.IMGFMT
> >  file format: IMGFMT
> >  virtual size: 64 MiB (67108864 bytes)
> > +backing file: TEST_DIR/t.IMGFMT.orig
> > +backing file format: raw
> >  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
> > backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
> >  wrote 4096/4096 bytes at offset 0
> >  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >
> >
> >
> > This looks suspiciously like the test isn't using a unique
> > filename for its disk image: "qemu-iotests/scratch/t.qcow2"
> > in the build directory, and so perhaps it has collided with
> > another iotest ?
> >
> > If we run 'make check' with a -j<something> option do the
> > iotests all get run serially anyway, or do they run in
> > parallel against each other ?
>
> As far as I know, all iotests are executed serially.  Anything else
> would not work with the same scratch directory.
>
> The only thing I suspect is that some tool has been accidentally left
> running by some previous test that still accesses its own image.  But I
> don’t know.

Just saw this one again with the same iotest 130 on the same
s390 box; only difference is that the log this time around
has the first part where qemu-img fails, but not the second part
where qemu-io fails:

--- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
12:27:16.948075733 -0400
+++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
2019-10-17 11:56:43.450750873 -0400
@@ -18,11 +18,8 @@
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) commit testdisk
 (qemu)
-image: TEST_DIR/t.IMGFMT
-file format: IMGFMT
-virtual size: 64 MiB (67108864 bytes)
-backing file: TEST_DIR/t.IMGFMT.orig
-backing file format: raw
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
+Is another process using the image [TEST_DIR/t.IMGFMT]?

 === Marking image dirty (lazy refcounts) ===

On the host machine there don't seem to be any stray
processes which might have held the file open, and
indeed the file doesn't exist at all, so it got removed
by some cleanup or other.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: iotest failure -- test possibly not using sufficiently unique temp filename?
  2019-10-17 16:41   ` Peter Maydell
@ 2019-10-18  6:20     ` Thomas Huth
  2019-10-18  8:42       ` Max Reitz
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Huth @ 2019-10-18  6:20 UTC (permalink / raw)
  To: Peter Maydell, Max Reitz; +Cc: Kevin Wolf, QEMU Developers, Qemu-block

On 17/10/2019 18.41, Peter Maydell wrote:
> On Fri, 27 Sep 2019 at 17:44, Max Reitz <mreitz@redhat.com> wrote:
>>
>> On 27.09.19 18:39, Peter Maydell wrote:
>>> Hi; I just saw this iotest failure (on an s390x box, as it happens):
>>>
>>>   TEST    iotest-qcow2: 130 [fail]
>>> QEMU          --
>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
>>> -nodefaults -display none -machine accel=qtest
>>> QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
>>> QEMU_IO       --
>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
>>> --cache writeback -f qcow2
>>> QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
>>> IMGFMT        -- qcow2 (compat=1.1)
>>> IMGPROTO      -- file
>>> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
>>> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
>>> SOCKET_SCM_HELPER --
>>> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
>>>
>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>> 12:27:16.948075733 -0400
>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>> 2019-09-27 12:01:23.649722655 -0400
>>> @@ -18,20 +18,22 @@
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) commit testdisk
>>>  (qemu)
>>> -image: TEST_DIR/t.IMGFMT
>>> -file format: IMGFMT
>>> -virtual size: 64 MiB (67108864 bytes)
>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>> -backing file format: raw
>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>
>>>  === Marking image dirty (lazy refcounts) ===
>>>
>>> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>> -wrote 4096/4096 bytes at offset 0
>>> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>> +qemu-io: can't open device
>>> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
>>> to get "write" lock
>>> +Is another process using the image
>>> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
>>> +no file open, try 'help open'
>>>  image: TEST_DIR/t.IMGFMT
>>>  file format: IMGFMT
>>>  virtual size: 64 MiB (67108864 bytes)
>>> +backing file: TEST_DIR/t.IMGFMT.orig
>>> +backing file format: raw
>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>>>  wrote 4096/4096 bytes at offset 0
>>>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>
>>>
>>>
>>> This looks suspiciously like the test isn't using a unique
>>> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
>>> in the build directory, and so perhaps it has collided with
>>> another iotest ?
>>>
>>> If we run 'make check' with a -j<something> option do the
>>> iotests all get run serially anyway, or do they run in
>>> parallel against each other ?
>>
>> As far as I know, all iotests are executed serially.  Anything else
>> would not work with the same scratch directory.
>>
>> The only thing I suspect is that some tool has been accidentally left
>> running by some previous test that still accesses its own image.  But I
>> don’t know.
> 
> Just saw this one again with the same iotest 130 on the same
> s390 box; only difference is that the log this time around
> has the first part where qemu-img fails, but not the second part
> where qemu-io fails:
> 
> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
> 12:27:16.948075733 -0400
> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
> 2019-10-17 11:56:43.450750873 -0400
> @@ -18,11 +18,8 @@
>  QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) commit testdisk
>  (qemu)
> -image: TEST_DIR/t.IMGFMT
> -file format: IMGFMT
> -virtual size: 64 MiB (67108864 bytes)
> -backing file: TEST_DIR/t.IMGFMT.orig
> -backing file format: raw
> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
> +Is another process using the image [TEST_DIR/t.IMGFMT]?
> 
>  === Marking image dirty (lazy refcounts) ===
> 
> On the host machine there don't seem to be any stray
> processes which might have held the file open, and
> indeed the file doesn't exist at all, so it got removed
> by some cleanup or other.

Ok, so unless someone has a clue what might be going on here (is there a
race in the test?), I'd suggest that we simply remove 130 from the auto
group again. Shall I send a patch?

 Thomas


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: iotest failure -- test possibly not using sufficiently unique temp filename?
  2019-10-18  6:20     ` Thomas Huth
@ 2019-10-18  8:42       ` Max Reitz
  2019-10-18 11:43         ` Thomas Huth
  0 siblings, 1 reply; 6+ messages in thread
From: Max Reitz @ 2019-10-18  8:42 UTC (permalink / raw)
  To: Thomas Huth, Peter Maydell; +Cc: Kevin Wolf, QEMU Developers, Qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 5456 bytes --]

On 18.10.19 08:20, Thomas Huth wrote:
> On 17/10/2019 18.41, Peter Maydell wrote:
>> On Fri, 27 Sep 2019 at 17:44, Max Reitz <mreitz@redhat.com> wrote:
>>>
>>> On 27.09.19 18:39, Peter Maydell wrote:
>>>> Hi; I just saw this iotest failure (on an s390x box, as it happens):
>>>>
>>>>   TEST    iotest-qcow2: 130 [fail]
>>>> QEMU          --
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
>>>> -nodefaults -display none -machine accel=qtest
>>>> QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
>>>> QEMU_IO       --
>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
>>>> --cache writeback -f qcow2
>>>> QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
>>>> IMGFMT        -- qcow2 (compat=1.1)
>>>> IMGPROTO      -- file
>>>> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
>>>> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
>>>> SOCKET_SCM_HELPER --
>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
>>>>
>>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>>> 12:27:16.948075733 -0400
>>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>>> 2019-09-27 12:01:23.649722655 -0400
>>>> @@ -18,20 +18,22 @@
>>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>>  (qemu) commit testdisk
>>>>  (qemu)
>>>> -image: TEST_DIR/t.IMGFMT
>>>> -file format: IMGFMT
>>>> -virtual size: 64 MiB (67108864 bytes)
>>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>>> -backing file format: raw
>>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>
>>>>  === Marking image dirty (lazy refcounts) ===
>>>>
>>>> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>> -wrote 4096/4096 bytes at offset 0
>>>> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>> +qemu-io: can't open device
>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
>>>> to get "write" lock
>>>> +Is another process using the image
>>>> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
>>>> +no file open, try 'help open'
>>>>  image: TEST_DIR/t.IMGFMT
>>>>  file format: IMGFMT
>>>>  virtual size: 64 MiB (67108864 bytes)
>>>> +backing file: TEST_DIR/t.IMGFMT.orig
>>>> +backing file format: raw
>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>>>>  wrote 4096/4096 bytes at offset 0
>>>>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>
>>>>
>>>>
>>>> This looks suspiciously like the test isn't using a unique
>>>> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
>>>> in the build directory, and so perhaps it has collided with
>>>> another iotest ?
>>>>
>>>> If we run 'make check' with a -j<something> option do the
>>>> iotests all get run serially anyway, or do they run in
>>>> parallel against each other ?
>>>
>>> As far as I know, all iotests are executed serially.  Anything else
>>> would not work with the same scratch directory.
>>>
>>> The only thing I suspect is that some tool has been accidentally left
>>> running by some previous test that still accesses its own image.  But I
>>> don’t know.
>>
>> Just saw this one again with the same iotest 130 on the same
>> s390 box; only difference is that the log this time around
>> has the first part where qemu-img fails, but not the second part
>> where qemu-io fails:
>>
>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>> 12:27:16.948075733 -0400
>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>> 2019-10-17 11:56:43.450750873 -0400
>> @@ -18,11 +18,8 @@
>>  QEMU X.Y.Z monitor - type 'help' for more information
>>  (qemu) commit testdisk
>>  (qemu)
>> -image: TEST_DIR/t.IMGFMT
>> -file format: IMGFMT
>> -virtual size: 64 MiB (67108864 bytes)
>> -backing file: TEST_DIR/t.IMGFMT.orig
>> -backing file format: raw
>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>
>>  === Marking image dirty (lazy refcounts) ===
>>
>> On the host machine there don't seem to be any stray
>> processes which might have held the file open, and
>> indeed the file doesn't exist at all, so it got removed
>> by some cleanup or other.
> 
> Ok, so unless someone has a clue what might be going on here (is there a
> race in the test?), I'd suggest that we simply remove 130 from the auto
> group again. Shall I send a patch?

I don’t have much of an idea.  It looks like maybe the qemu process
(which dos the commit) is lingering, but that shouldn’t be because
_cleanup_qemu always waits for it.  (Also, I can’t reproduce the problem
on my system.)

The only hunch that I have is that 130 seems to be the only test that
uses _cleanup_qemu to kill a qemu process (i.e. without wait=1) while it
has taken locks on an image and then still uses the image afterwards.
Maybe making it quit qemu through HMP would fix the problem.  But
knowing is difficult because I can’t reproduce it.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: iotest failure -- test possibly not using sufficiently unique temp filename?
  2019-10-18  8:42       ` Max Reitz
@ 2019-10-18 11:43         ` Thomas Huth
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Huth @ 2019-10-18 11:43 UTC (permalink / raw)
  To: Max Reitz, Peter Maydell; +Cc: Kevin Wolf, QEMU Developers, Qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 5815 bytes --]

On 18/10/2019 10.42, Max Reitz wrote:
> On 18.10.19 08:20, Thomas Huth wrote:
>> On 17/10/2019 18.41, Peter Maydell wrote:
>>> On Fri, 27 Sep 2019 at 17:44, Max Reitz <mreitz@redhat.com> wrote:
>>>>
>>>> On 27.09.19 18:39, Peter Maydell wrote:
>>>>> Hi; I just saw this iotest failure (on an s390x box, as it happens):
>>>>>
>>>>>   TEST    iotest-qcow2: 130 [fail]
>>>>> QEMU          --
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
>>>>> -nodefaults -display none -machine accel=qtest
>>>>> QEMU_IMG      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
>>>>> QEMU_IO       --
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
>>>>> --cache writeback -f qcow2
>>>>> QEMU_NBD      -- "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
>>>>> IMGFMT        -- qcow2 (compat=1.1)
>>>>> IMGPROTO      -- file
>>>>> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
>>>>> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
>>>>> SOCKET_SCM_HELPER --
>>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
>>>>>
>>>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>>>> 12:27:16.948075733 -0400
>>>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>>>> 2019-09-27 12:01:23.649722655 -0400
>>>>> @@ -18,20 +18,22 @@
>>>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>>>  (qemu) commit testdisk
>>>>>  (qemu)
>>>>> -image: TEST_DIR/t.IMGFMT
>>>>> -file format: IMGFMT
>>>>> -virtual size: 64 MiB (67108864 bytes)
>>>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>>>> -backing file format: raw
>>>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
>>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>>
>>>>>  === Marking image dirty (lazy refcounts) ===
>>>>>
>>>>> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
>>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>>> -wrote 4096/4096 bytes at offset 0
>>>>> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>> +qemu-io: can't open device
>>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
>>>>> to get "write" lock
>>>>> +Is another process using the image
>>>>> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
>>>>> +no file open, try 'help open'
>>>>>  image: TEST_DIR/t.IMGFMT
>>>>>  file format: IMGFMT
>>>>>  virtual size: 64 MiB (67108864 bytes)
>>>>> +backing file: TEST_DIR/t.IMGFMT.orig
>>>>> +backing file format: raw
>>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>>> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>>>>>  wrote 4096/4096 bytes at offset 0
>>>>>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>>
>>>>>
>>>>>
>>>>> This looks suspiciously like the test isn't using a unique
>>>>> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
>>>>> in the build directory, and so perhaps it has collided with
>>>>> another iotest ?
>>>>>
>>>>> If we run 'make check' with a -j<something> option do the
>>>>> iotests all get run serially anyway, or do they run in
>>>>> parallel against each other ?
>>>>
>>>> As far as I know, all iotests are executed serially.  Anything else
>>>> would not work with the same scratch directory.
>>>>
>>>> The only thing I suspect is that some tool has been accidentally left
>>>> running by some previous test that still accesses its own image.  But I
>>>> don’t know.
>>>
>>> Just saw this one again with the same iotest 130 on the same
>>> s390 box; only difference is that the log this time around
>>> has the first part where qemu-img fails, but not the second part
>>> where qemu-io fails:
>>>
>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>> 12:27:16.948075733 -0400
>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>> 2019-10-17 11:56:43.450750873 -0400
>>> @@ -18,11 +18,8 @@
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) commit testdisk
>>>  (qemu)
>>> -image: TEST_DIR/t.IMGFMT
>>> -file format: IMGFMT
>>> -virtual size: 64 MiB (67108864 bytes)
>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>> -backing file format: raw
>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" lock
>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>
>>>  === Marking image dirty (lazy refcounts) ===
>>>
>>> On the host machine there don't seem to be any stray
>>> processes which might have held the file open, and
>>> indeed the file doesn't exist at all, so it got removed
>>> by some cleanup or other.
>>
>> Ok, so unless someone has a clue what might be going on here (is there a
>> race in the test?), I'd suggest that we simply remove 130 from the auto
>> group again. Shall I send a patch?
> 
> I don’t have much of an idea.  It looks like maybe the qemu process
> (which dos the commit) is lingering, but that shouldn’t be because
> _cleanup_qemu always waits for it.  (Also, I can’t reproduce the problem
> on my system.)
> 
> The only hunch that I have is that 130 seems to be the only test that
> uses _cleanup_qemu to kill a qemu process (i.e. without wait=1) while it
> has taken locks on an image and then still uses the image afterwards.
> Maybe making it quit qemu through HMP would fix the problem.  But
> knowing is difficult because I can’t reproduce it.

I also can't reproduce the problem. I've now run 130 several hundred
times in a loop on a s390x box and never saw the problem.

I think we should disable it for the time being.

 Thomas


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-18 11:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-27 16:39 iotest failure -- test possibly not using sufficiently unique temp filename? Peter Maydell
2019-09-27 16:44 ` Max Reitz
2019-10-17 16:41   ` Peter Maydell
2019-10-18  6:20     ` Thomas Huth
2019-10-18  8:42       ` Max Reitz
2019-10-18 11:43         ` Thomas Huth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).