All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
@ 2018-04-23  8:45 Markus Armbruster
  2018-04-23 12:13 ` Paolo Bonzini
  2018-04-23 13:02 ` Kevin Wolf
  0 siblings, 2 replies; 10+ messages in thread
From: Markus Armbruster @ 2018-04-23  8:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, kwolf, Paolo Bonzini

When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.

Reproducer:

1. Create a scratch image
   $ dd if=/dev/zero of=scratch.img bs=1M count=100

   Size doesn't actually matter.

2. Prepare blkdebug configuration:

   $ cat >blkdebug.conf <<EOF
   [inject-error]
   event = "write_aio"
   errno = "5"
   EOF

   Note that errno 5 is EIO.

3. Run a guest with an additional scratch disk, i.e. with additional
   arguments
   -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
   -device virtio-blk-pci,id=scratch,drive=scratch-drive

   The blkdebug part makes all writes to the scratch drive fail with
   EIO.  The werror=stop pauses the guest on write errors.

4. Connect to the QMP socket e.g. like this:
   $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '

   Issue QMP command 'qmp_capabilities':
   QMP> { "execute": "qmp_capabilities" }

5. Boot the guest.

6. In the guest, write to the scratch disk, e.g. like this:

   # dd if=/dev/zero of=/dev/vdb count=1

   Do double-check the device specified with of= is actually the
   scratch device!

7. Issue QMP command 'cont':
   QMP> { "execute": "cont" }

After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.

After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".

The culprit is vm_prepare_start().

    /* Ensure that a STOP/RESUME pair of events is emitted if a
     * vmstop request was pending.  The BLOCK_IO_ERROR event, for
     * example, according to documentation is always followed by
     * the STOP event.
     */
    if (runstate_is_running()) {
        qapi_event_send_stop(&error_abort);
        res = -1;
    } else {
        replay_enable_events();
        cpu_enable_ticks();
        runstate_set(RUN_STATE_RUNNING);
        vm_state_notify(1, RUN_STATE_RUNNING);
    }

    /* We are sending this now, but the CPUs will be resumed shortly later */
    qapi_event_send_resume(&error_abort);
    return res;

When resuming a stopped guest, we take the else branch before we get
to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
event.

Reshuffle vm_prepare_start() to send the RESUME event earlier.

Fixes RHBZ 1566153.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 cpus.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index 38eba8bff3..398392bc3a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2043,7 +2043,6 @@ int vm_stop(RunState state)
 int vm_prepare_start(void)
 {
     RunState requested;
-    int res = 0;
 
     qemu_vmstop_requested(&requested);
     if (runstate_is_running() && requested == RUN_STATE__MAX) {
@@ -2057,17 +2056,18 @@ int vm_prepare_start(void)
      */
     if (runstate_is_running()) {
         qapi_event_send_stop(&error_abort);
-        res = -1;
-    } else {
-        replay_enable_events();
-        cpu_enable_ticks();
-        runstate_set(RUN_STATE_RUNNING);
-        vm_state_notify(1, RUN_STATE_RUNNING);
+        qapi_event_send_resume(&error_abort);
+        return -1;
     }
 
     /* We are sending this now, but the CPUs will be resumed shortly later */
     qapi_event_send_resume(&error_abort);
-    return res;
+
+    replay_enable_events();
+    cpu_enable_ticks();
+    runstate_set(RUN_STATE_RUNNING);
+    vm_state_notify(1, RUN_STATE_RUNNING);
+    return 0;
 }
 
 void vm_start(void)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-04-23  8:45 [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest Markus Armbruster
@ 2018-04-23 12:13 ` Paolo Bonzini
  2018-04-23 13:02 ` Kevin Wolf
  1 sibling, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2018-04-23 12:13 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel; +Cc: qemu-block, mreitz, kwolf

On 23/04/2018 10:45, Markus Armbruster wrote:
> When resume of a stopped guest immediately runs into block device
> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
> 
> Reproducer:
> 
> 1. Create a scratch image
>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
> 
>    Size doesn't actually matter.
> 
> 2. Prepare blkdebug configuration:
> 
>    $ cat >blkdebug.conf <<EOF
>    [inject-error]
>    event = "write_aio"
>    errno = "5"
>    EOF
> 
>    Note that errno 5 is EIO.
> 
> 3. Run a guest with an additional scratch disk, i.e. with additional
>    arguments
>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
> 
>    The blkdebug part makes all writes to the scratch drive fail with
>    EIO.  The werror=stop pauses the guest on write errors.
> 
> 4. Connect to the QMP socket e.g. like this:
>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> 
>    Issue QMP command 'qmp_capabilities':
>    QMP> { "execute": "qmp_capabilities" }
> 
> 5. Boot the guest.
> 
> 6. In the guest, write to the scratch disk, e.g. like this:
> 
>    # dd if=/dev/zero of=/dev/vdb count=1
> 
>    Do double-check the device specified with of= is actually the
>    scratch device!
> 
> 7. Issue QMP command 'cont':
>    QMP> { "execute": "cont" }
> 
> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
> 
> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
> 
> The funny event order confuses libvirt: virsh -r domstate DOMAIN
> --reason reports "paused (unknown)" rather than "paused (I/O error)".
> 
> The culprit is vm_prepare_start().
> 
>     /* Ensure that a STOP/RESUME pair of events is emitted if a
>      * vmstop request was pending.  The BLOCK_IO_ERROR event, for
>      * example, according to documentation is always followed by
>      * the STOP event.
>      */
>     if (runstate_is_running()) {
>         qapi_event_send_stop(&error_abort);
>         res = -1;
>     } else {
>         replay_enable_events();
>         cpu_enable_ticks();
>         runstate_set(RUN_STATE_RUNNING);
>         vm_state_notify(1, RUN_STATE_RUNNING);
>     }
> 
>     /* We are sending this now, but the CPUs will be resumed shortly later */
>     qapi_event_send_resume(&error_abort);
>     return res;
> 
> When resuming a stopped guest, we take the else branch before we get
> to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
> among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
> event.
> 
> Reshuffle vm_prepare_start() to send the RESUME event earlier.
> 
> Fixes RHBZ 1566153.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  cpus.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 38eba8bff3..398392bc3a 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -2043,7 +2043,6 @@ int vm_stop(RunState state)
>  int vm_prepare_start(void)
>  {
>      RunState requested;
> -    int res = 0;
>  
>      qemu_vmstop_requested(&requested);
>      if (runstate_is_running() && requested == RUN_STATE__MAX) {
> @@ -2057,17 +2056,18 @@ int vm_prepare_start(void)
>       */
>      if (runstate_is_running()) {
>          qapi_event_send_stop(&error_abort);
> -        res = -1;
> -    } else {
> -        replay_enable_events();
> -        cpu_enable_ticks();
> -        runstate_set(RUN_STATE_RUNNING);
> -        vm_state_notify(1, RUN_STATE_RUNNING);
> +        qapi_event_send_resume(&error_abort);
> +        return -1;
>      }
>  
>      /* We are sending this now, but the CPUs will be resumed shortly later */
>      qapi_event_send_resume(&error_abort);
> -    return res;
> +
> +    replay_enable_events();
> +    cpu_enable_ticks();
> +    runstate_set(RUN_STATE_RUNNING);
> +    vm_state_notify(1, RUN_STATE_RUNNING);
> +    return 0;
>  }
>  
>  void vm_start(void)
> 

Queued, thanks.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-04-23  8:45 [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest Markus Armbruster
  2018-04-23 12:13 ` Paolo Bonzini
@ 2018-04-23 13:02 ` Kevin Wolf
  2018-04-23 15:47   ` Markus Armbruster
  2018-05-03 12:17   ` Markus Armbruster
  1 sibling, 2 replies; 10+ messages in thread
From: Kevin Wolf @ 2018-04-23 13:02 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel, qemu-block, mreitz, Paolo Bonzini

Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
> When resume of a stopped guest immediately runs into block device
> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
> 
> Reproducer:
> 
> 1. Create a scratch image
>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
> 
>    Size doesn't actually matter.
> 
> 2. Prepare blkdebug configuration:
> 
>    $ cat >blkdebug.conf <<EOF
>    [inject-error]
>    event = "write_aio"
>    errno = "5"
>    EOF
> 
>    Note that errno 5 is EIO.
> 
> 3. Run a guest with an additional scratch disk, i.e. with additional
>    arguments
>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
> 
>    The blkdebug part makes all writes to the scratch drive fail with
>    EIO.  The werror=stop pauses the guest on write errors.
> 
> 4. Connect to the QMP socket e.g. like this:
>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> 
>    Issue QMP command 'qmp_capabilities':
>    QMP> { "execute": "qmp_capabilities" }
> 
> 5. Boot the guest.
> 
> 6. In the guest, write to the scratch disk, e.g. like this:
> 
>    # dd if=/dev/zero of=/dev/vdb count=1
> 
>    Do double-check the device specified with of= is actually the
>    scratch device!
> 
> 7. Issue QMP command 'cont':
>    QMP> { "execute": "cont" }
> 
> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
> 
> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

Do you want to rephrase this in the form of a script for qemu-iotests?

I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.

Kevin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-04-23 13:02 ` Kevin Wolf
@ 2018-04-23 15:47   ` Markus Armbruster
  2018-04-23 16:24     ` Kevin Wolf
  2018-05-03 12:17   ` Markus Armbruster
  1 sibling, 1 reply; 10+ messages in thread
From: Markus Armbruster @ 2018-04-23 15:47 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, qemu-block, mreitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
>> When resume of a stopped guest immediately runs into block device
>> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
>> 
>> Reproducer:
>> 
>> 1. Create a scratch image
>>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
>> 
>>    Size doesn't actually matter.
>> 
>> 2. Prepare blkdebug configuration:
>> 
>>    $ cat >blkdebug.conf <<EOF
>>    [inject-error]
>>    event = "write_aio"
>>    errno = "5"
>>    EOF
>> 
>>    Note that errno 5 is EIO.
>> 
>> 3. Run a guest with an additional scratch disk, i.e. with additional
>>    arguments
>>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
>>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
>> 
>>    The blkdebug part makes all writes to the scratch drive fail with
>>    EIO.  The werror=stop pauses the guest on write errors.
>> 
>> 4. Connect to the QMP socket e.g. like this:
>>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>> 
>>    Issue QMP command 'qmp_capabilities':
>>    QMP> { "execute": "qmp_capabilities" }
>> 
>> 5. Boot the guest.
>> 
>> 6. In the guest, write to the scratch disk, e.g. like this:
>> 
>>    # dd if=/dev/zero of=/dev/vdb count=1
>> 
>>    Do double-check the device specified with of= is actually the
>>    scratch device!
>> 
>> 7. Issue QMP command 'cont':
>>    QMP> { "execute": "cont" }
>> 
>> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
>> 
>> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
>> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
>
> Do you want to rephrase this in the form of a script for qemu-iotests?
>
> I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.

Makes sense, but I'm quite pretty much a noob there.  Perhaps I can copy
an existing test and hack it up.  Which one would you recommend?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-04-23 15:47   ` Markus Armbruster
@ 2018-04-23 16:24     ` Kevin Wolf
  0 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2018-04-23 16:24 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-devel, qemu-block, mreitz

Am 23.04.2018 um 17:47 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
> >> When resume of a stopped guest immediately runs into block device
> >> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
> >> 
> >> Reproducer:
> >> 
> >> 1. Create a scratch image
> >>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
> >> 
> >>    Size doesn't actually matter.
> >> 
> >> 2. Prepare blkdebug configuration:
> >> 
> >>    $ cat >blkdebug.conf <<EOF
> >>    [inject-error]
> >>    event = "write_aio"
> >>    errno = "5"
> >>    EOF
> >> 
> >>    Note that errno 5 is EIO.
> >> 
> >> 3. Run a guest with an additional scratch disk, i.e. with additional
> >>    arguments
> >>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
> >>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
> >> 
> >>    The blkdebug part makes all writes to the scratch drive fail with
> >>    EIO.  The werror=stop pauses the guest on write errors.
> >> 
> >> 4. Connect to the QMP socket e.g. like this:
> >>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >> 
> >>    Issue QMP command 'qmp_capabilities':
> >>    QMP> { "execute": "qmp_capabilities" }
> >> 
> >> 5. Boot the guest.
> >> 
> >> 6. In the guest, write to the scratch disk, e.g. like this:
> >> 
> >>    # dd if=/dev/zero of=/dev/vdb count=1
> >> 
> >>    Do double-check the device specified with of= is actually the
> >>    scratch device!
> >> 
> >> 7. Issue QMP command 'cont':
> >>    QMP> { "execute": "cont" }
> >> 
> >> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
> >> 
> >> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
> >> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
> >
> > Do you want to rephrase this in the form of a script for qemu-iotests?
> >
> > I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.
> 
> Makes sense, but I'm quite pretty much a noob there.  Perhaps I can copy
> an existing test and hack it up.  Which one would you recommend?

Depends on how much control you actually need for this test. At first
sight, it might be enough to copy one of the tests implementing a
run_qemu() function. These are tests that do essentially this:

    qemu-system-x86_64 -qmp stdio <<EOF
    ...commands here....
    EOF

This is all you need if you don't have a reason to wait for or even
parse QMP results. (The results end up in stdout, so they are validated
with the usual diffing.)

If you need a bit more, copy one of the tests that use ./common.qemu.
This is a bit more complex but allows you to wait for expected QMP
results before you continue with the next action. Probably you don't
need this here.

(And if even that is not powerful enough, Python test cases with
iotests.py are what you want. Almost certainly overkill for this one.)

Kevin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-04-23 13:02 ` Kevin Wolf
  2018-04-23 15:47   ` Markus Armbruster
@ 2018-05-03 12:17   ` Markus Armbruster
  2018-05-03 12:26     ` Kevin Wolf
  2018-05-03 12:47     ` Paolo Bonzini
  1 sibling, 2 replies; 10+ messages in thread
From: Markus Armbruster @ 2018-05-03 12:17 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, qemu-block, mreitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
>> When resume of a stopped guest immediately runs into block device
>> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
>> 
>> Reproducer:
>> 
>> 1. Create a scratch image
>>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
>> 
>>    Size doesn't actually matter.
>> 
>> 2. Prepare blkdebug configuration:
>> 
>>    $ cat >blkdebug.conf <<EOF
>>    [inject-error]
>>    event = "write_aio"
>>    errno = "5"
>>    EOF
>> 
>>    Note that errno 5 is EIO.
>> 
>> 3. Run a guest with an additional scratch disk, i.e. with additional
>>    arguments
>>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
>>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
>> 
>>    The blkdebug part makes all writes to the scratch drive fail with
>>    EIO.  The werror=stop pauses the guest on write errors.
>> 
>> 4. Connect to the QMP socket e.g. like this:
>>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>> 
>>    Issue QMP command 'qmp_capabilities':
>>    QMP> { "execute": "qmp_capabilities" }
>> 
>> 5. Boot the guest.
>> 
>> 6. In the guest, write to the scratch disk, e.g. like this:
>> 
>>    # dd if=/dev/zero of=/dev/vdb count=1
>> 
>>    Do double-check the device specified with of= is actually the
>>    scratch device!
>> 
>> 7. Issue QMP command 'cont':
>>    QMP> { "execute": "cont" }
>> 
>> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
>> 
>> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
>> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
>
> Do you want to rephrase this in the form of a script for qemu-iotests?
>
> I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.

Uh, can it?  With qemu-io, the write doesn't stop the guest, because it
bypasses the device model, and thus blk_error_action().  I'm not aware
of ways to make qemu-iotests write via a device model.  I'm afraid we
need a full-fledged qtest.  Better ideas?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-05-03 12:17   ` Markus Armbruster
@ 2018-05-03 12:26     ` Kevin Wolf
  2018-05-03 12:50       ` Markus Armbruster
  2018-05-03 12:47     ` Paolo Bonzini
  1 sibling, 1 reply; 10+ messages in thread
From: Kevin Wolf @ 2018-05-03 12:26 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-devel, qemu-block, mreitz

Am 03.05.2018 um 14:17 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
> >> When resume of a stopped guest immediately runs into block device
> >> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
> >> 
> >> Reproducer:
> >> 
> >> 1. Create a scratch image
> >>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
> >> 
> >>    Size doesn't actually matter.
> >> 
> >> 2. Prepare blkdebug configuration:
> >> 
> >>    $ cat >blkdebug.conf <<EOF
> >>    [inject-error]
> >>    event = "write_aio"
> >>    errno = "5"
> >>    EOF
> >> 
> >>    Note that errno 5 is EIO.
> >> 
> >> 3. Run a guest with an additional scratch disk, i.e. with additional
> >>    arguments
> >>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
> >>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
> >> 
> >>    The blkdebug part makes all writes to the scratch drive fail with
> >>    EIO.  The werror=stop pauses the guest on write errors.
> >> 
> >> 4. Connect to the QMP socket e.g. like this:
> >>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >> 
> >>    Issue QMP command 'qmp_capabilities':
> >>    QMP> { "execute": "qmp_capabilities" }
> >> 
> >> 5. Boot the guest.
> >> 
> >> 6. In the guest, write to the scratch disk, e.g. like this:
> >> 
> >>    # dd if=/dev/zero of=/dev/vdb count=1
> >> 
> >>    Do double-check the device specified with of= is actually the
> >>    scratch device!
> >> 
> >> 7. Issue QMP command 'cont':
> >>    QMP> { "execute": "cont" }
> >> 
> >> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
> >> 
> >> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
> >> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
> >
> > Do you want to rephrase this in the form of a script for qemu-iotests?
> >
> > I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.
> 
> Uh, can it?  With qemu-io, the write doesn't stop the guest, because it
> bypasses the device model, and thus blk_error_action().  I'm not aware
> of ways to make qemu-iotests write via a device model.  I'm afraid we
> need a full-fledged qtest.  Better ideas?

I'm afraid you're right. :-(

Did I ever mention that I don't really like having the werror logic in
the devices?

Kevin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-05-03 12:17   ` Markus Armbruster
  2018-05-03 12:26     ` Kevin Wolf
@ 2018-05-03 12:47     ` Paolo Bonzini
  2018-05-03 14:38       ` Markus Armbruster
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2018-05-03 12:47 UTC (permalink / raw)
  To: Markus Armbruster, Kevin Wolf; +Cc: qemu-devel, qemu-block, mreitz

On 03/05/2018 14:17, Markus Armbruster wrote:
>>> 4. Connect to the QMP socket e.g. like this:
>>>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>>>
>>>    Issue QMP command 'qmp_capabilities':
>>>    QMP> { "execute": "qmp_capabilities" }
>>>
>>> 5. Boot the guest.
>>>
>>> 6. In the guest, write to the scratch disk, e.g. like this:
>>>
>>>    # dd if=/dev/zero of=/dev/vdb count=1
>>>
>>>    Do double-check the device specified with of= is actually the
>>>    scratch device!
>>>
>>> 7. Issue QMP command 'cont':
>>>    QMP> { "execute": "cont" }
>>>
>>> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
>>>
>>> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
>>> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
>> Do you want to rephrase this in the form of a script for qemu-iotests?
>>
>> I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.
> Uh, can it?  With qemu-io, the write doesn't stop the guest, because it
> bypasses the device model, and thus blk_error_action().  I'm not aware
> of ways to make qemu-iotests write via a device model.  I'm afraid we
> need a full-fledged qtest.  Better ideas?

Yeah, using virtio-blk-test sounds like a good idea.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-05-03 12:26     ` Kevin Wolf
@ 2018-05-03 12:50       ` Markus Armbruster
  0 siblings, 0 replies; 10+ messages in thread
From: Markus Armbruster @ 2018-05-03 12:50 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, qemu-block, mreitz

Kevin Wolf <kwolf@redhat.com> writes:

> Am 03.05.2018 um 14:17 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 23.04.2018 um 10:45 hat Markus Armbruster geschrieben:
>> >> When resume of a stopped guest immediately runs into block device
>> >> errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
>> >> 
>> >> Reproducer:
>> >> 
>> >> 1. Create a scratch image
>> >>    $ dd if=/dev/zero of=scratch.img bs=1M count=100
>> >> 
>> >>    Size doesn't actually matter.
>> >> 
>> >> 2. Prepare blkdebug configuration:
>> >> 
>> >>    $ cat >blkdebug.conf <<EOF
>> >>    [inject-error]
>> >>    event = "write_aio"
>> >>    errno = "5"
>> >>    EOF
>> >> 
>> >>    Note that errno 5 is EIO.
>> >> 
>> >> 3. Run a guest with an additional scratch disk, i.e. with additional
>> >>    arguments
>> >>    -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
>> >>    -device virtio-blk-pci,id=scratch,drive=scratch-drive
>> >> 
>> >>    The blkdebug part makes all writes to the scratch drive fail with
>> >>    EIO.  The werror=stop pauses the guest on write errors.
>> >> 
>> >> 4. Connect to the QMP socket e.g. like this:
>> >>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>> >> 
>> >>    Issue QMP command 'qmp_capabilities':
>> >>    QMP> { "execute": "qmp_capabilities" }
>> >> 
>> >> 5. Boot the guest.
>> >> 
>> >> 6. In the guest, write to the scratch disk, e.g. like this:
>> >> 
>> >>    # dd if=/dev/zero of=/dev/vdb count=1
>> >> 
>> >>    Do double-check the device specified with of= is actually the
>> >>    scratch device!
>> >> 
>> >> 7. Issue QMP command 'cont':
>> >>    QMP> { "execute": "cont" }
>> >> 
>> >> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
>> >> 
>> >> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
>> >> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
>> >
>> > Do you want to rephrase this in the form of a script for qemu-iotests?
>> >
>> > I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.
>> 
>> Uh, can it?  With qemu-io, the write doesn't stop the guest, because it
>> bypasses the device model, and thus blk_error_action().  I'm not aware
>> of ways to make qemu-iotests write via a device model.  I'm afraid we
>> need a full-fledged qtest.  Better ideas?
>
> I'm afraid you're right. :-(
>
> Did I ever mention that I don't really like having the werror logic in
> the devices?

Only a few times :)

There's an explanation next to blk_error_action():

/* This is done by device models because, while the block layer knows
 * about the error, it does not know whether an operation comes from
 * the device or the block layer (from a job, for example).
 */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest
  2018-05-03 12:47     ` Paolo Bonzini
@ 2018-05-03 14:38       ` Markus Armbruster
  0 siblings, 0 replies; 10+ messages in thread
From: Markus Armbruster @ 2018-05-03 14:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, qemu-block, mreitz

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 03/05/2018 14:17, Markus Armbruster wrote:
>>>> 4. Connect to the QMP socket e.g. like this:
>>>>    $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>>>>
>>>>    Issue QMP command 'qmp_capabilities':
>>>>    QMP> { "execute": "qmp_capabilities" }
>>>>
>>>> 5. Boot the guest.
>>>>
>>>> 6. In the guest, write to the scratch disk, e.g. like this:
>>>>
>>>>    # dd if=/dev/zero of=/dev/vdb count=1
>>>>
>>>>    Do double-check the device specified with of= is actually the
>>>>    scratch device!
>>>>
>>>> 7. Issue QMP command 'cont':
>>>>    QMP> { "execute": "cont" }
>>>>
>>>> After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.
>>>>
>>>> After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
>>>> good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
>>> Do you want to rephrase this in the form of a script for qemu-iotests?
>>>
>>> I suppose the 'dd' line can be replaced by a 'qemu-io' monitor command.
>> Uh, can it?  With qemu-io, the write doesn't stop the guest, because it
>> bypasses the device model, and thus blk_error_action().  I'm not aware
>> of ways to make qemu-iotests write via a device model.  I'm afraid we
>> need a full-fledged qtest.  Better ideas?
>
> Yeah, using virtio-blk-test sounds like a good idea.

Who's familiar with this test?  I'm not sure I can afford digging into
it myself right now...

The other devices supporting error actions are in hw/scsi/scsi-disk.c,
hw/ide/ahci.c and hw/ide/core.c.  Tests with matching names are
virtio-scsi-test.c, ahci-test.c, ide-test.c.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-05-03 14:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-23  8:45 [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest Markus Armbruster
2018-04-23 12:13 ` Paolo Bonzini
2018-04-23 13:02 ` Kevin Wolf
2018-04-23 15:47   ` Markus Armbruster
2018-04-23 16:24     ` Kevin Wolf
2018-05-03 12:17   ` Markus Armbruster
2018-05-03 12:26     ` Kevin Wolf
2018-05-03 12:50       ` Markus Armbruster
2018-05-03 12:47     ` Paolo Bonzini
2018-05-03 14:38       ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.