All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] How to emulate block I/O timeout on qemu side?
@ 2018-11-02  8:11 Dongli Zhang
  2018-11-02 17:49 ` John Snow
  2018-11-05 17:49 ` Eric Blake
  0 siblings, 2 replies; 15+ messages in thread
From: Dongli Zhang @ 2018-11-02  8:11 UTC (permalink / raw)
  To: qemu-devel

Hi,

Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
kernel) without modifying qemu source code?

For instance, I would like to observe/study/debug the I/O timeout handling of
nvme, scsi, virtio-blk (not supported) of VM kernel.

Is there a way to trigger this on purpose on qemu side?

Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-02  8:11 [Qemu-devel] How to emulate block I/O timeout on qemu side? Dongli Zhang
@ 2018-11-02 17:49 ` John Snow
  2018-11-02 17:55   ` Marc Olson
  2018-11-05 17:49 ` Eric Blake
  1 sibling, 1 reply; 15+ messages in thread
From: John Snow @ 2018-11-02 17:49 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel; +Cc: Qemu-block, Marc Olson



On 11/02/2018 04:11 AM, Dongli Zhang wrote:
> Hi,
> 
> Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
> kernel) without modifying qemu source code?
> 
> For instance, I would like to observe/study/debug the I/O timeout handling of
> nvme, scsi, virtio-blk (not supported) of VM kernel.
> 
> Is there a way to trigger this on purpose on qemu side?
> 
> Thank you very much!
> 
> Dongli Zhang
> 

I don't think the blkdebug driver supports arbitrary delays right now.
Maybe we could augment it to do so?

(I thought someone already had, but maybe it wasn't merged?)

Aha, here:

https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05297.html
V2: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html

Let's work from there.

--js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-02 17:49 ` John Snow
@ 2018-11-02 17:55   ` Marc Olson
  2018-11-02 18:17     ` John Snow
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Olson @ 2018-11-02 17:55 UTC (permalink / raw)
  To: John Snow, Dongli Zhang, qemu-devel; +Cc: Qemu-block

On 11/2/18 10:49 AM, John Snow wrote:
> On 11/02/2018 04:11 AM, Dongli Zhang wrote:
>> Hi,
>>
>> Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
>> kernel) without modifying qemu source code?
>>
>> For instance, I would like to observe/study/debug the I/O timeout handling of
>> nvme, scsi, virtio-blk (not supported) of VM kernel.
>>
>> Is there a way to trigger this on purpose on qemu side?
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
> I don't think the blkdebug driver supports arbitrary delays right now.
> Maybe we could augment it to do so?
>
> (I thought someone already had, but maybe it wasn't merged?)
>
> Aha, here:
>
> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05297.html
> V2: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>
> Let's work from there.

I've got updates to that patch series that fell on the floor due to 
other competing things. I'll get some screen time this weekend to work 
on them and submit v3.

/marc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-02 17:55   ` Marc Olson
@ 2018-11-02 18:17     ` John Snow
  2018-11-03 17:24       ` Dongli Zhang
  0 siblings, 1 reply; 15+ messages in thread
From: John Snow @ 2018-11-02 18:17 UTC (permalink / raw)
  To: Marc Olson, Dongli Zhang, qemu-devel; +Cc: Qemu-block



On 11/02/2018 01:55 PM, Marc Olson wrote:
> On 11/2/18 10:49 AM, John Snow wrote:
>> On 11/02/2018 04:11 AM, Dongli Zhang wrote:
>>> Hi,
>>>
>>> Is there any way to emulate I/O timeout on qemu side (not fault
>>> injection in VM
>>> kernel) without modifying qemu source code?
>>>
>>> For instance, I would like to observe/study/debug the I/O timeout
>>> handling of
>>> nvme, scsi, virtio-blk (not supported) of VM kernel.
>>>
>>> Is there a way to trigger this on purpose on qemu side?
>>>
>>> Thank you very much!
>>>
>>> Dongli Zhang
>>>
>> I don't think the blkdebug driver supports arbitrary delays right now.
>> Maybe we could augment it to do so?
>>
>> (I thought someone already had, but maybe it wasn't merged?)
>>
>> Aha, here:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05297.html
>> V2: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>>
>> Let's work from there.
> 
> I've got updates to that patch series that fell on the floor due to
> other competing things. I'll get some screen time this weekend to work
> on them and submit v3.
> 
> /marc
> 

Great! Please CC the usual maintainers, but also include me.

In the meantime, Dongli Zhang, why don't you try the v2 patch and see if
that helps you out for your use case? Report back if it works for you or
not.

--js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-02 18:17     ` John Snow
@ 2018-11-03 17:24       ` Dongli Zhang
  2018-11-05 17:13         ` John Snow
  2018-11-12  7:13         ` Marc Olson
  0 siblings, 2 replies; 15+ messages in thread
From: Dongli Zhang @ 2018-11-03 17:24 UTC (permalink / raw)
  To: John Snow, Marc Olson; +Cc: qemu-devel, Qemu-block

Hi all,

I tried with the patch at:

https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html

The patch is applied to qemu-3.0.0.


Below configuration is used to test the feature for guest VM nvme.

# qemu-system-x86_64 \
-smp 4 -m 2000M -enable-kvm -vnc :0 -monitor stdio \
-net nic -net user,hostfwd=tcp::5022-:22 \
-drive file=virtio-disk.img,format=raw,if=none,id=disk0 \
-device virtio-blk-pci,drive=disk0,id=disk0-dev,num-queues=2,iothread=io1 \
-object iothread,id=io1 \
-device nvme,drive=nvme1,serial=deadbeaf1 \
-drive file=blkdebug:blkdebug.config:nvme.img,if=none,id=nvme1

# cat blkdebug.config
[delay]
event = "write_aio"
latency = "9999999999"
sector = "40960"


The 'write' latency of sector=40960 is set to a very large value. When the I/O
is stalled in guest due to that sector=40960 is accessed, I do see below
messages in guest log:

[   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
[   80.808095] nvme nvme0: Abort status: 0x4001


However, then nothing happens further. nvme I/O hangs in guest. I am not able to
kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
need to kill qemu with "kill -9"


The same result for virtio-scsi and qemu is stuck as well.


About blkdebug, I can only trigger the error by the config file. Is there a way
to inject error or latency via qemu monior? For instance, I would like to inject
error not for a specific sector or state, but for the entire disk when I input
some command via qemu monitor.

Dongli Zhang


On 11/03/2018 02:17 AM, John Snow wrote:
> 
> 
> On 11/02/2018 01:55 PM, Marc Olson wrote:
>> On 11/2/18 10:49 AM, John Snow wrote:
>>> On 11/02/2018 04:11 AM, Dongli Zhang wrote:
>>>> Hi,
>>>>
>>>> Is there any way to emulate I/O timeout on qemu side (not fault
>>>> injection in VM
>>>> kernel) without modifying qemu source code?
>>>>
>>>> For instance, I would like to observe/study/debug the I/O timeout
>>>> handling of
>>>> nvme, scsi, virtio-blk (not supported) of VM kernel.
>>>>
>>>> Is there a way to trigger this on purpose on qemu side?
>>>>
>>>> Thank you very much!
>>>>
>>>> Dongli Zhang
>>>>
>>> I don't think the blkdebug driver supports arbitrary delays right now.
>>> Maybe we could augment it to do so?
>>>
>>> (I thought someone already had, but maybe it wasn't merged?)
>>>
>>> Aha, here:
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05297.html
>>> V2: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>>>
>>> Let's work from there.
>>
>> I've got updates to that patch series that fell on the floor due to
>> other competing things. I'll get some screen time this weekend to work
>> on them and submit v3.
>>
>> /marc
>>
> 
> Great! Please CC the usual maintainers, but also include me.
> 
> In the meantime, Dongli Zhang, why don't you try the v2 patch and see if
> that helps you out for your use case? Report back if it works for you or
> not.
> 
> --js
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-03 17:24       ` Dongli Zhang
@ 2018-11-05 17:13         ` John Snow
  2018-11-12  7:13         ` Marc Olson
  1 sibling, 0 replies; 15+ messages in thread
From: John Snow @ 2018-11-05 17:13 UTC (permalink / raw)
  To: Dongli Zhang, Marc Olson; +Cc: qemu-devel, Qemu-block



On 11/03/2018 01:24 PM, Dongli Zhang wrote:
> Hi all,
> 

Hi, please reply below the quoted text when writing to qemu-devel in the
future; my reply is below.

> I tried with the patch at:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
> 
> The patch is applied to qemu-3.0.0.
> 
> 
> Below configuration is used to test the feature for guest VM nvme.
> 
> # qemu-system-x86_64 \
> -smp 4 -m 2000M -enable-kvm -vnc :0 -monitor stdio \
> -net nic -net user,hostfwd=tcp::5022-:22 \
> -drive file=virtio-disk.img,format=raw,if=none,id=disk0 \
> -device virtio-blk-pci,drive=disk0,id=disk0-dev,num-queues=2,iothread=io1 \
> -object iothread,id=io1 \
> -device nvme,drive=nvme1,serial=deadbeaf1 \
> -drive file=blkdebug:blkdebug.config:nvme.img,if=none,id=nvme1
> 
> # cat blkdebug.config
> [delay]
> event = "write_aio"
> latency = "9999999999"
> sector = "40960"
> 
> 
> The 'write' latency of sector=40960 is set to a very large value. When the I/O
> is stalled in guest due to that sector=40960 is accessed, I do see below
> messages in guest log:
> 
> [   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
> [   80.808095] nvme nvme0: Abort status: 0x4001
> 
> 
> However, then nothing happens further. nvme I/O hangs in guest. I am not able to
> kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
> need to kill qemu with "kill -9"
> >
> The same result for virtio-scsi and qemu is stuck as well.
> 

OK, sounds like a bug in the delay implementation here, then; or
something I've not considered with the locking/drain specifics. Thanks
for the report.

> 
> About blkdebug, I can only trigger the error by the config file. Is there a way
> to inject error or latency via qemu monior? For instance, I would like to inject
> error not for a specific sector or state, but for the entire disk when I input
> some command via qemu monitor.
> 

I don't recall.

There are some tricks you can play with set-state and rules that only
apply when in a certain state. I don't remember if there are monitor or
QMP commands to set the state explicitly.

I'm looking at docs/devel/blkdebug.txt and don't see anything immediately.

There's maybe a way you can use blockdev-add to create the blkdebug node
and insert it live into the graph when you want it, and live-remove it
when you don't, but I'm not sure of the syntax right away.

(maybe that's not possible?)

--js

> Dongli Zhang
> 
> 
> On 11/03/2018 02:17 AM, John Snow wrote:
>>
>>
>> On 11/02/2018 01:55 PM, Marc Olson wrote:
>>> On 11/2/18 10:49 AM, John Snow wrote:
>>>> On 11/02/2018 04:11 AM, Dongli Zhang wrote:
>>>>> Hi,
>>>>>
>>>>> Is there any way to emulate I/O timeout on qemu side (not fault
>>>>> injection in VM
>>>>> kernel) without modifying qemu source code?
>>>>>
>>>>> For instance, I would like to observe/study/debug the I/O timeout
>>>>> handling of
>>>>> nvme, scsi, virtio-blk (not supported) of VM kernel.
>>>>>
>>>>> Is there a way to trigger this on purpose on qemu side?
>>>>>
>>>>> Thank you very much!
>>>>>
>>>>> Dongli Zhang
>>>>>
>>>> I don't think the blkdebug driver supports arbitrary delays right now.
>>>> Maybe we could augment it to do so?
>>>>
>>>> (I thought someone already had, but maybe it wasn't merged?)
>>>>
>>>> Aha, here:
>>>>
>>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05297.html
>>>> V2: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>>>>
>>>> Let's work from there.
>>>
>>> I've got updates to that patch series that fell on the floor due to
>>> other competing things. I'll get some screen time this weekend to work
>>> on them and submit v3.
>>>
>>> /marc
>>>
>>
>> Great! Please CC the usual maintainers, but also include me.
>>
>> In the meantime, Dongli Zhang, why don't you try the v2 patch and see if
>> that helps you out for your use case? Report back if it works for you or
>> not.
>>
>> --js
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-02  8:11 [Qemu-devel] How to emulate block I/O timeout on qemu side? Dongli Zhang
  2018-11-02 17:49 ` John Snow
@ 2018-11-05 17:49 ` Eric Blake
  2018-11-06  6:17   ` Dongli Zhang
  1 sibling, 1 reply; 15+ messages in thread
From: Eric Blake @ 2018-11-05 17:49 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, libguestfs

On 11/2/18 3:11 AM, Dongli Zhang wrote:
> Hi,
> 
> Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
> kernel) without modifying qemu source code?

You may be interested in Rich's work on nbdkit.  If you don't mind the 
overhead of the host connecting through NBD, then you can use nbdkit's 
delay and fault-injection filters for inserting delays or even 
run-time-controllable failures to investigate how the guest reacts to 
those situations

> 
> For instance, I would like to observe/study/debug the I/O timeout handling of
> nvme, scsi, virtio-blk (not supported) of VM kernel.
> 
> Is there a way to trigger this on purpose on qemu side?
> 
> Thank you very much!
> 
> Dongli Zhang
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-05 17:49 ` Eric Blake
@ 2018-11-06  6:17   ` Dongli Zhang
  2018-11-06  9:14     ` [Qemu-devel] [Libguestfs] " Richard W.M. Jones
  0 siblings, 1 reply; 15+ messages in thread
From: Dongli Zhang @ 2018-11-06  6:17 UTC (permalink / raw)
  To: Eric Blake, jsnow; +Cc: qemu-devel, libguestfs



On 11/06/2018 01:49 AM, Eric Blake wrote:
> On 11/2/18 3:11 AM, Dongli Zhang wrote:
>> Hi,
>>
>> Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
>> kernel) without modifying qemu source code?
> 
> You may be interested in Rich's work on nbdkit.  If you don't mind the overhead
> of the host connecting through NBD, then you can use nbdkit's delay and
> fault-injection filters for inserting delays or even run-time-controllable
> failures to investigate how the guest reacts to those situations

Thank you all very much for the suggestions. I will take a look on nbdkit.

So far I am reproducing the issue with NFS (by shutdown the link to NFS where
the image is placed on purpose) but it did not work well.

> 
>>
>> For instance, I would like to observe/study/debug the I/O timeout handling of
>> nvme, scsi, virtio-blk (not supported) of VM kernel.
>>
>> Is there a way to trigger this on purpose on qemu side?
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>
> 

Dongli Zhang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Libguestfs] How to emulate block I/O timeout on qemu side?
  2018-11-06  6:17   ` Dongli Zhang
@ 2018-11-06  9:14     ` Richard W.M. Jones
  2018-11-06  9:43       ` Richard W.M. Jones
  0 siblings, 1 reply; 15+ messages in thread
From: Richard W.M. Jones @ 2018-11-06  9:14 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: Eric Blake, jsnow, qemu-devel, libguestfs

On Tue, Nov 06, 2018 at 02:17:46PM +0800, Dongli Zhang wrote:
> On 11/06/2018 01:49 AM, Eric Blake wrote:
> > On 11/2/18 3:11 AM, Dongli Zhang wrote:
> >> Hi,
> >>
> >> Is there any way to emulate I/O timeout on qemu side (not fault
> >> injection in VM kernel) without modifying qemu source code?
> >
> > You may be interested in Rich's work on nbdkit.  If you don't mind
> > the overhead of the host connecting through NBD, then you can use
> > nbdkit's delay and fault-injection filters for inserting delays or
> > even run-time-controllable failures to investigate how the guest
> > reacts to those situations
> >
> Thank you all very much for the suggestions. I will take a look on nbdkit.

These links should help:

  https://rwmj.wordpress.com/2018/09/04/nbdkit-for-loopback-pt-2-injecting-errors/
  https://rwmj.wordpress.com/2018/09/06/nbdkit-for-loopback-pt-7-a-slow-disk/

This link shows how to combine delay and error filters together:

  https://rwmj.wordpress.com/2018/11/04/nbd-graphical-viewer/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Libguestfs] How to emulate block I/O timeout on qemu side?
  2018-11-06  9:14     ` [Qemu-devel] [Libguestfs] " Richard W.M. Jones
@ 2018-11-06  9:43       ` Richard W.M. Jones
  2018-11-06 15:52         ` Richard W.M. Jones
  0 siblings, 1 reply; 15+ messages in thread
From: Richard W.M. Jones @ 2018-11-06  9:43 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: jsnow, qemu-devel, libguestfs

On Tue, Nov 06, 2018 at 09:14:57AM +0000, Richard W.M. Jones wrote:
> This link shows how to combine delay and error filters together:
> 
>   https://rwmj.wordpress.com/2018/11/04/nbd-graphical-viewer/

Oops, that's in a forthcoming blog post not this one.  Not enough
caffeine this morning.

Combining the filters is easy however:

  nbdkit --filter=error --filter=delay \
         memory size=$size \
         rdelay=$delay wdelay=$delay \
         error-rate=100% error-file=/tmp/error

Then touching /tmp/error will inject errors, and removing /tmp/error
will stop injecting errors.

The documentation says you should be able to write error-rate=1
instead of error-rate=100%, but in fact that was broken until
recently, and fixed in:

  https://github.com/libguestfs/nbdkit/commit/ee2d3b4fea6d4b7618262f85f882374c23674b4a

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Libguestfs] How to emulate block I/O timeout on qemu side?
  2018-11-06  9:43       ` Richard W.M. Jones
@ 2018-11-06 15:52         ` Richard W.M. Jones
  0 siblings, 0 replies; 15+ messages in thread
From: Richard W.M. Jones @ 2018-11-06 15:52 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: jsnow, qemu-devel, libguestfs

On Tue, Nov 06, 2018 at 09:43:06AM +0000, Richard W.M. Jones wrote:
> On Tue, Nov 06, 2018 at 09:14:57AM +0000, Richard W.M. Jones wrote:
> > This link shows how to combine delay and error filters together:
> > 
> >   https://rwmj.wordpress.com/2018/11/04/nbd-graphical-viewer/
> 
> Oops, that's in a forthcoming blog post not this one.  Not enough
> caffeine this morning.

Up now:

https://rwmj.wordpress.com/2018/11/06/nbd-graphical-viewer-raid-5-edition/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-03 17:24       ` Dongli Zhang
  2018-11-05 17:13         ` John Snow
@ 2018-11-12  7:13         ` Marc Olson
  2018-11-12  7:36           ` Dongli Zhang
  1 sibling, 1 reply; 15+ messages in thread
From: Marc Olson @ 2018-11-12  7:13 UTC (permalink / raw)
  To: Dongli Zhang, John Snow; +Cc: qemu-devel, Qemu-block

On 11/3/18 10:24 AM, Dongli Zhang wrote:
> Hi all,
>
> I tried with the patch at:
>
> https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>
> The patch is applied to qemu-3.0.0.
>
>
> Below configuration is used to test the feature for guest VM nvme.
>
> # qemu-system-x86_64 \
> -smp 4 -m 2000M -enable-kvm -vnc :0 -monitor stdio \
> -net nic -net user,hostfwd=tcp::5022-:22 \
> -drive file=virtio-disk.img,format=raw,if=none,id=disk0 \
> -device virtio-blk-pci,drive=disk0,id=disk0-dev,num-queues=2,iothread=io1 \
> -object iothread,id=io1 \
> -device nvme,drive=nvme1,serial=deadbeaf1 \
> -drive file=blkdebug:blkdebug.config:nvme.img,if=none,id=nvme1
>
> # cat blkdebug.config
> [delay]
> event = "write_aio"
> latency = "9999999999"
> sector = "40960"
>
>
> The 'write' latency of sector=40960 is set to a very large value. When the I/O
> is stalled in guest due to that sector=40960 is accessed, I do see below
> messages in guest log:
>
> [   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
> [   80.808095] nvme nvme0: Abort status: 0x4001
>
>
> However, then nothing happens further. nvme I/O hangs in guest. I am not able to
> kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
> need to kill qemu with "kill -9"
>
>
> The same result for virtio-scsi and qemu is stuck as well.
While I didn't try virtio-scsi, I wasn't able to reproduce this behavior 
using nvme on Ubuntu 18.04 (4.15). What image and kernel version are you 
trying against?

/marc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-12  7:13         ` Marc Olson
@ 2018-11-12  7:36           ` Dongli Zhang
  2018-11-12 22:52             ` Marc Olson
  0 siblings, 1 reply; 15+ messages in thread
From: Dongli Zhang @ 2018-11-12  7:36 UTC (permalink / raw)
  To: Marc Olson; +Cc: John Snow, qemu-devel, Qemu-block



On 11/12/2018 03:13 PM, Marc Olson via Qemu-devel wrote:
> On 11/3/18 10:24 AM, Dongli Zhang wrote:
>> Hi all,
>>
>> I tried with the patch at:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg00394.html
>>
>> The patch is applied to qemu-3.0.0.
>>
>>
>> Below configuration is used to test the feature for guest VM nvme.
>>
>> # qemu-system-x86_64 \
>> -smp 4 -m 2000M -enable-kvm -vnc :0 -monitor stdio \
>> -net nic -net user,hostfwd=tcp::5022-:22 \
>> -drive file=virtio-disk.img,format=raw,if=none,id=disk0 \
>> -device virtio-blk-pci,drive=disk0,id=disk0-dev,num-queues=2,iothread=io1 \
>> -object iothread,id=io1 \
>> -device nvme,drive=nvme1,serial=deadbeaf1 \
>> -drive file=blkdebug:blkdebug.config:nvme.img,if=none,id=nvme1
>>
>> # cat blkdebug.config
>> [delay]
>> event = "write_aio"
>> latency = "9999999999"
>> sector = "40960"
>>
>>
>> The 'write' latency of sector=40960 is set to a very large value. When the I/O
>> is stalled in guest due to that sector=40960 is accessed, I do see below
>> messages in guest log:
>>
>> [   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
>> [   80.808095] nvme nvme0: Abort status: 0x4001
>>
>>
>> However, then nothing happens further. nvme I/O hangs in guest. I am not able to
>> kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
>> need to kill qemu with "kill -9"
>>
>>
>> The same result for virtio-scsi and qemu is stuck as well.
> While I didn't try virtio-scsi, I wasn't able to reproduce this behavior using
> nvme on Ubuntu 18.04 (4.15). What image and kernel version are you trying against?

Would you like to reproduce the "aborting" message or the qemu hang?

guest image: ubuntu 16.04
guest kernel: mainline linux kernel (and default kernel in ubuntu 16.04)
qemu: qemu-3.0.0 (with the blkdebug delay patch)

Would you be able to see the nvme abort (which is indeed not supported by qemu)
message in guest kernel?

Once I see that message, I would not be able to kill the qemu-system-x86_64
command line with Ctrl+C.

Dongli Zhang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-12  7:36           ` Dongli Zhang
@ 2018-11-12 22:52             ` Marc Olson
  2018-11-13  0:31               ` Dongli Zhang
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Olson @ 2018-11-12 22:52 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: John Snow, qemu-devel, Qemu-block

On 11/11/18 11:36 PM, Dongli Zhang wrote:
> On 11/12/2018 03:13 PM, Marc Olson via Qemu-devel wrote:
>> On 11/3/18 10:24 AM, Dongli Zhang wrote:
>>> The 'write' latency of sector=40960 is set to a very large value. When the I/O
>>> is stalled in guest due to that sector=40960 is accessed, I do see below
>>> messages in guest log:
>>>
>>> [   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
>>> [   80.808095] nvme nvme0: Abort status: 0x4001
>>>
>>>
>>> However, then nothing happens further. nvme I/O hangs in guest. I am not able to
>>> kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
>>> need to kill qemu with "kill -9"
>>>
>>>
>>> The same result for virtio-scsi and qemu is stuck as well.
>> While I didn't try virtio-scsi, I wasn't able to reproduce this behavior using
>> nvme on Ubuntu 18.04 (4.15). What image and kernel version are you trying against?
> Would you like to reproduce the "aborting" message or the qemu hang?
I could not reproduce IO hanging in the guest, but I can reproduce qemu 
hanging.
> guest image: ubuntu 16.04
> guest kernel: mainline linux kernel (and default kernel in ubuntu 16.04)
> qemu: qemu-3.0.0 (with the blkdebug delay patch)
>
> Would you be able to see the nvme abort (which is indeed not supported by qemu)
> message in guest kernel?
Yes.
> Once I see that message, I would not be able to kill the qemu-system-x86_64
> command line with Ctrl+C.

I missed this part. I wasn't expecting to handle very long timeouts, but 
what appears to be happening is that the sleep doesn't get interrupted 
on shutdown. I suspect something like this, on top of the series I sent 
last night, should help:

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 6b1f2d6..0bfb91b 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -557,8 +557,11 @@ static int rule_check(BlockDriverState *bs, 
uint64_t offset, uint64_t bytes)
              remove_active_rule(s, delay_rule);
          }

-        if (latency != 0) {
-            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, latency);
+        while (latency > 0 && 
!aio_external_disabled(bdrv_get_aio_context(bs))) {
+            int64_t cur_latency = MIN(latency, 1000000000ULL);
+
+            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, cur_latency);
+            latency -= cur_latency;
          }
      }


/marc

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] How to emulate block I/O timeout on qemu side?
  2018-11-12 22:52             ` Marc Olson
@ 2018-11-13  0:31               ` Dongli Zhang
  0 siblings, 0 replies; 15+ messages in thread
From: Dongli Zhang @ 2018-11-13  0:31 UTC (permalink / raw)
  To: Marc Olson; +Cc: John Snow, qemu-devel, Qemu-block



On 11/13/2018 06:52 AM, Marc Olson via Qemu-devel wrote:
> On 11/11/18 11:36 PM, Dongli Zhang wrote:
>> On 11/12/2018 03:13 PM, Marc Olson via Qemu-devel wrote:
>>> On 11/3/18 10:24 AM, Dongli Zhang wrote:
>>>> The 'write' latency of sector=40960 is set to a very large value. When the I/O
>>>> is stalled in guest due to that sector=40960 is accessed, I do see below
>>>> messages in guest log:
>>>>
>>>> [   80.807755] nvme nvme0: I/O 11 QID 2 timeout, aborting
>>>> [   80.808095] nvme nvme0: Abort status: 0x4001
>>>>
>>>>
>>>> However, then nothing happens further. nvme I/O hangs in guest. I am not
>>>> able to
>>>> kill the qemu process with Ctrl+C. Both vnc and qemu user net do not work. I
>>>> need to kill qemu with "kill -9"
>>>>
>>>>
>>>> The same result for virtio-scsi and qemu is stuck as well.
>>> While I didn't try virtio-scsi, I wasn't able to reproduce this behavior using
>>> nvme on Ubuntu 18.04 (4.15). What image and kernel version are you trying
>>> against?
>> Would you like to reproduce the "aborting" message or the qemu hang?
> I could not reproduce IO hanging in the guest, but I can reproduce qemu hanging.
>> guest image: ubuntu 16.04
>> guest kernel: mainline linux kernel (and default kernel in ubuntu 16.04)
>> qemu: qemu-3.0.0 (with the blkdebug delay patch)
>>
>> Would you be able to see the nvme abort (which is indeed not supported by qemu)
>> message in guest kernel?
> Yes.
>> Once I see that message, I would not be able to kill the qemu-system-x86_64
>> command line with Ctrl+C.
> 
> I missed this part. I wasn't expecting to handle very long timeouts, but what
> appears to be happening is that the sleep doesn't get interrupted on shutdown. I
> suspect something like this, on top of the series I sent last night, should help:
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 6b1f2d6..0bfb91b 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -557,8 +557,11 @@ static int rule_check(BlockDriverState *bs, uint64_t
> offset, uint64_t bytes)
>              remove_active_rule(s, delay_rule);
>          }
> 
> -        if (latency != 0) {
> -            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, latency);
> +        while (latency > 0 && !aio_external_disabled(bdrv_get_aio_context(bs))) {
> +            int64_t cur_latency = MIN(latency, 1000000000ULL);
> +
> +            qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, cur_latency);
> +            latency -= cur_latency;
>          }
>      }
> 
> 
> /marc
> 
> 

I am able to interrupt qemu with above patch to periodically wake up and sleep
again.

Dongli Zhang

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-11-13  0:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02  8:11 [Qemu-devel] How to emulate block I/O timeout on qemu side? Dongli Zhang
2018-11-02 17:49 ` John Snow
2018-11-02 17:55   ` Marc Olson
2018-11-02 18:17     ` John Snow
2018-11-03 17:24       ` Dongli Zhang
2018-11-05 17:13         ` John Snow
2018-11-12  7:13         ` Marc Olson
2018-11-12  7:36           ` Dongli Zhang
2018-11-12 22:52             ` Marc Olson
2018-11-13  0:31               ` Dongli Zhang
2018-11-05 17:49 ` Eric Blake
2018-11-06  6:17   ` Dongli Zhang
2018-11-06  9:14     ` [Qemu-devel] [Libguestfs] " Richard W.M. Jones
2018-11-06  9:43       ` Richard W.M. Jones
2018-11-06 15:52         ` Richard W.M. Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.