[Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device
@ 2014-01-14 14:24 Eric Farman
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug Eric Farman
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Eric Farman @ 2014-01-14 14:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

In working with hot-plug/unplug of virtio-scsi devices on s390,
we have occasionally noticed some erratic behavior when an unplug
occurs while I/O is in flight.  Ideally a device is not being used
when it is removed from a guest configuration, but not guarantee
can be made that this will be the case.  And while this scenario
is meant for I/O that occurs during normal use of a device, it
includes the pathological case of an unplug that occurs while the
asynchronous Inquiry loop (initiated by a hotplug) is still ongoing.

Symptoms vary depending on when the unplug is recognized.  Sometimes
a hang occurs, because a reference is not properly released and thus
never reaches zero.  Sometimes a reference is released too early,
allowing the count to go negative and trip an assertion (or more
unpredictable results, if storage is released but still used).

Of course there are many times when things work perfectly, though
that seems to be when the I/O was able to complete in time.  These
patches simply straighten out the completion of I/Os during an
unplug, such that it results in predictable behavior whenever the
device is not idle.

Eric Farman (3):
  scsi/virtio-scsi: Properly flush I/Os inflight during an unplug
  scsi/virtio-scsi: Cleanup of I/Os that never started
  scsi/virtio-scsi: Prevent assertion on missed events

 hw/scsi/virtio-scsi.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug
  2014-01-14 14:24 [Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device Eric Farman
@ 2014-01-14 14:24 ` Eric Farman
  2014-01-14 14:45   ` Paolo Bonzini
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started Eric Farman
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events Eric Farman
  2 siblings, 1 reply; 9+ messages in thread
From: Eric Farman @ 2014-01-14 14:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

When an unplug is triggered via QMP, the routine scsi_req_cancel
is called to cancel any outstanding requests.  However, the I/Os
themselves were instantiated via an asynchronous call that will
drive scsi_*_complete routines after the unplug call stack finishes.
As all references to the request have been released by the cancel
thread, the scsi_*_complete routines experience a range of failures
when it attempts to manipulate the released storage.

Before we exit back to scsi_req_cancel, we need to inform the
asynchronous tasks that they were canceled.  This will handle all
the cleanup work performed by scsi_*_complete before we release
all the references to the affected request(s).

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
---
 hw/scsi/virtio-scsi.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 6dcdd1b..a8fde04 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -339,6 +339,7 @@ static void virtio_scsi_request_cancelled(SCSIRequest *r)
         req->resp.cmd->response = VIRTIO_SCSI_S_ABORTED;
     }
     virtio_scsi_complete_req(req);
+    bdrv_drain_all();
 }

 static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started
  2014-01-14 14:24 [Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device Eric Farman
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug Eric Farman
@ 2014-01-14 14:24 ` Eric Farman
  2014-01-14 14:49   ` Paolo Bonzini
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events Eric Farman
  2 siblings, 1 reply; 9+ messages in thread
From: Eric Farman @ 2014-01-14 14:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

There is still a small window that occurs when a cancel I/O affects
an asynchronous I/O operation that hasn't started.  In other words,
when the residual data length equals the expected data length.

Today, the routine virtio_scsi_command_complete fails because the
VirtIOSCSIReq pointer (from the hba_private field in SCSIRequest)
was cleared earlier when virtio_scsi_complete_req was called by
the virtio_scsi_request_cancelled routine.  As a result, the
virtio_scsi_command_complete routine needs to simply return when
it is processing a SCSIRequest block that was marked canceled.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
---
 hw/scsi/virtio-scsi.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index a8fde04..49a9576 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -306,6 +306,10 @@ static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
     VirtIOSCSIReq *req = r->hba_private;
     uint32_t sense_len;

+    if (r->io_canceled) {
+        return;
+    }
+
     req->resp.cmd->response = VIRTIO_SCSI_S_OK;
     req->resp.cmd->status = status;
     if (req->resp.cmd->status == GOOD) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events
  2014-01-14 14:24 [Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device Eric Farman
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug Eric Farman
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started Eric Farman
@ 2014-01-14 14:24 ` Eric Farman
  2014-01-14 14:31   ` Eric Farman
  2014-01-14 14:47   ` Paolo Bonzini
  2 siblings, 2 replies; 9+ messages in thread
From: Eric Farman @ 2014-01-14 14:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

In some cases, an unplug can cause events to be dropped, which
leads to an assertion failure when preparing to notify the guest
kernel.  This merely accommodates both variations of the "no event"
value that could occur in this codepath.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
---
 hw/scsi/virtio-scsi.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 49a9576..f8e3632 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -521,7 +521,8 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
     evt->event = event;
     evt->reason = reason;
     if (!dev) {
-        assert(event == VIRTIO_SCSI_T_NO_EVENT);
+        assert(event == VIRTIO_SCSI_T_NO_EVENT ||
+               event == VIRTIO_SCSI_T_EVENTS_MISSED);
     } else {
         evt->lun[0] = 1;
         evt->lun[1] = dev->id;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events Eric Farman
@ 2014-01-14 14:31   ` Eric Farman
  2014-01-14 14:47   ` Paolo Bonzini
  1 sibling, 0 replies; 9+ messages in thread
From: Eric Farman @ 2014-01-14 14:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

On 01/14/2014 09:24 AM, Eric Farman wrote:
> In some cases, an unplug can cause events to be dropped, which
> leads to an assertion failure when preparing to notify the guest
> kernel.  This merely accommodates both variations of the "no event"
> value that could occur in this codepath.
>
> Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
> ---
>   hw/scsi/virtio-scsi.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 49a9576..f8e3632 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -521,7 +521,8 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
>       evt->event = event;
>       evt->reason = reason;
>       if (!dev) {
> -        assert(event == VIRTIO_SCSI_T_NO_EVENT);
> +        assert(event == VIRTIO_SCSI_T_NO_EVENT ||
> +               event == VIRTIO_SCSI_T_EVENTS_MISSED);
>       } else {
>           evt->lun[0] = 1;
>           evt->lun[1] = dev->id;
Didn't want to stuff this in the commit itself, but I am especially 
curious about this patch.  While I encountered the assertion failure 
with some regularity during my testing, with the patch I have 
occasionally encountered a guest kernel oops.  I believe this patch ends 
up allowing us to get just a little bit farther, but that there is still 
code needed.  Not sure which failure is better, but also haven't had 
much success reproducing/debugging the kernel error.

I'm appending the kernel backtrace below, for reference:

[   21.341765] Unable to handle kernel pointer dereference at virtual 
kernel address 63772d302e302000
[   21.346892] Oops: 0038 [#1] SMP
[   21.347112] Modules linked in: virtio_blk virtio_scsi dm_multipath
[   21.347524] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.10.0 #1
[   21.348685] Workqueue: events virtscsi_handle_event [virtio_scsi]
[   21.348939] task: 000000001ff74848 ti: 000000001ffa0000 task.ti: 
000000001ffa0000
Krnl GPRS: 000000001cd47b78 63772d302e302e31 0000000001697000 
0000000000000001
[   21.349910]            0000000000000001 0000000000000000 
0000000000000000 0000000001697000
[   21.350185]            0000000000000000 0000000000000000 
00000000018e8000 00000000018e8738
[   21.350425]            000000001ff26e00 0000000001697000 
000000001ffa3d08 000000001ffa3ce0
[   21.350677] Krnl Code: 00000000004637c8: e3e0f0980024    stg 
%r14,152(%r15)
            00000000004637ce: e31020000004    lg    %r1,0(%r2)
           #00000000004637d4: e31010c00004    lg    %r1,192(%r1)
           >00000000004637da: e3c010000002    ltg    %r12,0(%r1)
            00000000004637e0: a784000a        brc    8,4637f4
            00000000004637e4: b904002c        lgr    %r2,%r12
            00000000004637e8: c0e5ffe9e944    brasl    %r14,1a0a70
            00000000004637ee: ec26000b007c    cgij    %r2,0,6,463804
[   21.352029] Call Trace:
[   21.352122] ([<000000000047316c>] scsi_remove_device+0x40/0x50)
[   21.352330]  [<000003ff8000f6ee>] virtscsi_handle_event+0x1ea/0x230 
[virtio_scsi]
[   21.378453]  [<0000000000152364>] process_one_work+0x1a8/0x400
[   21.382530]  [<0000000000152a94>] worker_thread+0x144/0x35c
[   21.382818]  [<000000000015a112>] kthread+0xd2/0xdc
[   21.383105]  [<000000000063504e>] kernel_thread_starter+0x6/0xc
[   21.383391]  [<0000000000635048>] kernel_thread_starter+0x0/0xc
[   21.383679] Last Breaking-Event-Address:
[   21.383851]  [<000003ff8000e1f4>] 0x3ff8000e1f4
[   21.384080]
[   21.384142] ---[ end trace 07fb696109d46d45 ]---
[   21.386001] Unable to handle kernel pointer dereference at virtual 
kernel address fffffffffffff000
[   21.386433] Oops: 0038 [#2] SMP
[   21.386648] Modules linked in: virtio_blk virtio_scsi dm_multipath
[   21.387035] CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: G D 3.10.0 #1
[   21.387471] task: 000000001ff74848 ti: 000000001ffa0000 task.ti: 
000000001ffa0000
[   21.387793] Krnl PSW : 0404e00180000000 000000000015a6c2 
(kthread_data+0x6/0x10)
[   21.388113]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 
PM:0 EA:3
Krnl GPRS: 000000000000000e 0000000000000000 000000001ff74848 
0000000000000000
[   21.392411]            0000000000006f00 000000000000038c 
00000000009db800 00000000009db800
[   21.392759]            00000000009db800 000000001ff74c10 
0000000000915c88 0000000002403800
[   21.393075]            000000001ff74848 0000000000000000 
0000000000153270 000000001ffa38d8
[   21.393398] Krnl Code: 000000000015a6b8: 07fe        bcr 15,%r14
            000000000015a6ba: 0707        bcr    0,%r7
           #000000000015a6bc: e31023800004    lg    %r1,896(%r2)
           >000000000015a6c2: e3201fd8ff04    lg    %r2,-40(%r1)
            000000000015a6c8: 07fe        bcr    15,%r14
            000000000015a6ca: 0707        bcr    0,%r7
            000000000015a6cc: ebeff0880024    stmg %r14,%r15,136(%r15)
            000000000015a6d2: a7f13fe0        tmll    %r15,16352
[   21.395048] Call Trace:
[   21.395153] ([<0000000000915c88>] __per_cpu_offset+0x0/0x200)
[   21.395419]  [<00000000006330ac>] __schedule+0x7a8/0xa94
[   21.395684]  [<0000000000136042>] do_exit+0x69e/0xa90
[   21.395948]  [<00000000001126f8>] die+0x154/0x17c
[   21.396212]  [<000000000011c400>] do_no_context+0xa8/0xe0
[   21.396475]  [<00000000006370e2>] do_asce_exception+0x172/0x18c
[   21.396738]  [<00000000006351ce>] pgm_check_handler+0x17a/0x17e
[   21.396999]  [<00000000004637da>] scsi_device_put+0x2e/0x74
[   21.397262] ([<000000000047316c>] scsi_remove_device+0x40/0x50)
[   21.397525]  [<000003ff8000f6ee>] virtscsi_handle_event+0x1ea/0x230 
[virtio_scsi]
[   21.400894]  [<0000000000152364>] process_one_work+0x1a8/0x400
[   21.403878]  [<0000000000152a94>] worker_thread+0x144/0x35c
[   21.404350]  [<000000000015a112>] kthread+0xd2/0xdc
[   21.404617]  [<000000000063504e>] kernel_thread_starter+0x6/0xc
[   21.404883]  [<0000000000635048>] kernel_thread_starter+0x0/0xc
[   21.405147] Last Breaking-Event-Address:
[   21.405306]  [<000000000015326a>] wq_worker_sleeping+0x22/0xb0
[   21.405570]

  - Eric Farman

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug Eric Farman
@ 2014-01-14 14:45   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2014-01-14 14:45 UTC (permalink / raw)
  To: Eric Farman; +Cc: qemu-devel

Il 14/01/2014 15:24, Eric Farman ha scritto:
> When an unplug is triggered via QMP, the routine scsi_req_cancel
> is called to cancel any outstanding requests.  However, the I/Os
> themselves were instantiated via an asynchronous call that will
> drive scsi_*_complete routines after the unplug call stack finishes.
> As all references to the request have been released by the cancel
> thread, the scsi_*_complete routines experience a range of failures
> when it attempts to manipulate the released storage.

This should never happen.  See scsi_req_cancel:

    void scsi_req_cancel(SCSIRequest *req)
    {
        trace_scsi_req_cancel(req->dev->id, req->lun, req->tag);
        if (!req->enqueued) {
            return;
        }
        scsi_req_ref(req);
        scsi_req_dequeue(req);
        req->io_canceled = true;
        if (req->ops->cancel_io) {
            req->ops->cancel_io(req);
        }
        if (req->bus->info->cancel) {
            req->bus->info->cancel(req);
        }
        scsi_req_unref(req);
    }

After req->ops->cancel_io returns, the following invariant must hold:

    Any AIO callbacks will have been called before req->ops->cancel_io
    returns, or they never will.

The invariant is also present in bdrv_aio_cancel, and should respected
at all levels: dma_aio_cancel in dma-helpers.c, thread_pool_cancel in
thread-pool.c, laio_cancel in block/linux-aio.c, and so on.

scsi_cancel_io (in hw/scsi/scsi-disk.c) is very careful in its handling
of reference counts and aiocb, with the exact purpose of triggering an
assertion failure if the invariant is not respected.

Now that I look more at the code, at least this patch is needed:

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index bce617c..ee1f5eb 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2306,6 +2306,7 @@ static const SCSIReqOps scsi_disk_emulate_reqops
     .send_command = scsi_disk_emulate_command,
     .read_data    = scsi_disk_emulate_read_data,
     .write_data   = scsi_disk_emulate_write_data,
+    .cancel_io    = scsi_cancel_io,
     .get_buf      = scsi_get_buf,
 };


but it should only have an effect in very special cases, with commands
such as UNMAP, WRITE SAME or MODE SELECT.

Paolo

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events Eric Farman
  2014-01-14 14:31   ` Eric Farman
@ 2014-01-14 14:47   ` Paolo Bonzini
  1 sibling, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2014-01-14 14:47 UTC (permalink / raw)
  To: Eric Farman; +Cc: qemu-devel

Il 14/01/2014 15:24, Eric Farman ha scritto:
> In some cases, an unplug can cause events to be dropped, which
> leads to an assertion failure when preparing to notify the guest
> kernel.  This merely accommodates both variations of the "no event"
> value that could occur in this codepath.
> 
> Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
> ---
>  hw/scsi/virtio-scsi.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 49a9576..f8e3632 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -521,7 +521,8 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
>      evt->event = event;
>      evt->reason = reason;
>      if (!dev) {
> -        assert(event == VIRTIO_SCSI_T_NO_EVENT);
> +        assert(event == VIRTIO_SCSI_T_NO_EVENT ||
> +               event == VIRTIO_SCSI_T_EVENTS_MISSED);
>      } else {
>          evt->lun[0] = 1;
>          evt->lun[1] = dev->id;
> 

Ugh, you're right.  In fact, you can only have event ==
VIRTIO_SCSI_T_EVENTS_MISSED at this point.  Can you make v2 that
completely replaces the condition?

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started
  2014-01-14 14:24 ` [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started Eric Farman
@ 2014-01-14 14:49   ` Paolo Bonzini
  2014-01-14 16:08     ` Eric Farman
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2014-01-14 14:49 UTC (permalink / raw)
  To: Eric Farman; +Cc: qemu-devel

Il 14/01/2014 15:24, Eric Farman ha scritto:
> There is still a small window that occurs when a cancel I/O affects
> an asynchronous I/O operation that hasn't started.  In other words,
> when the residual data length equals the expected data length.
> 
> Today, the routine virtio_scsi_command_complete fails because the
> VirtIOSCSIReq pointer (from the hba_private field in SCSIRequest)
> was cleared earlier when virtio_scsi_complete_req was called by
> the virtio_scsi_request_cancelled routine.  As a result, the
> virtio_scsi_command_complete routine needs to simply return when
> it is processing a SCSIRequest block that was marked canceled.
> 
> Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
> ---
>  hw/scsi/virtio-scsi.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index a8fde04..49a9576 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -306,6 +306,10 @@ static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
>      VirtIOSCSIReq *req = r->hba_private;
>      uint32_t sense_len;
>  
> +    if (r->io_canceled) {
> +        return;
> +    }
> +
>      req->resp.cmd->response = VIRTIO_SCSI_S_OK;
>      req->resp.cmd->status = status;
>      if (req->resp.cmd->status == GOOD) {
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Can you please look more closely into whether patch 1 is really
necessary, include the small fix I posted into that review, and post v2
of the series?

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started
  2014-01-14 14:49   ` Paolo Bonzini
@ 2014-01-14 16:08     ` Eric Farman
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Farman @ 2014-01-14 16:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 01/14/2014 09:49 AM, Paolo Bonzini wrote:
> Il 14/01/2014 15:24, Eric Farman ha scritto:
>> There is still a small window that occurs when a cancel I/O affects
>> an asynchronous I/O operation that hasn't started.  In other words,
>> when the residual data length equals the expected data length.
>>
>> Today, the routine virtio_scsi_command_complete fails because the
>> VirtIOSCSIReq pointer (from the hba_private field in SCSIRequest)
>> was cleared earlier when virtio_scsi_complete_req was called by
>> the virtio_scsi_request_cancelled routine.  As a result, the
>> virtio_scsi_command_complete routine needs to simply return when
>> it is processing a SCSIRequest block that was marked canceled.
>>
>> Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
>> ---
>>   hw/scsi/virtio-scsi.c |    4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
>> index a8fde04..49a9576 100644
>> --- a/hw/scsi/virtio-scsi.c
>> +++ b/hw/scsi/virtio-scsi.c
>> @@ -306,6 +306,10 @@ static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
>>       VirtIOSCSIReq *req = r->hba_private;
>>       uint32_t sense_len;
>>   
>> +    if (r->io_canceled) {
>> +        return;
>> +    }
>> +
>>       req->resp.cmd->response = VIRTIO_SCSI_S_OK;
>>       req->resp.cmd->status = status;
>>       if (req->resp.cmd->status == GOOD) {
>>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
>
> Can you please look more closely into whether patch 1 is really
> necessary, include the small fix I posted into that review, and post v2
> of the series?

Heh, that's fantastic.  I pulled patch 1 out, ran the tests that hit 
into these problems, didn't see anything.  Pulled out patch 2, reran the 
tests, hit them almost immediately.  Did this a few times, with the same 
results.  So I'll fix up another series, and post shortly.

Thanks for the feedback and the review!

- Eric
>
> Thanks!
>
> Paolo
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-14 16:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-14 14:24 [Qemu-devel] [PATCH RFC 0/3] virtio-scsi unplug of active device Eric Farman
2014-01-14 14:24 ` [Qemu-devel] [PATCH 1/3] scsi/virtio-scsi: Properly flush I/Os inflight during an unplug Eric Farman
2014-01-14 14:45   ` Paolo Bonzini
2014-01-14 14:24 ` [Qemu-devel] [PATCH 2/3] scsi/virtio-scsi: Cleanup of I/Os that never started Eric Farman
2014-01-14 14:49   ` Paolo Bonzini
2014-01-14 16:08     ` Eric Farman
2014-01-14 14:24 ` [Qemu-devel] [PATCH 3/3] scsi/virtio-scsi: Prevent assertion on missed events Eric Farman
2014-01-14 14:31   ` Eric Farman
2014-01-14 14:47   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.