linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dongli Zhang <dongli.zhang@oracle.com>
To: Hannes Reinecke <hare@suse.de>, Stefan Hajnoczi <stefanha@redhat.com>
Cc: virtualization@lists.linux-foundation.org,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, mst@redhat.com, jasowang@redhat.com,
	pbonzini@redhat.com, jejb@linux.ibm.com,
	martin.petersen@oracle.com, joe.jin@oracle.com,
	junxiao.bi@oracle.com, srinivas.eeda@oracle.com
Subject: Re: [RFC] virtio_scsi: to poll and kick the virtqueue in timeout handler
Date: Tue, 25 May 2021 10:34:24 -0700	[thread overview]
Message-ID: <9bc96bd0-5a0a-da4c-25f3-c540163318f8@oracle.com> (raw)
In-Reply-To: <1184a5ac-bbb4-a89d-b5e2-ee0bf58cd1b8@suse.de>



On 5/25/21 10:24 AM, Hannes Reinecke wrote:
> On 5/25/21 6:47 PM, Stefan Hajnoczi wrote:
>> On Mon, May 24, 2021 at 11:33:33PM -0700, Dongli Zhang wrote:
>>> On 5/24/21 6:24 AM, Stefan Hajnoczi wrote:
>>>> On Sun, May 23, 2021 at 09:39:51AM +0200, Hannes Reinecke wrote:
>>>>> On 5/23/21 8:38 AM, Dongli Zhang wrote:
>>>>>> This RFC is to trigger the discussion about to poll and kick the
>>>>>> virtqueue on purpose in virtio-scsi timeout handler.
>>>>>>
>>>>>> The virtio-scsi relies on the virtio vring shared between VM and host.
>>>>>> The VM side produces requests to vring and kicks the virtqueue, while the
>>>>>> host side produces responses to vring and interrupts the VM side.
>>>>>>
>>>>>> By default the virtio-scsi handler depends on the host timeout handler
>>>>>> by BLK_EH_RESET_TIMER to give host a chance to perform EH.
>>>>>>
>>>>>> However, this is not helpful for the case that the responses are available
>>>>>> on vring but the notification from host to VM is lost.
>>>>>>
>>>>> How can this happen?
>>>>> If responses are lost the communication between VM and host is broken, and
>>>>> we should rather reset the virtio rings themselves.
>>>>
>>>> I agree. In principle it's fine to poll the virtqueue at any time, but I
>>>> don't understand the failure scenario here. It's not clear to me why the
>>>> device-to-driver vq notification could be lost.
>>>>
>>>
>>> One example is the CPU hotplug issue before the commit bf0beec0607d ("blk-mq:
>>> drain I/O when all CPUs in a hctx are offline") was available. The issue is
>>> equivalent to loss of interrupt. Without the CPU hotplug fix, while NVMe driver
>>> relies on the timeout handler to complete inflight IO requests, the PV
>>> virtio-scsi may hang permanently.
>>>
>>> In addition, as the virtio/vhost/QEMU are complex software, we are not able to
>>> guarantee there is no further lost of interrupt/kick issue in the future. It is
>>> really painful if we encounter such issue in production environment.
>>
>> Any number of hardware or software bugs might exist that we don't know
>> about, yet we don't pre-emptively add workarounds for them because where
>> do you draw the line?
>>
>> I checked other SCSI/block drivers and found it's rare to poll in the
>> timeout function so there does not seem to be a consensus that it's
>> useful to do this.
>>
> Not only this; it's downright dangerous attempting to do that in SCSI.
> In SCSI we don't have fixed lifetime guarantees that NVMe has, so there will be
> a race condition between timeout and command completion.

Thank you very much for the explanation. Yes, we cannot do that due to the race.

Dongli Zhang


> Plus there is no interface in SCSI allowing to 'poll' for completions in a
> meaningful manner.
> 
>> That said, it's technically fine to do it, the virtqueue APIs are there
>> and can be used like this. So if you and others think this is necessary,
>> then it's a pretty small change and I'm not against merging a patch like
>> this.
>>
> I would rather _not_ put more functionality into the virtio_scsi timeout
> handler; this only serves to assume that the timeout handler has some
> functionality in virtio.
> Which it patently hasn't, as the prime reason for a timeout handler is to
> _abort_ a command, which we can't on virtio.
> Well, we can on virtio, but qemu as the main user will re-route the I/O from
> virtio into doing async-I/O, and there is no way how we can abort outstanding
> asynchronous I/O.
> Or any other ioctl, for that matter.
> 
> Cheers,
> 
> Hannes

      reply	other threads:[~2021-05-25 17:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-23  6:38 [RFC] virtio_scsi: to poll and kick the virtqueue in timeout handler Dongli Zhang
2021-05-23  7:39 ` Hannes Reinecke
2021-05-24 13:24   ` Stefan Hajnoczi
2021-05-25  6:33     ` Dongli Zhang
2021-05-25 16:47       ` Stefan Hajnoczi
2021-05-25 17:24         ` Hannes Reinecke
2021-05-25 17:34           ` Dongli Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9bc96bd0-5a0a-da4c-25f3-c540163318f8@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=hare@suse.de \
    --cc=jasowang@redhat.com \
    --cc=jejb@linux.ibm.com \
    --cc=joe.jin@oracle.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=srinivas.eeda@oracle.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).