All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: stefanha@redhat.com, virtio-comment@lists.oasis-open.org,
	mst@redhat.com, eperezma@redhat.com, sgarzare@redhat.com
Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
Date: Tue, 22 Dec 2020 10:36:41 +0800	[thread overview]
Message-ID: <bbd75f21-4a17-f3df-3c0e-3633a93d0599@redhat.com> (raw)
In-Reply-To: <20201221223338.7b5a21e6.pasic@linux.ibm.com>


On 2020/12/22 上午5:33, Halil Pasic wrote:
> On Fri, 18 Dec 2020 12:23:02 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> This patch introduces a new status bit DEVICE_STOPPED. This will be
>> used by the driver to stop and resume a device. The main user will be
>> live migration support for virtio device.
>>
> Can you please provide some more background information, or point
> me to the appropriate discussion?
>
> I mean AFAIK migration already works without this driver initiated
> drain. What is the exact motivation? What about the big picture? I
> guess some agent in the guest would have to make the driver issue
> the DEVICE_STOP.


This is not necessary if the datapath is done inside qemu and when 
migration is initiated by qemu itself.

But it's a must for using virtio-device as a backend for emulated virtio 
devices (e.g vhost-vDPA). In this case, qemu needs to stop the device 
then it can safely synchronize the state from them.


>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   content.tex | 26 ++++++++++++++++++++++++--
>>   1 file changed, 24 insertions(+), 2 deletions(-)
>>
>> diff --git a/content.tex b/content.tex
>> index 61eab41..4392b60 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>   \item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
>>     drive the device.
>>   
>> +\item[DEVICE_STOPPED (32)] When VIRTIO_F_DEVICE_STOPPED is negotiated,
>> +  indicates that the device has been stopped by the driver.
>> +
> AFAIU it is not only about indicating stopped, but also requesting to be
> stopped.
>
> More importantly, that must not be set immediately, in a sense that the
> one side initiates some action by requesting the bit to be set, and the
> other side must not set the bit before the action is performed.


Yes.


> We also
> seem to assume that every device implementation is capable of performing
> this trick.


A dedicated feature bit is introduced for this.


> Is it for hardware devices (e.g. PCI) standard to request an
> operation by writing some value into a register, and get feedback bout
> a non-completion by reading different value that written,


This is not ununsal in other devices. And in fact, the FEATURES_OK works 
like this:

"""

The device MUST NOT offer a feature which requires another feature which 
was not offered. The device SHOULD accept any valid subset of features 
the driver accepts, otherwise it MUST fail to set the FEATURES_OK device 
status bit when the driver writes it.

"""

We've already had several hardware implementation of virtio-pci devices 
from different vendors. I didn't hear any complain about such kind of 
design.


>   and about the
> completion, by reading the same value as written?


After after DEVICE_STOPPED is read from device the driver can assume the 
device is stopped.


>
>
>>   \item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
>>     an error from which it can't recover.
>>   \end{description}
>> @@ -58,8 +61,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>   \ref{sec:General Initialization And Device Operation / Device
>>   Initialization}.
>>   The driver MUST NOT clear a
>> -\field{device status} bit.  If the driver sets the FAILED bit,
>> -the driver MUST later reset the device before attempting to re-initialize.
>> +\field{device status} bit other than DEVICE_STOPPED.  If the
>> +driver sets the FAILED bit, the driver MUST later reset the device
>> +before attempting to re-initialize.
>>   
>>   The driver SHOULD NOT rely on completion of operations of a
>>   device if DEVICE_NEEDS_RESET is set.
>> @@ -70,12 +74,28 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>   recover by issuing a reset.
>>   \end{note}
>>   
>> +The driver MUST NOT set or clear DEVICE_STOPPED when DRIVER_OK is not
>> +set. In order to stop the device, the driver MUST set DEVICE_STOPPED
>> +first and re-read status to check whether DEVICE_STOPPED is set by the
>> +device. In order to resume the device, the driver MUST clear
>> +DEVICE_STOPPED first and read status to ensure whether DEVICE_STOPPED
>> +is cleared by the device.
>> +
>>   \devicenormative{\subsection}{Device Status Field}{Basic Facilities of a Virtio Device / Device Status Field}
>>   The device MUST initialize \field{device status} to 0 upon reset.
>>   
>>   The device MUST NOT consume buffers or send any used buffer
>>   notifications to the driver before DRIVER_OK.
>>   
>> +The device MUST ignore DEVICE_STOPPED when DRIVER_OK is not set.
>> +
>> +When driver is trying to set DEVICE_STOPPED, the device MUST not
> The when driver trying to set DEVICE_STOPPED is a bit soft as a
> duration. For example consider virtio-ccw, at the moment when the driver
> issues the ssch to set status, the device still does not know about it.


I need more context on this, if it works like this, when or how can 
device know the status has been changed? (E.g how reset or other status 
bit is supposed to work?)

It looks like a transport limitation if we can't guarantee this. Similar 
issue were met in the PCIE Endpoint device, but it can be workaround by 
designing a new transport.


>
>> +process new avail requests and MUST complete all requests that is
>> +currently processing before setting DEVICE_STOPPED.
> I would like to have a more precise definition of 'new avail requests'
> and 'requests that is currently processing'.


Good point. How about something like:

The device MUST stop reading requests from descriptor area or driver 
area and MUST complete all in flight requests before setting DEVICE_STOPPED.

To be 100% accurate, it looks to me we need to mention device 
implementation internals or pseudo code. I start with "in flight" but 
Stefan wants a more accurate one. Reading the spec I found "in flight" 
has been used in:

"""

The driver SHOULD NOT rely on completion of operations of a device if 
DEVICE_NEEDS_RESET is set. Note: For example, the driver can’t assume 
requests in flight will be completed if DEVICE_NEEDS_RESET is set, nor 
can it assume that they have not been completed. A good implementation 
will try to recover by issuing a reset.

"""

So we are probably fine.  Any idea or suggestion are more than welcomed.


>
>> +
>> +The device MUST keep the config space unchanged when DEVICE_STOPPED is
>> +set.
> Here you have the set by driver which is actually requesting the stop
> operation, and set by device which indicted that the stop operation
> was successfully performed by the device.


Exactly.

Thanks


>
>> +
>>   \label{sec:Basic Facilities of a Virtio Device / Device Status Field / DEVICENEEDSRESET}The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state
>>   that a reset is needed.  If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device
>>   MUST send a device configuration change notification to the driver.
>> @@ -6553,6 +6573,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>>     \item[VIRTIO_F_NOTIFICATION_DATA(38)] This feature indicates
>>     that the driver passes extra data (besides identifying the virtqueue)
>>     in its device notifications.
>> +  \item[VIRTIO_F_DEVICE_STOP(39)] This feature indicates that the
>> +  driver can stop and resume the device.
>>     See \ref{sec:Virtqueues / Driver notifications}~\nameref{sec:Virtqueues / Driver notifications}.
>>   \end{description}
>>   


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2020-12-22  2:36 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18  4:23 [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP Jason Wang
2020-12-18 10:15 ` [virtio-comment] " Stefano Garzarella
2020-12-21  3:08   ` Jason Wang
2020-12-21 11:06     ` Stefano Garzarella
2020-12-22  2:38       ` Jason Wang
2020-12-21 21:33 ` [virtio-comment] " Halil Pasic
2020-12-22  2:36   ` Jason Wang [this message]
2020-12-22  6:50     ` Halil Pasic
2020-12-22  7:30       ` Jason Wang
2020-12-22 12:14         ` Cornelia Huck
2020-12-22 12:51           ` Jason Wang
2020-12-22 15:54             ` Cornelia Huck
2020-12-23  2:48               ` Jason Wang
2020-12-25  7:38                 ` Halil Pasic
2020-12-27 10:00                   ` Michael S. Tsirkin
2020-12-28  6:21                     ` Halil Pasic
2020-12-28  7:01                       ` Jason Wang
2020-12-28 12:30                         ` Michael S. Tsirkin
2020-12-29  9:04                           ` Jason Wang
2021-01-12 10:54                             ` Michael S. Tsirkin
2021-01-13  3:35                               ` Jason Wang
2020-12-29 13:35                         ` Halil Pasic
2020-12-30  8:15                           ` Jason Wang
2021-01-11 18:16                             ` Cornelia Huck
2021-01-12  3:27                               ` Jason Wang
2021-01-12 12:13                                 ` Cornelia Huck
2021-01-13  2:52                                   ` Jason Wang
2021-01-14 12:00                                     ` Cornelia Huck
2020-12-28  6:47                   ` Jason Wang
2020-12-29 13:20                     ` Halil Pasic
2020-12-30  8:03                       ` Jason Wang
2020-12-24  4:52             ` Halil Pasic
2020-12-24  5:51               ` Jason Wang
2020-12-25  3:18                 ` Halil Pasic
2020-12-25  6:45                   ` Jason Wang
2020-12-27 11:12                     ` Michael S. Tsirkin
2020-12-28  7:05                       ` Jason Wang
2020-12-28 12:27                         ` Michael S. Tsirkin
2020-12-29  8:57                           ` Jason Wang
2021-05-03  9:02 ` [virtio-comment] " Eugenio Perez Martin
2021-05-06  2:51   ` Jason Wang
2021-05-05 13:16 ` Michael S. Tsirkin
2021-05-06  7:26   ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bbd75f21-4a17-f3df-3c0e-3633a93d0599@redhat.com \
    --to=jasowang@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=mst@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.