From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH V2 1/2] virtio: introduce virtqueue state as basic facility References: <20210706043334.36359-1-jasowang@redhat.com> <20210706043334.36359-2-jasowang@redhat.com> <20210706051731-mutt-send-email-mst@kernel.org> <20210706150459-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: Date: Wed, 7 Jul 2021 10:42:54 +0800 MIME-Version: 1.0 In-Reply-To: <20210706150459-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-US To: "Michael S. Tsirkin" , Eugenio Perez Martin Cc: virtio-comment@lists.oasis-open.org, Virtio-Dev , Stefan Hajnoczi , Max Gurtovoy , Cornelia Huck , Oren Duer , Shahaf Shuler , Parav Pandit , Bodong Wang , Alexander Mikheev , Halil Pasic List-ID: 在 2021/7/7 上午3:08, Michael S. Tsirkin 写道: > On Tue, Jul 06, 2021 at 07:09:10PM +0200, Eugenio Perez Martin wrote: >> On Tue, Jul 6, 2021 at 11:32 AM Michael S. Tsirkin wrote: >>> On Tue, Jul 06, 2021 at 12:33:33PM +0800, Jason Wang wrote: >>>> This patch adds new device facility to save and restore virtqueue >>>> state. The virtqueue state is split into two parts: >>>> >>>> - The available state: The state that is used for read the next >>>> available buffer. >>>> - The used state: The state that is used for making buffer used. >>>> >>>> Note that, there could be devices that is required to set and get the >>>> requests that are being processed by the device. I leave such API to >>>> be device specific. >>>> >>>> This facility could be used by both migration and device diagnostic. >>>> >>>> Signed-off-by: Jason Wang >>> Hi Jason! >>> I feel that for use-cases such as SRIOV, >>> the facility to save/restore vq should be part of a PF >>> that is there needs to be a way for one virtio device to >>> address the state of another one. >>> >> Hi! >> >> In my opinion we should go the other way around: To make features as >> orthogonal/independent as possible, and just make them work together >> if we have to. In this particular case, I think it should be easier to >> decide how to report status, its needs, etc for a VF, and then open >> the possibility for the PF to query or set them, reusing format, >> behavior, etc. as much as possible. >> >> I think that the most controversial point about doing it non-SR IOV >> way is the exposing of these features/fields to the guest using >> specific transport facilities, like PCI common config. However I think >> it should not be hard for the hypervisor to intercept them and even to >> expose them conditionally. Please correct me if this guessing was not >> right and you had other concerns. > > Possibly. I'd like to see some guidance on how this all will work > in practice then. Maybe make it all part of a non-normative section > for now. > I think that the feature itself is not very useful outside of > migration so we don't really gain much by adding it as is > without all the other missing pieces. For networking device, the only missing part is the transport implementation of the virtqueue state. > I would say let's see more of the whole picture before we commit. I will include an implementation of PCI as an example. Thanks > > > >>> Thoughts? >>> >>>> --- >>>> content.tex | 117 ++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 117 insertions(+) >>>> >>>> diff --git a/content.tex b/content.tex >>>> index 620c0e2..8877b6f 100644 >>>> --- a/content.tex >>>> +++ b/content.tex >>>> @@ -385,6 +385,116 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo >>>> types. It is RECOMMENDED that devices generate version 4 >>>> UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}. >>>> >>>> +\section{Virtqueue State}\label{sec:Virtqueues / Virtqueue State} >>>> + >>>> +When VIRTIO_F_RING_STATE is negotiated, the driver can set and >>>> +get the device internal virtqueue state through the following >>>> +fields. The way to access those fields is transport specific. >>>> + >>>> +\subsection{\field{Available State} Field} >>>> + >>>> +The \field{Available State} field is two bytes for the driver to get >>>> +or set the state that is used by the virtqueue to read for the next >>>> +available buffer. >>>> + >>>> +When VIRTIO_F_RING_PACKED is not negotiated, it contains: >>>> + >>>> +\begin{lstlisting} >>>> +le16 { >>>> + last_avail_idx : 16; >>>> +} avail_state; >>>> +\end{lstlisting} >>>> + >>>> +The \field{last_avail_idx} field indicates where the device would read >>>> +for the next index from the virtqueue available ring(modulo the queue >>>> + size). This starts at the value set by the driver, and increases. >>>> + >>>> +When VIRTIO_F_RING_PACKED is negotiated, it contains: >>>> + >>>> +\begin{lstlisting} >>>> +le16 { >>>> + last_avail_idx : 15; >>>> + last_avail_wrap_counter : 1; >>>> +} avail_state; >>>> +\end{lstlisting} >>>> + >>>> +The \field{last_avail_idx} field indicates where the device would read for >>>> +the next descriptor head from the descriptor ring. This starts at the >>>> +value set by the driver and wraps around when reaching the end of the >>>> +ring. >>>> + >>>> +The \field{last_avail_wrap_counter} field indicates the last Driver Ring >>>> +Wrap Counter that is observed by the device. This starts at the value >>>> +set by the driver, and is flipped when reaching the end of the ring. >>>> + >>>> +See also \ref{sec:Packed Virtqueues / Driver and Device Ring Wrap Counters}. >>>> + >>>> +\subsection{\field{Used State} Field} >>>> + >>>> +The \field{Used State} field is two bytes for the driver to set and >>>> +get the state used by the virtqueue to make buffer used. >>>> + >>>> +When VIRTIO_F_RING_PACKED is not negotiated, the used state contains: >>>> + >>>> +\begin{lstlisting} >>>> +le16 { >>>> + used_idx : 16; >>>> +} used_state; >>>> +\end{lstlisting} >>>> + >>>> +The \field{used_idx} where the device would write the next used >>>> +descriptor head to the used ring (modulo the queue size). This starts >>>> +at the value set by the driver, and increases. It is easy to see this >>>> +is the initial value of the \field{idx} in the used ring. >>>> + >>>> +See also \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / The Virtqueue Used Ring} >>>> + >>>> +When VIRTIO_F_RING_PACKED is negotiated, the used state contains: >>>> + >>>> +\begin{lstlisting} >>>> +le16 { >>>> + used_idx : 15; >>>> + used_wrap_counter : 1; >>>> +} used_state; >>>> +\end{lstlisting} >>>> + >>>> +The \field{used_idx} indicates where the device would write the next used >>>> +descriptor head to the descriptor ring. This starts at the value set >>>> +by the driver, and warps around when reaching the end of the ring. >>>> + >>>> +\field{used_wrap_counter} is the Device Ring Wrap Counter. This starts >>>> +at the value set by the driver, and is flipped when reaching the end >>>> +of the ring. >>>> + >>>> +See also \ref{sec:Packed Virtqueues / Driver and Device Ring Wrap Counters}. >>> >>> Above only fully describes the vq state if descriptors >>> are used in order or at least all out of order descriptors are consumed >>> at time of save. >>> >> I think that the most straightforward solution would be to add >> something similar to VHOST_USER_GET_INFLIGHT_FD, but without the _FD >> part. >> >> Thanks! >> >>> Adding later option to devices such as net will need extra spec work. >>> >>> >>>> +\drivernormative{\subsection}{Virtqueue State}{Basic Facilities of a Virtio Device / Virtqueue State} >>>> + >>>> +If VIRTIO_F_RING_STATE has been negotiated: >>>> +\begin{itemize} >>>> +\item A driver MUST NOT set the virtqueue state before setting the >>>> + FEATURE_OK status bit. >>>> +\item A driver MUST NOT set the virtqueue state after setting the >>>> + DRIVER_OK status bit. >>>> +\end{itemize} >>>> + >>>> +\devicenormative{\subsection}{Virtqueue State}{Basic Facilities of a Virtio Device / Virtqueue State} >>>> + >>>> +If VIRTIO_F_RING_STATE has not been negotiated, a device MUST ingore >>>> +the read and write to the virtqueue state. >>>> + >>>> +If VIRTIO_F_RING_STATE has been negotiated: >>>> +\begin{itemize} >>>> +\item A device SHOULD ignore the write to the virtqueue state if the >>>> +FEATURE_OK status bit is not set. >>>> +\item A device SHOULD ignore the write to the virtqueue state if the >>>> +DRIVER_OK status bit is set. >>>> +\end{itemize} >>>> + >>>> +If VIRTIO_F_RING_STATE has been negotiated, a device MAY has its >>> >>> may have? >>> should also go into a normative section >>> >>>> +device-specific way for the driver to set and get extra virtqueue >>>> +states such as in flight requests. >>>> + >>>> \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation} >>>> >>>> We start with an overview of device initialization, then expand on the >>>> @@ -420,6 +530,9 @@ \section{Device Initialization}\label{sec:General Initialization And Device Oper >>>> device, optional per-bus setup, reading and possibly writing the >>>> device's virtio configuration space, and population of virtqueues. >>>> >>>> +\item\label{itm:General Initialization And Device Operation / Device >>>> + Initialization / Virtqueue State Setup} When VIRTIO_F_RING_STATE has been negotiated, perform virtqueue state setup, including the initialization of the per virtqueue available state, used state and the possible device specific virtqueue state. >>>> + >>>> \item\label{itm:General Initialization And Device Operation / Device Initialization / Set DRIVER-OK} Set the DRIVER_OK status bit. At this point the device is >>>> ``live''. >>>> \end{enumerate} >>>> @@ -6596,6 +6709,10 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits} >>>> transport specific. >>>> For more details about driver notifications over PCI see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Available Buffer Notifications}. >>>> >>>> + \item[VIRTIO_F_RING_STATE(40)] This feature indicates that the driver >>>> + can set and get the device internal virtqueue state. >>>> + See \ref{sec:Virtqueues / Virtqueue State}~\nameref{sec:Virtqueues / Virtqueue State}. >>>> + >>>> \end{description} >>>> >>>> \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits} >>>> -- >>>> 2.25.1