All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Introduce device group and device management
@ 2022-04-26 22:58 Max Gurtovoy
  2022-04-26 22:58 ` [virtio-comment] [PATCH v5 1/7] Introduce device group Max Gurtovoy
                   ` (8 more replies)
  0 siblings, 9 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

Hi,
A device group definition will help extending the virtio specefication for
various future features that require a notion of grouping devices together or
managing devices inside a group. A device group include one or more virtio devices.
For now, only support for type 1 device group was added.

Also introduce the admin facility to allow manipulating features and configurations
in a generic manner. Using the admin command set, one can manipulate the device itself
and/or to manipulate, if possible, another device within the same device group (for now,
introduce only support of PCI SR-IOV devices grouping).

The admin command set will be extended in the future  to support more functionalities.
Some of these functionalities are already under discussions.

The admin virtqueue is the first management interface to issue admin commands from
the admin command set.

Motivation for choosing admin queue as first management interface:
1. It is anticipated that admin queue will be used for managing and configuring
   many different type of resources. For example,
   a. PCI PF configuring PCI VF attributes.
   b. virtio device creating/destroying/configuring subfunctions discussed in [1]
   c. composing device config space of VF or SF such as mac address, number of VQs, virtio features

   Mapping all of them as configuration registers to MMIO will require large MMIO space,
   if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
   requires on-chip resources to complete within MMIO access latencies. Such resources are very
   expensive.

2. Such limitation can be overcome by having smaller MMIO register set to build
   a command request response interface. However, such MMIO based command interface
   will be limited to serve single outstanding command execution. Such limitation can
   resulting in high device creation and composing time which can affect VM startup time.
   Often device can queue and service multiple commands in parallel, such command interface
   cannot use parallelism offered by the device.

3. When a command wants to DMA data from one or more physical addresses, for example in the future a
   live migration command may need to fetch device state consist of config space, tens of
   VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
   Packing one or more DMA addresses over new command interface will be burden some and continue
   to suffer single outstanding command execution latencies. Such limitation is not good for time
   sensitive live migration use cases.

4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
   descriptors. Similar mechanism exist today for device specific configuration - the control VQ.

A future work can add another management interface to issue admin commands.

[1] https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html

This series include the comments and fixes from V1-V4 of the initial patch sets ("VIRTIO: Provision maximum
MSI-X vectors for a VF" and "Introduce virtio subsystem and Admin virtqueue" [2]).
This series was extended with additional RFC for setting managed device feature bits as another example for
using admin command set. Also device/driver negotiation for admin caps was introduced as a response for previous
comments on the mailing list.

[2] https://lists.oasis-open.org/archives/virtio-comment/202203/msg00005.html


Open issues:
1. CCW and MMIO specification for admin_queue_index register

Changelog:
 - Merged MSI-X configuration series to current one.
 - Addressed comments from MST, Jason Wang and others.
 - simplified the interface.
 - added another resource management  as RFC (feature bits).

Max Gurtovoy (7):
  Introduce device group
  Introduce admin command set
  Introduce new destination type for admin commands
  Introduce virtio admin virtqueue
  Add miscellaneous configuration structure for PCI
  Introduce MGMT admin commands
  RFC: add initial support for configuring feature bits

 admin.tex        | 282 +++++++++++++++++++++++++++++++++++++++++++++++
 conformance.tex  |   3 +
 content.tex      | 118 +++++++++++++++++++-
 introduction.tex |  42 +++++++
 4 files changed, 443 insertions(+), 2 deletions(-)
 create mode 100644 admin.tex

-- 
2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] [PATCH v5 1/7] Introduce device group
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 15:25   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 2/7] Introduce admin command set Max Gurtovoy
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

Each device group has a type. For now, define initial type of device
groups: Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI
SR-IOV virtual functions (VFs). This group may contain one or more
virtio devices.

Each device within a device group has a unique identifier. This
identifier is the virtio device id (vdev_id).

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 introduction.tex | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/introduction.tex b/introduction.tex
index 4dc7085..4358ab1 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -155,6 +155,18 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
 sections tagged "Legacy Interface" in the section title.
 These highlight the changes made since the earlier drafts.
 
+\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
+
+A device group includes one or more virtio devices.
+Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
+0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
+
+For now, the supported device groups are:
+\begin{enumerate}
+\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
+and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
+\end{enumerate}
+
 \section{Structure Specifications}\label{sec:Structure Specifications}
 
 Many device and driver in-memory structure layouts are documented using
-- 
2.21.0


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 2/7] Introduce admin command set
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
  2022-04-26 22:58 ` [virtio-comment] [PATCH v5 1/7] Introduce device group Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 15:23   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 3/7] Introduce new destination type for admin commands Max Gurtovoy
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

This command set is used for essential administrative and management
operations.

Admin commands should be submitted to a well defined management
interface.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 content.tex |   2 +
 2 files changed, 125 insertions(+)
 create mode 100644 admin.tex

diff --git a/admin.tex b/admin.tex
new file mode 100644
index 0000000..6725daa
--- /dev/null
+++ b/admin.tex
@@ -0,0 +1,123 @@
+\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
+
+The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
+This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
+
+All the Admin commands are of the following form:
+
+\begin{lstlisting}
+struct virtio_admin_cmd {
+        /* Device-readable part */
+        le16 command;
+        /*
+         * 0 - self
+         * 1 - 65535 are reserved
+         */
+        le16 dst_type;
+        /* reserved for common cmd fields */
+        u8 reserved[20];
+        u8 command_specific_data[];
+
+        /* Device-writable part */
+        u8 status;
+        u8 command_specific_error;
+        u8 command_specific_result[];
+};
+\end{lstlisting}
+
+The following table describes the generic Admin status codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
+\hline
+01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
+\hline
+02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
+\hline
+03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
+\hline
+\end{tabular}
+
+The \field{command}, \field{dst_type} and \field{command_specific_data} are
+set by the driver, and the device sets the \field{status}, the
+\field{command_specific_error} and the \field{command_specific_result},
+if needed.
+
+Reserved common fields are ignored by the device and to be zeroed by the driver.
+
+The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
+
+The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
+
+The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
+
+The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
+VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
+holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
+\field{command_specific_error} value is undefined and should be ignored by the driver.
+
+The following table describes the Admin command set:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Command & M/O \\
+\hline \hline
+0000h   & VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY    & M  \\
+\hline
+0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
+\hline
+0002h - 7FFFh   & Generic admin cmds    & -  \\
+\hline
+8000h - FFFFh   & Reserved    & - \\
+\hline
+\end{tabular}
+
+\subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
+
+The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY by the driver.
+
+The device, upon success, returns a result that describes information about the designated virtio device.
+This result is of form:
+\begin{lstlisting}
+struct virtio_admin_device_caps_identify_result {
+       /* Indicates which of the below fields were returned
+        * (1 means that field was returned):
+        * Bit 0 - device_admin_caps
+        * Bits 1 - 63 - reserved for future fields
+        */
+       le64 attrs_mask;
+       /* This field indicates which of the below admin
+        * capabilities are supported by the device:
+        * Bits 0 - 63 - reserved for future capabilities.
+        */
+       le64 device_admin_caps;
+       u8 reserved[112];
+};
+\end{lstlisting}
+
+\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
+
+The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
+
+The command specific data set by the driver is of form:
+\begin{lstlisting}
+struct virtio_admin_device_caps_accept_data {
+       /* Indicates which of the below fields were set
+        * (1 means that field is set):
+        * Bit 0 - driver_admin_caps
+        * Bits 1 - 63 - reserved for future fields
+        */
+       le64 attrs_mask;
+       /* This field indicates which of the below admin
+        * capabilities are supported by the driver:
+        * Bits 0 - 63 - reserved for future capabilities.
+        */
+       le64 driver_admin_caps;
+       u8 reserved[112];
+};
+\end{lstlisting}
diff --git a/content.tex b/content.tex
index c6f116c..2e1df84 100644
--- a/content.tex
+++ b/content.tex
@@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 types. It is RECOMMENDED that devices generate version 4
 UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
 
+\input{admin.tex}
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
  2022-04-26 22:58 ` [virtio-comment] [PATCH v5 1/7] Introduce device group Max Gurtovoy
  2022-04-26 22:58 ` [PATCH v5 2/7] Introduce admin command set Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 15:01   ` Michael S. Tsirkin
  2022-05-15 15:09   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 4/7] Introduce virtio admin virtqueue Max Gurtovoy
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

Introduce a new mechanism to issue commands with dst_type field that is
not "self". With the new mechanism, driver can set dst_type to 1
and use the vdev_id common field to describe the designated vdev_id.

This mechanism is useful for device groups with multiple devices
with various different capabilities. For example, a type 1 device group
that contains a PCI PF and its VF. For this group, a clever system
administrator can use admin commands to manipulate the PF/VF resources.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/admin.tex b/admin.tex
index 6725daa..f816c3b 100644
--- a/admin.tex
+++ b/admin.tex
@@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
         le16 command;
         /*
          * 0 - self
-         * 1 - 65535 are reserved
+         * 1 - other virtio device (identified by vdev_id) in the same device group
+         * 2 - 65535 are reserved
          */
         le16 dst_type;
+        le64 vdev_id;
         /* reserved for common cmd fields */
-        u8 reserved[20];
+        u8 reserved[12];
         u8 command_specific_data[];
 
         /* Device-writable part */
@@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
 \hline
 03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
 \hline
+04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
+\hline
 \end{tabular}
 
-The \field{command}, \field{dst_type} and \field{command_specific_data} are
+The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
 set by the driver, and the device sets the \field{status}, the
 \field{command_specific_error} and the \field{command_specific_result},
 if needed.
@@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
 
 The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
 
+The optional unused fields to be zeroed by the driver.
+
 The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
 
-The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
+The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
+group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly to check whether this operation is allowed.
+If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
+If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
+\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).
 
 The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
 VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 4/7] Introduce virtio admin virtqueue
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (2 preceding siblings ...)
  2022-04-26 22:58 ` [PATCH v5 3/7] Introduce new destination type for admin commands Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 14:59   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 5/7] Add miscellaneous configuration structure for PCI Max Gurtovoy
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

In one of the many use cases a user wants to manipulate features and
configuration of the virtio devices regardless of the device type
(net/block/console). For that the admin command set introduced. The
admin virtqueue will be the first management interface to issue admin
commands.

Currently virtio specification defines control virtqueue to manipulate
features and configuration of the device it operates on. However,
control virtqueue commands are device type specific, which makes it very
difficult to extend for device agnostic commands.

To support this requirement in elegant way, this patch introduces a new
admin virtqueue interface.

Manipulate features via admin virtqueue is asynchronous, scalable, easy
to extend and doesn't require additional and expensive on-die resources
to be allocated for every new feature that will be added in the future.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex       | 17 +++++++++++++++++
 conformance.tex |  1 +
 content.tex     |  6 ++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/admin.tex b/admin.tex
index f816c3b..d09683d 100644
--- a/admin.tex
+++ b/admin.tex
@@ -131,3 +131,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
        u8 reserved[112];
 };
 \end{lstlisting}
+
+\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
+
+An admin virtqueue is a management interface of a device that can be used to send administrative
+commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
+various features of the device and/or to manipulate various features, if possible, of another device.
+
+An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
+negotiated. The index of the admin virtqueue is exposed by the device in a
+transport specific manner.
+
+If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
+
+\devicenormative{\subsection}{Admin Virtqueues}{Basic Facilities of a Virtio Device / Admin Virtqueues}
+A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.
+
+A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
diff --git a/conformance.tex b/conformance.tex
index 9807c30..3c7b7bc 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -342,6 +342,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Virtqueues / Available Buffer Notification Suppression}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
 \item \ref{devicenormative:Reserved Feature Bits}
+\item \ref{devicenormative:Basic Facilities of a Virtio Device / Admin Virtqueues}
 \end{itemize}
 
 \conformance{\subsection}{PCI Device Conformance}\label{sec:Conformance / Device Conformance / PCI Device Conformance}
diff --git a/content.tex b/content.tex
index 2e1df84..163cb34 100644
--- a/content.tex
+++ b/content.tex
@@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
 \begin{description}
 \item[0 to 23, and 50 to 127] Feature bits for the specific device type
 
-\item[24 to 40] Feature bits reserved for extensions to the queue and
+\item[24 to 41] Feature bits reserved for extensions to the queue and
   feature negotiation mechanisms
 
-\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
+\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
 \end{description}
 
 \begin{note}
@@ -6849,6 +6849,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   that the driver can reset a queue individually.
   See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
 
+  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (3 preceding siblings ...)
  2022-04-26 22:58 ` [PATCH v5 4/7] Introduce virtio admin virtqueue Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 14:49   ` Michael S. Tsirkin
  2022-05-15 14:57   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 6/7] Introduce MGMT admin commands Max Gurtovoy
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

This new structure will be used for adding new miscellaneous registers
for a virtio device configuration layout.

For now, only admin_queue_index register is added. Admin virtqueue index
does not depend on the device type. Hence, add a PCI capability to read
the admin virtqueue index.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 conformance.tex |  2 ++
 content.tex     | 29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index 3c7b7bc..c183581 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -103,6 +103,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
+\item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
 \end{itemize}
 
 \conformance{\subsection}{MMIO Driver Conformance}\label{sec:Conformance / Driver Conformance / MMIO Driver Conformance}
@@ -364,6 +365,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Used Buffer Notifications}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
+\item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
 \end{itemize}
 
 \conformance{\subsection}{MMIO Device Conformance}\label{sec:Conformance / Device Conformance / MMIO Device Conformance}
diff --git a/content.tex b/content.tex
index 163cb34..0c1d44f 100644
--- a/content.tex
+++ b/content.tex
@@ -712,6 +712,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 \item ISR Status
 \item Device-specific configuration (optional)
 \item PCI configuration access
+\item Miscellaneous configuration
 \end{itemize}
 
 Each structure can be mapped by a Base Address register (BAR) belonging to
@@ -771,6 +772,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG        9
+/* Miscellaneous configuration */
+#define VIRTIO_PCI_CAP_MISC_CFG          10
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -1352,6 +1355,32 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Miscellaneous configuration structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+
+The miscellaneous configuration structure is found at the bar and offset within the VIRTIO_PCI_CAP_MISC_CFG capability.
+Its layout is below.
+\begin{lstlisting}
+struct virtio_pci_misc_cfg {
+        le16 admin_queue_index;         /* read-only for driver */
+};
+\end{lstlisting}
+
+\begin{description}
+\item[\field{admin_queue_index}]
+        The device uses this to report the index of the admin virtqueue.
+        This field is valid only if VIRTIO_F_ADMIN_VQ is set.
+\end{description}
+
+\devicenormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+The device MUST present VIRTIO_PCI_CAP_MISC_CFG capability when VIRTIO_F_ADMIN_VQ is set.
+
+The device MUST present a valid \field{admin_queue_index} when VIRTIO_F_ADMIN_VQ is set.
+
+\drivernormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+The driver MUST NOT proceed with configuring the admin virtqueue in case VIRTIO_F_ADMIN_VQ is set and VIRTIO_PCI_CAP_MISC_CFG capability is not present.
+
+The driver MUST use the value of \field{admin_queue_index} to configure the admin virtqueue. For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 6/7] Introduce MGMT admin commands
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (4 preceding siblings ...)
  2022-04-26 22:58 ` [PATCH v5 5/7] Add miscellaneous configuration structure for PCI Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 14:37   ` Michael S. Tsirkin
  2022-04-26 22:58 ` [PATCH v5 7/7] RFC: add initial support for configuring feature bits Max Gurtovoy
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

Introduce the concept of a management and a managed device and add
example of using this concept to manage resources.

A management device supports the VIRTIO_ADMIN_DEVICE_MGMT and
VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands to manage some resources
of a managed device.

A typical cloud provider SR-IOV use case is to create many VFs for use
by guest VMs. The VFs may not be assigned to a VM until a user requests
a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X
vectors proportional to the number of CPUs in the VM, but there is no
standard way today in the spec to change the number of MSI-X vectors
supported by a VF, although there are some operating systems that
support this.

The new admin mechanism manages the MSI-X interrupt vectors assignments
of a managed PCI device (i.e. VF) by its management devices (i.e. its
parent PF) but can easily extended to any other generic resource
management.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex        | 132 +++++++++++++++++++++++++++++++++++++++++++++--
 content.tex      |  81 +++++++++++++++++++++++++++++
 introduction.tex |  32 +++++++++++-
 3 files changed, 241 insertions(+), 4 deletions(-)

diff --git a/admin.tex b/admin.tex
index d09683d..5b54743 100644
--- a/admin.tex
+++ b/admin.tex
@@ -79,12 +79,20 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
 \hline
 0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
 \hline
-0002h - 7FFFh   & Generic admin cmds    & -  \\
+0002h   & VIRTIO_ADMIN_DEVICE_MGMT    & O  \\
+\hline
+0003h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    & O  \\
+\hline
+0004h - 7FFFh   & Generic admin cmds    & -  \\
 \hline
 8000h - FFFFh   & Reserved    & - \\
 \hline
 \end{tabular}
 
+\begin{note}
+{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
+\end{note}
+
 \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
 
 The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
@@ -102,13 +110,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
        le64 attrs_mask;
        /* This field indicates which of the below admin
         * capabilities are supported by the device:
-        * Bits 0 - 63 - reserved for future capabilities.
+        * Bit 0 - if set, the device is a management device
+        * Bit 1 - if set, the device is a type 1 management device that supports
+        *         MSI-X vector mgmt of its type 1 managed devices
+        * Bits 2 - 63 - reserved for future capabilities.
         */
        le64 device_admin_caps;
        u8 reserved[112];
 };
 \end{lstlisting}
 
+\begin{note}
+{For more details on MSI-X vector management support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}.}
+\end{note}
+
 \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
 
 The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
@@ -125,13 +140,124 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
        le64 attrs_mask;
        /* This field indicates which of the below admin
         * capabilities are supported by the driver:
-        * Bits 0 - 63 - reserved for future capabilities.
+        * Bit 0 - if set, the driver accepted the device as a management device
+        * Bit 1 - if set, the driver accepted the device as a type 1 management device
+        *         that supports MSI-X vector mgmt of its type 1 managed devices
+        * Bits 2 - 63 - reserved for future capabilities.
         */
        le64 driver_admin_caps;
        u8 reserved[112];
 };
 \end{lstlisting}
 
+\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
+
+The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
+
+The command specific data set by the driver is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_data {
+        /*
+         * 0 - reserved
+         * 1 - assign resource to the designated vdev_id
+         * 2 - query resource of the designated vdev_id
+         * 3 - 255 are reserved
+         */
+        u8 operation;
+        /*
+         * 0 - MSI-X vector
+         * 1 - 65535 are reserved
+         */
+        le16 resource;
+        /*
+         * The value to the given resource:
+         * if resource = 0 (MSI-X vector), it's a 1-based count.
+         */
+        le64 resource_val;
+        u8 reserved[5];
+};
+\end{lstlisting}
+
+The following table describes the command specific error codes codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE    & designated device is in use, operation failed   \\
+\hline
+01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
+\hline
+02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
+\hline
+03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
+\hline
+04h - FFh   & Reserved    & -  \\
+\hline
+\end{tabular}
+
+The device, upon success, returns a result that describes the information according to the requested operation.
+This result is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_result {
+        le64 resource_val;
+        u8 reserved[8];
+};
+\end{lstlisting}
+
+If the requested operation by the driver was "assign resource to the designated vdev_id", the device will return the resource_val of the assigned
+resources to the designated vdev_id. Upon success, this value should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data
+structure set by the driver. In case of a failure, the value of this field is undefined and will be ignored by the driver.
+
+If the requested operation by the driver was "query resource of the designated vdev_id", the device will return resource_val of the currently assigned
+resources to the designated vdev_id upon success. In case of a failure, the value of this field is undefined and will be ignored by the driver.
+
+\begin{note}
+{MSI-X vector resource type is valid only for PCI devices. VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is
+returned by the device when the designated vdev_id is not a PCI device.}
+\end{note}
+
+\begin{note}
+{For this command, if driver is setting \field{resource} to MSI-X vector type, the \field{vdev_id} can't be associated with a Virtual Function with
+VF index greater than NumVFs value as defined in the PCI specification or smaller than 1. An error is returned by the device when \field{vdev_id} is out of the range.}
+\end{note}
+
+\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
+
+The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
+
+The device, upon success, returns a result that describes the management device attributes.
+This result is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_attrs_result {
+        /* Indicates which of the below fields were returned
+         * (1 means that field was returned):
+         * Bit 0 - vfs_total_msix_count
+         * Bit 1 - vfs_assigned_msix_count
+         * Bit 2 - per_vf_max_msix_count
+         * Bits 3 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+
+        /* Total number of msix vectors for the total number of VFs */
+        le32 vfs_total_msix_count;
+        /* Assigned number of msix vectors for the enabled VFs */
+        le32 vfs_assigned_msix_count;
+        /* Max number of msix vectors that can be assigned for a single VF */
+        le16 per_vf_max_msix_count;
+
+        u8 reserved[110];
+};
+\end{lstlisting}
+
+\begin{note}
+{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} and \field{per_vf_max_msix_count} returned by the device if the
+designated vdev_id is a management device that can allocate/deallocate MSI-X resources for PCI VFs devices. Otherwise,
+the associated bits in \field{attrs_mask} are zeroed by the device.}
+\end{note}
+
 \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
 
 An admin virtqueue is a management interface of a device that can be used to send administrative
diff --git a/content.tex b/content.tex
index 0c1d44f..81e5850 100644
--- a/content.tex
+++ b/content.tex
@@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 
 \input{admin.tex}
 
+\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
+
+A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a type 1 device group.
+A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
+A management device might have various management capabilities and attributes to manage its managed devices. The capabilities exposed
+in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
+for more details) and the attributes exposed in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
+(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
+
+The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
@@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
     \end{itemize}
 \end{itemize}
 
+\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
+
+This documents the group of admin capabilities for PCI virtio devices. Each capability is
+implemented using one or more Admin commands.
+
+\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}
+
+This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors
+for its managed devices. In PCI, a management device can be the PF device and the managed device can be the VF (for example in a type 1 device group).
+Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands, report the MSI-X attributes in the result of
+VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector resource management is supported in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
+See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
+
+In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of
+msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
+\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
+\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
+fields in the \field{attrs_mask} field of the result buffer.
+See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
+
+The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification.
+A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device.
+In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the
+amount of MSI-X interrupt vectors to configure to the designated managed device in \field{resource_val}. The managed device id is set to \field{vdev_id} field.
+
+A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the designated device.
+This value is also returned in the virtio_admin_device_mgmt_result structure.
+Also, a successful operation guarantees that the MSI-X capability access by the designated PCI device defined by the PCI specification must reflect
+the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT
+increases the MSI-X vectors to 8. On this change, reading Table size field of the MSI-X message control register will reflect a value of 7.
+
+It is beyond the scope of the virtio specification to define necessary synchronization in system software to ensure that a virtio PCI VF device
+interrupt configuration modification is reflected in the PCI device. However, it is expected that any modern system software implementing virtio
+drivers and PCI subsystem will ensure that any changes occurring in the VF interrupt configuration is either updated in the PCI VF device or
+such configuration fails. For example, one way to implement that is to make sure that there is no driver bounded to the virtio PCI SR-IOV VF during
+this operation.
+
+To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
+"query resource of the designated vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id}
+field. In the result of a successful operation, the amount of MSI-X interrupt vectors that is currently assigned to the designated managed device is
+returned by the device in \field{resource_val} field of the virtio_admin_device_mgmt_result structure.
+See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
+
+\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
+
+A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following:
+
+\begin{enumerate}
+\item Ensure that VF driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)
+
+\item Load the PF driver
+
+\item Enable SR-IOV by following the PCI specification
+
+\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
+
+\item Find the managed VF vdev_id (for type 1 device group the vdev_id of PCI VF is equal to vf number)
+
+\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
+
+\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
+
+\item After successful completion of the assignment, load the VF driver
+
+\item Assign the VF to a VM
+
+\end{enumerate}
+
 \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
 
 Virtual environments without PCI support (a common situation in
diff --git a/introduction.tex b/introduction.tex
index 4358ab1..bfc5498 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -164,9 +164,39 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
 For now, the supported device groups are:
 \begin{enumerate}
 \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
-and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
+and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification). A PCI SR-IOV PF device can act as a management device for
+type 1 group. A PCI SR-IOV VF device can act as a managed device for type 1 group (see \ref{sec:Introduction / Terminology / Virtio management device} and
+\ref{sec:Introduction / Terminology / Virtio managed device} for more information).
 \end{enumerate}
 
+\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
+
+A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
+This device can manage a virtio managed device. A device group may contain zero or more management devices.
+
+A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for type 1 device group).
+
+\subsection{Virtio type 1 management device}\label{sec:Introduction / Terminology / Virtio type 1 management device}
+
+A virtio management device for type 1 device group. This device is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio device in the same device group),
+and set \field{vdev_id} to an id that corresponds with one of its managed virtio devices (PCI SR-IOV VFs) for the VIRTIO_ADMIN_DEVICE_MGMT admin command.
+
+A type 1 device group may contain zero or one management devices.
+
+\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
+
+A virtio device that can be managed by a virtio management device.
+A device group may contain zero or more managed devices.
+
+A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for type 1 group).
+
+\subsection{virtio type 1 managed device}\label{sec:Introduction / Terminology / Virtio type 1 managed device}
+
+A virtio managed device for type 1 device group. This device is a PCI SR-IOV VF and is managed by a virtio type 1 management device (virtio PCI SR-IOV PF).
+It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio type 1 management device are type 1 managed devices.
+
 \section{Structure Specifications}\label{sec:Structure Specifications}
 
 Many device and driver in-memory structure layouts are documented using
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v5 7/7] RFC: add initial support for configuring feature bits
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (5 preceding siblings ...)
  2022-04-26 22:58 ` [PATCH v5 6/7] Introduce MGMT admin commands Max Gurtovoy
@ 2022-04-26 22:58 ` Max Gurtovoy
  2022-05-15 14:38   ` Michael S. Tsirkin
  2022-05-15 15:27 ` [PATCH v5 0/7] Introduce device group and device management Michael S. Tsirkin
  2022-07-05 13:56 ` Michael S. Tsirkin
  8 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-04-26 22:58 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, Max Gurtovoy

After adding the concept of a management and a managed device, add
another example of using this concept to manage resources.

Today there is no standard definition in the spec that allows user to
setup specific feature bits of a virtio device.

For that, extend the management mechanism to allow management devices to
change feature bits of its managed devices.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/admin.tex b/admin.tex
index 5b54743..43106ba 100644
--- a/admin.tex
+++ b/admin.tex
@@ -113,7 +113,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
         * Bit 0 - if set, the device is a management device
         * Bit 1 - if set, the device is a type 1 management device that supports
         *         MSI-X vector mgmt of its type 1 managed devices
-        * Bits 2 - 63 - reserved for future capabilities.
+        * Bit 2 - if set, the device is a type 1 management device that supports
+        *         feature mgmt of bits 0 to 63 for its type 1 managed devices
+        * Bits 3 - 63 - reserved for future capabilities.
         */
        le64 device_admin_caps;
        u8 reserved[112];
@@ -143,7 +145,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
         * Bit 0 - if set, the driver accepted the device as a management device
         * Bit 1 - if set, the driver accepted the device as a type 1 management device
         *         that supports MSI-X vector mgmt of its type 1 managed devices
-        * Bits 2 - 63 - reserved for future capabilities.
+        * Bit 2 - if set, the driver accepted the device as a type 1 management device
+        *         that supports feature mgmt of bits 0 to 63 for its type 1 managed devices
+        * Bits 3 - 63 - reserved for future capabilities.
         */
        le64 driver_admin_caps;
        u8 reserved[112];
@@ -167,12 +171,14 @@ \subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Vi
         u8 operation;
         /*
          * 0 - MSI-X vector
-         * 1 - 65535 are reserved
+         * 1 - Device feature bits 0 to 63
+         * 2 - 65535 are reserved
          */
         le16 resource;
         /*
          * The value to the given resource:
          * if resource = 0 (MSI-X vector), it's a 1-based count.
+         * if resource = 1 (Device feature bits 0 to 63), it's a feature bitmap.
          */
         le64 resource_val;
         u8 reserved[5];
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-04-26 22:58 ` [PATCH v5 6/7] Introduce MGMT admin commands Max Gurtovoy
@ 2022-05-15 14:37   ` Michael S. Tsirkin
  2022-05-16 21:47     ` Parav Pandit
                       ` (2 more replies)
  0 siblings, 3 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 14:37 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:23AM +0300, Max Gurtovoy wrote:
> Introduce the concept of a management and a managed device and add
> example of using this concept to manage resources.
> 
> A management device supports the VIRTIO_ADMIN_DEVICE_MGMT and
> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands to manage some resources
> of a managed device.
> 
> A typical cloud provider SR-IOV use case is to create many VFs for use
> by guest VMs. The VFs may not be assigned to a VM until a user requests
> a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X
> vectors proportional to the number of CPUs in the VM, but there is no
> standard way today in the spec to change the number of MSI-X vectors
> supported by a VF, although there are some operating systems that
> support this.
> 
> The new admin mechanism manages the MSI-X interrupt vectors assignments
> of a managed PCI device (i.e. VF) by its management devices (i.e. its
> parent PF) but can easily extended to any other generic resource
> management.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>


I'd like to see msix and the concept of type 1 group
in a separate patch from MSIX.

I am not sure MSIX things are ready but the grouping part looks mostly
ok to me.

> ---
>  admin.tex        | 132 +++++++++++++++++++++++++++++++++++++++++++++--
>  content.tex      |  81 +++++++++++++++++++++++++++++
>  introduction.tex |  32 +++++++++++-
>  3 files changed, 241 insertions(+), 4 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index d09683d..5b54743 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -79,12 +79,20 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  \hline
>  0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
>  \hline
> -0002h - 7FFFh   & Generic admin cmds    & -  \\
> +0002h   & VIRTIO_ADMIN_DEVICE_MGMT    & O  \\
> +\hline
> +0003h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    & O  \\
> +\hline
> +0004h - 7FFFh   & Generic admin cmds    & -  \\
>  \hline
>  8000h - FFFFh   & Reserved    & - \\
>  \hline
>  \end{tabular}
>  
> +\begin{note}
> +{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
> +\end{note}
> +
>  \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>  
>  The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
> @@ -102,13 +110,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>         le64 attrs_mask;
>         /* This field indicates which of the below admin
>          * capabilities are supported by the device:
> -        * Bits 0 - 63 - reserved for future capabilities.
> +        * Bit 0 - if set, the device is a management device
> +        * Bit 1 - if set, the device is a type 1 management device that supports
> +        *         MSI-X vector mgmt of its type 1 managed devices
> +        * Bits 2 - 63 - reserved for future capabilities.
>          */
>         le64 device_admin_caps;
>         u8 reserved[112];
>  };
>  \end{lstlisting}
>  
> +\begin{note}
> +{For more details on MSI-X vector management support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}.}
> +\end{note}
> +
>  \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
>  
>  The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
> @@ -125,13 +140,124 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>         le64 attrs_mask;
>         /* This field indicates which of the below admin
>          * capabilities are supported by the driver:
> -        * Bits 0 - 63 - reserved for future capabilities.
> +        * Bit 0 - if set, the driver accepted the device as a management device
> +        * Bit 1 - if set, the driver accepted the device as a type 1 management device
> +        *         that supports MSI-X vector mgmt of its type 1 managed devices
> +        * Bits 2 - 63 - reserved for future capabilities.
>          */
>         le64 driver_admin_caps;
>         u8 reserved[112];
>  };
>  \end{lstlisting}
>  
> +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
> +
> +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.
> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
> +
> +The command specific data set by the driver is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_data {
> +        /*
> +         * 0 - reserved
> +         * 1 - assign resource to the designated vdev_id
> +         * 2 - query resource of the designated vdev_id
> +         * 3 - 255 are reserved
> +         */
> +        u8 operation;
> +        /*
> +         * 0 - MSI-X vector
> +         * 1 - 65535 are reserved
> +         */
> +        le16 resource;
> +        /*
> +         * The value to the given resource:
> +         * if resource = 0 (MSI-X vector), it's a 1-based count.
> +         */
> +        le64 resource_val;
> +        u8 reserved[5];
> +};
> +\end{lstlisting}
> +
> +The following table describes the command specific error codes codes:
> +
> +\begin{tabular}{|l|l|l|}
> +\hline
> +Opcode & Status & Description \\
> +\hline \hline
> +00h   & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE    & designated device is in use, operation failed   \\
> +\hline
> +01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
> +\hline
> +02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
> +\hline
> +03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
> +\hline
> +04h - FFh   & Reserved    & -  \\
> +\hline
> +\end{tabular}
> +
> +The device, upon success, returns a result that describes the information according to the requested operation.
> +This result is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_result {
> +        le64 resource_val;
> +        u8 reserved[8];
> +};
> +\end{lstlisting}
> +
> +If the requested operation by the driver was "assign resource to the designated vdev_id", the device will return the resource_val of the assigned
> +resources to the designated vdev_id. Upon success, this value should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data
> +structure set by the driver. In case of a failure, the value of this field is undefined and will be ignored by the driver.
> +
> +If the requested operation by the driver was "query resource of the designated vdev_id", the device will return resource_val of the currently assigned
> +resources to the designated vdev_id upon success. In case of a failure, the value of this field is undefined and will be ignored by the driver.
> +
> +\begin{note}
> +{MSI-X vector resource type is valid only for PCI devices. VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is
> +returned by the device when the designated vdev_id is not a PCI device.}
> +\end{note}
> +
> +\begin{note}
> +{For this command, if driver is setting \field{resource} to MSI-X vector type, the \field{vdev_id} can't be associated with a Virtual Function with
> +VF index greater than NumVFs value as defined in the PCI specification or smaller than 1. An error is returned by the device when \field{vdev_id} is out of the range.}
> +\end{note}
> +
> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
> +
> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
> +
> +The device, upon success, returns a result that describes the management device attributes.
> +This result is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_attrs_result {
> +        /* Indicates which of the below fields were returned
> +         * (1 means that field was returned):
> +         * Bit 0 - vfs_total_msix_count
> +         * Bit 1 - vfs_assigned_msix_count
> +         * Bit 2 - per_vf_max_msix_count
> +         * Bits 3 - 63 - reserved for future fields
> +         */
> +        le64 attrs_mask;
> +
> +        /* Total number of msix vectors for the total number of VFs */
> +        le32 vfs_total_msix_count;
> +        /* Assigned number of msix vectors for the enabled VFs */
> +        le32 vfs_assigned_msix_count;
> +        /* Max number of msix vectors that can be assigned for a single VF */
> +        le16 per_vf_max_msix_count;
> +
> +        u8 reserved[110];
> +};
> +\end{lstlisting}
> +
> +\begin{note}
> +{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} and \field{per_vf_max_msix_count} returned by the device if the
> +designated vdev_id is a management device that can allocate/deallocate MSI-X resources for PCI VFs devices. Otherwise,
> +the associated bits in \field{attrs_mask} are zeroed by the device.}
> +\end{note}
> +
>  \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>  
>  An admin virtqueue is a management interface of a device that can be used to send administrative
> diff --git a/content.tex b/content.tex
> index 0c1d44f..81e5850 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>  
>  \input{admin.tex}
>  
> +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
> +
> +A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a type 1 device group.
> +A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
> +A management device might have various management capabilities and attributes to manage its managed devices.

This makes my eyes glaze over.
Please, find all instances which say "manage" more than once and
rephrase.

> The capabilities exposed
> +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> +for more details) and the attributes exposed in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
> +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
> +
> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
> +
>  \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>  
>  We start with an overview of device initialization, then expand on the
> @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
>      \end{itemize}
>  \end{itemize}
>  
> +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
> +
> +This documents the group of admin capabilities for PCI virtio devices. Each capability is
> +implemented using one or more Admin commands.
> +
> +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}
> +
> +This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors
> +for its managed devices. In PCI, a management device can be the PF device and the managed device can be the VF (for example in a type 1 device group).
> +Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands, report the MSI-X attributes in the result of
> +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector resource management is supported in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
> +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
> +
> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of
> +msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
> +\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
> +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
> +fields in the \field{attrs_mask} field of the result buffer.
> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
> +
> +The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification.
> +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device.
> +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the
> +amount of MSI-X interrupt vectors to configure to the designated managed device in \field{resource_val}. The managed device id is set to \field{vdev_id} field.
> +
> +A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the designated device.
> +This value is also returned in the virtio_admin_device_mgmt_result structure.
> +Also, a successful operation guarantees that the MSI-X capability access by the designated PCI device defined by the PCI specification must reflect
> +the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT
> +increases the MSI-X vectors to 8. On this change, reading Table size field of the MSI-X message control register will reflect a value of 7.
> +
> +It is beyond the scope of the virtio specification to define
> necessary synchronization in system software to ensure that a virtio
> PCI VF device +interrupt configuration modification is reflected in
> the PCI device.

IMHO it is very much in scope of the specification. The scope of the
specification is to allow device interoperability and this very much
fits the bill.

> However, it is expected that any modern system software implementing
> virtio +drivers and PCI subsystem will ensure that any changes
> occurring in the VF interrupt configuration is either updated in the
> PCI VF device or +such configuration fails.

OK. Anything more? What exactly does "interrupt configuration" mean here?

> For example, one way to
> implement that is to make sure that there is no driver bounded to the
> virtio PCI SR-IOV VF during +this operation.

bounded in what sense?

And why do you say VF? Is this command limited to type 1? You only
limit it to PCI above.

same elsewhere

> +
> +To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to

issues

lots of grammar error like this elsewhere, pls find and correct.

> +"query resource of the designated vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id}
> +field. In the result of a successful operation,

meaning "in case"?

> the amount of MSI-X interrupt vectors that is currently assigned to the designated managed device is
> +returned by the device in \field{resource_val} field of the virtio_admin_device_mgmt_result structure.
> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
> +
> +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
> +
> +A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following:

rephrase to simplify

The driver uses the following sequence for configuring MSI-X vectors
....



> +
> +\begin{enumerate}
> +\item Ensure that VF driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)
> +
> +\item Load the PF driver
> +
> +\item Enable SR-IOV by following the PCI specification
> +
> +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
> +
> +\item Find the managed VF vdev_id (for type 1 device group the vdev_id of PCI VF is equal to vf number)
> +
> +\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
> +
> +\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
> +
> +\item After successful completion of the assignment, load the VF driver
> +
> +\item Assign the VF to a VM
> +
> +\end{enumerate}
> +
>  \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
>  
>  Virtual environments without PCI support (a common situation in
> diff --git a/introduction.tex b/introduction.tex
> index 4358ab1..bfc5498 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -164,9 +164,39 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>  For now, the supported device groups are:
>  \begin{enumerate}
>  \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> -and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification). A PCI SR-IOV PF device can act as a management device for
> +type 1 group. A PCI SR-IOV VF device can act as a managed device for type 1 group (see \ref{sec:Introduction / Terminology / Virtio management device} and
> +\ref{sec:Introduction / Terminology / Virtio managed device} for more information).
>  \end{enumerate}
>  
> +\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
> +
> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
> +This device can manage a virtio managed device. A device group may contain zero or more management devices.
> +
> +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for type 1 device group).
> +
> +\subsection{Virtio type 1 management device}\label{sec:Introduction / Terminology / Virtio type 1 management device}
> +
> +A virtio management device for type 1 device group. This device is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio device in the same device group),
> +and set \field{vdev_id} to an id that corresponds with one of its managed virtio devices (PCI SR-IOV VFs) for the VIRTIO_ADMIN_DEVICE_MGMT admin command.
> +
> +A type 1 device group may contain zero or one management devices.
> +
> +\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
> +
> +A virtio device that can be managed by a virtio management device.
> +A device group may contain zero or more managed devices.
> +
> +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for type 1 group).
> +
> +\subsection{virtio type 1 managed device}\label{sec:Introduction / Terminology / Virtio type 1 managed device}
> +
> +A virtio managed device for type 1 device group. This device is a PCI SR-IOV VF and is managed by a virtio type 1 management device (virtio PCI SR-IOV PF).
> +It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio type 1 management device are type 1 managed devices.
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 7/7] RFC: add initial support for configuring feature bits
  2022-04-26 22:58 ` [PATCH v5 7/7] RFC: add initial support for configuring feature bits Max Gurtovoy
@ 2022-05-15 14:38   ` Michael S. Tsirkin
  2022-05-18 15:31     ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 14:38 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:24AM +0300, Max Gurtovoy wrote:
> After adding the concept of a management and a managed device, add
> another example of using this concept to manage resources.
> 
> Today there is no standard definition in the spec that allows user to
> setup specific feature bits of a virtio device.
> 
> For that, extend the management mechanism to allow management devices to
> change feature bits of its managed devices.
> 
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>


Please, add more explanation here. E.g. I am guessing these are
host feature bits, right? How does driver know which features are
ok to enable?

I would expect some description sections and conformance sections.


> ---
>  admin.tex | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index 5b54743..43106ba 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -113,7 +113,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>          * Bit 0 - if set, the device is a management device
>          * Bit 1 - if set, the device is a type 1 management device that supports
>          *         MSI-X vector mgmt of its type 1 managed devices
> -        * Bits 2 - 63 - reserved for future capabilities.
> +        * Bit 2 - if set, the device is a type 1 management device that supports
> +        *         feature mgmt of bits 0 to 63 for its type 1 managed devices
> +        * Bits 3 - 63 - reserved for future capabilities.
>          */
>         le64 device_admin_caps;
>         u8 reserved[112];
> @@ -143,7 +145,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>          * Bit 0 - if set, the driver accepted the device as a management device
>          * Bit 1 - if set, the driver accepted the device as a type 1 management device
>          *         that supports MSI-X vector mgmt of its type 1 managed devices
> -        * Bits 2 - 63 - reserved for future capabilities.
> +        * Bit 2 - if set, the driver accepted the device as a type 1 management device
> +        *         that supports feature mgmt of bits 0 to 63 for its type 1 managed devices
> +        * Bits 3 - 63 - reserved for future capabilities.
>          */
>         le64 driver_admin_caps;
>         u8 reserved[112];
> @@ -167,12 +171,14 @@ \subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Vi
>          u8 operation;
>          /*
>           * 0 - MSI-X vector
> -         * 1 - 65535 are reserved
> +         * 1 - Device feature bits 0 to 63
> +         * 2 - 65535 are reserved
>           */
>          le16 resource;
>          /*
>           * The value to the given resource:
>           * if resource = 0 (MSI-X vector), it's a 1-based count.
> +         * if resource = 1 (Device feature bits 0 to 63), it's a feature bitmap.
>           */
>          le64 resource_val;
>          u8 reserved[5];
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-04-26 22:58 ` [PATCH v5 5/7] Add miscellaneous configuration structure for PCI Max Gurtovoy
@ 2022-05-15 14:49   ` Michael S. Tsirkin
  2022-06-01 14:46     ` Max Gurtovoy
  2022-05-15 14:57   ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 14:49 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
> This new structure will be used for adding new miscellaneous registers
> for a virtio device configuration layout.
> 
> For now, only admin_queue_index register is added. Admin virtqueue index
> does not depend on the device type. Hence, add a PCI capability to read
> the admin virtqueue index.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  conformance.tex |  2 ++
>  content.tex     | 29 +++++++++++++++++++++++++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/conformance.tex b/conformance.tex
> index 3c7b7bc..c183581 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -103,6 +103,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
> +\item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>  \end{itemize}
>  
>  \conformance{\subsection}{MMIO Driver Conformance}\label{sec:Conformance / Driver Conformance / MMIO Driver Conformance}
> @@ -364,6 +365,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Used Buffer Notifications}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
> +\item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>  \end{itemize}
>  
>  \conformance{\subsection}{MMIO Device Conformance}\label{sec:Conformance / Device Conformance / MMIO Device Conformance}
> diff --git a/content.tex b/content.tex
> index 163cb34..0c1d44f 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -712,6 +712,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  \item ISR Status
>  \item Device-specific configuration (optional)
>  \item PCI configuration access
> +\item Miscellaneous configuration
>  \end{itemize}
>  
>  Each structure can be mapped by a Base Address register (BAR) belonging to
> @@ -771,6 +772,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Miscellaneous configuration */
> +#define VIRTIO_PCI_CAP_MISC_CFG          10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -1352,6 +1355,32 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Miscellaneous configuration structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +
> +The miscellaneous configuration structure is found at the bar and offset within the VIRTIO_PCI_CAP_MISC_CFG capability.

not very clear what is within what.

Simplify the sentence:

The VIRTIO_PCI_CAP_MISC_CFG specifies the location of The miscellaneous
configuration structure.

> +Its layout is below.

who's layout? and try to avoid using "below".
better:

The miscellaneous configuration structure has the following layout

> +\begin{lstlisting}
> +struct virtio_pci_misc_cfg {
> +        le16 admin_queue_index;         /* read-only for driver */
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{admin_queue_index}]
> +        The device uses this to report the index of the admin virtqueue.
> +        This field is valid only if VIRTIO_F_ADMIN_VQ is set.
> +\end{description}
> +
> +\devicenormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +The device MUST present VIRTIO_PCI_CAP_MISC_CFG capability when VIRTIO_F_ADMIN_VQ is set.
> +
> +The device MUST present a valid \field{admin_queue_index} when VIRTIO_F_ADMIN_VQ is set.
> +

set meaning "offered"?

> +\drivernormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +The driver MUST NOT proceed with configuring the admin virtqueue in case VIRTIO_F_ADMIN_VQ is set and VIRTIO_PCI_CAP_MISC_CFG capability is not present.

"set" is vague. do you mean "negotiated"?

also then let's mandate not negotiating VIRTIO_F_ADMIN_VQ.

also if you are addressing this, address the reverse case
what to do without VIRTIO_F_ADMIN_VQ but with the capability.


> +
> +The driver MUST use the value of \field{admin_queue_index} to configure the admin virtqueue. For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
> +
>  \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>  
>  Transitional devices MUST present part of configuration
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-04-26 22:58 ` [PATCH v5 5/7] Add miscellaneous configuration structure for PCI Max Gurtovoy
  2022-05-15 14:49   ` Michael S. Tsirkin
@ 2022-05-15 14:57   ` Michael S. Tsirkin
  2022-05-17 10:12     ` [virtio] " Cornelia Huck
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 14:57 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
> This new structure will be used for adding new miscellaneous registers
> for a virtio device configuration layout.
> 
> For now, only admin_queue_index register is added. Admin virtqueue index
> does not depend on the device type. Hence, add a PCI capability to read
> the admin virtqueue index.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>


I guess we discussed this but I forgot. Why do we have this new
structure as opposed to just adding the value at the end of config
structure? I was kind of hoping that the structure can be
reused for CCW/MMIO and then we can add more use-cases with
new transport and device independent structures.

If we keep it transport specific I don't really understand why
is it useful ...

> ---
>  conformance.tex |  2 ++
>  content.tex     | 29 +++++++++++++++++++++++++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/conformance.tex b/conformance.tex
> index 3c7b7bc..c183581 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -103,6 +103,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>  \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
> +\item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>  \end{itemize}
>  
>  \conformance{\subsection}{MMIO Driver Conformance}\label{sec:Conformance / Driver Conformance / MMIO Driver Conformance}
> @@ -364,6 +365,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Used Buffer Notifications}
>  \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
> +\item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>  \end{itemize}
>  
>  \conformance{\subsection}{MMIO Device Conformance}\label{sec:Conformance / Device Conformance / MMIO Device Conformance}
> diff --git a/content.tex b/content.tex
> index 163cb34..0c1d44f 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -712,6 +712,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  \item ISR Status
>  \item Device-specific configuration (optional)
>  \item PCI configuration access
> +\item Miscellaneous configuration
>  \end{itemize}
>  
>  Each structure can be mapped by a Base Address register (BAR) belonging to
> @@ -771,6 +772,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Miscellaneous configuration */
> +#define VIRTIO_PCI_CAP_MISC_CFG          10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -1352,6 +1355,32 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Miscellaneous configuration structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +
> +The miscellaneous configuration structure is found at the bar and offset within the VIRTIO_PCI_CAP_MISC_CFG capability.

If this is all there is going to be there, maybe just stick the value in
the capability directly.

> +Its layout is below.
> +\begin{lstlisting}
> +struct virtio_pci_misc_cfg {
> +        le16 admin_queue_index;         /* read-only for driver */
> +};
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{admin_queue_index}]
> +        The device uses this to report the index of the admin virtqueue.
> +        This field is valid only if VIRTIO_F_ADMIN_VQ is set.
> +\end{description}
> +


we need to explain that we will add more fields and drivers must
not make assumptions that this structure is 2 bytes in size.
look for a language like this in device specific and common
structures.

> +\devicenormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +The device MUST present VIRTIO_PCI_CAP_MISC_CFG capability when VIRTIO_F_ADMIN_VQ is set.
> +
> +The device MUST present a valid \field{admin_queue_index} when VIRTIO_F_ADMIN_VQ is set.
> +
> +\drivernormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
> +The driver MUST NOT proceed with configuring the admin virtqueue in case VIRTIO_F_ADMIN_VQ is set and VIRTIO_PCI_CAP_MISC_CFG capability is not present.
> +
> +The driver MUST use the value of \field{admin_queue_index} to configure the admin virtqueue. For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
> +
>  \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>  
>  Transitional devices MUST present part of configuration
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 4/7] Introduce virtio admin virtqueue
  2022-04-26 22:58 ` [PATCH v5 4/7] Introduce virtio admin virtqueue Max Gurtovoy
@ 2022-05-15 14:59   ` Michael S. Tsirkin
  2022-05-18 14:37     ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 14:59 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:21AM +0300, Max Gurtovoy wrote:
> In one of the many use cases a user wants to manipulate features and
> configuration of the virtio devices regardless of the device type
> (net/block/console). For that the admin command set introduced. The
> admin virtqueue will be the first management interface to issue admin
> commands.
> 
> Currently virtio specification defines control virtqueue to manipulate
> features and configuration of the device it operates on. However,
> control virtqueue commands are device type specific, which makes it very
> difficult to extend for device agnostic commands.
> 
> To support this requirement in elegant way, this patch introduces a new
> admin virtqueue interface.
> 
> Manipulate features via admin virtqueue is asynchronous, scalable, easy
> to extend and doesn't require additional and expensive on-die resources
> to be allocated for every new feature that will be added in the future.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex       | 17 +++++++++++++++++
>  conformance.tex |  1 +
>  content.tex     |  6 ++++--
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index f816c3b..d09683d 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -131,3 +131,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>         u8 reserved[112];
>  };
>  \end{lstlisting}
> +
> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> +
> +An admin virtqueue is a management interface of a device that can be used to send administrative
> +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
> +various features of the device and/or to manipulate various features, if possible, of another device.
> +
> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
> +negotiated. The index of the admin virtqueue is exposed by the device in a
> +transport specific manner.
> +
> +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
> +
> +\devicenormative{\subsection}{Admin Virtqueues}{Basic Facilities of a Virtio Device / Admin Virtqueues}
> +A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.


don't see where these are defined.

> +
> +A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
> diff --git a/conformance.tex b/conformance.tex
> index 9807c30..3c7b7bc 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -342,6 +342,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Basic Facilities of a Virtio Device / Virtqueues / Available Buffer Notification Suppression}
>  \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
>  \item \ref{devicenormative:Reserved Feature Bits}
> +\item \ref{devicenormative:Basic Facilities of a Virtio Device / Admin Virtqueues}
>  \end{itemize}
>  
>  \conformance{\subsection}{PCI Device Conformance}\label{sec:Conformance / Device Conformance / PCI Device Conformance}
> diff --git a/content.tex b/content.tex
> index 2e1df84..163cb34 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
>  \begin{description}
>  \item[0 to 23, and 50 to 127] Feature bits for the specific device type
>  
> -\item[24 to 40] Feature bits reserved for extensions to the queue and
> +\item[24 to 41] Feature bits reserved for extensions to the queue and
>    feature negotiation mechanisms
>  
> -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
> +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
>  \end{description}
>  
>  \begin{note}
> @@ -6849,6 +6849,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    that the driver can reset a queue individually.
>    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
>  
> +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
> +
>  \end{description}
>  
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-04-26 22:58 ` [PATCH v5 3/7] Introduce new destination type for admin commands Max Gurtovoy
@ 2022-05-15 15:01   ` Michael S. Tsirkin
  2022-05-18 14:27     ` [virtio-comment] " Max Gurtovoy
  2022-05-15 15:09   ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 15:01 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:20AM +0300, Max Gurtovoy wrote:
> Introduce a new mechanism to issue commands with dst_type field that is
> not "self". With the new mechanism, driver can set dst_type to 1
> and use the vdev_id common field to describe the designated vdev_id.
> 
> This mechanism is useful for device groups with multiple devices
> with various different capabilities. For example, a type 1 device group
> that contains a PCI PF and its VF. For this group, a clever system
> administrator can use admin commands to manipulate the PF/VF resources.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index 6725daa..f816c3b 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>          le16 command;
>          /*
>           * 0 - self
> -         * 1 - 65535 are reserved
> +         * 1 - other virtio device (identified by vdev_id) in the same device group
> +         * 2 - 65535 are reserved
>           */
>          le16 dst_type;
> +        le64 vdev_id;
>          /* reserved for common cmd fields */
> -        u8 reserved[20];
> +        u8 reserved[12];
>          u8 command_specific_data[];
>  
>          /* Device-writable part */
> @@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  \hline
>  03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>  \hline
> +04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
> +\hline
>  \end{tabular}
>  
> -The \field{command}, \field{dst_type} and \field{command_specific_data} are
> +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
>  set by the driver, and the device sets the \field{status}, the
>  \field{command_specific_error} and the \field{command_specific_result},
>  if needed.
> @@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  
>  The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
>  
> +The optional unused fields to be zeroed by the driver.
> +
>  The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>  
> -The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
> +The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
> +group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly

explicitly means nothig. just avoid using this word.

> to check whether this operation is allowed.

what this operation? I think you mean "which commands are allowed".

> +If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
> +If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
> +\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).


should for use by conformance sections only. in fact pls add one for
this.

>  
>  The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>  VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-04-26 22:58 ` [PATCH v5 3/7] Introduce new destination type for admin commands Max Gurtovoy
  2022-05-15 15:01   ` Michael S. Tsirkin
@ 2022-05-15 15:09   ` Michael S. Tsirkin
  2022-05-16 21:21     ` Parav Pandit
  2022-05-18 14:34     ` Max Gurtovoy
  1 sibling, 2 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 15:09 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:20AM +0300, Max Gurtovoy wrote:
> Introduce a new mechanism to issue commands with dst_type field that is
> not "self". With the new mechanism, driver can set dst_type to 1
> and use the vdev_id common field to describe the designated vdev_id.
> 
> This mechanism is useful for device groups with multiple devices
> with various different capabilities. For example, a type 1 device group
> that contains a PCI PF and its VF. For this group, a clever system
> administrator can use admin commands to manipulate the PF/VF resources.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index 6725daa..f816c3b 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>          le16 command;
>          /*
>           * 0 - self
> -         * 1 - 65535 are reserved
> +         * 1 - other virtio device (identified by vdev_id) in the same device group
> +         * 2 - 65535 are reserved
>           */
>          le16 dst_type;
> +        le64 vdev_id;

Alignment problems. Proposal:

vdev_id
dst_type
command



>          /* reserved for common cmd fields */
> -        u8 reserved[20];
> +        u8 reserved[12];
>          u8 command_specific_data[];
>  
>          /* Device-writable part */


> @@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  \hline
>  03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>  \hline
> +04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
> +\hline
>  \end{tabular}
>  
> -The \field{command}, \field{dst_type} and \field{command_specific_data} are
> +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
>  set by the driver, and the device sets the \field{status}, the
>  \field{command_specific_error} and the \field{command_specific_result},
>  if needed.
> @@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  
>  The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
>  
> +The optional unused fields to be zeroed by the driver.
> +
>  The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>  
> -The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
> +The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
> +group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly to check whether this operation is allowed.
> +If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
> +If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
> +\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).
>  
>  The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>  VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}


teminology is a bit inconsistent. would dst_id be better? it goes
together with dst_type after all.

> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 2/7] Introduce admin command set
  2022-04-26 22:58 ` [PATCH v5 2/7] Introduce admin command set Max Gurtovoy
@ 2022-05-15 15:23   ` Michael S. Tsirkin
  2022-05-16 21:08     ` [virtio-comment] " Parav Pandit
  2022-05-18 13:39     ` [virtio-comment] " Max Gurtovoy
  0 siblings, 2 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 15:23 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
> This command set is used for essential administrative and management
> operations.
> 
> Admin commands should be submitted to a well defined management
> interface.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  content.tex |   2 +
>  2 files changed, 125 insertions(+)
>  create mode 100644 admin.tex
> 
> diff --git a/admin.tex b/admin.tex
> new file mode 100644
> index 0000000..6725daa
> --- /dev/null
> +++ b/admin.tex
> @@ -0,0 +1,123 @@
> +\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
> +
> +The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
> +This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
> +
> +All the Admin commands are of the following form:
> +
> +\begin{lstlisting}
> +struct virtio_admin_cmd {
> +        /* Device-readable part */
> +        le16 command;
> +        /*
> +         * 0 - self
> +         * 1 - 65535 are reserved
> +         */
> +        le16 dst_type;
> +        /* reserved for common cmd fields */
> +        u8 reserved[20];
> +        u8 command_specific_data[];
> +
> +        /* Device-writable part */
> +        u8 status;
> +        u8 command_specific_error;
> +        u8 command_specific_result[];
> +};
> +\end{lstlisting}
> +
> +The following table describes the generic Admin status codes:
> +
> +\begin{tabular}{|l|l|l|}
> +\hline
> +Opcode & Status & Description \\
> +\hline \hline
> +00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
> +\hline
> +01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
> +\hline
> +02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
> +\hline
> +03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
> +\hline
> +\end{tabular}
> +
> +The \field{command}, \field{dst_type} and \field{command_specific_data} are
> +set by the driver, and the device sets the \field{status}, the
> +\field{command_specific_error} and the \field{command_specific_result},
> +if needed.
> +
> +Reserved common fields are ignored by the device and to be zeroed by the driver.
> +
> +The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
> +
> +The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
> +
> +The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
> +
> +The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
> +VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
> +holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
> +\field{command_specific_error} value is undefined and should be ignored by the driver.
> +
> +The following table describes the Admin command set:
> +
> +\begin{tabular}{|l|l|l|}
> +\hline
> +Opcode & Command & M/O \\
> +\hline \hline
> +0000h   & VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY    & M  \\
> +\hline
> +0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
> +\hline
> +0002h - 7FFFh   & Generic admin cmds    & -  \\
> +\hline
> +8000h - FFFFh   & Reserved    & - \\
> +\hline
> +\end{tabular}
> +
> +\subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}

why without _ here?

> +
> +The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY by the driver.
> +
> +The device, upon success, returns a result that describes information about the designated virtio device.

result really just means a result structure right? let's say so.

> +This result is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_caps_identify_result {
> +       /* Indicates which of the below fields were returned
> +        * (1 means that field was returned):

what does this mean "field is returned"? above restult is returned.

> +        * Bit 0 - device_admin_caps
> +        * Bits 1 - 63 - reserved for future fields
> +        */
> +       le64 attrs_mask;
> +       /* This field indicates which of the below admin
> +        * capabilities are supported by the device:
> +        * Bits 0 - 63 - reserved for future capabilities.
> +        */
> +       le64 device_admin_caps;


so all of the field is reserved?

> +       u8 reserved[112];
> +};
> +\end{lstlisting}
> +
> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
> +
> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.


ok so we have a protocol here, kind of like feature negotiation. Please write its description.
e.g. is it ok to change accepted caps? when? can device change its caps
etc etc etc.

Avoiding this kind of spec work is exactly why me and jason keep telling
you to consider just using features instead. Add a 64 bit admin features
field to the PCI transport and be done with it. CCW and MMIO already
have feature selector so it's trivial to add feature bits.


> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
> +
> +The command specific data set by the driver is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_caps_accept_data {
> +       /* Indicates which of the below fields were set
> +        * (1 means that field is set):


yes we all know that 1 means set.

do you really mean field is valid maybe?


> +        * Bit 0 - driver_admin_caps
> +        * Bits 1 - 63 - reserved for future fields
> +        */
> +       le64 attrs_mask;

looks like going overboard. just send 64 caps bits and be done with it.
and rename accept_data to accept_caps.

> +       /* This field indicates which of the below admin
> +        * capabilities are supported by the driver:
> +        * Bits 0 - 63 - reserved for future capabilities.
> +        */
> +       le64 driver_admin_caps;
> +       u8 reserved[112];


I just noticed this. Please do not add this huge amount of padding
everywhere. instead, explain that device must be ready to accept
a smaller or larger buffer depending on feature bits.

> +};
> +\end{lstlisting}
> diff --git a/content.tex b/content.tex
> index c6f116c..2e1df84 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>  types. It is RECOMMENDED that devices generate version 4
>  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>  
> +\input{admin.tex}
> +
>  \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>  
>  We start with an overview of device initialization, then expand on the
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 1/7] Introduce device group
  2022-04-26 22:58 ` [virtio-comment] [PATCH v5 1/7] Introduce device group Max Gurtovoy
@ 2022-05-15 15:25   ` Michael S. Tsirkin
  2022-05-18 13:14     ` [virtio-comment] " Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 15:25 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
> Each device group has a type. For now, define initial type of device
> groups: Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI
> SR-IOV virtual functions (VFs). This group may contain one or more
> virtio devices.
> 
> Each device within a device group has a unique identifier. This
> identifier is the virtio device id (vdev_id).
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  introduction.tex | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/introduction.tex b/introduction.tex
> index 4dc7085..4358ab1 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -155,6 +155,18 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
>  sections tagged "Legacy Interface" in the section title.
>  These highlight the changes made since the earlier drafts.
>  
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> +
> +A device group includes one or more virtio devices.
> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
> +
> +For now, the supported device groups are:
> +\begin{enumerate}
> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> +\end{enumerate}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}


In context of virtualization type 1 already refers to a specific type
of hypervisor.

I suggest simply "SR-IOV type" - this way users do not need to remember
special terminology.

>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 0/7] Introduce device group and device management
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (6 preceding siblings ...)
  2022-04-26 22:58 ` [PATCH v5 7/7] RFC: add initial support for configuring feature bits Max Gurtovoy
@ 2022-05-15 15:27 ` Michael S. Tsirkin
  2022-05-18 15:32   ` Max Gurtovoy
  2022-07-05 13:56 ` Michael S. Tsirkin
  8 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-15 15:27 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> Hi,
> A device group definition will help extending the virtio specefication for
> various future features that require a notion of grouping devices together or
> managing devices inside a group. A device group include one or more virtio devices.
> For now, only support for type 1 device group was added.


OK good progress here. Sent a bunch of comments, most of them
cosmetic.


> Also introduce the admin facility to allow manipulating features and configurations
> in a generic manner. Using the admin command set, one can manipulate the device itself
> and/or to manipulate, if possible, another device within the same device group (for now,
> introduce only support of PCI SR-IOV devices grouping).
> 
> The admin command set will be extended in the future  to support more functionalities.
> Some of these functionalities are already under discussions.
> 
> The admin virtqueue is the first management interface to issue admin commands from
> the admin command set.
> 
> Motivation for choosing admin queue as first management interface:
> 1. It is anticipated that admin queue will be used for managing and configuring
>    many different type of resources. For example,
>    a. PCI PF configuring PCI VF attributes.
>    b. virtio device creating/destroying/configuring subfunctions discussed in [1]
>    c. composing device config space of VF or SF such as mac address, number of VQs, virtio features
> 
>    Mapping all of them as configuration registers to MMIO will require large MMIO space,
>    if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
>    requires on-chip resources to complete within MMIO access latencies. Such resources are very
>    expensive.
> 
> 2. Such limitation can be overcome by having smaller MMIO register set to build
>    a command request response interface. However, such MMIO based command interface
>    will be limited to serve single outstanding command execution. Such limitation can
>    resulting in high device creation and composing time which can affect VM startup time.
>    Often device can queue and service multiple commands in parallel, such command interface
>    cannot use parallelism offered by the device.
> 
> 3. When a command wants to DMA data from one or more physical addresses, for example in the future a
>    live migration command may need to fetch device state consist of config space, tens of
>    VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
>    Packing one or more DMA addresses over new command interface will be burden some and continue
>    to suffer single outstanding command execution latencies. Such limitation is not good for time
>    sensitive live migration use cases.
> 
> 4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
>    descriptors. Similar mechanism exist today for device specific configuration - the control VQ.
> 
> A future work can add another management interface to issue admin commands.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html
> 
> This series include the comments and fixes from V1-V4 of the initial patch sets ("VIRTIO: Provision maximum
> MSI-X vectors for a VF" and "Introduce virtio subsystem and Admin virtqueue" [2]).
> This series was extended with additional RFC for setting managed device feature bits as another example for
> using admin command set. Also device/driver negotiation for admin caps was introduced as a response for previous
> comments on the mailing list.
> 
> [2] https://lists.oasis-open.org/archives/virtio-comment/202203/msg00005.html
> 
> 
> Open issues:
> 1. CCW and MMIO specification for admin_queue_index register
> 
> Changelog:
>  - Merged MSI-X configuration series to current one.
>  - Addressed comments from MST, Jason Wang and others.
>  - simplified the interface.
>  - added another resource management  as RFC (feature bits).
> 
> Max Gurtovoy (7):
>   Introduce device group
>   Introduce admin command set
>   Introduce new destination type for admin commands
>   Introduce virtio admin virtqueue
>   Add miscellaneous configuration structure for PCI
>   Introduce MGMT admin commands
>   RFC: add initial support for configuring feature bits
> 
>  admin.tex        | 282 +++++++++++++++++++++++++++++++++++++++++++++++
>  conformance.tex  |   3 +
>  content.tex      | 118 +++++++++++++++++++-
>  introduction.tex |  42 +++++++
>  4 files changed, 443 insertions(+), 2 deletions(-)
>  create mode 100644 admin.tex
> 
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-15 15:23   ` Michael S. Tsirkin
@ 2022-05-16 21:08     ` Parav Pandit
  2022-05-17 10:08       ` [virtio-dev] " Cornelia Huck
  2022-05-17 11:48       ` Michael S. Tsirkin
  2022-05-18 13:39     ` [virtio-comment] " Max Gurtovoy
  1 sibling, 2 replies; 103+ messages in thread
From: Parav Pandit @ 2022-05-16 21:08 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio

Hi Michael,

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, May 15, 2022 11:24 AM

[..]

> > +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
> command}\label{sec:Basic
> > +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
> > +DEVICE CAPS ACCEPT command}
> > +
> > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
> driver to acknowledge those admin capabilities it understands and wishes to
> use.
> 
> 
> ok so we have a protocol here, kind of like feature negotiation. Please write
> its description.
> e.g. is it ok to change accepted caps? when? can device change its caps etc
> etc etc.
> 
> Avoiding this kind of spec work is exactly why me and jason keep telling you
> to consider just using features instead. Add a 64 bit admin features field to
> the PCI transport and be done with it. CCW and MMIO already have feature
> selector so it's trivial to add feature bits.
> 
As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.

And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.

One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.

However, AQ may find some usage in the VF/SF themselves down the road.
Hence, keeping the cap exchange transport this way is more optimal.

Max has called out this AQ rationale in 4 or 5 points in the cover letter.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-15 15:09   ` Michael S. Tsirkin
@ 2022-05-16 21:21     ` Parav Pandit
  2022-05-16 23:33       ` Michael S. Tsirkin
  2022-05-18 14:34     ` Max Gurtovoy
  1 sibling, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-05-16 21:21 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, May 15, 2022 11:09 AM
> 
> > ---
> >  admin.tex | 18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/admin.tex b/admin.tex
> > index 6725daa..f816c3b 100644
> > --- a/admin.tex
> > +++ b/admin.tex
> > @@ -11,11 +11,13 @@ \section{Administration command
> set}\label{sec:Basic Facilities of a Virtio Devi
> >          le16 command;
> >          /*
> >           * 0 - self
> > -         * 1 - 65535 are reserved
> > +         * 1 - other virtio device (identified by vdev_id) in the same device
> group
> > +         * 2 - 65535 are reserved
> >           */
> >          le16 dst_type;
> > +        le64 vdev_id;
> 
> Alignment problems. Proposal:
> 
> vdev_id
> dst_type
> command

I remember that this has come up internal review as well.
Though it is certainly good to naturally align, I don't think we have alignment problem spec wise based on below snippet of the spec [1].
It was kind of counter intuitive to see vdev_id before seeing what the actual command is.
That way current layout made more sense.

There was some internal version or external, do not recall, where vdev_id was optional or was union.

But I think if we always going have vdev_id, it can be naturally aligned like your above proposal.

[1] "Structure Specifications
Many device and driver in-memory structure layouts are documented using the C struct syntax. All structures
are assumed to be without additional padding. To stress this, cases where common C compilers are known
to insert extra padding within structures are tagged using the GNU C __attribute__((packed)) syntax."


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-15 14:37   ` Michael S. Tsirkin
@ 2022-05-16 21:47     ` Parav Pandit
  2022-05-17 12:31       ` [virtio-comment] " Michael S. Tsirkin
  2022-05-17  2:28     ` Jason Wang
  2022-05-18 15:03     ` Max Gurtovoy
  2 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-05-16 21:47 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, May 15, 2022 10:37 AM

> > +This value is also returned in the virtio_admin_device_mgmt_result
> structure.
> > +Also, a successful operation guarantees that the MSI-X capability
> > +access by the designated PCI device defined by the PCI specification
> > +must reflect the new configuration in all relevant fields. For example, by
> default if the PCI VF has been assigned 4 MSI-X vectors, and
> VIRTIO_ADMIN_DEVICE_MGMT increases the MSI-X vectors to 8. On this
> change, reading Table size field of the MSI-X message control register will
> reflect a value of 7.
> > +
> > +It is beyond the scope of the virtio specification to define
> > necessary synchronization in system software to ensure that a virtio
> > PCI VF device +interrupt configuration modification is reflected in
> > the PCI device.
> 
> IMHO it is very much in scope of the specification. 
Many pieces of this system software are not implemented by the virtio specification...
Not sure, how can it belong to virtio spec.

> The scope of the
> specification is to allow device interoperability and this very much fits the bill.
> 
How is interoperability affected and between which two entities?


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-16 21:21     ` Parav Pandit
@ 2022-05-16 23:33       ` Michael S. Tsirkin
  2022-05-18 14:36         ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-16 23:33 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, May 16, 2022 at 09:21:11PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, May 15, 2022 11:09 AM
> > 
> > > ---
> > >  admin.tex | 18 ++++++++++++++----
> > >  1 file changed, 14 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/admin.tex b/admin.tex
> > > index 6725daa..f816c3b 100644
> > > --- a/admin.tex
> > > +++ b/admin.tex
> > > @@ -11,11 +11,13 @@ \section{Administration command
> > set}\label{sec:Basic Facilities of a Virtio Devi
> > >          le16 command;
> > >          /*
> > >           * 0 - self
> > > -         * 1 - 65535 are reserved
> > > +         * 1 - other virtio device (identified by vdev_id) in the same device
> > group
> > > +         * 2 - 65535 are reserved
> > >           */
> > >          le16 dst_type;
> > > +        le64 vdev_id;
> > 
> > Alignment problems. Proposal:
> > 
> > vdev_id
> > dst_type
> > command
> 
> I remember that this has come up internal review as well.
> Though it is certainly good to naturally align, I don't think we have alignment problem spec wise based on below snippet of the spec [1].
> It was kind of counter intuitive to see vdev_id before seeing what the actual command is.
> That way current layout made more sense.

vdev_id and dst_type tell you where to send the command. They do not
depend on the command. In fact vdev_id -> dst_id might make sense.

> 
> There was some internal version or external, do not recall, where vdev_id was optional or was union.
> 
> But I think if we always going have vdev_id, it can be naturally aligned like your above proposal.
> 
> [1] "Structure Specifications
> Many device and driver in-memory structure layouts are documented using the C struct syntax. All structures
> are assumed to be without additional padding. To stress this, cases where common C compilers are known
> to insert extra padding within structures are tagged using the GNU C __attribute__((packed)) syntax."

These are pathological cases due to legacy. Better avoided.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-15 14:37   ` Michael S. Tsirkin
  2022-05-16 21:47     ` Parav Pandit
@ 2022-05-17  2:28     ` Jason Wang
  2022-05-18 15:27       ` Max Gurtovoy
  2022-05-18 15:03     ` Max Gurtovoy
  2 siblings, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-05-17  2:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, oren, parav, shahafs, aadam, virtio


在 2022/5/15 22:37, Michael S. Tsirkin 写道:
> On Wed, Apr 27, 2022 at 01:58:23AM +0300, Max Gurtovoy wrote:
>> Introduce the concept of a management and a managed device and add
>> example of using this concept to manage resources.
>>
>> A management device supports the VIRTIO_ADMIN_DEVICE_MGMT and
>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands to manage some resources
>> of a managed device.
>>
>> A typical cloud provider SR-IOV use case is to create many VFs for use
>> by guest VMs. The VFs may not be assigned to a VM until a user requests
>> a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X
>> vectors proportional to the number of CPUs in the VM, but there is no
>> standard way today in the spec to change the number of MSI-X vectors
>> supported by a VF, although there are some operating systems that
>> support this.
>>
>> The new admin mechanism manages the MSI-X interrupt vectors assignments
>> of a managed PCI device (i.e. VF) by its management devices (i.e. its
>> parent PF) but can easily extended to any other generic resource
>> management.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> I'd like to see msix and the concept of type 1 group
> in a separate patch from MSIX.
>
> I am not sure MSIX things are ready but the grouping part looks mostly
> ok to me.
>
>> ---
>>   admin.tex        | 132 +++++++++++++++++++++++++++++++++++++++++++++--
>>   content.tex      |  81 +++++++++++++++++++++++++++++
>>   introduction.tex |  32 +++++++++++-
>>   3 files changed, 241 insertions(+), 4 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index d09683d..5b54743 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -79,12 +79,20 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   \hline
>>   0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
>>   \hline
>> -0002h - 7FFFh   & Generic admin cmds    & -  \\
>> +0002h   & VIRTIO_ADMIN_DEVICE_MGMT    & O  \\
>> +\hline
>> +0003h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    & O  \\
>> +\hline
>> +0004h - 7FFFh   & Generic admin cmds    & -  \\
>>   \hline
>>   8000h - FFFFh   & Reserved    & - \\
>>   \hline
>>   \end{tabular}
>>   
>> +\begin{note}
>> +{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
>> +\end{note}
>> +
>>   \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>   
>>   The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
>> @@ -102,13 +110,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>>          le64 attrs_mask;
>>          /* This field indicates which of the below admin
>>           * capabilities are supported by the device:
>> -        * Bits 0 - 63 - reserved for future capabilities.
>> +        * Bit 0 - if set, the device is a management device
>> +        * Bit 1 - if set, the device is a type 1 management device that supports
>> +        *         MSI-X vector mgmt of its type 1 managed devices
>> +        * Bits 2 - 63 - reserved for future capabilities.
>>           */
>>          le64 device_admin_caps;
>>          u8 reserved[112];
>>   };
>>   \end{lstlisting}
>>   
>> +\begin{note}
>> +{For more details on MSI-X vector management support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}.}
>> +\end{note}
>> +
>>   \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
>>   
>>   The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
>> @@ -125,13 +140,124 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>>          le64 attrs_mask;
>>          /* This field indicates which of the below admin
>>           * capabilities are supported by the driver:
>> -        * Bits 0 - 63 - reserved for future capabilities.
>> +        * Bit 0 - if set, the driver accepted the device as a management device
>> +        * Bit 1 - if set, the driver accepted the device as a type 1 management device
>> +        *         that supports MSI-X vector mgmt of its type 1 managed devices
>> +        * Bits 2 - 63 - reserved for future capabilities.
>>           */
>>          le64 driver_admin_caps;
>>          u8 reserved[112];
>>   };
>>   \end{lstlisting}
>>   
>> +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
>> +
>> +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
>> +
>> +The command specific data set by the driver is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_data {
>> +        /*
>> +         * 0 - reserved
>> +         * 1 - assign resource to the designated vdev_id
>> +         * 2 - query resource of the designated vdev_id
>> +         * 3 - 255 are reserved
>> +         */
>> +        u8 operation;
>> +        /*
>> +         * 0 - MSI-X vector
>> +         * 1 - 65535 are reserved
>> +         */
>> +        le16 resource;
>> +        /*
>> +         * The value to the given resource:
>> +         * if resource = 0 (MSI-X vector), it's a 1-based count.
>> +         */
>> +        le64 resource_val;
>> +        u8 reserved[5];
>> +};
>> +\end{lstlisting}
>> +
>> +The following table describes the command specific error codes codes:
>> +
>> +\begin{tabular}{|l|l|l|}
>> +\hline
>> +Opcode & Status & Description \\
>> +\hline \hline
>> +00h   & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE    & designated device is in use, operation failed   \\
>> +\hline
>> +01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
>> +\hline
>> +02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
>> +\hline
>> +03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
>> +\hline
>> +04h - FFh   & Reserved    & -  \\
>> +\hline
>> +\end{tabular}
>> +
>> +The device, upon success, returns a result that describes the information according to the requested operation.
>> +This result is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_result {
>> +        le64 resource_val;
>> +        u8 reserved[8];
>> +};
>> +\end{lstlisting}
>> +
>> +If the requested operation by the driver was "assign resource to the designated vdev_id", the device will return the resource_val of the assigned
>> +resources to the designated vdev_id. Upon success, this value should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data
>> +structure set by the driver. In case of a failure, the value of this field is undefined and will be ignored by the driver.
>> +
>> +If the requested operation by the driver was "query resource of the designated vdev_id", the device will return resource_val of the currently assigned
>> +resources to the designated vdev_id upon success. In case of a failure, the value of this field is undefined and will be ignored by the driver.
>> +
>> +\begin{note}
>> +{MSI-X vector resource type is valid only for PCI devices. VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is
>> +returned by the device when the designated vdev_id is not a PCI device.}


Note that MSI has been used by various platform devices. It would be 
better if we can make it work for non-PCI devices otherwise we may 
re-introduce duplicated commands.


>> +\end{note}
>> +
>> +\begin{note}
>> +{For this command, if driver is setting \field{resource} to MSI-X vector type, the \field{vdev_id} can't be associated with a Virtual Function with
>> +VF index greater than NumVFs value as defined in the PCI specification or smaller than 1. An error is returned by the device when \field{vdev_id} is out of the range.}
>> +\end{note}
>> +
>> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
>> +
>> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
>> +
>> +The device, upon success, returns a result that describes the management device attributes.
>> +This result is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_attrs_result {
>> +        /* Indicates which of the below fields were returned
>> +         * (1 means that field was returned):
>> +         * Bit 0 - vfs_total_msix_count
>> +         * Bit 1 - vfs_assigned_msix_count
>> +         * Bit 2 - per_vf_max_msix_count
>> +         * Bits 3 - 63 - reserved for future fields
>> +         */
>> +        le64 attrs_mask;
>> +
>> +        /* Total number of msix vectors for the total number of VFs */
>> +        le32 vfs_total_msix_count;
>> +        /* Assigned number of msix vectors for the enabled VFs */
>> +        le32 vfs_assigned_msix_count;
>> +        /* Max number of msix vectors that can be assigned for a single VF */
>> +        le16 per_vf_max_msix_count;
>> +
>> +        u8 reserved[110];
>> +};
>> +\end{lstlisting}
>> +
>> +\begin{note}
>> +{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} and \field{per_vf_max_msix_count} returned by the device if the
>> +designated vdev_id is a management device that can allocate/deallocate MSI-X resources for PCI VFs devices. Otherwise,
>> +the associated bits in \field{attrs_mask} are zeroed by the device.}
>> +\end{note}
>> +
>>   \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>>   
>>   An admin virtqueue is a management interface of a device that can be used to send administrative
>> diff --git a/content.tex b/content.tex
>> index 0c1d44f..81e5850 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>>   
>>   \input{admin.tex}
>>   
>> +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
>> +
>> +A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a type 1 device group.
>> +A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
>> +A management device might have various management capabilities and attributes to manage its managed devices.
> This makes my eyes glaze over.
> Please, find all instances which say "manage" more than once and
> rephrase.
>
>> The capabilities exposed
>> +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>> +for more details) and the attributes exposed in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
>> +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
>> +
>> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
>> +
>>   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>>   
>>   We start with an overview of device initialization, then expand on the
>> @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
>>       \end{itemize}
>>   \end{itemize}
>>   
>> +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
>> +
>> +This documents the group of admin capabilities for PCI virtio devices. Each capability is
>> +implemented using one or more Admin commands.
>> +
>> +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}
>> +
>> +This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors
>> +for its managed devices.


I think we need to clarify whether the Initial VFs belong to the 
"managed device".


>>   In PCI, a management device can be the PF device and the managed device can be the VF (for example in a type 1 device group).
>> +Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands, report the MSI-X attributes in the result of
>> +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector resource management is supported in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
>> +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>> +
>> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of
>> +msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
>> +\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
>> +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
>> +fields in the \field{attrs_mask} field of the result buffer.
>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>> +
>> +The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification.
>> +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device.
>> +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the
>> +amount of MSI-X interrupt vectors to configure to the designated managed device in \field{resource_val}. The managed device id is set to \field{vdev_id} field.
>> +
>> +A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the designated device.
>> +This value is also returned in the virtio_admin_device_mgmt_result structure.
>> +Also, a successful operation guarantees that the MSI-X capability access by the designated PCI device defined by the PCI specification must reflect
>> +the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT
>> +increases the MSI-X vectors to 8. On this change, reading Table size field of the MSI-X message control register will reflect a value of 7.


This seems odd, what happens if we reduce the number of vectors. Or is 
such on-the-fly changes of the semantic of a register allowed by the PCI 
specification?

I think the driver must do this before creating the VFs (writing to the 
sriov_numvfs or status), and the device will ignore or fail the request 
of such changes after the VFs have been provisioned.


>> +
>> +It is beyond the scope of the virtio specification to define
>> necessary synchronization in system software to ensure that a virtio
>> PCI VF device +interrupt configuration modification is reflected in
>> the PCI device.
> IMHO it is very much in scope of the specification. The scope of the
> specification is to allow device interoperability and this very much
> fits the bill.


+1, things will be much easier if we only allow the changes before 
provisioning VFs.


>
>> However, it is expected that any modern system software implementing
>> virtio +drivers and PCI subsystem will ensure that any changes
>> occurring in the VF interrupt configuration is either updated in the
>> PCI VF device or +such configuration fails.
> OK. Anything more? What exactly does "interrupt configuration" mean here?
>
>> For example, one way to
>> implement that is to make sure that there is no driver bounded to the
>> virtio PCI SR-IOV VF during +this operation.
> bounded in what sense?
>
> And why do you say VF? Is this command limited to type 1? You only
> limit it to PCI above.
>
> same elsewhere
>
>> +
>> +To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
> issues
>
> lots of grammar error like this elsewhere, pls find and correct.
>
>> +"query resource of the designated vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id}
>> +field. In the result of a successful operation,
> meaning "in case"?
>
>> the amount of MSI-X interrupt vectors that is currently assigned to the designated managed device is
>> +returned by the device in \field{resource_val} field of the virtio_admin_device_mgmt_result structure.
>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
>> +
>> +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
>> +
>> +A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following:
> rephrase to simplify
>
> The driver uses the following sequence for configuring MSI-X vectors
> ....
>
>
>
>> +
>> +\begin{enumerate}
>> +\item Ensure that VF driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)


Is "sriov auto probing" a general OS facility instead of Linux specific? 
If not, we need clarify what it did here.

Thanks


>> +
>> +\item Load the PF driver
>> +
>> +\item Enable SR-IOV by following the PCI specification
>> +
>> +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
>> +
>> +\item Find the managed VF vdev_id (for type 1 device group the vdev_id of PCI VF is equal to vf number)
>> +
>> +\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
>> +
>> +\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
>> +
>> +\item After successful completion of the assignment, load the VF driver
>> +
>> +\item Assign the VF to a VM
>> +
>> +\end{enumerate}
>> +
>>   \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
>>   
>>   Virtual environments without PCI support (a common situation in
>> diff --git a/introduction.tex b/introduction.tex
>> index 4358ab1..bfc5498 100644
>> --- a/introduction.tex
>> +++ b/introduction.tex
>> @@ -164,9 +164,39 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>   For now, the supported device groups are:
>>   \begin{enumerate}
>>   \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>> -and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification). A PCI SR-IOV PF device can act as a management device for
>> +type 1 group. A PCI SR-IOV VF device can act as a managed device for type 1 group (see \ref{sec:Introduction / Terminology / Virtio management device} and
>> +\ref{sec:Introduction / Terminology / Virtio managed device} for more information).
>>   \end{enumerate}
>>   
>> +\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
>> +
>> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
>> +This device can manage a virtio managed device. A device group may contain zero or more management devices.
>> +
>> +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for type 1 device group).
>> +
>> +\subsection{Virtio type 1 management device}\label{sec:Introduction / Terminology / Virtio type 1 management device}
>> +
>> +A virtio management device for type 1 device group. This device is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio device in the same device group),
>> +and set \field{vdev_id} to an id that corresponds with one of its managed virtio devices (PCI SR-IOV VFs) for the VIRTIO_ADMIN_DEVICE_MGMT admin command.
>> +
>> +A type 1 device group may contain zero or one management devices.
>> +
>> +\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
>> +
>> +A virtio device that can be managed by a virtio management device.
>> +A device group may contain zero or more managed devices.
>> +
>> +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for type 1 group).
>> +
>> +\subsection{virtio type 1 managed device}\label{sec:Introduction / Terminology / Virtio type 1 managed device}
>> +
>> +A virtio managed device for type 1 device group. This device is a PCI SR-IOV VF and is managed by a virtio type 1 management device (virtio PCI SR-IOV PF).
>> +It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio type 1 management device are type 1 managed devices.
>> +
>>   \section{Structure Specifications}\label{sec:Structure Specifications}
>>   
>>   Many device and driver in-memory structure layouts are documented using
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-16 21:08     ` [virtio-comment] " Parav Pandit
@ 2022-05-17 10:08       ` Cornelia Huck
  2022-05-18 13:42         ` Max Gurtovoy
  2022-05-17 11:48       ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Cornelia Huck @ 2022-05-17 10:08 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, virtio-dev, Oren Duer, Shahaf Shuler,
	aadam, virtio

On Mon, May 16 2022, Parav Pandit <parav@nvidia.com> wrote:

> Hi Michael,
>
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Sunday, May 15, 2022 11:24 AM
>
> [..]
>
>> > +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
>> command}\label{sec:Basic
>> > +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
>> > +DEVICE CAPS ACCEPT command}
>> > +
>> > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
>> driver to acknowledge those admin capabilities it understands and wishes to
>> use.
>> 
>> 
>> ok so we have a protocol here, kind of like feature negotiation. Please write
>> its description.
>> e.g. is it ok to change accepted caps? when? can device change its caps etc
>> etc etc.
>> 
>> Avoiding this kind of spec work is exactly why me and jason keep telling you
>> to consider just using features instead. Add a 64 bit admin features field to
>> the PCI transport and be done with it. CCW and MMIO already have feature
>> selector so it's trivial to add feature bits.
>> 
> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
>
> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
>
> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
>
> However, AQ may find some usage in the VF/SF themselves down the road.
> Hence, keeping the cap exchange transport this way is more optimal.
>
> Max has called out this AQ rationale in 4 or 5 points in the cover letter.

I'm not against using a queue, but why not use feature bits for
capabilities? As Michael said, the infrastructure for that is already in
place.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio] Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-05-15 14:57   ` Michael S. Tsirkin
@ 2022-05-17 10:12     ` Cornelia Huck
  2022-05-18 14:42       ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Cornelia Huck @ 2022-05-17 10:12 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, virtio-dev, oren, parav, shahafs,
	aadam, virtio

On Sun, May 15 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
>> This new structure will be used for adding new miscellaneous registers
>> for a virtio device configuration layout.
>> 
>> For now, only admin_queue_index register is added. Admin virtqueue index
>> does not depend on the device type. Hence, add a PCI capability to read
>> the admin virtqueue index.
>> 
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>
>
> I guess we discussed this but I forgot. Why do we have this new
> structure as opposed to just adding the value at the end of config
> structure? I was kind of hoping that the structure can be
> reused for CCW/MMIO and then we can add more use-cases with
> new transport and device independent structures.
>
> If we keep it transport specific I don't really understand why
> is it useful ...

Nod, just define a common misc_configuration struct and have the
individual transports access it in the way it works best for them.


---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that 
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-16 21:08     ` [virtio-comment] " Parav Pandit
  2022-05-17 10:08       ` [virtio-dev] " Cornelia Huck
@ 2022-05-17 11:48       ` Michael S. Tsirkin
  2022-05-18 14:09         ` Max Gurtovoy
  2022-05-31 20:39         ` Parav Pandit
  1 sibling, 2 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-17 11:48 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, May 16, 2022 at 09:08:34PM +0000, Parav Pandit wrote:
> Hi Michael,
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, May 15, 2022 11:24 AM
> 
> [..]
> 
> > > +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
> > command}\label{sec:Basic
> > > +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
> > > +DEVICE CAPS ACCEPT command}
> > > +
> > > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
> > driver to acknowledge those admin capabilities it understands and wishes to
> > use.
> > 
> > 
> > ok so we have a protocol here, kind of like feature negotiation. Please write
> > its description.
> > e.g. is it ok to change accepted caps? when? can device change its caps etc
> > etc etc.
> > 
> > Avoiding this kind of spec work is exactly why me and jason keep telling you
> > to consider just using features instead. Add a 64 bit admin features field to
> > the PCI transport and be done with it. CCW and MMIO already have feature
> > selector so it's trivial to add feature bits.
> > 
> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
> 
> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
> 
> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
> 
> However, AQ may find some usage in the VF/SF themselves down the road.
> Hence, keeping the cap exchange transport this way is more optimal.
> 
> Max has called out this AQ rationale in 4 or 5 points in the cover letter.

Hmm. It's kind of a generic claim though. We never put devices on a diet
trying to conserve registers. There is cost associated with this dance
and that is driver boot time.

I also don't really understand how you can claim you need to save
memory like this and at the same time blindly add a more or less
"just in case" misc config in the config space.
So, not pretty.

And as I said, you will need much more spec work to reach the level
to which features are specified - and note we are not yet happy with
how features are specified either! So it's a moving target.

Maybe put this in features for now, and leave the whole
capability thing for another day?

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-16 21:47     ` Parav Pandit
@ 2022-05-17 12:31       ` Michael S. Tsirkin
  2022-05-18 15:14         ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-17 12:31 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, May 16, 2022 at 09:47:23PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, May 15, 2022 10:37 AM
> 
> > > +This value is also returned in the virtio_admin_device_mgmt_result
> > structure.
> > > +Also, a successful operation guarantees that the MSI-X capability
> > > +access by the designated PCI device defined by the PCI specification
> > > +must reflect the new configuration in all relevant fields. For example, by
> > default if the PCI VF has been assigned 4 MSI-X vectors, and
> > VIRTIO_ADMIN_DEVICE_MGMT increases the MSI-X vectors to 8. On this
> > change, reading Table size field of the MSI-X message control register will
> > reflect a value of 7.
> > > +
> > > +It is beyond the scope of the virtio specification to define
> > > necessary synchronization in system software to ensure that a virtio
> > > PCI VF device +interrupt configuration modification is reflected in
> > > the PCI device.
> > 
> > IMHO it is very much in scope of the specification. 
> Many pieces of this system software are not implemented by the virtio specification...
> Not sure, how can it belong to virtio spec.
> 
> > The scope of the
> > specification is to allow device interoperability and this very much fits the bill.
> > 
> How is interoperability affected and between which two entities?

For example, if OS/driver caches the # of vectors in the MSI capability,
and it changes things will not work.  OS/drivers won't know not to cache
values unless we tell them what not to cache.


I'd like to float again the idea of instead exposing a larger number of
vectors than supported. Assigning more vectors than supported will then
fail, drivers already check that so they will recover.
Avoids modifying fields that pci spec expects to be read only.


-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-05-15 15:25   ` Michael S. Tsirkin
@ 2022-05-18 13:14     ` Max Gurtovoy
  2022-05-18 13:32       ` Cornelia Huck
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 13:14 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Cornelia Huck
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

Hi MST,

On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
>> Each device group has a type. For now, define initial type of device
>> groups: Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI
>> SR-IOV virtual functions (VFs). This group may contain one or more
>> virtio devices.
>>
>> Each device within a device group has a unique identifier. This
>> identifier is the virtio device id (vdev_id).
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   introduction.tex | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/introduction.tex b/introduction.tex
>> index 4dc7085..4358ab1 100644
>> --- a/introduction.tex
>> +++ b/introduction.tex
>> @@ -155,6 +155,18 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
>>   sections tagged "Legacy Interface" in the section title.
>>   These highlight the changes made since the earlier drafts.
>>   
>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>> +
>> +A device group includes one or more virtio devices.
>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
>> +
>> +For now, the supported device groups are:
>> +\begin{enumerate}
>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>> +\end{enumerate}
>> +
>>   \section{Structure Specifications}\label{sec:Structure Specifications}
>
> In context of virtualization type 1 already refers to a specific type
> of hypervisor.
>
> I suggest simply "SR-IOV type" - this way users do not need to remember
> special terminology.

This is 12 lines addition commit with simple definition.

I didn't mentioned hypervisors here.

I will stick to your suggestion and use name instead of numbers 
(although I don't understand how can a use that knows how to read spec 
will be confused here), but I would like Jason and Cornelia to ack on 
this during this review cycle.

When we'll get 3 acks on this name - I'll update it for v6.

>
>>   Many device and driver in-memory structure layouts are documented using
>> -- 
>> 2.21.0

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-05-18 13:14     ` [virtio-comment] " Max Gurtovoy
@ 2022-05-18 13:32       ` Cornelia Huck
  2022-06-01 13:43         ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Cornelia Huck @ 2022-05-18 13:32 UTC (permalink / raw)
  To: Max Gurtovoy, Michael S. Tsirkin, Jason Wang
  Cc: jasowang, virtio-comment, virtio-dev, oren, parav, shahafs,
	aadam, virtio

On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> Hi MST,
>
> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>> +
>>> +A device group includes one or more virtio devices.
>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
>>> +
>>> +For now, the supported device groups are:
>>> +\begin{enumerate}
>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>>> +\end{enumerate}
>>> +
>>>   \section{Structure Specifications}\label{sec:Structure Specifications}
>>
>> In context of virtualization type 1 already refers to a specific type
>> of hypervisor.
>>
>> I suggest simply "SR-IOV type" - this way users do not need to remember
>> special terminology.
>
> This is 12 lines addition commit with simple definition.
>
> I didn't mentioned hypervisors here.
>
> I will stick to your suggestion and use name instead of numbers 
> (although I don't understand how can a use that knows how to read spec 
> will be confused here), but I would like Jason and Cornelia to ack on 
> this during this review cycle.
>
> When we'll get 3 acks on this name - I'll update it for v6.

So, do you want to imply some kind of numbering? I don't like "Type 1",
either. If the type needs to be referenced in code, it should have a
#define or such; otherwise, "SR-IOV type" would be fine.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] Re: [PATCH v5 2/7] Introduce admin command set
  2022-05-15 15:23   ` Michael S. Tsirkin
  2022-05-16 21:08     ` [virtio-comment] " Parav Pandit
@ 2022-05-18 13:39     ` Max Gurtovoy
  2022-05-18 13:50       ` [virtio] " Cornelia Huck
  2022-06-20 21:08       ` Michael S. Tsirkin
  1 sibling, 2 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 13:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 6:23 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
>> This command set is used for essential administrative and management
>> operations.
>>
>> Admin commands should be submitted to a well defined management
>> interface.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   content.tex |   2 +
>>   2 files changed, 125 insertions(+)
>>   create mode 100644 admin.tex
>>
>> diff --git a/admin.tex b/admin.tex
>> new file mode 100644
>> index 0000000..6725daa
>> --- /dev/null
>> +++ b/admin.tex
>> @@ -0,0 +1,123 @@
>> +\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
>> +
>> +The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
>> +This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
>> +
>> +All the Admin commands are of the following form:
>> +
>> +\begin{lstlisting}
>> +struct virtio_admin_cmd {
>> +        /* Device-readable part */
>> +        le16 command;
>> +        /*
>> +         * 0 - self
>> +         * 1 - 65535 are reserved
>> +         */
>> +        le16 dst_type;
>> +        /* reserved for common cmd fields */
>> +        u8 reserved[20];
>> +        u8 command_specific_data[];
>> +
>> +        /* Device-writable part */
>> +        u8 status;
>> +        u8 command_specific_error;
>> +        u8 command_specific_result[];
>> +};
>> +\end{lstlisting}
>> +
>> +The following table describes the generic Admin status codes:
>> +
>> +\begin{tabular}{|l|l|l|}
>> +\hline
>> +Opcode & Status & Description \\
>> +\hline \hline
>> +00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
>> +\hline
>> +01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
>> +\hline
>> +02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
>> +\hline
>> +03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>> +\hline
>> +\end{tabular}
>> +
>> +The \field{command}, \field{dst_type} and \field{command_specific_data} are
>> +set by the driver, and the device sets the \field{status}, the
>> +\field{command_specific_error} and the \field{command_specific_result},
>> +if needed.
>> +
>> +Reserved common fields are ignored by the device and to be zeroed by the driver.
>> +
>> +The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
>> +
>> +The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>> +
>> +The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
>> +
>> +The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>> +VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
>> +holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
>> +\field{command_specific_error} value is undefined and should be ignored by the driver.
>> +
>> +The following table describes the Admin command set:
>> +
>> +\begin{tabular}{|l|l|l|}
>> +\hline
>> +Opcode & Command & M/O \\
>> +\hline \hline
>> +0000h   & VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY    & M  \\
>> +\hline
>> +0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
>> +\hline
>> +0002h - 7FFFh   & Generic admin cmds    & -  \\
>> +\hline
>> +8000h - FFFFh   & Reserved    & - \\
>> +\hline
>> +\end{tabular}
>> +
>> +\subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> why without _ here?

The pdf generator doesn't like _

If someone will fix it, I'll add _

>
>> +
>> +The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY by the driver.
>> +
>> +The device, upon success, returns a result that describes information about the designated virtio device.
> result really just means a result structure right? let's say so.

"The device, upon success, returns a result structure that describes 
information about the designated virtio device."

is the above ok ?

MST, Jason and Cornelia please ack.

>> +This result is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_caps_identify_result {
>> +       /* Indicates which of the below fields were returned
>> +        * (1 means that field was returned):
> what does this mean "field is returned"? above restult is returned.

It means: if bit i is 1, then the value/values described by bit i are valid.

Is this ok ?

>> +        * Bit 0 - device_admin_caps
>> +        * Bits 1 - 63 - reserved for future fields
>> +        */
>> +       le64 attrs_mask;
>> +       /* This field indicates which of the below admin
>> +        * capabilities are supported by the device:
>> +        * Bits 0 - 63 - reserved for future capabilities.
>> +        */
>> +       le64 device_admin_caps;
>
> so all of the field is reserved?

the bellow 112 bytes are reserved.

the result is 128B and for this stage 112B are reserved for future 
extensions.

I minimized it from 4k to 128B.

Please ack.

>
>> +       u8 reserved[112];
>> +};
>> +\end{lstlisting}
>> +
>> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
>> +
>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
>
> ok so we have a protocol here, kind of like feature negotiation. Please write its description.
> e.g. is it ok to change accepted caps? when? can device change its caps
> etc etc etc.

I don't understand what does this mean to change a cap ?

Device can offer a cap and driver can accept it if it wishes to use it.

That is it.

I added this mechanism just for your request.

I never saw a device that asks acceptance from driver but I did my best 
to fulfill your request.

>
> Avoiding this kind of spec work is exactly why me and jason keep telling
> you to consider just using features instead. Add a 64 bit admin features
> field to the PCI transport and be done with it. CCW and MMIO already
> have feature selector so it's trivial to add feature bits.

It's not scalable for admin mechanism and I don't want to perform 100 
write/read from configuration space instead of doing all in 1 admin command.

>
>
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
>> +
>> +The command specific data set by the driver is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_caps_accept_data {
>> +       /* Indicates which of the below fields were set
>> +        * (1 means that field is set):
>
> yes we all know that 1 means set.
>
> do you really mean field is valid maybe?
yes valid == set.
>
>
>> +        * Bit 0 - driver_admin_caps
>> +        * Bits 1 - 63 - reserved for future fields
>> +        */
>> +       le64 attrs_mask;
> looks like going overboard. just send 64 caps bits and be done with it.
> and rename accept_data to accept_caps.
this is the command specific data.
>
>> +       /* This field indicates which of the below admin
>> +        * capabilities are supported by the driver:
>> +        * Bits 0 - 63 - reserved for future capabilities.
>> +        */
>> +       le64 driver_admin_caps;
>> +       u8 reserved[112];
>
> I just noticed this. Please do not add this huge amount of padding
> everywhere. instead, explain that device must be ready to accept
> a smaller or larger buffer depending on feature bits.

It's not huge. It's 128B command data.

We will be sorry in the future for not doing extendable API.

I prefer keep it 128B unless there is a concrete reason for not doing so.

>
>> +};
>> +\end{lstlisting}
>> diff --git a/content.tex b/content.tex
>> index c6f116c..2e1df84 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>>   types. It is RECOMMENDED that devices generate version 4
>>   UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>>   
>> +\input{admin.tex}
>> +
>>   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>>   
>>   We start with an overview of device initialization, then expand on the
>> -- 
>> 2.21.0

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-17 10:08       ` [virtio-dev] " Cornelia Huck
@ 2022-05-18 13:42         ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 13:42 UTC (permalink / raw)
  To: Cornelia Huck, Parav Pandit, Michael S. Tsirkin
  Cc: jasowang, virtio-comment, virtio-dev, Oren Duer, Shahaf Shuler,
	aadam, virtio


On 5/17/2022 1:08 PM, Cornelia Huck wrote:
> On Mon, May 16 2022, Parav Pandit <parav@nvidia.com> wrote:
>
>> Hi Michael,
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Sunday, May 15, 2022 11:24 AM
>> [..]
>>
>>>> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
>>> command}\label{sec:Basic
>>>> +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
>>>> +DEVICE CAPS ACCEPT command}
>>>> +
>>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
>>> driver to acknowledge those admin capabilities it understands and wishes to
>>> use.
>>>
>>>
>>> ok so we have a protocol here, kind of like feature negotiation. Please write
>>> its description.
>>> e.g. is it ok to change accepted caps? when? can device change its caps etc
>>> etc etc.
>>>
>>> Avoiding this kind of spec work is exactly why me and jason keep telling you
>>> to consider just using features instead. Add a 64 bit admin features field to
>>> the PCI transport and be done with it. CCW and MMIO already have feature
>>> selector so it's trivial to add feature bits.
>>>
>> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
>>
>> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
>>
>> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
>>
>> However, AQ may find some usage in the VF/SF themselves down the road.
>> Hence, keeping the cap exchange transport this way is more optimal.
>>
>> Max has called out this AQ rationale in 4 or 5 points in the cover letter.
> I'm not against using a queue, but why not use feature bits for
> capabilities? As Michael said, the infrastructure for that is already in
> place.

It's not scalable.

After few features we'll add, we'll regret we ever did it.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio] Re: [virtio-comment] Re: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 13:39     ` [virtio-comment] " Max Gurtovoy
@ 2022-05-18 13:50       ` Cornelia Huck
  2022-05-18 14:16         ` Max Gurtovoy
  2022-06-20 21:08       ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Cornelia Huck @ 2022-05-18 13:50 UTC (permalink / raw)
  To: Max Gurtovoy, Michael S. Tsirkin
  Cc: jasowang, virtio-comment, virtio-dev, oren, parav, shahafs,
	aadam, virtio

On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 5/15/2022 6:23 PM, Michael S. Tsirkin wrote:
>> On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
>>> This command set is used for essential administrative and management
>>> operations.
>>>
>>> Admin commands should be submitted to a well defined management
>>> interface.
>>>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>> ---
>>>   admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   content.tex |   2 +
>>>   2 files changed, 125 insertions(+)
>>>   create mode 100644 admin.tex
>>>
>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
>>
>> ok so we have a protocol here, kind of like feature negotiation. Please write its description.
>> e.g. is it ok to change accepted caps? when? can device change its caps
>> etc etc etc.
>
> I don't understand what does this mean to change a cap ?
>
> Device can offer a cap and driver can accept it if it wishes to use it.
>
> That is it.
>
> I added this mechanism just for your request.
>
> I never saw a device that asks acceptance from driver but I did my best 
> to fulfill your request.
>
>>
>> Avoiding this kind of spec work is exactly why me and jason keep telling
>> you to consider just using features instead. Add a 64 bit admin features
>> field to the PCI transport and be done with it. CCW and MMIO already
>> have feature selector so it's trivial to add feature bits.
>
> It's not scalable for admin mechanism and I don't want to perform 100 
> write/read from configuration space instead of doing all in 1 admin command.

Why use the config space for that; just use feature bits, there are
enough of those, and we already have a defined protocol.

>
>>
>>
>>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
>>> +
>>> +The command specific data set by the driver is of form:
>>> +\begin{lstlisting}
>>> +struct virtio_admin_device_caps_accept_data {
>>> +       /* Indicates which of the below fields were set
>>> +        * (1 means that field is set):
>>
>> yes we all know that 1 means set.
>>
>> do you really mean field is valid maybe?
> yes valid == set.
>>
>>
>>> +        * Bit 0 - driver_admin_caps
>>> +        * Bits 1 - 63 - reserved for future fields
>>> +        */
>>> +       le64 attrs_mask;
>> looks like going overboard. just send 64 caps bits and be done with it.
>> and rename accept_data to accept_caps.
> this is the command specific data.
>>
>>> +       /* This field indicates which of the below admin
>>> +        * capabilities are supported by the driver:
>>> +        * Bits 0 - 63 - reserved for future capabilities.
>>> +        */
>>> +       le64 driver_admin_caps;
>>> +       u8 reserved[112];
>>
>> I just noticed this. Please do not add this huge amount of padding
>> everywhere. instead, explain that device must be ready to accept
>> a smaller or larger buffer depending on feature bits.
>
> It's not huge. It's 128B command data.
>
> We will be sorry in the future for not doing extendable API.
>
> I prefer keep it 128B unless there is a concrete reason for not doing so.

So just use a variable length structure, that should be extendable for
all future use cases.


---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that 
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-17 11:48       ` Michael S. Tsirkin
@ 2022-05-18 14:09         ` Max Gurtovoy
  2022-05-18 14:42           ` [virtio] " Cornelia Huck
  2022-05-31 20:39         ` Parav Pandit
  1 sibling, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:09 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


On 5/17/2022 2:48 PM, Michael S. Tsirkin wrote:
> On Mon, May 16, 2022 at 09:08:34PM +0000, Parav Pandit wrote:
>> Hi Michael,
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Sunday, May 15, 2022 11:24 AM
>> [..]
>>
>>>> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
>>> command}\label{sec:Basic
>>>> +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
>>>> +DEVICE CAPS ACCEPT command}
>>>> +
>>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
>>> driver to acknowledge those admin capabilities it understands and wishes to
>>> use.
>>>
>>>
>>> ok so we have a protocol here, kind of like feature negotiation. Please write
>>> its description.
>>> e.g. is it ok to change accepted caps? when? can device change its caps etc
>>> etc etc.
>>>
>>> Avoiding this kind of spec work is exactly why me and jason keep telling you
>>> to consider just using features instead. Add a 64 bit admin features field to
>>> the PCI transport and be done with it. CCW and MMIO already have feature
>>> selector so it's trivial to add feature bits.
>>>
>> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
>>
>> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
>>
>> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
>>
>> However, AQ may find some usage in the VF/SF themselves down the road.
>> Hence, keeping the cap exchange transport this way is more optimal.
>>
>> Max has called out this AQ rationale in 4 or 5 points in the cover letter.
> Hmm. It's kind of a generic claim though. We never put devices on a diet
> trying to conserve registers. There is cost associated with this dance
> and that is driver boot time.
>
> I also don't really understand how you can claim you need to save
> memory like this and at the same time blindly add a more or less
> "just in case" misc config in the config space.
> So, not pretty.
>
> And as I said, you will need much more spec work to reach the level
> to which features are specified - and note we are not yet happy with
> how features are specified either! So it's a moving target.
>
> Maybe put this in features for now, and leave the whole
> capability thing for another day?

If you're not happy on the feature negotiation in the spec so please 
don't insist we use this mechanism for admin capabilities.

I don't want to postpone essential and basic definition to another day.

We need to agree on it today (even yesterday).

>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 13:50       ` [virtio] " Cornelia Huck
@ 2022-05-18 14:16         ` Max Gurtovoy
  2022-06-20 22:26           ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:16 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: jasowang, virtio-comment, virtio-dev, oren, parav, shahafs,
	aadam, virtio


On 5/18/2022 4:50 PM, Cornelia Huck wrote:
> On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>> On 5/15/2022 6:23 PM, Michael S. Tsirkin wrote:
>>> On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
>>>> This command set is used for essential administrative and management
>>>> operations.
>>>>
>>>> Admin commands should be submitted to a well defined management
>>>> interface.
>>>>
>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>> ---
>>>>    admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    content.tex |   2 +
>>>>    2 files changed, 125 insertions(+)
>>>>    create mode 100644 admin.tex
>>>>
>>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
>>> ok so we have a protocol here, kind of like feature negotiation. Please write its description.
>>> e.g. is it ok to change accepted caps? when? can device change its caps
>>> etc etc etc.
>> I don't understand what does this mean to change a cap ?
>>
>> Device can offer a cap and driver can accept it if it wishes to use it.
>>
>> That is it.
>>
>> I added this mechanism just for your request.
>>
>> I never saw a device that asks acceptance from driver but I did my best
>> to fulfill your request.
>>
>>> Avoiding this kind of spec work is exactly why me and jason keep telling
>>> you to consider just using features instead. Add a 64 bit admin features
>>> field to the PCI transport and be done with it. CCW and MMIO already
>>> have feature selector so it's trivial to add feature bits.
>> It's not scalable for admin mechanism and I don't want to perform 100
>> write/read from configuration space instead of doing all in 1 admin command.
> Why use the config space for that; just use feature bits, there are
> enough of those, and we already have a defined protocol.

can you please propose something concrete ?

that will be scalable and will not add complexity to the feature 
negotiation mechanism we have today ?

>
>>>
>>>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
>>>> +
>>>> +The command specific data set by the driver is of form:
>>>> +\begin{lstlisting}
>>>> +struct virtio_admin_device_caps_accept_data {
>>>> +       /* Indicates which of the below fields were set
>>>> +        * (1 means that field is set):
>>> yes we all know that 1 means set.
>>>
>>> do you really mean field is valid maybe?
>> yes valid == set.
>>>
>>>> +        * Bit 0 - driver_admin_caps
>>>> +        * Bits 1 - 63 - reserved for future fields
>>>> +        */
>>>> +       le64 attrs_mask;
>>> looks like going overboard. just send 64 caps bits and be done with it.
>>> and rename accept_data to accept_caps.
>> this is the command specific data.
>>>> +       /* This field indicates which of the below admin
>>>> +        * capabilities are supported by the driver:
>>>> +        * Bits 0 - 63 - reserved for future capabilities.
>>>> +        */
>>>> +       le64 driver_admin_caps;
>>>> +       u8 reserved[112];
>>> I just noticed this. Please do not add this huge amount of padding
>>> everywhere. instead, explain that device must be ready to accept
>>> a smaller or larger buffer depending on feature bits.
>> It's not huge. It's 128B command data.
>>
>> We will be sorry in the future for not doing extendable API.
>>
>> I prefer keep it 128B unless there is a concrete reason for not doing so.
> So just use a variable length structure, that should be extendable for
> all future use cases.

I don't know how to develop compatible HW that use variable length 
structure.

And why ? without any good reason.

>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-15 15:01   ` Michael S. Tsirkin
@ 2022-05-18 14:27     ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 6:01 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:20AM +0300, Max Gurtovoy wrote:
>> Introduce a new mechanism to issue commands with dst_type field that is
>> not "self". With the new mechanism, driver can set dst_type to 1
>> and use the vdev_id common field to describe the designated vdev_id.
>>
>> This mechanism is useful for device groups with multiple devices
>> with various different capabilities. For example, a type 1 device group
>> that contains a PCI PF and its VF. For this group, a clever system
>> administrator can use admin commands to manipulate the PF/VF resources.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex | 18 ++++++++++++++----
>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index 6725daa..f816c3b 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>           le16 command;
>>           /*
>>            * 0 - self
>> -         * 1 - 65535 are reserved
>> +         * 1 - other virtio device (identified by vdev_id) in the same device group
>> +         * 2 - 65535 are reserved
>>            */
>>           le16 dst_type;
>> +        le64 vdev_id;
>>           /* reserved for common cmd fields */
>> -        u8 reserved[20];
>> +        u8 reserved[12];
>>           u8 command_specific_data[];
>>   
>>           /* Device-writable part */
>> @@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   \hline
>>   03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>>   \hline
>> +04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
>> +\hline
>>   \end{tabular}
>>   
>> -The \field{command}, \field{dst_type} and \field{command_specific_data} are
>> +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
>>   set by the driver, and the device sets the \field{status}, the
>>   \field{command_specific_error} and the \field{command_specific_result},
>>   if needed.
>> @@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   
>>   The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
>>   
>> +The optional unused fields to be zeroed by the driver.
>> +
>>   The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>>   
>> -The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
>> +The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
>> +group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly
> explicitly means nothig. just avoid using this word.

ok.

>
>> to check whether this operation is allowed.
> what this operation? I think you mean "which commands are allowed".

You suggest: "Refer to each command description to check which commands 
are allowed."

It doesn't sounds right.

>> +If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
>> +If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
>> +\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).
>
> should for use by conformance sections only. in fact pls add one for
> this.

Please define clearly, what should be moved to where. And why ?


>>   
>>   The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>>   VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
>> -- 
>> 2.21.0

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-15 15:09   ` Michael S. Tsirkin
  2022-05-16 21:21     ` Parav Pandit
@ 2022-05-18 14:34     ` Max Gurtovoy
  2022-05-18 23:55       ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 6:09 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:20AM +0300, Max Gurtovoy wrote:
>> Introduce a new mechanism to issue commands with dst_type field that is
>> not "self". With the new mechanism, driver can set dst_type to 1
>> and use the vdev_id common field to describe the designated vdev_id.
>>
>> This mechanism is useful for device groups with multiple devices
>> with various different capabilities. For example, a type 1 device group
>> that contains a PCI PF and its VF. For this group, a clever system
>> administrator can use admin commands to manipulate the PF/VF resources.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex | 18 ++++++++++++++----
>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index 6725daa..f816c3b 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>           le16 command;
>>           /*
>>            * 0 - self
>> -         * 1 - 65535 are reserved
>> +         * 1 - other virtio device (identified by vdev_id) in the same device group
>> +         * 2 - 65535 are reserved
>>            */
>>           le16 dst_type;
>> +        le64 vdev_id;
> Alignment problems. Proposal:
>
> vdev_id
> dst_type
> command

I don't understand what is the issue here. All sw-hw packets should be 
__packed__

why should I change the order to be something that is not intuitive ?

>
>
>>           /* reserved for common cmd fields */
>> -        u8 reserved[20];
>> +        u8 reserved[12];
>>           u8 command_specific_data[];
>>   
>>           /* Device-writable part */
>
>> @@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   \hline
>>   03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>>   \hline
>> +04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
>> +\hline
>>   \end{tabular}
>>   
>> -The \field{command}, \field{dst_type} and \field{command_specific_data} are
>> +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
>>   set by the driver, and the device sets the \field{status}, the
>>   \field{command_specific_error} and the \field{command_specific_result},
>>   if needed.
>> @@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   
>>   The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
>>   
>> +The optional unused fields to be zeroed by the driver.
>> +
>>   The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>>   
>> -The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
>> +The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
>> +group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly to check whether this operation is allowed.
>> +If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
>> +If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
>> +\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).
>>   
>>   The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>>   VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
>
> teminology is a bit inconsistent. would dst_id be better? it goes
> together with dst_type after all.

I don't think dst_id is better than vdev_id. Sorry.

vdev_id is a unique identifier of a device inside a device group.

>
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-16 23:33       ` Michael S. Tsirkin
@ 2022-05-18 14:36         ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:36 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


On 5/17/2022 2:33 AM, Michael S. Tsirkin wrote:
> On Mon, May 16, 2022 at 09:21:11PM +0000, Parav Pandit wrote:
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Sunday, May 15, 2022 11:09 AM
>>>
>>>> ---
>>>>   admin.tex | 18 ++++++++++++++----
>>>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/admin.tex b/admin.tex
>>>> index 6725daa..f816c3b 100644
>>>> --- a/admin.tex
>>>> +++ b/admin.tex
>>>> @@ -11,11 +11,13 @@ \section{Administration command
>>> set}\label{sec:Basic Facilities of a Virtio Devi
>>>>           le16 command;
>>>>           /*
>>>>            * 0 - self
>>>> -         * 1 - 65535 are reserved
>>>> +         * 1 - other virtio device (identified by vdev_id) in the same device
>>> group
>>>> +         * 2 - 65535 are reserved
>>>>            */
>>>>           le16 dst_type;
>>>> +        le64 vdev_id;
>>> Alignment problems. Proposal:
>>>
>>> vdev_id
>>> dst_type
>>> command
>> I remember that this has come up internal review as well.
>> Though it is certainly good to naturally align, I don't think we have alignment problem spec wise based on below snippet of the spec [1].
>> It was kind of counter intuitive to see vdev_id before seeing what the actual command is.
>> That way current layout made more sense.
> vdev_id and dst_type tell you where to send the command. They do not
> depend on the command. In fact vdev_id -> dst_id might make sense.

vdev_id -> dst_id ??

what is this supposed to be ? Do you have example for -> in the spec ?

>
>> There was some internal version or external, do not recall, where vdev_id was optional or was union.
>>
>> But I think if we always going have vdev_id, it can be naturally aligned like your above proposal.
>>
>> [1] "Structure Specifications
>> Many device and driver in-memory structure layouts are documented using the C struct syntax. All structures
>> are assumed to be without additional padding. To stress this, cases where common C compilers are known
>> to insert extra padding within structures are tagged using the GNU C __attribute__((packed)) syntax."
> These are pathological cases due to legacy. Better avoided.
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 4/7] Introduce virtio admin virtqueue
  2022-05-15 14:59   ` Michael S. Tsirkin
@ 2022-05-18 14:37     ` Max Gurtovoy
  2022-05-18 23:56       ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 5:59 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:21AM +0300, Max Gurtovoy wrote:
>> In one of the many use cases a user wants to manipulate features and
>> configuration of the virtio devices regardless of the device type
>> (net/block/console). For that the admin command set introduced. The
>> admin virtqueue will be the first management interface to issue admin
>> commands.
>>
>> Currently virtio specification defines control virtqueue to manipulate
>> features and configuration of the device it operates on. However,
>> control virtqueue commands are device type specific, which makes it very
>> difficult to extend for device agnostic commands.
>>
>> To support this requirement in elegant way, this patch introduces a new
>> admin virtqueue interface.
>>
>> Manipulate features via admin virtqueue is asynchronous, scalable, easy
>> to extend and doesn't require additional and expensive on-die resources
>> to be allocated for every new feature that will be added in the future.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex       | 17 +++++++++++++++++
>>   conformance.tex |  1 +
>>   content.tex     |  6 ++++--
>>   3 files changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index f816c3b..d09683d 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -131,3 +131,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>>          u8 reserved[112];
>>   };
>>   \end{lstlisting}
>> +
>> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>> +
>> +An admin virtqueue is a management interface of a device that can be used to send administrative
>> +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
>> +various features of the device and/or to manipulate various features, if possible, of another device.
>> +
>> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
>> +negotiated. The index of the admin virtqueue is exposed by the device in a
>> +transport specific manner.
>> +
>> +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
>> +
>> +\devicenormative{\subsection}{Admin Virtqueues}{Basic Facilities of a Virtio Device / Admin Virtqueues}
>> +A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.
>
> don't see where these are defined.

In the command table (M - mandatory, O - optional)


>
>> +
>> +A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
>> diff --git a/conformance.tex b/conformance.tex
>> index 9807c30..3c7b7bc 100644
>> --- a/conformance.tex
>> +++ b/conformance.tex
>> @@ -342,6 +342,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>>   \item \ref{devicenormative:Basic Facilities of a Virtio Device / Virtqueues / Available Buffer Notification Suppression}
>>   \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
>>   \item \ref{devicenormative:Reserved Feature Bits}
>> +\item \ref{devicenormative:Basic Facilities of a Virtio Device / Admin Virtqueues}
>>   \end{itemize}
>>   
>>   \conformance{\subsection}{PCI Device Conformance}\label{sec:Conformance / Device Conformance / PCI Device Conformance}
>> diff --git a/content.tex b/content.tex
>> index 2e1df84..163cb34 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
>>   \begin{description}
>>   \item[0 to 23, and 50 to 127] Feature bits for the specific device type
>>   
>> -\item[24 to 40] Feature bits reserved for extensions to the queue and
>> +\item[24 to 41] Feature bits reserved for extensions to the queue and
>>     feature negotiation mechanisms
>>   
>> -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
>> +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
>>   \end{description}
>>   
>>   \begin{note}
>> @@ -6849,6 +6849,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>>     that the driver can reset a queue individually.
>>     See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
>>   
>> +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
>> +
>>   \end{description}
>>   
>>   \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-05-17 10:12     ` [virtio] " Cornelia Huck
@ 2022-05-18 14:42       ` Max Gurtovoy
  2022-05-18 23:58         ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:42 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: jasowang, virtio-comment, virtio-dev, oren, parav, shahafs,
	aadam, virtio


On 5/17/2022 1:12 PM, Cornelia Huck wrote:
> On Sun, May 15 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
>>> This new structure will be used for adding new miscellaneous registers
>>> for a virtio device configuration layout.
>>>
>>> For now, only admin_queue_index register is added. Admin virtqueue index
>>> does not depend on the device type. Hence, add a PCI capability to read
>>> the admin virtqueue index.
>>>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>> I guess we discussed this but I forgot. Why do we have this new
>> structure as opposed to just adding the value at the end of config
>> structure? I was kind of hoping that the structure can be
>> reused for CCW/MMIO and then we can add more use-cases with
>> new transport and device independent structures.
>>
>> If we keep it transport specific I don't really understand why
>> is it useful ...
> Nod, just define a common misc_configuration struct and have the
> individual transports access it in the way it works best for them.

can we agree on it ?

I think we got 3 acks on this in the past, are we opening this topic again ?


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio] Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 14:09         ` Max Gurtovoy
@ 2022-05-18 14:42           ` Cornelia Huck
  2022-05-18 14:48             ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Cornelia Huck @ 2022-05-18 14:42 UTC (permalink / raw)
  To: Max Gurtovoy, Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, virtio-dev, Oren Duer, Shahaf Shuler,
	aadam, virtio

On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 5/17/2022 2:48 PM, Michael S. Tsirkin wrote:
>> On Mon, May 16, 2022 at 09:08:34PM +0000, Parav Pandit wrote:
>>> Hi Michael,
>>>
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Sunday, May 15, 2022 11:24 AM
>>> [..]
>>>
>>>>> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
>>>> command}\label{sec:Basic
>>>>> +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
>>>>> +DEVICE CAPS ACCEPT command}
>>>>> +
>>>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
>>>> driver to acknowledge those admin capabilities it understands and wishes to
>>>> use.
>>>>
>>>>
>>>> ok so we have a protocol here, kind of like feature negotiation. Please write
>>>> its description.
>>>> e.g. is it ok to change accepted caps? when? can device change its caps etc
>>>> etc etc.
>>>>
>>>> Avoiding this kind of spec work is exactly why me and jason keep telling you
>>>> to consider just using features instead. Add a 64 bit admin features field to
>>>> the PCI transport and be done with it. CCW and MMIO already have feature
>>>> selector so it's trivial to add feature bits.
>>>>
>>> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
>>>
>>> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
>>>
>>> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
>>>
>>> However, AQ may find some usage in the VF/SF themselves down the road.
>>> Hence, keeping the cap exchange transport this way is more optimal.
>>>
>>> Max has called out this AQ rationale in 4 or 5 points in the cover letter.
>> Hmm. It's kind of a generic claim though. We never put devices on a diet
>> trying to conserve registers. There is cost associated with this dance
>> and that is driver boot time.
>>
>> I also don't really understand how you can claim you need to save
>> memory like this and at the same time blindly add a more or less
>> "just in case" misc config in the config space.
>> So, not pretty.
>>
>> And as I said, you will need much more spec work to reach the level
>> to which features are specified - and note we are not yet happy with
>> how features are specified either! So it's a moving target.
>>
>> Maybe put this in features for now, and leave the whole
>> capability thing for another day?
>
> If you're not happy on the feature negotiation in the spec so please 
> don't insist we use this mechanism for admin capabilities.
>
> I don't want to postpone essential and basic definition to another day.
>
> We need to agree on it today (even yesterday).
>
>>

Sorry, we will not agree on this today. And certainly not yesterday.

I'll pull out of this discussion. If something is agreed, I'll vote on
it.


---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that 
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 14:42           ` [virtio] " Cornelia Huck
@ 2022-05-18 14:48             ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 14:48 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, virtio-dev, Oren Duer, Shahaf Shuler,
	aadam, virtio


On 5/18/2022 5:42 PM, Cornelia Huck wrote:
> On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>> On 5/17/2022 2:48 PM, Michael S. Tsirkin wrote:
>>> On Mon, May 16, 2022 at 09:08:34PM +0000, Parav Pandit wrote:
>>>> Hi Michael,
>>>>
>>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>>> Sent: Sunday, May 15, 2022 11:24 AM
>>>> [..]
>>>>
>>>>>> +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
>>>>> command}\label{sec:Basic
>>>>>> +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
>>>>>> +DEVICE CAPS ACCEPT command}
>>>>>> +
>>>>>> +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
>>>>> driver to acknowledge those admin capabilities it understands and wishes to
>>>>> use.
>>>>>
>>>>>
>>>>> ok so we have a protocol here, kind of like feature negotiation. Please write
>>>>> its description.
>>>>> e.g. is it ok to change accepted caps? when? can device change its caps etc
>>>>> etc etc.
>>>>>
>>>>> Avoiding this kind of spec work is exactly why me and jason keep telling you
>>>>> to consider just using features instead. Add a 64 bit admin features field to
>>>>> the PCI transport and be done with it. CCW and MMIO already have feature
>>>>> selector so it's trivial to add feature bits.
>>>>>
>>>> As we begin to scale with the device, adding more and more registers like this demands more on-device real estate to comply to the PCI standards.
>>>>
>>>> And therefore, things are queried/accessed rare or occasionally, are better accessed via a queue interface.
>>>>
>>>> One can argue that admin VQ is proposed only for the mgmt. functions so having this cfg register for PF is enough.
>>>>
>>>> However, AQ may find some usage in the VF/SF themselves down the road.
>>>> Hence, keeping the cap exchange transport this way is more optimal.
>>>>
>>>> Max has called out this AQ rationale in 4 or 5 points in the cover letter.
>>> Hmm. It's kind of a generic claim though. We never put devices on a diet
>>> trying to conserve registers. There is cost associated with this dance
>>> and that is driver boot time.
>>>
>>> I also don't really understand how you can claim you need to save
>>> memory like this and at the same time blindly add a more or less
>>> "just in case" misc config in the config space.
>>> So, not pretty.
>>>
>>> And as I said, you will need much more spec work to reach the level
>>> to which features are specified - and note we are not yet happy with
>>> how features are specified either! So it's a moving target.
>>>
>>> Maybe put this in features for now, and leave the whole
>>> capability thing for another day?
>> If you're not happy on the feature negotiation in the spec so please
>> don't insist we use this mechanism for admin capabilities.
>>
>> I don't want to postpone essential and basic definition to another day.
>>
>> We need to agree on it today (even yesterday).
>>
> Sorry, we will not agree on this today. And certainly not yesterday.
>
> I'll pull out of this discussion. If something is agreed, I'll vote on
> it.

I'm trying to get 3 acks (MST, Jason and yours) on each patch to ease on 
the voting process.

We can continue sending versions until v40 but what will it worth if the 
reviewers will nack during vote time ?


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-15 14:37   ` Michael S. Tsirkin
  2022-05-16 21:47     ` Parav Pandit
  2022-05-17  2:28     ` Jason Wang
@ 2022-05-18 15:03     ` Max Gurtovoy
  2022-06-20  9:45       ` Michael S. Tsirkin
  2 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 15:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 5:37 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:23AM +0300, Max Gurtovoy wrote:
>> Introduce the concept of a management and a managed device and add
>> example of using this concept to manage resources.
>>
>> A management device supports the VIRTIO_ADMIN_DEVICE_MGMT and
>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands to manage some resources
>> of a managed device.
>>
>> A typical cloud provider SR-IOV use case is to create many VFs for use
>> by guest VMs. The VFs may not be assigned to a VM until a user requests
>> a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X
>> vectors proportional to the number of CPUs in the VM, but there is no
>> standard way today in the spec to change the number of MSI-X vectors
>> supported by a VF, although there are some operating systems that
>> support this.
>>
>> The new admin mechanism manages the MSI-X interrupt vectors assignments
>> of a managed PCI device (i.e. VF) by its management devices (i.e. its
>> parent PF) but can easily extended to any other generic resource
>> management.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> I'd like to see msix and the concept of type 1 group
> in a separate patch from MSIX.
ok.
>
> I am not sure MSIX things are ready but the grouping part looks mostly
> ok to me.
>
>> ---
>>   admin.tex        | 132 +++++++++++++++++++++++++++++++++++++++++++++--
>>   content.tex      |  81 +++++++++++++++++++++++++++++
>>   introduction.tex |  32 +++++++++++-
>>   3 files changed, 241 insertions(+), 4 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index d09683d..5b54743 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -79,12 +79,20 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   \hline
>>   0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
>>   \hline
>> -0002h - 7FFFh   & Generic admin cmds    & -  \\
>> +0002h   & VIRTIO_ADMIN_DEVICE_MGMT    & O  \\
>> +\hline
>> +0003h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    & O  \\
>> +\hline
>> +0004h - 7FFFh   & Generic admin cmds    & -  \\
>>   \hline
>>   8000h - FFFFh   & Reserved    & - \\
>>   \hline
>>   \end{tabular}
>>   
>> +\begin{note}
>> +{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
>> +\end{note}
>> +
>>   \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>   
>>   The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
>> @@ -102,13 +110,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>>          le64 attrs_mask;
>>          /* This field indicates which of the below admin
>>           * capabilities are supported by the device:
>> -        * Bits 0 - 63 - reserved for future capabilities.
>> +        * Bit 0 - if set, the device is a management device
>> +        * Bit 1 - if set, the device is a type 1 management device that supports
>> +        *         MSI-X vector mgmt of its type 1 managed devices
>> +        * Bits 2 - 63 - reserved for future capabilities.
>>           */
>>          le64 device_admin_caps;
>>          u8 reserved[112];
>>   };
>>   \end{lstlisting}
>>   
>> +\begin{note}
>> +{For more details on MSI-X vector management support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}.}
>> +\end{note}
>> +
>>   \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
>>   
>>   The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
>> @@ -125,13 +140,124 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>>          le64 attrs_mask;
>>          /* This field indicates which of the below admin
>>           * capabilities are supported by the driver:
>> -        * Bits 0 - 63 - reserved for future capabilities.
>> +        * Bit 0 - if set, the driver accepted the device as a management device
>> +        * Bit 1 - if set, the driver accepted the device as a type 1 management device
>> +        *         that supports MSI-X vector mgmt of its type 1 managed devices
>> +        * Bits 2 - 63 - reserved for future capabilities.
>>           */
>>          le64 driver_admin_caps;
>>          u8 reserved[112];
>>   };
>>   \end{lstlisting}
>>   
>> +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
>> +
>> +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
>> +
>> +The command specific data set by the driver is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_data {
>> +        /*
>> +         * 0 - reserved
>> +         * 1 - assign resource to the designated vdev_id
>> +         * 2 - query resource of the designated vdev_id
>> +         * 3 - 255 are reserved
>> +         */
>> +        u8 operation;
>> +        /*
>> +         * 0 - MSI-X vector
>> +         * 1 - 65535 are reserved
>> +         */
>> +        le16 resource;
>> +        /*
>> +         * The value to the given resource:
>> +         * if resource = 0 (MSI-X vector), it's a 1-based count.
>> +         */
>> +        le64 resource_val;
>> +        u8 reserved[5];
>> +};
>> +\end{lstlisting}
>> +
>> +The following table describes the command specific error codes codes:
>> +
>> +\begin{tabular}{|l|l|l|}
>> +\hline
>> +Opcode & Status & Description \\
>> +\hline \hline
>> +00h   & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE    & designated device is in use, operation failed   \\
>> +\hline
>> +01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
>> +\hline
>> +02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
>> +\hline
>> +03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
>> +\hline
>> +04h - FFh   & Reserved    & -  \\
>> +\hline
>> +\end{tabular}
>> +
>> +The device, upon success, returns a result that describes the information according to the requested operation.
>> +This result is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_result {
>> +        le64 resource_val;
>> +        u8 reserved[8];
>> +};
>> +\end{lstlisting}
>> +
>> +If the requested operation by the driver was "assign resource to the designated vdev_id", the device will return the resource_val of the assigned
>> +resources to the designated vdev_id. Upon success, this value should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data
>> +structure set by the driver. In case of a failure, the value of this field is undefined and will be ignored by the driver.
>> +
>> +If the requested operation by the driver was "query resource of the designated vdev_id", the device will return resource_val of the currently assigned
>> +resources to the designated vdev_id upon success. In case of a failure, the value of this field is undefined and will be ignored by the driver.
>> +
>> +\begin{note}
>> +{MSI-X vector resource type is valid only for PCI devices. VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is
>> +returned by the device when the designated vdev_id is not a PCI device.}
>> +\end{note}
>> +
>> +\begin{note}
>> +{For this command, if driver is setting \field{resource} to MSI-X vector type, the \field{vdev_id} can't be associated with a Virtual Function with
>> +VF index greater than NumVFs value as defined in the PCI specification or smaller than 1. An error is returned by the device when \field{vdev_id} is out of the range.}
>> +\end{note}
>> +
>> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
>> +
>> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
>> +
>> +The device, upon success, returns a result that describes the management device attributes.
>> +This result is of form:
>> +\begin{lstlisting}
>> +struct virtio_admin_device_mgmt_attrs_result {
>> +        /* Indicates which of the below fields were returned
>> +         * (1 means that field was returned):
>> +         * Bit 0 - vfs_total_msix_count
>> +         * Bit 1 - vfs_assigned_msix_count
>> +         * Bit 2 - per_vf_max_msix_count
>> +         * Bits 3 - 63 - reserved for future fields
>> +         */
>> +        le64 attrs_mask;
>> +
>> +        /* Total number of msix vectors for the total number of VFs */
>> +        le32 vfs_total_msix_count;
>> +        /* Assigned number of msix vectors for the enabled VFs */
>> +        le32 vfs_assigned_msix_count;
>> +        /* Max number of msix vectors that can be assigned for a single VF */
>> +        le16 per_vf_max_msix_count;
>> +
>> +        u8 reserved[110];
>> +};
>> +\end{lstlisting}
>> +
>> +\begin{note}
>> +{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} and \field{per_vf_max_msix_count} returned by the device if the
>> +designated vdev_id is a management device that can allocate/deallocate MSI-X resources for PCI VFs devices. Otherwise,
>> +the associated bits in \field{attrs_mask} are zeroed by the device.}
>> +\end{note}
>> +
>>   \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>>   
>>   An admin virtqueue is a management interface of a device that can be used to send administrative
>> diff --git a/content.tex b/content.tex
>> index 0c1d44f..81e5850 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>>   
>>   \input{admin.tex}
>>   
>> +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
>> +
>> +A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a type 1 device group.
>> +A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
>> +A management device might have various management capabilities and attributes to manage its managed devices.
> This makes my eyes glaze over.
> Please, find all instances which say "manage" more than once and
> rephrase.

Can you propose something you like ?

Each individual has different wording style.

Just choose whatever fits to your style and I'll add it.

>
>> The capabilities exposed
>> +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>> +for more details) and the attributes exposed in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
>> +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
>> +
>> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
>> +
>>   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>>   
>>   We start with an overview of device initialization, then expand on the
>> @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
>>       \end{itemize}
>>   \end{itemize}
>>   
>> +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
>> +
>> +This documents the group of admin capabilities for PCI virtio devices. Each capability is
>> +implemented using one or more Admin commands.
>> +
>> +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}
>> +
>> +This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors
>> +for its managed devices. In PCI, a management device can be the PF device and the managed device can be the VF (for example in a type 1 device group).
>> +Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands, report the MSI-X attributes in the result of
>> +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector resource management is supported in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
>> +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>> +
>> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of
>> +msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
>> +\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
>> +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
>> +fields in the \field{attrs_mask} field of the result buffer.
>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>> +
>> +The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification.
>> +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device.
>> +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the
>> +amount of MSI-X interrupt vectors to configure to the designated managed device in \field{resource_val}. The managed device id is set to \field{vdev_id} field.
>> +
>> +A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the designated device.
>> +This value is also returned in the virtio_admin_device_mgmt_result structure.
>> +Also, a successful operation guarantees that the MSI-X capability access by the designated PCI device defined by the PCI specification must reflect
>> +the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT
>> +increases the MSI-X vectors to 8. On this change, reading Table size field of the MSI-X message control register will reflect a value of 7.
>> +
>> +It is beyond the scope of the virtio specification to define
>> necessary synchronization in system software to ensure that a virtio
>> PCI VF device +interrupt configuration modification is reflected in
>> the PCI device.
> IMHO it is very much in scope of the specification. The scope of the
> specification is to allow device interoperability and this very much
> fits the bill.

each system has its own set of tools and definitions.

It's not covered in the spec today and should be covered. Otherwise, the 
spec will get inside areas it shouldn't.


>
>> However, it is expected that any modern system software implementing
>> virtio +drivers and PCI subsystem will ensure that any changes
>> occurring in the VF interrupt configuration is either updated in the
>> PCI VF device or +such configuration fails.
> OK. Anything more? What exactly does "interrupt configuration" mean here?

MSI-X configuration.


>
>> For example, one way to
>> implement that is to make sure that there is no driver bounded to the
>> virtio PCI SR-IOV VF during +this operation.
> bounded in what sense?
in a sense that a pci device driver is bounded and probed the device.
>
> And why do you say VF? Is this command limited to type 1? You only
> limit it to PCI above.

Today we support setting MSI-X configuration for VFs.

This is why I mentioned VFs.

IIRC, you asked to mentioned VFs in the past - but I'm not sure.

Is this a problem ? should I remove some sentance ?

> same elsewhere
>
>> +
>> +To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
> issues
>
> lots of grammar error like this elsewhere, pls find and correct.
>
>> +"query resource of the designated vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id}
>> +field. In the result of a successful operation,
> meaning "in case"?
yes.
>> the amount of MSI-X interrupt vectors that is currently assigned to the designated managed device is
>> +returned by the device in \field{resource_val} field of the virtio_admin_device_mgmt_result structure.
>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
>> +
>> +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
>> +
>> +A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following:
> rephrase to simplify
>
> The driver uses the following sequence for configuring MSI-X vectors
> ....

But it's not the driver.

why should I change this if it's not true ?

>
>
>> +
>> +\begin{enumerate}
>> +\item Ensure that VF driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)
>> +
>> +\item Load the PF driver
>> +
>> +\item Enable SR-IOV by following the PCI specification
>> +
>> +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
>> +
>> +\item Find the managed VF vdev_id (for type 1 device group the vdev_id of PCI VF is equal to vf number)
>> +
>> +\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
>> +
>> +\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
>> +
>> +\item After successful completion of the assignment, load the VF driver
>> +
>> +\item Assign the VF to a VM
>> +
>> +\end{enumerate}
>> +
>>   \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
>>   
>>   Virtual environments without PCI support (a common situation in
>> diff --git a/introduction.tex b/introduction.tex
>> index 4358ab1..bfc5498 100644
>> --- a/introduction.tex
>> +++ b/introduction.tex
>> @@ -164,9 +164,39 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>   For now, the supported device groups are:
>>   \begin{enumerate}
>>   \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>> -and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification). A PCI SR-IOV PF device can act as a management device for
>> +type 1 group. A PCI SR-IOV VF device can act as a managed device for type 1 group (see \ref{sec:Introduction / Terminology / Virtio management device} and
>> +\ref{sec:Introduction / Terminology / Virtio managed device} for more information).
>>   \end{enumerate}
>>   
>> +\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
>> +
>> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
>> +This device can manage a virtio managed device. A device group may contain zero or more management devices.
>> +
>> +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for type 1 device group).
>> +
>> +\subsection{Virtio type 1 management device}\label{sec:Introduction / Terminology / Virtio type 1 management device}
>> +
>> +A virtio management device for type 1 device group. This device is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio device in the same device group),
>> +and set \field{vdev_id} to an id that corresponds with one of its managed virtio devices (PCI SR-IOV VFs) for the VIRTIO_ADMIN_DEVICE_MGMT admin command.
>> +
>> +A type 1 device group may contain zero or one management devices.
>> +
>> +\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
>> +
>> +A virtio device that can be managed by a virtio management device.
>> +A device group may contain zero or more managed devices.
>> +
>> +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for type 1 group).
>> +
>> +\subsection{virtio type 1 managed device}\label{sec:Introduction / Terminology / Virtio type 1 managed device}
>> +
>> +A virtio managed device for type 1 device group. This device is a PCI SR-IOV VF and is managed by a virtio type 1 management device (virtio PCI SR-IOV PF).
>> +It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio type 1 management device are type 1 managed devices.
>> +
>>   \section{Structure Specifications}\label{sec:Structure Specifications}
>>   
>>   Many device and driver in-memory structure layouts are documented using
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-17 12:31       ` [virtio-comment] " Michael S. Tsirkin
@ 2022-05-18 15:14         ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 15:14 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


On 5/17/2022 3:31 PM, Michael S. Tsirkin wrote:
> On Mon, May 16, 2022 at 09:47:23PM +0000, Parav Pandit wrote:
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Sunday, May 15, 2022 10:37 AM
>>>> +This value is also returned in the virtio_admin_device_mgmt_result
>>> structure.
>>>> +Also, a successful operation guarantees that the MSI-X capability
>>>> +access by the designated PCI device defined by the PCI specification
>>>> +must reflect the new configuration in all relevant fields. For example, by
>>> default if the PCI VF has been assigned 4 MSI-X vectors, and
>>> VIRTIO_ADMIN_DEVICE_MGMT increases the MSI-X vectors to 8. On this
>>> change, reading Table size field of the MSI-X message control register will
>>> reflect a value of 7.
>>>> +
>>>> +It is beyond the scope of the virtio specification to define
>>>> necessary synchronization in system software to ensure that a virtio
>>>> PCI VF device +interrupt configuration modification is reflected in
>>>> the PCI device.
>>> IMHO it is very much in scope of the specification.
>> Many pieces of this system software are not implemented by the virtio specification...
>> Not sure, how can it belong to virtio spec.
>>
>>> The scope of the
>>> specification is to allow device interoperability and this very much fits the bill.
>>>
>> How is interoperability affected and between which two entities?
> For example, if OS/driver caches the # of vectors in the MSI capability,
> and it changes things will not work.  OS/drivers won't know not to cache
> values unless we tell them what not to cache.

There is a support today in Linux to change msi-x configuration of mlx5 VFs.

You're welcome to review it and raise issues if you find some.

I'll fwd these issues to relevant maintainers of RDMA and PCI subsystems.

We mentioned that drivers shouldn't probe the VF, so for sure the driver 
will not cache the # of vectors if it didn't probe the VF.

>
> I'd like to float again the idea of instead exposing a larger number of
> vectors than supported. Assigning more vectors than supported will then
> fail, drivers already check that so they will recover.
> Avoids modifying fields that pci spec expects to be read only.

It's nothing new.

This exist today in linux.

starting from some large unsupported number might cause issues to 
un-resilient drivers.

Are you sure this is what you like to write in the spec ?

>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-17  2:28     ` Jason Wang
@ 2022-05-18 15:27       ` Max Gurtovoy
  2022-05-18 16:41         ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 15:27 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtio-comment, cohuck, virtio-dev, oren, parav, shahafs, aadam, virtio


On 5/17/2022 5:28 AM, Jason Wang wrote:
>
> 在 2022/5/15 22:37, Michael S. Tsirkin 写道:
>> On Wed, Apr 27, 2022 at 01:58:23AM +0300, Max Gurtovoy wrote:
>>> Introduce the concept of a management and a managed device and add
>>> example of using this concept to manage resources.
>>>
>>> A management device supports the VIRTIO_ADMIN_DEVICE_MGMT and
>>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands to manage some resources
>>> of a managed device.
>>>
>>> A typical cloud provider SR-IOV use case is to create many VFs for use
>>> by guest VMs. The VFs may not be assigned to a VM until a user requests
>>> a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X
>>> vectors proportional to the number of CPUs in the VM, but there is no
>>> standard way today in the spec to change the number of MSI-X vectors
>>> supported by a VF, although there are some operating systems that
>>> support this.
>>>
>>> The new admin mechanism manages the MSI-X interrupt vectors assignments
>>> of a managed PCI device (i.e. VF) by its management devices (i.e. its
>>> parent PF) but can easily extended to any other generic resource
>>> management.
>>>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>> I'd like to see msix and the concept of type 1 group
>> in a separate patch from MSIX.
>>
>> I am not sure MSIX things are ready but the grouping part looks mostly
>> ok to me.
>>
>>> ---
>>>   admin.tex        | 132 
>>> +++++++++++++++++++++++++++++++++++++++++++++--
>>>   content.tex      |  81 +++++++++++++++++++++++++++++
>>>   introduction.tex |  32 +++++++++++-
>>>   3 files changed, 241 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/admin.tex b/admin.tex
>>> index d09683d..5b54743 100644
>>> --- a/admin.tex
>>> +++ b/admin.tex
>>> @@ -79,12 +79,20 @@ \section{Administration command 
>>> set}\label{sec:Basic Facilities of a Virtio Devi
>>>   \hline
>>>   0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
>>>   \hline
>>> -0002h - 7FFFh   & Generic admin cmds    & -  \\
>>> +0002h   & VIRTIO_ADMIN_DEVICE_MGMT    & O  \\
>>> +\hline
>>> +0003h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    & O  \\
>>> +\hline
>>> +0004h - 7FFFh   & Generic admin cmds    & -  \\
>>>   \hline
>>>   8000h - FFFFh   & Reserved    & - \\
>>>   \hline
>>>   \end{tabular}
>>>   +\begin{note}
>>> +{The following commands are mandatory for management devices: 
>>> VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
>>> +\end{note}
>>> +
>>>   \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY 
>>> command}\label{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>>     The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command 
>>> specific data set by the driver.
>>> @@ -102,13 +110,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY 
>>> command}\label{sec:Basic Facilitie
>>>          le64 attrs_mask;
>>>          /* This field indicates which of the below admin
>>>           * capabilities are supported by the device:
>>> -        * Bits 0 - 63 - reserved for future capabilities.
>>> +        * Bit 0 - if set, the device is a management device
>>> +        * Bit 1 - if set, the device is a type 1 management device 
>>> that supports
>>> +        *         MSI-X vector mgmt of its type 1 managed devices
>>> +        * Bits 2 - 63 - reserved for future capabilities.
>>>           */
>>>          le64 device_admin_caps;
>>>          u8 reserved[112];
>>>   };
>>>   \end{lstlisting}
>>>   +\begin{note}
>>> +{For more details on MSI-X vector management support see section 
>>> \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / 
>>> PCI-specific Admin command set / MSI-X vector management}.}
>>> +\end{note}
>>> +
>>>   \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT 
>>> command}\label{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
>>>     The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the 
>>> driver to acknowledge those admin capabilities it understands and 
>>> wishes to use.
>>> @@ -125,13 +140,124 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT 
>>> command}\label{sec:Basic Facilities
>>>          le64 attrs_mask;
>>>          /* This field indicates which of the below admin
>>>           * capabilities are supported by the driver:
>>> -        * Bits 0 - 63 - reserved for future capabilities.
>>> +        * Bit 0 - if set, the driver accepted the device as a 
>>> management device
>>> +        * Bit 1 - if set, the driver accepted the device as a type 
>>> 1 management device
>>> +        *         that supports MSI-X vector mgmt of its type 1 
>>> managed devices
>>> +        * Bits 2 - 63 - reserved for future capabilities.
>>>           */
>>>          le64 driver_admin_caps;
>>>          u8 reserved[112];
>>>   };
>>>   \end{lstlisting}
>>>   +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic 
>>> Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN 
>>> DEVICE MGMT command}
>>> +
>>> +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device 
>>> to manage resources of managed virtio devices.
>>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
>>> +
>>> +The command specific data set by the driver is of form:
>>> +\begin{lstlisting}
>>> +struct virtio_admin_device_mgmt_data {
>>> +        /*
>>> +         * 0 - reserved
>>> +         * 1 - assign resource to the designated vdev_id
>>> +         * 2 - query resource of the designated vdev_id
>>> +         * 3 - 255 are reserved
>>> +         */
>>> +        u8 operation;
>>> +        /*
>>> +         * 0 - MSI-X vector
>>> +         * 1 - 65535 are reserved
>>> +         */
>>> +        le16 resource;
>>> +        /*
>>> +         * The value to the given resource:
>>> +         * if resource = 0 (MSI-X vector), it's a 1-based count.
>>> +         */
>>> +        le64 resource_val;
>>> +        u8 reserved[5];
>>> +};
>>> +\end{lstlisting}
>>> +
>>> +The following table describes the command specific error codes codes:
>>> +
>>> +\begin{tabular}{|l|l|l|}
>>> +\hline
>>> +Opcode & Status & Description \\
>>> +\hline \hline
>>> +00h   & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE    & designated device is 
>>> in use, operation failed   \\
>>> +\hline
>>> +01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is 
>>> invalid  \\
>>> +\hline
>>> +02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid 
>>> resource  \\
>>> +\hline
>>> +03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid 
>>> operation  \\
>>> +\hline
>>> +04h - FFh   & Reserved    & -  \\
>>> +\hline
>>> +\end{tabular}
>>> +
>>> +The device, upon success, returns a result that describes the 
>>> information according to the requested operation.
>>> +This result is of form:
>>> +\begin{lstlisting}
>>> +struct virtio_admin_device_mgmt_result {
>>> +        le64 resource_val;
>>> +        u8 reserved[8];
>>> +};
>>> +\end{lstlisting}
>>> +
>>> +If the requested operation by the driver was "assign resource to 
>>> the designated vdev_id", the device will return the resource_val of 
>>> the assigned
>>> +resources to the designated vdev_id. Upon success, this value 
>>> should be equal to the \field{resource_val} of the 
>>> virtio_admin_device_mgmt_data
>>> +structure set by the driver. In case of a failure, the value of 
>>> this field is undefined and will be ignored by the driver.
>>> +
>>> +If the requested operation by the driver was "query resource of the 
>>> designated vdev_id", the device will return resource_val of the 
>>> currently assigned
>>> +resources to the designated vdev_id upon success. In case of a 
>>> failure, the value of this field is undefined and will be ignored by 
>>> the driver.
>>> +
>>> +\begin{note}
>>> +{MSI-X vector resource type is valid only for PCI devices. 
>>> VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is
>>> +returned by the device when the designated vdev_id is not a PCI 
>>> device.}
>
>
> Note that MSI has been used by various platform devices. It would be 
> better if we can make it work for non-PCI devices otherwise we may 
> re-introduce duplicated commands.
>
we can't even agree on PCI existing feature today in Linux so adding 
more complexity will bring us back to the beginning.
>
>>> +\end{note}
>>> +
>>> +\begin{note}
>>> +{For this command, if driver is setting \field{resource} to MSI-X 
>>> vector type, the \field{vdev_id} can't be associated with a Virtual 
>>> Function with
>>> +VF index greater than NumVFs value as defined in the PCI 
>>> specification or smaller than 1. An error is returned by the device 
>>> when \field{vdev_id} is out of the range.}
>>> +\end{note}
>>> +
>>> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic 
>>> Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN 
>>> DEVICE MGMT ATTRS command}
>>> +
>>> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific 
>>> data set by the driver.
>>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
>>> +
>>> +The device, upon success, returns a result that describes the 
>>> management device attributes.
>>> +This result is of form:
>>> +\begin{lstlisting}
>>> +struct virtio_admin_device_mgmt_attrs_result {
>>> +        /* Indicates which of the below fields were returned
>>> +         * (1 means that field was returned):
>>> +         * Bit 0 - vfs_total_msix_count
>>> +         * Bit 1 - vfs_assigned_msix_count
>>> +         * Bit 2 - per_vf_max_msix_count
>>> +         * Bits 3 - 63 - reserved for future fields
>>> +         */
>>> +        le64 attrs_mask;
>>> +
>>> +        /* Total number of msix vectors for the total number of VFs */
>>> +        le32 vfs_total_msix_count;
>>> +        /* Assigned number of msix vectors for the enabled VFs */
>>> +        le32 vfs_assigned_msix_count;
>>> +        /* Max number of msix vectors that can be assigned for a 
>>> single VF */
>>> +        le16 per_vf_max_msix_count;
>>> +
>>> +        u8 reserved[110];
>>> +};
>>> +\end{lstlisting}
>>> +
>>> +\begin{note}
>>> +{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} 
>>> and \field{per_vf_max_msix_count} returned by the device if the
>>> +designated vdev_id is a management device that can 
>>> allocate/deallocate MSI-X resources for PCI VFs devices. Otherwise,
>>> +the associated bits in \field{attrs_mask} are zeroed by the device.}
>>> +\end{note}
>>> +
>>>   \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio 
>>> Device / Admin Virtqueues}
>>>     An admin virtqueue is a management interface of a device that 
>>> can be used to send administrative
>>> diff --git a/content.tex b/content.tex
>>> index 0c1d44f..81e5850 100644
>>> --- a/content.tex
>>> +++ b/content.tex
>>> @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic 
>>> Facilities of a Virtio Device / Expo
>>>     \input{admin.tex}
>>>   +\section{Device management}\label{sec:Basic Facilities of a 
>>> Virtio Device / Device management}
>>> +
>>> +A device group might consist of one or more virtio devices. For 
>>> example, virtio PCI SR-IOV PF and its VFs compose a type 1 device 
>>> group.
>>> +A capable PCI SR-IOV PF virtio device might act as the management 
>>> device in this group, and its PCI SR-IOV VFs are the managed devices.
>>> +A management device might have various management capabilities and 
>>> attributes to manage its managed devices.
>> This makes my eyes glaze over.
>> Please, find all instances which say "manage" more than once and
>> rephrase.
>>
>>> The capabilities exposed
>>> +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see 
>>> section \ref{sec:Basic Facilities of a Virtio Device / Admin command 
>>> set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>> +for more details) and the attributes exposed in the result of 
>>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
>>> +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more 
>>> details).
>>> +
>>> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin 
>>> command to manage its managed devices (see section
>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / 
>>> VIRTIO ADMIN DEVICE MGMT command} for more details).
>>> +
>>>   \chapter{General Initialization And Device 
>>> Operation}\label{sec:General Initialization And Device Operation}
>>>     We start with an overview of device initialization, then expand 
>>> on the
>>> @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling 
>>> Interrupts}\label{sec:Virtio Transport Options /
>>>       \end{itemize}
>>>   \end{itemize}
>>>   +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio 
>>> Transport Options / Virtio Over PCI Bus / PCI-specific Admin 
>>> capabilities}
>>> +
>>> +This documents the group of admin capabilities for PCI virtio 
>>> devices. Each capability is
>>> +implemented using one or more Admin commands.
>>> +
>>> +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport 
>>> Options / Virtio Over PCI Bus / PCI-specific Admin command set / 
>>> MSI-X vector management}
>>> +
>>> +This capability enables a virtio management device to control the 
>>> assignment of MSI-X interrupt vectors
>>> +for its managed devices.
>
>
> I think we need to clarify whether the Initial VFs belong to the 
> "managed device".
>
>
>>>   In PCI, a management device can be the PF device and the managed 
>>> device can be the VF (for example in a type 1 device group).
>>> +Capable management devices will need to implement 
>>> VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin 
>>> commands, report the MSI-X attributes in the result of
>>> +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector 
>>> resource management is supported in the result of 
>>> VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
>>> +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / 
>>> VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>>> +
>>> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a 
>>> capable management device will return the total number of
>>> +msix vectors for its VFs in \field{vfs_total_msix_count} field, the 
>>> number of already assigned msix vectors for its VFs in
>>> +\field{vfs_assigned_msix_count} field and also the maximal number 
>>> of msix vectors that can be assigned for a single VF in
>>> +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and 
>>> bit 2 are set to indicate on the validity of the other 3
>>> +fields in the \field{attrs_mask} field of the result buffer.
>>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>>> +
>>> +The default assignment of the MSI-X vectors for managed devices is 
>>> out of the scope of this specification.
>>> +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X 
>>> assignment for a specific managed device.
>>> +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set 
>>> the \field{resource} type to be MSI-X vector and the
>>> +amount of MSI-X interrupt vectors to configure to the designated 
>>> managed device in \field{resource_val}. The managed device id is set 
>>> to \field{vdev_id} field.
>>> +
>>> +A successful operation guarantees that the requested amount of 
>>> MSI-X interrupt vectors was assigned to the designated device.
>>> +This value is also returned in the virtio_admin_device_mgmt_result 
>>> structure.
>>> +Also, a successful operation guarantees that the MSI-X capability 
>>> access by the designated PCI device defined by the PCI specification 
>>> must reflect
>>> +the new configuration in all relevant fields. For example, by 
>>> default if the PCI VF has been assigned 4 MSI-X vectors, and 
>>> VIRTIO_ADMIN_DEVICE_MGMT
>>> +increases the MSI-X vectors to 8. On this change, reading Table 
>>> size field of the MSI-X message control register will reflect a 
>>> value of 7.
>
>
> This seems odd, what happens if we reduce the number of vectors. Or is 
> such on-the-fly changes of the semantic of a register allowed by the 
> PCI specification?
it's done in Linux.
>
> I think the driver must do this before creating the VFs (writing to 
> the sriov_numvfs or status), and the device will ignore or fail the 
> request of such changes after the VFs have been provisioned.
>
>
>>> +
>>> +It is beyond the scope of the virtio specification to define
>>> necessary synchronization in system software to ensure that a virtio
>>> PCI VF device +interrupt configuration modification is reflected in
>>> the PCI device.
>> IMHO it is very much in scope of the specification. The scope of the
>> specification is to allow device interoperability and this very much
>> fits the bill.
>
>
> +1, things will be much easier if we only allow the changes before 
> provisioning VFs.

Do you want to limit the spec to this ?

it will restrict the feature a lot.

>
>
>>
>>> However, it is expected that any modern system software implementing
>>> virtio +drivers and PCI subsystem will ensure that any changes
>>> occurring in the VF interrupt configuration is either updated in the
>>> PCI VF device or +such configuration fails.
>> OK. Anything more? What exactly does "interrupt configuration" mean 
>> here?
>>
>>> For example, one way to
>>> implement that is to make sure that there is no driver bounded to the
>>> virtio PCI SR-IOV VF during +this operation.
>> bounded in what sense?
>>
>> And why do you say VF? Is this command limited to type 1? You only
>> limit it to PCI above.
>>
>> same elsewhere
>>
>>> +
>>> +To query amount of MSI-X interrupt vectors that is currently 
>>> assigned to a managed device, the driver issue 
>>> VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
>> issues
>>
>> lots of grammar error like this elsewhere, pls find and correct.
>>
>>> +"query resource of the designated vdev_id" value (== 2). The driver 
>>> also set the \field{resource} type to be MSI-X vector and the 
>>> managed device id is set to \field{vdev_id}
>>> +field. In the result of a successful operation,
>> meaning "in case"?
>>
>>> the amount of MSI-X interrupt vectors that is currently assigned to 
>>> the designated managed device is
>>> +returned by the device in \field{resource_val} field of the 
>>> virtio_admin_device_mgmt_result structure.
>>> +See section \ref{sec:Basic Facilities of a Virtio Device / Admin 
>>> command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
>>> +
>>> +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio 
>>> Transport Options / Virtio Over PCI Bus / PCI-specific Admin command 
>>> set / VF MSI-X control / MSI-X configuration sequence example }
>>> +
>>> +A typical sequence for configuring MSI-X vectors for PCI VFs using 
>>> MSI-X vector management mechanism is following:
>> rephrase to simplify
>>
>> The driver uses the following sequence for configuring MSI-X vectors
>> ....
>>
>>
>>
>>> +
>>> +\begin{enumerate}
>>> +\item Ensure that VF driver doesn't run and it is safe to change 
>>> MSI-X (e.g. disable sriov auto probing)
>
>
> Is "sriov auto probing" a general OS facility instead of Linux 
> specific? If not, we need clarify what it did here.

is "disable automatic probing mechanism for virtual functions or use 
some other tools to verify the virtual function is not bound and probed 
by any device driver"

better ?

>
> Thanks
>
>
>>> +
>>> +\item Load the PF driver
>>> +
>>> +\item Enable SR-IOV by following the PCI specification
>>> +
>>> +\item Query the management device capabilities using commands 
>>> VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
>>> +
>>> +\item Find the managed VF vdev_id (for type 1 device group the 
>>> vdev_id of PCI VF is equal to vf number)
>>> +
>>> +\item Query the VF MSI-X configuration using command 
>>> VIRTIO_ADMIN_DEVICE_MGMT (query operation)
>>> +
>>> +\item Assign desired MSI-X configuration for the VF using command 
>>> VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
>>> +
>>> +\item After successful completion of the assignment, load the VF 
>>> driver
>>> +
>>> +\item Assign the VF to a VM
>>> +
>>> +\end{enumerate}
>>> +
>>>   \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / 
>>> Virtio Over MMIO}
>>>     Virtual environments without PCI support (a common situation in
>>> diff --git a/introduction.tex b/introduction.tex
>>> index 4358ab1..bfc5498 100644
>>> --- a/introduction.tex
>>> +++ b/introduction.tex
>>> @@ -164,9 +164,39 @@ \subsection{Device 
>>> group}\label{sec:Introduction / Terminology / Device group}
>>>   For now, the supported device groups are:
>>>   \begin{enumerate}
>>>   \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its 
>>> PCI SR-IOV virtual functions (VFs). For this group type, the PF 
>>> device has vdev_id that is equal to 0
>>> -and the VF devices have vdev_id's that are equal to their vf_number 
>>> (according to the PCI SR-IOV specification).
>>> +and the VF devices have vdev_id's that are equal to their vf_number 
>>> (according to the PCI SR-IOV specification). A PCI SR-IOV PF device 
>>> can act as a management device for
>>> +type 1 group. A PCI SR-IOV VF device can act as a managed device 
>>> for type 1 group (see \ref{sec:Introduction / Terminology / Virtio 
>>> management device} and
>>> +\ref{sec:Introduction / Terminology / Virtio managed device} for 
>>> more information).
>>>   \end{enumerate}
>>>   +\subsection{Virtio management device}\label{sec:Introduction / 
>>> Terminology / Virtio management device}
>>> +
>>> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and 
>>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / 
>>> VIRTIO ADMIN DEVICE MGMT command} and
>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / 
>>> VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
>>> +This device can manage a virtio managed device. A device group may 
>>> contain zero or more management devices.
>>> +
>>> +A PCI SR-IOV Physical Function based virtio device is an example of 
>>> a possible virtio management device (for type 1 device group).
>>> +
>>> +\subsection{Virtio type 1 management device}\label{sec:Introduction 
>>> / Terminology / Virtio type 1 management device}
>>> +
>>> +A virtio management device for type 1 device group. This device is 
>>> a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio 
>>> device in the same device group),
>>> +and set \field{vdev_id} to an id that corresponds with one of its 
>>> managed virtio devices (PCI SR-IOV VFs) for the 
>>> VIRTIO_ADMIN_DEVICE_MGMT admin command.
>>> +
>>> +A type 1 device group may contain zero or one management devices.
>>> +
>>> +\subsection{virtio managed device}\label{sec:Introduction / 
>>> Terminology / Virtio managed device}
>>> +
>>> +A virtio device that can be managed by a virtio management device.
>>> +A device group may contain zero or more managed devices.
>>> +
>>> +A PCI SR-IOV Virtual Function based virtio device is an example of 
>>> a possible virtio managed device (for type 1 group).
>>> +
>>> +\subsection{virtio type 1 managed device}\label{sec:Introduction / 
>>> Terminology / Virtio type 1 managed device}
>>> +
>>> +A virtio managed device for type 1 device group. This device is a 
>>> PCI SR-IOV VF and is managed by a virtio type 1 management device 
>>> (virtio PCI SR-IOV PF).
>>> +It is implied that all the virtio PCI SR-IOV VFs related to a 
>>> virtio PCI SR-IOV PF that is virtio type 1 management device are 
>>> type 1 managed devices.
>>> +
>>>   \section{Structure Specifications}\label{sec:Structure 
>>> Specifications}
>>>     Many device and driver in-memory structure layouts are 
>>> documented using
>>> -- 
>>> 2.21.0
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 7/7] RFC: add initial support for configuring feature bits
  2022-05-15 14:38   ` Michael S. Tsirkin
@ 2022-05-18 15:31     ` Max Gurtovoy
  2022-05-18 16:34       ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 15:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 5:38 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:24AM +0300, Max Gurtovoy wrote:
>> After adding the concept of a management and a managed device, add
>> another example of using this concept to manage resources.
>>
>> Today there is no standard definition in the spec that allows user to
>> setup specific feature bits of a virtio device.
>>
>> For that, extend the management mechanism to allow management devices to
>> change feature bits of its managed devices.
>>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> Please, add more explanation here. E.g. I am guessing these are
> host feature bits, right? How does driver know which features are
> ok to enable?
>
> I would expect some description sections and conformance sections.
>
This is an RFC to emphasize the need for the admin command set.

I added this because we agreed to think on more features for showing the 
motivation of this mechanism.

I can work on MSI-X or feature bits as initial submission.

Choose what do you think will be easier to merge and lets keep working 
on it.

Working on both will cause distraction.

>> ---
>>   admin.tex | 12 +++++++++---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index 5b54743..43106ba 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -113,7 +113,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>>           * Bit 0 - if set, the device is a management device
>>           * Bit 1 - if set, the device is a type 1 management device that supports
>>           *         MSI-X vector mgmt of its type 1 managed devices
>> -        * Bits 2 - 63 - reserved for future capabilities.
>> +        * Bit 2 - if set, the device is a type 1 management device that supports
>> +        *         feature mgmt of bits 0 to 63 for its type 1 managed devices
>> +        * Bits 3 - 63 - reserved for future capabilities.
>>           */
>>          le64 device_admin_caps;
>>          u8 reserved[112];
>> @@ -143,7 +145,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>>           * Bit 0 - if set, the driver accepted the device as a management device
>>           * Bit 1 - if set, the driver accepted the device as a type 1 management device
>>           *         that supports MSI-X vector mgmt of its type 1 managed devices
>> -        * Bits 2 - 63 - reserved for future capabilities.
>> +        * Bit 2 - if set, the driver accepted the device as a type 1 management device
>> +        *         that supports feature mgmt of bits 0 to 63 for its type 1 managed devices
>> +        * Bits 3 - 63 - reserved for future capabilities.
>>           */
>>          le64 driver_admin_caps;
>>          u8 reserved[112];
>> @@ -167,12 +171,14 @@ \subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Vi
>>           u8 operation;
>>           /*
>>            * 0 - MSI-X vector
>> -         * 1 - 65535 are reserved
>> +         * 1 - Device feature bits 0 to 63
>> +         * 2 - 65535 are reserved
>>            */
>>           le16 resource;
>>           /*
>>            * The value to the given resource:
>>            * if resource = 0 (MSI-X vector), it's a 1-based count.
>> +         * if resource = 1 (Device feature bits 0 to 63), it's a feature bitmap.
>>            */
>>           le64 resource_val;
>>           u8 reserved[5];
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 0/7] Introduce device group and device management
  2022-05-15 15:27 ` [PATCH v5 0/7] Introduce device group and device management Michael S. Tsirkin
@ 2022-05-18 15:32   ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 15:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 6:27 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
>> Hi,
>> A device group definition will help extending the virtio specefication for
>> various future features that require a notion of grouping devices together or
>> managing devices inside a group. A device group include one or more virtio devices.
>> For now, only support for type 1 device group was added.
>
> OK good progress here. Sent a bunch of comments, most of them
> cosmetic.
>

Is there a chance to get some reviewed-by for few commits in the series ?

to see some progress from v5 -> v6.

>> Also introduce the admin facility to allow manipulating features and configurations
>> in a generic manner. Using the admin command set, one can manipulate the device itself
>> and/or to manipulate, if possible, another device within the same device group (for now,
>> introduce only support of PCI SR-IOV devices grouping).
>>
>> The admin command set will be extended in the future  to support more functionalities.
>> Some of these functionalities are already under discussions.
>>
>> The admin virtqueue is the first management interface to issue admin commands from
>> the admin command set.
>>
>> Motivation for choosing admin queue as first management interface:
>> 1. It is anticipated that admin queue will be used for managing and configuring
>>     many different type of resources. For example,
>>     a. PCI PF configuring PCI VF attributes.
>>     b. virtio device creating/destroying/configuring subfunctions discussed in [1]
>>     c. composing device config space of VF or SF such as mac address, number of VQs, virtio features
>>
>>     Mapping all of them as configuration registers to MMIO will require large MMIO space,
>>     if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
>>     requires on-chip resources to complete within MMIO access latencies. Such resources are very
>>     expensive.
>>
>> 2. Such limitation can be overcome by having smaller MMIO register set to build
>>     a command request response interface. However, such MMIO based command interface
>>     will be limited to serve single outstanding command execution. Such limitation can
>>     resulting in high device creation and composing time which can affect VM startup time.
>>     Often device can queue and service multiple commands in parallel, such command interface
>>     cannot use parallelism offered by the device.
>>
>> 3. When a command wants to DMA data from one or more physical addresses, for example in the future a
>>     live migration command may need to fetch device state consist of config space, tens of
>>     VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
>>     Packing one or more DMA addresses over new command interface will be burden some and continue
>>     to suffer single outstanding command execution latencies. Such limitation is not good for time
>>     sensitive live migration use cases.
>>
>> 4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
>>     descriptors. Similar mechanism exist today for device specific configuration - the control VQ.
>>
>> A future work can add another management interface to issue admin commands.
>>
>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F202108%2Fmsg00025.html&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cbc35d233eaa34e20897408da36877f01%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637882252823323958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=nP3IpYLl7cwk5SkibfDTdguBfzIsUKBwMPDffMG5Vk4%3D&amp;reserved=0
>>
>> This series include the comments and fixes from V1-V4 of the initial patch sets ("VIRTIO: Provision maximum
>> MSI-X vectors for a VF" and "Introduce virtio subsystem and Admin virtqueue" [2]).
>> This series was extended with additional RFC for setting managed device feature bits as another example for
>> using admin command set. Also device/driver negotiation for admin caps was introduced as a response for previous
>> comments on the mailing list.
>>
>> [2] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F202203%2Fmsg00005.html&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cbc35d233eaa34e20897408da36877f01%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637882252823323958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=WAxPBmPb%2BcyMNACTGzcwzCv1kFdjX0iIbXzM8Zl%2Bv4M%3D&amp;reserved=0
>>
>>
>> Open issues:
>> 1. CCW and MMIO specification for admin_queue_index register
>>
>> Changelog:
>>   - Merged MSI-X configuration series to current one.
>>   - Addressed comments from MST, Jason Wang and others.
>>   - simplified the interface.
>>   - added another resource management  as RFC (feature bits).
>>
>> Max Gurtovoy (7):
>>    Introduce device group
>>    Introduce admin command set
>>    Introduce new destination type for admin commands
>>    Introduce virtio admin virtqueue
>>    Add miscellaneous configuration structure for PCI
>>    Introduce MGMT admin commands
>>    RFC: add initial support for configuring feature bits
>>
>>   admin.tex        | 282 +++++++++++++++++++++++++++++++++++++++++++++++
>>   conformance.tex  |   3 +
>>   content.tex      | 118 +++++++++++++++++++-
>>   introduction.tex |  42 +++++++
>>   4 files changed, 443 insertions(+), 2 deletions(-)
>>   create mode 100644 admin.tex
>>
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 7/7] RFC: add initial support for configuring feature bits
  2022-05-18 15:31     ` Max Gurtovoy
@ 2022-05-18 16:34       ` Michael S. Tsirkin
  2022-05-18 23:18         ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-18 16:34 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 06:31:18PM +0300, Max Gurtovoy wrote:
> 
> On 5/15/2022 5:38 PM, Michael S. Tsirkin wrote:
> > On Wed, Apr 27, 2022 at 01:58:24AM +0300, Max Gurtovoy wrote:
> > > After adding the concept of a management and a managed device, add
> > > another example of using this concept to manage resources.
> > > 
> > > Today there is no standard definition in the spec that allows user to
> > > setup specific feature bits of a virtio device.
> > > 
> > > For that, extend the management mechanism to allow management devices to
> > > change feature bits of its managed devices.
> > > 
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > 
> > Please, add more explanation here. E.g. I am guessing these are
> > host feature bits, right? How does driver know which features are
> > ok to enable?
> > 
> > I would expect some description sections and conformance sections.
> > 
> This is an RFC to emphasize the need for the admin command set.
> 
> I added this because we agreed to think on more features for showing the
> motivation of this mechanism.
> 
> I can work on MSI-X or feature bits as initial submission.
> 
> Choose what do you think will be easier to merge and lets keep working on
> it.
> 
> Working on both will cause distraction.

OK.

I personally think feature bits are easier, we don't need to also refer
to the pci spec.

> > > ---
> > >   admin.tex | 12 +++++++++---
> > >   1 file changed, 9 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > index 5b54743..43106ba 100644
> > > --- a/admin.tex
> > > +++ b/admin.tex
> > > @@ -113,7 +113,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
> > >           * Bit 0 - if set, the device is a management device
> > >           * Bit 1 - if set, the device is a type 1 management device that supports
> > >           *         MSI-X vector mgmt of its type 1 managed devices
> > > -        * Bits 2 - 63 - reserved for future capabilities.
> > > +        * Bit 2 - if set, the device is a type 1 management device that supports
> > > +        *         feature mgmt of bits 0 to 63 for its type 1 managed devices
> > > +        * Bits 3 - 63 - reserved for future capabilities.
> > >           */
> > >          le64 device_admin_caps;
> > >          u8 reserved[112];
> > > @@ -143,7 +145,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
> > >           * Bit 0 - if set, the driver accepted the device as a management device
> > >           * Bit 1 - if set, the driver accepted the device as a type 1 management device
> > >           *         that supports MSI-X vector mgmt of its type 1 managed devices
> > > -        * Bits 2 - 63 - reserved for future capabilities.
> > > +        * Bit 2 - if set, the driver accepted the device as a type 1 management device
> > > +        *         that supports feature mgmt of bits 0 to 63 for its type 1 managed devices
> > > +        * Bits 3 - 63 - reserved for future capabilities.
> > >           */
> > >          le64 driver_admin_caps;
> > >          u8 reserved[112];
> > > @@ -167,12 +171,14 @@ \subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Vi
> > >           u8 operation;
> > >           /*
> > >            * 0 - MSI-X vector
> > > -         * 1 - 65535 are reserved
> > > +         * 1 - Device feature bits 0 to 63
> > > +         * 2 - 65535 are reserved
> > >            */
> > >           le16 resource;
> > >           /*
> > >            * The value to the given resource:
> > >            * if resource = 0 (MSI-X vector), it's a 1-based count.
> > > +         * if resource = 1 (Device feature bits 0 to 63), it's a feature bitmap.
> > >            */
> > >           le64 resource_val;
> > >           u8 reserved[5];
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-18 15:27       ` Max Gurtovoy
@ 2022-05-18 16:41         ` Michael S. Tsirkin
  2022-05-18 23:10           ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-18 16:41 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Wang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 06:27:50PM +0300, Max Gurtovoy wrote:
> > Note that MSI has been used by various platform devices. It would be
> > better if we can make it work for non-PCI devices otherwise we may
> > re-introduce duplicated commands.
> > 
> we can't even agree on PCI existing feature today in Linux so adding more
> complexity will bring us back to the beginning.

I think I agree with Max here. MMIO and CCW do not support MSI ATM.
Adding this for MMIO has been proposed in the past but the proposal
was too complex resulting in losing the attractive property of MMIO
that is it's simplicity.

> > 
> > > > +\end{note}
> > > > +
> > > > +\begin{note}
> > > > +{For this command, if driver is setting \field{resource} to
> > > > MSI-X vector type, the \field{vdev_id} can't be associated with
> > > > a Virtual Function with
> > > > +VF index greater than NumVFs value as defined in the PCI
> > > > specification or smaller than 1. An error is returned by the
> > > > device when \field{vdev_id} is out of the range.}
> > > > +\end{note}
> > > > +
> > > > +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS
> > > > command}\label{sec:Basic Facilities of a Virtio Device / Admin
> > > > command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
> > > > +
> > > > +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command
> > > > specific data set by the driver.
> > > > +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
> > > > +
> > > > +The device, upon success, returns a result that describes the
> > > > management device attributes.
> > > > +This result is of form:
> > > > +\begin{lstlisting}
> > > > +struct virtio_admin_device_mgmt_attrs_result {
> > > > +        /* Indicates which of the below fields were returned
> > > > +         * (1 means that field was returned):
> > > > +         * Bit 0 - vfs_total_msix_count
> > > > +         * Bit 1 - vfs_assigned_msix_count
> > > > +         * Bit 2 - per_vf_max_msix_count
> > > > +         * Bits 3 - 63 - reserved for future fields
> > > > +         */
> > > > +        le64 attrs_mask;
> > > > +
> > > > +        /* Total number of msix vectors for the total number of VFs */
> > > > +        le32 vfs_total_msix_count;
> > > > +        /* Assigned number of msix vectors for the enabled VFs */
> > > > +        le32 vfs_assigned_msix_count;
> > > > +        /* Max number of msix vectors that can be assigned for
> > > > a single VF */
> > > > +        le16 per_vf_max_msix_count;
> > > > +
> > > > +        u8 reserved[110];
> > > > +};
> > > > +\end{lstlisting}
> > > > +
> > > > +\begin{note}
> > > > +{The \field{vfs_total_msix_count},
> > > > \field{vfs_assigned_msix_count} and
> > > > \field{per_vf_max_msix_count} returned by the device if the
> > > > +designated vdev_id is a management device that can
> > > > allocate/deallocate MSI-X resources for PCI VFs devices.
> > > > Otherwise,
> > > > +the associated bits in \field{attrs_mask} are zeroed by the device.}
> > > > +\end{note}
> > > > +
> > > >   \section{Admin Virtqueues}\label{sec:Basic Facilities of a
> > > > Virtio Device / Admin Virtqueues}
> > > >     An admin virtqueue is a management interface of a device
> > > > that can be used to send administrative
> > > > diff --git a/content.tex b/content.tex
> > > > index 0c1d44f..81e5850 100644
> > > > --- a/content.tex
> > > > +++ b/content.tex
> > > > @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic
> > > > Facilities of a Virtio Device / Expo
> > > >     \input{admin.tex}
> > > >   +\section{Device management}\label{sec:Basic Facilities of a
> > > > Virtio Device / Device management}
> > > > +
> > > > +A device group might consist of one or more virtio devices. For
> > > > example, virtio PCI SR-IOV PF and its VFs compose a type 1
> > > > device group.
> > > > +A capable PCI SR-IOV PF virtio device might act as the
> > > > management device in this group, and its PCI SR-IOV VFs are the
> > > > managed devices.
> > > > +A management device might have various management capabilities
> > > > and attributes to manage its managed devices.
> > > This makes my eyes glaze over.
> > > Please, find all instances which say "manage" more than once and
> > > rephrase.
> > > 
> > > > The capabilities exposed
> > > > +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see
> > > > section \ref{sec:Basic Facilities of a Virtio Device / Admin
> > > > command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> > > > +for more details) and the attributes exposed in the result of
> > > > VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
> > > > +(see section \ref{sec:Basic Facilities of a Virtio Device /
> > > > Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for
> > > > more details).
> > > > +
> > > > +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT
> > > > admin command to manage its managed devices (see section
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command
> > > > set / VIRTIO ADMIN DEVICE MGMT command} for more details).
> > > > +
> > > >   \chapter{General Initialization And Device
> > > > Operation}\label{sec:General Initialization And Device
> > > > Operation}
> > > >     We start with an overview of device initialization, then
> > > > expand on the
> > > > @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling
> > > > Interrupts}\label{sec:Virtio Transport Options /
> > > >       \end{itemize}
> > > >   \end{itemize}
> > > >   +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio
> > > > Transport Options / Virtio Over PCI Bus / PCI-specific Admin
> > > > capabilities}
> > > > +
> > > > +This documents the group of admin capabilities for PCI virtio
> > > > devices. Each capability is
> > > > +implemented using one or more Admin commands.
> > > > +
> > > > +\subsubsection{MSI-X vector management}\label{sec:Virtio
> > > > Transport Options / Virtio Over PCI Bus / PCI-specific Admin
> > > > command set / MSI-X vector management}
> > > > +
> > > > +This capability enables a virtio management device to control
> > > > the assignment of MSI-X interrupt vectors
> > > > +for its managed devices.
> > 
> > 
> > I think we need to clarify whether the Initial VFs belong to the
> > "managed device".
> > 
> > 
> > > >   In PCI, a management device can be the PF device and the
> > > > managed device can be the VF (for example in a type 1 device
> > > > group).
> > > > +Capable management devices will need to implement
> > > > VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
> > > > admin commands, report the MSI-X attributes in the result of
> > > > +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector
> > > > resource management is supported in the result of
> > > > VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
> > > > +See sections \ref{sec:Basic Facilities of a Virtio Device /
> > > > Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> > > > and
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command
> > > > set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
> > > > +
> > > > +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command,
> > > > a capable management device will return the total number of
> > > > +msix vectors for its VFs in \field{vfs_total_msix_count} field,
> > > > the number of already assigned msix vectors for its VFs in
> > > > +\field{vfs_assigned_msix_count} field and also the maximal
> > > > number of msix vectors that can be assigned for a single VF in
> > > > +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1
> > > > and bit 2 are set to indicate on the validity of the other 3
> > > > +fields in the \field{attrs_mask} field of the result buffer.
> > > > +See section \ref{sec:Basic Facilities of a Virtio Device /
> > > > Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for
> > > > more details.
> > > > +
> > > > +The default assignment of the MSI-X vectors for managed devices
> > > > is out of the scope of this specification.
> > > > +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X
> > > > assignment for a specific managed device.
> > > > +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver
> > > > set the \field{resource} type to be MSI-X vector and the
> > > > +amount of MSI-X interrupt vectors to configure to the
> > > > designated managed device in \field{resource_val}. The managed
> > > > device id is set to \field{vdev_id} field.
> > > > +
> > > > +A successful operation guarantees that the requested amount of
> > > > MSI-X interrupt vectors was assigned to the designated device.
> > > > +This value is also returned in the
> > > > virtio_admin_device_mgmt_result structure.
> > > > +Also, a successful operation guarantees that the MSI-X
> > > > capability access by the designated PCI device defined by the
> > > > PCI specification must reflect
> > > > +the new configuration in all relevant fields. For example, by
> > > > default if the PCI VF has been assigned 4 MSI-X vectors, and
> > > > VIRTIO_ADMIN_DEVICE_MGMT
> > > > +increases the MSI-X vectors to 8. On this change, reading Table
> > > > size field of the MSI-X message control register will reflect a
> > > > value of 7.
> > 
> > 
> > This seems odd, what happens if we reduce the number of vectors. Or is
> > such on-the-fly changes of the semantic of a register allowed by the PCI
> > specification?
> it's done in Linux.
> > 
> > I think the driver must do this before creating the VFs (writing to the
> > sriov_numvfs or status), and the device will ignore or fail the request
> > of such changes after the VFs have been provisioned.
> > 
> > 
> > > > +
> > > > +It is beyond the scope of the virtio specification to define
> > > > necessary synchronization in system software to ensure that a virtio
> > > > PCI VF device +interrupt configuration modification is reflected in
> > > > the PCI device.
> > > IMHO it is very much in scope of the specification. The scope of the
> > > specification is to allow device interoperability and this very much
> > > fits the bill.
> > 
> > 
> > +1, things will be much easier if we only allow the changes before
> > provisioning VFs.

I suspect it won't be enough though. VFIO binds to VFs and caches MSIX
before provisioning them to VMs.


> 
> Do you want to limit the spec to this ?
> 
> it will restrict the feature a lot.



> > 
> > 
> > > 
> > > > However, it is expected that any modern system software implementing
> > > > virtio +drivers and PCI subsystem will ensure that any changes
> > > > occurring in the VF interrupt configuration is either updated in the
> > > > PCI VF device or +such configuration fails.
> > > OK. Anything more? What exactly does "interrupt configuration" mean
> > > here?
> > > 
> > > > For example, one way to
> > > > implement that is to make sure that there is no driver bounded to the
> > > > virtio PCI SR-IOV VF during +this operation.
> > > bounded in what sense?
> > > 
> > > And why do you say VF? Is this command limited to type 1? You only
> > > limit it to PCI above.
> > > 
> > > same elsewhere
> > > 
> > > > +
> > > > +To query amount of MSI-X interrupt vectors that is currently
> > > > assigned to a managed device, the driver issue
> > > > VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
> > > issues
> > > 
> > > lots of grammar error like this elsewhere, pls find and correct.
> > > 
> > > > +"query resource of the designated vdev_id" value (== 2). The
> > > > driver also set the \field{resource} type to be MSI-X vector and
> > > > the managed device id is set to \field{vdev_id}
> > > > +field. In the result of a successful operation,
> > > meaning "in case"?
> > > 
> > > > the amount of MSI-X interrupt vectors that is currently assigned
> > > > to the designated managed device is
> > > > +returned by the device in \field{resource_val} field of the
> > > > virtio_admin_device_mgmt_result structure.
> > > > +See section \ref{sec:Basic Facilities of a Virtio Device /
> > > > Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more
> > > > details.
> > > > +
> > > > +\paragraph{MSI-X configuration sequence
> > > > example}\label{sec:Virtio Transport Options / Virtio Over PCI
> > > > Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X
> > > > configuration sequence example }
> > > > +
> > > > +A typical sequence for configuring MSI-X vectors for PCI VFs
> > > > using MSI-X vector management mechanism is following:
> > > rephrase to simplify
> > > 
> > > The driver uses the following sequence for configuring MSI-X vectors
> > > ....
> > > 
> > > 
> > > 
> > > > +
> > > > +\begin{enumerate}
> > > > +\item Ensure that VF driver doesn't run and it is safe to
> > > > change MSI-X (e.g. disable sriov auto probing)
> > 
> > 
> > Is "sriov auto probing" a general OS facility instead of Linux specific?
> > If not, we need clarify what it did here.
> 
> is "disable automatic probing mechanism for virtual functions or use some
> other tools to verify the virtual function is not bound and probed by any
> device driver"
> 
> better ?

That may be enough, but I suspect not practical for management reasons
(e.g. management might expect to be able to bind to VFs and keep the VFIO fd
open, passing it on to non privileged daemons).
I feel that it's better to be very specific about what should
not happen though. which fields should not be accessed.



> > 
> > Thanks
> > 
> > 
> > > > +
> > > > +\item Load the PF driver
> > > > +
> > > > +\item Enable SR-IOV by following the PCI specification
> > > > +
> > > > +\item Query the management device capabilities using commands
> > > > VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
> > > > +
> > > > +\item Find the managed VF vdev_id (for type 1 device group the
> > > > vdev_id of PCI VF is equal to vf number)
> > > > +
> > > > +\item Query the VF MSI-X configuration using command
> > > > VIRTIO_ADMIN_DEVICE_MGMT (query operation)
> > > > +
> > > > +\item Assign desired MSI-X configuration for the VF using
> > > > command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
> > > > +
> > > > +\item After successful completion of the assignment, load the
> > > > VF driver
> > > > +
> > > > +\item Assign the VF to a VM
> > > > +
> > > > +\end{enumerate}
> > > > +
> > > >   \section{Virtio Over MMIO}\label{sec:Virtio Transport Options
> > > > / Virtio Over MMIO}
> > > >     Virtual environments without PCI support (a common situation in
> > > > diff --git a/introduction.tex b/introduction.tex
> > > > index 4358ab1..bfc5498 100644
> > > > --- a/introduction.tex
> > > > +++ b/introduction.tex
> > > > @@ -164,9 +164,39 @@ \subsection{Device
> > > > group}\label{sec:Introduction / Terminology / Device group}
> > > >   For now, the supported device groups are:
> > > >   \begin{enumerate}
> > > >   \item Type 1 - A virtio PCI SR-IOV physical function (PF) and
> > > > its PCI SR-IOV virtual functions (VFs). For this group type, the
> > > > PF device has vdev_id that is equal to 0
> > > > -and the VF devices have vdev_id's that are equal to their
> > > > vf_number (according to the PCI SR-IOV specification).
> > > > +and the VF devices have vdev_id's that are equal to their
> > > > vf_number (according to the PCI SR-IOV specification). A PCI
> > > > SR-IOV PF device can act as a management device for
> > > > +type 1 group. A PCI SR-IOV VF device can act as a managed
> > > > device for type 1 group (see \ref{sec:Introduction / Terminology
> > > > / Virtio management device} and
> > > > +\ref{sec:Introduction / Terminology / Virtio managed device}
> > > > for more information).
> > > >   \end{enumerate}
> > > >   +\subsection{Virtio management device}\label{sec:Introduction
> > > > / Terminology / Virtio management device}
> > > > +
> > > > +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and
> > > > VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command
> > > > set / VIRTIO ADMIN DEVICE MGMT command} and
> > > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command
> > > > set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more
> > > > information).
> > > > +This device can manage a virtio managed device. A device group
> > > > may contain zero or more management devices.
> > > > +
> > > > +A PCI SR-IOV Physical Function based virtio device is an
> > > > example of a possible virtio management device (for type 1
> > > > device group).
> > > > +
> > > > +\subsection{Virtio type 1 management
> > > > device}\label{sec:Introduction / Terminology / Virtio type 1
> > > > management device}
> > > > +
> > > > +A virtio management device for type 1 device group. This device
> > > > is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other
> > > > virtio device in the same device group),
> > > > +and set \field{vdev_id} to an id that corresponds with one of
> > > > its managed virtio devices (PCI SR-IOV VFs) for the
> > > > VIRTIO_ADMIN_DEVICE_MGMT admin command.
> > > > +
> > > > +A type 1 device group may contain zero or one management devices.
> > > > +
> > > > +\subsection{virtio managed device}\label{sec:Introduction /
> > > > Terminology / Virtio managed device}
> > > > +
> > > > +A virtio device that can be managed by a virtio management device.
> > > > +A device group may contain zero or more managed devices.
> > > > +
> > > > +A PCI SR-IOV Virtual Function based virtio device is an example
> > > > of a possible virtio managed device (for type 1 group).
> > > > +
> > > > +\subsection{virtio type 1 managed
> > > > device}\label{sec:Introduction / Terminology / Virtio type 1
> > > > managed device}
> > > > +
> > > > +A virtio managed device for type 1 device group. This device is
> > > > a PCI SR-IOV VF and is managed by a virtio type 1 management
> > > > device (virtio PCI SR-IOV PF).
> > > > +It is implied that all the virtio PCI SR-IOV VFs related to a
> > > > virtio PCI SR-IOV PF that is virtio type 1 management device are
> > > > type 1 managed devices.
> > > > +
> > > >   \section{Structure Specifications}\label{sec:Structure
> > > > Specifications}
> > > >     Many device and driver in-memory structure layouts are
> > > > documented using
> > > > -- 
> > > > 2.21.0
> > 


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-18 16:41         ` Michael S. Tsirkin
@ 2022-05-18 23:10           ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 23:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/18/2022 7:41 PM, Michael S. Tsirkin wrote:
> On Wed, May 18, 2022 at 06:27:50PM +0300, Max Gurtovoy wrote:
>>> Note that MSI has been used by various platform devices. It would be
>>> better if we can make it work for non-PCI devices otherwise we may
>>> re-introduce duplicated commands.
>>>
>> we can't even agree on PCI existing feature today in Linux so adding more
>> complexity will bring us back to the beginning.
> I think I agree with Max here. MMIO and CCW do not support MSI ATM.
> Adding this for MMIO has been proposed in the past but the proposal
> was too complex resulting in losing the attractive property of MMIO
> that is it's simplicity.
>
>>>>> +\end{note}
>>>>> +
>>>>> +\begin{note}
>>>>> +{For this command, if driver is setting \field{resource} to
>>>>> MSI-X vector type, the \field{vdev_id} can't be associated with
>>>>> a Virtual Function with
>>>>> +VF index greater than NumVFs value as defined in the PCI
>>>>> specification or smaller than 1. An error is returned by the
>>>>> device when \field{vdev_id} is out of the range.}
>>>>> +\end{note}
>>>>> +
>>>>> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS
>>>>> command}\label{sec:Basic Facilities of a Virtio Device / Admin
>>>>> command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
>>>>> +
>>>>> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command
>>>>> specific data set by the driver.
>>>>> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
>>>>> +
>>>>> +The device, upon success, returns a result that describes the
>>>>> management device attributes.
>>>>> +This result is of form:
>>>>> +\begin{lstlisting}
>>>>> +struct virtio_admin_device_mgmt_attrs_result {
>>>>> +        /* Indicates which of the below fields were returned
>>>>> +         * (1 means that field was returned):
>>>>> +         * Bit 0 - vfs_total_msix_count
>>>>> +         * Bit 1 - vfs_assigned_msix_count
>>>>> +         * Bit 2 - per_vf_max_msix_count
>>>>> +         * Bits 3 - 63 - reserved for future fields
>>>>> +         */
>>>>> +        le64 attrs_mask;
>>>>> +
>>>>> +        /* Total number of msix vectors for the total number of VFs */
>>>>> +        le32 vfs_total_msix_count;
>>>>> +        /* Assigned number of msix vectors for the enabled VFs */
>>>>> +        le32 vfs_assigned_msix_count;
>>>>> +        /* Max number of msix vectors that can be assigned for
>>>>> a single VF */
>>>>> +        le16 per_vf_max_msix_count;
>>>>> +
>>>>> +        u8 reserved[110];
>>>>> +};
>>>>> +\end{lstlisting}
>>>>> +
>>>>> +\begin{note}
>>>>> +{The \field{vfs_total_msix_count},
>>>>> \field{vfs_assigned_msix_count} and
>>>>> \field{per_vf_max_msix_count} returned by the device if the
>>>>> +designated vdev_id is a management device that can
>>>>> allocate/deallocate MSI-X resources for PCI VFs devices.
>>>>> Otherwise,
>>>>> +the associated bits in \field{attrs_mask} are zeroed by the device.}
>>>>> +\end{note}
>>>>> +
>>>>>    \section{Admin Virtqueues}\label{sec:Basic Facilities of a
>>>>> Virtio Device / Admin Virtqueues}
>>>>>      An admin virtqueue is a management interface of a device
>>>>> that can be used to send administrative
>>>>> diff --git a/content.tex b/content.tex
>>>>> index 0c1d44f..81e5850 100644
>>>>> --- a/content.tex
>>>>> +++ b/content.tex
>>>>> @@ -451,6 +451,18 @@ \section{Exporting Objects}\label{sec:Basic
>>>>> Facilities of a Virtio Device / Expo
>>>>>      \input{admin.tex}
>>>>>    +\section{Device management}\label{sec:Basic Facilities of a
>>>>> Virtio Device / Device management}
>>>>> +
>>>>> +A device group might consist of one or more virtio devices. For
>>>>> example, virtio PCI SR-IOV PF and its VFs compose a type 1
>>>>> device group.
>>>>> +A capable PCI SR-IOV PF virtio device might act as the
>>>>> management device in this group, and its PCI SR-IOV VFs are the
>>>>> managed devices.
>>>>> +A management device might have various management capabilities
>>>>> and attributes to manage its managed devices.
>>>> This makes my eyes glaze over.
>>>> Please, find all instances which say "manage" more than once and
>>>> rephrase.
>>>>
>>>>> The capabilities exposed
>>>>> +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see
>>>>> section \ref{sec:Basic Facilities of a Virtio Device / Admin
>>>>> command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>>>> +for more details) and the attributes exposed in the result of
>>>>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
>>>>> +(see section \ref{sec:Basic Facilities of a Virtio Device /
>>>>> Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for
>>>>> more details).
>>>>> +
>>>>> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT
>>>>> admin command to manage its managed devices (see section
>>>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command
>>>>> set / VIRTIO ADMIN DEVICE MGMT command} for more details).
>>>>> +
>>>>>    \chapter{General Initialization And Device
>>>>> Operation}\label{sec:General Initialization And Device
>>>>> Operation}
>>>>>      We start with an overview of device initialization, then
>>>>> expand on the
>>>>> @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling
>>>>> Interrupts}\label{sec:Virtio Transport Options /
>>>>>        \end{itemize}
>>>>>    \end{itemize}
>>>>>    +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio
>>>>> Transport Options / Virtio Over PCI Bus / PCI-specific Admin
>>>>> capabilities}
>>>>> +
>>>>> +This documents the group of admin capabilities for PCI virtio
>>>>> devices. Each capability is
>>>>> +implemented using one or more Admin commands.
>>>>> +
>>>>> +\subsubsection{MSI-X vector management}\label{sec:Virtio
>>>>> Transport Options / Virtio Over PCI Bus / PCI-specific Admin
>>>>> command set / MSI-X vector management}
>>>>> +
>>>>> +This capability enables a virtio management device to control
>>>>> the assignment of MSI-X interrupt vectors
>>>>> +for its managed devices.
>>>
>>> I think we need to clarify whether the Initial VFs belong to the
>>> "managed device".
>>>
>>>
>>>>>    In PCI, a management device can be the PF device and the
>>>>> managed device can be the VF (for example in a type 1 device
>>>>> group).
>>>>> +Capable management devices will need to implement
>>>>> VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
>>>>> admin commands, report the MSI-X attributes in the result of
>>>>> +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector
>>>>> resource management is supported in the result of
>>>>> VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
>>>>> +See sections \ref{sec:Basic Facilities of a Virtio Device /
>>>>> Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
>>>>> and
>>>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command
>>>>> set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
>>>>> +
>>>>> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command,
>>>>> a capable management device will return the total number of
>>>>> +msix vectors for its VFs in \field{vfs_total_msix_count} field,
>>>>> the number of already assigned msix vectors for its VFs in
>>>>> +\field{vfs_assigned_msix_count} field and also the maximal
>>>>> number of msix vectors that can be assigned for a single VF in
>>>>> +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1
>>>>> and bit 2 are set to indicate on the validity of the other 3
>>>>> +fields in the \field{attrs_mask} field of the result buffer.
>>>>> +See section \ref{sec:Basic Facilities of a Virtio Device /
>>>>> Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for
>>>>> more details.
>>>>> +
>>>>> +The default assignment of the MSI-X vectors for managed devices
>>>>> is out of the scope of this specification.
>>>>> +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X
>>>>> assignment for a specific managed device.
>>>>> +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver
>>>>> set the \field{resource} type to be MSI-X vector and the
>>>>> +amount of MSI-X interrupt vectors to configure to the
>>>>> designated managed device in \field{resource_val}. The managed
>>>>> device id is set to \field{vdev_id} field.
>>>>> +
>>>>> +A successful operation guarantees that the requested amount of
>>>>> MSI-X interrupt vectors was assigned to the designated device.
>>>>> +This value is also returned in the
>>>>> virtio_admin_device_mgmt_result structure.
>>>>> +Also, a successful operation guarantees that the MSI-X
>>>>> capability access by the designated PCI device defined by the
>>>>> PCI specification must reflect
>>>>> +the new configuration in all relevant fields. For example, by
>>>>> default if the PCI VF has been assigned 4 MSI-X vectors, and
>>>>> VIRTIO_ADMIN_DEVICE_MGMT
>>>>> +increases the MSI-X vectors to 8. On this change, reading Table
>>>>> size field of the MSI-X message control register will reflect a
>>>>> value of 7.
>>>
>>> This seems odd, what happens if we reduce the number of vectors. Or is
>>> such on-the-fly changes of the semantic of a register allowed by the PCI
>>> specification?
>> it's done in Linux.
>>> I think the driver must do this before creating the VFs (writing to the
>>> sriov_numvfs or status), and the device will ignore or fail the request
>>> of such changes after the VFs have been provisioned.
>>>
>>>
>>>>> +
>>>>> +It is beyond the scope of the virtio specification to define
>>>>> necessary synchronization in system software to ensure that a virtio
>>>>> PCI VF device +interrupt configuration modification is reflected in
>>>>> the PCI device.
>>>> IMHO it is very much in scope of the specification. The scope of the
>>>> specification is to allow device interoperability and this very much
>>>> fits the bill.
>>>
>>> +1, things will be much easier if we only allow the changes before
>>> provisioning VFs.
> I suspect it won't be enough though. VFIO binds to VFs and caches MSIX
> before provisioning them to VMs.
>
I mentioned that the VF device shouldn't be bounded to any driver.

Not VFIO and not virtio.

This can be done in Linux by disable auto probing of VFs.

Goto the steps of the configuration I wrote in this commit.

>> Do you want to limit the spec to this ?
>>
>> it will restrict the feature a lot.
>
>
>>>
>>>>> However, it is expected that any modern system software implementing
>>>>> virtio +drivers and PCI subsystem will ensure that any changes
>>>>> occurring in the VF interrupt configuration is either updated in the
>>>>> PCI VF device or +such configuration fails.
>>>> OK. Anything more? What exactly does "interrupt configuration" mean
>>>> here?
>>>>
>>>>> For example, one way to
>>>>> implement that is to make sure that there is no driver bounded to the
>>>>> virtio PCI SR-IOV VF during +this operation.
>>>> bounded in what sense?
>>>>
>>>> And why do you say VF? Is this command limited to type 1? You only
>>>> limit it to PCI above.
>>>>
>>>> same elsewhere
>>>>
>>>>> +
>>>>> +To query amount of MSI-X interrupt vectors that is currently
>>>>> assigned to a managed device, the driver issue
>>>>> VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
>>>> issues
>>>>
>>>> lots of grammar error like this elsewhere, pls find and correct.
>>>>
>>>>> +"query resource of the designated vdev_id" value (== 2). The
>>>>> driver also set the \field{resource} type to be MSI-X vector and
>>>>> the managed device id is set to \field{vdev_id}
>>>>> +field. In the result of a successful operation,
>>>> meaning "in case"?
>>>>
>>>>> the amount of MSI-X interrupt vectors that is currently assigned
>>>>> to the designated managed device is
>>>>> +returned by the device in \field{resource_val} field of the
>>>>> virtio_admin_device_mgmt_result structure.
>>>>> +See section \ref{sec:Basic Facilities of a Virtio Device /
>>>>> Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more
>>>>> details.
>>>>> +
>>>>> +\paragraph{MSI-X configuration sequence
>>>>> example}\label{sec:Virtio Transport Options / Virtio Over PCI
>>>>> Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X
>>>>> configuration sequence example }
>>>>> +
>>>>> +A typical sequence for configuring MSI-X vectors for PCI VFs
>>>>> using MSI-X vector management mechanism is following:
>>>> rephrase to simplify
>>>>
>>>> The driver uses the following sequence for configuring MSI-X vectors
>>>> ....
>>>>
>>>>
>>>>
>>>>> +
>>>>> +\begin{enumerate}
>>>>> +\item Ensure that VF driver doesn't run and it is safe to
>>>>> change MSI-X (e.g. disable sriov auto probing)
>>>
>>> Is "sriov auto probing" a general OS facility instead of Linux specific?
>>> If not, we need clarify what it did here.
>> is "disable automatic probing mechanism for virtual functions or use some
>> other tools to verify the virtual function is not bound and probed by any
>> device driver"
>>
>> better ?
> That may be enough, but I suspect not practical for management reasons
> (e.g. management might expect to be able to bind to VFs and keep the VFIO fd
> open, passing it on to non privileged daemons).
> I feel that it's better to be very specific about what should
> not happen though. which fields should not be accessed.

I really can't understand how a VFIO fd can be opened if the 
pre-condition is not binding/probing the VF by any device driver (and 
VFIO is a device driver).

I'm very specific about that.

can you please tell me what is not clear ?

>
>
>>> Thanks
>>>
>>>
>>>>> +
>>>>> +\item Load the PF driver
>>>>> +
>>>>> +\item Enable SR-IOV by following the PCI specification
>>>>> +
>>>>> +\item Query the management device capabilities using commands
>>>>> VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
>>>>> +
>>>>> +\item Find the managed VF vdev_id (for type 1 device group the
>>>>> vdev_id of PCI VF is equal to vf number)
>>>>> +
>>>>> +\item Query the VF MSI-X configuration using command
>>>>> VIRTIO_ADMIN_DEVICE_MGMT (query operation)
>>>>> +
>>>>> +\item Assign desired MSI-X configuration for the VF using
>>>>> command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
>>>>> +
>>>>> +\item After successful completion of the assignment, load the
>>>>> VF driver
>>>>> +
>>>>> +\item Assign the VF to a VM
>>>>> +
>>>>> +\end{enumerate}
>>>>> +
>>>>>    \section{Virtio Over MMIO}\label{sec:Virtio Transport Options
>>>>> / Virtio Over MMIO}
>>>>>      Virtual environments without PCI support (a common situation in
>>>>> diff --git a/introduction.tex b/introduction.tex
>>>>> index 4358ab1..bfc5498 100644
>>>>> --- a/introduction.tex
>>>>> +++ b/introduction.tex
>>>>> @@ -164,9 +164,39 @@ \subsection{Device
>>>>> group}\label{sec:Introduction / Terminology / Device group}
>>>>>    For now, the supported device groups are:
>>>>>    \begin{enumerate}
>>>>>    \item Type 1 - A virtio PCI SR-IOV physical function (PF) and
>>>>> its PCI SR-IOV virtual functions (VFs). For this group type, the
>>>>> PF device has vdev_id that is equal to 0
>>>>> -and the VF devices have vdev_id's that are equal to their
>>>>> vf_number (according to the PCI SR-IOV specification).
>>>>> +and the VF devices have vdev_id's that are equal to their
>>>>> vf_number (according to the PCI SR-IOV specification). A PCI
>>>>> SR-IOV PF device can act as a management device for
>>>>> +type 1 group. A PCI SR-IOV VF device can act as a managed
>>>>> device for type 1 group (see \ref{sec:Introduction / Terminology
>>>>> / Virtio management device} and
>>>>> +\ref{sec:Introduction / Terminology / Virtio managed device}
>>>>> for more information).
>>>>>    \end{enumerate}
>>>>>    +\subsection{Virtio management device}\label{sec:Introduction
>>>>> / Terminology / Virtio management device}
>>>>> +
>>>>> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and
>>>>> VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
>>>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command
>>>>> set / VIRTIO ADMIN DEVICE MGMT command} and
>>>>> +\ref{sec:Basic Facilities of a Virtio Device / Admin command
>>>>> set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more
>>>>> information).
>>>>> +This device can manage a virtio managed device. A device group
>>>>> may contain zero or more management devices.
>>>>> +
>>>>> +A PCI SR-IOV Physical Function based virtio device is an
>>>>> example of a possible virtio management device (for type 1
>>>>> device group).
>>>>> +
>>>>> +\subsection{Virtio type 1 management
>>>>> device}\label{sec:Introduction / Terminology / Virtio type 1
>>>>> management device}
>>>>> +
>>>>> +A virtio management device for type 1 device group. This device
>>>>> is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other
>>>>> virtio device in the same device group),
>>>>> +and set \field{vdev_id} to an id that corresponds with one of
>>>>> its managed virtio devices (PCI SR-IOV VFs) for the
>>>>> VIRTIO_ADMIN_DEVICE_MGMT admin command.
>>>>> +
>>>>> +A type 1 device group may contain zero or one management devices.
>>>>> +
>>>>> +\subsection{virtio managed device}\label{sec:Introduction /
>>>>> Terminology / Virtio managed device}
>>>>> +
>>>>> +A virtio device that can be managed by a virtio management device.
>>>>> +A device group may contain zero or more managed devices.
>>>>> +
>>>>> +A PCI SR-IOV Virtual Function based virtio device is an example
>>>>> of a possible virtio managed device (for type 1 group).
>>>>> +
>>>>> +\subsection{virtio type 1 managed
>>>>> device}\label{sec:Introduction / Terminology / Virtio type 1
>>>>> managed device}
>>>>> +
>>>>> +A virtio managed device for type 1 device group. This device is
>>>>> a PCI SR-IOV VF and is managed by a virtio type 1 management
>>>>> device (virtio PCI SR-IOV PF).
>>>>> +It is implied that all the virtio PCI SR-IOV VFs related to a
>>>>> virtio PCI SR-IOV PF that is virtio type 1 management device are
>>>>> type 1 managed devices.
>>>>> +
>>>>>    \section{Structure Specifications}\label{sec:Structure
>>>>> Specifications}
>>>>>      Many device and driver in-memory structure layouts are
>>>>> documented using
>>>>> -- 
>>>>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 7/7] RFC: add initial support for configuring feature bits
  2022-05-18 16:34       ` Michael S. Tsirkin
@ 2022-05-18 23:18         ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-05-18 23:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/18/2022 7:34 PM, Michael S. Tsirkin wrote:
> On Wed, May 18, 2022 at 06:31:18PM +0300, Max Gurtovoy wrote:
>> On 5/15/2022 5:38 PM, Michael S. Tsirkin wrote:
>>> On Wed, Apr 27, 2022 at 01:58:24AM +0300, Max Gurtovoy wrote:
>>>> After adding the concept of a management and a managed device, add
>>>> another example of using this concept to manage resources.
>>>>
>>>> Today there is no standard definition in the spec that allows user to
>>>> setup specific feature bits of a virtio device.
>>>>
>>>> For that, extend the management mechanism to allow management devices to
>>>> change feature bits of its managed devices.
>>>>
>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>> Please, add more explanation here. E.g. I am guessing these are
>>> host feature bits, right? How does driver know which features are
>>> ok to enable?
>>>
>>> I would expect some description sections and conformance sections.
>>>
>> This is an RFC to emphasize the need for the admin command set.
>>
>> I added this because we agreed to think on more features for showing the
>> motivation of this mechanism.
>>
>> I can work on MSI-X or feature bits as initial submission.
>>
>> Choose what do you think will be easier to merge and lets keep working on
>> it.
>>
>> Working on both will cause distraction.
> OK.
>
> I personally think feature bits are easier, we don't need to also refer
> to the pci spec.

If you think it will accelerate the process of acceptance than I can 
give it a try.

Although I really don't understand the gap we have for MSI-X.

Just need to keep some rules such as not allowing a VF to be bounded to 
a device driver before performing the configuration.

it's very easy and works today for mlx5.

I also hope someone will add it to nvme driver since this functionality 
is in NVM spec.

Maybe I'll do it myself if I'll find some time soon.

Also, if we'll do the feature bits configuration still the pre-condition 
for it to work will be to make sure that the VF is not bounded to a 
device driver, same as for MSI-X.

>
>>>> ---
>>>>    admin.tex | 12 +++++++++---
>>>>    1 file changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/admin.tex b/admin.tex
>>>> index 5b54743..43106ba 100644
>>>> --- a/admin.tex
>>>> +++ b/admin.tex
>>>> @@ -113,7 +113,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilitie
>>>>            * Bit 0 - if set, the device is a management device
>>>>            * Bit 1 - if set, the device is a type 1 management device that supports
>>>>            *         MSI-X vector mgmt of its type 1 managed devices
>>>> -        * Bits 2 - 63 - reserved for future capabilities.
>>>> +        * Bit 2 - if set, the device is a type 1 management device that supports
>>>> +        *         feature mgmt of bits 0 to 63 for its type 1 managed devices
>>>> +        * Bits 3 - 63 - reserved for future capabilities.
>>>>            */
>>>>           le64 device_admin_caps;
>>>>           u8 reserved[112];
>>>> @@ -143,7 +145,9 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
>>>>            * Bit 0 - if set, the driver accepted the device as a management device
>>>>            * Bit 1 - if set, the driver accepted the device as a type 1 management device
>>>>            *         that supports MSI-X vector mgmt of its type 1 managed devices
>>>> -        * Bits 2 - 63 - reserved for future capabilities.
>>>> +        * Bit 2 - if set, the driver accepted the device as a type 1 management device
>>>> +        *         that supports feature mgmt of bits 0 to 63 for its type 1 managed devices
>>>> +        * Bits 3 - 63 - reserved for future capabilities.
>>>>            */
>>>>           le64 driver_admin_caps;
>>>>           u8 reserved[112];
>>>> @@ -167,12 +171,14 @@ \subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Vi
>>>>            u8 operation;
>>>>            /*
>>>>             * 0 - MSI-X vector
>>>> -         * 1 - 65535 are reserved
>>>> +         * 1 - Device feature bits 0 to 63
>>>> +         * 2 - 65535 are reserved
>>>>             */
>>>>            le16 resource;
>>>>            /*
>>>>             * The value to the given resource:
>>>>             * if resource = 0 (MSI-X vector), it's a 1-based count.
>>>> +         * if resource = 1 (Device feature bits 0 to 63), it's a feature bitmap.
>>>>             */
>>>>            le64 resource_val;
>>>>            u8 reserved[5];
>>>> -- 
>>>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 3/7] Introduce new destination type for admin commands
  2022-05-18 14:34     ` Max Gurtovoy
@ 2022-05-18 23:55       ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-18 23:55 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 05:34:33PM +0300, Max Gurtovoy wrote:
> 
> On 5/15/2022 6:09 PM, Michael S. Tsirkin wrote:
> > On Wed, Apr 27, 2022 at 01:58:20AM +0300, Max Gurtovoy wrote:
> > > Introduce a new mechanism to issue commands with dst_type field that is
> > > not "self". With the new mechanism, driver can set dst_type to 1
> > > and use the vdev_id common field to describe the designated vdev_id.
> > > 
> > > This mechanism is useful for device groups with multiple devices
> > > with various different capabilities. For example, a type 1 device group
> > > that contains a PCI PF and its VF. For this group, a clever system
> > > administrator can use admin commands to manipulate the PF/VF resources.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > ---
> > >   admin.tex | 18 ++++++++++++++----
> > >   1 file changed, 14 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > index 6725daa..f816c3b 100644
> > > --- a/admin.tex
> > > +++ b/admin.tex
> > > @@ -11,11 +11,13 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
> > >           le16 command;
> > >           /*
> > >            * 0 - self
> > > -         * 1 - 65535 are reserved
> > > +         * 1 - other virtio device (identified by vdev_id) in the same device group
> > > +         * 2 - 65535 are reserved
> > >            */
> > >           le16 dst_type;
> > > +        le64 vdev_id;
> > Alignment problems. Proposal:
> > 
> > vdev_id
> > dst_type
> > command
> 
> I don't understand what is the issue here. All sw-hw packets should be
> __packed__

No and most virtio ones are not.

> why should I change the order to be something that is not intuitive ?

packed structs often cause compiler to generate horrible code.
so we generally don't do this. in some cases we made a mistake
and stick to it for now. don't copy that.

> > 
> > 
> > >           /* reserved for common cmd fields */
> > > -        u8 reserved[20];
> > > +        u8 reserved[12];
> > >           u8 command_specific_data[];
> > >           /* Device-writable part */
> > 
> > > @@ -39,9 +41,11 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
> > >   \hline
> > >   03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
> > >   \hline
> > > +04h   & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID    & invalid vdev_id was set  \\
> > > +\hline
> > >   \end{tabular}
> > > -The \field{command}, \field{dst_type} and \field{command_specific_data} are
> > > +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are
> > >   set by the driver, and the device sets the \field{status}, the
> > >   \field{command_specific_error} and the \field{command_specific_result},
> > >   if needed.
> > > @@ -50,9 +54,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
> > >   The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
> > > +The optional unused fields to be zeroed by the driver.
> > > +
> > >   The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
> > > -The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
> > > +The \field{dst_type} defines the designated virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same device
> > > +group) by the driver. Not all the commands allow setting \field{dst_type} to 1. Refer to each command description explicitly to check whether this operation is allowed.
> > > +If \field{dst_type} is set to 0 by the driver, the \field{vdev_id} isn't valid, should be zeroed by the driver and should be ignored by the device.
> > > +If \field{dst_type} is set to 1 by the driver, the \field{vdev_id} is valid and used to describe the vdev_id of the designated virtio device (see section
> > > +\ref{sec:Introduction / Terminology / Device group} for vdev_id numbering for type 1 device groups).
> > >   The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
> > >   VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
> > 
> > teminology is a bit inconsistent. would dst_id be better? it goes
> > together with dst_type after all.
> 
> I don't think dst_id is better than vdev_id. Sorry.
> 
> vdev_id is a unique identifier of a device inside a device group.

- no consistency with dst_type
- v in "vdev" is pointless
- does not tell you which device it is. source sending command?
  destination to which it refers?

> > 
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 4/7] Introduce virtio admin virtqueue
  2022-05-18 14:37     ` Max Gurtovoy
@ 2022-05-18 23:56       ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-18 23:56 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 05:37:11PM +0300, Max Gurtovoy wrote:
> 
> On 5/15/2022 5:59 PM, Michael S. Tsirkin wrote:
> > On Wed, Apr 27, 2022 at 01:58:21AM +0300, Max Gurtovoy wrote:
> > > In one of the many use cases a user wants to manipulate features and
> > > configuration of the virtio devices regardless of the device type
> > > (net/block/console). For that the admin command set introduced. The
> > > admin virtqueue will be the first management interface to issue admin
> > > commands.
> > > 
> > > Currently virtio specification defines control virtqueue to manipulate
> > > features and configuration of the device it operates on. However,
> > > control virtqueue commands are device type specific, which makes it very
> > > difficult to extend for device agnostic commands.
> > > 
> > > To support this requirement in elegant way, this patch introduces a new
> > > admin virtqueue interface.
> > > 
> > > Manipulate features via admin virtqueue is asynchronous, scalable, easy
> > > to extend and doesn't require additional and expensive on-die resources
> > > to be allocated for every new feature that will be added in the future.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > ---
> > >   admin.tex       | 17 +++++++++++++++++
> > >   conformance.tex |  1 +
> > >   content.tex     |  6 ++++--
> > >   3 files changed, 22 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > index f816c3b..d09683d 100644
> > > --- a/admin.tex
> > > +++ b/admin.tex
> > > @@ -131,3 +131,20 @@ \subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities
> > >          u8 reserved[112];
> > >   };
> > >   \end{lstlisting}
> > > +
> > > +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> > > +
> > > +An admin virtqueue is a management interface of a device that can be used to send administrative
> > > +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
> > > +various features of the device and/or to manipulate various features, if possible, of another device.
> > > +
> > > +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
> > > +negotiated. The index of the admin virtqueue is exposed by the device in a
> > > +transport specific manner.
> > > +
> > > +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
> > > +
> > > +\devicenormative{\subsection}{Admin Virtqueues}{Basic Facilities of a Virtio Device / Admin Virtqueues}
> > > +A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.
> > 
> > don't see where these are defined.
> 
> In the command table (M - mandatory, O - optional)
> 

Add a link then.

> > 
> > > +
> > > +A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
> > > diff --git a/conformance.tex b/conformance.tex
> > > index 9807c30..3c7b7bc 100644
> > > --- a/conformance.tex
> > > +++ b/conformance.tex
> > > @@ -342,6 +342,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > >   \item \ref{devicenormative:Basic Facilities of a Virtio Device / Virtqueues / Available Buffer Notification Suppression}
> > >   \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
> > >   \item \ref{devicenormative:Reserved Feature Bits}
> > > +\item \ref{devicenormative:Basic Facilities of a Virtio Device / Admin Virtqueues}
> > >   \end{itemize}
> > >   \conformance{\subsection}{PCI Device Conformance}\label{sec:Conformance / Device Conformance / PCI Device Conformance}
> > > diff --git a/content.tex b/content.tex
> > > index 2e1df84..163cb34 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
> > >   \begin{description}
> > >   \item[0 to 23, and 50 to 127] Feature bits for the specific device type
> > > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > >     feature negotiation mechanisms
> > > -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
> > > +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
> > >   \end{description}
> > >   \begin{note}
> > > @@ -6849,6 +6849,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> > >     that the driver can reset a queue individually.
> > >     See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
> > > +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
> > > +
> > >   \end{description}
> > >   \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-05-18 14:42       ` Max Gurtovoy
@ 2022-05-18 23:58         ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-05-18 23:58 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Cornelia Huck, jasowang, virtio-comment, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 05:42:05PM +0300, Max Gurtovoy wrote:
> 
> On 5/17/2022 1:12 PM, Cornelia Huck wrote:
> > On Sun, May 15 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > 
> > > On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
> > > > This new structure will be used for adding new miscellaneous registers
> > > > for a virtio device configuration layout.
> > > > 
> > > > For now, only admin_queue_index register is added. Admin virtqueue index
> > > > does not depend on the device type. Hence, add a PCI capability to read
> > > > the admin virtqueue index.
> > > > 
> > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > 
> > > I guess we discussed this but I forgot. Why do we have this new
> > > structure as opposed to just adding the value at the end of config
> > > structure? I was kind of hoping that the structure can be
> > > reused for CCW/MMIO and then we can add more use-cases with
> > > new transport and device independent structures.
> > > 
> > > If we keep it transport specific I don't really understand why
> > > is it useful ...
> > Nod, just define a common misc_configuration struct and have the
> > individual transports access it in the way it works best for them.
> 
> can we agree on it ?
> 
> I think we got 3 acks on this in the past, are we opening this topic again ?

Are you going to answer the question though?
I think the assumption was it will be reused for CCW/MMIO but now we see
it's useless for that...

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-17 11:48       ` Michael S. Tsirkin
  2022-05-18 14:09         ` Max Gurtovoy
@ 2022-05-31 20:39         ` Parav Pandit
  2022-06-20  9:23           ` Michael S. Tsirkin
  2022-06-20  9:59           ` Michael S. Tsirkin
  1 sibling, 2 replies; 103+ messages in thread
From: Parav Pandit @ 2022-05-31 20:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, May 17, 2022 7:48 AM
> 
> On Mon, May 16, 2022 at 09:08:34PM +0000, Parav Pandit wrote:
> > Hi Michael,
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Sunday, May 15, 2022 11:24 AM
> >
> > [..]
> >
> > > > +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT
> > > command}\label{sec:Basic
> > > > +Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN
> > > > +DEVICE CAPS ACCEPT command}
> > > > +
> > > > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the
> > > driver to acknowledge those admin capabilities it understands and
> > > wishes to use.
> > >
> > >
> > > ok so we have a protocol here, kind of like feature negotiation.
> > > Please write its description.
> > > e.g. is it ok to change accepted caps? when? can device change its
> > > caps etc etc etc.
> > >
> > > Avoiding this kind of spec work is exactly why me and jason keep
> > > telling you to consider just using features instead. Add a 64 bit
> > > admin features field to the PCI transport and be done with it. CCW
> > > and MMIO already have feature selector so it's trivial to add feature bits.
> > >
> > As we begin to scale with the device, adding more and more registers like
> this demands more on-device real estate to comply to the PCI standards.
> >
> > And therefore, things are queried/accessed rare or occasionally, are better
> accessed via a queue interface.
> >
> > One can argue that admin VQ is proposed only for the mgmt. functions so
> having this cfg register for PF is enough.
> >
> > However, AQ may find some usage in the VF/SF themselves down the
> road.
> > Hence, keeping the cap exchange transport this way is more optimal.
> >
> > Max has called out this AQ rationale in 4 or 5 points in the cover letter.
> 
> Hmm. It's kind of a generic claim though. 
I am sorry for my way late response.

Not sure what you mean by generic claim.
If you mean generic for any PCI device, than yes, at scale such large register scale doesn't work well.

> We never put devices on a diet
> trying to conserve registers. 
> There is cost associated with this dance and that
> is driver boot time.
How much more time you anticipate by adding an additional queue which slows down the boot time?
We don't see downtime increasing beyond in < 2 msec range.

And AQ is not used anyway until the actual AQ work is done.
With Q_RESET in place, driver won't even create the AQ during boot time anyway.

> 
> I also don't really understand how you can claim you need to save memory
> like this and at the same time blindly add a more or less "just in case" misc
> config in the config space.
> So, not pretty.
Adding to config space was really your suggestion. :)

There are multiple points described in cover letter that I prefer not to repeat here. It was not only memory.

Also, capabilities and fields around that are expected to grow more than q_index.
So not a right comparison.

> 
> And as I said, you will need much more spec work to reach the level to which
> features are specified - and note we are not yet happy with how features are
Can you be specific of the work that you are expecting in this v5 version?

> specified either! So it's a moving target.
> 
> Maybe put this in features for now, and leave the whole capability thing for
> another day?
> 
> --
> MST

It appears a narrow view to do temporary work on putting in features bits that doesn't scale.


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-05-18 13:32       ` Cornelia Huck
@ 2022-06-01 13:43         ` Max Gurtovoy
  2022-06-02  2:21           ` Jason Wang
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-06-01 13:43 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin, Jason Wang
  Cc: virtio-comment, virtio-dev, oren, parav, shahafs, aadam, virtio


On 5/18/2022 4:32 PM, Cornelia Huck wrote:
> On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>> Hi MST,
>>
>> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
>>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
>>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>>> +
>>>> +A device group includes one or more virtio devices.
>>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
>>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
>>>> +
>>>> +For now, the supported device groups are:
>>>> +\begin{enumerate}
>>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>>>> +\end{enumerate}
>>>> +
>>>>    \section{Structure Specifications}\label{sec:Structure Specifications}
>>> In context of virtualization type 1 already refers to a specific type
>>> of hypervisor.
>>>
>>> I suggest simply "SR-IOV type" - this way users do not need to remember
>>> special terminology.
>> This is 12 lines addition commit with simple definition.
>>
>> I didn't mentioned hypervisors here.
>>
>> I will stick to your suggestion and use name instead of numbers
>> (although I don't understand how can a use that knows how to read spec
>> will be confused here), but I would like Jason and Cornelia to ack on
>> this during this review cycle.
>>
>> When we'll get 3 acks on this name - I'll update it for v6.
> So, do you want to imply some kind of numbering? I don't like "Type 1",
> either. If the type needs to be referenced in code, it should have a
> #define or such; otherwise, "SR-IOV type" would be fine.

ok I'll change it to be:

diff --git a/introduction.tex b/introduction.tex
index aa9ec1b..bba70a6 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -156,6 +156,18 @@ \subsection{Transition from earlier specification 
drafts}\label{sec:Transition f
  sections tagged "Legacy Interface" in the section title.
  These highlight the changes made since the earlier drafts.

+\subsection{Device group}\label{sec:Introduction / Terminology / Device 
group}
+
+A device group includes one or more virtio devices.
+Each virtio device has a unique virtio device id (vdev_id) within a 
device group. A valid vdev_id is a 64-bit field in the range of
+0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that 
refers to all devices in a device group and isn't a valid vdev_id.
+
+For now, the supported device groups are:
+\begin{enumerate}
+\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its 
PCI SR-IOV virtual functions (VFs). For this group type, the PF device 
has vdev_id that is equal to 0
+and the VF devices have vdev_id's that are equal to their vf_number 
(according to the PCI SR-IOV specification).
+\end{enumerate}
+
  \section{Structure Specifications}\label{sec:Structure Specifications}

MST/Jason/Cornelia,

can you add some Reviewed-By signatures if the above is agreed ?



>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 5/7] Add miscellaneous configuration structure for PCI
  2022-05-15 14:49   ` Michael S. Tsirkin
@ 2022-06-01 14:46     ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-06-01 14:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio


On 5/15/2022 5:49 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2022 at 01:58:22AM +0300, Max Gurtovoy wrote:
>> This new structure will be used for adding new miscellaneous registers
>> for a virtio device configuration layout.
>>
>> For now, only admin_queue_index register is added. Admin virtqueue index
>> does not depend on the device type. Hence, add a PCI capability to read
>> the admin virtqueue index.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   conformance.tex |  2 ++
>>   content.tex     | 29 +++++++++++++++++++++++++++++
>>   2 files changed, 31 insertions(+)
>>
>> diff --git a/conformance.tex b/conformance.tex
>> index 3c7b7bc..c183581 100644
>> --- a/conformance.tex
>> +++ b/conformance.tex
>> @@ -103,6 +103,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>>   \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
>>   \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>>   \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
>> +\item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>>   \end{itemize}
>>   
>>   \conformance{\subsection}{MMIO Driver Conformance}\label{sec:Conformance / Driver Conformance / MMIO Driver Conformance}
>> @@ -364,6 +365,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>>   \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
>>   \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Used Buffer Notifications}
>>   \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
>> +\item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>>   \end{itemize}
>>   
>>   \conformance{\subsection}{MMIO Device Conformance}\label{sec:Conformance / Device Conformance / MMIO Device Conformance}
>> diff --git a/content.tex b/content.tex
>> index 163cb34..0c1d44f 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -712,6 +712,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>   \item ISR Status
>>   \item Device-specific configuration (optional)
>>   \item PCI configuration access
>> +\item Miscellaneous configuration
>>   \end{itemize}
>>   
>>   Each structure can be mapped by a Base Address register (BAR) belonging to
>> @@ -771,6 +772,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>   #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>>   /* Vendor-specific data */
>>   #define VIRTIO_PCI_CAP_VENDOR_CFG        9
>> +/* Miscellaneous configuration */
>> +#define VIRTIO_PCI_CAP_MISC_CFG          10
>>   \end{lstlisting}
>>   
>>           Any other value is reserved for future use.
>> @@ -1352,6 +1355,32 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>>   specified by some other Virtio Structure PCI Capability
>>   of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>>   
>> +\subsubsection{Miscellaneous configuration structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>> +
>> +The miscellaneous configuration structure is found at the bar and offset within the VIRTIO_PCI_CAP_MISC_CFG capability.
> not very clear what is within what.
>
> Simplify the sentence:
>
> The VIRTIO_PCI_CAP_MISC_CFG specifies the location of The miscellaneous
> configuration structure.
>
>> +Its layout is below.
> who's layout? and try to avoid using "below".
> better:
>
> The miscellaneous configuration structure has the following layout
>
>> +\begin{lstlisting}
>> +struct virtio_pci_misc_cfg {
>> +        le16 admin_queue_index;         /* read-only for driver */
>> +};
>> +\end{lstlisting}
>> +
>> +\begin{description}
>> +\item[\field{admin_queue_index}]
>> +        The device uses this to report the index of the admin virtqueue.
>> +        This field is valid only if VIRTIO_F_ADMIN_VQ is set.
>> +\end{description}
>> +
>> +\devicenormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>> +The device MUST present VIRTIO_PCI_CAP_MISC_CFG capability when VIRTIO_F_ADMIN_VQ is set.
>> +
>> +The device MUST present a valid \field{admin_queue_index} when VIRTIO_F_ADMIN_VQ is set.
>> +
> set meaning "offered"?
>
>> +\drivernormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
>> +The driver MUST NOT proceed with configuring the admin virtqueue in case VIRTIO_F_ADMIN_VQ is set and VIRTIO_PCI_CAP_MISC_CFG capability is not present.
> "set" is vague. do you mean "negotiated"?
>
> also then let's mandate not negotiating VIRTIO_F_ADMIN_VQ.
>
> also if you are addressing this, address the reverse case
> what to do without VIRTIO_F_ADMIN_VQ but with the capability.

is the below good enough ?

"

\devicenormative{\paragraph}{Miscellaneous configuration structure 
layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device 
Layout / Miscellaneous configuration structure layout}
The device MUST present VIRTIO_PCI_CAP_MISC_CFG capability when 
VIRTIO_F_ADMIN_VQ is offered.

The device MUST present a valid \field{admin_queue_index} when 
VIRTIO_F_ADMIN_VQ is offered.

\drivernormative{\paragraph}{Miscellaneous configuration structure 
layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device 
Layout / Miscellaneous configuration structure layout}
The driver MUST NOT negotiate VIRTIO_F_ADMIN_VQ if 
VIRTIO_PCI_CAP_MISC_CFG capability is not present.

The driver MAY proceed with other configuration steps in case 
VIRTIO_F_ADMIN_VQ isn't offered and VIRTIO_PCI_CAP_MISC_CFG is present.

The driver MUST use the value of \field{admin_queue_index} to configure 
the admin virtqueue, if VIRTIO_F_ADMIN_VQ negotiated and 
VIRTIO_PCI_CAP_MISC_CFG is present.
For more details on virtqueue configuration see section \ref{sec:Virtio 
Transport Options / Virtio Over PCI Bus / PCI-specific Initialization 
And Device Operation / Device Initialization / Virtqueue Configuration}.

"


>
>
>> +
>> +The driver MUST use the value of \field{admin_queue_index} to configure the admin virtqueue. For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
>> +
>>   \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>>   
>>   Transitional devices MUST present part of configuration
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-06-01 13:43         ` Max Gurtovoy
@ 2022-06-02  2:21           ` Jason Wang
  2022-06-02  6:59             ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-06-02  2:21 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Cornelia Huck, Michael S. Tsirkin, virtio-comment, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

On Wed, Jun 1, 2022 at 9:44 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>
> On 5/18/2022 4:32 PM, Cornelia Huck wrote:
> > On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >
> >> Hi MST,
> >>
> >> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
> >>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
> >>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> >>>> +
> >>>> +A device group includes one or more virtio devices.
> >>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
> >>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
> >>>> +
> >>>> +For now, the supported device groups are:
> >>>> +\begin{enumerate}
> >>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> >>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> >>>> +\end{enumerate}
> >>>> +
> >>>>    \section{Structure Specifications}\label{sec:Structure Specifications}
> >>> In context of virtualization type 1 already refers to a specific type
> >>> of hypervisor.
> >>>
> >>> I suggest simply "SR-IOV type" - this way users do not need to remember
> >>> special terminology.
> >> This is 12 lines addition commit with simple definition.
> >>
> >> I didn't mentioned hypervisors here.
> >>
> >> I will stick to your suggestion and use name instead of numbers
> >> (although I don't understand how can a use that knows how to read spec
> >> will be confused here), but I would like Jason and Cornelia to ack on
> >> this during this review cycle.
> >>
> >> When we'll get 3 acks on this name - I'll update it for v6.
> > So, do you want to imply some kind of numbering? I don't like "Type 1",
> > either. If the type needs to be referenced in code, it should have a
> > #define or such; otherwise, "SR-IOV type" would be fine.
>
> ok I'll change it to be:
>
> diff --git a/introduction.tex b/introduction.tex
> index aa9ec1b..bba70a6 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -156,6 +156,18 @@ \subsection{Transition from earlier specification
> drafts}\label{sec:Transition f
>   sections tagged "Legacy Interface" in the section title.
>   These highlight the changes made since the earlier drafts.
>
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
> group}
> +
> +A device group includes one or more virtio devices.
> +Each virtio device has a unique virtio device id (vdev_id) within a
> device group. A valid vdev_id is a 64-bit field in the range of
> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that
> refers to all devices in a device group and isn't a valid vdev_id.
> +
> +For now, the supported device groups are:
> +\begin{enumerate}
> +\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
> PCI SR-IOV virtual functions (VFs). For this group type, the PF device
> has vdev_id that is equal to 0
> +and the VF devices have vdev_id's that are equal to their vf_number
> (according to the PCI SR-IOV specification).
> +\end{enumerate}
> +
>   \section{Structure Specifications}\label{sec:Structure Specifications}
>
> MST/Jason/Cornelia,
>
> can you add some Reviewed-By signatures if the above is agreed ?

If I understand this correctly, the idea of "device group" is to allow
different groups to be managed by a single admin virtqueue?

And I feel that mixing transport specific definitions in the general
admin virtqueue might not be optimal. So I wonder whether it's better
to just say this is a transport specific type. And define it in the
PCI transport part.

Thanks

>
>
>
> >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> >
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-06-02  2:21           ` Jason Wang
@ 2022-06-02  6:59             ` Michael S. Tsirkin
  2022-06-27 21:52               ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-02  6:59 UTC (permalink / raw)
  To: Jason Wang
  Cc: Max Gurtovoy, Cornelia Huck, virtio-comment, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

On Thu, Jun 02, 2022 at 10:21:23AM +0800, Jason Wang wrote:
> On Wed, Jun 1, 2022 at 9:44 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >
> >
> > On 5/18/2022 4:32 PM, Cornelia Huck wrote:
> > > On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> > >
> > >> Hi MST,
> > >>
> > >> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
> > >>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
> > >>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> > >>>> +
> > >>>> +A device group includes one or more virtio devices.
> > >>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
> > >>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
> > >>>> +
> > >>>> +For now, the supported device groups are:
> > >>>> +\begin{enumerate}
> > >>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> > >>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> > >>>> +\end{enumerate}
> > >>>> +
> > >>>>    \section{Structure Specifications}\label{sec:Structure Specifications}
> > >>> In context of virtualization type 1 already refers to a specific type
> > >>> of hypervisor.
> > >>>
> > >>> I suggest simply "SR-IOV type" - this way users do not need to remember
> > >>> special terminology.
> > >> This is 12 lines addition commit with simple definition.
> > >>
> > >> I didn't mentioned hypervisors here.
> > >>
> > >> I will stick to your suggestion and use name instead of numbers
> > >> (although I don't understand how can a use that knows how to read spec
> > >> will be confused here), but I would like Jason and Cornelia to ack on
> > >> this during this review cycle.
> > >>
> > >> When we'll get 3 acks on this name - I'll update it for v6.
> > > So, do you want to imply some kind of numbering? I don't like "Type 1",
> > > either. If the type needs to be referenced in code, it should have a
> > > #define or such; otherwise, "SR-IOV type" would be fine.
> >
> > ok I'll change it to be:
> >
> > diff --git a/introduction.tex b/introduction.tex
> > index aa9ec1b..bba70a6 100644
> > --- a/introduction.tex
> > +++ b/introduction.tex
> > @@ -156,6 +156,18 @@ \subsection{Transition from earlier specification
> > drafts}\label{sec:Transition f
> >   sections tagged "Legacy Interface" in the section title.
> >   These highlight the changes made since the earlier drafts.
> >
> > +\subsection{Device group}\label{sec:Introduction / Terminology / Device
> > group}
> > +
> > +A device group includes one or more virtio devices.
> > +Each virtio device has a unique virtio device id (vdev_id) within a
> > device group. A valid vdev_id is a 64-bit field in the range of
> > +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that
> > refers to all devices in a device group and isn't a valid vdev_id.

BTW I don't really like inventing terms with underscores.
Let's stick to english if we can. By the way "id" is not a word, either
;) We have a couple of instances which we really should fix.
And "virtio device id" is confusingly similar to
virtio "device id" where device id is a term from pci.
And abbreviations really should be capitalized or end with a comma
but really best just avoided.

How about group member identifier? So

Each virtio device within a group has a unique member identifier.



> > +
> > +For now, the supported device groups are:
> > +\begin{enumerate}
> > +\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
> > PCI SR-IOV virtual functions (VFs). For this group type, the PF device
> > has vdev_id that is equal to 0
> > +and the VF devices have vdev_id's that are equal to their vf_number
> > (according to the PCI SR-IOV specification).

A bit better just from grammar perspective:

 \item SR-IOV type - the group includes a virtio PCI SR-IOV physical function (PF) and
 all its virtual functions (VFs). For this group type, the PF device
 has vdev_id of 0;
 each VF has a vdev_id matching it's VF number [link to SRIOV spec].

but note ideas on terminology above.

> > +\end{enumerate}
> > +
> >   \section{Structure Specifications}\label{sec:Structure Specifications}
> >
> > MST/Jason/Cornelia,
> >
> > can you add some Reviewed-By signatures if the above is agreed ?
> 
> If I understand this correctly, the idea of "device group" is to allow
> different groups to be managed by a single admin virtqueue?
> 
> And I feel that mixing transport specific definitions in the general
> admin virtqueue might not be optimal. So I wonder whether it's better
> to just say this is a transport specific type. And define it in the
> PCI transport part.
> 
> Thanks

Well it's a single paragraph here, I think it's ok for now for completeness
sake rather than have reader chase references just to figure out an
example.

But I do agree this sentence about SRIOV type has to be repeated
in the pci transport section for completeness of that one.



> >
> >
> >
> > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > >
> >


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-31 20:39         ` Parav Pandit
@ 2022-06-20  9:23           ` Michael S. Tsirkin
  2022-06-20  9:49             ` Michael S. Tsirkin
  2022-06-20  9:59           ` Michael S. Tsirkin
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  9:23 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

With 1.2 release nearing completion I went back to see what's going on
with this wotk, and it looks like I forgot to reply to this question.
Sorry.


On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > And as I said, you will need much more spec work to reach the level to which
> > features are specified - and note we are not yet happy with how features are
> Can you be specific of the work that you are expecting in this v5 version?

I think we need a way for device to be notified about capabilities
of the driver. Current query only works the other way.

I would like us to move to add another group of feature bits intended
for the management purposes. Maybe just 32 bit, or maybe "extended
features" with a selector and length like mmio and ccw already do,
up to you.

This will replace functionality of the query capabilities command and
have the advantage of also notifying device about capabilities of the
driver.


-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 6/7] Introduce MGMT admin commands
  2022-05-18 15:03     ` Max Gurtovoy
@ 2022-06-20  9:45       ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  9:45 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 06:03:42PM +0300, Max Gurtovoy wrote:
> > > +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
> > > +
> > > +A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a type 1 device group.
> > > +A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
> > > +A management device might have various management capabilities and attributes to manage its managed devices.
> > This makes my eyes glaze over.
> > Please, find all instances which say "manage" more than once and
> > rephrase.
> 
> Can you propose something you like ?
> 
> Each individual has different wording style.
> 
> Just choose whatever fits to your style and I'll add it.


Unfortunately I don't know what you are trying to say here at all. Just
drop this sentence?


> > 
> > > The capabilities exposed
> > > +in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> > > +for more details) and the attributes exposed in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command
> > > +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
> > > +
> > > +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
> > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
> > > +
> > >   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
> > >   We start with an overview of device initialization, then expand on the
> > > @@ -1763,6 +1775,75 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
> > >       \end{itemize}
> > >   \end{itemize}
> > > +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
> > > +
> > > +This documents the group of admin capabilities for PCI virtio devices. Each capability is
> > > +implemented using one or more Admin commands.
> > > +
> > > +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}
> > > +
> > > +This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors
> > > +for its managed devices. In PCI, a management device can be the PF device and the managed device can be the VF (for example in a type 1 device group).
> > > +Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands, report the MSI-X attributes in the result of
> > > +VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X vector resource management is supported in the result of VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY admin command.
> > > +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command} and
> > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
> > > +
> > > +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of
> > > +msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
> > > +\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
> > > +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
> > > +fields in the \field{attrs_mask} field of the result buffer.
> > > +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details.
> > > +
> > > +The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification.
> > > +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device.
> > > +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the
> > > +amount of MSI-X interrupt vectors to configure to the designated managed device in \field{resource_val}. The managed device id is set to \field{vdev_id} field.
> > > +
> > > +A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the designated device.
> > > +This value is also returned in the virtio_admin_device_mgmt_result structure.
> > > +Also, a successful operation guarantees that the MSI-X capability access by the designated PCI device defined by the PCI specification must reflect
> > > +the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT
> > > +increases the MSI-X vectors to 8. On this change, reading Table size field of the MSI-X message control register will reflect a value of 7.
> > > +
> > > +It is beyond the scope of the virtio specification to define
> > > necessary synchronization in system software to ensure that a virtio
> > > PCI VF device +interrupt configuration modification is reflected in
> > > the PCI device.
> > IMHO it is very much in scope of the specification. The scope of the
> > specification is to allow device interoperability and this very much
> > fits the bill.
> 
> each system has its own set of tools and definitions.
> 
> It's not covered in the spec today and should be covered. Otherwise, the
> spec will get inside areas it shouldn't.

Then were is this described?
I suspect we can just drop this text, you are actually
describing this below.



> 
> > 
> > > However, it is expected that any modern system software implementing
> > > virtio +drivers and PCI subsystem will ensure that any changes
> > > occurring in the VF interrupt configuration is either updated in the
> > > PCI VF device or +such configuration fails.
> > OK. Anything more?

What's the answer here? Is this enough or is more needed?

> What exactly does "interrupt configuration" mean here?
> 
> MSI-X configuration.

Meaning msi-x capability and tables?

> 
> > 
> > > For example, one way to
> > > implement that is to make sure that there is no driver bounded to the
> > > virtio PCI SR-IOV VF during +this operation.
> > bounded in what sense?
> in a sense that a pci device driver is bounded and probed the device.

Do you mean bound maybe?

> > 
> > And why do you say VF? Is this command limited to type 1? You only
> > limit it to PCI above.
> 
> Today we support setting MSI-X configuration for VFs.
> 
> This is why I mentioned VFs.
> 
> IIRC, you asked to mentioned VFs in the past - but I'm not sure.
> 
> Is this a problem ? should I remove some sentance ?

I think what you mean is this. "For example, for type 1 groups, ...."

In other words if you mention VFs this is ok as an example,
but let's make sure we can extend to other types of grouping.


> > same elsewhere
> > 
> > > +
> > > +To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to
> > issues
> > 
> > lots of grammar error like this elsewhere, pls find and correct.
> > 
> > > +"query resource of the designated vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id}
> > > +field. In the result of a successful operation,
> > meaning "in case"?
> yes.
> > > the amount of MSI-X interrupt vectors that is currently assigned to the designated managed device is
> > > +returned by the device in \field{resource_val} field of the virtio_admin_device_mgmt_result structure.
> > > +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details.
> > > +
> > > +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
> > > +
> > > +A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following:
> > rephrase to simplify
> > 
> > The driver uses the following sequence for configuring MSI-X vectors
> > ....
> 
> But it's not the driver.
> 
> why should I change this if it's not true ?

Then who does this? We have the driver and the device in the spec ...
I'm ok with adding another entity but that's *a lot* more work ...

> > 
> > 
> > > +
> > > +\begin{enumerate}
> > > +\item Ensure that VF driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)
> > > +
> > > +\item Load the PF driver
> > > +
> > > +\item Enable SR-IOV by following the PCI specification
> > > +
> > > +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS
> > > +
> > > +\item Find the managed VF vdev_id (for type 1 device group the vdev_id of PCI VF is equal to vf number)
> > > +
> > > +\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
> > > +
> > > +\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
> > > +
> > > +\item After successful completion of the assignment, load the VF driver
> > > +
> > > +\item Assign the VF to a VM
> > > +
> > > +\end{enumerate}
> > > +
> > >   \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
> > >   Virtual environments without PCI support (a common situation in
> > > diff --git a/introduction.tex b/introduction.tex
> > > index 4358ab1..bfc5498 100644
> > > --- a/introduction.tex
> > > +++ b/introduction.tex
> > > @@ -164,9 +164,39 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> > >   For now, the supported device groups are:
> > >   \begin{enumerate}
> > >   \item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> > > -and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> > > +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification). A PCI SR-IOV PF device can act as a management device for
> > > +type 1 group. A PCI SR-IOV VF device can act as a managed device for type 1 group (see \ref{sec:Introduction / Terminology / Virtio management device} and
> > > +\ref{sec:Introduction / Terminology / Virtio managed device} for more information).
> > >   \end{enumerate}
> > > +\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
> > > +
> > > +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
> > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
> > > +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
> > > +This device can manage a virtio managed device. A device group may contain zero or more management devices.
> > > +
> > > +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for type 1 device group).
> > > +
> > > +\subsection{Virtio type 1 management device}\label{sec:Introduction / Terminology / Virtio type 1 management device}
> > > +
> > > +A virtio management device for type 1 device group. This device is a PCI SR-IOV PF that can set \field{dst_type} to 1 (other virtio device in the same device group),
> > > +and set \field{vdev_id} to an id that corresponds with one of its managed virtio devices (PCI SR-IOV VFs) for the VIRTIO_ADMIN_DEVICE_MGMT admin command.
> > > +
> > > +A type 1 device group may contain zero or one management devices.
> > > +
> > > +\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
> > > +
> > > +A virtio device that can be managed by a virtio management device.
> > > +A device group may contain zero or more managed devices.
> > > +
> > > +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for type 1 group).
> > > +
> > > +\subsection{virtio type 1 managed device}\label{sec:Introduction / Terminology / Virtio type 1 managed device}
> > > +
> > > +A virtio managed device for type 1 device group. This device is a PCI SR-IOV VF and is managed by a virtio type 1 management device (virtio PCI SR-IOV PF).
> > > +It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio type 1 management device are type 1 managed devices.
> > > +
> > >   \section{Structure Specifications}\label{sec:Structure Specifications}
> > >   Many device and driver in-memory structure layouts are documented using
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20  9:23           ` Michael S. Tsirkin
@ 2022-06-20  9:49             ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  9:49 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jun 20, 2022 at 05:24:02AM -0400, Michael S. Tsirkin wrote:
> With 1.2 release nearing completion I went back to see what's going on
> with this wotk, and it looks like I forgot to reply to this question.
> Sorry.
> 
> 
> On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > And as I said, you will need much more spec work to reach the level to which
> > > features are specified - and note we are not yet happy with how features are
> > Can you be specific of the work that you are expecting in this v5 version?
> 
> I think we need a way for device to be notified about capabilities
> of the driver. Current query only works the other way.

Sorry I spoke from memory about previous version.
Pls ignore this one and I will reply properly.

> I would like us to move to add another group of feature bits intended
> for the management purposes. Maybe just 32 bit, or maybe "extended
> features" with a selector and length like mmio and ccw already do,
> up to you.
> 
> This will replace functionality of the query capabilities command and
> have the advantage of also notifying device about capabilities of the
> driver.
> 
> 
> -- 
> MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-05-31 20:39         ` Parav Pandit
  2022-06-20  9:23           ` Michael S. Tsirkin
@ 2022-06-20  9:59           ` Michael S. Tsirkin
  2022-06-20 11:06             ` Parav Pandit
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  9:59 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > And as I said, you will need much more spec work to reach the level to which
> > features are specified - and note we are not yet happy with how features are
> Can you be specific of the work that you are expecting in this v5 version?

I proposed getting rid of attrs_mask/device_admin_caps and instead
specify that e.g. feature bits 64 to 95 are reserved for management
purposes. Maybe just add a 32 bit register, or maybe "extended features"
with a selector and length like mmio and ccw already do, up to you.
I think Cornelia likes this suggestion too.

This would replace functionality of the
VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT
commands and have the advantage of being generally well specified and
understood.

Is this acceptable? Or is there a reason the new commands are
preferable?

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20  9:59           ` Michael S. Tsirkin
@ 2022-06-20 11:06             ` Parav Pandit
  2022-06-20 16:46               ` Michael S. Tsirkin
  2022-06-23  1:26               ` Jason Wang
  0 siblings, 2 replies; 103+ messages in thread
From: Parav Pandit @ 2022-06-20 11:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, June 20, 2022 6:00 AM
> 
> On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > And as I said, you will need much more spec work to reach the level
> > > to which features are specified - and note we are not yet happy with
> > > how features are
> > Can you be specific of the work that you are expecting in this v5 version?
> 
> I proposed getting rid of attrs_mask/device_admin_caps and instead specify
> that e.g. feature bits 64 to 95 are reserved for management purposes.
> Maybe just add a 32 bit register, or maybe "extended features"
> with a selector and length like mmio and ccw already do, up to you.
> I think Cornelia likes this suggestion too.
> 
> This would replace functionality of the
> VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
> CEPT
> commands and have the advantage of being generally well specified and
> understood.
> 
> Is this acceptable? Or is there a reason the new commands are preferable?

We prefer the new commands for following reason.

1. proposed new command doesn't demand exposing registers in the PCI memory mapped area
2. Today AQ is perceived in to be PF, but it doesn't have to be. For next 10 years spec may evolve to have AQ for some other purpose on VFs/SFs.
Such devices scale at much higher magnitude than PFs.
And exposing memory mapped registers for rare functionality is the last thing to do in my mind.
3. Placements of such features bits in an AQ gives device lot more flexibility on _how_ to implement them. Some in sw, fw, hw, memory die etc.
Placement of them in PCI register space reduces these options.

So exposing them something in PCI register space has to have strong technical reason than just simplicity of access.


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 11:06             ` Parav Pandit
@ 2022-06-20 16:46               ` Michael S. Tsirkin
  2022-06-20 16:54                 ` Max Gurtovoy
  2022-06-20 17:16                 ` Parav Pandit
  2022-06-23  1:26               ` Jason Wang
  1 sibling, 2 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 16:46 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jun 20, 2022 at 11:06:07AM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, June 20, 2022 6:00 AM
> > 
> > On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > > And as I said, you will need much more spec work to reach the level
> > > > to which features are specified - and note we are not yet happy with
> > > > how features are
> > > Can you be specific of the work that you are expecting in this v5 version?
> > 
> > I proposed getting rid of attrs_mask/device_admin_caps and instead specify
> > that e.g. feature bits 64 to 95 are reserved for management purposes.
> > Maybe just add a 32 bit register, or maybe "extended features"
> > with a selector and length like mmio and ccw already do, up to you.
> > I think Cornelia likes this suggestion too.
> > 
> > This would replace functionality of the
> > VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
> > CEPT
> > commands and have the advantage of being generally well specified and
> > understood.
> > 
> > Is this acceptable? Or is there a reason the new commands are preferable?
> 
> We prefer the new commands for following reason.
> 
> 1. proposed new command doesn't demand exposing registers in the PCI memory mapped area
> 2. Today AQ is perceived in to be PF, but it doesn't have to be. For next 10 years spec may evolve to have AQ for some other purpose on VFs/SFs.
> Such devices scale at much higher magnitude than PFs.
> And exposing memory mapped registers for rare functionality is the last thing to do in my mind.
> 3. Placements of such features bits in an AQ gives device lot more flexibility on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> Placement of them in PCI register space reduces these options.
> 
> So exposing them something in PCI register space has to have strong technical reason than just simplicity of access.

I ansolutely get this argument. Please do get mine which is avoiding
duplicating very similar functionality is subtly differing ways.

Your above arguments apply equally to most other registers we already
have.  Wouldn't you then agree we need to address this more drastically by
defining an alternative transport such as virtio over virtqueue then?

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 16:46               ` Michael S. Tsirkin
@ 2022-06-20 16:54                 ` Max Gurtovoy
  2022-06-20 17:04                   ` Michael S. Tsirkin
  2022-06-20 17:16                 ` Parav Pandit
  1 sibling, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-06-20 16:54 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


On 6/20/2022 7:46 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 20, 2022 at 11:06:07AM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Monday, June 20, 2022 6:00 AM
>>>
>>> On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
>>>>> And as I said, you will need much more spec work to reach the level
>>>>> to which features are specified - and note we are not yet happy with
>>>>> how features are
>>>> Can you be specific of the work that you are expecting in this v5 version?
>>> I proposed getting rid of attrs_mask/device_admin_caps and instead specify
>>> that e.g. feature bits 64 to 95 are reserved for management purposes.
>>> Maybe just add a 32 bit register, or maybe "extended features"
>>> with a selector and length like mmio and ccw already do, up to you.
>>> I think Cornelia likes this suggestion too.
>>>
>>> This would replace functionality of the
>>> VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
>>> CEPT
>>> commands and have the advantage of being generally well specified and
>>> understood.
>>>
>>> Is this acceptable? Or is there a reason the new commands are preferable?
>> We prefer the new commands for following reason.
>>
>> 1. proposed new command doesn't demand exposing registers in the PCI memory mapped area
>> 2. Today AQ is perceived in to be PF, but it doesn't have to be. For next 10 years spec may evolve to have AQ for some other purpose on VFs/SFs.
>> Such devices scale at much higher magnitude than PFs.
>> And exposing memory mapped registers for rare functionality is the last thing to do in my mind.
>> 3. Placements of such features bits in an AQ gives device lot more flexibility on _how_ to implement them. Some in sw, fw, hw, memory die etc.
>> Placement of them in PCI register space reduces these options.
>>
>> So exposing them something in PCI register space has to have strong technical reason than just simplicity of access.
> I ansolutely get this argument. Please do get mine which is avoiding
> duplicating very similar functionality is subtly differing ways.
>
> Your above arguments apply equally to most other registers we already
> have.  Wouldn't you then agree we need to address this more drastically by
> defining an alternative transport such as virtio over virtqueue then?

No.

Virtqueues are part of the virtio device.

Admin virtq is yet another queue. It has its own command set. That's it.

And it's ok to have an administration queue to do some control 
configurations. It is common practice in other specifications and not of 
them created a new "queue transport".

Adding a transport is not needed and not relevant to this solution.

>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 16:54                 ` Max Gurtovoy
@ 2022-06-20 17:04                   ` Michael S. Tsirkin
  2022-06-20 17:19                     ` Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 17:04 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Parav Pandit, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jun 20, 2022 at 07:54:00PM +0300, Max Gurtovoy wrote:
> 
> On 6/20/2022 7:46 PM, Michael S. Tsirkin wrote:
> > On Mon, Jun 20, 2022 at 11:06:07AM +0000, Parav Pandit wrote:
> > > 
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Monday, June 20, 2022 6:00 AM
> > > > 
> > > > On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > > > > And as I said, you will need much more spec work to reach the level
> > > > > > to which features are specified - and note we are not yet happy with
> > > > > > how features are
> > > > > Can you be specific of the work that you are expecting in this v5 version?
> > > > I proposed getting rid of attrs_mask/device_admin_caps and instead specify
> > > > that e.g. feature bits 64 to 95 are reserved for management purposes.
> > > > Maybe just add a 32 bit register, or maybe "extended features"
> > > > with a selector and length like mmio and ccw already do, up to you.
> > > > I think Cornelia likes this suggestion too.
> > > > 
> > > > This would replace functionality of the
> > > > VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
> > > > CEPT
> > > > commands and have the advantage of being generally well specified and
> > > > understood.
> > > > 
> > > > Is this acceptable? Or is there a reason the new commands are preferable?
> > > We prefer the new commands for following reason.
> > > 
> > > 1. proposed new command doesn't demand exposing registers in the PCI memory mapped area
> > > 2. Today AQ is perceived in to be PF, but it doesn't have to be. For next 10 years spec may evolve to have AQ for some other purpose on VFs/SFs.
> > > Such devices scale at much higher magnitude than PFs.
> > > And exposing memory mapped registers for rare functionality is the last thing to do in my mind.
> > > 3. Placements of such features bits in an AQ gives device lot more flexibility on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> > > Placement of them in PCI register space reduces these options.
> > > 
> > > So exposing them something in PCI register space has to have strong technical reason than just simplicity of access.
> > I ansolutely get this argument. Please do get mine which is avoiding
> > duplicating very similar functionality is subtly differing ways.
> > 
> > Your above arguments apply equally to most other registers we already
> > have.  Wouldn't you then agree we need to address this more drastically by
> > defining an alternative transport such as virtio over virtqueue then?
> 
> No.
> 
> Virtqueues are part of the virtio device.
> 
> Admin virtq is yet another queue. It has its own command set. That's it.
> 
> And it's ok to have an administration queue to do some control
> configurations. It is common practice in other specifications and not of
> them created a new "queue transport".
> 
> Adding a transport is not needed and not relevant to this solution.
> 

Max with transport I am not talking about admin queue specifically.
The point Parav made is that he wants to avoid using memory
writable registers for slow path configuration, as far as possible.


Presumably the idea is to keep the information in question
outside the device, with pci spec rules preventing
blocking writes until another write completes (doing so would
create potential for deadlocks).

To that end your proposal does its best to keep new registers outside
device configuration space.

My response is that we already have a large number of registers,
so why don't we define a transport which avoids writes for
configuration completely? This seems to go further than
just saving 32 bit of capability registers.
This way we can focus on just getting the functionality right
without worrying too much about conserving memory space, and then work
on reducing memory space use separately by moving things out of there.

Thanks,

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 16:46               ` Michael S. Tsirkin
  2022-06-20 16:54                 ` Max Gurtovoy
@ 2022-06-20 17:16                 ` Parav Pandit
  1 sibling, 0 replies; 103+ messages in thread
From: Parav Pandit @ 2022-06-20 17:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio


> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> 
> On Mon, Jun 20, 2022 at 11:06:07AM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, June 20, 2022 6:00 AM
> > >
> > > On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > > > And as I said, you will need much more spec work to reach the
> > > > > level to which features are specified - and note we are not yet
> > > > > happy with how features are
> > > > Can you be specific of the work that you are expecting in this v5
> version?
> > >
> > > I proposed getting rid of attrs_mask/device_admin_caps and instead
> > > specify that e.g. feature bits 64 to 95 are reserved for management
> purposes.
> > > Maybe just add a 32 bit register, or maybe "extended features"
> > > with a selector and length like mmio and ccw already do, up to you.
> > > I think Cornelia likes this suggestion too.
> > >
> > > This would replace functionality of the
> > >
> VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
> > > CEPT
> > > commands and have the advantage of being generally well specified
> > > and understood.
> > >
> > > Is this acceptable? Or is there a reason the new commands are
> preferable?
> >
> > We prefer the new commands for following reason.
> >
> > 1. proposed new command doesn't demand exposing registers in the PCI
> > memory mapped area 2. Today AQ is perceived in to be PF, but it doesn't
> have to be. For next 10 years spec may evolve to have AQ for some other
> purpose on VFs/SFs.
> > Such devices scale at much higher magnitude than PFs.
> > And exposing memory mapped registers for rare functionality is the last
> thing to do in my mind.
> > 3. Placements of such features bits in an AQ gives device lot more flexibility
> on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> > Placement of them in PCI register space reduces these options.
> >
> > So exposing them something in PCI register space has to have strong
> technical reason than just simplicity of access.
> 
> I ansolutely get this argument. Please do get mine which is avoiding
> duplicating very similar functionality is subtly differing ways.
Yes, because it is anticipated that AQ will be more widely used that adds more features/functionality.

> 
> Your above arguments apply equally to most other registers we already
> have.  Wouldn't you then agree we need to address this more drastically by
> defining an alternative transport such as virtio over virtqueue then?

We cannot change what already exists. But we can better define new additions.
If we would be adding AQ only for the purpose of querying feature bits of a device under operation that would be a significant overhead.
But that is not done. An AQ is reused for multiple tasks.

I see feature bits is something that is very essential for device initialization sequence; without negotiating them it is very hard to operate the device.
Implementing them as essential pci registers feature bits makes a total sense like done today.

For AQ features proposed that has no hard requirements of it with device initialization, it is better to query and negotiate run time.

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 17:04                   ` Michael S. Tsirkin
@ 2022-06-20 17:19                     ` Parav Pandit
  2022-06-20 20:53                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-06-20 17:19 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, June 20, 2022 1:04 PM
> 
> Max with transport I am not talking about admin queue specifically.
> The point Parav made is that he wants to avoid using memory writable
> registers for slow path configuration, as far as possible.
Precisely, especially those one which doesn't have direct impact on the device initialization sequence.

Please see my previous response.


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 17:19                     ` Parav Pandit
@ 2022-06-20 20:53                       ` Michael S. Tsirkin
  2022-06-20 23:54                         ` Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 20:53 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jun 20, 2022 at 05:19:26PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, June 20, 2022 1:04 PM
> > 
> > Max with transport I am not talking about admin queue specifically.
> > The point Parav made is that he wants to avoid using memory writable
> > registers for slow path configuration, as far as possible.
> Precisely, especially those one which doesn't have direct impact on the device initialization sequence.
> 
> Please see my previous response.

Right. However
1. The cost is quite modest here. E.g. we have two feature selectors which
   just wastes bits.
   How bad would be to just use feature bits for now, and
   later/separately add an interface that reduces amount of writeable mmio,
   if deemed appropriate?
2. Driver features are writeable but device features are not -
   so why not have device features in memory?

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 13:39     ` [virtio-comment] " Max Gurtovoy
  2022-05-18 13:50       ` [virtio] " Cornelia Huck
@ 2022-06-20 21:08       ` Michael S. Tsirkin
  1 sibling, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 21:08 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 04:39:54PM +0300, Max Gurtovoy wrote:
> 
> On 5/15/2022 6:23 PM, Michael S. Tsirkin wrote:
> > On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
> > > This command set is used for essential administrative and management
> > > operations.
> > > 
> > > Admin commands should be submitted to a well defined management
> > > interface.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > ---
> > >   admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >   content.tex |   2 +
> > >   2 files changed, 125 insertions(+)
> > >   create mode 100644 admin.tex
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > new file mode 100644
> > > index 0000000..6725daa
> > > --- /dev/null
> > > +++ b/admin.tex
> > > @@ -0,0 +1,123 @@
> > > +\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
> > > +
> > > +The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
> > > +This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
> > > +
> > > +All the Admin commands are of the following form:
> > > +
> > > +\begin{lstlisting}
> > > +struct virtio_admin_cmd {
> > > +        /* Device-readable part */
> > > +        le16 command;
> > > +        /*
> > > +         * 0 - self
> > > +         * 1 - 65535 are reserved
> > > +         */
> > > +        le16 dst_type;
> > > +        /* reserved for common cmd fields */
> > > +        u8 reserved[20];
> > > +        u8 command_specific_data[];
> > > +
> > > +        /* Device-writable part */
> > > +        u8 status;
> > > +        u8 command_specific_error;
> > > +        u8 command_specific_result[];
> > > +};
> > > +\end{lstlisting}
> > > +
> > > +The following table describes the generic Admin status codes:
> > > +
> > > +\begin{tabular}{|l|l|l|}
> > > +\hline
> > > +Opcode & Status & Description \\
> > > +\hline \hline
> > > +00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
> > > +\hline
> > > +01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
> > > +\hline
> > > +02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
> > > +\hline
> > > +03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
> > > +\hline
> > > +\end{tabular}
> > > +
> > > +The \field{command}, \field{dst_type} and \field{command_specific_data} are
> > > +set by the driver, and the device sets the \field{status}, the
> > > +\field{command_specific_error} and the \field{command_specific_result},
> > > +if needed.
> > > +
> > > +Reserved common fields are ignored by the device and to be zeroed by the driver.
> > > +
> > > +The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}.
> > > +
> > > +The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
> > > +
> > > +The \field{dst_type} defines the designated virtio device for the command. This value should be set to 0 (self).
> > > +
> > > +The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
> > > +VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
> > > +holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
> > > +\field{command_specific_error} value is undefined and should be ignored by the driver.
> > > +
> > > +The following table describes the Admin command set:
> > > +
> > > +\begin{tabular}{|l|l|l|}
> > > +\hline
> > > +Opcode & Command & M/O \\
> > > +\hline \hline
> > > +0000h   & VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY    & M  \\
> > > +\hline
> > > +0001h   & VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT    & M  \\
> > > +\hline
> > > +0002h - 7FFFh   & Generic admin cmds    & -  \\
> > > +\hline
> > > +8000h - FFFFh   & Reserved    & - \\
> > > +\hline
> > > +\end{tabular}
> > > +
> > > +\subsection{VIRTIO ADMIN DEVICE CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS IDENTIFY command}
> > why without _ here?
> 
> The pdf generator doesn't like _
> 
> If someone will fix it, I'll add _

All upper case in the title is a bad idea anyway.
Please come up with a name in english for each command and
then put this readable name in the title.



> > 
> > > +
> > > +The VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY command has no command specific data set by the driver.
> > > +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY by the driver.
> > > +
> > > +The device, upon success, returns a result that describes information about the designated virtio device.
> > result really just means a result structure right? let's say so.
> 
> "The device, upon success, returns a result structure that describes
> information about the designated virtio device."
> 
> is the above ok ?
> 
> MST, Jason and Cornelia please ack.

ok

> > > +This result is of form:
> > > +\begin{lstlisting}
> > > +struct virtio_admin_device_caps_identify_result {
> > > +       /* Indicates which of the below fields were returned
> > > +        * (1 means that field was returned):
> > what does this mean "field is returned"? above restult is returned.
> 
> It means: if bit i is 1, then the value/values described by bit i are valid.
> 
> Is this ok ?
> 
> > > +        * Bit 0 - device_admin_caps
> > > +        * Bits 1 - 63 - reserved for future fields
> > > +        */
> > > +       le64 attrs_mask;
> > > +       /* This field indicates which of the below admin
> > > +        * capabilities are supported by the device:
> > > +        * Bits 0 - 63 - reserved for future capabilities.
> > > +        */
> > > +       le64 device_admin_caps;
> > 
> > so all of the field is reserved?
> 
> the bellow 112 bytes are reserved.
> 
> the result is 128B and for this stage 112B are reserved for future
> extensions.
> 
> I minimized it from 4k to 128B.
> 
> Please ack.

All this aggressive padding is pointless IMHO. We need to describe how
driver and device should validate the structure being future
proof, and being ready to
handle larger structure sizes and smaller buffer sizes.

> > 
> > > +       u8 reserved[112];
> > > +};
> > > +\end{lstlisting}
> > > +
> > > +\subsection{VIRTIO ADMIN DEVICE CAPS ACCEPT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE CAPS ACCEPT command}
> > > +
> > > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
> > 
> > ok so we have a protocol here, kind of like feature negotiation. Please write its description.
> > e.g. is it ok to change accepted caps? when? can device change its caps
> > etc etc etc.
> 
> I don't understand what does this mean to change a cap ?
> 
> Device can offer a cap and driver can accept it if it wishes to use it.
> 
> That is it.

let's say driver sends multiple commands to accept. legal?
what if capabilities depend on each other? how should
driver validate? and on failure how should it react?
same questions for device ...

> I added this mechanism just for your request.
> 
> I never saw a device that asks acceptance from driver but I did my best to
> fulfill your request.

And now that you started down the road you will discover it's
a lot of work just duplicating existing functionality.
which would be more usefully spent extending existing functionality,
which isn't always complete either :(


> > 
> > Avoiding this kind of spec work is exactly why me and jason keep telling
> > you to consider just using features instead. Add a 64 bit admin features
> > field to the PCI transport and be done with it. CCW and MMIO already
> > have feature selector so it's trivial to add feature bits.
> 
> It's not scalable for admin mechanism and I don't want to perform 100
> write/read from configuration space instead of doing all in 1 admin command.

Where would 100 read/write come from?
"not sclable" normally implies growth when something increases.
what exactly increases here? way I see it, feature bits
are per group, any number of elements in a group would not
require any more memory/accesses.



> > 
> > 
> > > +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
> > > +
> > > +The command specific data set by the driver is of form:
> > > +\begin{lstlisting}
> > > +struct virtio_admin_device_caps_accept_data {
> > > +       /* Indicates which of the below fields were set
> > > +        * (1 means that field is set):
> > 
> > yes we all know that 1 means set.
> > 
> > do you really mean field is valid maybe?
> yes valid == set.
> > 
> > 
> > > +        * Bit 0 - driver_admin_caps
> > > +        * Bits 1 - 63 - reserved for future fields
> > > +        */
> > > +       le64 attrs_mask;
> > looks like going overboard. just send 64 caps bits and be done with it.
> > and rename accept_data to accept_caps.
> this is the command specific data.
> > 
> > > +       /* This field indicates which of the below admin
> > > +        * capabilities are supported by the driver:
> > > +        * Bits 0 - 63 - reserved for future capabilities.
> > > +        */
> > > +       le64 driver_admin_caps;
> > > +       u8 reserved[112];
> > 
> > I just noticed this. Please do not add this huge amount of padding
> > everywhere. instead, explain that device must be ready to accept
> > a smaller or larger buffer depending on feature bits.
> 
> It's not huge. It's 128B command data.

Could be worse, yes :) But could be better.

> We will be sorry in the future for not doing extendable API.

Instead of trying to guess how much will be enough for everyone,
we need to specify how future struct size changes will be handled.

> I prefer keep it 128B unless there is a concrete reason for not doing so.

Because Jason wants an option to put this stuff in MMIO
down the road and this wastes memory space.


> > 
> > > +};
> > > +\end{lstlisting}
> > > diff --git a/content.tex b/content.tex
> > > index c6f116c..2e1df84 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
> > >   types. It is RECOMMENDED that devices generate version 4
> > >   UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> > > +\input{admin.tex}
> > > +
> > >   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
> > >   We start with an overview of device initialization, then expand on the
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 2/7] Introduce admin command set
  2022-05-18 14:16         ` Max Gurtovoy
@ 2022-06-20 22:26           ` Michael S. Tsirkin
  0 siblings, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 22:26 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Cornelia Huck, jasowang, virtio-comment, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, May 18, 2022 at 05:16:13PM +0300, Max Gurtovoy wrote:
> 
> On 5/18/2022 4:50 PM, Cornelia Huck wrote:
> > On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> > 
> > > On 5/15/2022 6:23 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Apr 27, 2022 at 01:58:19AM +0300, Max Gurtovoy wrote:
> > > > > This command set is used for essential administrative and management
> > > > > operations.
> > > > > 
> > > > > Admin commands should be submitted to a well defined management
> > > > > interface.
> > > > > 
> > > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > > ---
> > > > >    admin.tex   | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >    content.tex |   2 +
> > > > >    2 files changed, 125 insertions(+)
> > > > >    create mode 100644 admin.tex
> > > > > 
> > > > > +The VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT command is used by the driver to acknowledge those admin capabilities it understands and wishes to use.
> > > > ok so we have a protocol here, kind of like feature negotiation. Please write its description.
> > > > e.g. is it ok to change accepted caps? when? can device change its caps
> > > > etc etc etc.
> > > I don't understand what does this mean to change a cap ?
> > > 
> > > Device can offer a cap and driver can accept it if it wishes to use it.
> > > 
> > > That is it.
> > > 
> > > I added this mechanism just for your request.
> > > 
> > > I never saw a device that asks acceptance from driver but I did my best
> > > to fulfill your request.
> > > 
> > > > Avoiding this kind of spec work is exactly why me and jason keep telling
> > > > you to consider just using features instead. Add a 64 bit admin features
> > > > field to the PCI transport and be done with it. CCW and MMIO already
> > > > have feature selector so it's trivial to add feature bits.
> > > It's not scalable for admin mechanism and I don't want to perform 100
> > > write/read from configuration space instead of doing all in 1 admin command.
> > Why use the config space for that; just use feature bits, there are
> > enough of those, and we already have a defined protocol.
> 
> can you please propose something concrete ?
> 
> that will be scalable and will not add complexity to the feature negotiation
> mechanism we have today ?

I think you just say: "feature bits 64 to 127 are reserved for management
purposes" or something to this end. Then each command (or a group if
they are related) maps to a feature bit.

There's a cost in that the feature bits map to driver feature bits
which are writeable on device memory. But the cost scales well as
it's per group and not per device in a group.


> > 
> > > > 
> > > > > +The \field{command} is set to VIRTIO_ADMIN_DEVICE_CAPS_ACCEPT by the driver.
> > > > > +
> > > > > +The command specific data set by the driver is of form:
> > > > > +\begin{lstlisting}
> > > > > +struct virtio_admin_device_caps_accept_data {
> > > > > +       /* Indicates which of the below fields were set
> > > > > +        * (1 means that field is set):
> > > > yes we all know that 1 means set.
> > > > 
> > > > do you really mean field is valid maybe?
> > > yes valid == set.
> > > > 
> > > > > +        * Bit 0 - driver_admin_caps
> > > > > +        * Bits 1 - 63 - reserved for future fields
> > > > > +        */
> > > > > +       le64 attrs_mask;
> > > > looks like going overboard. just send 64 caps bits and be done with it.
> > > > and rename accept_data to accept_caps.
> > > this is the command specific data.
> > > > > +       /* This field indicates which of the below admin
> > > > > +        * capabilities are supported by the driver:
> > > > > +        * Bits 0 - 63 - reserved for future capabilities.
> > > > > +        */
> > > > > +       le64 driver_admin_caps;
> > > > > +       u8 reserved[112];
> > > > I just noticed this. Please do not add this huge amount of padding
> > > > everywhere. instead, explain that device must be ready to accept
> > > > a smaller or larger buffer depending on feature bits.
> > > It's not huge. It's 128B command data.
> > > 
> > > We will be sorry in the future for not doing extendable API.
> > > 
> > > I prefer keep it 128B unless there is a concrete reason for not doing so.
> > So just use a variable length structure, that should be extendable for
> > all future use cases.
> 
> I don't know how to develop compatible HW that use variable length
> structure.

There's a length field accompanying each descriptor.

> And why ? without any good reason.

To save memory space if we ever map the commands to memory: it's a
limited resource, often limited to 32 bit, but sometimes even to 16 bit.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 20:53                       ` Michael S. Tsirkin
@ 2022-06-20 23:54                         ` Parav Pandit
  0 siblings, 0 replies; 103+ messages in thread
From: Parav Pandit @ 2022-06-20 23:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, June 20, 2022 4:54 PM
> 
> On Mon, Jun 20, 2022 at 05:19:26PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, June 20, 2022 1:04 PM
> > >
> > > Max with transport I am not talking about admin queue specifically.
> > > The point Parav made is that he wants to avoid using memory writable
> > > registers for slow path configuration, as far as possible.
> > Precisely, especially those one which doesn't have direct impact on the
> device initialization sequence.
> >
> > Please see my previous response.
> 
> Right. However
> 1. The cost is quite modest here. E.g. we have two feature selectors which
>    just wastes bits.
Today because of short proposal there are two bits. But I expect them to grow.
For example, a device capable of modifying feature bits, num_vqs will expose two more bits.
For net specific device, one might want to configure mac, mtu of the device.
LM specific capabilities, write logging capable.
List will have few more entries.

>    How bad would be to just use feature bits for now, and
>    later/separately add an interface that reduces amount of writeable mmio,
>    if deemed appropriate?
If one can think of using AQ to implement vdpa over virtio device, and having AQ on each VF, each VF will now have to have rare entries as registers.
I think we can do better.

> 2. Driver features are writeable but device features are not -
>    so why not have device features in memory?
> 

If virtio spec builds the channel to write driver features, it is more logical to use same channel to read too.


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-20 11:06             ` Parav Pandit
  2022-06-20 16:46               ` Michael S. Tsirkin
@ 2022-06-23  1:26               ` Jason Wang
  2022-06-23  2:07                 ` Parav Pandit
  1 sibling, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-06-23  1:26 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jun 20, 2022 at 7:06 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, June 20, 2022 6:00 AM
> >
> > On Tue, May 31, 2022 at 08:39:24PM +0000, Parav Pandit wrote:
> > > > And as I said, you will need much more spec work to reach the level
> > > > to which features are specified - and note we are not yet happy with
> > > > how features are
> > > Can you be specific of the work that you are expecting in this v5 version?
> >
> > I proposed getting rid of attrs_mask/device_admin_caps and instead specify
> > that e.g. feature bits 64 to 95 are reserved for management purposes.
> > Maybe just add a 32 bit register, or maybe "extended features"
> > with a selector and length like mmio and ccw already do, up to you.
> > I think Cornelia likes this suggestion too.
> >
> > This would replace functionality of the
> > VIRTIO_ADMIN_DEVICE_CAPS_IDENTIFY/VIRTIO_ADMIN_DEVICE_CAPS_AC
> > CEPT
> > commands and have the advantage of being generally well specified and
> > understood.
> >
> > Is this acceptable? Or is there a reason the new commands are preferable?
>
> We prefer the new commands for following reason.
>
> 1. proposed new command doesn't demand exposing registers in the PCI memory mapped area
> 2. Today AQ is perceived in to be PF, but it doesn't have to be. For next 10 years spec may evolve to have AQ for some other purpose on VFs/SFs.
> Such devices scale at much higher magnitude than PFs.
> And exposing memory mapped registers for rare functionality is the last thing to do in my mind.
> 3. Placements of such features bits in an AQ gives device lot more flexibility on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> Placement of them in PCI register space reduces these options.
>
> So exposing them something in PCI register space has to have strong technical reason than just simplicity of access.

I think I know the advantages of admin virtqueue. But this part looks
suspicious and actually the reverse,

1) register based transport have been used for years, it's natural to
add features based on the existing transport and what you suggest here
limit the new features to be carried with those transports
2) admin virtqueue is heavy weight in use cases like nesting, we need
a simple interface
3) admin virtqueue is not the universal transport for all cases, we've
already had DMA/CMA based transports (e.g ccw and rproc)

We'd better decouple features from the transport and allow it to be
used by both admin virtqueue and other transports.

Thanks

>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-23  1:26               ` Jason Wang
@ 2022-06-23  2:07                 ` Parav Pandit
  2022-06-23  2:41                   ` Jason Wang
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-06-23  2:07 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, June 22, 2022 9:27 PM

> > 1. proposed new command doesn't demand exposing registers in the PCI
> > memory mapped area 2. Today AQ is perceived in to be PF, but it doesn't
> have to be. For next 10 years spec may evolve to have AQ for some other
> purpose on VFs/SFs.
> > Such devices scale at much higher magnitude than PFs.
> > And exposing memory mapped registers for rare functionality is the last
> thing to do in my mind.
> > 3. Placements of such features bits in an AQ gives device lot more flexibility
> on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> > Placement of them in PCI register space reduces these options.
> >
> > So exposing them something in PCI register space has to have strong
> technical reason than just simplicity of access.
> 
> I think I know the advantages of admin virtqueue. But this part looks
> suspicious and actually the reverse,
> 
> 1) register based transport have been used for years, it's natural to add
> features based on the existing transport and what you suggest here limit the
> new features to be carried with those transports
> 2) admin virtqueue is heavy weight in use cases like nesting, we need a
> simple interface
> 3) admin virtqueue is not the universal transport for all cases, we've already
> had DMA/CMA based transports (e.g ccw and rproc)
>
Are you suggesting only feature bits through register-based transport or all commands via register-based transport?

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-23  2:07                 ` Parav Pandit
@ 2022-06-23  2:41                   ` Jason Wang
  2022-06-23  2:57                     ` Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-06-23  2:41 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio

On Thu, Jun 23, 2022 at 10:07 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, June 22, 2022 9:27 PM
>
> > > 1. proposed new command doesn't demand exposing registers in the PCI
> > > memory mapped area 2. Today AQ is perceived in to be PF, but it doesn't
> > have to be. For next 10 years spec may evolve to have AQ for some other
> > purpose on VFs/SFs.
> > > Such devices scale at much higher magnitude than PFs.
> > > And exposing memory mapped registers for rare functionality is the last
> > thing to do in my mind.
> > > 3. Placements of such features bits in an AQ gives device lot more flexibility
> > on _how_ to implement them. Some in sw, fw, hw, memory die etc.
> > > Placement of them in PCI register space reduces these options.
> > >
> > > So exposing them something in PCI register space has to have strong
> > technical reason than just simplicity of access.
> >
> > I think I know the advantages of admin virtqueue. But this part looks
> > suspicious and actually the reverse,
> >
> > 1) register based transport have been used for years, it's natural to add
> > features based on the existing transport and what you suggest here limit the
> > new features to be carried with those transports
> > 2) admin virtqueue is heavy weight in use cases like nesting, we need a
> > simple interface
> > 3) admin virtqueue is not the universal transport for all cases, we've already
> > had DMA/CMA based transports (e.g ccw and rproc)
> >
> Are you suggesting only feature bits through register-based transport or all commands via register-based transport?

For feature bits, if we don't have admin virtqueue as transport, I
suggest it be negotiated with existing transport: we don't need any
extension of the existing transport facility to make it work. If we
have admin virtqueue as a transport, we must allow the features of the
managed device to be negotiated through the admin virtqueue.

For other commands, we probably need to analyze it case by case. But
if the command is for management only, it should be done via admin
virtqueue. (Note that for the MSIX allocation, I do see some vendors
allocating it via registers). If the command is used for basic
facilities like the virtqueue reset, it should be allowed to be
implemented by any transport (admin virtqueue as well as others like
PCI registers).

Thanks


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-23  2:41                   ` Jason Wang
@ 2022-06-23  2:57                     ` Parav Pandit
  2022-06-23  3:34                       ` Jason Wang
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-06-23  2:57 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio


> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, June 22, 2022 10:42 PM

[..]
> > Are you suggesting only feature bits through register-based transport or all
> commands via register-based transport?
> 
> For feature bits, if we don't have admin virtqueue as transport, 
Right.
Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.

> I suggest it
> be negotiated with existing transport: we don't need any extension of the
> existing transport facility to make it work. If we have admin virtqueue as a
> transport, we must allow the features of the managed device to be
> negotiated through the admin virtqueue.
>
Regarding above _managed_device_:

In current version the AQ is on the management device (not managed device).
But AQ as stated is generic to be present on any device that wants to implement AQ.

> For other commands, we probably need to analyze it case by case. But if the
> command is for management only, it should be done via admin virtqueue.
> (Note that for the MSIX allocation, I do see some vendors allocating it via
> registers). 
Where do you see it? In the spec draft?

> If the command is used for basic facilities like the virtqueue reset,
> it should be allowed to be implemented by any transport (admin virtqueue
> as well as others like PCI registers).

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-23  2:57                     ` Parav Pandit
@ 2022-06-23  3:34                       ` Jason Wang
  2022-06-28 14:24                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-06-23  3:34 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio

On Thu, Jun 23, 2022 at 10:57 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, June 22, 2022 10:42 PM
>
> [..]
> > > Are you suggesting only feature bits through register-based transport or all
> > commands via register-based transport?
> >
> > For feature bits, if we don't have admin virtqueue as transport,
> Right.
> Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
> It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.

Just to clarify,

E.g if the admin virtqueue is implemented in PF. I suggest using the
existing PCI transport based feature negotiation since:

1) we have selector, so we don't need any new registered and we don't
need worry about the scalability
2) the feature negotiated is done once, so we don't need to care about
the performance

This is the way we used for other features that require a specific
type of virtqueue.

If we use admin virtqueue (management device) to be a transport for
e.g scalable functions (managed devices). We must have a command that
is used for feature negotiation for the scalable function (managed
devices).

>
> > I suggest it
> > be negotiated with existing transport: we don't need any extension of the
> > existing transport facility to make it work. If we have admin virtqueue as a
> > transport, we must allow the features of the managed device to be
> > negotiated through the admin virtqueue.
> >
> Regarding above _managed_device_:
>
> In current version the AQ is on the management device (not managed device).
> But AQ as stated is generic to be present on any device that wants to implement AQ.

Yes.

>
> > For other commands, we probably need to analyze it case by case. But if the
> > command is for management only, it should be done via admin virtqueue.
> > (Note that for the MSIX allocation, I do see some vendors allocating it via
> > registers).
> Where do you see it?

E.g Intel E810

> In the spec draft?

No.

Thanks

>
> > If the command is used for basic facilities like the virtqueue reset,
> > it should be allowed to be implemented by any transport (admin virtqueue
> > as well as others like PCI registers).


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-06-02  6:59             ` Michael S. Tsirkin
@ 2022-06-27 21:52               ` Max Gurtovoy
  2022-06-28 18:54                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-06-27 21:52 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio


On 6/2/2022 9:59 AM, Michael S. Tsirkin wrote:
> On Thu, Jun 02, 2022 at 10:21:23AM +0800, Jason Wang wrote:
>> On Wed, Jun 1, 2022 at 9:44 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>
>>> On 5/18/2022 4:32 PM, Cornelia Huck wrote:
>>>> On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>
>>>>> Hi MST,
>>>>>
>>>>> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
>>>>>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>>>>>> +
>>>>>>> +A device group includes one or more virtio devices.
>>>>>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
>>>>>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
>>>>>>> +
>>>>>>> +For now, the supported device groups are:
>>>>>>> +\begin{enumerate}
>>>>>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>>>>>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>>>>>>> +\end{enumerate}
>>>>>>> +
>>>>>>>     \section{Structure Specifications}\label{sec:Structure Specifications}
>>>>>> In context of virtualization type 1 already refers to a specific type
>>>>>> of hypervisor.
>>>>>>
>>>>>> I suggest simply "SR-IOV type" - this way users do not need to remember
>>>>>> special terminology.
>>>>> This is 12 lines addition commit with simple definition.
>>>>>
>>>>> I didn't mentioned hypervisors here.
>>>>>
>>>>> I will stick to your suggestion and use name instead of numbers
>>>>> (although I don't understand how can a use that knows how to read spec
>>>>> will be confused here), but I would like Jason and Cornelia to ack on
>>>>> this during this review cycle.
>>>>>
>>>>> When we'll get 3 acks on this name - I'll update it for v6.
>>>> So, do you want to imply some kind of numbering? I don't like "Type 1",
>>>> either. If the type needs to be referenced in code, it should have a
>>>> #define or such; otherwise, "SR-IOV type" would be fine.
>>> ok I'll change it to be:
>>>
>>> diff --git a/introduction.tex b/introduction.tex
>>> index aa9ec1b..bba70a6 100644
>>> --- a/introduction.tex
>>> +++ b/introduction.tex
>>> @@ -156,6 +156,18 @@ \subsection{Transition from earlier specification
>>> drafts}\label{sec:Transition f
>>>    sections tagged "Legacy Interface" in the section title.
>>>    These highlight the changes made since the earlier drafts.
>>>
>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
>>> group}
>>> +
>>> +A device group includes one or more virtio devices.
>>> +Each virtio device has a unique virtio device id (vdev_id) within a
>>> device group. A valid vdev_id is a 64-bit field in the range of
>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that
>>> refers to all devices in a device group and isn't a valid vdev_id.
> BTW I don't really like inventing terms with underscores.
> Let's stick to english if we can. By the way "id" is not a word, either
> ;) We have a couple of instances which we really should fix.
> And "virtio device id" is confusingly similar to
> virtio "device id" where device id is a term from pci.
> And abbreviations really should be capitalized or end with a comma
> but really best just avoided.
>
> How about group member identifier? So
>
> Each virtio device within a group has a unique member identifier.
>
>
>
>>> +
>>> +For now, the supported device groups are:
>>> +\begin{enumerate}
>>> +\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
>>> PCI SR-IOV virtual functions (VFs). For this group type, the PF device
>>> has vdev_id that is equal to 0
>>> +and the VF devices have vdev_id's that are equal to their vf_number
>>> (according to the PCI SR-IOV specification).
> A bit better just from grammar perspective:
>
>   \item SR-IOV type - the group includes a virtio PCI SR-IOV physical function (PF) and
>   all its virtual functions (VFs). For this group type, the PF device
>   has vdev_id of 0;
>   each VF has a vdev_id matching it's VF number [link to SRIOV spec].
>
> but note ideas on terminology above.

update to:

"

+\subsection{Device group}\label{sec:Introduction / Terminology / Device 
group}
+
+A device group includes one or more virtio devices.
+Each virtio device has a unique group member identifier 
(group_member_id) within a device group. A valid group member identifier
+is a 64-bit field in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 
0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a
+device group and isn't a valid group member identifier.
+
+For now, the supported device groups are:
+\begin{enumerate}
+\item SR-IOV type - this group includes a virtio PCI SR-IOV physical 
function (PF) and all its virtual functions (VFs).
+For this group type, the PF device has group member identifier of 0. 
Each VF has a group member identifier matching it's VF number
+(according to PCI Express Base Specification, Single Root I/O 
Virtualization and Sharing chapter).
+\end{enumerate}

"

It's 90% suggested by MST.

>>> +\end{enumerate}
>>> +
>>>    \section{Structure Specifications}\label{sec:Structure Specifications}
>>>
>>> MST/Jason/Cornelia,
>>>
>>> can you add some Reviewed-By signatures if the above is agreed ?
>> If I understand this correctly, the idea of "device group" is to allow
>> different groups to be managed by a single admin virtqueue?
>>
>> And I feel that mixing transport specific definitions in the general
>> admin virtqueue might not be optimal. So I wonder whether it's better
>> to just say this is a transport specific type. And define it in the
>> PCI transport part.
>>
>> Thanks
> Well it's a single paragraph here, I think it's ok for now for completeness
> sake rather than have reader chase references just to figure out an
> example.
>
> But I do agree this sentence about SRIOV type has to be repeated
> in the pci transport section for completeness of that one.
>
Please suggest exactly what to add and where to add it.

And it will be done in V6.

>
>>>
>>>
>>>>
>>>> This publicly archived list offers a means to provide input to the
>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>
>>>> In order to verify user consent to the Feedback License terms and
>>>> to minimize spam in the list archive, subscription is required
>>>> before posting.
>>>>
>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=wfat54XCPxc1pbJ3Os%2B3S2HLs%2FF2HjV4sYtjlllkX14%3D&amp;reserved=0
>>>> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=aa7ZUanV6np2NuROfW%2BnugNCnEFB%2FWr%2B1FSCv5LwAE0%3D&amp;reserved=0
>>>> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=yPuKPUevurVlY0%2B%2Bg31pjVKDk5dIIJBbmdyMQUB5GFc%3D&amp;reserved=0
>>>> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=9uLpj4f0klAdNethEgLAdziquwZ%2FU%2FG7qPEvCzEDlH0%3D&amp;reserved=0
>>>> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ja9uSVhwbuwxER1nJTu7qFB5v%2FK6XA6akrDCLJzOj%2BQ%3D&amp;reserved=0
>>>>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-23  3:34                       ` Jason Wang
@ 2022-06-28 14:24                         ` Michael S. Tsirkin
  2022-06-29  8:43                           ` Jason Wang
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-28 14:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Thu, Jun 23, 2022 at 11:34:42AM +0800, Jason Wang wrote:
> On Thu, Jun 23, 2022 at 10:57 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Wednesday, June 22, 2022 10:42 PM
> >
> > [..]
> > > > Are you suggesting only feature bits through register-based transport or all
> > > commands via register-based transport?
> > >
> > > For feature bits, if we don't have admin virtqueue as transport,
> > Right.
> > Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
> > It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.
> 
> Just to clarify,
> 
> E.g if the admin virtqueue is implemented in PF. I suggest using the
> existing PCI transport based feature negotiation since:
> 
> 1) we have selector, so we don't need any new registered and we don't
> need worry about the scalability


I think this is a point that needs addressing.

To be more specific. The concern raised was that feature bits are memory
mapped. Accessing them on PCI requires an MMIO write. According to the
PCI spec it is not legal for a device to defer accepting a write until
another write is accepted (since that leads to deadlocks).
Thus any info programmed using MMIO writes has to reside in
on-card memory and can not be offloaded to system RAM.

After thinking about these issues I have an idea:

At the moment VQs are already never programmed before FEATURES_OK,
features are never programmed after FEATURES_OK.
This implies that device can actually store features in
VQ state registers temporarily until FEATURES_OK.
This is more than 24 bytes of memory per VQ, with at least
one data VQ and one admin VQ, this will be sufficient for a long time.

Thus, all that we need to do is prohibit programming VQs
before FEATURES_OK more strongly.
I can work on that if the idea is acceptable to others.




-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-06-27 21:52               ` Max Gurtovoy
@ 2022-06-28 18:54                 ` Michael S. Tsirkin
  2022-07-06 11:25                   ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-28 18:54 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

On Tue, Jun 28, 2022 at 12:52:51AM +0300, Max Gurtovoy wrote:
> 
> On 6/2/2022 9:59 AM, Michael S. Tsirkin wrote:
> > On Thu, Jun 02, 2022 at 10:21:23AM +0800, Jason Wang wrote:
> > > On Wed, Jun 1, 2022 at 9:44 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> > > > 
> > > > On 5/18/2022 4:32 PM, Cornelia Huck wrote:
> > > > > On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> > > > > 
> > > > > > Hi MST,
> > > > > > 
> > > > > > On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
> > > > > > > > +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> > > > > > > > +
> > > > > > > > +A device group includes one or more virtio devices.
> > > > > > > > +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
> > > > > > > > +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
> > > > > > > > +
> > > > > > > > +For now, the supported device groups are:
> > > > > > > > +\begin{enumerate}
> > > > > > > > +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
> > > > > > > > +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
> > > > > > > > +\end{enumerate}
> > > > > > > > +
> > > > > > > >     \section{Structure Specifications}\label{sec:Structure Specifications}
> > > > > > > In context of virtualization type 1 already refers to a specific type
> > > > > > > of hypervisor.
> > > > > > > 
> > > > > > > I suggest simply "SR-IOV type" - this way users do not need to remember
> > > > > > > special terminology.
> > > > > > This is 12 lines addition commit with simple definition.
> > > > > > 
> > > > > > I didn't mentioned hypervisors here.
> > > > > > 
> > > > > > I will stick to your suggestion and use name instead of numbers
> > > > > > (although I don't understand how can a use that knows how to read spec
> > > > > > will be confused here), but I would like Jason and Cornelia to ack on
> > > > > > this during this review cycle.
> > > > > > 
> > > > > > When we'll get 3 acks on this name - I'll update it for v6.
> > > > > So, do you want to imply some kind of numbering? I don't like "Type 1",
> > > > > either. If the type needs to be referenced in code, it should have a
> > > > > #define or such; otherwise, "SR-IOV type" would be fine.
> > > > ok I'll change it to be:
> > > > 
> > > > diff --git a/introduction.tex b/introduction.tex
> > > > index aa9ec1b..bba70a6 100644
> > > > --- a/introduction.tex
> > > > +++ b/introduction.tex
> > > > @@ -156,6 +156,18 @@ \subsection{Transition from earlier specification
> > > > drafts}\label{sec:Transition f
> > > >    sections tagged "Legacy Interface" in the section title.
> > > >    These highlight the changes made since the earlier drafts.
> > > > 
> > > > +\subsection{Device group}\label{sec:Introduction / Terminology / Device
> > > > group}
> > > > +
> > > > +A device group includes one or more virtio devices.
> > > > +Each virtio device has a unique virtio device id (vdev_id) within a
> > > > device group. A valid vdev_id is a 64-bit field in the range of
> > > > +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that
> > > > refers to all devices in a device group and isn't a valid vdev_id.
> > BTW I don't really like inventing terms with underscores.
> > Let's stick to english if we can. By the way "id" is not a word, either
> > ;) We have a couple of instances which we really should fix.
> > And "virtio device id" is confusingly similar to
> > virtio "device id" where device id is a term from pci.
> > And abbreviations really should be capitalized or end with a comma
> > but really best just avoided.
> > 
> > How about group member identifier? So
> > 
> > Each virtio device within a group has a unique member identifier.
> > 
> > 
> > 
> > > > +
> > > > +For now, the supported device groups are:
> > > > +\begin{enumerate}
> > > > +\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
> > > > PCI SR-IOV virtual functions (VFs). For this group type, the PF device
> > > > has vdev_id that is equal to 0
> > > > +and the VF devices have vdev_id's that are equal to their vf_number
> > > > (according to the PCI SR-IOV specification).
> > A bit better just from grammar perspective:
> > 
> >   \item SR-IOV type - the group includes a virtio PCI SR-IOV physical function (PF) and
> >   all its virtual functions (VFs). For this group type, the PF device
> >   has vdev_id of 0;
> >   each VF has a vdev_id matching it's VF number [link to SRIOV spec].
> > 
> > but note ideas on terminology above.
> 
> update to:
> 
> "
> 
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
> group}
> +
> +A device group includes one or more virtio devices.
> +Each virtio device has a unique group member identifier (group_member_id)
> within a device group. A valid group member identifier
> +is a 64-bit field in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id

Vdev_id -> member identifier ?

> 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a
> +device group and isn't a valid group member identifier.
> +
> +For now, the supported device groups are:
> +\begin{enumerate}
> +\item SR-IOV type - this group includes a virtio PCI SR-IOV physical
> function (PF) and all its virtual functions (VFs).
> +For this group type, the PF device has group member identifier of 0. Each
> VF has a group member identifier matching it's VF number
> +(according to PCI Express Base Specification, Single Root I/O
> Virtualization and Sharing chapter).
> +\end{enumerate}
> 
> "
> 
> It's 90% suggested by MST.
> 
> > > > +\end{enumerate}
> > > > +
> > > >    \section{Structure Specifications}\label{sec:Structure Specifications}
> > > > 
> > > > MST/Jason/Cornelia,
> > > > 
> > > > can you add some Reviewed-By signatures if the above is agreed ?
> > > If I understand this correctly, the idea of "device group" is to allow
> > > different groups to be managed by a single admin virtqueue?
> > > 
> > > And I feel that mixing transport specific definitions in the general
> > > admin virtqueue might not be optimal. So I wonder whether it's better
> > > to just say this is a transport specific type. And define it in the
> > > PCI transport part.
> > > 
> > > Thanks
> > Well it's a single paragraph here, I think it's ok for now for completeness
> > sake rather than have reader chase references just to figure out an
> > example.
> > 
> > But I do agree this sentence about SRIOV type has to be repeated
> > in the pci transport section for completeness of that one.
> > 
> Please suggest exactly what to add and where to add it.
> 
> And it will be done in V6.


maybe add here "devices of this type use the Virtio PCI transport
(link)"

I am not sure. Maybe we want a section about SR-IOV generally,
with mentions of the feature bit and the group type.
For now I think what you have is enough.

> > 
> > > > 
> > > > 
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=wfat54XCPxc1pbJ3Os%2B3S2HLs%2FF2HjV4sYtjlllkX14%3D&amp;reserved=0
> > > > > Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=aa7ZUanV6np2NuROfW%2BnugNCnEFB%2FWr%2B1FSCv5LwAE0%3D&amp;reserved=0
> > > > > List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=yPuKPUevurVlY0%2B%2Bg31pjVKDk5dIIJBbmdyMQUB5GFc%3D&amp;reserved=0
> > > > > Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=9uLpj4f0klAdNethEgLAdziquwZ%2FU%2FG7qPEvCzEDlH0%3D&amp;reserved=0
> > > > > Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Ce4f8c9bcf9bd4d714ddc08da446583c4%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637897500193215232%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ja9uSVhwbuwxER1nJTu7qFB5v%2FK6XA6akrDCLJzOj%2BQ%3D&amp;reserved=0
> > > > > 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-28 14:24                         ` Michael S. Tsirkin
@ 2022-06-29  8:43                           ` Jason Wang
  2022-06-29  9:02                             ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Jason Wang @ 2022-06-29  8:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Tue, Jun 28, 2022 at 10:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jun 23, 2022 at 11:34:42AM +0800, Jason Wang wrote:
> > On Thu, Jun 23, 2022 at 10:57 AM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Wednesday, June 22, 2022 10:42 PM
> > >
> > > [..]
> > > > > Are you suggesting only feature bits through register-based transport or all
> > > > commands via register-based transport?
> > > >
> > > > For feature bits, if we don't have admin virtqueue as transport,
> > > Right.
> > > Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
> > > It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.
> >
> > Just to clarify,
> >
> > E.g if the admin virtqueue is implemented in PF. I suggest using the
> > existing PCI transport based feature negotiation since:
> >
> > 1) we have selector, so we don't need any new registered and we don't
> > need worry about the scalability
>
>
> I think this is a point that needs addressing.
>
> To be more specific. The concern raised was that feature bits are memory
> mapped. Accessing them on PCI requires an MMIO write. According to the
> PCI spec it is not legal for a device to defer accepting a write until
> another write is accepted (since that leads to deadlocks).
> Thus any info programmed using MMIO writes has to reside in
> on-card memory and can not be offloaded to system RAM.

Probably, but it can be offloaded via other ways. We had already used
2 for e.g feature_sel, it works for the past 5+ years, I guess having
4 more should work for the future 10 years. We only have this "issue"
for some specific hardware (e.g we used to have similar discussion
when somebody posted pci endpoint device support for virtio-net).

Note that this is only an "issue" for the management device (PF),
having several more bytes of register just for the PF doesn't seem
expensive. We know we can use admin virtqueue as a transport for
managed devices.

>
> After thinking about these issues I have an idea:
>
> At the moment VQs are already never programmed before FEATURES_OK,
> features are never programmed after FEATURES_OK.
> This implies that device can actually store features in
> VQ state registers temporarily until FEATURES_OK.
> This is more than 24 bytes of memory per VQ, with at least
> one data VQ and one admin VQ, this will be sufficient for a long time.

Something like

struct virtio_pci_common_cfg {
      union {
            virtqueue_state;
             features;
       }
}

?

>
> Thus, all that we need to do is prohibit programming VQs
> before FEATURES_OK more strongly.
> I can work on that if the idea is acceptable to others.

Note sure, it looks to me if such kind of device is popular, it might
be better to have new PCI transport for non-register behaviour
devices.

Thanks

>
>
>

>
> --
> MST
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-29  8:43                           ` Jason Wang
@ 2022-06-29  9:02                             ` Michael S. Tsirkin
  2022-06-30  1:53                               ` Jason Wang
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-06-29  9:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Wed, Jun 29, 2022 at 04:43:18PM +0800, Jason Wang wrote:
> On Tue, Jun 28, 2022 at 10:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jun 23, 2022 at 11:34:42AM +0800, Jason Wang wrote:
> > > On Thu, Jun 23, 2022 at 10:57 AM Parav Pandit <parav@nvidia.com> wrote:
> > > >
> > > >
> > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > Sent: Wednesday, June 22, 2022 10:42 PM
> > > >
> > > > [..]
> > > > > > Are you suggesting only feature bits through register-based transport or all
> > > > > commands via register-based transport?
> > > > >
> > > > > For feature bits, if we don't have admin virtqueue as transport,
> > > > Right.
> > > > Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
> > > > It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.
> > >
> > > Just to clarify,
> > >
> > > E.g if the admin virtqueue is implemented in PF. I suggest using the
> > > existing PCI transport based feature negotiation since:
> > >
> > > 1) we have selector, so we don't need any new registered and we don't
> > > need worry about the scalability
> >
> >
> > I think this is a point that needs addressing.
> >
> > To be more specific. The concern raised was that feature bits are memory
> > mapped. Accessing them on PCI requires an MMIO write. According to the
> > PCI spec it is not legal for a device to defer accepting a write until
> > another write is accepted (since that leads to deadlocks).
> > Thus any info programmed using MMIO writes has to reside in
> > on-card memory and can not be offloaded to system RAM.
> 
> Probably, but it can be offloaded via other ways. We had already used
> 2 for e.g feature_sel, it works for the past 5+ years, I guess having
> 4 more should work for the future 10 years. We only have this "issue"
> for some specific hardware (e.g we used to have similar discussion
> when somebody posted pci endpoint device support for virtio-net).
> 
> Note that this is only an "issue" for the management device (PF),

The claim was that for nesting purposes we might want to add
admin queue to the VFs as well.

> having several more bytes of register just for the PF doesn't seem
> expensive. We know we can use admin virtqueue as a transport for
> managed devices.

Did you post a prototype spec patch of this at some point? I do not recall.

> >
> > After thinking about these issues I have an idea:
> >
> > At the moment VQs are already never programmed before FEATURES_OK,
> > features are never programmed after FEATURES_OK.
> > This implies that device can actually store features in
> > VQ state registers temporarily until FEATURES_OK.
> > This is more than 24 bytes of memory per VQ, with at least
> > one data VQ and one admin VQ, this will be sufficient for a long time.
> 
> Something like
> 
> struct virtio_pci_common_cfg {
>       union {
>             virtqueue_state;
>              features;
>        }
> }
> 
> ?


Yea! Except in verilog.

> >
> > Thus, all that we need to do is prohibit programming VQs
> > before FEATURES_OK more strongly.
> > I can work on that if the idea is acceptable to others.
> 
> Note sure, it looks to me if such kind of device is popular, it might
> be better to have new PCI transport for non-register behaviour
> devices.
> 
> Thanks
> 
> >
> >
> >
> 
> >
> > --
> > MST
> >


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] RE: [PATCH v5 2/7] Introduce admin command set
  2022-06-29  9:02                             ` Michael S. Tsirkin
@ 2022-06-30  1:53                               ` Jason Wang
  0 siblings, 0 replies; 103+ messages in thread
From: Jason Wang @ 2022-06-30  1:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Wed, Jun 29, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 29, 2022 at 04:43:18PM +0800, Jason Wang wrote:
> > On Tue, Jun 28, 2022 at 10:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Jun 23, 2022 at 11:34:42AM +0800, Jason Wang wrote:
> > > > On Thu, Jun 23, 2022 at 10:57 AM Parav Pandit <parav@nvidia.com> wrote:
> > > > >
> > > > >
> > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > Sent: Wednesday, June 22, 2022 10:42 PM
> > > > >
> > > > > [..]
> > > > > > > Are you suggesting only feature bits through register-based transport or all
> > > > > > commands via register-based transport?
> > > > > >
> > > > > > For feature bits, if we don't have admin virtqueue as transport,
> > > > > Right.
> > > > > Management tasks (features) done via AQ are negotiated via the AQ (regardless of device type being management dev or managed dev).
> > > > > It is not replacing usual feature bits negotiation that has strong relationships with device initialization sequence.
> > > >
> > > > Just to clarify,
> > > >
> > > > E.g if the admin virtqueue is implemented in PF. I suggest using the
> > > > existing PCI transport based feature negotiation since:
> > > >
> > > > 1) we have selector, so we don't need any new registered and we don't
> > > > need worry about the scalability
> > >
> > >
> > > I think this is a point that needs addressing.
> > >
> > > To be more specific. The concern raised was that feature bits are memory
> > > mapped. Accessing them on PCI requires an MMIO write. According to the
> > > PCI spec it is not legal for a device to defer accepting a write until
> > > another write is accepted (since that leads to deadlocks).
> > > Thus any info programmed using MMIO writes has to reside in
> > > on-card memory and can not be offloaded to system RAM.
> >
> > Probably, but it can be offloaded via other ways. We had already used
> > 2 for e.g feature_sel, it works for the past 5+ years, I guess having
> > 4 more should work for the future 10 years. We only have this "issue"
> > for some specific hardware (e.g we used to have similar discussion
> > when somebody posted pci endpoint device support for virtio-net).
> >
> > Note that this is only an "issue" for the management device (PF),
>
> The claim was that for nesting purposes we might want to add
> admin queue to the VFs as well.

So at least from the current usage (MSI-X allocation), I don't see why
an admin queue is needed for VF.

And in the case of nesting, it would be much easier to use MMIO
registers. Qemu is free to map it to any function of the device.

>
> > having several more bytes of register just for the PF doesn't seem
> > expensive. We know we can use admin virtqueue as a transport for
> > managed devices.
>
> Did you post a prototype spec patch of this at some point? I do not recall.

I did it here.

https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html

Thanks

>
> > >
> > > After thinking about these issues I have an idea:
> > >
> > > At the moment VQs are already never programmed before FEATURES_OK,
> > > features are never programmed after FEATURES_OK.
> > > This implies that device can actually store features in
> > > VQ state registers temporarily until FEATURES_OK.
> > > This is more than 24 bytes of memory per VQ, with at least
> > > one data VQ and one admin VQ, this will be sufficient for a long time.
> >
> > Something like
> >
> > struct virtio_pci_common_cfg {
> >       union {
> >             virtqueue_state;
> >              features;
> >        }
> > }
> >
> > ?
>
>
> Yea! Except in verilog.
>
> > >
> > > Thus, all that we need to do is prohibit programming VQs
> > > before FEATURES_OK more strongly.
> > > I can work on that if the idea is acceptable to others.
> >
> > Note sure, it looks to me if such kind of device is popular, it might
> > be better to have new PCI transport for non-register behaviour
> > devices.
> >
> > Thanks
> >
> > >
> > >
> > >
> >
> > >
> > > --
> > > MST
> > >
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH v5 0/7] Introduce device group and device management
  2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
                   ` (7 preceding siblings ...)
  2022-05-15 15:27 ` [PATCH v5 0/7] Introduce device group and device management Michael S. Tsirkin
@ 2022-07-05 13:56 ` Michael S. Tsirkin
  2022-07-05 15:11   ` [virtio-comment] " Parav Pandit
  8 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-05 13:56 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio

On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> Hi,
> A device group definition will help extending the virtio specefication for
> various future features that require a notion of grouping devices together or
> managing devices inside a group. A device group include one or more virtio devices.
> For now, only support for type 1 device group was added.

How about posting a new version? I frankly think between the
hardware hack I proposed and the new admin queue transport
proposal, we do not need to work hard to save
on writeable registers and so for starters at least
we can just use feature bits instead of the query capability
command.

Other issues I feel we have agreed on.


-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-05 13:56 ` Michael S. Tsirkin
@ 2022-07-05 15:11   ` Parav Pandit
  2022-07-06  2:54     ` [virtio-dev] " Jason Wang
  2022-07-06 11:00     ` Michael S. Tsirkin
  0 siblings, 2 replies; 103+ messages in thread
From: Parav Pandit @ 2022-07-05 15:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, Oren Duer,
	Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, July 5, 2022 9:57 AM
> 
> On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> > Hi,
> > A device group definition will help extending the virtio specefication
> > for various future features that require a notion of grouping devices
> > together or managing devices inside a group. A device group include one or
> more virtio devices.
> > For now, only support for type 1 device group was added.
> 
> How about posting a new version? I frankly think between the hardware
> hack I proposed and the new admin queue transport proposal, we do not
> need to work hard to save on writeable registers and so for starters at least
> we can just use feature bits instead of the query capability command.

Reader and writer via different channel is really a hack.
It still doesn't help to have efficient hw.

What is the technical reason to use feature bits which has strong relationships with device initialization sequence?
The proposed bits are AQ functionality bits and not per_say device feature bits?

This enables to build device in lot more flexible (and frankly simple way).
Looking at sw guest driver being simple to use existing limited 64-bits is just one part of it.

And when AQ exists, most of the plumbing comes for free.
So I fail to see the simplicity benefit of overloading feature bits for things that has no relation to device init sequence.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-05 15:11   ` [virtio-comment] " Parav Pandit
@ 2022-07-06  2:54     ` Jason Wang
  2022-07-06 10:10       ` Michael S. Tsirkin
  2022-07-06 10:46       ` [virtio-comment] " Parav Pandit
  2022-07-06 11:00     ` Michael S. Tsirkin
  1 sibling, 2 replies; 103+ messages in thread
From: Jason Wang @ 2022-07-06  2:54 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio

On Tue, Jul 5, 2022 at 11:11 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, July 5, 2022 9:57 AM
> >
> > On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> > > Hi,
> > > A device group definition will help extending the virtio specefication
> > > for various future features that require a notion of grouping devices
> > > together or managing devices inside a group. A device group include one or
> > more virtio devices.
> > > For now, only support for type 1 device group was added.
> >
> > How about posting a new version? I frankly think between the hardware
> > hack I proposed and the new admin queue transport proposal, we do not
> > need to work hard to save on writeable registers and so for starters at least
> > we can just use feature bits instead of the query capability command.
>
> Reader and writer via different channel is really a hack.
> It still doesn't help to have efficient hw.

So having 2 or 4 32bit registers just for PF doesn't seem expensive to
me. Another 64/128 bit should be sufficient for the future 5 or 10
years.

>
> What is the technical reason to use feature bits which has strong relationships with device initialization sequence?

Using admin virtqueue may suffer from bootstrapping issues. Especially
the features that may affect the vring itself.

E.g we may want to have

1) new vring layouts (similar to packed virtqueue)
2) new way to coalesce/suppress notification (similar to event index)
3) new way to notify the device (similar to notification data)

Those features can't be negotiated by admin virtqueue.

Thanks

> The proposed bits are AQ functionality bits and not per_say device feature bits?
>
> This enables to build device in lot more flexible (and frankly simple way).
> Looking at sw guest driver being simple to use existing limited 64-bits is just one part of it.
>
> And when AQ exists, most of the plumbing comes for free.
> So I fail to see the simplicity benefit of overloading feature bits for things that has no relation to device init sequence.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-06  2:54     ` [virtio-dev] " Jason Wang
@ 2022-07-06 10:10       ` Michael S. Tsirkin
  2022-07-06 10:46       ` [virtio-comment] " Parav Pandit
  1 sibling, 0 replies; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-06 10:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

Weird, I see Jason's reply in the archives:

https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08603.html

but not the original message by Parav.


Prav, I will send an email to OASIS tech support, it's important
that communication is archived.




On Wed, Jul 06, 2022 at 10:54:08AM +0800, Jason Wang wrote:
> On Tue, Jul 5, 2022 at 11:11 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, July 5, 2022 9:57 AM
> > >
> > > On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> > > > Hi,
> > > > A device group definition will help extending the virtio specefication
> > > > for various future features that require a notion of grouping devices
> > > > together or managing devices inside a group. A device group include one or
> > > more virtio devices.
> > > > For now, only support for type 1 device group was added.
> > >
> > > How about posting a new version? I frankly think between the hardware
> > > hack I proposed and the new admin queue transport proposal, we do not
> > > need to work hard to save on writeable registers and so for starters at least
> > > we can just use feature bits instead of the query capability command.
> >
> > Reader and writer via different channel is really a hack.
> > It still doesn't help to have efficient hw.
> 
> So having 2 or 4 32bit registers just for PF doesn't seem expensive to
> me. Another 64/128 bit should be sufficient for the future 5 or 10
> years.
> 
> >
> > What is the technical reason to use feature bits which has strong relationships with device initialization sequence?
> 
> Using admin virtqueue may suffer from bootstrapping issues. Especially
> the features that may affect the vring itself.
> 
> E.g we may want to have
> 
> 1) new vring layouts (similar to packed virtqueue)
> 2) new way to coalesce/suppress notification (similar to event index)
> 3) new way to notify the device (similar to notification data)
> 
> Those features can't be negotiated by admin virtqueue.
> 
> Thanks
> 
> > The proposed bits are AQ functionality bits and not per_say device feature bits?
> >
> > This enables to build device in lot more flexible (and frankly simple way).
> > Looking at sw guest driver being simple to use existing limited 64-bits is just one part of it.
> >
> > And when AQ exists, most of the plumbing comes for free.
> > So I fail to see the simplicity benefit of overloading feature bits for things that has no relation to device init sequence.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-06  2:54     ` [virtio-dev] " Jason Wang
  2022-07-06 10:10       ` Michael S. Tsirkin
@ 2022-07-06 10:46       ` Parav Pandit
  1 sibling, 0 replies; 103+ messages in thread
From: Parav Pandit @ 2022-07-06 10:46 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Max Gurtovoy, virtio-comment, cohuck,
	virtio-dev, Oren Duer, Shahaf Shuler, aadam, virtio


> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, July 5, 2022 10:54 PM
> 
> On Tue, Jul 5, 2022 at 11:11 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, July 5, 2022 9:57 AM
> > >
> > > On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> > > > Hi,
> > > > A device group definition will help extending the virtio
> > > > specefication for various future features that require a notion of
> > > > grouping devices together or managing devices inside a group. A
> > > > device group include one or
> > > more virtio devices.
> > > > For now, only support for type 1 device group was added.
> > >
> > > How about posting a new version? I frankly think between the
> > > hardware hack I proposed and the new admin queue transport proposal,
> > > we do not need to work hard to save on writeable registers and so
> > > for starters at least we can just use feature bits instead of the query
> capability command.
> >
> > Reader and writer via different channel is really a hack.
> > It still doesn't help to have efficient hw.
> 
> So having 2 or 4 32bit registers just for PF doesn't seem expensive to me.
> Another 64/128 bit should be sufficient for the future 5 or 10 years.
> 
> >
> > What is the technical reason to use feature bits which has strong
> relationships with device initialization sequence?
> 
> Using admin virtqueue may suffer from bootstrapping issues. Especially the
> features that may affect the vring itself.
> 
> E.g we may want to have
> 
> 1) new vring layouts (similar to packed virtqueue)
> 2) new way to coalesce/suppress notification (similar to event index)
> 3) new way to notify the device (similar to notification data)
> 
> Those features can't be negotiated by admin virtqueue.
Sure. There is no motivation to negotiate them via AQ either.
My ask is below.
a. functionality done over AQ is negotiated via AQ, things that has no relation to device init sequence
b. functionality of the device that has strong relation to device init sequence stays as today via feature bits.
For example, the cases you described above.

> 
> Thanks
> 
> > The proposed bits are AQ functionality bits and not per_say device feature
> bits?
> >
> > This enables to build device in lot more flexible (and frankly simple way).
> > Looking at sw guest driver being simple to use existing limited 64-bits is just
> one part of it.
> >
> > And when AQ exists, most of the plumbing comes for free.
> > So I fail to see the simplicity benefit of overloading feature bits for things
> that has no relation to device init sequence.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-05 15:11   ` [virtio-comment] " Parav Pandit
  2022-07-06  2:54     ` [virtio-dev] " Jason Wang
@ 2022-07-06 11:00     ` Michael S. Tsirkin
  2022-07-06 20:45       ` Parav Pandit
  1 sibling, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-06 11:00 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Tue, Jul 05, 2022 at 03:11:43PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, July 5, 2022 9:57 AM
> > 
> > On Wed, Apr 27, 2022 at 01:58:17AM +0300, Max Gurtovoy wrote:
> > > Hi,
> > > A device group definition will help extending the virtio specefication
> > > for various future features that require a notion of grouping devices
> > > together or managing devices inside a group. A device group include one or
> > more virtio devices.
> > > For now, only support for type 1 device group was added.
> > 
> > How about posting a new version? I frankly think between the hardware
> > hack I proposed and the new admin queue transport proposal, we do not
> > need to work hard to save on writeable registers and so for starters at least
> > we can just use feature bits instead of the query capability command.
> 
> Reader and writer via different channel is really a hack.
> It still doesn't help to have efficient hw.
> 
> What is the technical reason to use feature bits which has strong relationships with device initialization sequence?

The reason is it's a reasonably well understood mechanism, already
addressing questions like lifecycle, dependencies, extensibility.  For
example, one might reasonably ask what happens if driver uses commands
to change number of MSI-X vectors, then disables the capability bit
related to relevant commands? Should device forget the allocated
vectors? There's no longer a way to change the allocation...
If the set of allowed commands are locked in place until reset then
it's clearer.

Again I am sure the specific question can be answered. But I would
really just like to see the admin queue proposal get finalized as
quickly as possible, this means making compromizes to reuse as much of
existing functionality as possible. And we can add extensions down the
road.

> The proposed bits are AQ functionality bits and not per_say device feature bits?

Could not parse this sentence.

> This enables to build device in lot more flexible (and frankly simple way).
> Looking at sw guest driver being simple to use existing limited 64-bits is just one part of it.

I'm not sure what does this refers to. What was suggested using
e.g. feature bits 64 to 127, not stealing some of the 64 ones in use.
All existing transports support doing that without adding
any new transport specific interfaces.

Yes adding commands to tweak features might be benefitial.  I just
suggest splitting that out simply because this might or might not make
sense for all features.  For example, a related common request is
modifying feature bits and generally some config space on the device
side to facilitate migration across devices with varying features.
Should that be included in the proposal then?

> And when AQ exists, most of the plumbing comes for free.
> So I fail to see the simplicity benefit of overloading feature bits for things that has no relation to device init sequence.

Well most of feature bits are not *directly* tied to device init sequence.
In this case, one can argue that's at least related. For example,
one can imagine that whether number of MSI vectors is
controllable is relevant to the initialization of the PF driver.

The similicity just comes from reusing an existing mechanism so people
do not need to learn a new one.

Again I don't know what to do with this. I feel if it's put up for vote
in the current form it's likely to fail. I propose cutting out as much
as possible as a first step, so we can make progress. Specifically the
MSI-X commands are clearly PF specific so there's no concern about VF
memory use at all.  We can worry about other types of command down the
road when it becomes relevant.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-06-28 18:54                 ` Michael S. Tsirkin
@ 2022-07-06 11:25                   ` Max Gurtovoy
  2022-07-06 11:42                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-07-06 11:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio


On 6/28/2022 9:54 PM, Michael S. Tsirkin wrote:
> On Tue, Jun 28, 2022 at 12:52:51AM +0300, Max Gurtovoy wrote:
>> On 6/2/2022 9:59 AM, Michael S. Tsirkin wrote:
>>> On Thu, Jun 02, 2022 at 10:21:23AM +0800, Jason Wang wrote:
>>>> On Wed, Jun 1, 2022 at 9:44 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>> On 5/18/2022 4:32 PM, Cornelia Huck wrote:
>>>>>> On Wed, May 18 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>>>
>>>>>>> Hi MST,
>>>>>>>
>>>>>>> On 5/15/2022 6:25 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Wed, Apr 27, 2022 at 01:58:18AM +0300, Max Gurtovoy wrote:
>>>>>>>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>>>>>>>>> +
>>>>>>>>> +A device group includes one or more virtio devices.
>>>>>>>>> +Each virtio device has a unique virtio device id (vdev_id) within a device group. A valid vdev_id is a 64-bit field in the range of
>>>>>>>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a device group and isn't a valid vdev_id.
>>>>>>>>> +
>>>>>>>>> +For now, the supported device groups are:
>>>>>>>>> +\begin{enumerate}
>>>>>>>>> +\item Type 1 - A virtio PCI SR-IOV physical function (PF) and its PCI SR-IOV virtual functions (VFs). For this group type, the PF device has vdev_id that is equal to 0
>>>>>>>>> +and the VF devices have vdev_id's that are equal to their vf_number (according to the PCI SR-IOV specification).
>>>>>>>>> +\end{enumerate}
>>>>>>>>> +
>>>>>>>>>      \section{Structure Specifications}\label{sec:Structure Specifications}
>>>>>>>> In context of virtualization type 1 already refers to a specific type
>>>>>>>> of hypervisor.
>>>>>>>>
>>>>>>>> I suggest simply "SR-IOV type" - this way users do not need to remember
>>>>>>>> special terminology.
>>>>>>> This is 12 lines addition commit with simple definition.
>>>>>>>
>>>>>>> I didn't mentioned hypervisors here.
>>>>>>>
>>>>>>> I will stick to your suggestion and use name instead of numbers
>>>>>>> (although I don't understand how can a use that knows how to read spec
>>>>>>> will be confused here), but I would like Jason and Cornelia to ack on
>>>>>>> this during this review cycle.
>>>>>>>
>>>>>>> When we'll get 3 acks on this name - I'll update it for v6.
>>>>>> So, do you want to imply some kind of numbering? I don't like "Type 1",
>>>>>> either. If the type needs to be referenced in code, it should have a
>>>>>> #define or such; otherwise, "SR-IOV type" would be fine.
>>>>> ok I'll change it to be:
>>>>>
>>>>> diff --git a/introduction.tex b/introduction.tex
>>>>> index aa9ec1b..bba70a6 100644
>>>>> --- a/introduction.tex
>>>>> +++ b/introduction.tex
>>>>> @@ -156,6 +156,18 @@ \subsection{Transition from earlier specification
>>>>> drafts}\label{sec:Transition f
>>>>>     sections tagged "Legacy Interface" in the section title.
>>>>>     These highlight the changes made since the earlier drafts.
>>>>>
>>>>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
>>>>> group}
>>>>> +
>>>>> +A device group includes one or more virtio devices.
>>>>> +Each virtio device has a unique virtio device id (vdev_id) within a
>>>>> device group. A valid vdev_id is a 64-bit field in the range of
>>>>> +0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id 0xFFFFFFFFFFFFFFFF is a value that
>>>>> refers to all devices in a device group and isn't a valid vdev_id.
>>> BTW I don't really like inventing terms with underscores.
>>> Let's stick to english if we can. By the way "id" is not a word, either
>>> ;) We have a couple of instances which we really should fix.
>>> And "virtio device id" is confusingly similar to
>>> virtio "device id" where device id is a term from pci.
>>> And abbreviations really should be capitalized or end with a comma
>>> but really best just avoided.
>>>
>>> How about group member identifier? So
>>>
>>> Each virtio device within a group has a unique member identifier.
>>>
>>>
>>>
>>>>> +
>>>>> +For now, the supported device groups are:
>>>>> +\begin{enumerate}
>>>>> +\item SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
>>>>> PCI SR-IOV virtual functions (VFs). For this group type, the PF device
>>>>> has vdev_id that is equal to 0
>>>>> +and the VF devices have vdev_id's that are equal to their vf_number
>>>>> (according to the PCI SR-IOV specification).
>>> A bit better just from grammar perspective:
>>>
>>>    \item SR-IOV type - the group includes a virtio PCI SR-IOV physical function (PF) and
>>>    all its virtual functions (VFs). For this group type, the PF device
>>>    has vdev_id of 0;
>>>    each VF has a vdev_id matching it's VF number [link to SRIOV spec].
>>>
>>> but note ideas on terminology above.
>> update to:
>>
>> "
>>
>> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
>> group}
>> +
>> +A device group includes one or more virtio devices.
>> +Each virtio device has a unique group member identifier (group_member_id)
>> within a device group. A valid group member identifier
>> +is a 64-bit field in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. Vdev_id
> Vdev_id -> member identifier ?

Yea thanks.

>
>> 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a
>> +device group and isn't a valid group member identifier.
>> +
>> +For now, the supported device groups are:
>> +\begin{enumerate}
>> +\item SR-IOV type - this group includes a virtio PCI SR-IOV physical
>> function (PF) and all its virtual functions (VFs).
>> +For this group type, the PF device has group member identifier of 0. Each
>> VF has a group member identifier matching it's VF number
>> +(according to PCI Express Base Specification, Single Root I/O
>> Virtualization and Sharing chapter).
>> +\end{enumerate}
>>
>> "
>>
>> It's 90% suggested by MST.
>>
>>>>> +\end{enumerate}
>>>>> +
>>>>>     \section{Structure Specifications}\label{sec:Structure Specifications}
>>>>>
>>>>> MST/Jason/Cornelia,
>>>>>
>>>>> can you add some Reviewed-By signatures if the above is agreed ?
>>>> If I understand this correctly, the idea of "device group" is to allow
>>>> different groups to be managed by a single admin virtqueue?
>>>>
>>>> And I feel that mixing transport specific definitions in the general
>>>> admin virtqueue might not be optimal. So I wonder whether it's better
>>>> to just say this is a transport specific type. And define it in the
>>>> PCI transport part.
>>>>
>>>> Thanks
>>> Well it's a single paragraph here, I think it's ok for now for completeness
>>> sake rather than have reader chase references just to figure out an
>>> example.
>>>
>>> But I do agree this sentence about SRIOV type has to be repeated
>>> in the pci transport section for completeness of that one.
>>>
>> Please suggest exactly what to add and where to add it.
>>
>> And it will be done in V6.
>
> maybe add here "devices of this type use the Virtio PCI transport
> (link)"
>
> I am not sure. Maybe we want a section about SR-IOV generally,
> with mentions of the feature bit and the group type.
> For now I think what you have is enough.

Ok.

So for now we won't  add this.

>
>>>>>
>>>>>> This publicly archived list offers a means to provide input to the
>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
>>>>>>
>>>>>> In order to verify user consent to the Feedback License terms and
>>>>>> to minimize spam in the list archive, subscription is required
>>>>>> before posting.
>>>>>>
>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>>>>>> List help: virtio-comment-help@lists.oasis-open.org
>>>>>> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0XwN3pTAYNhgJFRuqALYtVBThDCllyIqodB%2FKWUQ60g%3D&amp;reserved=0
>>>>>> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2BGL0fb%2BXCfyJgYd1yWHnvB8W0sftmUqkIkzauEMhdLo%3D&amp;reserved=0
>>>>>> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LW1vxwUQ06I91v01WYDQ4xlMcuwFPsuhSqGhPj5%2BAIA%3D&amp;reserved=0
>>>>>> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=19ZQLm0hS7EFuvqW2W9irntVceUFXn9PwixmsoGoq24%3D&amp;reserved=0
>>>>>> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=J0Ar7y7sfs15LYzCU%2F7hYsQr0k2UEDEH4Ae%2BAUjn9RU%3D&amp;reserved=0
>>>>>>
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0XwN3pTAYNhgJFRuqALYtVBThDCllyIqodB%2FKWUQ60g%3D&amp;reserved=0
>> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2BGL0fb%2BXCfyJgYd1yWHnvB8W0sftmUqkIkzauEMhdLo%3D&amp;reserved=0
>> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LW1vxwUQ06I91v01WYDQ4xlMcuwFPsuhSqGhPj5%2BAIA%3D&amp;reserved=0
>> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=19ZQLm0hS7EFuvqW2W9irntVceUFXn9PwixmsoGoq24%3D&amp;reserved=0
>> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=J0Ar7y7sfs15LYzCU%2F7hYsQr0k2UEDEH4Ae%2BAUjn9RU%3D&amp;reserved=0
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0XwN3pTAYNhgJFRuqALYtVBThDCllyIqodB%2FKWUQ60g%3D&amp;reserved=0
> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2BGL0fb%2BXCfyJgYd1yWHnvB8W0sftmUqkIkzauEMhdLo%3D&amp;reserved=0
> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LW1vxwUQ06I91v01WYDQ4xlMcuwFPsuhSqGhPj5%2BAIA%3D&amp;reserved=0
> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=19ZQLm0hS7EFuvqW2W9irntVceUFXn9PwixmsoGoq24%3D&amp;reserved=0
> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7Cf8b3fbba2f4d48340b2908da5937a162%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637920392844378366%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=J0Ar7y7sfs15LYzCU%2F7hYsQr0k2UEDEH4Ae%2BAUjn9RU%3D&amp;reserved=0
>


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-07-06 11:25                   ` Max Gurtovoy
@ 2022-07-06 11:42                     ` Michael S. Tsirkin
  2022-07-06 12:01                       ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-06 11:42 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

On Wed, Jul 06, 2022 at 02:25:52PM +0300, Max Gurtovoy wrote:
> > maybe add here "devices of this type use the Virtio PCI transport
> > (link)"
> > 
> > I am not sure. Maybe we want a section about SR-IOV generally,
> > with mentions of the feature bit and the group type.
> > For now I think what you have is enough.
> 
> Ok.
> 
> So for now we won't  add this.

Well adding a link can't hurt.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-07-06 11:42                     ` Michael S. Tsirkin
@ 2022-07-06 12:01                       ` Max Gurtovoy
  2022-07-06 12:23                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Max Gurtovoy @ 2022-07-06 12:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio


On 7/6/2022 2:42 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 06, 2022 at 02:25:52PM +0300, Max Gurtovoy wrote:
>>> maybe add here "devices of this type use the Virtio PCI transport
>>> (link)"
>>>
>>> I am not sure. Maybe we want a section about SR-IOV generally,
>>> with mentions of the feature bit and the group type.
>>> For now I think what you have is enough.
>> Ok.
>>
>> So for now we won't  add this.
> Well adding a link can't hurt.

Ok.

Can you review the bellow please ? reviewed-by will be great..

diff --git a/introduction.tex b/introduction.tex
index aa9ec1b..c9ca978 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -156,6 +156,21 @@ \subsection{Transition from earlier specification 
drafts}\label{sec:Transition f
  sections tagged "Legacy Interface" in the section title.
  These highlight the changes made since the earlier drafts.

+\subsection{Device group}\label{sec:Introduction / Terminology / Device 
group}
+
+A device group includes one or more virtio devices.
+Each virtio device has a unique group member identifier 
(group_member_id) within a device group. A valid group member identifier
+is a 64-bit field in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. Member 
identifier 0xFFFFFFFFFFFFFFFF is a value that refers to all devices in a
+device group and isn't a valid group member identifier.
+
+For now, the supported device groups are:
+\begin{enumerate}
+\item SR-IOV type - this group includes a virtio PCI SR-IOV physical 
function (PF) and all its virtual functions (VFs).
+For this group type, the PF device has group member identifier of 0. 
Each VF has a group member identifier matching it's VF number
+(according to PCI Express Base Specification, Single Root I/O 
Virtualization and Sharing chapter). Devices that are members in this 
group use
+the Virtio PCI transport (for more details see \ref{sec:Virtio 
Transport Options / Virtio Over PCI Bus}).
+\end{enumerate}
+
  \section{Structure Specifications}\label{sec:Structure Specifications}

  Many device and driver in-memory structure layouts are documented using



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-07-06 12:01                       ` Max Gurtovoy
@ 2022-07-06 12:23                         ` Michael S. Tsirkin
  2022-07-06 15:18                           ` Max Gurtovoy
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-06 12:23 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

On Wed, Jul 06, 2022 at 03:01:14PM +0300, Max Gurtovoy wrote:
> 
> On 7/6/2022 2:42 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 06, 2022 at 02:25:52PM +0300, Max Gurtovoy wrote:
> > > > maybe add here "devices of this type use the Virtio PCI transport
> > > > (link)"
> > > > 
> > > > I am not sure. Maybe we want a section about SR-IOV generally,
> > > > with mentions of the feature bit and the group type.
> > > > For now I think what you have is enough.
> > > Ok.
> > > 
> > > So for now we won't  add this.
> > Well adding a link can't hurt.
> 
> Ok.
> 
> Can you review the bellow please ? reviewed-by will be great..
> 
> diff --git a/introduction.tex b/introduction.tex
> index aa9ec1b..c9ca978 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -156,6 +156,21 @@ \subsection{Transition from earlier specification
> drafts}\label{sec:Transition f
>  sections tagged "Legacy Interface" in the section title.
>  These highlight the changes made since the earlier drafts.
> 
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device
> group}
> +
> +A device group includes one or more virtio devices.
> +Each virtio device has a unique group member identifier (group_member_id)
> within a device group. A valid group member identifier
> +is a 64-bit field

field->value

> in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. Member
> identifier 0xFFFFFFFFFFFFFFFF is a value that

The value 0xFFFFFFFFFFFFFFFF

> refers to all devices in a
> +device group and isn't a valid group member identifier.
> +
> +For now, the supported device groups are:
> +\begin{enumerate}
> +\item SR-IOV type - this group includes a virtio PCI SR-IOV physical
> function (PF) and all its virtual functions (VFs).
> +For this group type, the PF device has group member identifier of 0. Each
> VF has a group member identifier matching it's VF number
> +(according to PCI Express Base Specification, Single Root I/O
> Virtualization and Sharing chapter). Devices that are members in this group
> use
> +the Virtio PCI transport (for more details see \ref{sec:Virtio Transport
> Options / Virtio Over PCI Bus}).
> +\end{enumerate}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
> 
>  Many device and driver in-memory structure layouts are documented using



^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-comment] Re: [PATCH v5 1/7] Introduce device group
  2022-07-06 12:23                         ` Michael S. Tsirkin
@ 2022-07-06 15:18                           ` Max Gurtovoy
  0 siblings, 0 replies; 103+ messages in thread
From: Max Gurtovoy @ 2022-07-06 15:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Cornelia Huck, virtio-comment, Virtio-Dev, Oren Duer,
	Parav Pandit, Shahaf Shuler, Ariel Adam, virtio

ok, updated with all the comments.

See bellow:

diff --git a/introduction.tex b/introduction.tex
index aa9ec1b..6075f83 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -156,6 +156,20 @@ \subsection{Transition from earlier specification 
drafts}\label{sec:Transition f
  sections tagged "Legacy Interface" in the section title.
  These highlight the changes made since the earlier drafts.

+\subsection{Device group}\label{sec:Introduction / Terminology / Device 
group}
+
+A device group includes one or more virtio devices.
+Each virtio device has a unique group member identifier 
(group_member_id) within a device group. A valid group member identifier
+is a 64-bit value in the range of 0x0 - 0xFFFFFFFFFFFFFFF0. The value 
0xFFFFFFFFFFFFFFFF refers to all devices in a device group and isn't a 
valid group member identifier.
+
+For now, the supported device groups are:
+\begin{enumerate}
+\item SR-IOV type - this group includes a virtio PCI SR-IOV physical 
function (PF) and all its virtual functions (VFs).
+For this group type, the PF device has group member identifier of 0. 
Each VF has a group member identifier matching it's VF number
+(according to PCI Express Base Specification, Single Root I/O 
Virtualization and Sharing chapter). Devices that are members in this 
group use
+the Virtio PCI transport (for more details see \ref{sec:Virtio 
Transport Options / Virtio Over PCI Bus}).
+\end{enumerate}
+
  \section{Structure Specifications}\label{sec:Structure Specifications}

  Many device and driver in-memory structure layouts are documented using


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 103+ messages in thread

* RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-06 11:00     ` Michael S. Tsirkin
@ 2022-07-06 20:45       ` Parav Pandit
  2022-07-24 21:09         ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-07-06 20:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, July 6, 2022 7:00 AM

> > Reader and writer via different channel is really a hack.
> > It still doesn't help to have efficient hw.
> >
> > What is the technical reason to use feature bits which has strong relationships
> with device initialization sequence?
> 
> The reason is it's a reasonably well understood mechanism, already addressing
> questions like lifecycle, dependencies, extensibility.  For example, one might
> reasonably ask what happens if driver uses commands to change number of
> MSI-X vectors, then disables the capability bit related to relevant commands?
> Should device forget the allocated vectors? There's no longer a way to change
> the allocation...
> If the set of allowed commands are locked in place until reset then it's clearer.
> 
I can't think of a real use case why a driver would do something like this.
But say its done, it means that driver is not interested of assigning the MSI-X resource anymore from the AQ.
The PF driver did something nasty that it should well behave.
All configurations done stays as-is.

The first line of virtio spec introducing virtqueue is : " The mechanism for bulk data transport on virtio devices is pretentiously called a virtqueue".

This is no different than doing queue_reset on the AQ, what happens on the resource configuration done via AQ before reset?
What happens to packets and IO sent via virtio before the queue reset?
If virtqueue is destroyed and recreate.. things happened before cannot be reverted.
Same behavior when capabilities are switched off...
Device won't allow any new configuration.

This is matter of wording it...

This is important; I will get this text added in the v6.

> Again I am sure the specific question can be answered. But I would really just like
> to see the admin queue proposal get finalized as quickly as possible, this means
> making compromizes to reuse as much of existing functionality as possible. And
> we can add extensions down the road.
> 
We can add new capability command later, but there is no point duplicating the feature bit at that point.

> > The proposed bits are AQ functionality bits and not per_say device feature
> bits?
> 
> Could not parse this sentence.
> 
Meaning capabilities negotiated is about what all different config will happen via AQ.

> > This enables to build device in lot more flexible (and frankly simple way).
> > Looking at sw guest driver being simple to use existing limited 64-bits is just
> one part of it.
> 
> I'm not sure what does this refers to. What was suggested using e.g. feature bits
> 64 to 127, not stealing some of the 64 ones in use.
> All existing transports support doing that without adding any new transport
> specific interfaces.
> 
> Yes adding commands to tweak features might be benefitial.  I just suggest
> splitting that out simply because this might or might not make sense for all
> features.  For example, a related common request is modifying feature bits and
> generally some config space on the device side to facilitate migration across
> devices with varying features.
> Should that be included in the proposal then?
> 
> > And when AQ exists, most of the plumbing comes for free.
> > So I fail to see the simplicity benefit of overloading feature bits for things that
> has no relation to device init sequence.
> 
> Well most of feature bits are not *directly* tied to device init sequence.
Sure, but that was past. When feature bits introduced unrelated to device init sequence, at that time there was no AQ in the same spec update.
So it might have been fine then.

> 
> The similicity just comes from reusing an existing mechanism so people do not
> need to learn a new one.
> 
I think the device efficiency weights more than driver code learning about AQ specific bits in generic device init sequence.

> Again I don't know what to do with this. I feel if it's put up for vote in the current
> form it's likely to fail. I propose cutting out as much as possible as a first step, so
> we can make progress. Specifically the MSI-X commands are clearly PF specific
> so there's no concern about VF memory use at all.  We can worry about other
> types of command down the road when it becomes relevant.

Since feature bits proposes are limited to PF, I agree that current short cut is fine to place in 64-127 feature bits.

When/if similar functionality is needed at scale for the VF or SIOV devices, placing them in 64-127 bits area weight way less for sake of "people familiarity to feature bits".
Do you agree?


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-06 20:45       ` Parav Pandit
@ 2022-07-24 21:09         ` Michael S. Tsirkin
  2022-07-24 21:25           ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-24 21:09 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

I snippet rest of mail - wasn't sure whether you were waiting for an
answer.

On Wed, Jul 06, 2022 at 08:45:26PM +0000, Parav Pandit wrote:
> > Again I don't know what to do with this. I feel if it's put up for vote in the current
> > form it's likely to fail. I propose cutting out as much as possible as a first step, so
> > we can make progress. Specifically the MSI-X commands are clearly PF specific
> > so there's no concern about VF memory use at all.  We can worry about other
> > types of command down the road when it becomes relevant.
> 
> Since feature bits proposes are limited to PF, I agree that current short cut is fine to place in 64-127 feature bits.
> 
> When/if similar functionality is needed at scale for the VF or SIOV devices, placing them in 64-127 bits area weight way less for sake of "people familiarity to feature bits".
> Do you agree?

I think we can all agree that extensions for scalable IOV will need more
work if that is what you are saying.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-24 21:09         ` Michael S. Tsirkin
@ 2022-07-24 21:25           ` Parav Pandit
  2022-07-24 23:41             ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-07-24 21:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, July 24, 2022 5:10 PM
> 
> I snippet rest of mail - wasn't sure whether you were waiting for an answer.
> 
> On Wed, Jul 06, 2022 at 08:45:26PM +0000, Parav Pandit wrote:
> > > Again I don't know what to do with this. I feel if it's put up for
> > > vote in the current form it's likely to fail. I propose cutting out
> > > as much as possible as a first step, so we can make progress.
> > > Specifically the MSI-X commands are clearly PF specific so there's
> > > no concern about VF memory use at all.  We can worry about other types
> of command down the road when it becomes relevant.
> >
> > Since feature bits proposes are limited to PF, I agree that current short cut
> is fine to place in 64-127 feature bits.
> >
> > When/if similar functionality is needed at scale for the VF or SIOV devices,
> placing them in 64-127 bits area weight way less for sake of "people
> familiarity to feature bits".
> > Do you agree?
> 
> I think we can all agree that extensions for scalable IOV will need more work
> if that is what you are saying.
Yes.
Even VFs to negotiate few tens of Kbytes seems waste of resources for one time read/write bits.
So better to start using AQ for new functionalities.
What is stopping us to adapt to this modern and optimized way?
Why cant we keep features bits limited to device startup negotiation scheme?
Other unrelated bits are added in past to feature bits  But lets follow the good examples instead when the new choice is already proposed and defined.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-24 21:25           ` [virtio-comment] " Parav Pandit
@ 2022-07-24 23:41             ` Michael S. Tsirkin
  2022-07-25  2:53               ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-24 23:41 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Sun, Jul 24, 2022 at 09:25:17PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, July 24, 2022 5:10 PM
> > 
> > I snippet rest of mail - wasn't sure whether you were waiting for an answer.
> > 
> > On Wed, Jul 06, 2022 at 08:45:26PM +0000, Parav Pandit wrote:
> > > > Again I don't know what to do with this. I feel if it's put up for
> > > > vote in the current form it's likely to fail. I propose cutting out
> > > > as much as possible as a first step, so we can make progress.
> > > > Specifically the MSI-X commands are clearly PF specific so there's
> > > > no concern about VF memory use at all.  We can worry about other types
> > of command down the road when it becomes relevant.
> > >
> > > Since feature bits proposes are limited to PF, I agree that current short cut
> > is fine to place in 64-127 feature bits.
> > >
> > > When/if similar functionality is needed at scale for the VF or SIOV devices,
> > placing them in 64-127 bits area weight way less for sake of "people
> > familiarity to feature bits".
> > > Do you agree?
> > 
> > I think we can all agree that extensions for scalable IOV will need more work
> > if that is what you are saying.
> Yes.
> Even VFs to negotiate few tens of Kbytes seems waste of resources for one time read/write bits.

It has small a cost for sure, but it's negligeable compared to the
current cost of implementing the spec.

> So better to start using AQ for new functionalities.
> What is stopping us to adapt to this modern and optimized way?

The cost/benefit tradeoff. The benefit is a theorectical gain of a few
bits per VF. The cost is very real engineering time spent.

I can keep poking holes and finding underspecified behaviour in the
proposal. But I don't really see why would anyone spend the time
duplicating what virtio spec is already doing decently.

> Why cant we keep features bits limited to device startup negotiation scheme?

Well they are already used for much more. So sure, maybe work on
changing that, but IMO trying it all in a single huge project is just
not a great idea.

> Other unrelated bits are added in past to feature bits  But lets follow the good examples instead when the new choice is already proposed and defined.

So IMHO saving a few bits here and there when we are spending tens of
bytes on state makes no sense.

Something like transport over AQ (or the transport vq proposal) or some
other concerted effort to save per device memory would be needed for
SIOV, and maybe it's useful for SRIOV too.



-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-24 23:41             ` Michael S. Tsirkin
@ 2022-07-25  2:53               ` Parav Pandit
  2022-07-25  7:44                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-07-25  2:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, July 24, 2022 7:42 PM
> To: Parav Pandit <parav@nvidia.com>
> Cc: Max Gurtovoy <mgurtovoy@nvidia.com>; jasowang@redhat.com; virtio-
> comment@lists.oasis-open.org; cohuck@redhat.com; virtio-dev@lists.oasis-
> open.org; Oren Duer <oren@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; aadam@redhat.com; virtio@lists.oasis-open.org
> Subject: Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and
> device management
> 
> On Sun, Jul 24, 2022 at 09:25:17PM +0000, Parav Pandit wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Sunday, July 24, 2022 5:10 PM
> > >
> > > I snippet rest of mail - wasn't sure whether you were waiting for an
> answer.
> > >
> > > On Wed, Jul 06, 2022 at 08:45:26PM +0000, Parav Pandit wrote:
> > > > > Again I don't know what to do with this. I feel if it's put up
> > > > > for vote in the current form it's likely to fail. I propose
> > > > > cutting out as much as possible as a first step, so we can make
> progress.
> > > > > Specifically the MSI-X commands are clearly PF specific so
> > > > > there's no concern about VF memory use at all.  We can worry
> > > > > about other types
> > > of command down the road when it becomes relevant.
> > > >
> > > > Since feature bits proposes are limited to PF, I agree that
> > > > current short cut
> > > is fine to place in 64-127 feature bits.
> > > >
> > > > When/if similar functionality is needed at scale for the VF or
> > > > SIOV devices,
> > > placing them in 64-127 bits area weight way less for sake of "people
> > > familiarity to feature bits".
> > > > Do you agree?
> > >
> > > I think we can all agree that extensions for scalable IOV will need
> > > more work if that is what you are saying.
> > Yes.
> > Even VFs to negotiate few tens of Kbytes seems waste of resources for one
> time read/write bits.
> 
> It has small a cost for sure, but it's negligeable compared to the current cost
> of implementing the spec.
>
How did you calculate the cost being negligible?
 
> > So better to start using AQ for new functionalities.
> > What is stopping us to adapt to this modern and optimized way?
> 
> The cost/benefit tradeoff. The benefit is a theorectical gain of a few bits per
> VF. The cost is very real engineering time spent.
>
How is it theoretical? I provided the calculation few times in previous emails.
 
> I can keep poking holes and finding underspecified behaviour in the
> proposal. 
We should fix the wording.

> But I don't really see why would anyone spend the time
> duplicating what virtio spec is already doing decently.
> 
> > Why cant we keep features bits limited to device startup negotiation
> scheme?
> 
> Well they are already used for much more. So sure, maybe work on changing
> that, but IMO trying it all in a single huge project is just not a great idea.
>
Why would you recommend on changing something historic like this?
The ask here to improve the new features.
 
> > Other unrelated bits are added in past to feature bits  But lets follow the
> good examples instead when the new choice is already proposed and
> defined.
> 
> So IMHO saving a few bits here and there when we are spending tens of
> bytes on state makes no sense.
>
Not a good reason to introduce something inferior.
 
> Something like transport over AQ (or the transport vq proposal) or some
> other concerted effort to save per device memory would be needed for
> SIOV, and maybe it's useful for SRIOV too.
>
AQ proposal is already doing it.
Some of the things seems to be lately duplicated in transport vq proposal.
We should possibly rename both the queues to mgmt_queue that serves both the purposes.

Transport VQ proposal is tunneling some SIOV device feature bits.
Here AQ is negotiating its own feature bits. Both are orthogonal.
And suggesting to some newer RFC to discuss current one doesn't make much sense to tangent the discussion at all.

Since there is zero technical short comings of negotiating AQ features via AQ command, lets please conclude to proceed with it.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-25  2:53               ` [virtio-comment] " Parav Pandit
@ 2022-07-25  7:44                 ` Michael S. Tsirkin
  2022-07-30 13:21                   ` [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-25  7:44 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Mon, Jul 25, 2022 at 02:53:39AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, July 24, 2022 7:42 PM
> > To: Parav Pandit <parav@nvidia.com>
> > Cc: Max Gurtovoy <mgurtovoy@nvidia.com>; jasowang@redhat.com; virtio-
> > comment@lists.oasis-open.org; cohuck@redhat.com; virtio-dev@lists.oasis-
> > open.org; Oren Duer <oren@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; aadam@redhat.com; virtio@lists.oasis-open.org
> > Subject: Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and
> > device management
> > 
> > On Sun, Jul 24, 2022 at 09:25:17PM +0000, Parav Pandit wrote:
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Sunday, July 24, 2022 5:10 PM
> > > >
> > > > I snippet rest of mail - wasn't sure whether you were waiting for an
> > answer.
> > > >
> > > > On Wed, Jul 06, 2022 at 08:45:26PM +0000, Parav Pandit wrote:
> > > > > > Again I don't know what to do with this. I feel if it's put up
> > > > > > for vote in the current form it's likely to fail. I propose
> > > > > > cutting out as much as possible as a first step, so we can make
> > progress.
> > > > > > Specifically the MSI-X commands are clearly PF specific so
> > > > > > there's no concern about VF memory use at all.  We can worry
> > > > > > about other types
> > > > of command down the road when it becomes relevant.
> > > > >
> > > > > Since feature bits proposes are limited to PF, I agree that
> > > > > current short cut
> > > > is fine to place in 64-127 feature bits.
> > > > >
> > > > > When/if similar functionality is needed at scale for the VF or
> > > > > SIOV devices,
> > > > placing them in 64-127 bits area weight way less for sake of "people
> > > > familiarity to feature bits".
> > > > > Do you agree?
> > > >
> > > > I think we can all agree that extensions for scalable IOV will need
> > > > more work if that is what you are saying.
> > > Yes.
> > > Even VFs to negotiate few tens of Kbytes seems waste of resources for one
> > time read/write bits.
> > 
> > It has small a cost for sure, but it's negligeable compared to the current cost
> > of implementing the spec.
> >
> How did you calculate the cost being negligible?

For sure each VF needs at least 2 VQs plus the AQ. Each one
is 3 64 bit pointers, plus some flags and counters, plus at
least one MSI vector about 200-300 bit.
We are talking each feature taking up a single bit per VF right now.
That's below 1%.


> > > So better to start using AQ for new functionalities.
> > > What is stopping us to adapt to this modern and optimized way?
> > 
> > The cost/benefit tradeoff. The benefit is a theorectical gain of a few bits per
> > VF. The cost is very real engineering time spent.
> >
> How is it theoretical? I provided the calculation few times in previous emails.

It's theoretical because we don't know whether anyone will ever add
AQ to VFs as opposed to a PF.

> > I can keep poking holes and finding underspecified behaviour in the
> > proposal. 
> We should fix the wording.
> > But I don't really see why would anyone spend the time
> > duplicating what virtio spec is already doing decently.
> > 
> > > Why cant we keep features bits limited to device startup negotiation
> > scheme?
> > 
> > Well they are already used for much more. So sure, maybe work on changing
> > that, but IMO trying it all in a single huge project is just not a great idea.
> >
> Why would you recommend on changing something historic like this?

Because that will bring a much bigger gain.

> The ask here to improve the new features.

Let's just improve everything, a much bigger gain.  Simply put, pci
transport was never designed with saving per device memory in mind.
Adding complexity to save a couple of bytes per device will just create
bugs without solving that. AQ is already trying to solve grouping
devices and that's a big project, let's finish that and then decide
whether we want to work on migration or work on saving memory next.

> > > Other unrelated bits are added in past to feature bits  But lets follow the
> > good examples instead when the new choice is already proposed and
> > defined.
> > 
> > So IMHO saving a few bits here and there when we are spending tens of
> > bytes on state makes no sense.
> >
> Not a good reason to introduce something inferior.

Avoiding duplication and focusing on low hanging fruit are all
sound engineering principles.


> > Something like transport over AQ (or the transport vq proposal) or some
> > other concerted effort to save per device memory would be needed for
> > SIOV, and maybe it's useful for SRIOV too.
> >
> AQ proposal is already doing it.
> Some of the things seems to be lately duplicated in transport vq proposal.
> We should possibly rename both the queues to mgmt_queue that serves both the purposes.

Quite possibly - transport VQ seems to handle a group of devices from one
device which seems similar to what AQ does, and you guys should
probably work together.

But whether we do or do not unify transport and admin vq,
with transport vq existing feature mechanism stops taking up
space in the VQ and this opens up possibility to just use that mechanism
to discover supported commands without paying memory penalty.


> Transport VQ proposal is tunneling some SIOV device feature bits.
> Here AQ is negotiating its own feature bits. Both are orthogonal.

I thought the discussion was about dropping the get features command and
using features instead. The Transport VQ proposal seems to show how we
can do that and still make things scale.  Yes it's orthogonal to using
AQ for grouping, and using features is exactly what will keep it
orthogonal.

> And suggesting to some newer RFC to discuss current one doesn't make much sense to tangent the discussion at all.


The reason I bring it up is that it seems like a better solution to saving device
memory than adding a get capabilities command. And it let you just use
features for now, and assume whatever we spend will be saved back
by using vq as transport.


> Since there is zero technical short comings of negotiating AQ features via AQ command, lets please conclude to proceed with it.

Not 100% sure what you are saing here, so I think you should post a new
version, yes. Generally iterating faster is a good idea. If you feel
you didn't get enough feedback on a given version and some time passed
try pinging people.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [virtio-comment] RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-25  7:44                 ` Michael S. Tsirkin
@ 2022-07-30 13:21                   ` Parav Pandit
  2022-07-31 15:38                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 103+ messages in thread
From: Parav Pandit @ 2022-07-30 13:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, July 25, 2022 3:44 AM
> > >
> > How did you calculate the cost being negligible?
> 
> For sure each VF needs at least 2 VQs plus the AQ. Each one is 3 64 bit
> pointers, plus some flags and counters, plus at least one MSI vector about
> 200-300 bit.
> We are talking each feature taking up a single bit per VF right now.
> That's below 1%.
>
Because you assumed that new feature bits in this area will never be added.
The AQ and its capabilities are not limited to PF even though first version of the spec has only few bits.

And these 1% at scale results in several Kbytes.
 
> 
> > > > So better to start using AQ for new functionalities.
> > > > What is stopping us to adapt to this modern and optimized way?
> > >
> > > The cost/benefit tradeoff. The benefit is a theorectical gain of a
> > > few bits per VF. The cost is very real engineering time spent.
> > >
> > How is it theoretical? I provided the calculation few times in previous
> emails.
> 
> It's theoretical because we don't know whether anyone will ever add AQ to
> VFs as opposed to a PF.
>
Isnt the scope of AQ now more than just msix+features+livemigration?
Such as managing siov and more?

Things like queue_reset, query counters and other things can be done more efficiently through a vq than some burned bits regardless of SIOV.
So in my view looking at AQ for msix, LM purpose is narrow view to stay on the PF.
 
> > > I can keep poking holes and finding underspecified behaviour in the
> > > proposal.
> > We should fix the wording.
> > > But I don't really see why would anyone spend the time duplicating
> > > what virtio spec is already doing decently.
> > >
> > > > Why cant we keep features bits limited to device startup
> > > > negotiation
> > > scheme?
> > >
> > > Well they are already used for much more. So sure, maybe work on
> > > changing that, but IMO trying it all in a single huge project is just not a
> great idea.
> > >
> > Why would you recommend on changing something historic like this?
> 
> Because that will bring a much bigger gain.
>
That also breaks existing backward compatibility for existing devices.
So it has to be something additional and optional.

 
> > The ask here to improve the new features.
> 
> Let's just improve everything, a much bigger gain.  Simply put, pci transport
> was never designed with saving per device memory in mind.
Hence, new addition shouldn't make it worse when the new optimized methods are available.

> Adding complexity to save a couple of bytes per device will just create bugs
> without solving that. 
I disagree that it creates bugs.
When done right, we have seen in multiple projects (nvme, mlx5 and more) that it doesn't create bugs.

> AQ is already trying to solve grouping devices and that's
> a big project, let's finish that and then decide whether we want to work on
> migration or work on saving memory next.
> 
I assume migration means live migration and not migration from pci to something else.
If its former, it is not either-or project.
Saving memory is attributed as project and it doesn't have to be for the small piece being added here.

> > > > Other unrelated bits are added in past to feature bits  But lets
> > > > follow the
> > > good examples instead when the new choice is already proposed and
> > > defined.
> > >
> > > So IMHO saving a few bits here and there when we are spending tens
> > > of bytes on state makes no sense.
> > >
You missed the multiply factor of VFs. :)

It is not only saving bits, it also enforces certain aspects of how a device is being composed that must expose these bits pretty early.

> > Not a good reason to introduce something inferior.
> 
> Avoiding duplication and focusing on low hanging fruit are all sound
> engineering principles.
> 
It is not a duplication.
I iterate again. The bits we asked to put in the Aq commands are the one that does not need to be negotiated early.
Feature bits were mis-used to put all things in there because in past a generic AQ didn't exists.
And there is no reason for that past decision to influence current proposal.

Feature bits that are so basic for device bring up stage, stays in feature bits area defined today.
Hence, its not a duplication.

> 
> > > Something like transport over AQ (or the transport vq proposal) or
> > > some other concerted effort to save per device memory would be
> > > needed for SIOV, and maybe it's useful for SRIOV too.
> > >
> > AQ proposal is already doing it.
> > Some of the things seems to be lately duplicated in transport vq proposal.
> > We should possibly rename both the queues to mgmt_queue that serves
> both the purposes.
> 
> Quite possibly - transport VQ seems to handle a group of devices from one
> device which seems similar to what AQ does, and you guys should probably
> work together.
>
Yes.
 
> But whether we do or do not unify transport and admin vq, with transport
Without unification, I don't see how SIOV proposal can ever make to the spec.
Once AQ is merged SIOV can use AQ.
I would like to see SIOV to refer to the AQ proposal commands and extend it. Not as separate proposal.

> vq existing feature mechanism stops taking up space in the VQ and this
> opens up possibility to just use that mechanism to discover supported
> commands without paying memory penalty.
> 
This only helps any new SIOV beast which is not even fully defined at various layers of platform.

> 
> > Transport VQ proposal is tunneling some SIOV device feature bits.
> > Here AQ is negotiating its own feature bits. Both are orthogonal.
> 
> I thought the discussion was about dropping the get features command and
> using features instead. 

> The Transport VQ proposal seems to show how we
> can do that and still make things scale.  Yes it's orthogonal to using AQ for
> grouping, and using features is exactly what will keep it orthogonal.
> 
> > And suggesting to some newer RFC to discuss current one doesn't make
> much sense to tangent the discussion at all.
> 
> 
> The reason I bring it up is that it seems like a better solution to saving device
> memory than adding a get capabilities command. And it let you just use
> features for now, and assume whatever we spend will be saved back by
> using vq as transport.
> 
Even for non SIOV devices?

> 
> > Since there is zero technical short comings of negotiating AQ features via
> AQ command, lets please conclude to proceed with it.
> 
> Not 100% sure what you are saing here, so I think you should post a new
> version, yes. Generally iterating faster is a good idea. If you feel you didn't
> get enough feedback on a given version and some time passed try pinging
> people.
Without closing the discussion, I have seen same topic come up again and again in future version.
So even though new series is up for posting, we better close the discussion.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-30 13:21                   ` [virtio-comment] " Parav Pandit
@ 2022-07-31 15:38                     ` Michael S. Tsirkin
  2022-08-02 17:40                       ` Parav Pandit
  0 siblings, 1 reply; 103+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 15:38 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio

On Sat, Jul 30, 2022 at 01:21:20PM +0000, Parav Pandit wrote:
> > > Not a good reason to introduce something inferior.
> > 
> > Avoiding duplication and focusing on low hanging fruit are all sound
> > engineering principles.
> > 
> It is not a duplication.
> I iterate again. The bits we asked to put in the Aq commands are the one that does not need to be negotiated early.
> Feature bits were mis-used to put all things in there because in past a generic AQ didn't exists.
> And there is no reason for that past decision to influence current proposal.
> 
> Feature bits that are so basic for device bring up stage, stays in feature bits area defined today.
> Hence, its not a duplication.

Let's take a look at a network device for example.


	\item[VIRTIO_NET_F_CSUM (0)] Device handles packets with partial checksum.   This 
	  ``checksum offload'' is a common feature on modern network cards.

	\item[VIRTIO_NET_F_GUEST_CSUM (1)] Driver handles packets with partial checksum.

	\item[VIRTIO_NET_F_CTRL_GUEST_OFFLOADS (2)] Control channel offloads
		reconfiguration support.

	\item[VIRTIO_NET_F_MTU(3)] Device maximum MTU reporting is supported. If
	    offered by the device, device advises driver about the value of
	    its maximum MTU. If negotiated, the driver uses \field{mtu} as
	    the maximum MTU value.

	\item[VIRTIO_NET_F_MAC (5)] Device has given MAC address.

	\item[VIRTIO_NET_F_GUEST_TSO4 (7)] Driver can receive TSOv4.

	\item[VIRTIO_NET_F_GUEST_TSO6 (8)] Driver can receive TSOv6.

	\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with ECN.

	\item[VIRTIO_NET_F_GUEST_UFO (10)] Driver can receive UFO.

	\item[VIRTIO_NET_F_HOST_TSO4 (11)] Device can receive TSOv4.

	\item[VIRTIO_NET_F_HOST_TSO6 (12)] Device can receive TSOv6.

	\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with ECN.

	\item[VIRTIO_NET_F_HOST_UFO (14)] Device can receive UFO.

	\item[VIRTIO_NET_F_MRG_RXBUF (15)] Driver can merge receive buffers.

	\item[VIRTIO_NET_F_STATUS (16)] Configuration status field is
	    available.

	\item[VIRTIO_NET_F_CTRL_VQ (17)] Control channel is available.

	\item[VIRTIO_NET_F_CTRL_RX (18)] Control channel RX mode support.

	\item[VIRTIO_NET_F_CTRL_VLAN (19)] Control channel VLAN filtering.

	\item[VIRTIO_NET_F_GUEST_ANNOUNCE(21)] Driver can send gratuitous
	    packets.

	\item[VIRTIO_NET_F_MQ(22)] Device supports multiqueue with automatic
	    receive steering.

	\item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
	    channel.

	\item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications coalescing.

	\item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4 packets.

	\item[VIRTIO_NET_F_GUEST_USO6 (55)] Driver can receive USOv6 packets.

	\item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO packets. Unlike UFO
	 (fragmenting the packet) the USO splits large UDP packet
	 to several segments when each of these smaller packets has UDP header.

	\item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
	    value and a type of calculated hash.

	\item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact \field{hdr_len}
	    value. Device benefits from knowing the exact header length.

	\item[VIRTIO_NET_F_RSS(60)] Device supports RSS (receive-side scaling)
	    with Toeplitz hash calculation and configurable hash
	    parameters for receive steering.

	\item[VIRTIO_NET_F_RSC_EXT(61)] Device can process duplicated ACKs
	    and report number of coalesced segments and duplicated ACKs.

	\item[VIRTIO_NET_F_STANDBY(62)] Device may act as a standby for a primary
	    device with the same MAC address.

	\item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and duplex.


which of these are basic for bring up stage?



> > 
> > > > Something like transport over AQ (or the transport vq proposal) or
> > > > some other concerted effort to save per device memory would be
> > > > needed for SIOV, and maybe it's useful for SRIOV too.
> > > >
> > > AQ proposal is already doing it.
> > > Some of the things seems to be lately duplicated in transport vq proposal.
> > > We should possibly rename both the queues to mgmt_queue that serves
> > both the purposes.
> > 
> > Quite possibly - transport VQ seems to handle a group of devices from one
> > device which seems similar to what AQ does, and you guys should probably
> > work together.
> >
> Yes.
>  
> > But whether we do or do not unify transport and admin vq, with transport
> Without unification, I don't see how SIOV proposal can ever make to the spec.
> Once AQ is merged SIOV can use AQ.
> I would like to see SIOV to refer to the AQ proposal commands and extend it. Not as separate proposal.


Sounds good. Or maybe the reverse will happen and transport vq will land
first, I don't much care personally. It would be nice if the parts of AQ
proposal necessary for transport vq work were a separate patch.


> > vq existing feature mechanism stops taking up space in the VQ and this
> > opens up possibility to just use that mechanism to discover supported
> > commands without paying memory penalty.
> > 
> This only helps any new SIOV beast which is not even fully defined at various layers of platform.
> 
> > 
> > > Transport VQ proposal is tunneling some SIOV device feature bits.
> > > Here AQ is negotiating its own feature bits. Both are orthogonal.
> > 
> > I thought the discussion was about dropping the get features command and
> > using features instead. 
> 
> > The Transport VQ proposal seems to show how we
> > can do that and still make things scale.  Yes it's orthogonal to using AQ for
> > grouping, and using features is exactly what will keep it orthogonal.
> > 
> > > And suggesting to some newer RFC to discuss current one doesn't make
> > much sense to tangent the discussion at all.
> > 
> > 
> > The reason I bring it up is that it seems like a better solution to saving device
> > memory than adding a get capabilities command. And it let you just use
> > features for now, and assume whatever we spend will be saved back by
> > using vq as transport.
> > 
> Even for non SIOV devices?

Yes.

> > 
> > > Since there is zero technical short comings of negotiating AQ features via
> > AQ command, lets please conclude to proceed with it.
> > 
> > Not 100% sure what you are saing here, so I think you should post a new
> > version, yes. Generally iterating faster is a good idea. If you feel you didn't
> > get enough feedback on a given version and some time passed try pinging
> > people.
> Without closing the discussion, I have seen same topic come up again and again in future version.
> So even though new series is up for posting, we better close the discussion.

That's a mistake, it will be harder for you to make progress like this.
Reviewers have limited attention span and memory, text bitrots, you decide to
make unrelated changes yourself... Sorry I'm repeating myself but
iterate quickly in the open even if there are known issues not addressed
yet, just make very sure to be open and detailed about issues you addressed
and known issues you did not address.

-- 
MST


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [virtio-dev] RE: [PATCH v5 0/7] Introduce device group and device management
  2022-07-31 15:38                     ` Michael S. Tsirkin
@ 2022-08-02 17:40                       ` Parav Pandit
  0 siblings, 0 replies; 103+ messages in thread
From: Parav Pandit @ 2022-08-02 17:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, jasowang, virtio-comment, cohuck, virtio-dev,
	Oren Duer, Shahaf Shuler, aadam, virtio


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, July 31, 2022 11:39 AM
> 
> On Sat, Jul 30, 2022 at 01:21:20PM +0000, Parav Pandit wrote:
> > > > Not a good reason to introduce something inferior.
> > >
> > > Avoiding duplication and focusing on low hanging fruit are all sound
> > > engineering principles.
> > >
> > It is not a duplication.
> > I iterate again. The bits we asked to put in the Aq commands are the one
> that does not need to be negotiated early.
> > Feature bits were mis-used to put all things in there because in past a
> generic AQ didn't exists.
> > And there is no reason for that past decision to influence current proposal.
> >
> > Feature bits that are so basic for device bring up stage, stays in feature bits
> area defined today.
> > Hence, its not a duplication.
> 
> Let's take a look at a network device for example.
> 
> 
> 	\item[VIRTIO_NET_F_CSUM (0)] Device handles packets with partial
> checksum.   This
> 	  ``checksum offload'' is a common feature on modern network cards.
> 
> 	\item[VIRTIO_NET_F_GUEST_CSUM (1)] Driver handles packets with
> partial checksum.
> 
> 	\item[VIRTIO_NET_F_CTRL_GUEST_OFFLOADS (2)] Control channel
> offloads
> 		reconfiguration support.
> 
> 	\item[VIRTIO_NET_F_MTU(3)] Device maximum MTU reporting is
> supported. If
> 	    offered by the device, device advises driver about the value of
> 	    its maximum MTU. If negotiated, the driver uses \field{mtu} as
> 	    the maximum MTU value.
> 
> 	\item[VIRTIO_NET_F_MAC (5)] Device has given MAC address.
> 
> 	\item[VIRTIO_NET_F_GUEST_TSO4 (7)] Driver can receive TSOv4.
> 
> 	\item[VIRTIO_NET_F_GUEST_TSO6 (8)] Driver can receive TSOv6.
> 
> 	\item[VIRTIO_NET_F_GUEST_ECN (9)] Driver can receive TSO with
> ECN.
> 
> 	\item[VIRTIO_NET_F_GUEST_UFO (10)] Driver can receive UFO.
> 
> 	\item[VIRTIO_NET_F_HOST_TSO4 (11)] Device can receive TSOv4.
> 
> 	\item[VIRTIO_NET_F_HOST_TSO6 (12)] Device can receive TSOv6.
> 
> 	\item[VIRTIO_NET_F_HOST_ECN (13)] Device can receive TSO with
> ECN.
> 
> 	\item[VIRTIO_NET_F_HOST_UFO (14)] Device can receive UFO.
> 
> 	\item[VIRTIO_NET_F_MRG_RXBUF (15)] Driver can merge receive
> buffers.
> 
> 	\item[VIRTIO_NET_F_STATUS (16)] Configuration status field is
> 	    available.
> 
> 	\item[VIRTIO_NET_F_CTRL_VQ (17)] Control channel is available.
> 
> 	\item[VIRTIO_NET_F_CTRL_RX (18)] Control channel RX mode
> support.
> 
> 	\item[VIRTIO_NET_F_CTRL_VLAN (19)] Control channel VLAN
> filtering.
> 
> 	\item[VIRTIO_NET_F_GUEST_ANNOUNCE(21)] Driver can send
> gratuitous
> 	    packets.
> 
> 	\item[VIRTIO_NET_F_MQ(22)] Device supports multiqueue with
> automatic
> 	    receive steering.
> 
> 	\item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address
> through control
> 	    channel.
> 
> 	\item[VIRTIO_NET_F_NOTF_COAL(53)] Device supports notifications
> coalescing.
> 
> 	\item[VIRTIO_NET_F_GUEST_USO4 (54)] Driver can receive USOv4
> packets.
> 
> 	\item[VIRTIO_NET_F_GUEST_USO6 (55)] Driver can receive USOv6
> packets.
> 
> 	\item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO
> packets. Unlike UFO
> 	 (fragmenting the packet) the USO splits large UDP packet
> 	 to several segments when each of these smaller packets has UDP
> header.
> 
> 	\item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-
> packet hash
> 	    value and a type of calculated hash.
> 
> 	\item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the
> exact \field{hdr_len}
> 	    value. Device benefits from knowing the exact header length.
> 
> 	\item[VIRTIO_NET_F_RSS(60)] Device supports RSS (receive-side
> scaling)
> 	    with Toeplitz hash calculation and configurable hash
> 	    parameters for receive steering.
> 
> 	\item[VIRTIO_NET_F_RSC_EXT(61)] Device can process duplicated
> ACKs
> 	    and report number of coalesced segments and duplicated ACKs.
> 
> 	\item[VIRTIO_NET_F_STANDBY(62)] Device may act as a standby for
> a primary
> 	    device with the same MAC address.
> 
> 	\item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and
> duplex.
> 
> 
> which of these are basic for bring up stage?
> 
> 
As you pointed all above fields has zero relation to the device initialization.
Those can be very well-done post device init stage over an AQ.
The fact that such queue never existed before, it was logic to extend the feature bits.
Otherwise, they are very well suitable to be negotiated over net CVQ or new AQ.
> > >
> > Even for non SIOV devices?
> 
> Yes.
In that case it is no different than the AQ. :)
Not sure what are we debating for.
You say that feature bits of the VF _itself_ are negotiated over a transport queue by VF itself.
Including the above one that you listed?
Right?
If so, that is useful.
Yet to review the rfc.

> > > Not 100% sure what you are saing here, so I think you should post a
> > > new version, yes. Generally iterating faster is a good idea. If you
> > > feel you didn't get enough feedback on a given version and some time
> > > passed try pinging people.
> > Without closing the discussion, I have seen same topic come up again and
> again in future version.
> > So even though new series is up for posting, we better close the
> discussion.
> 
> That's a mistake, it will be harder for you to make progress like this.
> Reviewers have limited attention span and memory, text bitrots, you decide
> to make unrelated changes yourself... Sorry I'm repeating myself but iterate
> quickly in the open even if there are known issues not addressed yet, just
> make very sure to be open and detailed about issues you addressed and
> known issues you did not address.
Ok. Thanks for the useful direction.


^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2022-08-02 17:40 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-26 22:58 [PATCH v5 0/7] Introduce device group and device management Max Gurtovoy
2022-04-26 22:58 ` [virtio-comment] [PATCH v5 1/7] Introduce device group Max Gurtovoy
2022-05-15 15:25   ` Michael S. Tsirkin
2022-05-18 13:14     ` [virtio-comment] " Max Gurtovoy
2022-05-18 13:32       ` Cornelia Huck
2022-06-01 13:43         ` Max Gurtovoy
2022-06-02  2:21           ` Jason Wang
2022-06-02  6:59             ` Michael S. Tsirkin
2022-06-27 21:52               ` Max Gurtovoy
2022-06-28 18:54                 ` Michael S. Tsirkin
2022-07-06 11:25                   ` Max Gurtovoy
2022-07-06 11:42                     ` Michael S. Tsirkin
2022-07-06 12:01                       ` Max Gurtovoy
2022-07-06 12:23                         ` Michael S. Tsirkin
2022-07-06 15:18                           ` Max Gurtovoy
2022-04-26 22:58 ` [PATCH v5 2/7] Introduce admin command set Max Gurtovoy
2022-05-15 15:23   ` Michael S. Tsirkin
2022-05-16 21:08     ` [virtio-comment] " Parav Pandit
2022-05-17 10:08       ` [virtio-dev] " Cornelia Huck
2022-05-18 13:42         ` Max Gurtovoy
2022-05-17 11:48       ` Michael S. Tsirkin
2022-05-18 14:09         ` Max Gurtovoy
2022-05-18 14:42           ` [virtio] " Cornelia Huck
2022-05-18 14:48             ` Max Gurtovoy
2022-05-31 20:39         ` Parav Pandit
2022-06-20  9:23           ` Michael S. Tsirkin
2022-06-20  9:49             ` Michael S. Tsirkin
2022-06-20  9:59           ` Michael S. Tsirkin
2022-06-20 11:06             ` Parav Pandit
2022-06-20 16:46               ` Michael S. Tsirkin
2022-06-20 16:54                 ` Max Gurtovoy
2022-06-20 17:04                   ` Michael S. Tsirkin
2022-06-20 17:19                     ` Parav Pandit
2022-06-20 20:53                       ` Michael S. Tsirkin
2022-06-20 23:54                         ` Parav Pandit
2022-06-20 17:16                 ` Parav Pandit
2022-06-23  1:26               ` Jason Wang
2022-06-23  2:07                 ` Parav Pandit
2022-06-23  2:41                   ` Jason Wang
2022-06-23  2:57                     ` Parav Pandit
2022-06-23  3:34                       ` Jason Wang
2022-06-28 14:24                         ` Michael S. Tsirkin
2022-06-29  8:43                           ` Jason Wang
2022-06-29  9:02                             ` Michael S. Tsirkin
2022-06-30  1:53                               ` Jason Wang
2022-05-18 13:39     ` [virtio-comment] " Max Gurtovoy
2022-05-18 13:50       ` [virtio] " Cornelia Huck
2022-05-18 14:16         ` Max Gurtovoy
2022-06-20 22:26           ` Michael S. Tsirkin
2022-06-20 21:08       ` Michael S. Tsirkin
2022-04-26 22:58 ` [PATCH v5 3/7] Introduce new destination type for admin commands Max Gurtovoy
2022-05-15 15:01   ` Michael S. Tsirkin
2022-05-18 14:27     ` [virtio-comment] " Max Gurtovoy
2022-05-15 15:09   ` Michael S. Tsirkin
2022-05-16 21:21     ` Parav Pandit
2022-05-16 23:33       ` Michael S. Tsirkin
2022-05-18 14:36         ` Max Gurtovoy
2022-05-18 14:34     ` Max Gurtovoy
2022-05-18 23:55       ` Michael S. Tsirkin
2022-04-26 22:58 ` [PATCH v5 4/7] Introduce virtio admin virtqueue Max Gurtovoy
2022-05-15 14:59   ` Michael S. Tsirkin
2022-05-18 14:37     ` Max Gurtovoy
2022-05-18 23:56       ` Michael S. Tsirkin
2022-04-26 22:58 ` [PATCH v5 5/7] Add miscellaneous configuration structure for PCI Max Gurtovoy
2022-05-15 14:49   ` Michael S. Tsirkin
2022-06-01 14:46     ` Max Gurtovoy
2022-05-15 14:57   ` Michael S. Tsirkin
2022-05-17 10:12     ` [virtio] " Cornelia Huck
2022-05-18 14:42       ` Max Gurtovoy
2022-05-18 23:58         ` Michael S. Tsirkin
2022-04-26 22:58 ` [PATCH v5 6/7] Introduce MGMT admin commands Max Gurtovoy
2022-05-15 14:37   ` Michael S. Tsirkin
2022-05-16 21:47     ` Parav Pandit
2022-05-17 12:31       ` [virtio-comment] " Michael S. Tsirkin
2022-05-18 15:14         ` Max Gurtovoy
2022-05-17  2:28     ` Jason Wang
2022-05-18 15:27       ` Max Gurtovoy
2022-05-18 16:41         ` Michael S. Tsirkin
2022-05-18 23:10           ` Max Gurtovoy
2022-05-18 15:03     ` Max Gurtovoy
2022-06-20  9:45       ` Michael S. Tsirkin
2022-04-26 22:58 ` [PATCH v5 7/7] RFC: add initial support for configuring feature bits Max Gurtovoy
2022-05-15 14:38   ` Michael S. Tsirkin
2022-05-18 15:31     ` Max Gurtovoy
2022-05-18 16:34       ` Michael S. Tsirkin
2022-05-18 23:18         ` Max Gurtovoy
2022-05-15 15:27 ` [PATCH v5 0/7] Introduce device group and device management Michael S. Tsirkin
2022-05-18 15:32   ` Max Gurtovoy
2022-07-05 13:56 ` Michael S. Tsirkin
2022-07-05 15:11   ` [virtio-comment] " Parav Pandit
2022-07-06  2:54     ` [virtio-dev] " Jason Wang
2022-07-06 10:10       ` Michael S. Tsirkin
2022-07-06 10:46       ` [virtio-comment] " Parav Pandit
2022-07-06 11:00     ` Michael S. Tsirkin
2022-07-06 20:45       ` Parav Pandit
2022-07-24 21:09         ` Michael S. Tsirkin
2022-07-24 21:25           ` [virtio-comment] " Parav Pandit
2022-07-24 23:41             ` Michael S. Tsirkin
2022-07-25  2:53               ` [virtio-comment] " Parav Pandit
2022-07-25  7:44                 ` Michael S. Tsirkin
2022-07-30 13:21                   ` [virtio-comment] " Parav Pandit
2022-07-31 15:38                     ` Michael S. Tsirkin
2022-08-02 17:40                       ` Parav Pandit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.