All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] Introduce device group and device management
@ 2022-07-31 15:43 Max Gurtovoy
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
                   ` (5 more replies)
  0 siblings, 6 replies; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

Hi,
A device group definition will help extending the virtio specefication for
various future features that require a notion of grouping devices together or
managing devices inside a group. A device group include one or more virtio devices.
For now, only support 2 device group types was added.

Also introduce the admin facility to allow manipulating features and configurations
in a generic manner. Using the admin command set, one can manipulate the device itself
and/or to manipulate, if possible, another device within the same device group (for now,
introduce only support of PCI SR-IOV devices grouping managemant).

The admin command set will be extended in the future to support more functionalities.
Some of these functionalities are already under discussions.

The admin virtqueue is the first management interface to issue admin commands from
the admin command set.

Motivation for choosing admin queue as first management interface:
1. It is anticipated that admin queue will be used for managing and configuring
   many different type of resources. For example,
   a. PCI PF configuring PCI VF attributes.
   b. virtio device creating/destroying/configuring subfunctions discussed in [1]
   c. composing device config space of VF or SF such as mac address, number of VQs, virtio features

   Mapping all of them as configuration registers to MMIO will require large MMIO space,
   if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
   requires on-chip resources to complete within MMIO access latencies. Such resources are very
   expensive.

2. Such limitation can be overcome by having smaller MMIO register set to build
   a command request response interface. However, such MMIO based command interface
   will be limited to serve single outstanding command execution. Such limitation can
   resulting in high device creation and composing time which can affect VM startup time.
   Often device can queue and service multiple commands in parallel, such command interface
   cannot use parallelism offered by the device.

3. When a command wants to DMA data from one or more physical addresses, for example in the future a
   live migration command may need to fetch device state consist of config space, tens of
   VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
   Packing one or more DMA addresses over new command interface will be burden some and continue
   to suffer single outstanding command execution latencies. Such limitation is not good for time
   sensitive live migration use cases.

4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
   descriptors. Similar mechanism exist today for device specific configuration - the control VQ.

A future work can add another management interface to issue admin commands.

[1] https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html

This series include the comments and fixes from V1-V5 of the initial patch sets ("VIRTIO: Provision maximum
MSI-X vectors for a VF" and "Introduce virtio subsystem and Admin virtqueue" [2]).
This series was updated with management for device feature bits instead of MSI-X vector provisioning as example for
using admin command set. Also add dedicated feature bits for the new management capabilities as was discussed in the
previous atempts.

[2] https://lists.oasis-open.org/archives/virtio-comment/202203/msg00005.html


Open issues:
1. CCW and MMIO specification for admin_queue_index register

Changelog:
 - remove MSI-X configuration patch from current version.
 - Addressed comments from MST, Jason Wang and others (feature bits, self device group).
 - simplified the interface.


Max Gurtovoy (5):
  Introduce device group
  Introduce admin command set
  Introduce virtio admin virtqueue
  Add admin_queue_index register to PCI common configuration structure
  Introduce MGMT admin commands

 admin.tex        | 188 +++++++++++++++++++++++++++++++++++++++++++++++
 content.tex      |  68 ++++++++++++++++-
 introduction.tex |  54 ++++++++++++++
 3 files changed, 308 insertions(+), 2 deletions(-)
 create mode 100644 admin.tex

-- 
2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v6 1/5] Introduce device group
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
@ 2022-07-31 15:43 ` Max Gurtovoy
  2022-07-31 20:38   ` Michael S. Tsirkin
                     ` (3 more replies)
  2022-07-31 15:43 ` [PATCH v6 2/5] Introduce admin command set Max Gurtovoy
                   ` (4 subsequent siblings)
  5 siblings, 4 replies; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

Each device group has a type. For now, define 2 initial types of device
groups: Self type and SR-IOV type.

Self type - A group that has a single virtio device as a member.

SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
PCI SR-IOV virtual functions (VFs). This group may contain one or more
virtio devices.

Each device group has a unique identifier. This identifier is the group
identifier (group_id).

Each device within a device group has a unique identifier. This identifier
is the group member identifier (group_member_id).

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 introduction.tex | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/introduction.tex b/introduction.tex
index aa9ec1b..e8bde45 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -156,6 +156,30 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
 sections tagged "Legacy Interface" in the section title.
 These highlight the changes made since the earlier drafts.
 
+\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
+
+A device group includes one or more virtio devices. Each device group has a unique group identifier (group_id).
+A device can be a member of one or more device groups.
+A device within a group is identified by a unique group member identifier (group_member_id).
+The scope of the group member identifier is within the group. In other words, two device groups can have overlap group member identifiers.
+A group member identifier is a 64-bit value in range of 0x0 - 0xFFFFFFFFFFFFFFF0.
+A special group member identifier value of 0xFFFFFFFFFFFFFFFF refers to all the devices in a device group.
+
+The supported device groups are:
+\begin{enumerate}
+\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
+For this group type, the device is identified by group member identifier of 0.
+
+\item SR-IOV type (group identifier = 1) - this group includes a virtio PCI SR-IOV physical function (PF) and all its virtual functions (VFs).
+For this group type, the PF device has group member identifier of 0. Each VF group member identifier equals the PCI VF number according to the PCI Express Base Specification
+(Single Root I/O Virtualization and Sharing chapter). Devices that are members in this group use the Virtio PCI transport (for more details see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus}).
+\end{enumerate}
+
+\begin{note}
+  The same device can be identified by different identifiers within different groups. For example, A virtual function device has a group
+  member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.
+\end{note}
+
 \section{Structure Specifications}\label{sec:Structure Specifications}
 
 Many device and driver in-memory structure layouts are documented using
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 2/5] Introduce admin command set
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
@ 2022-07-31 15:43 ` Max Gurtovoy
  2022-07-31 20:59   ` Michael S. Tsirkin
  2022-07-31 15:43 ` [virtio-comment] [PATCH v6 3/5] Introduce virtio admin virtqueue Max Gurtovoy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

This command set is used for essential administrative and management
operations.

Admin commands should be submitted to a well defined management
interface.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex   | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 content.tex |  2 ++
 2 files changed, 84 insertions(+)
 create mode 100644 admin.tex

diff --git a/admin.tex b/admin.tex
new file mode 100644
index 0000000..d14f8f7
--- /dev/null
+++ b/admin.tex
@@ -0,0 +1,82 @@
+\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
+
+The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
+This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
+
+All the Admin commands are of the following form:
+
+\begin{lstlisting}
+struct virtio_admin_cmd {
+        /* Device-readable part */
+        le16 command;
+        /*
+         * 0 - Self
+         * 1 - SR-IOV
+         * 2 - 65535 are reserved
+         */
+        le16 group_id;
+        le64 group_member_id;
+        /* reserved for common cmd fields */
+        u8 reserved[12];
+        u8 command_specific_data[];
+
+        /* Device-writable part */
+        u8 status;
+        u8 command_specific_error;
+        u8 command_specific_result[];
+};
+\end{lstlisting}
+
+The following table describes the generic Admin status codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
+\hline
+01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
+\hline
+02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
+\hline
+03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
+\hline
+04h   & VIRTIO_ADMIN_STATUS_INVALID_GROUP    & invalid group identifier was set  \\
+\hline
+05h   & VIRTIO_ADMIN_STATUS_INVALID_MEM    & invalid group member identifier was set  \\
+\hline
+\end{tabular}
+
+The \field{command}, \field{group_id}, \field{group_member_id} and \field{command_specific_data} are
+set by the driver, and the device sets the \field{status}, the
+\field{command_specific_error} and the \field{command_specific_result},
+if needed.
+
+Reserved common fields are ignored by the device and to be zeroed by the driver.
+
+The mandatory fields to be set by the driver, for all admin commands, are the \field{command}, \field{group_id} and \field{group_member_id}.
+
+The optional unused fields to be zeroed by the driver.
+
+The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
+
+The \field{group_id} defines the designated device group for the command.
+The \field{group_member_id} defines the designated device within the device group for the command.
+See section \ref{sec:Introduction / Terminology / Device group} for group identifier and group member identifier numbering for various device groups.
+
+The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
+VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
+holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
+\field{command_specific_error} value is undefined and should be ignored by the driver.
+
+The following table describes the Admin command set:
+
+\begin{tabular}{|l|l|}
+\hline
+Opcode & Command \\
+\hline \hline
+0000h - 7FFFh   & Generic admin cmds    \\
+\hline
+8000h - FFFFh   & Reserved    \\
+\hline
+\end{tabular}
diff --git a/content.tex b/content.tex
index 7508dd1..caab298 100644
--- a/content.tex
+++ b/content.tex
@@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 types. It is RECOMMENDED that devices generate version 4
 UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
 
+\input{admin.tex}
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [virtio-comment] [PATCH v6 3/5] Introduce virtio admin virtqueue
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
  2022-07-31 15:43 ` [PATCH v6 2/5] Introduce admin command set Max Gurtovoy
@ 2022-07-31 15:43 ` Max Gurtovoy
  2022-07-31 21:00   ` Michael S. Tsirkin
  2022-07-31 15:43 ` [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure Max Gurtovoy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

In one of the many use cases a user wants to manipulate features and
configuration of the virtio devices regardless of the device type
(net/block/console). For that the admin command set introduced. The
admin virtqueue will be the first management interface to issue admin
commands.

Currently virtio specification defines control virtqueue to manipulate
features and configuration of the device it operates on. However,
control virtqueue commands are device type specific, which makes it very
difficult to extend for device agnostic commands.

To support this requirement in elegant way, this patch introduces a new
admin virtqueue interface.

Manipulate features via admin virtqueue is asynchronous, scalable, easy
to extend and doesn't require additional and expensive on-die resources
to be allocated for every new feature that will be added in the future.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex   | 12 ++++++++++++
 content.tex |  6 ++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/admin.tex b/admin.tex
index d14f8f7..26ac60e 100644
--- a/admin.tex
+++ b/admin.tex
@@ -80,3 +80,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
 8000h - FFFFh   & Reserved    \\
 \hline
 \end{tabular}
+
+\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
+
+An admin virtqueue is a management interface of a device that can be used to send administrative
+commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
+various features of the device and/or to manipulate various features, if possible, of another device.
+
+An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
+negotiated. The index of the admin virtqueue is exposed by the device in a
+transport specific manner.
+
+If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
diff --git a/content.tex b/content.tex
index caab298..c15423e 100644
--- a/content.tex
+++ b/content.tex
@@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
 \begin{description}
 \item[0 to 23, and 50 to 127] Feature bits for the specific device type
 
-\item[24 to 40] Feature bits reserved for extensions to the queue and
+\item[24 to 41] Feature bits reserved for extensions to the queue and
   feature negotiation mechanisms
 
-\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
+\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
 \end{description}
 
 \begin{note}
@@ -6841,6 +6841,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   that the driver can reset a queue individually.
   See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
 
+  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
-- 
2.21.0


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
                   ` (2 preceding siblings ...)
  2022-07-31 15:43 ` [virtio-comment] [PATCH v6 3/5] Introduce virtio admin virtqueue Max Gurtovoy
@ 2022-07-31 15:43 ` Max Gurtovoy
  2022-07-31 21:03   ` Michael S. Tsirkin
  2022-07-31 15:43 ` [PATCH v6 5/5] Introduce MGMT admin commands Max Gurtovoy
  2022-07-31 16:27 ` [PATCH v6 0/5] Introduce device group and device management Michael S. Tsirkin
  5 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

This new register will be used for querying the index of the admin
virtqueue of a virtio device. To configure, reset or enable the admin
virtqueue, the driver should follow existing queue configuration/setup
sequence.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 content.tex | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/content.tex b/content.tex
index c15423e..5fda1a0 100644
--- a/content.tex
+++ b/content.tex
@@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         le64 queue_device;              /* read-write */
         le16 queue_notify_data;         /* read-only for driver */
         le16 queue_reset;               /* read-write */
+
+        /* About admin virtqueue. */
+        le16 admin_queue_index;         /* read-only for driver */
 };
 \end{lstlisting}
 
@@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         This field exists only if VIRTIO_F_RING_RESET has been
         negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
 
+\item[\field{admin_queue_index}]
+        The device uses this to report the index of the admin virtqueue.
+        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
+
 \end{description}
 
 \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
@@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 were used before the queue reset.
 (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
 
+For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
+For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
+
 \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
 
 The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v6 5/5] Introduce MGMT admin commands
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
                   ` (3 preceding siblings ...)
  2022-07-31 15:43 ` [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure Max Gurtovoy
@ 2022-07-31 15:43 ` Max Gurtovoy
  2022-07-31 21:16   ` Michael S. Tsirkin
  2022-07-31 16:27 ` [PATCH v6 0/5] Introduce device group and device management Michael S. Tsirkin
  5 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 15:43 UTC (permalink / raw)
  To: jasowang, virtio-comment, mst, cohuck, virtio-dev
  Cc: oren, parav, shahafs, aadam, virtio, eperezma, Max Gurtovoy

Introduce the concept of a management and a managed device and add
example of using this concept to manage resources.

A management device (from any type) supports the
VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin
commands to manage some resources of a managed device.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex        | 96 +++++++++++++++++++++++++++++++++++++++++++++++-
 content.tex      | 50 +++++++++++++++++++++++++
 introduction.tex | 30 +++++++++++++++
 3 files changed, 175 insertions(+), 1 deletion(-)

diff --git a/admin.tex b/admin.tex
index 26ac60e..73bc411 100644
--- a/admin.tex
+++ b/admin.tex
@@ -75,12 +75,106 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
 \hline
 Opcode & Command \\
 \hline \hline
-0000h - 7FFFh   & Generic admin cmds    \\
+0000h   & VIRTIO_ADMIN_DEVICE_MGMT    \\
+\hline
+0001h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    \\
+\hline
+0002h - 7FFFh   & Generic admin cmds    \\
 \hline
 8000h - FFFFh   & Reserved    \\
 \hline
 \end{tabular}
 
+\begin{note}
+{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
+\end{note}
+
+\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
+
+The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
+
+The command specific data set by the driver is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_data {
+        /*
+         * 0 - reserved
+         * 1 - assign resource to the designated device
+         * 2 - query resource of the designated device
+         * 3 - 255 are reserved for future operations
+         */
+        u8 operation;
+        /*
+         * 0 - Device feature bits 0 to 63
+         * 1 - 65535 are reserved
+         */
+        le16 resource;
+        /*
+         * The value to the given resource.
+         * Valid only if operation == 1 (assign).
+         * resource = 0 indicates device feature bits 0 to 63.
+         */
+        le64 resource_val;
+        u8 reserved[5];
+};
+\end{lstlisting}
+
+The following table describes the command specific error codes codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_CS_ERR_DEV_IN_USE    & designated device is in use, operation failed   \\
+\hline
+01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
+\hline
+02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
+\hline
+03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
+\hline
+04h - FFh   & Reserved    & -  \\
+\hline
+\end{tabular}
+
+The device, upon success, returns a result that describes the information according to the requested operation.
+This result is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_result {
+        le64 resource_val;
+        u8 reserved[8];
+};
+\end{lstlisting}
+
+If the \field{operation} was set to 1 ("assign resource to the designated device") and the command succeeded, the device successfully assigned
+resources to the designated device. Also, upon success, the assigned value should returned in the \field{resource_val} of the virtio_admin_device_mgmt_result
+structure and should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data structure set by the driver.
+In case of a failure, the value of this field is undefined and will be ignored by the driver.
+
+If the \field{operation} was set to 2 ("query resource of the designated device") and the command succeeded, the device will return the requested
+value in the \field{resource_val} of the virtio_admin_device_mgmt_result structure.
+In case of a failure, the value of this field is undefined and will be ignored by the driver.
+
+\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
+
+The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
+The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
+
+The device, upon success, returns a result that describes the management device attributes.
+This result is of form:
+\begin{lstlisting}
+struct virtio_admin_device_mgmt_attrs_result {
+        /* Indicates which of the below fields are valid.
+         * Bits 0 - managed_device_features_63_0
+         * Bits 1 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+
+        le64 managed_device_features_63_0;
+        u8 reserved[48];
+};
+\end{lstlisting}
+
 \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
 
 An admin virtqueue is a management interface of a device that can be used to send administrative
diff --git a/content.tex b/content.tex
index 5fda1a0..b15a887 100644
--- a/content.tex
+++ b/content.tex
@@ -451,6 +451,48 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 
 \input{admin.tex}
 
+\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
+
+A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a SR-IOV type device group.
+A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
+A management device might have various management capabilities and attributes to manage its managed devices. These capabilities and attributes exposed
+via feature bits and in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
+
+The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
+
+\subsection{Device features management}\label{sec:Basic Facilities of a Virtio Device / Device management / Device features management}
+
+A virtio management device that negotiated VIRTIO_F_FEATURE_63_0_MGMT can control the values of the first 64 feature bits for its managed devices.
+In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the feature bits that can be enabled for its managed devices in \field{managed_device_features_63_0}
+(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
+A management device driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the feature bits assignment for a specific managed device.
+A management device driver can only assign a subset of the bits returned in \field{managed_device_features_63_0}.
+A management device driver can update the managed device feature bits only before the managed device driver set the ACKNOWLEDGE status bit.
+
+\begin{note}
+{It's recommended to manipulate the managed device feature bits before loading the managed device driver.}
+\end{note}
+
+A typical sequence for configuring feature bits is following:
+
+\begin{enumerate}
+\item Ensure that the driver of the managed device doesn't run on the device
+
+\item Load the driver of the management device
+
+\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and look for \field{managed_device_features_63_0}
+
+\item Find the managed device group_id and group_member_id
+
+\item Query the current feature bits configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
+
+\item Assign desired feature bits for the managed device using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
+
+\item After successful completion of the assignment, load the managed device driver
+
+\end{enumerate}
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
@@ -6853,6 +6895,14 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
   \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
 
+  \item[VIRTIO_F_SR_IOV_MGMT_DEVICE (42)] This feature indicates that the device is a SR-IOV management device.
+  SR-IOV management device is a device that can manage other devices within the SR-IOV device group.
+  For more details see \ref{sec:Introduction / Terminology / Virtio SR-IOV type management device}.
+
+  \item[VIRTIO_F_FEATURE_63_0_MGMT (43)] This feature indicates that the device is a management device that supports
+  feature mgmt of bits 0 to 63 for its managed devices.
+  For more details see \ref{sec:Basic Facilities of a Virtio Device / Device management / Device features management}.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
diff --git a/introduction.tex b/introduction.tex
index e8bde45..e20df24 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -173,6 +173,9 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
 \item SR-IOV type (group identifier = 1) - this group includes a virtio PCI SR-IOV physical function (PF) and all its virtual functions (VFs).
 For this group type, the PF device has group member identifier of 0. Each VF group member identifier equals the PCI VF number according to the PCI Express Base Specification
 (Single Root I/O Virtualization and Sharing chapter). Devices that are members in this group use the Virtio PCI transport (for more details see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus}).
+A PCI SR-IOV PF device can act as a management device for SR-IOV type device group. A PCI SR-IOV VF device can act as a managed device for SR-IOV type device group
+(see \ref{sec:Introduction / Terminology / Virtio management device} and \ref{sec:Introduction / Terminology / Virtio managed device} for more information).
+SR-IOV type device group may contain zero or one management devices.
 \end{enumerate}
 
 \begin{note}
@@ -180,6 +183,33 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
   member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.
 \end{note}
 
+\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
+
+A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
+This device can manage a virtio managed device. A device group may contain zero or more management devices.
+
+A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for SR-IOV type device group).
+
+\subsection{Virtio SR-IOV type management device}\label{sec:Introduction / Terminology / Virtio SR-IOV type management device}
+
+A virtio management device for SR-IOV type device group. A driver can issue the management admin commands to this device with \field{group_id} set to 1
+and \field{group_member_id} set to an identifier that corresponds with one of its managed virtio devices (PCI SR-IOV VFs).
+A Virtio SR-IOV type management device must expose VIRTIO_F_SR_IOV_MGMT_DEVICE feature bit.
+
+\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
+
+A virtio device that can be managed by a virtio management device.
+A device group may contain zero or more managed devices.
+
+A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for SR-IOV type device group).
+
+\subsection{virtio SR-IOV type managed device}\label{sec:Introduction / Terminology / Virtio SR-IOV type managed device}
+
+A virtio managed device for SR-IOV type device group. This device is a PCI SR-IOV VF and is managed by a virtio SR-IOV type management device (virtio PCI SR-IOV PF).
+It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio SR-IOV type management device are SR-IOV type managed devices.
+
 \section{Structure Specifications}\label{sec:Structure Specifications}
 
 Many device and driver in-memory structure layouts are documented using
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 0/5] Introduce device group and device management
  2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
                   ` (4 preceding siblings ...)
  2022-07-31 15:43 ` [PATCH v6 5/5] Introduce MGMT admin commands Max Gurtovoy
@ 2022-07-31 16:27 ` Michael S. Tsirkin
  5 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 16:27 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:49PM +0300, Max Gurtovoy wrote:
> Hi,
> A device group definition will help extending the virtio specefication for
> various future features that require a notion of grouping devices together or
> managing devices inside a group. A device group include one or more virtio devices.
> For now, only support 2 device group types was added.
> 
> Also introduce the admin facility to allow manipulating features and configurations
> in a generic manner. Using the admin command set, one can manipulate the device itself
> and/or to manipulate, if possible, another device within the same device group (for now,
> introduce only support of PCI SR-IOV devices grouping managemant).
> 
> The admin command set will be extended in the future to support more functionalities.
> Some of these functionalities are already under discussions.
> 
> The admin virtqueue is the first management interface to issue admin commands from
> the admin command set.
> Motivation for choosing admin queue as first management interface:
> 1. It is anticipated that admin queue will be used for managing and configuring
>    many different type of resources. For example,
>    a. PCI PF configuring PCI VF attributes.
>    b. virtio device creating/destroying/configuring subfunctions discussed in [1]
>    c. composing device config space of VF or SF such as mac address, number of VQs, virtio features
> 
>    Mapping all of them as configuration registers to MMIO will require large MMIO space,
>    if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
>    requires on-chip resources to complete within MMIO access latencies. Such resources are very
>    expensive.
> 
> 2. Such limitation can be overcome by having smaller MMIO register set to build
>    a command request response interface. However, such MMIO based command interface
>    will be limited to serve single outstanding command execution. Such limitation can
>    resulting in high device creation and composing time which can affect VM startup time.
>    Often device can queue and service multiple commands in parallel, such command interface
>    cannot use parallelism offered by the device.
> 
> 3. When a command wants to DMA data from one or more physical addresses, for example in the future a
>    live migration command may need to fetch device state consist of config space, tens of
>    VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
>    Packing one or more DMA addresses over new command interface will be burden some and continue
>    to suffer single outstanding command execution latencies. Such limitation is not good for time
>    sensitive live migration use cases.
> 
> 4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
>    descriptors. Similar mechanism exist today for device specific configuration - the control VQ.
> 
> A future work can add another management interface to issue admin commands.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html
> 
> This series include the comments and fixes from V1-V5 of the initial patch sets ("VIRTIO: Provision maximum
> MSI-X vectors for a VF" and "Introduce virtio subsystem and Admin virtqueue" [2]).
> This series was updated with management for device feature bits instead of MSI-X vector provisioning as example for
> using admin command set. Also add dedicated feature bits for the new management capabilities as was discussed in the
> previous atempts.
> 
> [2] https://lists.oasis-open.org/archives/virtio-comment/202203/msg00005.html
> 
> 
> Open issues:
> 1. CCW and MMIO specification for admin_queue_index register
> 
> Changelog:
>  - remove MSI-X configuration patch from current version.
>  - Addressed comments from MST, Jason Wang and others (feature bits, self device group).
>  - simplified the interface.
> 


Thanks a lot! Will review. I assume these are changes since v5.
So in the next revision (if any) you will want to list
- changes in v7
	... future changes
- changes in v6
	this list

> 
> Max Gurtovoy (5):
>   Introduce device group
>   Introduce admin command set
>   Introduce virtio admin virtqueue
>   Add admin_queue_index register to PCI common configuration structure
>   Introduce MGMT admin commands
> 
>  admin.tex        | 188 +++++++++++++++++++++++++++++++++++++++++++++++
>  content.tex      |  68 ++++++++++++++++-
>  introduction.tex |  54 ++++++++++++++
>  3 files changed, 308 insertions(+), 2 deletions(-)
>  create mode 100644 admin.tex
> 
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
@ 2022-07-31 20:38   ` Michael S. Tsirkin
  2022-07-31 20:42   ` Michael S. Tsirkin
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 20:38 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> Each device group has a type. For now, define 2 initial types of device
> groups: Self type and SR-IOV type.
> 
> Self type - A group that has a single virtio device as a member.
> 
> SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
> PCI SR-IOV virtual functions (VFs). This group may contain one or more
> virtio devices.
> 
> Each device group has a unique identifier. This identifier is the group
> identifier (group_id).
> 
> Each device within a device group has a unique identifier. This identifier
> is the group member identifier (group_member_id).
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>


This looks good to me. Minor corrections below.

> ---
>  introduction.tex | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/introduction.tex b/introduction.tex
> index aa9ec1b..e8bde45 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -156,6 +156,30 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
>  sections tagged "Legacy Interface" in the section title.
>  These highlight the changes made since the earlier drafts.
>  
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> +

Maybe an introductory sentence. "It is occasionally useful to manage
multiple virtio devices as a group."

> +A device group includes one or more virtio devices. Each device group has a unique group identifier (group_id).

Wait a second. Is this true? To me it looks like group identifier should
actually be group type identifier. And it's not unique.

This rename will of course ripple to follow up patches.

> +A device can be a member of one or more device groups.
> +A device within a group is identified by a unique group member identifier (group_member_id).

group_member_id and group_id seem unused here. Drop?

> +The scope of the group member identifier is within the group. In other words, two device groups can have overlap group member identifiers.
> +A group member identifier is a 64-bit value in range of 0x0 - 0xFFFFFFFFFFFFFFF0.

in the range between 0x0 and 0xFFFFFFFFFFFFFFF0 inclusive.

> +A special group member identifier value of 0xFFFFFFFFFFFFFFFF refers to all the devices in a device group.

all devices in the device group.

> +
> +The supported device groups are:
> +\begin{enumerate}
> +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> +For this group type, the device is identified by group member identifier of 0.

by a group member identifier

> +
> +\item SR-IOV type (group identifier = 1) - this group includes a virtio PCI SR-IOV physical function (PF) and all its virtual functions (VFs).
> +For this group type, the PF device has group member identifier of 0. Each VF group member identifier equals the PCI VF number according to the PCI Express Base Specification
> +(Single Root I/O Virtualization and Sharing chapter). Devices that are members in this group use the Virtio PCI transport (for more details see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus}).
> +\end{enumerate}
> +
> +\begin{note}
> +  The same device can be identified by different identifiers within different groups. For example, A virtual function device has a group
> +  member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.
> +\end{note}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
  2022-07-31 20:38   ` Michael S. Tsirkin
@ 2022-07-31 20:42   ` Michael S. Tsirkin
  2022-07-31 21:19   ` Michael S. Tsirkin
  2022-08-02 13:41   ` Michael S. Tsirkin
  3 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 20:42 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> +\begin{note}
> +  The same device can be identified by different identifiers within different groups. For example, A virtual function device has a group

identified by different identifiers is a tautology

maybe "has different member identifiers"?

> +  member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.

And I think we should add:

... however, each device is a part of exactly one group of a given
group type, and has exactly one identifier within each group it is
a part of.



> +\end{note}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 2/5] Introduce admin command set
  2022-07-31 15:43 ` [PATCH v6 2/5] Introduce admin command set Max Gurtovoy
@ 2022-07-31 20:59   ` Michael S. Tsirkin
  2022-07-31 23:56     ` [virtio-comment] " Max Gurtovoy
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 20:59 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:51PM +0300, Max Gurtovoy wrote:
> This command set is used for essential administrative and management
> operations.
> 
> Admin commands should be submitted to a well defined management
> interface.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>

Looks good. Minor comments.

> ---
>  admin.tex   | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  content.tex |  2 ++
>  2 files changed, 84 insertions(+)
>  create mode 100644 admin.tex
> 
> diff --git a/admin.tex b/admin.tex
> new file mode 100644
> index 0000000..d14f8f7
> --- /dev/null
> +++ b/admin.tex
> @@ -0,0 +1,82 @@
> +\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
> +
> +The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.

this is the only place where we talk about a management interface

A better definition maybe:

commands that can be used to control one device in a group through
another device in a group?


> +This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.

Let's just avoid mentioning system administrator here since that's easy
to confuse with admin commands.

	For example, this mechanism can be used to configure a device before it is initialized by its driver.


> +All the Admin commands are of the following form:
> +
> +\begin{lstlisting}
> +struct virtio_admin_cmd {
> +        /* Device-readable part */
> +        le16 command;
> +        /*
> +         * 0 - Self
> +         * 1 - SR-IOV
> +         * 2 - 65535 are reserved
> +         */
> +        le16 group_id;

we need 32 bits of padding to align the next field naturally.
Stick them where you see fit. 16 after command and 16 here?

> +        le64 group_member_id;
> +        /* reserved for common cmd fields */
> +        u8 reserved[12];
> +        u8 command_specific_data[];
> +
> +        /* Device-writable part */
> +        u8 status;
> +        u8 command_specific_error;
> +        u8 command_specific_result[];
> +};
> +\end{lstlisting}
> +
> +The following table describes the generic Admin status codes:
> +
> +\begin{tabular}{|l|l|l|}
> +\hline
> +Opcode & Status & Description \\
> +\hline \hline
> +00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
> +\hline
> +01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
> +\hline
> +02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
> +\hline
> +03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
> +\hline
> +04h   & VIRTIO_ADMIN_STATUS_INVALID_GROUP    & invalid group identifier was set  \\
> +\hline
> +05h   & VIRTIO_ADMIN_STATUS_INVALID_MEM    & invalid group member identifier was set  \\
> +\hline
> +\end{tabular}
> +
> +The \field{command}, \field{group_id}, \field{group_member_id} and \field{command_specific_data} are
> +set by the driver, and the device sets the \field{status}, the
> +\field{command_specific_error} and the \field{command_specific_result},
> +if needed.
> +
> +Reserved common fields are ignored by the device and to be zeroed by the driver.

Add this to conformance statement maybe.

> +
> +The mandatory fields to be set by the driver, for all admin commands, are the \field{command}, \field{group_id} and \field{group_member_id}.
> +
> +The optional unused fields to be zeroed by the driver.

A bit confusing - these are the only fields I see. Is this still
relevant?

> +The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
> +
> +The \field{group_id} defines the designated device group for the command.

> +The \field{group_member_id} defines the designated device within the device group for the command.
> +See section \ref{sec:Introduction / Terminology / Device group} for group identifier and group member identifier numbering for various device groups.
> +


Hmm not I think it's not precise.

the command is sent by driver to specific a device.

this device (should we come up with a name for it? maybe
the administrative device?) together with the group type id is what defines the device
group. For example, for the SRIOV type the commands must be
sent to the PF and refer to the group including the PF
through which the commands are sent and its VFs.


For each group that supports specific commands, there is
one or more devices through which the commands can be given?



> +The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
> +VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
> +holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
> +\field{command_specific_error} value is undefined and should be ignored by the driver.
> +
> +The following table describes the Admin command set:
> +
> +\begin{tabular}{|l|l|}
> +\hline
> +Opcode & Command \\
> +\hline \hline
> +0000h - 7FFFh   & Generic admin cmds    \\
> +\hline
> +8000h - FFFFh   & Reserved    \\
> +\hline
> +\end{tabular}
> diff --git a/content.tex b/content.tex
> index 7508dd1..caab298 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>  types. It is RECOMMENDED that devices generate version 4
>  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>  
> +\input{admin.tex}
> +
>  \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>  
>  We start with an overview of device initialization, then expand on the
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 3/5] Introduce virtio admin virtqueue
  2022-07-31 15:43 ` [virtio-comment] [PATCH v6 3/5] Introduce virtio admin virtqueue Max Gurtovoy
@ 2022-07-31 21:00   ` Michael S. Tsirkin
  2022-07-31 23:07     ` Max Gurtovoy
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 21:00 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:52PM +0300, Max Gurtovoy wrote:
> In one of the many use cases a user wants to manipulate features and
> configuration of the virtio devices regardless of the device type
> (net/block/console). For that the admin command set introduced. The
> admin virtqueue will be the first management interface to issue admin
> commands.
> 
> Currently virtio specification defines control virtqueue to manipulate
> features and configuration of the device it operates on. However,
> control virtqueue commands are device type specific, which makes it very
> difficult to extend for device agnostic commands.
> 
> To support this requirement in elegant way, this patch introduces a new
> admin virtqueue interface.
> 
> Manipulate features via admin virtqueue is asynchronous, scalable, easy
> to extend and doesn't require additional and expensive on-die resources
> to be allocated for every new feature that will be added in the future.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex   | 12 ++++++++++++
>  content.tex |  6 ++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/admin.tex b/admin.tex
> index d14f8f7..26ac60e 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -80,3 +80,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  8000h - FFFFh   & Reserved    \\
>  \hline
>  \end{tabular}
> +
> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> +
> +An admin virtqueue is a management interface of a device that can be used to send administrative
> +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
> +various features of the device and/or to manipulate various features, if possible, of another device.
> +
> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
> +negotiated. The index of the admin virtqueue is exposed by the device in a
> +transport specific manner.
> +
> +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
> diff --git a/content.tex b/content.tex
> index caab298..c15423e 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
>  \begin{description}
>  \item[0 to 23, and 50 to 127] Feature bits for the specific device type
>  
> -\item[24 to 40] Feature bits reserved for extensions to the queue and
> +\item[24 to 41] Feature bits reserved for extensions to the queue and
>    feature negotiation mechanisms
>  
> -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
> +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
>  \end{description}
>  
>  \begin{note}
> @@ -6841,6 +6841,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    that the driver can reset a queue individually.
>    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
>  
> +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.

mention here that index is in transport specific manner too?

> +
>  \end{description}
>  
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-07-31 15:43 ` [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure Max Gurtovoy
@ 2022-07-31 21:03   ` Michael S. Tsirkin
  2022-08-01  0:11     ` Max Gurtovoy
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 21:03 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:53PM +0300, Max Gurtovoy wrote:
> This new register will be used for querying the index of the admin
> virtqueue of a virtio device. To configure, reset or enable the admin
> virtqueue, the driver should follow existing queue configuration/setup
> sequence.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>


Can you please at least add text to MMIO and CCW that drivers and
devices must not negotiate the new feature bit? Will help avoid confusion.

> ---
>  content.tex | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index c15423e..5fda1a0 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>          le64 queue_device;              /* read-write */
>          le16 queue_notify_data;         /* read-only for driver */
>          le16 queue_reset;               /* read-write */
> +
> +        /* About admin virtqueue. */
> +        le16 admin_queue_index;         /* read-only for driver */
>  };
>  \end{lstlisting}
>  
> @@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>          This field exists only if VIRTIO_F_RING_RESET has been
>          negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>  
> +\item[\field{admin_queue_index}]
> +        The device uses this to report the index of the admin virtqueue.
> +        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
> +
>  \end{description}
>  
>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}

we have a mess with this exists versus valid. I think exists is the same
is valid personally. Do others object if we say same as for reset here?
Not a big deal either way, we need to clean this up later.

> @@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>  were used before the queue reset.
>  (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>  
> +For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
> +For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
> +
>  \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
>  
>  The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 5/5] Introduce MGMT admin commands
  2022-07-31 15:43 ` [PATCH v6 5/5] Introduce MGMT admin commands Max Gurtovoy
@ 2022-07-31 21:16   ` Michael S. Tsirkin
  0 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 21:16 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:54PM +0300, Max Gurtovoy wrote:
> Introduce the concept of a management and a managed device and add
> example of using this concept to manage resources.
> 
> A management device (from any type) supports the
> VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin
> commands to manage some resources of a managed device.
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>

Actually this does two things:
- more SRIOV group definitions
- new MGMT commands


> ---
>  admin.tex        | 96 +++++++++++++++++++++++++++++++++++++++++++++++-
>  content.tex      | 50 +++++++++++++++++++++++++
>  introduction.tex | 30 +++++++++++++++
>  3 files changed, 175 insertions(+), 1 deletion(-)
> 
> diff --git a/admin.tex b/admin.tex
> index 26ac60e..73bc411 100644
> --- a/admin.tex
> +++ b/admin.tex
> @@ -75,12 +75,106 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>  \hline
>  Opcode & Command \\
>  \hline \hline
> -0000h - 7FFFh   & Generic admin cmds    \\
> +0000h   & VIRTIO_ADMIN_DEVICE_MGMT    \\
> +\hline
> +0001h   & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS    \\
> +\hline
> +0002h - 7FFFh   & Generic admin cmds    \\
>  \hline
>  8000h - FFFFh   & Reserved    \\
>  \hline
>  \end{tabular}
>  
> +\begin{note}
> +{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.}
> +\end{note}

this is the first time we mention management devices. We should not use
terms before they are defined.

> +
> +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command}
> +
> +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of managed virtio devices.

Please rephrase and change terminology to avoid saying manage 3 times in
a single sentence :)
In this case we have concepts of a group already.


Can these commands ever work for the self type?
I don't think they can, right?









> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT by the driver.
> +
> +The command specific data set by the driver is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_data {
> +        /*
> +         * 0 - reserved
> +         * 1 - assign resource to the designated device
> +         * 2 - query resource of the designated device
> +         * 3 - 255 are reserved for future operations
> +         */
> +        u8 operation;
> +        /*
> +         * 0 - Device feature bits 0 to 63
> +         * 1 - 65535 are reserved
> +         */
> +        le16 resource;

I think we need to move padding around here to align the next field.

> +        /*
> +         * The value to the given resource.
> +         * Valid only if operation == 1 (assign).
> +         * resource = 0 indicates device feature bits 0 to 63.
> +         */
> +        le64 resource_val;
> +        u8 reserved[5];
> +};
> +\end{lstlisting}
> +
> +The following table describes the command specific error codes codes:
> +
> +\begin{tabular}{|l|l|l|}
> +\hline
> +Opcode & Status & Description \\
> +\hline \hline
> +00h   & VIRTIO_ADMIN_CS_ERR_DEV_IN_USE    & designated device is in use, operation failed   \\
> +\hline
> +01h   & VIRTIO_ADMIN_CS_RSC_VAL_INVALID    & resource value is invalid  \\
> +\hline
> +02h   & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED    & unsupported or invalid resource  \\
> +\hline
> +03h   & VIRTIO_ADMIN_CS_OP_UNSUPPORTED    & unsupported or invalid operation  \\
> +\hline
> +04h - FFh   & Reserved    & -  \\
> +\hline
> +\end{tabular}
> +
> +The device, upon success, returns a result that describes the information according to the requested operation.
> +This result is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_result {
> +        le64 resource_val;
> +        u8 reserved[8];
> +};
> +\end{lstlisting}
> +
> +If the \field{operation} was set to 1 ("assign resource to the designated device") and the command succeeded, the device successfully assigned
> +resources to the designated device. Also, upon success, the assigned value should returned in the \field{resource_val} of the virtio_admin_device_mgmt_result
> +structure and should be equal to the \field{resource_val} of the virtio_admin_device_mgmt_data structure set by the driver.
> +In case of a failure, the value of this field is undefined and will be ignored by the driver.
> +
> +If the \field{operation} was set to 2 ("query resource of the designated device") and the command succeeded, the device will return the requested
> +value in the \field{resource_val} of the virtio_admin_device_mgmt_result structure.
> +In case of a failure, the value of this field is undefined and will be ignored by the driver.
> +
> +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command}
> +
> +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command has no command specific data set by the driver.
> +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS.
> +
> +The device, upon success, returns a result that describes the management device attributes.
> +This result is of form:
> +\begin{lstlisting}
> +struct virtio_admin_device_mgmt_attrs_result {
> +        /* Indicates which of the below fields are valid.
> +         * Bits 0 - managed_device_features_63_0
> +         * Bits 1 - 63 - reserved for future fields
> +         */
> +        le64 attrs_mask;
> +
> +        le64 managed_device_features_63_0;
> +        u8 reserved[48];

why are we reserving these?

> +};
> +\end{lstlisting}
> +
>  \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>  
>  An admin virtqueue is a management interface of a device that can be used to send administrative
> diff --git a/content.tex b/content.tex
> index 5fda1a0..b15a887 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -451,6 +451,48 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>  
>  \input{admin.tex}
>  
> +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
> +
> +A device group might consist of one or more virtio devices. For example, virtio PCI SR-IOV PF and its VFs compose a SR-IOV type device group.
> +A capable PCI SR-IOV PF virtio device might act as the management device in this group, and its PCI SR-IOV VFs are the managed devices.
> +A management device might have various management capabilities and attributes to manage its managed devices. These capabilities and attributes exposed
> +via feature bits and in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
> +
> +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details).
> +
> +\subsection{Device features management}\label{sec:Basic Facilities of a Virtio Device / Device management / Device features management}
> +
> +A virtio management device that negotiated VIRTIO_F_FEATURE_63_0_MGMT can control the values of the first 64 feature bits for its managed devices.
> +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the feature bits that can be enabled for its managed devices in \field{managed_device_features_63_0}
> +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details).
> +A management device driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the feature bits assignment for a specific managed device.
> +A management device driver can only assign a subset of the bits returned in \field{managed_device_features_63_0}.
> +A management device driver can update the managed device feature bits only before the managed device driver set the ACKNOWLEDGE status bit.
> +
> +\begin{note}
> +{It's recommended to manipulate the managed device feature bits before loading the managed device driver.}
> +\end{note}
> +
> +A typical sequence for configuring feature bits is following:
> +
> +\begin{enumerate}
> +\item Ensure that the driver of the managed device doesn't run on the device
> +
> +\item Load the driver of the management device
> +
> +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and look for \field{managed_device_features_63_0}
> +
> +\item Find the managed device group_id and group_member_id
> +
> +\item Query the current feature bits configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation)
> +
> +\item Assign desired feature bits for the managed device using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation)
> +
> +\item After successful completion of the assignment, load the managed device driver
> +
> +\end{enumerate}
> +
>  \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>  
>  We start with an overview of device initialization, then expand on the
> @@ -6853,6 +6895,14 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>  
>    \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
>  
> +  \item[VIRTIO_F_SR_IOV_MGMT_DEVICE (42)] This feature indicates that the device is a SR-IOV management device.
> +  SR-IOV management device is a device that can manage other devices within the SR-IOV device group.
> +  For more details see \ref{sec:Introduction / Terminology / Virtio SR-IOV type management device}.
> +
> +  \item[VIRTIO_F_FEATURE_63_0_MGMT (43)] This feature indicates that the device is a management device that supports
> +  feature mgmt of bits 0 to 63 for its managed devices.

eschew abbreviation in text please.

why limit this to bits 0 to 63? why not make the command cover any number of bits?


> +  For more details see \ref{sec:Basic Facilities of a Virtio Device / Device management / Device features management}.
> +
>  \end{description}
>  
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> diff --git a/introduction.tex b/introduction.tex
> index e8bde45..e20df24 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -173,6 +173,9 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>  \item SR-IOV type (group identifier = 1) - this group includes a virtio PCI SR-IOV physical function (PF) and all its virtual functions (VFs).
>  For this group type, the PF device has group member identifier of 0. Each VF group member identifier equals the PCI VF number according to the PCI Express Base Specification
>  (Single Root I/O Virtualization and Sharing chapter). Devices that are members in this group use the Virtio PCI transport (for more details see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus}).
> +A PCI SR-IOV PF device can act as a management device for SR-IOV type device group. A PCI SR-IOV VF device can act as a managed device for SR-IOV type device group
> +(see \ref{sec:Introduction / Terminology / Virtio management device} and \ref{sec:Introduction / Terminology / Virtio managed device} for more information).
> +SR-IOV type device group may contain zero or one management devices.
>  \end{enumerate}
>  
>  \begin{note}
> @@ -180,6 +183,33 @@ \subsection{Device group}\label{sec:Introduction / Terminology / Device group}
>    member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.
>  \end{note}
>  
> +\subsection{Virtio management device}\label{sec:Introduction / Terminology / Virtio management device}
> +
> +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin commands (see
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} and
> +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more information).
> +This device can manage a virtio managed device. A device group may contain zero or more management devices.
> +
> +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device (for SR-IOV type device group).
> +
> +\subsection{Virtio SR-IOV type management device}\label{sec:Introduction / Terminology / Virtio SR-IOV type management device}
> +
> +A virtio management device for SR-IOV type device group. A driver can issue the management admin commands to this device with \field{group_id} set to 1
> +and \field{group_member_id} set to an identifier that corresponds with one of its managed virtio devices (PCI SR-IOV VFs).
> +A Virtio SR-IOV type management device must expose VIRTIO_F_SR_IOV_MGMT_DEVICE feature bit.
> +
> +\subsection{virtio managed device}\label{sec:Introduction / Terminology / Virtio managed device}
> +
> +A virtio device that can be managed by a virtio management device.
> +A device group may contain zero or more managed devices.
> +
> +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device (for SR-IOV type device group).
> +
> +\subsection{virtio SR-IOV type managed device}\label{sec:Introduction / Terminology / Virtio SR-IOV type managed device}
> +
> +A virtio managed device for SR-IOV type device group. This device is a PCI SR-IOV VF and is managed by a virtio SR-IOV type management device (virtio PCI SR-IOV PF).
> +It is implied that all the virtio PCI SR-IOV VFs related to a virtio PCI SR-IOV PF that is virtio SR-IOV type management device are SR-IOV type managed devices.
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
  2022-07-31 20:38   ` Michael S. Tsirkin
  2022-07-31 20:42   ` Michael S. Tsirkin
@ 2022-07-31 21:19   ` Michael S. Tsirkin
  2022-08-02 13:41   ` Michael S. Tsirkin
  3 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-07-31 21:19 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> Each device group has a type. For now, define 2 initial types of device
> groups: Self type and SR-IOV type.
> 
> Self type - A group that has a single virtio device as a member.
> 
> SR-IOV type - A virtio PCI SR-IOV physical function (PF) and its
> PCI SR-IOV virtual functions (VFs). This group may contain one or more
> virtio devices.
> 
> Each device group has a unique identifier. This identifier is the group
> identifier (group_id).
> 
> Each device within a device group has a unique identifier. This identifier
> is the group member identifier (group_member_id).
> 
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  introduction.tex | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/introduction.tex b/introduction.tex
> index aa9ec1b..e8bde45 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -156,6 +156,30 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
>  sections tagged "Legacy Interface" in the section title.
>  These highlight the changes made since the earlier drafts.
>  
> +\subsection{Device group}\label{sec:Introduction / Terminology / Device group}
> +
> +A device group includes one or more virtio devices. Each device group has a unique group identifier (group_id).
> +A device can be a member of one or more device groups.
> +A device within a group is identified by a unique group member identifier (group_member_id).
> +The scope of the group member identifier is within the group. In other words, two device groups can have overlap group member identifiers.
> +A group member identifier is a 64-bit value in range of 0x0 - 0xFFFFFFFFFFFFFFF0.
> +A special group member identifier value of 0xFFFFFFFFFFFFFFFF refers to all the devices in a device group.
> +
> +The supported device groups are:
> +\begin{enumerate}
> +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> +For this group type, the device is identified by group member identifier of 0.
> +
> +\item SR-IOV type (group identifier = 1) - this group includes a virtio PCI SR-IOV physical function (PF) and all its virtual functions (VFs).
> +For this group type, the PF device has group member identifier of 0. Each VF group member identifier equals the PCI VF number according to the PCI Express Base Specification
> +(Single Root I/O Virtualization and Sharing chapter). Devices that are members in this group use the Virtio PCI transport (for more details see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus}).
> +\end{enumerate}
> +
> +\begin{note}
> +  The same device can be identified by different identifiers within different groups. For example, A virtual function device has a group


Hmm. I just realized.  So if a given device is part of several groups,
then these groups can support different command sets.
Of course this breaks the idea of using feature bits to specify
which commands are supported.
Interesting.

My question is what are some uses of the self type? And what are some
other examples where a single device is part of multiple groups?



> +  member identifier equals to 0 within Self type group and a group member identifier equals to VF number (e.g 4) within SR-IOV type group.
> +\end{note}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 3/5] Introduce virtio admin virtqueue
  2022-07-31 21:00   ` Michael S. Tsirkin
@ 2022-07-31 23:07     ` Max Gurtovoy
  2022-08-01  6:03       ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 23:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma


On 8/1/2022 12:00 AM, Michael S. Tsirkin wrote:
> On Sun, Jul 31, 2022 at 06:43:52PM +0300, Max Gurtovoy wrote:
>> In one of the many use cases a user wants to manipulate features and
>> configuration of the virtio devices regardless of the device type
>> (net/block/console). For that the admin command set introduced. The
>> admin virtqueue will be the first management interface to issue admin
>> commands.
>>
>> Currently virtio specification defines control virtqueue to manipulate
>> features and configuration of the device it operates on. However,
>> control virtqueue commands are device type specific, which makes it very
>> difficult to extend for device agnostic commands.
>>
>> To support this requirement in elegant way, this patch introduces a new
>> admin virtqueue interface.
>>
>> Manipulate features via admin virtqueue is asynchronous, scalable, easy
>> to extend and doesn't require additional and expensive on-die resources
>> to be allocated for every new feature that will be added in the future.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex   | 12 ++++++++++++
>>   content.tex |  6 ++++--
>>   2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/admin.tex b/admin.tex
>> index d14f8f7..26ac60e 100644
>> --- a/admin.tex
>> +++ b/admin.tex
>> @@ -80,3 +80,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
>>   8000h - FFFFh   & Reserved    \\
>>   \hline
>>   \end{tabular}
>> +
>> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>> +
>> +An admin virtqueue is a management interface of a device that can be used to send administrative
>> +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
>> +various features of the device and/or to manipulate various features, if possible, of another device.
>> +
>> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
>> +negotiated. The index of the admin virtqueue is exposed by the device in a
>> +transport specific manner.
>> +
>> +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
>> diff --git a/content.tex b/content.tex
>> index caab298..c15423e 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
>>   \begin{description}
>>   \item[0 to 23, and 50 to 127] Feature bits for the specific device type
>>   
>> -\item[24 to 40] Feature bits reserved for extensions to the queue and
>> +\item[24 to 41] Feature bits reserved for extensions to the queue and
>>     feature negotiation mechanisms
>>   
>> -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
>> +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
>>   \end{description}
>>   
>>   \begin{note}
>> @@ -6841,6 +6841,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>>     that the driver can reset a queue individually.
>>     See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
>>   
>> +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
> mention here that index is in transport specific manner too?

is this ok ?

+  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an 
administration virtqueue is supported.
+  The index of the administration virtqueue exposed in a transport 
specific manner.


>
>> +
>>   \end{description}
>>   
>>   \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [virtio-comment] Re: [PATCH v6 2/5] Introduce admin command set
  2022-07-31 20:59   ` Michael S. Tsirkin
@ 2022-07-31 23:56     ` Max Gurtovoy
  0 siblings, 0 replies; 32+ messages in thread
From: Max Gurtovoy @ 2022-07-31 23:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma


On 7/31/2022 11:59 PM, Michael S. Tsirkin wrote:
> On Sun, Jul 31, 2022 at 06:43:51PM +0300, Max Gurtovoy wrote:
>> This command set is used for essential administrative and management
>> operations.
>>
>> Admin commands should be submitted to a well defined management
>> interface.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> Looks good. Minor comments.
>
>> ---
>>   admin.tex   | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   content.tex |  2 ++
>>   2 files changed, 84 insertions(+)
>>   create mode 100644 admin.tex
>>
>> diff --git a/admin.tex b/admin.tex
>> new file mode 100644
>> index 0000000..d14f8f7
>> --- /dev/null
>> +++ b/admin.tex
>> @@ -0,0 +1,82 @@
>> +\section{Administration command set}\label{sec:Basic Facilities of a Virtio Device / Administration command set}
>> +
>> +The Administration command set (also known as Admin command set) defines the commands that can be issued using a management interface.
> this is the only place where we talk about a management interface
>
> A better definition maybe:
>
> commands that can be used to control one device in a group through
> another device in a group?

I think we already agreed on that wording. We must start adding 
"reviewed-by" stamps.

The above sentence is not true. The goal of this command set is not only 
"control one device in a group through another device in a group".

You can simply see it in this series: definition of the self type device 
group and MGMT_ATTR command.

>
>
>> +This mechanism, for example, can be used by a system administrator that wants to configure a device before it is initialized by its driver.
> Let's just avoid mentioning system administrator here since that's easy
> to confuse with admin commands.
>
> 	For example, this mechanism can be used to configure a device before it is initialized by its driver.

ok. will do.


>
>
>> +All the Admin commands are of the following form:
>> +
>> +\begin{lstlisting}
>> +struct virtio_admin_cmd {
>> +        /* Device-readable part */
>> +        le16 command;
>> +        /*
>> +         * 0 - Self
>> +         * 1 - SR-IOV
>> +         * 2 - 65535 are reserved
>> +         */
>> +        le16 group_id;
> we need 32 bits of padding to align the next field naturally.
> Stick them where you see fit. 16 after command and 16 here?

I think i'll take the entire reserved[12] array above the group_member_id.

ok?

>
>> +        le64 group_member_id;
>> +        /* reserved for common cmd fields */
>> +        u8 reserved[12];
>> +        u8 command_specific_data[];
>> +
>> +        /* Device-writable part */
>> +        u8 status;
>> +        u8 command_specific_error;
>> +        u8 command_specific_result[];
>> +};
>> +\end{lstlisting}
>> +
>> +The following table describes the generic Admin status codes:
>> +
>> +\begin{tabular}{|l|l|l|}
>> +\hline
>> +Opcode & Status & Description \\
>> +\hline \hline
>> +00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
>> +\hline
>> +01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
>> +\hline
>> +02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
>> +\hline
>> +03h   & VIRTIO_ADMIN_STATUS_INVALID_FIELD    & invalid field was set  \\
>> +\hline
>> +04h   & VIRTIO_ADMIN_STATUS_INVALID_GROUP    & invalid group identifier was set  \\
>> +\hline
>> +05h   & VIRTIO_ADMIN_STATUS_INVALID_MEM    & invalid group member identifier was set  \\
>> +\hline
>> +\end{tabular}
>> +
>> +The \field{command}, \field{group_id}, \field{group_member_id} and \field{command_specific_data} are
>> +set by the driver, and the device sets the \field{status}, the
>> +\field{command_specific_error} and the \field{command_specific_result},
>> +if needed.
>> +
>> +Reserved common fields are ignored by the device and to be zeroed by the driver.
> Add this to conformance statement maybe.

like this ?

+\devicenormative{\subsection}{Administration command set}{Basic 
Facilities of a Virtio Device / Administration command set}
+A device MUST ignore reserved fields for all Administration commands 
arriving to it.
+
+\drivernormative{\subsection}{Administration command set}{Basic 
Facilities of a Virtio Device / Administration command set}
+A driver SHOULD zero reserved fields for all Administration commands 
sending to a device.


>
>> +
>> +The mandatory fields to be set by the driver, for all admin commands, are the \field{command}, \field{group_id} and \field{group_member_id}.
>> +
>> +The optional unused fields to be zeroed by the driver.
> A bit confusing - these are the only fields I see. Is this still
> relevant?

I'll remove.


>
>> +The \field{command} defines the opcode for the command. The value for each command can be found in each command section.
>> +
>> +The \field{group_id} defines the designated device group for the command.
>> +The \field{group_member_id} defines the designated device within the device group for the command.
>> +See section \ref{sec:Introduction / Terminology / Device group} for group identifier and group member identifier numbering for various device groups.
>> +
>
> Hmm not I think it's not precise.
>
> the command is sent by driver to specific a device.
>
> this device (should we come up with a name for it? maybe
This device is referred as designated device. I can call it differently 
if needed.
> the administrative device?) together with the group type id is what defines the device
> group. For example, for the SRIOV type the commands must be
> sent to the PF and refer to the group including the PF
> through which the commands are sent and its VFs.
The device that the command is sent through is the owner of the 
interface. There is no way to be confused here.
>
> For each group that supports specific commands, there is
> one or more devices through which the commands can be given?
>
The owner of the management interface is the mgmt device.


>
>> +The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
>> +VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
>> +holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
>> +\field{command_specific_error} value is undefined and should be ignored by the driver.
>> +
>> +The following table describes the Admin command set:
>> +
>> +\begin{tabular}{|l|l|}
>> +\hline
>> +Opcode & Command \\
>> +\hline \hline
>> +0000h - 7FFFh   & Generic admin cmds    \\
>> +\hline
>> +8000h - FFFFh   & Reserved    \\
>> +\hline
>> +\end{tabular}
>> diff --git a/content.tex b/content.tex
>> index 7508dd1..caab298 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
>>   types. It is RECOMMENDED that devices generate version 4
>>   UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>>   
>> +\input{admin.tex}
>> +
>>   \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
>>   
>>   We start with an overview of device initialization, then expand on the
>> -- 
>> 2.21.0

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-07-31 21:03   ` Michael S. Tsirkin
@ 2022-08-01  0:11     ` Max Gurtovoy
  2022-08-01  6:13       ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-08-01  0:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma


On 8/1/2022 12:03 AM, Michael S. Tsirkin wrote:
> On Sun, Jul 31, 2022 at 06:43:53PM +0300, Max Gurtovoy wrote:
>> This new register will be used for querying the index of the admin
>> virtqueue of a virtio device. To configure, reset or enable the admin
>> virtqueue, the driver should follow existing queue configuration/setup
>> sequence.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> Can you please at least add text to MMIO and CCW that drivers and
> devices must not negotiate the new feature bit? Will help avoid confusion.

why this is needed ? who will create a device and expose bits that it 
doesn't know how to implement.

And if I'll add it, we might forget to remove it later on.

>
>> ---
>>   content.tex | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/content.tex b/content.tex
>> index c15423e..5fda1a0 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>           le64 queue_device;              /* read-write */
>>           le16 queue_notify_data;         /* read-only for driver */
>>           le16 queue_reset;               /* read-write */
>> +
>> +        /* About admin virtqueue. */
>> +        le16 admin_queue_index;         /* read-only for driver */
>>   };
>>   \end{lstlisting}
>>   
>> @@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>           This field exists only if VIRTIO_F_RING_RESET has been
>>           negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>>   
>> +\item[\field{admin_queue_index}]
>> +        The device uses this to report the index of the admin virtqueue.
>> +        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
>> +
>>   \end{description}
>>   
>>   \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
> we have a mess with this exists versus valid. I think exists is the same

the bigest mess that we have is for things like ring_reset that are 
optionally exist according to bit negotiation.

about valid - the driver shouldn't even look on registers that it didn't 
negotiated it's feature bits. There is no reason to do so.

So yes, there is a mess but not in our proposal.

> is valid personally. Do others object if we say same as for reset here?
> Not a big deal either way, we need to clean this up later.
>
>> @@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>   were used before the queue reset.
>>   (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>>   
>> +For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
>> +For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
>> +
>>   \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
>>   
>>   The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
>> -- 
>> 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 3/5] Introduce virtio admin virtqueue
  2022-07-31 23:07     ` Max Gurtovoy
@ 2022-08-01  6:03       ` Michael S. Tsirkin
  0 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-01  6:03 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Mon, Aug 01, 2022 at 02:07:26AM +0300, Max Gurtovoy wrote:
> 
> On 8/1/2022 12:00 AM, Michael S. Tsirkin wrote:
> > On Sun, Jul 31, 2022 at 06:43:52PM +0300, Max Gurtovoy wrote:
> > > In one of the many use cases a user wants to manipulate features and
> > > configuration of the virtio devices regardless of the device type
> > > (net/block/console). For that the admin command set introduced. The
> > > admin virtqueue will be the first management interface to issue admin
> > > commands.
> > > 
> > > Currently virtio specification defines control virtqueue to manipulate
> > > features and configuration of the device it operates on. However,
> > > control virtqueue commands are device type specific, which makes it very
> > > difficult to extend for device agnostic commands.
> > > 
> > > To support this requirement in elegant way, this patch introduces a new
> > > admin virtqueue interface.
> > > 
> > > Manipulate features via admin virtqueue is asynchronous, scalable, easy
> > > to extend and doesn't require additional and expensive on-die resources
> > > to be allocated for every new feature that will be added in the future.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > ---
> > >   admin.tex   | 12 ++++++++++++
> > >   content.tex |  6 ++++--
> > >   2 files changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > index d14f8f7..26ac60e 100644
> > > --- a/admin.tex
> > > +++ b/admin.tex
> > > @@ -80,3 +80,15 @@ \section{Administration command set}\label{sec:Basic Facilities of a Virtio Devi
> > >   8000h - FFFFh   & Reserved    \\
> > >   \hline
> > >   \end{tabular}
> > > +
> > > +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> > > +
> > > +An admin virtqueue is a management interface of a device that can be used to send administrative
> > > +commands (see \ref{sec:Basic Facilities of a Virtio Device / Administration command set}) to manipulate
> > > +various features of the device and/or to manipulate various features, if possible, of another device.
> > > +
> > > +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ feature is
> > > +negotiated. The index of the admin virtqueue is exposed by the device in a
> > > +transport specific manner.
> > > +
> > > +If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin virtqueue to send all admin commands.
> > > diff --git a/content.tex b/content.tex
> > > index caab298..c15423e 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
> > >   \begin{description}
> > >   \item[0 to 23, and 50 to 127] Feature bits for the specific device type
> > > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > >     feature negotiation mechanisms
> > > -\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
> > > +\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
> > >   \end{description}
> > >   \begin{note}
> > > @@ -6841,6 +6841,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
> > >     that the driver can reset a queue individually.
> > >     See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
> > > +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
> > mention here that index is in transport specific manner too?
> 
> is this ok ?
> 
> +  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an
> administration virtqueue is supported.
> +  The index of the administration virtqueue exposed in a transport specific
> manner.

s/exposed/is specified/

> 
> > 
> > > +
> > >   \end{description}
> > >   \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-08-01  0:11     ` Max Gurtovoy
@ 2022-08-01  6:13       ` Michael S. Tsirkin
  2022-08-04  0:01         ` [virtio-comment] " Max Gurtovoy
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-01  6:13 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Mon, Aug 01, 2022 at 03:11:58AM +0300, Max Gurtovoy wrote:
> 
> On 8/1/2022 12:03 AM, Michael S. Tsirkin wrote:
> > On Sun, Jul 31, 2022 at 06:43:53PM +0300, Max Gurtovoy wrote:
> > > This new register will be used for querying the index of the admin
> > > virtqueue of a virtio device. To configure, reset or enable the admin
> > > virtqueue, the driver should follow existing queue configuration/setup
> > > sequence.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > 
> > Can you please at least add text to MMIO and CCW that drivers and
> > devices must not negotiate the new feature bit? Will help avoid confusion.
> 
> why this is needed ? who will create a device and expose bits that it
> doesn't know how to implement.

Any multi-transport implementation actually.
For example, on a Linux guest the default is to add a feature bit to all
transports. Extra care is needed to actually only add it
to the PCI transport. With qemu, same applies to features in
include/hw/virtio/virtio.h.

> And if I'll add it, we might forget to remove it later on.

When we add the implementation I'm reasonably sure we won't forget to
remove text saying it's not implemented.

> > 
> > > ---
> > >   content.tex | 10 ++++++++++
> > >   1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/content.tex b/content.tex
> > > index c15423e..5fda1a0 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > >           le64 queue_device;              /* read-write */
> > >           le16 queue_notify_data;         /* read-only for driver */
> > >           le16 queue_reset;               /* read-write */
> > > +
> > > +        /* About admin virtqueue. */
> > > +        le16 admin_queue_index;         /* read-only for driver */
> > >   };
> > >   \end{lstlisting}
> > > @@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > >           This field exists only if VIRTIO_F_RING_RESET has been
> > >           negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
> > > +\item[\field{admin_queue_index}]
> > > +        The device uses this to report the index of the admin virtqueue.
> > > +        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
> > > +
> > >   \end{description}
> > >   \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
> > we have a mess with this exists versus valid. I think exists is the same
> 
> the bigest mess that we have is for things like ring_reset that are
> optionally exist according to bit negotiation.

I think this optionally exist does not mean much, just that
it should not be accessed

> about valid - the driver shouldn't even look on registers that it didn't
> negotiated it's feature bits. There is no reason to do so.

> 
> So yes, there is a mess but not in our proposal.

Right. I propose removing "This field always exists." just
say that it's valid with VIRTIO_F_ADMIN_VQ.

> > is valid personally. Do others object if we say same as for reset here?
> > Not a big deal either way, we need to clean this up later.
> > 
> > > @@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > >   were used before the queue reset.
> > >   (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
> > > +For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
> > > +For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
> > > +
> > >   \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
> > >   The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
> > > -- 
> > > 2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
                     ` (2 preceding siblings ...)
  2022-07-31 21:19   ` Michael S. Tsirkin
@ 2022-08-02 13:41   ` Michael S. Tsirkin
  2022-08-03  4:44     ` Jason Wang
  2022-08-03 23:45     ` [virtio-comment] " Max Gurtovoy
  3 siblings, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-02 13:41 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

I feel some of my latest review opened some questions that I don't have
good answers for and might have felt a bit rambling.
So to focus the discussion:

On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> +A device can be a member of one or more device groups.

Presumably this is so we can e.g. create subfunctions inside a VF.
A VF now is a member of a SRIOV and SIOV type groups and we
can use type to distinguish between these.

We should probably be explicit that each of these groups has to
have a distinct group type then.

And this raises the question: different types have different
capabilities. So let's say admin queue is used to both
control features for SRIOV VFs and to create SIOV SFs.
I guess we'll have a feature bit to say "command to create
SIOV SFs is supported" but how do we say that this command
is only supported for VFs not SFs?

Do we just make features list a superset of what is supported and simply
say in the spec which commands are legal with which group types?


Jason Cornelia what do you think?



> +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.

Presumably, this is here so we can send commands that refer to the
device itself as opposed to a group member (e.g. to
PF as opposed to VF). Is that right?

It's handy but again the problem here is, this refers to
device as part of which group? Let's just drop this type?


-- 
MST


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-02 13:41   ` Michael S. Tsirkin
@ 2022-08-03  4:44     ` Jason Wang
  2022-08-03  6:10       ` Michael S. Tsirkin
  2022-08-03  6:43       ` Michael S. Tsirkin
  2022-08-03 23:45     ` [virtio-comment] " Max Gurtovoy
  1 sibling, 2 replies; 32+ messages in thread
From: Jason Wang @ 2022-08-03  4:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> I feel some of my latest review opened some questions that I don't have
> good answers for and might have felt a bit rambling.
> So to focus the discussion:
>
> On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > +A device can be a member of one or more device groups.
>
> Presumably this is so we can e.g. create subfunctions inside a VF.

Then VF should have its own transport virtqueue. And subfunctions need
to be created there. If we don't all thing in PF, we may end up with
nesting issue when assign VF to the guest.

> A VF now is a member of a SRIOV and SIOV type groups and we
> can use type to distinguish between these.
>
> We should probably be explicit that each of these groups has to
> have a distinct group type then.
>
> And this raises the question: different types have different
> capabilities. So let's say admin queue is used to both
> control features for SRIOV VFs and to create SIOV SFs.

I don't get how the admin queue can be used to control VF features
considering VF has its capabilities. (SR-IOV lacks the ability to
provision a single VF).

> I guess we'll have a feature bit to say "command to create
> SIOV SFs is supported" but how do we say that this command
> is only supported for VFs not SFs?

I think we should first answer if having VF and SF to be dealt with a
single type of virtqueue is a good idea. They have something in common
but they distinguish each other:

- SF requires per virtual device lifecycle management
- SF requires a transport other than PCI
- SF requires more mediation in the software layer for presenting a
virtual device

Using a single type of virtqueue may end up with complex design.
Having a dedicated queue for SF might be a better choice.

>
> Do we just make features list a superset of what is supported and simply
> say in the spec which commands are legal with which group types?
>
>
> Jason Cornelia what do you think?

It looks to me it would be much more simpler if we use separated
virtqueues for SRIOV and SIOV.

Thanks

>
>
>
> > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
>
> Presumably, this is here so we can send commands that refer to the
> device itself as opposed to a group member (e.g. to
> PF as opposed to VF). Is that right?
>
> It's handy but again the problem here is, this refers to
> device as part of which group? Let's just drop this type?
>
>
> --
> MST
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-03  4:44     ` Jason Wang
@ 2022-08-03  6:10       ` Michael S. Tsirkin
  2022-08-03  8:04         ` Jason Wang
  2022-08-03  6:43       ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-03  6:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > I feel some of my latest review opened some questions that I don't have
> > good answers for and might have felt a bit rambling.
> > So to focus the discussion:
> >
> > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > +A device can be a member of one or more device groups.
> >
> > Presumably this is so we can e.g. create subfunctions inside a VF.
> 
> Then VF should have its own transport virtqueue. And subfunctions need
> to be created there. If we don't all thing in PF, we may end up with
> nesting issue when assign VF to the guest.
> > A VF now is a member of a SRIOV and SIOV type groups and we
> > can use type to distinguish between these.
> >
> > We should probably be explicit that each of these groups has to
> > have a distinct group type then.
> >
> > And this raises the question: different types have different
> > capabilities. So let's say admin queue is used to both
> > control features for SRIOV VFs and to create SIOV SFs.
> 
> I don't get how the admin queue can be used to control VF features
> considering VF has its capabilities. (SR-IOV lacks the ability to
> provision a single VF).

Well look at latest proposal, last patch controls VF features from PF.

> > I guess we'll have a feature bit to say "command to create
> > SIOV SFs is supported" but how do we say that this command
> > is only supported for VFs not SFs?
> 
> I think we should first answer if having VF and SF to be dealt with a
> single type of virtqueue is a good idea. They have something in common
> but they distinguish each other:
> 
> - SF requires per virtual device lifecycle management
> - SF requires a transport other than PCI
> - SF requires more mediation in the software layer for presenting a
> virtual device
> 
> Using a single type of virtqueue may end up with complex design.
> Having a dedicated queue for SF might be a better choice.

And dedicated feature bits for commands thereof?  For example, I imagine
we could have commands to control the MAC of the group member. That is
the same for SF and VF, yes? How do we avoid duplication for that?

> >
> > Do we just make features list a superset of what is supported and simply
> > say in the spec which commands are legal with which group types?
> >
> >
> > Jason Cornelia what do you think?
> 
> It looks to me it would be much more simpler if we use separated
> virtqueues for SRIOV and SIOV.
> 
> Thanks

Then is it still helpful that we have the generic group type concept?
I was hoping it will work so the same command can be used for VFs
and SFs.


> >
> >
> >
> > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> >
> > Presumably, this is here so we can send commands that refer to the
> > device itself as opposed to a group member (e.g. to
> > PF as opposed to VF). Is that right?
> >
> > It's handy but again the problem here is, this refers to
> > device as part of which group? Let's just drop this type?
> >
> >
> > --
> > MST
> >


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-03  4:44     ` Jason Wang
  2022-08-03  6:10       ` Michael S. Tsirkin
@ 2022-08-03  6:43       ` Michael S. Tsirkin
  1 sibling, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-03  6:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> > A VF now is a member of a SRIOV and SIOV type groups and we
> > can use type to distinguish between these.
> >
> > We should probably be explicit that each of these groups has to
> > have a distinct group type then.
> >
> > And this raises the question: different types have different
> > capabilities. So let's say admin queue is used to both
> > control features for SRIOV VFs and to create SIOV SFs.
> 
> I don't get how the admin queue can be used to control VF features
> considering VF has its capabilities. (SR-IOV lacks the ability to
> provision a single VF).

I mean, VF exists, but it does not have to consume lots of resources.
We can if we want to require that drivers only attach to VFs which have
been enabled through a special command.

-- 
MST


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-03  6:10       ` Michael S. Tsirkin
@ 2022-08-03  8:04         ` Jason Wang
  2022-08-03 12:33           ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2022-08-03  8:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma


在 2022/8/3 14:10, Michael S. Tsirkin 写道:
> On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
>> On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>> I feel some of my latest review opened some questions that I don't have
>>> good answers for and might have felt a bit rambling.
>>> So to focus the discussion:
>>>
>>> On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
>>>> +A device can be a member of one or more device groups.
>>> Presumably this is so we can e.g. create subfunctions inside a VF.
>> Then VF should have its own transport virtqueue. And subfunctions need
>> to be created there. If we don't all thing in PF, we may end up with
>> nesting issue when assign VF to the guest.
>>> A VF now is a member of a SRIOV and SIOV type groups and we
>>> can use type to distinguish between these.
>>>
>>> We should probably be explicit that each of these groups has to
>>> have a distinct group type then.
>>>
>>> And this raises the question: different types have different
>>> capabilities. So let's say admin queue is used to both
>>> control features for SRIOV VFs and to create SIOV SFs.
>> I don't get how the admin queue can be used to control VF features
>> considering VF has its capabilities. (SR-IOV lacks the ability to
>> provision a single VF).
> Well look at latest proposal, last patch controls VF features from PF.


Yes, so it works like previous MSI-X allocation which needs some care to 
prevent managed device from being probed before assigning features.

This is technically possible, but I'm not sure it is a good design. For 
example, what happens if the management change the feature while the a 
driver is using the managed device.


>
>>> I guess we'll have a feature bit to say "command to create
>>> SIOV SFs is supported" but how do we say that this command
>>> is only supported for VFs not SFs?
>> I think we should first answer if having VF and SF to be dealt with a
>> single type of virtqueue is a good idea. They have something in common
>> but they distinguish each other:
>>
>> - SF requires per virtual device lifecycle management
>> - SF requires a transport other than PCI
>> - SF requires more mediation in the software layer for presenting a
>> virtual device
>>
>> Using a single type of virtqueue may end up with complex design.
>> Having a dedicated queue for SF might be a better choice.
> And dedicated feature bits for commands thereof?


Only needed if we're using a single type of the queue.


>    For example, I imagine
> we could have commands to control the MAC of the group member. That is
> the same for SF and VF, yes? How do we avoid duplication for that?


In the transport vq, all configs (include mtu and features) were 
specified during the device creating command. It is not allowed to 
change mac afterwards. (If we need, the SF needs to be destroyed and 
created again with different configs).


>
>>> Do we just make features list a superset of what is supported and simply
>>> say in the spec which commands are legal with which group types?
>>>
>>>
>>> Jason Cornelia what do you think?
>> It looks to me it would be much more simpler if we use separated
>> virtqueues for SRIOV and SIOV.
>>
>> Thanks
> Then is it still helpful that we have the generic group type concept?


Not sure, I wonder if the implicit group can do here. E.g _F_SRIOV with 
_F_ADMIN_VQ menast SR-IOV group.


> I was hoping it will work so the same command can be used for VFs
> and SFs.


Yes, but the transport vq ties the mac and other configuration with the 
device creating. Not sure we can easily do the same for SR-IOV.

Thanks


>
>
>>>
>>>
>>>> +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
>>> Presumably, this is here so we can send commands that refer to the
>>> device itself as opposed to a group member (e.g. to
>>> PF as opposed to VF). Is that right?
>>>
>>> It's handy but again the problem here is, this refers to
>>> device as part of which group? Let's just drop this type?
>>>
>>>
>>> --
>>> MST
>>>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-03  8:04         ` Jason Wang
@ 2022-08-03 12:33           ` Michael S. Tsirkin
  2022-08-04  2:08             ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-03 12:33 UTC (permalink / raw)
  To: Jason Wang
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Wed, Aug 03, 2022 at 04:04:50PM +0800, Jason Wang wrote:
> 
> 在 2022/8/3 14:10, Michael S. Tsirkin 写道:
> > On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> > > On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > I feel some of my latest review opened some questions that I don't have
> > > > good answers for and might have felt a bit rambling.
> > > > So to focus the discussion:
> > > > 
> > > > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > > > +A device can be a member of one or more device groups.
> > > > Presumably this is so we can e.g. create subfunctions inside a VF.
> > > Then VF should have its own transport virtqueue. And subfunctions need
> > > to be created there. If we don't all thing in PF, we may end up with
> > > nesting issue when assign VF to the guest.
> > > > A VF now is a member of a SRIOV and SIOV type groups and we
> > > > can use type to distinguish between these.
> > > > 
> > > > We should probably be explicit that each of these groups has to
> > > > have a distinct group type then.
> > > > 
> > > > And this raises the question: different types have different
> > > > capabilities. So let's say admin queue is used to both
> > > > control features for SRIOV VFs and to create SIOV SFs.
> > > I don't get how the admin queue can be used to control VF features
> > > considering VF has its capabilities. (SR-IOV lacks the ability to
> > > provision a single VF).
> > Well look at latest proposal, last patch controls VF features from PF.
> 
> 
> Yes, so it works like previous MSI-X allocation which needs some care to
> prevent managed device from being probed before assigning features.
> 
> This is technically possible, but I'm not sure it is a good design. For
> example, what happens if the management change the feature while the a
> driver is using the managed device.

I think this should be prohibited in the spec.

It might be a good idea to have explicit commands that allow driver to
attach.

For example the following might work for both VFs and SFs:


INIT

configure

ENABLE <- driver can attach now, configure is blocked


--- device can be used ---

Note: some configs might be editable while device is in use.
E.g. enabling/disabling softmac dynamically.

--- device can be used ---

DISABLE -> takes control from driver. we can have a flag telling
	   whether we want to be graceful about it and fail
	   if driver is still attached or not

configure - if we want to attach to another VM

CLEANUP - release resources and forget config



> 
> > 
> > > > I guess we'll have a feature bit to say "command to create
> > > > SIOV SFs is supported" but how do we say that this command
> > > > is only supported for VFs not SFs?
> > > I think we should first answer if having VF and SF to be dealt with a
> > > single type of virtqueue is a good idea. They have something in common
> > > but they distinguish each other:
> > > 
> > > - SF requires per virtual device lifecycle management
> > > - SF requires a transport other than PCI
> > > - SF requires more mediation in the software layer for presenting a
> > > virtual device
> > > 
> > > Using a single type of virtqueue may end up with complex design.
> > > Having a dedicated queue for SF might be a better choice.
> > And dedicated feature bits for commands thereof?
> 
> 
> Only needed if we're using a single type of the queue.


Imagine a command only allowed for SFs not VFs. Does
the PF supporting SFs and VFs have the corresponding
feature bit or not?

> 
> >    For example, I imagine
> > we could have commands to control the MAC of the group member. That is
> > the same for SF and VF, yes? How do we avoid duplication for that?
> 
> 
> In the transport vq, all configs (include mtu and features) were specified
> during the device creating command. It is not allowed to change mac
> afterwards. (If we need, the SF needs to be destroyed and created again with
> different configs).

It was just an example. Are you implying SFs and VFs have completely
different needs with no overlap then? It seems weird since
fundamentally they look the same at a lot of levels.



> 
> > 
> > > > Do we just make features list a superset of what is supported and simply
> > > > say in the spec which commands are legal with which group types?
> > > > 
> > > > 
> > > > Jason Cornelia what do you think?
> > > It looks to me it would be much more simpler if we use separated
> > > virtqueues for SRIOV and SIOV.
> > > 
> > > Thanks
> > Then is it still helpful that we have the generic group type concept?
> 
> 
> Not sure, I wonder if the implicit group can do here. E.g _F_SRIOV with
> _F_ADMIN_VQ menast SR-IOV group.


I don't see how. PF can have SFs right?


> 
> > I was hoping it will work so the same command can be used for VFs
> > and SFs.
> 
> 
> Yes, but the transport vq ties the mac and other configuration with the
> device creating. Not sure we can easily do the same for SR-IOV.
> 
> Thanks

We can if we either split SF out or artificially add creation to VFs.



But I expect more command will be exactly the same. Live migration?



> 
> > 
> > 
> > > > 
> > > > 
> > > > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> > > > Presumably, this is here so we can send commands that refer to the
> > > > device itself as opposed to a group member (e.g. to
> > > > PF as opposed to VF). Is that right?
> > > > 
> > > > It's handy but again the problem here is, this refers to
> > > > device as part of which group? Let's just drop this type?
> > > > 
> > > > 
> > > > --
> > > > MST
> > > > 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [virtio-comment] Re: [PATCH v6 1/5] Introduce device group
  2022-08-02 13:41   ` Michael S. Tsirkin
  2022-08-03  4:44     ` Jason Wang
@ 2022-08-03 23:45     ` Max Gurtovoy
  2022-08-04  6:20       ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-08-03 23:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma


On 8/2/2022 4:41 PM, Michael S. Tsirkin wrote:
> I feel some of my latest review opened some questions that I don't have
> good answers for and might have felt a bit rambling.
> So to focus the discussion:
>
> On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
>> +A device can be a member of one or more device groups.
> Presumably this is so we can e.g. create subfunctions inside a VF.
> A VF now is a member of a SRIOV and SIOV type groups and we
> can use type to distinguish between these.
>
> We should probably be explicit that each of these groups has to
> have a distinct group type then.
>
> And this raises the question: different types have different
> capabilities. So let's say admin queue is used to both
> control features for SRIOV VFs and to create SIOV SFs.
> I guess we'll have a feature bit to say "command to create
> SIOV SFs is supported" but how do we say that this command
> is only supported for VFs not SFs?
>
> Do we just make features list a superset of what is supported and simply
> say in the spec which commands are legal with which group types?
>
>
> Jason Cornelia what do you think?
>
>
>
>> +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> Presumably, this is here so we can send commands that refer to the
> device itself as opposed to a group member (e.g. to
> PF as opposed to VF). Is that right?
yes.
>
> It's handy but again the problem here is, this refers to
> device as part of which group? Let's just drop this type?

You suggested this type in our meeting. And this was one of the major 
changes from v5 --> v6.

How do you suggest sending admin commands to yourself ? the way I 
suggested in previous versions ?

>
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [virtio-comment] Re: [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-08-01  6:13       ` Michael S. Tsirkin
@ 2022-08-04  0:01         ` Max Gurtovoy
  2022-08-04  6:26           ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Max Gurtovoy @ 2022-08-04  0:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma


On 8/1/2022 9:13 AM, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2022 at 03:11:58AM +0300, Max Gurtovoy wrote:
>> On 8/1/2022 12:03 AM, Michael S. Tsirkin wrote:
>>> On Sun, Jul 31, 2022 at 06:43:53PM +0300, Max Gurtovoy wrote:
>>>> This new register will be used for querying the index of the admin
>>>> virtqueue of a virtio device. To configure, reset or enable the admin
>>>> virtqueue, the driver should follow existing queue configuration/setup
>>>> sequence.
>>>>
>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>> Can you please at least add text to MMIO and CCW that drivers and
>>> devices must not negotiate the new feature bit? Will help avoid confusion.
>> why this is needed ? who will create a device and expose bits that it
>> doesn't know how to implement.
> Any multi-transport implementation actually.
> For example, on a Linux guest the default is to add a feature bit to all
> transports. Extra care is needed to actually only add it
> to the PCI transport. With qemu, same applies to features in
> include/hw/virtio/virtio.h.

I don't know how a reasonable driver will want to negotiate feature bits 
that a device didn't expose.

Please suggest whatever you want me to add and where.

I don't mind adding this but I don't want to waste cycle of review on 
that so I need exact guideline and not hints please.

>
>> And if I'll add it, we might forget to remove it later on.
> When we add the implementation I'm reasonably sure we won't forget to
> remove text saying it's not implemented.

Ok.


>
>>>> ---
>>>>    content.tex | 10 ++++++++++
>>>>    1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/content.tex b/content.tex
>>>> index c15423e..5fda1a0 100644
>>>> --- a/content.tex
>>>> +++ b/content.tex
>>>> @@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>>>            le64 queue_device;              /* read-write */
>>>>            le16 queue_notify_data;         /* read-only for driver */
>>>>            le16 queue_reset;               /* read-write */
>>>> +
>>>> +        /* About admin virtqueue. */
>>>> +        le16 admin_queue_index;         /* read-only for driver */
>>>>    };
>>>>    \end{lstlisting}
>>>> @@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>>>            This field exists only if VIRTIO_F_RING_RESET has been
>>>>            negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>>>> +\item[\field{admin_queue_index}]
>>>> +        The device uses this to report the index of the admin virtqueue.
>>>> +        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
>>>> +
>>>>    \end{description}
>>>>    \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
>>> we have a mess with this exists versus valid. I think exists is the same
>> the bigest mess that we have is for things like ring_reset that are
>> optionally exist according to bit negotiation.
> I think this optionally exist does not mean much, just that
> it should not be accessed

If I understand the word "exist" translation correctly, it doesn't mean 
what you said about "shouldn't be accessed".

>> about valid - the driver shouldn't even look on registers that it didn't
>> negotiated it's feature bits. There is no reason to do so.
>> So yes, there is a mess but not in our proposal.
> Right. I propose removing "This field always exists." just
> say that it's valid with VIRTIO_F_ADMIN_VQ.

Why removing ? does it exist always or not ?

>
>>> is valid personally. Do others object if we say same as for reset here?
>>> Not a big deal either way, we need to clean this up later.
>>>
>>>> @@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>>>>    were used before the queue reset.
>>>>    (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
>>>> +For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
>>>> +For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
>>>> +
>>>>    \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
>>>>    The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
>>>> -- 
>>>> 2.21.0
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=72o5SFbI1Ai5twIQXF0Onn2TTorpuIlWoqp46tpUgrQ%3D&amp;reserved=0
> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=7%2FdpXHFFpA7j97vIqLFXfaVQMj1FmDX2Ldwwzytpuuw%3D&amp;reserved=0
> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=lN2X%2BC%2BoHvd90YL%2BSf3FSleiNJYZnfTqRWSdDpK2wAc%3D&amp;reserved=0
> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=5FFookod06loAfQ8uGp3HmeuEUdP2lurXWtPNOnCtNY%3D&amp;reserved=0
> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1OJe%2BI1%2FsWYMIaM0PIZjMKmcC%2FpGAoOkHr%2Bi4E4AJOE%3D&amp;reserved=0
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-03 12:33           ` Michael S. Tsirkin
@ 2022-08-04  2:08             ` Jason Wang
  2022-08-04  6:17               ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2022-08-04  2:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Wed, Aug 3, 2022 at 8:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Aug 03, 2022 at 04:04:50PM +0800, Jason Wang wrote:
> >
> > 在 2022/8/3 14:10, Michael S. Tsirkin 写道:
> > > On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> > > > On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > I feel some of my latest review opened some questions that I don't have
> > > > > good answers for and might have felt a bit rambling.
> > > > > So to focus the discussion:
> > > > >
> > > > > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > > > > +A device can be a member of one or more device groups.
> > > > > Presumably this is so we can e.g. create subfunctions inside a VF.
> > > > Then VF should have its own transport virtqueue. And subfunctions need
> > > > to be created there. If we don't all thing in PF, we may end up with
> > > > nesting issue when assign VF to the guest.
> > > > > A VF now is a member of a SRIOV and SIOV type groups and we
> > > > > can use type to distinguish between these.
> > > > >
> > > > > We should probably be explicit that each of these groups has to
> > > > > have a distinct group type then.
> > > > >
> > > > > And this raises the question: different types have different
> > > > > capabilities. So let's say admin queue is used to both
> > > > > control features for SRIOV VFs and to create SIOV SFs.
> > > > I don't get how the admin queue can be used to control VF features
> > > > considering VF has its capabilities. (SR-IOV lacks the ability to
> > > > provision a single VF).
> > > Well look at latest proposal, last patch controls VF features from PF.
> >
> >
> > Yes, so it works like previous MSI-X allocation which needs some care to
> > prevent managed device from being probed before assigning features.
> >
> > This is technically possible, but I'm not sure it is a good design. For
> > example, what happens if the management change the feature while the a
> > driver is using the managed device.
>
> I think this should be prohibited in the spec.

Yes, but implementation wise, this needs to be considered.

>
> It might be a good idea to have explicit commands that allow driver to
> attach.
>
> For example the following might work for both VFs and SFs:
>
>
> INIT
>
> configure
>
> ENABLE <- driver can attach now, configure is blocked
>
>
> --- device can be used ---
>
> Note: some configs might be editable while device is in use.
> E.g. enabling/disabling softmac dynamically.
>
> --- device can be used ---
>
> DISABLE -> takes control from driver. we can have a flag telling
>            whether we want to be graceful about it and fail
>            if driver is still attached or not
>
> configure - if we want to attach to another VM
>
> CLEANUP - release resources and forget config

Yes, but for SF it's not a must.

And should we add these states in the current state machine? If yes,
it might complicate the migration compatibility.

>
>
>
> >
> > >
> > > > > I guess we'll have a feature bit to say "command to create
> > > > > SIOV SFs is supported" but how do we say that this command
> > > > > is only supported for VFs not SFs?
> > > > I think we should first answer if having VF and SF to be dealt with a
> > > > single type of virtqueue is a good idea. They have something in common
> > > > but they distinguish each other:
> > > >
> > > > - SF requires per virtual device lifecycle management
> > > > - SF requires a transport other than PCI
> > > > - SF requires more mediation in the software layer for presenting a
> > > > virtual device
> > > >
> > > > Using a single type of virtqueue may end up with complex design.
> > > > Having a dedicated queue for SF might be a better choice.
> > > And dedicated feature bits for commands thereof?
> >
> >
> > Only needed if we're using a single type of the queue.
>
>
> Imagine a command only allowed for SFs not VFs. Does
> the PF supporting SFs and VFs have the corresponding
> feature bit or not?

I wonder if we can do:

1) having two type of virtqueues
2) VFs goes to VF admin queue
3) SFs goes to SF transport queue

So if PF supports both SFs and VFs, it should have at least two feature bits.

>
> >
> > >    For example, I imagine
> > > we could have commands to control the MAC of the group member. That is
> > > the same for SF and VF, yes? How do we avoid duplication for that?
> >
> >
> > In the transport vq, all configs (include mtu and features) were specified
> > during the device creating command. It is not allowed to change mac
> > afterwards. (If we need, the SF needs to be destroyed and created again with
> > different configs).
>
> It was just an example. Are you implying SFs and VFs have completely
> different needs with no overlap then?

There indeed overlaps, e.g the provisions of the configs. Other than
these, there should be no other. The idea of the transport virtqueue
is mainly for having a new transport. This is different from what I
understand for the admin virtqueue.

> It seems weird since
> fundamentally they look the same at a lot of levels.

Yes but only from the view of the functionality.

>
>
>
> >
> > >
> > > > > Do we just make features list a superset of what is supported and simply
> > > > > say in the spec which commands are legal with which group types?
> > > > >
> > > > >
> > > > > Jason Cornelia what do you think?
> > > > It looks to me it would be much more simpler if we use separated
> > > > virtqueues for SRIOV and SIOV.
> > > >
> > > > Thanks
> > > Then is it still helpful that we have the generic group type concept?
> >
> >
> > Not sure, I wonder if the implicit group can do here. E.g _F_SRIOV with
> > _F_ADMIN_VQ menast SR-IOV group.
>
>
> I don't see how. PF can have SFs right?

Yes, but technically, we have capabilities then we know which
virtqueue is doing PF and which is doing SF.

>
>
> >
> > > I was hoping it will work so the same command can be used for VFs
> > > and SFs.
> >
> >
> > Yes, but the transport vq ties the mac and other configuration with the
> > device creating. Not sure we can easily do the same for SR-IOV.
> >
> > Thanks
>
> We can if we either split SF out or artificially add creation to VFs.

I agree. But the artificial creation for VF requires more work.

>
>
>
> But I expect more command will be exactly the same. Live migration?

My understanding is that the live migration is a basic facility like
device status. It means it needs to be transport independent.

That means the function could be accessed via PCI capability/MMIO and
other transport so it does not look like an issue specific to admin
virtqueue or transport virtqueue. We can define a common data
structure then it can be mapped to the same or different commands in
each type of transport(or virtqueue)?

Thanks

Thanks


>
>
>
> >
> > >
> > >
> > > > >
> > > > >
> > > > > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> > > > > Presumably, this is here so we can send commands that refer to the
> > > > > device itself as opposed to a group member (e.g. to
> > > > > PF as opposed to VF). Is that right?
> > > > >
> > > > > It's handy but again the problem here is, this refers to
> > > > > device as part of which group? Let's just drop this type?
> > > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-04  2:08             ` Jason Wang
@ 2022-08-04  6:17               ` Michael S. Tsirkin
  2022-08-04  7:17                 ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-04  6:17 UTC (permalink / raw)
  To: Jason Wang
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Thu, Aug 04, 2022 at 10:08:30AM +0800, Jason Wang wrote:
> On Wed, Aug 3, 2022 at 8:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Aug 03, 2022 at 04:04:50PM +0800, Jason Wang wrote:
> > >
> > > 在 2022/8/3 14:10, Michael S. Tsirkin 写道:
> > > > On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> > > > > On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > I feel some of my latest review opened some questions that I don't have
> > > > > > good answers for and might have felt a bit rambling.
> > > > > > So to focus the discussion:
> > > > > >
> > > > > > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > > > > > +A device can be a member of one or more device groups.
> > > > > > Presumably this is so we can e.g. create subfunctions inside a VF.
> > > > > Then VF should have its own transport virtqueue. And subfunctions need
> > > > > to be created there. If we don't all thing in PF, we may end up with
> > > > > nesting issue when assign VF to the guest.
> > > > > > A VF now is a member of a SRIOV and SIOV type groups and we
> > > > > > can use type to distinguish between these.
> > > > > >
> > > > > > We should probably be explicit that each of these groups has to
> > > > > > have a distinct group type then.
> > > > > >
> > > > > > And this raises the question: different types have different
> > > > > > capabilities. So let's say admin queue is used to both
> > > > > > control features for SRIOV VFs and to create SIOV SFs.
> > > > > I don't get how the admin queue can be used to control VF features
> > > > > considering VF has its capabilities. (SR-IOV lacks the ability to
> > > > > provision a single VF).
> > > > Well look at latest proposal, last patch controls VF features from PF.
> > >
> > >
> > > Yes, so it works like previous MSI-X allocation which needs some care to
> > > prevent managed device from being probed before assigning features.
> > >
> > > This is technically possible, but I'm not sure it is a good design. For
> > > example, what happens if the management change the feature while the a
> > > driver is using the managed device.
> >
> > I think this should be prohibited in the spec.
> 
> Yes, but implementation wise, this needs to be considered.

Just check the DRIVER status bit, it's not difficult.

> >
> > It might be a good idea to have explicit commands that allow driver to
> > attach.
> >
> > For example the following might work for both VFs and SFs:
> >
> >
> > INIT
> >
> > configure
> >
> > ENABLE <- driver can attach now, configure is blocked
> >
> >
> > --- device can be used ---
> >
> > Note: some configs might be editable while device is in use.
> > E.g. enabling/disabling softmac dynamically.
> >
> > --- device can be used ---
> >
> > DISABLE -> takes control from driver. we can have a flag telling
> >            whether we want to be graceful about it and fail
> >            if driver is still attached or not
> >
> > configure - if we want to attach to another VM
> >
> > CLEANUP - release resources and forget config
> 
> Yes, but for SF it's not a must.
> 
> And should we add these states in the current state machine? If yes,
> it might complicate the migration compatibility.

Noidea what does it have to do with migration.

> >
> >
> >
> > >
> > > >
> > > > > > I guess we'll have a feature bit to say "command to create
> > > > > > SIOV SFs is supported" but how do we say that this command
> > > > > > is only supported for VFs not SFs?
> > > > > I think we should first answer if having VF and SF to be dealt with a
> > > > > single type of virtqueue is a good idea. They have something in common
> > > > > but they distinguish each other:
> > > > >
> > > > > - SF requires per virtual device lifecycle management
> > > > > - SF requires a transport other than PCI
> > > > > - SF requires more mediation in the software layer for presenting a
> > > > > virtual device
> > > > >
> > > > > Using a single type of virtqueue may end up with complex design.
> > > > > Having a dedicated queue for SF might be a better choice.
> > > > And dedicated feature bits for commands thereof?
> > >
> > >
> > > Only needed if we're using a single type of the queue.
> >
> >
> > Imagine a command only allowed for SFs not VFs. Does
> > the PF supporting SFs and VFs have the corresponding
> > feature bit or not?
> 
> I wonder if we can do:
> 
> 1) having two type of virtqueues
> 2) VFs goes to VF admin queue
> 3) SFs goes to SF transport queue
> 
> So if PF supports both SFs and VFs, it should have at least two feature bits.

This does not answer the question.  Let's say we have command X. We
would normally have feature COMMAND_X.  How do we communicate which of
the VQs support which command?


> >
> > >
> > > >    For example, I imagine
> > > > we could have commands to control the MAC of the group member. That is
> > > > the same for SF and VF, yes? How do we avoid duplication for that?
> > >
> > >
> > > In the transport vq, all configs (include mtu and features) were specified
> > > during the device creating command. It is not allowed to change mac
> > > afterwards. (If we need, the SF needs to be destroyed and created again with
> > > different configs).
> >
> > It was just an example. Are you implying SFs and VFs have completely
> > different needs with no overlap then?
> 
> There indeed overlaps, e.g the provisions of the configs. Other than
> these, there should be no other.

Yes provision but my point is that it is not just the config space.
Here's a better example of a resource which is not in device config: MAC
table size.  And one of the issues is that this is also something that can be
changed transparently as device is running.  So we could have a separate
command to provision it both for admin queue and for transport vq, and a
separate command to change it later, but it seems inelegant.


> The idea of the transport virtqueue
> is mainly for having a new transport. This is different from what I
> understand for the admin virtqueue.

Hmm my point was that a transport, or a bus, is in fact a way to address
a group of devices.  Which is exactly what admin queue does.

> > It seems weird since
> > fundamentally they look the same at a lot of levels.
> 
> Yes but only from the view of the functionality.

The concept of a device addressing other devices is what unifies them.
The patch dealing with device groups was developed in response to this.

> >
> >
> >
> > >
> > > >
> > > > > > Do we just make features list a superset of what is supported and simply
> > > > > > say in the spec which commands are legal with which group types?
> > > > > >
> > > > > >
> > > > > > Jason Cornelia what do you think?
> > > > > It looks to me it would be much more simpler if we use separated
> > > > > virtqueues for SRIOV and SIOV.
> > > > >
> > > > > Thanks
> > > > Then is it still helpful that we have the generic group type concept?
> > >
> > >
> > > Not sure, I wonder if the implicit group can do here. E.g _F_SRIOV with
> > > _F_ADMIN_VQ menast SR-IOV group.
> >
> >
> > I don't see how. PF can have SFs right?
> 
> Yes, but technically, we have capabilities then we know which
> virtqueue is doing PF and which is doing SF.

What are capabilities? An older proposal from nvidia had them but
it was dropped. Do you propose bringing them back?

> >
> >
> > >
> > > > I was hoping it will work so the same command can be used for VFs
> > > > and SFs.
> > >
> > >
> > > Yes, but the transport vq ties the mac and other configuration with the
> > > device creating. Not sure we can easily do the same for SR-IOV.
> > >
> > > Thanks
> >
> > We can if we either split SF out or artificially add creation to VFs.
> 
> I agree. But the artificial creation for VF requires more work.

Well we need to specify what can be changed when otherwise it's
a free for all, drivers will go crazy changing random fields
at random time and then we need to support this mess.
Just look at featuresm config fields and FEATURES_OK mess -
the spec said don't do it but implementations did not bother
checking and now we wasted man months already trying to fix it properly.

> >
> >
> >
> > But I expect more command will be exactly the same. Live migration?
> 
> My understanding is that the live migration is a basic facility like
> device status. It means it needs to be transport independent.
>
> That means the function could be accessed via PCI capability/MMIO and
> other transport so it does not look like an issue specific to admin
> virtqueue or transport virtqueue. We can define a common data
> structure then it can be mapped to the same or different commands in
> each type of transport(or virtqueue)?
> 
> Thanks
> 
> Thanks

I think I now understand that you would add capability to have admin
queue inside the vq transport. It does address some issues though not
all. I will need to ponder this.


> >
> >
> >
> > >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> > > > > > Presumably, this is here so we can send commands that refer to the
> > > > > > device itself as opposed to a group member (e.g. to
> > > > > > PF as opposed to VF). Is that right?
> > > > > >
> > > > > > It's handy but again the problem here is, this refers to
> > > > > > device as part of which group? Let's just drop this type?
> > > > > >
> > > > > >
> > > > > > --
> > > > > > MST
> > > > > >
> >


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [virtio-comment] Re: [PATCH v6 1/5] Introduce device group
  2022-08-03 23:45     ` [virtio-comment] " Max Gurtovoy
@ 2022-08-04  6:20       ` Michael S. Tsirkin
  0 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-04  6:20 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Thu, Aug 04, 2022 at 02:45:40AM +0300, Max Gurtovoy wrote:
> 
> On 8/2/2022 4:41 PM, Michael S. Tsirkin wrote:
> > I feel some of my latest review opened some questions that I don't have
> > good answers for and might have felt a bit rambling.
> > So to focus the discussion:
> > 
> > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > +A device can be a member of one or more device groups.
> > Presumably this is so we can e.g. create subfunctions inside a VF.
> > A VF now is a member of a SRIOV and SIOV type groups and we
> > can use type to distinguish between these.
> > 
> > We should probably be explicit that each of these groups has to
> > have a distinct group type then.
> > 
> > And this raises the question: different types have different
> > capabilities. So let's say admin queue is used to both
> > control features for SRIOV VFs and to create SIOV SFs.
> > I guess we'll have a feature bit to say "command to create
> > SIOV SFs is supported" but how do we say that this command
> > is only supported for VFs not SFs?
> > 
> > Do we just make features list a superset of what is supported and simply
> > say in the spec which commands are legal with which group types?
> > 
> > 
> > Jason Cornelia what do you think?
> > 
> > 
> > 
> > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> > Presumably, this is here so we can send commands that refer to the
> > device itself as opposed to a group member (e.g. to
> > PF as opposed to VF). Is that right?
> yes.
> > 
> > It's handy but again the problem here is, this refers to
> > device as part of which group? Let's just drop this type?
> 
> You suggested this type in our meeting. And this was one of the major
> changes from v5 --> v6.

Not just me but yes. I only just realized the problem we have with
device being part of multiple groups. Let's see what the fix is.

> 
> How do you suggest sending admin commands to yourself ? the way I suggested
> in previous versions ?

I don't know yet. I am sorry I only saw this now.

> > 
> > 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [virtio-comment] Re: [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure
  2022-08-04  0:01         ` [virtio-comment] " Max Gurtovoy
@ 2022-08-04  6:26           ` Michael S. Tsirkin
  0 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2022-08-04  6:26 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: jasowang, virtio-comment, cohuck, virtio-dev, oren, parav,
	shahafs, aadam, virtio, eperezma

On Thu, Aug 04, 2022 at 03:01:44AM +0300, Max Gurtovoy wrote:
> 
> On 8/1/2022 9:13 AM, Michael S. Tsirkin wrote:
> > On Mon, Aug 01, 2022 at 03:11:58AM +0300, Max Gurtovoy wrote:
> > > On 8/1/2022 12:03 AM, Michael S. Tsirkin wrote:
> > > > On Sun, Jul 31, 2022 at 06:43:53PM +0300, Max Gurtovoy wrote:
> > > > > This new register will be used for querying the index of the admin
> > > > > virtqueue of a virtio device. To configure, reset or enable the admin
> > > > > virtqueue, the driver should follow existing queue configuration/setup
> > > > > sequence.
> > > > > 
> > > > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > > Can you please at least add text to MMIO and CCW that drivers and
> > > > devices must not negotiate the new feature bit? Will help avoid confusion.
> > > why this is needed ? who will create a device and expose bits that it
> > > doesn't know how to implement.
> > Any multi-transport implementation actually.
> > For example, on a Linux guest the default is to add a feature bit to all
> > transports. Extra care is needed to actually only add it
> > to the PCI transport. With qemu, same applies to features in
> > include/hw/virtio/virtio.h.
> 
> I don't know how a reasonable driver will want to negotiate feature bits
> that a device didn't expose.

As a result of a bug.

> Please suggest whatever you want me to add and where.
> 
> I don't mind adding this but I don't want to waste cycle of review on that
> so I need exact guideline and not hints please.

Sure. So

"The driver MUST NOT negotiate VIRTIO_F_ADMIN_QUEUE. This is because
there is as yet no way for the driver to discover the index of the admin
queue."

in Device Initialization chapter.

And similarly add a device normative:

"The device MUST NOT offer VIRTIO_F_ADMIN_QUEUE. This is because
there is as yet no way to device to expose the index of the admin
queue."



> > 
> > > And if I'll add it, we might forget to remove it later on.
> > When we add the implementation I'm reasonably sure we won't forget to
> > remove text saying it's not implemented.
> 
> Ok.
> 
> 
> > 
> > > > > ---
> > > > >    content.tex | 10 ++++++++++
> > > > >    1 file changed, 10 insertions(+)
> > > > > 
> > > > > diff --git a/content.tex b/content.tex
> > > > > index c15423e..5fda1a0 100644
> > > > > --- a/content.tex
> > > > > +++ b/content.tex
> > > > > @@ -904,6 +904,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > > > >            le64 queue_device;              /* read-write */
> > > > >            le16 queue_notify_data;         /* read-only for driver */
> > > > >            le16 queue_reset;               /* read-write */
> > > > > +
> > > > > +        /* About admin virtqueue. */
> > > > > +        le16 admin_queue_index;         /* read-only for driver */
> > > > >    };
> > > > >    \end{lstlisting}
> > > > > @@ -989,6 +992,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > > > >            This field exists only if VIRTIO_F_RING_RESET has been
> > > > >            negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
> > > > > +\item[\field{admin_queue_index}]
> > > > > +        The device uses this to report the index of the admin virtqueue.
> > > > > +        This field always exists. Its value is valid only if VIRTIO_F_ADMIN_VQ has been negotiated.
> > > > > +
> > > > >    \end{description}
> > > > >    \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
> > > > we have a mess with this exists versus valid. I think exists is the same
> > > the bigest mess that we have is for things like ring_reset that are
> > > optionally exist according to bit negotiation.
> > I think this optionally exist does not mean much, just that
> > it should not be accessed
> 
> If I understand the word "exist" translation correctly, it doesn't mean what
> you said about "shouldn't be accessed".
> 
> > > about valid - the driver shouldn't even look on registers that it didn't
> > > negotiated it's feature bits. There is no reason to do so.
> > > So yes, there is a mess but not in our proposal.
> > Right. I propose removing "This field always exists." just
> > say that it's valid with VIRTIO_F_ADMIN_VQ.
> 
> Why removing ? does it exist always or not ?

It doesn't. For example existing devices do not have it.


> > 
> > > > is valid personally. Do others object if we say same as for reset here?
> > > > Not a big deal either way, we need to clean this up later.
> > > > 
> > > > > @@ -1075,6 +1082,9 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> > > > >    were used before the queue reset.
> > > > >    (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
> > > > > +For configuring the admin virtqueue, the driver MUST use the value of \field{admin_queue_index}.
> > > > > +For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
> > > > > +
> > > > >    \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
> > > > >    The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
> > > > > -- 
> > > > > 2.21.0
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=72o5SFbI1Ai5twIQXF0Onn2TTorpuIlWoqp46tpUgrQ%3D&amp;reserved=0
> > Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=7%2FdpXHFFpA7j97vIqLFXfaVQMj1FmDX2Ldwwzytpuuw%3D&amp;reserved=0
> > List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=lN2X%2BC%2BoHvd90YL%2BSf3FSleiNJYZnfTqRWSdDpK2wAc%3D&amp;reserved=0
> > Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=5FFookod06loAfQ8uGp3HmeuEUdP2lurXWtPNOnCtNY%3D&amp;reserved=0
> > Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=05%7C01%7Cmgurtovoy%40nvidia.com%7C46caf90d77cb4e2e7d1b08da7384fbe5%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637949312238253496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1OJe%2BI1%2FsWYMIaM0PIZjMKmcC%2FpGAoOkHr%2Bi4E4AJOE%3D&amp;reserved=0
> > 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v6 1/5] Introduce device group
  2022-08-04  6:17               ` Michael S. Tsirkin
@ 2022-08-04  7:17                 ` Jason Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2022-08-04  7:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, Cornelia Huck, Virtio-Dev,
	Oren Duer, Parav Pandit, Shahaf Shuler, Ariel Adam, virtio,
	eperezma

On Thu, Aug 4, 2022 at 2:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Aug 04, 2022 at 10:08:30AM +0800, Jason Wang wrote:
> > On Wed, Aug 3, 2022 at 8:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Aug 03, 2022 at 04:04:50PM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/8/3 14:10, Michael S. Tsirkin 写道:
> > > > > On Wed, Aug 03, 2022 at 12:44:38PM +0800, Jason Wang wrote:
> > > > > > On Tue, Aug 2, 2022 at 9:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > I feel some of my latest review opened some questions that I don't have
> > > > > > > good answers for and might have felt a bit rambling.
> > > > > > > So to focus the discussion:
> > > > > > >
> > > > > > > On Sun, Jul 31, 2022 at 06:43:50PM +0300, Max Gurtovoy wrote:
> > > > > > > > +A device can be a member of one or more device groups.
> > > > > > > Presumably this is so we can e.g. create subfunctions inside a VF.
> > > > > > Then VF should have its own transport virtqueue. And subfunctions need
> > > > > > to be created there. If we don't all thing in PF, we may end up with
> > > > > > nesting issue when assign VF to the guest.
> > > > > > > A VF now is a member of a SRIOV and SIOV type groups and we
> > > > > > > can use type to distinguish between these.
> > > > > > >
> > > > > > > We should probably be explicit that each of these groups has to
> > > > > > > have a distinct group type then.
> > > > > > >
> > > > > > > And this raises the question: different types have different
> > > > > > > capabilities. So let's say admin queue is used to both
> > > > > > > control features for SRIOV VFs and to create SIOV SFs.
> > > > > > I don't get how the admin queue can be used to control VF features
> > > > > > considering VF has its capabilities. (SR-IOV lacks the ability to
> > > > > > provision a single VF).
> > > > > Well look at latest proposal, last patch controls VF features from PF.
> > > >
> > > >
> > > > Yes, so it works like previous MSI-X allocation which needs some care to
> > > > prevent managed device from being probed before assigning features.
> > > >
> > > > This is technically possible, but I'm not sure it is a good design. For
> > > > example, what happens if the management change the feature while the a
> > > > driver is using the managed device.
> > >
> > > I think this should be prohibited in the spec.
> >
> > Yes, but implementation wise, this needs to be considered.
>
> Just check the DRIVER status bit, it's not difficult.

But it would be a race? There could be an ongining probing for this
device, the code just didn't reach the line that set the DRIVER.

>
> > >
> > > It might be a good idea to have explicit commands that allow driver to
> > > attach.
> > >
> > > For example the following might work for both VFs and SFs:
> > >
> > >
> > > INIT
> > >
> > > configure
> > >
> > > ENABLE <- driver can attach now, configure is blocked
> > >
> > >
> > > --- device can be used ---
> > >
> > > Note: some configs might be editable while device is in use.
> > > E.g. enabling/disabling softmac dynamically.
> > >
> > > --- device can be used ---
> > >
> > > DISABLE -> takes control from driver. we can have a flag telling
> > >            whether we want to be graceful about it and fail
> > >            if driver is still attached or not
> > >
> > > configure - if we want to attach to another VM
> > >
> > > CLEANUP - release resources and forget config
> >
> > Yes, but for SF it's not a must.
> >
> > And should we add these states in the current state machine? If yes,
> > it might complicate the migration compatibility.
>
> Noidea what does it have to do with migration.

For example, in src, we support the above new status but not in the destination.

>
> > >
> > >
> > >
> > > >
> > > > >
> > > > > > > I guess we'll have a feature bit to say "command to create
> > > > > > > SIOV SFs is supported" but how do we say that this command
> > > > > > > is only supported for VFs not SFs?
> > > > > > I think we should first answer if having VF and SF to be dealt with a
> > > > > > single type of virtqueue is a good idea. They have something in common
> > > > > > but they distinguish each other:
> > > > > >
> > > > > > - SF requires per virtual device lifecycle management
> > > > > > - SF requires a transport other than PCI
> > > > > > - SF requires more mediation in the software layer for presenting a
> > > > > > virtual device
> > > > > >
> > > > > > Using a single type of virtqueue may end up with complex design.
> > > > > > Having a dedicated queue for SF might be a better choice.
> > > > > And dedicated feature bits for commands thereof?
> > > >
> > > >
> > > > Only needed if we're using a single type of the queue.
> > >
> > >
> > > Imagine a command only allowed for SFs not VFs. Does
> > > the PF supporting SFs and VFs have the corresponding
> > > feature bit or not?
> >
> > I wonder if we can do:
> >
> > 1) having two type of virtqueues
> > 2) VFs goes to VF admin queue
> > 3) SFs goes to SF transport queue
> >
> > So if PF supports both SFs and VFs, it should have at least two feature bits.
>
> This does not answer the question.  Let's say we have command X. We
> would normally have feature COMMAND_X.  How do we communicate which of
> the VQs support which command?

It should work like the existing virtqueue? E.g for virtio-net, ctrl
vq only accepts commands VIRTIO_NET_CTRL_VQ_XXX. So did for the admin
vq and transport vq.

>
>
> > >
> > > >
> > > > >    For example, I imagine
> > > > > we could have commands to control the MAC of the group member. That is
> > > > > the same for SF and VF, yes? How do we avoid duplication for that?
> > > >
> > > >
> > > > In the transport vq, all configs (include mtu and features) were specified
> > > > during the device creating command. It is not allowed to change mac
> > > > afterwards. (If we need, the SF needs to be destroyed and created again with
> > > > different configs).
> > >
> > > It was just an example. Are you implying SFs and VFs have completely
> > > different needs with no overlap then?
> >
> > There indeed overlaps, e.g the provisions of the configs. Other than
> > these, there should be no other.
>
> Yes provision but my point is that it is not just the config space.
> Here's a better example of a resource which is not in device config: MAC
> table size.  And one of the issues is that this is also something that can be
> changed transparently as device is running.

It looks to me spec doesn't say it can be changed in this way.

>  So we could have a separate
> command to provision it both for admin queue and for transport vq, and a
> separate command to change it later, but it seems inelegant.

For MAC table size, as spec suggests, it probably requires a mediation
layer in the software: switching to use alluni when we run out of mac
tables. Having to provision mac table size seems a burden for the mgmt
layer since it is not something that can be seen by the driver.

So for transport vq, the idea is to provision all config in one
command with the provision of the SF itself. So if the config
provision could be done for SR-IOV, we may think of a way to unify
them. It could be something like

1) define the common structure
2) map them into different command

This might be useful for other DMA/CMA based transport in the future.
This is similar to other device facilities: e.g the device status
could be accessed in various transport dependent ways.

>
>
> > The idea of the transport virtqueue
> > is mainly for having a new transport. This is different from what I
> > understand for the admin virtqueue.
>
> Hmm my point was that a transport, or a bus, is in fact a way to address
> a group of devices.  Which is exactly what admin queue does.

Addressing for VF has been done via BDF (transport) for SR-IOV. My
understanding for admin virtqueue so far is that it is not aimed to be
a transport but a way to have an out of band management interface.

>
> > > It seems weird since
> > > fundamentally they look the same at a lot of levels.
> >
> > Yes but only from the view of the functionality.
>
> The concept of a device addressing other devices is what unifies them.
> The patch dealing with device groups was developed in response to this.

My understanding is that there are one major differences In the case
of transport vq proposal, the basic device facility (device
configuration like status, virtqueue, features etc) is carried via
virtual queue but in the admin virtqueue they are still expected to be
accessed via BAR.

>
> > >
> > >
> > >
> > > >
> > > > >
> > > > > > > Do we just make features list a superset of what is supported and simply
> > > > > > > say in the spec which commands are legal with which group types?
> > > > > > >
> > > > > > >
> > > > > > > Jason Cornelia what do you think?
> > > > > > It looks to me it would be much more simpler if we use separated
> > > > > > virtqueues for SRIOV and SIOV.
> > > > > >
> > > > > > Thanks
> > > > > Then is it still helpful that we have the generic group type concept?
> > > >
> > > >
> > > > Not sure, I wonder if the implicit group can do here. E.g _F_SRIOV with
> > > > _F_ADMIN_VQ menast SR-IOV group.
> > >
> > >
> > > I don't see how. PF can have SFs right?
> >
> > Yes, but technically, we have capabilities then we know which
> > virtqueue is doing PF and which is doing SF.
>
> What are capabilities? An older proposal from nvidia had them but
> it was dropped. Do you propose bringing them back?

Just to be clear, I meant we can develop any necessary facilities for
the driver to know:

virtqueue X: admin virtqueue or not
virtqueue Y: transport virtqueue or not
virtqueue Z: live migration virtqueue or not

etc.

>
> > >
> > >
> > > >
> > > > > I was hoping it will work so the same command can be used for VFs
> > > > > and SFs.
> > > >
> > > >
> > > > Yes, but the transport vq ties the mac and other configuration with the
> > > > device creating. Not sure we can easily do the same for SR-IOV.
> > > >
> > > > Thanks
> > >
> > > We can if we either split SF out or artificially add creation to VFs.
> >
> > I agree. But the artificial creation for VF requires more work.
>
> Well we need to specify what can be changed when otherwise it's
> a free for all, drivers will go crazy changing random fields
> at random time and then we need to support this mess.
> Just look at featuresm config fields and FEATURES_OK mess -
> the spec said don't do it but implementations did not bother
> checking and now we wasted man months already trying to fix it properly.

Yes, so what I meant is that. Consider the complexity, If we provision
all configs along with the SF itself, we don't need to bother with
artificial creation.

>
> > >
> > >
> > >
> > > But I expect more command will be exactly the same. Live migration?
> >
> > My understanding is that the live migration is a basic facility like
> > device status. It means it needs to be transport independent.
> >
> > That means the function could be accessed via PCI capability/MMIO and
> > other transport so it does not look like an issue specific to admin
> > virtqueue or transport virtqueue. We can define a common data
> > structure then it can be mapped to the same or different commands in
> > each type of transport(or virtqueue)?
> >
> > Thanks
> >
> > Thanks
>
> I think I now understand that you would add capability to have admin
> queue inside the vq transport. It does address some issues though not
> all. I will need to ponder this.

I prefer to do them in parallel if it's possible. The only overlap is
the provisioning, but we can think of a way to reduce the duplication
of commands.

Thanks

>
>
> > >
> > >
> > >
> > > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > +\item Self type (group identifier = 0) - this group has only one device in the group. Each virtio device is a member of at least one device group, the Self type group.
> > > > > > > Presumably, this is here so we can send commands that refer to the
> > > > > > > device itself as opposed to a group member (e.g. to
> > > > > > > PF as opposed to VF). Is that right?
> > > > > > >
> > > > > > > It's handy but again the problem here is, this refers to
> > > > > > > device as part of which group? Let's just drop this type?
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > MST
> > > > > > >
> > >
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-08-04  7:17 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-31 15:43 [PATCH v6 0/5] Introduce device group and device management Max Gurtovoy
2022-07-31 15:43 ` [PATCH v6 1/5] Introduce device group Max Gurtovoy
2022-07-31 20:38   ` Michael S. Tsirkin
2022-07-31 20:42   ` Michael S. Tsirkin
2022-07-31 21:19   ` Michael S. Tsirkin
2022-08-02 13:41   ` Michael S. Tsirkin
2022-08-03  4:44     ` Jason Wang
2022-08-03  6:10       ` Michael S. Tsirkin
2022-08-03  8:04         ` Jason Wang
2022-08-03 12:33           ` Michael S. Tsirkin
2022-08-04  2:08             ` Jason Wang
2022-08-04  6:17               ` Michael S. Tsirkin
2022-08-04  7:17                 ` Jason Wang
2022-08-03  6:43       ` Michael S. Tsirkin
2022-08-03 23:45     ` [virtio-comment] " Max Gurtovoy
2022-08-04  6:20       ` Michael S. Tsirkin
2022-07-31 15:43 ` [PATCH v6 2/5] Introduce admin command set Max Gurtovoy
2022-07-31 20:59   ` Michael S. Tsirkin
2022-07-31 23:56     ` [virtio-comment] " Max Gurtovoy
2022-07-31 15:43 ` [virtio-comment] [PATCH v6 3/5] Introduce virtio admin virtqueue Max Gurtovoy
2022-07-31 21:00   ` Michael S. Tsirkin
2022-07-31 23:07     ` Max Gurtovoy
2022-08-01  6:03       ` Michael S. Tsirkin
2022-07-31 15:43 ` [PATCH v6 4/5] Add admin_queue_index register to PCI common configuration structure Max Gurtovoy
2022-07-31 21:03   ` Michael S. Tsirkin
2022-08-01  0:11     ` Max Gurtovoy
2022-08-01  6:13       ` Michael S. Tsirkin
2022-08-04  0:01         ` [virtio-comment] " Max Gurtovoy
2022-08-04  6:26           ` Michael S. Tsirkin
2022-07-31 15:43 ` [PATCH v6 5/5] Introduce MGMT admin commands Max Gurtovoy
2022-07-31 21:16   ` Michael S. Tsirkin
2022-07-31 16:27 ` [PATCH v6 0/5] Introduce device group and device management Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.