All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF
@ 2022-02-03  7:57 Max Gurtovoy
  2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
                   ` (3 more replies)
  0 siblings, 4 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-03  7:57 UTC (permalink / raw)
  To: virtio-comment, mst, cohuck, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

Hi,

In a PCI SR-IOV configuration, MSI-X vectors of the device is precious
device resource. Hence making efficient use of it based on the use case
that aligns to the VM configuration is desired for best system
performance.

For example, today's static assignment of the amount of MSI-X vectors
doesn't allow sophisticated utilization of resources.

A typical cloud provider SR-IOV use case is to create many VFs for
use by guest VMs. Each VM might have a different purpose and different
amount of resources accordingly (e.g. number of CPUs). A common driver
usage of device's MSI-X vectors is proportional to the number of CPUs in
the VM. Since the system administrator might know the amount of CPUs in
the requested VM, he can also configure the VF's MSI-X vectors amount
proportional to the number of CPUs in the VM. In this way, the
utilization of the physical hardware will be improved.

Today we have some operating systems that support provisioning MSI-X
vectors for PCI VFs.

Update the specification to have a method to change the number of MSI-X
vectors supported by a VF using the PF admin virtqueue interface. For that,
create a generic infrastructure for managing PCI resources of the managed
VF by its parent PF.

The admin virtqueue interface will be extended in the future with more and more
features that some of them already in discussions. Some of these features don't
fit to MMIO/config_space characteristics, therefore a queue is selected to address
admin commands.

Motivation for choosing admin queue:
1. It is anticipated that admin queue will be used for managing and configuring
   many different type of resources. For example,
   a. PCI PF configuring PCI VF attributes in [1].
   b. virtio device creating/destroying/configuring subfunctions discussed in [2]
   c. composing device config space of VF or SF such as mac address, number of VQs, virtio features

Mapping all of them as configuration registers to MMIO will require large MMIO space, if done for
each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF requires on-chip
resources to complete within MMIO access latencies. Such resources are very expensive.

2. Such limitation can be overcome by having smaller MMIO register set to build
   a command request response interface. However, such MMIO based command interface
   will be limited to serve single outstanding command execution. Such limitation can
   resulting in high device creation and composing time which can affect VM startup time.
   Often device can queue and service multiple commands in parallel, such command interface
   cannot use parallelism offered by the device.

3. When a command wants to DMA data from one or more physical addresses, for example in the future a
   live migration command may need to fetch device state consist of config space, tens of
   VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
   Packing one or more DMA addresses over new command interface will be burden some and continue
   to suffer single outstanding command execution latencies. Such limitation is not good for time
   sensitive live migration use cases.

4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
   descriptors. Similar mechanism exist today for device specific configuration - the control VQ.

[1] This proposal
[2] https://lists.oasis-open.org/archives/virtio-comment/202108/msg00025.html

Patch (1/4) introduce the admin virtqueue concept and feature bit.
Patch (2/4) introduce the miscellaneous configuration structure for PCI.
Patch (3/4) introduce the device management concept.
Patch (4/4) introduce MSI-X control support.

Open issues:
1.  Should we have command to reset all VFs vectors to zero before enabling SR-IOV? Yes/No.
    If yes, is this optimization command must in current series? Or it can be added later?
2. CCW and MMIO specification for admin_queue_index register

---

changes from V2:

1. Addressed below comments from Jason:
 - split admin virtqueue section from admin command set section
 - added device management section in "Basic Facilities of a Virtio Device"
 - mention that management interface may be admin virtqueue (but not a must)
 - add MSI-X configuration sequence

2. Addressed below comments from MST:
 - added conformance sections
 - use decimal instead of hex for bit numbering
 - remove the OOO definitions for AQ
 - explain more about default configuration of MSI-X for VFs

3. Addressed below comments from Cornelia:
 - reword AQ index definitions
 - introduce misc config for PCI transport

4. added vfs_assigned_msix_count attr to VIRTIO_ADMIN_PCI_SRIOV_ATTRS cmd

5. removed the net/blk AQ index definitions (not needed in the new model)

6. rebased on top of master branch commit 88895f56e642aca ("Reserve more feature bits for device type usage")

changes from V1:

1. Addressed below comments from MST:
 - updated cover letter for admin queue motivation
 - removed VIRTIO_F_ADMIN_PCI_VIRT_MANAGER
 - simplified admin queue interface by removing 
   VIRTIO_F_ADMIN_VQ_INDIRECT_DESC/VIRTIO_F_ADMIN_VQ_IN_ORDER feature bits
 - added a subsection for VF MSI-X control capability in PCI section
 - re-designed interrupt vector management admin commands
 - added a mandatory admin command to expose admin capabilities
 - improved commit messages
 - described error code for unsupported command
 - described error code for errors on invalid VF
 - described system software requirements for supporting MSI-X configuration

2. Addressed comments from Parav
 - extended command opcode to 16-bit
 - improve commit messages

3. Added more command specific error codes

---


Max Gurtovoy (4):
  Add virtio Admin virtqueue
  Add miscellaneous configuration structure for PCI
  Add device management facility
  Add support for MSI-X vectors configuration for PCI VFs

 admin.tex       | 252 ++++++++++++++++++++++++++++++++++++++++++++++++
 conformance.tex |   4 +
 content.tex     | 119 ++++++++++++++++++++++-
 3 files changed, 373 insertions(+), 2 deletions(-)
 create mode 100644 admin.tex

-- 
2.21.0


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-03  7:57 [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF Max Gurtovoy
@ 2022-02-03  7:57 ` Max Gurtovoy
  2022-02-03 13:09   ` [virtio-dev] " Cornelia Huck
  2022-02-07 10:39   ` Michael S. Tsirkin
  2022-02-03  7:57 ` [PATCH v3 2/4] Add miscellaneous configuration structure for PCI Max Gurtovoy
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-03  7:57 UTC (permalink / raw)
  To: virtio-comment, mst, cohuck, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

In one of the many use cases a user wants to manipulate features and
configuration of the virtio devices regardless of the device type
(net/block/console). Some of this configuration is generic enough. i.e
Number of MSI-X vectors of a virtio PCI VF device. There is a need to do
such features query and manipulation by its parent PCI PF.

Currently virtio specification defines control virtqueue to manipulate
features and configuration of the device it operates on. However,
control virtqueue commands are device type specific, which makes it very
difficult to extend for device agnostic commands.

To support this requirement in elegant way, this patch introduces a new
admin virtqueue interface. Admin virtqueue is proposed as one of the
interfaces to issue admin commands that have the same command format for
all type of virtio devices.

Manipulate features via admin virtqueue is asynchronous, scalable, easy
to extend and doesn't require additional and expensive on-die resources
to be allocated for every new feature that will be added in the future.

Subsequent patches make use of this admin virtqueue.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex       | 94 +++++++++++++++++++++++++++++++++++++++++++++++++
 conformance.tex |  1 +
 content.tex     |  8 +++--
 3 files changed, 101 insertions(+), 2 deletions(-)
 create mode 100644 admin.tex

diff --git a/admin.tex b/admin.tex
new file mode 100644
index 0000000..fa9c993
--- /dev/null
+++ b/admin.tex
@@ -0,0 +1,94 @@
+\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
+
+Admin virtqueue is one of the management interface that used to send administrative
+commands to manipulate various features of the device and/or to manipulate
+various features, if possible, of another device within the same group (e.g. PCI VFs
+of a parent PCI PF device are grouped together. These devices can be optionally
+managed by its parent PCI PF using its admin virtqueue.).
+
+An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ is
+negotiated. The index of the admin virtqueue exposed by the device in a
+transport specific manner.
+
+When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
+through the admin virtqueue.
+
+\section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admin command set}
+
+The Admin command set defines the commands that can be issued to a well defined management
+interface, such as the admin virtqueue. All commands are of the following form:
+
+\begin{lstlisting}
+struct virtio_admin_cmd {
+        /* Device-readable part */
+        u16 command;
+        u8 command_specific_data[];
+
+        /* Device-writable part */
+        u8 status;
+        u8 command_specific_error;
+        u8 command_specific_result[];
+};
+\end{lstlisting}
+
+The following table describes the generic Admin status codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_STATUS_OK    & successful completion  \\
+\hline
+01h   & VIRTIO_ADMIN_STATUS_CS_ERR    & command specific error  \\
+\hline
+02h   & VIRTIO_ADMIN_STATUS_COMMAND_UNSUPPORTED    & unsupported or invalid opcode  \\
+\hline
+\end{tabular}
+
+The \field{command} and \field{command_specific_data} are
+set by the driver, and the device sets the \field{status}, the
+\field{command_specific_error} and the \field{command_specific_result},
+if needed.
+
+The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to
+VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error}
+holds the command specific error. If \field{status} is not set to VIRTIO_ADMIN_STATUS_CS_ERR, the
+\field{command_specific_error} value is undefined.
+
+The following table describes the Admin command set:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Command & M/O \\
+\hline \hline
+0000h   & VIRTIO_ADMIN_CAPS_IDENTIFY    & M  \\
+\hline
+0001h - 7FFFh   & Generic admin cmds    & -  \\
+\hline
+8000h - FFFFh   & Reserved    & - \\
+\hline
+\end{tabular}
+
+\devicenormative{\subsection}{Admin command set}{Basic Facilities of a Virtio Device / Admin command set}
+A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.
+
+A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
+
+
+\subsection{VIRTIO ADMIN CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN CAPS IDENTIFY command}
+
+The VIRTIO_ADMIN_CAPS_IDENTIFY command has no command specific data set by the driver.
+This command upon success, returns a data buffer that describes information about the supported
+admin capabilities by the device. This information is of form:
+\begin{lstlisting}
+struct virtio_admin_caps_identify_result {
+       /*
+        * caps[0] - Bit 0 - Bit 7 are reserved
+        * caps[1] - Bit 0 - Bit 7 are reserved
+        * caps[2] - Bit 0 - Bit 7 are reserved
+        * ....
+        * caps[8191] - Bit 0 - Bit 7 are reserved
+        */
+       u8 caps[8192];
+};
+\end{lstlisting}
diff --git a/conformance.tex b/conformance.tex
index 42f8537..2e5d7e0 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -341,6 +341,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Virtqueues / Available Buffer Notification Suppression}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
 \item \ref{devicenormative:Reserved Feature Bits}
+\item \ref{devicenormative:Basic Facilities of a Virtio Device / Admin command set}
 \end{itemize}
 
 \conformance{\subsection}{PCI Device Conformance}\label{sec:Conformance / Device Conformance / PCI Device Conformance}
diff --git a/content.tex b/content.tex
index c6f116c..163cb34 100644
--- a/content.tex
+++ b/content.tex
@@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
 \begin{description}
 \item[0 to 23, and 50 to 127] Feature bits for the specific device type
 
-\item[24 to 40] Feature bits reserved for extensions to the queue and
+\item[24 to 41] Feature bits reserved for extensions to the queue and
   feature negotiation mechanisms
 
-\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
+\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
 \end{description}
 
 \begin{note}
@@ -449,6 +449,8 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 types. It is RECOMMENDED that devices generate version 4
 UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
 
+\input{admin.tex}
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
@@ -6847,6 +6849,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   that the driver can reset a queue individually.
   See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
 
+  \item[VIRTIO_F_ADMIN_VQ (41)] This feature indicates that an administration virtqueue is supported.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v3 2/4] Add miscellaneous configuration structure for PCI
  2022-02-03  7:57 [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF Max Gurtovoy
  2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
@ 2022-02-03  7:57 ` Max Gurtovoy
  2022-02-03  7:57 ` [PATCH v3 3/4] Add device management facility Max Gurtovoy
  2022-02-03  7:57 ` [virtio-comment] [PATCH v3 4/4] Add support for MSI-X vectors configuration for PCI VFs Max Gurtovoy
  3 siblings, 0 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-03  7:57 UTC (permalink / raw)
  To: virtio-comment, mst, cohuck, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

This new structure will be used for adding new miscellaneous registers
for a virtio device configuration layout.

For now, only admin_queue_index register is added. Admin virtqueue index
does not depend on the device type. Hence, add a PCI capability to read
the admin virtqueue index.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 conformance.tex |  2 ++
 content.tex     | 25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index 2e5d7e0..6ba4d94 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -102,6 +102,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / PCI configuration access capability}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
 \item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
+\item \ref{drivernormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
 \end{itemize}
 
 \conformance{\subsection}{MMIO Driver Conformance}\label{sec:Conformance / Driver Conformance / MMIO Driver Conformance}
@@ -363,6 +364,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / MSI-X Vector Configuration}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Used Buffer Notifications}
 \item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Notification of Device Configuration Changes}
+\item \ref{devicenormative:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
 \end{itemize}
 
 \conformance{\subsection}{MMIO Device Conformance}\label{sec:Conformance / Device Conformance / MMIO Device Conformance}
diff --git a/content.tex b/content.tex
index 163cb34..bf46192 100644
--- a/content.tex
+++ b/content.tex
@@ -712,6 +712,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 \item ISR Status
 \item Device-specific configuration (optional)
 \item PCI configuration access
+\item Miscellaneous configuration
 \end{itemize}
 
 Each structure can be mapped by a Base Address register (BAR) belonging to
@@ -771,6 +772,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG        9
+/* Miscellaneous configuration */
+#define VIRTIO_PCI_CAP_MISC_CFG          10
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -1352,6 +1355,28 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Miscellaneous configuration structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+
+The miscellaneous configuration structure is found at the bar and offset within the VIRTIO_PCI_CAP_MISC_CFG capability.
+Its layout is below.
+\begin{lstlisting}
+struct virtio_pci_misc_cfg {
+        le16 admin_queue_index;         /* read-only for driver */
+};
+\end{lstlisting}
+
+\begin{description}
+\item[\field{admin_queue_index}]
+        The device uses this to report the index of the admin virtqueue.
+        This field is valid only if VIRTIO_F_ADMIN_VQ is set.
+\end{description}
+
+\devicenormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+The device MUST present a valid \field{admin_queue_index} when VIRTIO_F_ADMIN_VQ is set.
+
+\drivernormative{\paragraph}{Miscellaneous configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Miscellaneous configuration structure layout}
+The driver MUST use the value of \field{admin_queue_index} to configure the admin virtqueue. For more details on virtqueue configuration see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration}.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v3 3/4] Add device management facility
  2022-02-03  7:57 [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF Max Gurtovoy
  2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
  2022-02-03  7:57 ` [PATCH v3 2/4] Add miscellaneous configuration structure for PCI Max Gurtovoy
@ 2022-02-03  7:57 ` Max Gurtovoy
  2022-02-03  7:57 ` [virtio-comment] [PATCH v3 4/4] Add support for MSI-X vectors configuration for PCI VFs Max Gurtovoy
  3 siblings, 0 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-03  7:57 UTC (permalink / raw)
  To: virtio-comment, mst, cohuck, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

A virtio device may be capable of managing other virtio device
features and configuration. These management operations are composed of
one or more admin commands from the admin command set. One such
management interface for issuing these commands can be the virtio admin
virtqueue.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 content.tex | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/content.tex b/content.tex
index bf46192..276a29f 100644
--- a/content.tex
+++ b/content.tex
@@ -451,6 +451,20 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 
 \input{admin.tex}
 
+\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management}
+
+Virtualized environments might be composed of one or more virtio devices. These devices might be
+associated with each other. For example,  virtio PCI PF and its VFS are composing one virtio group.
+A PCI PF device, by default, is the primary device in the group, and its PCI VFs are the secondary devices in the group.
+A sophisticated primary device might have capabilities to manage its secondary devices. This primary
+device will be the virtio group manager and the secondary devices would be the managed devices in this group.
+
+The primary device will use the admin command set to manage secondary devices (see section
+\ref{sec:Basic Facilities of a Virtio Device / Admin command set} for more details). The primary device
+can use any management interface that is defined in the virtio specification for issuing admin commands.
+For example, the admin virtqueue (see section \ref{sec:Basic Facilities of a Virtio Device / Admin Virtqueues} for more details)
+of the PF can be used to configure some attributes (such as MSI-X vectors) for its VFs in order to improve resource utilization.
+
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
 
 We start with an overview of device initialization, then expand on the
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [virtio-comment] [PATCH v3 4/4] Add support for MSI-X vectors configuration for PCI VFs
  2022-02-03  7:57 [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF Max Gurtovoy
                   ` (2 preceding siblings ...)
  2022-02-03  7:57 ` [PATCH v3 3/4] Add device management facility Max Gurtovoy
@ 2022-02-03  7:57 ` Max Gurtovoy
  3 siblings, 0 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-03  7:57 UTC (permalink / raw)
  To: virtio-comment, mst, cohuck, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

A typical cloud provider SR-IOV use case is to create many VFs for
use by guest VMs. The VFs may not be assigned to a VM until a user
requests a VM of a certain size, e.g., number of CPUs. A VF may need
MSI-X vectors proportional to the number of CPUs in the VM, but there is
no standard way today in the spec to change the number of MSI-X vectors
supported by a VF, although there are some operating systems that
support this.

Introduce new admin commands for a generic interrupt vector management
for PCI VFs. For now, this mechanism will manage the MSI-X interrupt
vectors assignments of a VF by its parent PF.

These admin commands will be easily extended, if needed, for other types
of interrupt vectors in the future with backward compatibility to old
drivers and devices.

Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 admin.tex       | 162 +++++++++++++++++++++++++++++++++++++++++++++++-
 conformance.tex |   1 +
 content.tex     |  72 +++++++++++++++++++++
 3 files changed, 233 insertions(+), 2 deletions(-)

diff --git a/admin.tex b/admin.tex
index fa9c993..f3263bb 100644
--- a/admin.tex
+++ b/admin.tex
@@ -63,17 +63,33 @@ \section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admi
 \hline \hline
 0000h   & VIRTIO_ADMIN_CAPS_IDENTIFY    & M  \\
 \hline
-0001h - 7FFFh   & Generic admin cmds    & -  \\
+0001h   & VIRTIO_ADMIN_PCI_SRIOV_ATTRS    & O  \\
+\hline
+0002h   & VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET    & O  \\
+\hline
+0003h   & VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET    & O  \\
+\hline
+0004h - 7FFFh   & Generic admin cmds    & -  \\
 \hline
 8000h - FFFFh   & Reserved    & - \\
 \hline
 \end{tabular}
 
+\drivernormative{\subsection}{Admin command set}{Basic Facilities of a Virtio Device / Admin command set}
+A driver SHOULD NOT issue VIRTIO_ADMIN_PCI_SRIOV_ATTRS command when VIRTIO_F_SR_IOV is not negotiated.
+
+A driver SHOULD NOT issue VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET and VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET
+commands when VIRTIO_F_SR_IOV is not negotiated or when PCI Single Root I/O Virtualization is disabled.
+
 \devicenormative{\subsection}{Admin command set}{Basic Facilities of a Virtio Device / Admin command set}
 A device that advertises VIRTIO_F_ADMIN_VQ capability MUST support all the mandatory admin commands.
 
 A device that advertises VIRTIO_F_ADMIN_VQ capability MAY support one or more optional admin commands.
 
+A device MUST fail VIRTIO_ADMIN_PCI_SRIOV_ATTRS command when VIRTIO_F_SR_IOV is not negotiated.
+
+A device MUST fail VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET and VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET
+commands when VIRTIO_F_SR_IOV is not negotiated or when PCI Single Root I/O Virtualization is disabled.
 
 \subsection{VIRTIO ADMIN CAPS IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN CAPS IDENTIFY command}
 
@@ -83,7 +99,8 @@ \subsection{VIRTIO ADMIN CAPS IDENTIFY command}\label{sec:Basic Facilities of a
 \begin{lstlisting}
 struct virtio_admin_caps_identify_result {
        /*
-        * caps[0] - Bit 0 - Bit 7 are reserved
+        * caps[0] - Bit 0 - if set, VF MSI-X control supported
+        *           Bit 1 - Bit 7 are reserved
         * caps[1] - Bit 0 - Bit 7 are reserved
         * caps[2] - Bit 0 - Bit 7 are reserved
         * ....
@@ -92,3 +109,144 @@ \subsection{VIRTIO ADMIN CAPS IDENTIFY command}\label{sec:Basic Facilities of a
        u8 caps[8192];
 };
 \end{lstlisting}
+
+For more details on VF MSI-X configuration support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control}.
+
+\subsection{VIRTIO ADMIN PCI SRIOV ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI SRIOV ATTRS command}
+
+The VIRTIO_ADMIN_PCI_SRIOV_ATTRS command has no command specific data set by the driver.
+This command upon success, returns a data buffer that describes information about PCI SRIOV
+related capabilities and attributes for the device. This command can be supported only by
+PCI devices that supports Single Root I/O Virtualization.
+This information is of form:
+\begin{lstlisting}
+struct virtio_admin_pci_sriov_attrs_result {
+        /* For compatibility - indicates which of the below fields are valid (1 means valid):
+         * Bit 0 - vfs_total_msix_count
+         * Bit 1 - vfs_assigned_msix_count
+         * Bit 2 - per_vf_max_msix_count
+         * Bits 3 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+        /* Total number of msix vectors for the total number of VFs */
+        le32 vfs_total_msix_count;
+        /* Assigned number of msix vectors for the enabled VFs */
+        le32 vfs_assigned_msix_count;
+        /* Max number of msix vectors that can be assigned for a single VF */
+        le16 per_vf_max_msix_count;
+};
+\end{lstlisting}
+
+\subsection{VIRTIO ADMIN PCI VF INTERRUPT CONFIG SET command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI VF INTERRUPT CONFIG SET command}
+
+The VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET command is used to modify the interrupt vectors
+count for a PCI virtual function. The command specific data set by the driver is of form:
+\begin{lstlisting}
+struct virtio_admin_pci_vf_interrupt_config_set_data {
+        /* The virtual function number */
+        le16 vf_number;
+        /* For compatibility - indicates which of the below properties should be
+         * modified (1 means that field should be modified):
+         * Bit 0 - msix_count
+         * Bits 1 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+        /* The amount of MSI-X interrupt vectors */
+        le16 msix_count;
+};
+\end{lstlisting}
+
+The following table describes the command specific error codes codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_CS_ERR_PCI_VF_NUM_INVALID    & invalid VF number  \\
+\hline
+01h   & VIRTIO_ADMIN_CS_ERR_PCI_MSIX_COUNT_EXCEED    & MSI-X count exceed the max value per VF  \\
+\hline
+02h   & VIRTIO_ADMIN_CS_ERR_PCI_MSIX_NO_RSC    & Device don't have requested MSI-X count   \\
+\hline
+03h   & VIRTIO_ADMIN_CS_ERR_PCI_VF_IN_USE    & VF is already in use, operation failed   \\
+\hline
+04h - FFh   & Reserved    & -  \\
+\hline
+\end{tabular}
+
+\begin{note}
+{vf_number can't be greater than NumVFs value as defined in the PCI specification
+or smaller than 1. A command specific error status code VIRTIO_ADMIN_CS_ERR_PCI_VF_NUM_INVALID
+is returned when vf_number is out of the range.}
+\end{note}
+
+\begin{note}
+{A command specific error status code VIRTIO_ADMIN_CS_ERR_PCI_MSIX_COUNT_EXCEED
+is returned when the amount of MSI-X to assign exceed the maximum value that can be
+assigned to a single VF.}
+\end{note}
+
+\begin{note}
+{A command specific error status code VIRTIO_ADMIN_CS_ERR_PCI_MSIX_NO_RSC
+is returned when the device doesn't have requested number of MSI-X vectors free.}
+\end{note}
+
+This command has no command specific result set by the device. Upon success, the device guarantees
+that all the requested properties were modified to the given values. Otherwise, error will be returned.
+
+\begin{note}
+{Before setting msix_count property the virtual/managed device (VF) shall be un-initialized and MUST not be used by the driver.
+Otherwise, a command specific error status code VIRTIO_ADMIN_CS_ERR_PCI_VF_IN_USE will be returned.}
+\end{note}
+
+\subsection{VIRTIO ADMIN PCI VF INTERRUPT CONFIG GET command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI VF INTERRUPT CONFIG GET command}
+
+The VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET command is used to obtain the values of the VFs
+interrupt vectors configuration.
+The command specific data set by the driver is of form:
+\begin{lstlisting}
+struct virtio_admin_pci_vf_interrupt_config_get_data {
+        /* The virtual function number */
+        le16 vf_number;
+        /* For compatibility - indicates which of the below properties should be
+         * queried (1 means that field should be queried):
+         * Bit 0 - msix_count (The amount of MSI-X interrupt vectors)
+         * Bits 1 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+};
+\end{lstlisting}
+
+The following table describes the command specific error codes codes:
+
+\begin{tabular}{|l|l|l|}
+\hline
+Opcode & Status & Description \\
+\hline \hline
+00h   & VIRTIO_ADMIN_CS_ERR_PCI_VF_NUM_INVALID    & invalid VF number  \\
+\hline
+01h - FFh   & Reserved    & -  \\
+\hline
+\end{tabular}
+
+\begin{note}
+{vf_number can't be greater than NumVFs value as defined in the PCI specification
+or smaller than 1. A command specific error status code VIRTIO_ADMIN_CS_ERR_PCI_VF_NUM_INVALID
+is returned when vf_number is out of the range.}
+\end{note}
+
+This command, upon success, returns a data buffer that describes the properties that were requested
+and their values for the subject virtio VF device according to the given vf_number.
+This information is of form:
+\begin{lstlisting}
+struct virtio_admin_pci_vf_interrupt_config_get_result {
+        /* For compatibility - indicates which of the below fields were returned
+         * (1 means that field was returned):
+         * Bit 0 - msix_count
+         * Bits 1 - 63 - reserved for future fields
+         */
+        le64 attrs_mask;
+        /* The amount of MSI-X interrupt vectors currently assigned to the VF */
+        le16 msix_count;
+};
+\end{lstlisting}
diff --git a/conformance.tex b/conformance.tex
index 6ba4d94..e438b27 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -87,6 +87,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Initialization}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Cleanup}
 \item \ref{drivernormative:Reserved Feature Bits}
+\item \ref{drivernormative:Basic Facilities of a Virtio Device / Admin command set}
 \end{itemize}
 
 \conformance{\subsection}{PCI Driver Conformance}\label{sec:Conformance / Driver Conformance / PCI Driver Conformance}
diff --git a/content.tex b/content.tex
index 276a29f..a5e2035 100644
--- a/content.tex
+++ b/content.tex
@@ -1773,6 +1773,78 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options /
     \end{itemize}
 \end{itemize}
 
+\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities}
+
+This documents the group of Admin capabilities for PCI virtio devices. Each capability is
+implemented using one or more Admin queue commands.
+
+\subsubsection{VF MSI-X control}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control}
+
+This capability enables a virtio PCI PF device to control the assignment of MSI-X interrupt vectors
+for its managed VFs. Capable devices will need to set bit 0 of caps[0] in the result of VIRTIO_ADMIN_CAPS_IDENTIFY
+admin command. See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN CAPS IDENTIFY command}
+for more details.
+
+Capable devices will also need to implement VIRTIO_ADMIN_PCI_SRIOV_ATTRS, VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET and
+VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET admin commands.
+
+In the result of VIRTIO_ADMIN_PCI_SRIOV_ATTRS admin command, a capable device will return the total number of
+msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in
+\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in
+\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3
+fields in the \field{attrs_mask} field of the result buffer.
+See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI SRIOV ATTRS command} for more
+details.
+
+A PCI PF device that supports VF MSI-X control capability will always allocate MSI-X vectors for its VFs from the device resources.
+The default assignment of the MSI-X vectors for the PCI VFs is out of the scope of this specification.
+A driver, using VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET can update the MSI-X assignment for a specific VF.
+In the data of VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET admin command, a driver set the virtual function number in
+\field{vf_number} and the amount of MSI-X interrupt vectors to configure to the subject virtual function in \field{msix_count}.
+In addition, bit 0 in the \field{attrs_mask} field is set. A successful operation guarantees that the requested
+amount of MSI-X interrupt vectors was assigned to the subject virtual function.
+Also, a successful operation guarantees that the MSI-X capability access by the subject PCI VF defined by the PCI specification must reflect
+the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET
+increases the MSI-X vectors to 8; on this change, reading Table size field of the MSI-X message control register will reflect a value of 7.
+See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI VF INTERRUPT CONFIG SET command} for more details.
+
+It is beyond the scope of the virtio specification to define necessary synchronization in system software to ensure that a virtio PCI VF device
+interrupt configuration modification is reflected in the PCI device. However, it is expected that any modern system software implementing virtio
+drivers and PCI subsystem will ensure that any changes occurring in the VF interrupt configuration is either updated in the PCI VF device or
+such configuration fails.
+
+In the data of VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET admin command, a driver will set the virtual function number in
+\field{vf_number}. In addition, bit 0 in the \field{attrs_mask} field is set to indicate requested output fields in
+the result from the device. In the result of a successful operation, the amount of MSI-X interrupt vectors that is currently
+assigned to the subject virtual function is returned by the device in \field{msix_count} field. In addition, bit 0 in the \field{attrs_mask} field is set by the device
+to indicate on the validity of \field{msix_count} field.
+See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN PCI VF INTERRUPT CONFIG GET command} for more details.
+
+\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example }
+
+A typical sequence for configuring MSI-X vectors for PCI VFs using VF MSI-X control mechanism is following:
+
+\begin{enumerate}
+\item Ensure that driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing)
+
+\item Load the PF driver
+
+\item Enable SR-IOV by following the PCI specification
+
+\item Query the SR-IOV attributes using command VIRTIO_ADMIN_PCI_SRIOV_ATTRS
+
+\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_GET
+
+\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_PCI_VF_INTERRUPT_CONFIG_SET
+
+\item After successful completion of the assignment, load the VF driver
+
+\item Assign the VF to a VM
+
+\item After VM/VF use is completed, user assigns different MSI-X value and repeats the steps of 6 onwards
+
+\end{enumerate}
+
 \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO}
 
 Virtual environments without PCI support (a common situation in
-- 
2.21.0


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
@ 2022-02-03 13:09   ` Cornelia Huck
  2022-02-07 10:14     ` Max Gurtovoy
  2022-02-07 10:39   ` Michael S. Tsirkin
  1 sibling, 1 reply; 43+ messages in thread
From: Cornelia Huck @ 2022-02-03 13:09 UTC (permalink / raw)
  To: Max Gurtovoy, virtio-comment, mst, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha, Max Gurtovoy

On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> In one of the many use cases a user wants to manipulate features and
> configuration of the virtio devices regardless of the device type
> (net/block/console). Some of this configuration is generic enough. i.e
> Number of MSI-X vectors of a virtio PCI VF device. There is a need to do
> such features query and manipulation by its parent PCI PF.
>
> Currently virtio specification defines control virtqueue to manipulate
> features and configuration of the device it operates on. However,
> control virtqueue commands are device type specific, which makes it very
> difficult to extend for device agnostic commands.
>
> To support this requirement in elegant way, this patch introduces a new
> admin virtqueue interface. Admin virtqueue is proposed as one of the
> interfaces to issue admin commands that have the same command format for
> all type of virtio devices.
>
> Manipulate features via admin virtqueue is asynchronous, scalable, easy
> to extend and doesn't require additional and expensive on-die resources
> to be allocated for every new feature that will be added in the future.
>
> Subsequent patches make use of this admin virtqueue.
>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> ---
>  admin.tex       | 94 +++++++++++++++++++++++++++++++++++++++++++++++++
>  conformance.tex |  1 +
>  content.tex     |  8 +++--
>  3 files changed, 101 insertions(+), 2 deletions(-)
>  create mode 100644 admin.tex
>
> diff --git a/admin.tex b/admin.tex
> new file mode 100644
> index 0000000..fa9c993
> --- /dev/null
> +++ b/admin.tex
> @@ -0,0 +1,94 @@

[some wording only, have not yet thought about the rest]

> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> +
> +Admin virtqueue is one of the management interface that used to send administrative

"An admin virtqueue is a management interface of a device that can be used..."

> +commands to manipulate various features of the device and/or to manipulate
> +various features, if possible, of another device within the same group (e.g. PCI VFs

Maybe add

"Which devices are actually considered a group is transport specific."

?

> +of a parent PCI PF device are grouped together. These devices can be optionally
> +managed by its parent PCI PF using its admin virtqueue.).

I would move the content in the brackets to a separate paragraph. Maybe

"An example of a group is PCI virtual functions (VFs) being grouped
together with their parent PCI physical function (PF). These VFs can be
optionally managed by their parent PF using its admin virtqueue."

> +
> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ is
> +negotiated. The index of the admin virtqueue exposed by the device in a

s/exposed/is exposed/

> +transport specific manner.
> +
> +When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
> +through the admin virtqueue.

That sounds a bit like the driver might use an alternative interface for
the admin commands as well? What about

"If VIRTIO_F_ADMIN_VQ has been negotiated, the driver used the admin
virtqueue to send admin commands."


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-03 13:09   ` [virtio-dev] " Cornelia Huck
@ 2022-02-07 10:14     ` Max Gurtovoy
  2022-02-07 10:28       ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-07 10:14 UTC (permalink / raw)
  To: Cornelia Huck, virtio-comment, mst, virtio-dev, jasowang
  Cc: parav, shahafs, oren, stefanha


On 2/3/2022 3:09 PM, Cornelia Huck wrote:
> On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>> In one of the many use cases a user wants to manipulate features and
>> configuration of the virtio devices regardless of the device type
>> (net/block/console). Some of this configuration is generic enough. i.e
>> Number of MSI-X vectors of a virtio PCI VF device. There is a need to do
>> such features query and manipulation by its parent PCI PF.
>>
>> Currently virtio specification defines control virtqueue to manipulate
>> features and configuration of the device it operates on. However,
>> control virtqueue commands are device type specific, which makes it very
>> difficult to extend for device agnostic commands.
>>
>> To support this requirement in elegant way, this patch introduces a new
>> admin virtqueue interface. Admin virtqueue is proposed as one of the
>> interfaces to issue admin commands that have the same command format for
>> all type of virtio devices.
>>
>> Manipulate features via admin virtqueue is asynchronous, scalable, easy
>> to extend and doesn't require additional and expensive on-die resources
>> to be allocated for every new feature that will be added in the future.
>>
>> Subsequent patches make use of this admin virtqueue.
>>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>> ---
>>   admin.tex       | 94 +++++++++++++++++++++++++++++++++++++++++++++++++
>>   conformance.tex |  1 +
>>   content.tex     |  8 +++--
>>   3 files changed, 101 insertions(+), 2 deletions(-)
>>   create mode 100644 admin.tex
>>
>> diff --git a/admin.tex b/admin.tex
>> new file mode 100644
>> index 0000000..fa9c993
>> --- /dev/null
>> +++ b/admin.tex
>> @@ -0,0 +1,94 @@
> [some wording only, have not yet thought about the rest]
>
>> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>> +
>> +Admin virtqueue is one of the management interface that used to send administrative
> "An admin virtqueue is a management interface of a device that can be used..."

OK thanks.

>
>> +commands to manipulate various features of the device and/or to manipulate
>> +various features, if possible, of another device within the same group (e.g. PCI VFs
> Maybe add
>
> "Which devices are actually considered a group is transport specific."
>
> ?

Not sure we want to restrict ourselves for that.

>
>> +of a parent PCI PF device are grouped together. These devices can be optionally
>> +managed by its parent PCI PF using its admin virtqueue.).
> I would move the content in the brackets to a separate paragraph. Maybe
>
> "An example of a group is PCI virtual functions (VFs) being grouped
> together with their parent PCI physical function (PF). These VFs can be
> optionally managed by their parent PF using its admin virtqueue."

Sounds good.


>
>> +
>> +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ is
>> +negotiated. The index of the admin virtqueue exposed by the device in a
> s/exposed/is exposed/
thanks.
>
>> +transport specific manner.
>> +
>> +When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
>> +through the admin virtqueue.
> That sounds a bit like the driver might use an alternative interface for
> the admin commands as well? What about

Yes if there will be an alternative for AQ and this feature bit will not 
be negotiated so the driver will use a different channel.

This was explicitly discussed in previous versions.

What is the issue with this assumption ?

>
> "If VIRTIO_F_ADMIN_VQ has been negotiated, the driver used the admin
> virtqueue to send admin commands."

maybe:

"If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin 
virtqueue to send
all admin commands."


>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 10:14     ` Max Gurtovoy
@ 2022-02-07 10:28       ` Michael S. Tsirkin
  2022-02-07 11:51         ` [virtio-dev] " Cornelia Huck
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-07 10:28 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Cornelia Huck, virtio-comment, virtio-dev, jasowang, parav,
	shahafs, oren, stefanha

On Mon, Feb 07, 2022 at 12:14:33PM +0200, Max Gurtovoy wrote:
> 
> On 2/3/2022 3:09 PM, Cornelia Huck wrote:
> > On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> > 
> > > In one of the many use cases a user wants to manipulate features and
> > > configuration of the virtio devices regardless of the device type
> > > (net/block/console). Some of this configuration is generic enough. i.e
> > > Number of MSI-X vectors of a virtio PCI VF device. There is a need to do
> > > such features query and manipulation by its parent PCI PF.
> > > 
> > > Currently virtio specification defines control virtqueue to manipulate
> > > features and configuration of the device it operates on. However,
> > > control virtqueue commands are device type specific, which makes it very
> > > difficult to extend for device agnostic commands.
> > > 
> > > To support this requirement in elegant way, this patch introduces a new
> > > admin virtqueue interface. Admin virtqueue is proposed as one of the
> > > interfaces to issue admin commands that have the same command format for
> > > all type of virtio devices.
> > > 
> > > Manipulate features via admin virtqueue is asynchronous, scalable, easy
> > > to extend and doesn't require additional and expensive on-die resources
> > > to be allocated for every new feature that will be added in the future.
> > > 
> > > Subsequent patches make use of this admin virtqueue.
> > > 
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> > > ---
> > >   admin.tex       | 94 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >   conformance.tex |  1 +
> > >   content.tex     |  8 +++--
> > >   3 files changed, 101 insertions(+), 2 deletions(-)
> > >   create mode 100644 admin.tex
> > > 
> > > diff --git a/admin.tex b/admin.tex
> > > new file mode 100644
> > > index 0000000..fa9c993
> > > --- /dev/null
> > > +++ b/admin.tex
> > > @@ -0,0 +1,94 @@
> > [some wording only, have not yet thought about the rest]
> > 
> > > +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
> > > +
> > > +Admin virtqueue is one of the management interface that used to send administrative
> > "An admin virtqueue is a management interface of a device that can be used..."
> 
> OK thanks.
> 
> > 
> > > +commands to manipulate various features of the device and/or to manipulate
> > > +various features, if possible, of another device within the same group (e.g. PCI VFs
> > Maybe add
> > 
> > "Which devices are actually considered a group is transport specific."
> > 
> > ?
> 
> Not sure we want to restrict ourselves for that.

do restrict this please, if we want to extend the scope we can
always do that down the road.

> > 
> > > +of a parent PCI PF device are grouped together. These devices can be optionally
> > > +managed by its parent PCI PF using its admin virtqueue.).
> > I would move the content in the brackets to a separate paragraph. Maybe
> > 
> > "An example of a group is PCI virtual functions (VFs) being grouped
> > together with their parent PCI physical function (PF). These VFs can be
> > optionally managed by their parent PF using its admin virtqueue."
> 
> Sounds good.
> 
> 
> > 
> > > +
> > > +An admin virtqueue exists for a certain device if VIRTIO_F_ADMIN_VQ is
> > > +negotiated. The index of the admin virtqueue exposed by the device in a
> > s/exposed/is exposed/
> thanks.
> > 
> > > +transport specific manner.
> > > +
> > > +When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
> > > +through the admin virtqueue.
> > That sounds a bit like the driver might use an alternative interface for
> > the admin commands as well? What about
> 
> Yes if there will be an alternative for AQ and this feature bit will not be
> negotiated so the driver will use a different channel.
> 
> This was explicitly discussed in previous versions.
> 
> What is the issue with this assumption ?

it's not an issue down the road but we want to be clear
that right now that is the only way, to make sure
reader does not waste time looking for more ways
in the spec. maybe just say so.

> > 
> > "If VIRTIO_F_ADMIN_VQ has been negotiated, the driver used the admin
> > virtqueue to send admin commands."
> 
> maybe:
> 
> "If VIRTIO_F_ADMIN_VQ has been negotiated, the driver will use the admin
> virtqueue to send
> all admin commands."
> 
> 
> > 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
  2022-02-03 13:09   ` [virtio-dev] " Cornelia Huck
@ 2022-02-07 10:39   ` Michael S. Tsirkin
  2022-02-07 14:58     ` Max Gurtovoy
  2022-02-08  6:25     ` Parav Pandit
  1 sibling, 2 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-07 10:39 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha

On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
> +\begin{lstlisting}
> +struct virtio_admin_cmd {
> +        /* Device-readable part */
> +        u16 command;
> +        u8 command_specific_data[];
> +
> +        /* Device-writable part */
> +        u8 status;
> +        u8 command_specific_error;
> +        u8 command_specific_result[];
> +};
> +\end{lstlisting}

ok this abstraction is an improvement, thanks!

What I'd like to see is moving a bit more format to this generic structure.

From what I could gather, some commands affect a group as a whole, and
some commands just a single member of the group. We could have a
"destination" field for that, and a special "all of the group"
destination for commands affecting the whole group.


Next, trying to think about scalable iov extensions. So we
will have groups of VFs and then SFs as the next level.
How does one differentiate between the two?
Maybe reserve a field for "destination type"?


The point of all this is to allow making sense of commands and
e.g. virtualizing them for nested virt without necessarily
knowing all of the detail about the specific command.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 10:28       ` Michael S. Tsirkin
@ 2022-02-07 11:51         ` Cornelia Huck
  2022-02-07 14:34           ` Max Gurtovoy
  0 siblings, 1 reply; 43+ messages in thread
From: Cornelia Huck @ 2022-02-07 11:51 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: virtio-comment, virtio-dev, jasowang, parav, shahafs, oren, stefanha

On Mon, Feb 07 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Feb 07, 2022 at 12:14:33PM +0200, Max Gurtovoy wrote:
>> 
>> On 2/3/2022 3:09 PM, Cornelia Huck wrote:
>> > On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

>> > > +commands to manipulate various features of the device and/or to manipulate
>> > > +various features, if possible, of another device within the same group (e.g. PCI VFs
>> > Maybe add
>> > 
>> > "Which devices are actually considered a group is transport specific."
>> > 
>> > ?
>> 
>> Not sure we want to restrict ourselves for that.
>
> do restrict this please, if we want to extend the scope we can
> always do that down the road.

I'm also not sure how grouping can _not_ be transport specific... the
PF/VF example is obviously a pci thing; for ccw, in a non-virtio
context, there's sometimes the concept of some subchannels/devices being
grouped together with no clear hierarchy, and for mmio, I don't really
have an idea how "grouping" might work there.

>> > > +When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
>> > > +through the admin virtqueue.
>> > That sounds a bit like the driver might use an alternative interface for
>> > the admin commands as well? What about
>> 
>> Yes if there will be an alternative for AQ and this feature bit will not be
>> negotiated so the driver will use a different channel.
>> 
>> This was explicitly discussed in previous versions.
>> 
>> What is the issue with this assumption ?
>
> it's not an issue down the road but we want to be clear
> that right now that is the only way, to make sure
> reader does not waste time looking for more ways
> in the spec. maybe just say so.

Yes, we should be explicit that admin commands are independent of the
conduit they are using, and that currently the only conduit is the admin
vq. Someone reading the spec does not know about previous discussions on
the mailing list.

Maybe reorder this? First have a section where the admin commands are
defined, and then have a section that lists the different channels admin
commands can use, where the admin vq is the only one currently
supported?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 11:51         ` [virtio-dev] " Cornelia Huck
@ 2022-02-07 14:34           ` Max Gurtovoy
  2022-02-07 15:08             ` [virtio-comment] " Cornelia Huck
  0 siblings, 1 reply; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-07 14:34 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, jasowang, parav, shahafs, oren, stefanha


On 2/7/2022 1:51 PM, Cornelia Huck wrote:
> On Mon, Feb 07 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Mon, Feb 07, 2022 at 12:14:33PM +0200, Max Gurtovoy wrote:
>>> On 2/3/2022 3:09 PM, Cornelia Huck wrote:
>>>> On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>> +commands to manipulate various features of the device and/or to manipulate
>>>>> +various features, if possible, of another device within the same group (e.g. PCI VFs
>>>> Maybe add
>>>>
>>>> "Which devices are actually considered a group is transport specific."
>>>>
>>>> ?
>>> Not sure we want to restrict ourselves for that.
>> do restrict this please, if we want to extend the scope we can
>> always do that down the road.
> I'm also not sure how grouping can _not_ be transport specific... the
> PF/VF example is obviously a pci thing; for ccw, in a non-virtio
> context, there's sometimes the concept of some subchannels/devices being
> grouped together with no clear hierarchy, and for mmio, I don't really
> have an idea how "grouping" might work there.

Yes today it's transport specific.

But if one day there will be a definition for virtio fabric (over 
TCP/RDMA) it might not be true.

>>>>> +When VIRTIO_F_ADMIN_VQ is negotiated with the device, driver will send all admin commands
>>>>> +through the admin virtqueue.
>>>> That sounds a bit like the driver might use an alternative interface for
>>>> the admin commands as well? What about
>>> Yes if there will be an alternative for AQ and this feature bit will not be
>>> negotiated so the driver will use a different channel.
>>>
>>> This was explicitly discussed in previous versions.
>>>
>>> What is the issue with this assumption ?
>> it's not an issue down the road but we want to be clear
>> that right now that is the only way, to make sure
>> reader does not waste time looking for more ways
>> in the spec. maybe just say so.
> Yes, we should be explicit that admin commands are independent of the
> conduit they are using, and that currently the only conduit is the admin
> vq. Someone reading the spec does not know about previous discussions on
> the mailing list.
>
> Maybe reorder this? First have a section where the admin commands are
> defined, and then have a section that lists the different channels admin
> commands can use, where the admin vq is the only one currently
> supported?

I'll reorder it for V4.


>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 10:39   ` Michael S. Tsirkin
@ 2022-02-07 14:58     ` Max Gurtovoy
  2022-02-07 16:18       ` Michael S. Tsirkin
  2022-02-08  6:25     ` Parav Pandit
  1 sibling, 1 reply; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-07 14:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha


On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
> On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
>> +\begin{lstlisting}
>> +struct virtio_admin_cmd {
>> +        /* Device-readable part */
>> +        u16 command;
>> +        u8 command_specific_data[];
>> +
>> +        /* Device-writable part */
>> +        u8 status;
>> +        u8 command_specific_error;
>> +        u8 command_specific_result[];
>> +};
>> +\end{lstlisting}
> ok this abstraction is an improvement, thanks!
>
> What I'd like to see is moving a bit more format to this generic structure.
>
>  From what I could gather, some commands affect a group as a whole, and
> some commands just a single member of the group. We could have a
> "destination" field for that, and a special "all of the group"
> destination for commands affecting the whole group.
>
>
> Next, trying to think about scalable iov extensions. So we
> will have groups of VFs and then SFs as the next level.
> How does one differentiate between the two?
> Maybe reserve a field for "destination type"?

For now we have only a PCI group that composed of VFs and the PF.

What you suggest, IMO is a definition of a generic virtio 
group/subsystem that I've mentioned in the discussion of V1.

Once we have virtio group - it should have a group id and them the admin 
command can have a field calld group_id for commands that are targeted 
to the whole group.

Some commands are referring to a specific device in the group so only a 
vdev_id is needed.

Some commands are even targeted to the same device to query some info 
(we have examples in this series for that), so in this case there is no 
need for vdev_id nor group_id.

So I'm sure sure we can improve common virtio_admin_cmd structure to 
have these attributes since they are not mandatory because of the 
reasons I've mentioned.

>
> The point of all this is to allow making sense of commands and
> e.g. virtualizing them for nested virt without necessarily
> knowing all of the detail about the specific command.
I don't understand this, sorry.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 14:34           ` Max Gurtovoy
@ 2022-02-07 15:08             ` Cornelia Huck
  2022-02-07 16:19               ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Cornelia Huck @ 2022-02-07 15:08 UTC (permalink / raw)
  To: Max Gurtovoy, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, jasowang, parav, shahafs, oren, stefanha

On Mon, Feb 07 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 2/7/2022 1:51 PM, Cornelia Huck wrote:
>> On Mon, Feb 07 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Mon, Feb 07, 2022 at 12:14:33PM +0200, Max Gurtovoy wrote:
>>>> On 2/3/2022 3:09 PM, Cornelia Huck wrote:
>>>>> On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>>> +commands to manipulate various features of the device and/or to manipulate
>>>>>> +various features, if possible, of another device within the same group (e.g. PCI VFs
>>>>> Maybe add
>>>>>
>>>>> "Which devices are actually considered a group is transport specific."
>>>>>
>>>>> ?
>>>> Not sure we want to restrict ourselves for that.
>>> do restrict this please, if we want to extend the scope we can
>>> always do that down the road.
>> I'm also not sure how grouping can _not_ be transport specific... the
>> PF/VF example is obviously a pci thing; for ccw, in a non-virtio
>> context, there's sometimes the concept of some subchannels/devices being
>> grouped together with no clear hierarchy, and for mmio, I don't really
>> have an idea how "grouping" might work there.
>
> Yes today it's transport specific.
>
> But if one day there will be a definition for virtio fabric (over 
> TCP/RDMA) it might not be true.

I don't think that is contadictory; we can simply extend the meaning of
what "grouping" means when needed.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 14:58     ` Max Gurtovoy
@ 2022-02-07 16:18       ` Michael S. Tsirkin
  2022-02-08  0:41         ` Max Gurtovoy
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-07 16:18 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha

On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
> 
> On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
> > On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
> > > +\begin{lstlisting}
> > > +struct virtio_admin_cmd {
> > > +        /* Device-readable part */
> > > +        u16 command;
> > > +        u8 command_specific_data[];
> > > +
> > > +        /* Device-writable part */
> > > +        u8 status;
> > > +        u8 command_specific_error;
> > > +        u8 command_specific_result[];
> > > +};
> > > +\end{lstlisting}
> > ok this abstraction is an improvement, thanks!
> > 
> > What I'd like to see is moving a bit more format to this generic structure.
> > 
> >  From what I could gather, some commands affect a group as a whole, and
> > some commands just a single member of the group. We could have a
> > "destination" field for that, and a special "all of the group"
> > destination for commands affecting the whole group.
> > 
> > 
> > Next, trying to think about scalable iov extensions. So we
> > will have groups of VFs and then SFs as the next level.
> > How does one differentiate between the two?
> > Maybe reserve a field for "destination type"?
> 
> For now we have only a PCI group that composed of VFs and the PF.
> 
> What you suggest, IMO is a definition of a generic virtio group/subsystem
> that I've mentioned in the discussion of V1.
> 
> Once we have virtio group - it should have a group id and them the admin
> command can have a field calld group_id for commands that are targeted to
> the whole group.
> 
> Some commands are referring to a specific device in the group so only a
> vdev_id is needed.
> 
> Some commands are even targeted to the same device to query some info (we
> have examples in this series for that), so in this case there is no need for
> vdev_id nor group_id.
> 
> So I'm sure sure we can improve common virtio_admin_cmd structure to have
> these attributes since they are not mandatory because of the reasons I've
> mentioned.

I'm not sure I understand 100%, but try to address in the next
revision and we'll discuss.

> > 
> > The point of all this is to allow making sense of commands and
> > e.g. virtualizing them for nested virt without necessarily
> > knowing all of the detail about the specific command.
> I don't understand this, sorry.

Basically try to move stuff into generic format so it's possible
to understand things without knowing detail of the command.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 15:08             ` [virtio-comment] " Cornelia Huck
@ 2022-02-07 16:19               ` Michael S. Tsirkin
  0 siblings, 0 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-07 16:19 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang, parav,
	shahafs, oren, stefanha

On Mon, Feb 07, 2022 at 04:08:41PM +0100, Cornelia Huck wrote:
> On Mon, Feb 07 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> 
> > On 2/7/2022 1:51 PM, Cornelia Huck wrote:
> >> On Mon, Feb 07 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>
> >>> On Mon, Feb 07, 2022 at 12:14:33PM +0200, Max Gurtovoy wrote:
> >>>> On 2/3/2022 3:09 PM, Cornelia Huck wrote:
> >>>>> On Thu, Feb 03 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>>>>> +commands to manipulate various features of the device and/or to manipulate
> >>>>>> +various features, if possible, of another device within the same group (e.g. PCI VFs
> >>>>> Maybe add
> >>>>>
> >>>>> "Which devices are actually considered a group is transport specific."
> >>>>>
> >>>>> ?
> >>>> Not sure we want to restrict ourselves for that.
> >>> do restrict this please, if we want to extend the scope we can
> >>> always do that down the road.
> >> I'm also not sure how grouping can _not_ be transport specific... the
> >> PF/VF example is obviously a pci thing; for ccw, in a non-virtio
> >> context, there's sometimes the concept of some subchannels/devices being
> >> grouped together with no clear hierarchy, and for mmio, I don't really
> >> have an idea how "grouping" might work there.
> >
> > Yes today it's transport specific.
> >
> > But if one day there will be a definition for virtio fabric (over 
> > TCP/RDMA) it might not be true.
> 
> I don't think that is contadictory; we can simply extend the meaning of
> what "grouping" means when needed.

Yes but as long it is transport specific I don't see why not
call this out.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 16:18       ` Michael S. Tsirkin
@ 2022-02-08  0:41         ` Max Gurtovoy
  2022-02-08  6:45           ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-08  0:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha


On 2/7/2022 6:18 PM, Michael S. Tsirkin wrote:
> On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
>> On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
>>> On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
>>>> +\begin{lstlisting}
>>>> +struct virtio_admin_cmd {
>>>> +        /* Device-readable part */
>>>> +        u16 command;
>>>> +        u8 command_specific_data[];
>>>> +
>>>> +        /* Device-writable part */
>>>> +        u8 status;
>>>> +        u8 command_specific_error;
>>>> +        u8 command_specific_result[];
>>>> +};
>>>> +\end{lstlisting}
>>> ok this abstraction is an improvement, thanks!
>>>
>>> What I'd like to see is moving a bit more format to this generic structure.
>>>
>>>   From what I could gather, some commands affect a group as a whole, and
>>> some commands just a single member of the group. We could have a
>>> "destination" field for that, and a special "all of the group"
>>> destination for commands affecting the whole group.
>>>
>>>
>>> Next, trying to think about scalable iov extensions. So we
>>> will have groups of VFs and then SFs as the next level.
>>> How does one differentiate between the two?
>>> Maybe reserve a field for "destination type"?
>> For now we have only a PCI group that composed of VFs and the PF.
>>
>> What you suggest, IMO is a definition of a generic virtio group/subsystem
>> that I've mentioned in the discussion of V1.
>>
>> Once we have virtio group - it should have a group id and them the admin
>> command can have a field calld group_id for commands that are targeted to
>> the whole group.
>>
>> Some commands are referring to a specific device in the group so only a
>> vdev_id is needed.
>>
>> Some commands are even targeted to the same device to query some info (we
>> have examples in this series for that), so in this case there is no need for
>> vdev_id nor group_id.
>>
>> So I'm sure sure we can improve common virtio_admin_cmd structure to have
>> these attributes since they are not mandatory because of the reasons I've
>> mentioned.
> I'm not sure I understand 100%, but try to address in the next
> revision and we'll discuss.

I meant to say that I'm *not* sure we can improve the common structure...

It was a typo.

And I don't understand why this info can't be in the 
command_specific_data because of all the reasons I mentioned above.

>
>>> The point of all this is to allow making sense of commands and
>>> e.g. virtualizing them for nested virt without necessarily
>>> knowing all of the detail about the specific command.
>> I don't understand this, sorry.
> Basically try to move stuff into generic format so it's possible
> to understand things without knowing detail of the command.

But we don't develop a networking protocol here. The management device 
is not sending packets towards its managed devices, right ?

This is an interface for a specific device that can manage others but 
also manage itself.

We didn't introduce a notion of broadcasting admin commands for other 
devices.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-07 10:39   ` Michael S. Tsirkin
  2022-02-07 14:58     ` Max Gurtovoy
@ 2022-02-08  6:25     ` Parav Pandit
  2022-02-08  6:42       ` Michael S. Tsirkin
  1 sibling, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08  6:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, Shahaf Shuler,
	Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, February 7, 2022 4:09 PM
> 
> Next, trying to think about scalable iov extensions. So we will have groups of
> VFs and then SFs as the next level.
> How does one differentiate between the two?
> Maybe reserve a field for "destination type"?
>
We already discussed this in v2.
SF will have different identification than 16-bits. And no one knows what that would be.
We just cannot reserve some arbitrary bytes for unknown.
You suggested in v2 to reserve 4 bytes for sf_id, and I explained you that 4 bytes may not be enough.

Whether SFs are on top of VFs or SFs are on top of PFs or both is completely different spec.
Whether PF will manage SFs of the VFs or it will be done nested manner by VF etc is a completely different discussion than what is being proposed here.
Whether PF will manage the SF is yet another big question. We work with users and they dislike this.
To address it, some OS has a different management interface (not visible to PF) for SF life cycle even though SFs are anchored on a PF.

So SF/iov extension discussion has long way to go for community to first understand the use cases before crafting some extension.

So lets not complicate and mix things here for a blurry definition of scalable iov/sf extension.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  6:25     ` Parav Pandit
@ 2022-02-08  6:42       ` Michael S. Tsirkin
  2022-02-08  7:04         ` Parav Pandit
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08  6:42 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Max Gurtovoy, virtio-comment, cohuck, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, February 7, 2022 4:09 PM
> > 
> > Next, trying to think about scalable iov extensions. So we will have groups of
> > VFs and then SFs as the next level.
> > How does one differentiate between the two?
> > Maybe reserve a field for "destination type"?
> >
> We already discussed this in v2.
> SF will have different identification than 16-bits. And no one knows what that would be.
> We just cannot reserve some arbitrary bytes for unknown.
> You suggested in v2 to reserve 4 bytes for sf_id, and I explained you that 4 bytes may not be enough.
> 
> Whether SFs are on top of VFs or SFs are on top of PFs or both is completely different spec.
> Whether PF will manage SFs of the VFs or it will be done nested manner by VF etc is a completely different discussion than what is being proposed here.
> Whether PF will manage the SF is yet another big question. We work with users and they dislike this.
> To address it, some OS has a different management interface (not visible to PF) for SF life cycle even though SFs are anchored on a PF.
> 
> So SF/iov extension discussion has long way to go for community to first understand the use cases before crafting some extension.
> 
> So lets not complicate and mix things here for a blurry definition of scalable iov/sf extension.

Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
sounds like that should be enough.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  0:41         ` Max Gurtovoy
@ 2022-02-08  6:45           ` Michael S. Tsirkin
  2022-02-08  8:34             ` Max Gurtovoy
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08  6:45 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha

On Tue, Feb 08, 2022 at 02:41:44AM +0200, Max Gurtovoy wrote:
> 
> On 2/7/2022 6:18 PM, Michael S. Tsirkin wrote:
> > On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
> > > On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
> > > > On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
> > > > > +\begin{lstlisting}
> > > > > +struct virtio_admin_cmd {
> > > > > +        /* Device-readable part */
> > > > > +        u16 command;
> > > > > +        u8 command_specific_data[];
> > > > > +
> > > > > +        /* Device-writable part */
> > > > > +        u8 status;
> > > > > +        u8 command_specific_error;
> > > > > +        u8 command_specific_result[];
> > > > > +};
> > > > > +\end{lstlisting}
> > > > ok this abstraction is an improvement, thanks!
> > > > 
> > > > What I'd like to see is moving a bit more format to this generic structure.
> > > > 
> > > >   From what I could gather, some commands affect a group as a whole, and
> > > > some commands just a single member of the group. We could have a
> > > > "destination" field for that, and a special "all of the group"
> > > > destination for commands affecting the whole group.
> > > > 
> > > > 
> > > > Next, trying to think about scalable iov extensions. So we
> > > > will have groups of VFs and then SFs as the next level.
> > > > How does one differentiate between the two?
> > > > Maybe reserve a field for "destination type"?
> > > For now we have only a PCI group that composed of VFs and the PF.
> > > 
> > > What you suggest, IMO is a definition of a generic virtio group/subsystem
> > > that I've mentioned in the discussion of V1.
> > > 
> > > Once we have virtio group - it should have a group id and them the admin
> > > command can have a field calld group_id for commands that are targeted to
> > > the whole group.
> > > 
> > > Some commands are referring to a specific device in the group so only a
> > > vdev_id is needed.
> > > 
> > > Some commands are even targeted to the same device to query some info (we
> > > have examples in this series for that), so in this case there is no need for
> > > vdev_id nor group_id.
> > > 
> > > So I'm sure sure we can improve common virtio_admin_cmd structure to have
> > > these attributes since they are not mandatory because of the reasons I've
> > > mentioned.
> > I'm not sure I understand 100%, but try to address in the next
> > revision and we'll discuss.
> 
> I meant to say that I'm *not* sure we can improve the common structure...
> 
> It was a typo.
> 
> And I don't understand why this info can't be in the command_specific_data
> because of all the reasons I mentioned above.

It can, but as declared admin commands are there to handle
groups of VFs, so let's standardize how they refer to groups.

> > 
> > > > The point of all this is to allow making sense of commands and
> > > > e.g. virtualizing them for nested virt without necessarily
> > > > knowing all of the detail about the specific command.
> > > I don't understand this, sorry.
> > Basically try to move stuff into generic format so it's possible
> > to understand things without knowing detail of the command.
> 
> But we don't develop a networking protocol here. The management device is
> not sending packets towards its managed devices, right ?
> 
> This is an interface for a specific device that can manage others but also
> manage itself.
> 
> We didn't introduce a notion of broadcasting admin commands for other
> devices.

It's all entities communicating by message passing whether
you call these "commands", "packets" or whatever.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  6:42       ` Michael S. Tsirkin
@ 2022-02-08  7:04         ` Parav Pandit
  2022-02-08 13:19           ` [virtio-comment] " Cornelia Huck
  0 siblings, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08  7:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, cohuck, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 12:13 PM

> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, February 7, 2022 4:09 PM
> > >
> > > Next, trying to think about scalable iov extensions. So we will have
> > > groups of VFs and then SFs as the next level.
> > > How does one differentiate between the two?
> > > Maybe reserve a field for "destination type"?
> > >
> > We already discussed this in v2.
> > SF will have different identification than 16-bits. And no one knows what
> that would be.
> > We just cannot reserve some arbitrary bytes for unknown.
> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained you that 4
> bytes may not be enough.
> >
> > Whether SFs are on top of VFs or SFs are on top of PFs or both is completely
> different spec.
> > Whether PF will manage SFs of the VFs or it will be done nested manner by
> VF etc is a completely different discussion than what is being proposed here.
> > Whether PF will manage the SF is yet another big question. We work with
> users and they dislike this.
> > To address it, some OS has a different management interface (not visible to
> PF) for SF life cycle even though SFs are anchored on a PF.
> >
> > So SF/iov extension discussion has long way to go for community to first
> understand the use cases before crafting some extension.
> >
> > So lets not complicate and mix things here for a blurry definition of scalable
> iov/sf extension.
> 
> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types, sounds like
> that should be enough.
It doesn't stop there.
Mentioning some destination type, interrupt type, etc also requires reserving bytes for different device id type, interrupt type and more.
We past this stage long ago after discussing this in v1 at [1].
It is just better and cleaner to define a different structure to describe SF/iov and its configuration.

[1] https://markmail.org/search/?q=Add+virtio+Admin+Virtqueue+specification&q=list%3Aorg.oasis-open.lists.virtio-dev#query:Add%20virtio%20Admin%20Virtqueue%20specification%20list%3Aorg.oasis-open.lists.virtio-dev+page:2+mid:ncyxqhjlvusjfust+state:results


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  6:45           ` Michael S. Tsirkin
@ 2022-02-08  8:34             ` Max Gurtovoy
  2022-02-08 13:08               ` [virtio-dev] " Cornelia Huck
  2022-02-08 14:04               ` Michael S. Tsirkin
  0 siblings, 2 replies; 43+ messages in thread
From: Max Gurtovoy @ 2022-02-08  8:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha


On 2/8/2022 8:45 AM, Michael S. Tsirkin wrote:
> On Tue, Feb 08, 2022 at 02:41:44AM +0200, Max Gurtovoy wrote:
>> On 2/7/2022 6:18 PM, Michael S. Tsirkin wrote:
>>> On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
>>>> On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
>>>>> On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
>>>>>> +\begin{lstlisting}
>>>>>> +struct virtio_admin_cmd {
>>>>>> +        /* Device-readable part */
>>>>>> +        u16 command;
>>>>>> +        u8 command_specific_data[];
>>>>>> +
>>>>>> +        /* Device-writable part */
>>>>>> +        u8 status;
>>>>>> +        u8 command_specific_error;
>>>>>> +        u8 command_specific_result[];
>>>>>> +};
>>>>>> +\end{lstlisting}
>>>>> ok this abstraction is an improvement, thanks!
>>>>>
>>>>> What I'd like to see is moving a bit more format to this generic structure.
>>>>>
>>>>>    From what I could gather, some commands affect a group as a whole, and
>>>>> some commands just a single member of the group. We could have a
>>>>> "destination" field for that, and a special "all of the group"
>>>>> destination for commands affecting the whole group.
>>>>>
>>>>>
>>>>> Next, trying to think about scalable iov extensions. So we
>>>>> will have groups of VFs and then SFs as the next level.
>>>>> How does one differentiate between the two?
>>>>> Maybe reserve a field for "destination type"?
>>>> For now we have only a PCI group that composed of VFs and the PF.
>>>>
>>>> What you suggest, IMO is a definition of a generic virtio group/subsystem
>>>> that I've mentioned in the discussion of V1.
>>>>
>>>> Once we have virtio group - it should have a group id and them the admin
>>>> command can have a field calld group_id for commands that are targeted to
>>>> the whole group.
>>>>
>>>> Some commands are referring to a specific device in the group so only a
>>>> vdev_id is needed.
>>>>
>>>> Some commands are even targeted to the same device to query some info (we
>>>> have examples in this series for that), so in this case there is no need for
>>>> vdev_id nor group_id.
>>>>
>>>> So I'm sure sure we can improve common virtio_admin_cmd structure to have
>>>> these attributes since they are not mandatory because of the reasons I've
>>>> mentioned.
>>> I'm not sure I understand 100%, but try to address in the next
>>> revision and we'll discuss.
>> I meant to say that I'm *not* sure we can improve the common structure...
>>
>> It was a typo.
>>
>> And I don't understand why this info can't be in the command_specific_data
>> because of all the reasons I mentioned above.
> It can, but as declared admin commands are there to handle
> groups of VFs, so let's standardize how they refer to groups.

"Admin virtqueue is one of the management interface that used to send administrative
commands to manipulate various features of the *device* and/or to manipulate
various features, if possible, of *another device* within the same group"

It's not only for groups.

>
>>>>> The point of all this is to allow making sense of commands and
>>>>> e.g. virtualizing them for nested virt without necessarily
>>>>> knowing all of the detail about the specific command.
>>>> I don't understand this, sorry.
>>> Basically try to move stuff into generic format so it's possible
>>> to understand things without knowing detail of the command.
>> But we don't develop a networking protocol here. The management device is
>> not sending packets towards its managed devices, right ?
>>
>> This is an interface for a specific device that can manage others but also
>> manage itself.
>>
>> We didn't introduce a notion of broadcasting admin commands for other
>> devices.
> It's all entities communicating by message passing whether
> you call these "commands", "packets" or whatever.

We're extending a spec that introduce a driver-device communication via 
queue.

Your suggestion is an internal fabric with packets, headers, 
broadcasting commands (what about error flows ? is this fabric reliable 
? etc etc..).

I feel like this can be a nice future extension and shouldn't block 
current basic (and already big enough) proposal.

>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-dev] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  8:34             ` Max Gurtovoy
@ 2022-02-08 13:08               ` Cornelia Huck
  2022-02-08 13:20                 ` Parav Pandit
  2022-02-08 14:04               ` Michael S. Tsirkin
  1 sibling, 1 reply; 43+ messages in thread
From: Cornelia Huck @ 2022-02-08 13:08 UTC (permalink / raw)
  To: Max Gurtovoy, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, jasowang, parav, shahafs, oren, stefanha

On Tue, Feb 08 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 2/8/2022 8:45 AM, Michael S. Tsirkin wrote:
>> On Tue, Feb 08, 2022 at 02:41:44AM +0200, Max Gurtovoy wrote:
>>> On 2/7/2022 6:18 PM, Michael S. Tsirkin wrote:
>>>> On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
>>>>> On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
>>>>>>> +\begin{lstlisting}
>>>>>>> +struct virtio_admin_cmd {
>>>>>>> +        /* Device-readable part */
>>>>>>> +        u16 command;
>>>>>>> +        u8 command_specific_data[];
>>>>>>> +
>>>>>>> +        /* Device-writable part */
>>>>>>> +        u8 status;
>>>>>>> +        u8 command_specific_error;
>>>>>>> +        u8 command_specific_result[];
>>>>>>> +};
>>>>>>> +\end{lstlisting}
>>>>>> ok this abstraction is an improvement, thanks!
>>>>>>
>>>>>> What I'd like to see is moving a bit more format to this generic structure.
>>>>>>
>>>>>>    From what I could gather, some commands affect a group as a whole, and
>>>>>> some commands just a single member of the group. We could have a
>>>>>> "destination" field for that, and a special "all of the group"
>>>>>> destination for commands affecting the whole group.
>>>>>>
>>>>>>
>>>>>> Next, trying to think about scalable iov extensions. So we
>>>>>> will have groups of VFs and then SFs as the next level.
>>>>>> How does one differentiate between the two?
>>>>>> Maybe reserve a field for "destination type"?
>>>>> For now we have only a PCI group that composed of VFs and the PF.
>>>>>
>>>>> What you suggest, IMO is a definition of a generic virtio group/subsystem
>>>>> that I've mentioned in the discussion of V1.
>>>>>
>>>>> Once we have virtio group - it should have a group id and them the admin
>>>>> command can have a field calld group_id for commands that are targeted to
>>>>> the whole group.
>>>>>
>>>>> Some commands are referring to a specific device in the group so only a
>>>>> vdev_id is needed.
>>>>>
>>>>> Some commands are even targeted to the same device to query some info (we
>>>>> have examples in this series for that), so in this case there is no need for
>>>>> vdev_id nor group_id.
>>>>>
>>>>> So I'm sure sure we can improve common virtio_admin_cmd structure to have
>>>>> these attributes since they are not mandatory because of the reasons I've
>>>>> mentioned.
>>>> I'm not sure I understand 100%, but try to address in the next
>>>> revision and we'll discuss.
>>> I meant to say that I'm *not* sure we can improve the common structure...
>>>
>>> It was a typo.
>>>
>>> And I don't understand why this info can't be in the command_specific_data
>>> because of all the reasons I mentioned above.
>> It can, but as declared admin commands are there to handle
>> groups of VFs, so let's standardize how they refer to groups.
>
> "Admin virtqueue is one of the management interface that used to send administrative
> commands to manipulate various features of the *device* and/or to manipulate
> various features, if possible, of *another device* within the same group"
>
> It's not only for groups.

What I don't understand: Why can't we simply add target information to
the common definition? If a certain command doesn't need that target
information, it is simply ignored.

The benefit of reserving a field for target information and specifying
how groups and devices are supposed to be addressed is that not every
command will have to roll their own.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-comment] RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  7:04         ` Parav Pandit
@ 2022-02-08 13:19           ` Cornelia Huck
  2022-02-08 13:32             ` Parav Pandit
  0 siblings, 1 reply; 43+ messages in thread
From: Cornelia Huck @ 2022-02-08 13:19 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:

>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Tuesday, February 8, 2022 12:13 PM
>
>> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
>> >
>> > > From: Michael S. Tsirkin <mst@redhat.com>
>> > > Sent: Monday, February 7, 2022 4:09 PM
>> > >
>> > > Next, trying to think about scalable iov extensions. So we will have
>> > > groups of VFs and then SFs as the next level.
>> > > How does one differentiate between the two?
>> > > Maybe reserve a field for "destination type"?
>> > >
>> > We already discussed this in v2.
>> > SF will have different identification than 16-bits. And no one knows what
>> that would be.
>> > We just cannot reserve some arbitrary bytes for unknown.
>> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained you that 4
>> bytes may not be enough.
>> >
>> > Whether SFs are on top of VFs or SFs are on top of PFs or both is completely
>> different spec.
>> > Whether PF will manage SFs of the VFs or it will be done nested manner by
>> VF etc is a completely different discussion than what is being proposed here.
>> > Whether PF will manage the SF is yet another big question. We work with
>> users and they dislike this.
>> > To address it, some OS has a different management interface (not visible to
>> PF) for SF life cycle even though SFs are anchored on a PF.
>> >
>> > So SF/iov extension discussion has long way to go for community to first
>> understand the use cases before crafting some extension.
>> >
>> > So lets not complicate and mix things here for a blurry definition of scalable
>> iov/sf extension.
>> 
>> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types, sounds like
>> that should be enough.
> It doesn't stop there.
> Mentioning some destination type, interrupt type, etc also requires reserving bytes for different device id type, interrupt type and more.
> We past this stage long ago after discussing this in v1 at [1].
> It is just better and cleaner to define a different structure to describe SF/iov and its configuration.

I have the feeling that we might be overcomplicating this. We have some
groups of targets (a device, a group, that more complicated SF thingy),
and we want to distinguish between them. That's easy enough to cover via
some kind of enum-equivalent (0 == same dev, 1 == target a dev id, 2
== target a group id, 3 == multi-layer target) and some spec how 1 and 2
should look like (as I'd expect them to be common for many different
things). The SF thingy can be covered by 3, and we'll probably want to
make that one command-specific, as the whole setup does not have enough
things we can standardize on.

Does that make sense? This standardizes the simpler setups, and gives
enough flexibility for the more complicated ones. If we have another
simpler setup in the future, it can become type 4, 5, etc.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 13:08               ` [virtio-dev] " Cornelia Huck
@ 2022-02-08 13:20                 ` Parav Pandit
  0 siblings, 0 replies; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 13:20 UTC (permalink / raw)
  To: Cornelia Huck, Max Gurtovoy, Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, jasowang, Shahaf Shuler, Oren Duer, stefanha


> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, February 8, 2022 6:39 PM
> 
> On Tue, Feb 08 2022, Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> 
> > On 2/8/2022 8:45 AM, Michael S. Tsirkin wrote:
> >> On Tue, Feb 08, 2022 at 02:41:44AM +0200, Max Gurtovoy wrote:
> >>> On 2/7/2022 6:18 PM, Michael S. Tsirkin wrote:
> >>>> On Mon, Feb 07, 2022 at 04:58:19PM +0200, Max Gurtovoy wrote:
> >>>>> On 2/7/2022 12:39 PM, Michael S. Tsirkin wrote:
> >>>>>> On Thu, Feb 03, 2022 at 09:57:13AM +0200, Max Gurtovoy wrote:
> >>>>>>> +\begin{lstlisting}
> >>>>>>> +struct virtio_admin_cmd {
> >>>>>>> +        /* Device-readable part */
> >>>>>>> +        u16 command;
> >>>>>>> +        u8 command_specific_data[];
> >>>>>>> +
> >>>>>>> +        /* Device-writable part */
> >>>>>>> +        u8 status;
> >>>>>>> +        u8 command_specific_error;
> >>>>>>> +        u8 command_specific_result[]; }; \end{lstlisting}
> >>>>>> ok this abstraction is an improvement, thanks!
> >>>>>>
> >>>>>> What I'd like to see is moving a bit more format to this generic
> structure.
> >>>>>>
> >>>>>>    From what I could gather, some commands affect a group as a
> >>>>>> whole, and some commands just a single member of the group. We
> >>>>>> could have a "destination" field for that, and a special "all of the group"
> >>>>>> destination for commands affecting the whole group.
> >>>>>>
> >>>>>>
> >>>>>> Next, trying to think about scalable iov extensions. So we will
> >>>>>> have groups of VFs and then SFs as the next level.
> >>>>>> How does one differentiate between the two?
> >>>>>> Maybe reserve a field for "destination type"?
> >>>>> For now we have only a PCI group that composed of VFs and the PF.
> >>>>>
> >>>>> What you suggest, IMO is a definition of a generic virtio
> >>>>> group/subsystem that I've mentioned in the discussion of V1.
> >>>>>
> >>>>> Once we have virtio group - it should have a group id and them the
> >>>>> admin command can have a field calld group_id for commands that
> >>>>> are targeted to the whole group.
> >>>>>
> >>>>> Some commands are referring to a specific device in the group so
> >>>>> only a vdev_id is needed.
> >>>>>
> >>>>> Some commands are even targeted to the same device to query some
> >>>>> info (we have examples in this series for that), so in this case
> >>>>> there is no need for vdev_id nor group_id.
> >>>>>
> >>>>> So I'm sure sure we can improve common virtio_admin_cmd structure
> >>>>> to have these attributes since they are not mandatory because of
> >>>>> the reasons I've mentioned.
> >>>> I'm not sure I understand 100%, but try to address in the next
> >>>> revision and we'll discuss.
> >>> I meant to say that I'm *not* sure we can improve the common structure...
> >>>
> >>> It was a typo.
> >>>
> >>> And I don't understand why this info can't be in the
> >>> command_specific_data because of all the reasons I mentioned above.
> >> It can, but as declared admin commands are there to handle groups of
> >> VFs, so let's standardize how they refer to groups.
> >
> > "Admin virtqueue is one of the management interface that used to send
> > administrative commands to manipulate various features of the *device*
> > and/or to manipulate various features, if possible, of *another device* within
> the same group"
> >
> > It's not only for groups.
> 
> What I don't understand: Why can't we simply add target information to the
> common definition? 
A command in the admin queue is always targeted to the device on which AQ resides.
There is no need to add any explicit target field. AQ is just like any other queue.

A command in the AQ is modifying the attributes of a VF.
If you call this VF as _target_ than related AQ commands already contain the VF id.

> If a certain command doesn't need that target
> information, it is simply ignored.
There is no point in adding any arbitrary and unknown length of reserved bytes which are supposed to be ignored.

> 
> The benefit of reserving a field for target information and specifying how
> groups and devices are supposed to be addressed is that not every command
> will have to roll their own.
There isn't a good proposal available about how explicit grouping to be done for which some arbitrary group id or target id is requested.
So I think we should not introduce some unknown fields that driver and devices doesn't know how to use it.

Better to define such new fields when an actual explicit grouping scheme is defined.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 13:19           ` [virtio-comment] " Cornelia Huck
@ 2022-02-08 13:32             ` Parav Pandit
  2022-02-08 13:58               ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 13:32 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha


> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, February 8, 2022 6:50 PM
> 
> On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:
> 
> >> From: Michael S. Tsirkin <mst@redhat.com>
> >> Sent: Tuesday, February 8, 2022 12:13 PM
> >
> >> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
> >> >
> >> > > From: Michael S. Tsirkin <mst@redhat.com>
> >> > > Sent: Monday, February 7, 2022 4:09 PM
> >> > >
> >> > > Next, trying to think about scalable iov extensions. So we will
> >> > > have groups of VFs and then SFs as the next level.
> >> > > How does one differentiate between the two?
> >> > > Maybe reserve a field for "destination type"?
> >> > >
> >> > We already discussed this in v2.
> >> > SF will have different identification than 16-bits. And no one
> >> > knows what
> >> that would be.
> >> > We just cannot reserve some arbitrary bytes for unknown.
> >> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained
> >> > you that 4
> >> bytes may not be enough.
> >> >
> >> > Whether SFs are on top of VFs or SFs are on top of PFs or both is
> >> > completely
> >> different spec.
> >> > Whether PF will manage SFs of the VFs or it will be done nested
> >> > manner by
> >> VF etc is a completely different discussion than what is being proposed here.
> >> > Whether PF will manage the SF is yet another big question. We work
> >> > with
> >> users and they dislike this.
> >> > To address it, some OS has a different management interface (not
> >> > visible to
> >> PF) for SF life cycle even though SFs are anchored on a PF.
> >> >
> >> > So SF/iov extension discussion has long way to go for community to
> >> > first
> >> understand the use cases before crafting some extension.
> >> >
> >> > So lets not complicate and mix things here for a blurry definition
> >> > of scalable
> >> iov/sf extension.
> >>
> >> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
> >> sounds like that should be enough.
> > It doesn't stop there.
> > Mentioning some destination type, interrupt type, etc also requires reserving
> bytes for different device id type, interrupt type and more.
> > We past this stage long ago after discussing this in v1 at [1].
> > It is just better and cleaner to define a different structure to describe SF/iov
> and its configuration.
> 
> I have the feeling that we might be overcomplicating this. We have some
> groups of targets (a device, a group, that more complicated SF thingy), and we
> want to distinguish between them. That's easy enough to cover via some kind of
> enum-equivalent (0 == same dev, 1 == target a dev id, 2 == target a group id, 3
> == multi-layer target) and some spec how 1 and 2 should look like (as I'd expect
> them to be common for many different things). 
Do we have a concrete example of a command that can be targeted for same device and a target device, which requires differentiating their destination? If so, lets discuss and then it make sense to add for the well-defined use case.

> The SF thingy can be covered
> by 3, and we'll probably want to make that one command-specific, as the whole
> setup does not have enough things we can standardize on.
This comment has underlying assumption that there are nested layers. And that assumption may not be true at all.
At least for iov extension of intel and SF of Mellanox (already open source for 1+ year) doesn't have the nested layers as today as far as I understand.

> 
> Does that make sense? This standardizes the simpler setups, and gives enough
> flexibility for the more complicated ones. If we have another simpler setup in
> the future, it can become type 4, 5, etc.
I feel that without a use case we are over complicating the commands by introducing group/target id etc.

So better to first come up with a valid use case and a device that supports it, which needs a group.
Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF UUID or a VF on a remote DPU system or PCI device on transparent bridge or something else.

Without knowing the grouping and a nonexistence device we shouldn't complicate the commands.

There are enough opcodes (64K) that can define new structure for more complex devices.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 13:32             ` Parav Pandit
@ 2022-02-08 13:58               ` Michael S. Tsirkin
  2022-02-08 14:59                 ` [virtio-comment] " Cornelia Huck
  2022-02-08 15:06                 ` Parav Pandit
  0 siblings, 2 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 13:58 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 01:32:12PM +0000, Parav Pandit wrote:
> 
> > From: Cornelia Huck <cohuck@redhat.com>
> > Sent: Tuesday, February 8, 2022 6:50 PM
> > 
> > On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:
> > 
> > >> From: Michael S. Tsirkin <mst@redhat.com>
> > >> Sent: Tuesday, February 8, 2022 12:13 PM
> > >
> > >> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
> > >> >
> > >> > > From: Michael S. Tsirkin <mst@redhat.com>
> > >> > > Sent: Monday, February 7, 2022 4:09 PM
> > >> > >
> > >> > > Next, trying to think about scalable iov extensions. So we will
> > >> > > have groups of VFs and then SFs as the next level.
> > >> > > How does one differentiate between the two?
> > >> > > Maybe reserve a field for "destination type"?
> > >> > >
> > >> > We already discussed this in v2.
> > >> > SF will have different identification than 16-bits. And no one
> > >> > knows what
> > >> that would be.
> > >> > We just cannot reserve some arbitrary bytes for unknown.
> > >> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained
> > >> > you that 4
> > >> bytes may not be enough.
> > >> >
> > >> > Whether SFs are on top of VFs or SFs are on top of PFs or both is
> > >> > completely
> > >> different spec.
> > >> > Whether PF will manage SFs of the VFs or it will be done nested
> > >> > manner by
> > >> VF etc is a completely different discussion than what is being proposed here.
> > >> > Whether PF will manage the SF is yet another big question. We work
> > >> > with
> > >> users and they dislike this.
> > >> > To address it, some OS has a different management interface (not
> > >> > visible to
> > >> PF) for SF life cycle even though SFs are anchored on a PF.
> > >> >
> > >> > So SF/iov extension discussion has long way to go for community to
> > >> > first
> > >> understand the use cases before crafting some extension.
> > >> >
> > >> > So lets not complicate and mix things here for a blurry definition
> > >> > of scalable
> > >> iov/sf extension.
> > >>
> > >> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
> > >> sounds like that should be enough.
> > > It doesn't stop there.
> > > Mentioning some destination type, interrupt type, etc also requires reserving
> > bytes for different device id type, interrupt type and more.
> > > We past this stage long ago after discussing this in v1 at [1].
> > > It is just better and cleaner to define a different structure to describe SF/iov
> > and its configuration.
> > 
> > I have the feeling that we might be overcomplicating this. We have some
> > groups of targets (a device, a group, that more complicated SF thingy), and we
> > want to distinguish between them. That's easy enough to cover via some kind of
> > enum-equivalent (0 == same dev, 1 == target a dev id, 2 == target a group id, 3
> > == multi-layer target) and some spec how 1 and 2 should look like (as I'd expect
> > them to be common for many different things). 
> Do we have a concrete example of a command that can be targeted for same device and a target device, which requires differentiating their destination? If so, lets discuss and then it make sense to add for the well-defined use case.

So e.g. things like controlling NIC's MAC can reasonably be part of
the same device.

> > The SF thingy can be covered
> > by 3, and we'll probably want to make that one command-specific, as the whole
> > setup does not have enough things we can standardize on.
> This comment has underlying assumption that there are nested layers. And that assumption may not be true at all.
> At least for iov extension of intel and SF of Mellanox (already open source for 1+ year) doesn't have the nested layers as today as far as I understand.
> > 
> > Does that make sense? This standardizes the simpler setups, and gives enough
> > flexibility for the more complicated ones. If we have another simpler setup in
> > the future, it can become type 4, 5, etc.
> I feel that without a use case we are over complicating the commands by introducing group/target id etc.
> 
> So better to first come up with a valid use case and a device that supports it, which needs a group.
> Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF UUID or a VF on a remote DPU system or PCI device on transparent bridge or something else.

Well PASID is IIRC just 20 bit on express. I find it unlikely that we'll
need more than 64 bit. Yes, it's hard to predict the future but just
doing 16 bit here seems frankly like a premature optimization. UUID for
a transient thing such as SF just seems unnecessary. 32 or 64 bit seem
both acceptable.

> Without knowing the grouping and a nonexistence device we shouldn't complicate the commands.
> 
> There are enough opcodes (64K) that can define new structure for more complex devices.

I think you are asking for a bit much frankly, it's up to you to build
the interface. Just like with code, if the design does not feel robust
the bar is much higher even if one can not prove there's a locking
problem.  Same here, this interface design does not yet feel very robust
yet - so either we build it in a way that seems robust based
basically on our gut feeling, or actually spend time predicting and
addressing future use-cases to prove they can be addressed.
I dare say we've developed some intuition about what makes
an extensible interface and where things are likely to go here at the
TC, so I wouldn't discard all feedback as unnecessary complication even
if it does not always come with concrete use-case examples.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08  8:34             ` Max Gurtovoy
  2022-02-08 13:08               ` [virtio-dev] " Cornelia Huck
@ 2022-02-08 14:04               ` Michael S. Tsirkin
  1 sibling, 0 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 14:04 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: virtio-comment, cohuck, virtio-dev, jasowang, parav, shahafs,
	oren, stefanha

On Tue, Feb 08, 2022 at 10:34:24AM +0200, Max Gurtovoy wrote:
> "Admin virtqueue is one of the management interface that used to send administrative
> commands to manipulate various features of the *device* and/or to manipulate
> various features, if possible, of *another device* within the same group"
> 
> It's not only for groups.

First this sentence is unclear what does "to manipulate" refer to?
What is done to manipulate these things?
I think the vq is to send commands. The commands are to manipulate etc.
So pls cut this up and use shorter sentences please.

And I think the vq is useful mostly for groups. But yes, thinkably if
admin commands are submitted via e.g. MMIO then they can refer to the
same device where they are sent, not to a group.  And this is exactly
why Cornelia's suggesting a way to differentiate.  We should not need a
completely separate command just because it's going to the same device
and not to another one.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 13:58               ` Michael S. Tsirkin
@ 2022-02-08 14:59                 ` Cornelia Huck
  2022-02-08 15:11                   ` [virtio-dev] " Parav Pandit
  2022-02-08 15:26                   ` Michael S. Tsirkin
  2022-02-08 15:06                 ` Parav Pandit
  1 sibling, 2 replies; 43+ messages in thread
From: Cornelia Huck @ 2022-02-08 14:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Feb 08, 2022 at 01:32:12PM +0000, Parav Pandit wrote:
>> 
>> > From: Cornelia Huck <cohuck@redhat.com>
>> > Sent: Tuesday, February 8, 2022 6:50 PM
>> > 
>> > On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:
>> > 
>> > >> From: Michael S. Tsirkin <mst@redhat.com>
>> > >> Sent: Tuesday, February 8, 2022 12:13 PM
>> > >
>> > >> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
>> > >> >
>> > >> > > From: Michael S. Tsirkin <mst@redhat.com>
>> > >> > > Sent: Monday, February 7, 2022 4:09 PM
>> > >> > >
>> > >> > > Next, trying to think about scalable iov extensions. So we will
>> > >> > > have groups of VFs and then SFs as the next level.
>> > >> > > How does one differentiate between the two?
>> > >> > > Maybe reserve a field for "destination type"?
>> > >> > >
>> > >> > We already discussed this in v2.
>> > >> > SF will have different identification than 16-bits. And no one
>> > >> > knows what
>> > >> that would be.
>> > >> > We just cannot reserve some arbitrary bytes for unknown.
>> > >> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained
>> > >> > you that 4
>> > >> bytes may not be enough.
>> > >> >
>> > >> > Whether SFs are on top of VFs or SFs are on top of PFs or both is
>> > >> > completely
>> > >> different spec.
>> > >> > Whether PF will manage SFs of the VFs or it will be done nested
>> > >> > manner by
>> > >> VF etc is a completely different discussion than what is being proposed here.
>> > >> > Whether PF will manage the SF is yet another big question. We work
>> > >> > with
>> > >> users and they dislike this.
>> > >> > To address it, some OS has a different management interface (not
>> > >> > visible to
>> > >> PF) for SF life cycle even though SFs are anchored on a PF.
>> > >> >
>> > >> > So SF/iov extension discussion has long way to go for community to
>> > >> > first
>> > >> understand the use cases before crafting some extension.
>> > >> >
>> > >> > So lets not complicate and mix things here for a blurry definition
>> > >> > of scalable
>> > >> iov/sf extension.
>> > >>
>> > >> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
>> > >> sounds like that should be enough.
>> > > It doesn't stop there.
>> > > Mentioning some destination type, interrupt type, etc also requires reserving
>> > bytes for different device id type, interrupt type and more.
>> > > We past this stage long ago after discussing this in v1 at [1].
>> > > It is just better and cleaner to define a different structure to describe SF/iov
>> > and its configuration.
>> > 
>> > I have the feeling that we might be overcomplicating this. We have some
>> > groups of targets (a device, a group, that more complicated SF thingy), and we
>> > want to distinguish between them. That's easy enough to cover via some kind of
>> > enum-equivalent (0 == same dev, 1 == target a dev id, 2 == target a group id, 3
>> > == multi-layer target) and some spec how 1 and 2 should look like (as I'd expect
>> > them to be common for many different things). 
>> Do we have a concrete example of a command that can be targeted for same device and a target device, which requires differentiating their destination? If so, lets discuss and then it make sense to add for the well-defined use case.
>
> So e.g. things like controlling NIC's MAC can reasonably be part of
> the same device.

Yes, that would be an example.

I might have been a bit too vague about what I had been thinking
about. Let's do a sketch (intentionally without concrete sizes):

+-------------------------------------------------------+
| command                                               |
+-------------------------------------------------------+
| target type (0 - self, 1 - dev id, 2 - group id, ...  |
+-------------------------------------------------------+
| dev id                                                |
+-------------------------------------------------------+
| group id                                              |
+-------------------------------------------------------+
| command-specific data                                 |
+-------------------------------------------------------+
| response part                                         |
+-------------------------------------------------------+

'dev id' would be valid for 'target type' == 1, 'group id' would be
valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
could be a single 'target id' field; if there's nothing better to use,
it can simply contain a uuid.

>
>> > The SF thingy can be covered
>> > by 3, and we'll probably want to make that one command-specific, as the whole
>> > setup does not have enough things we can standardize on.
>> This comment has underlying assumption that there are nested layers. And that assumption may not be true at all.
>> At least for iov extension of intel and SF of Mellanox (already open source for 1+ year) doesn't have the nested layers as today as far as I understand.
>> > 
>> > Does that make sense? This standardizes the simpler setups, and gives enough
>> > flexibility for the more complicated ones. If we have another simpler setup in
>> > the future, it can become type 4, 5, etc.
>> I feel that without a use case we are over complicating the commands by introducing group/target id etc.
>> 
>> So better to first come up with a valid use case and a device that supports it, which needs a group.
>> Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF UUID or a VF on a remote DPU system or PCI device on transparent bridge or something else.
>
> Well PASID is IIRC just 20 bit on express. I find it unlikely that we'll
> need more than 64 bit. Yes, it's hard to predict the future but just
> doing 16 bit here seems frankly like a premature optimization. UUID for
> a transient thing such as SF just seems unnecessary. 32 or 64 bit seem
> both acceptable.

However, if we want to be able to accommodate targets we have no idea
about yet (and that may have nothing to do with PCI at all), we should
maybe standardize on something that can fit a uuid -- if all else fails,
you can identify anything with a uuid.

64 bits might be enough in practice, 32 bit seems a bit short.

>
>> Without knowing the grouping and a nonexistence device we shouldn't complicate the commands.
>> 
>> There are enough opcodes (64K) that can define new structure for more complex devices.
>
> I think you are asking for a bit much frankly, it's up to you to build
> the interface. Just like with code, if the design does not feel robust
> the bar is much higher even if one can not prove there's a locking
> problem.  Same here, this interface design does not yet feel very robust
> yet - so either we build it in a way that seems robust based
> basically on our gut feeling, or actually spend time predicting and
> addressing future use-cases to prove they can be addressed.
> I dare say we've developed some intuition about what makes
> an extensible interface and where things are likely to go here at the
> TC, so I wouldn't discard all feedback as unnecessary complication even
> if it does not always come with concrete use-case examples.

Indeed. I'm not saying that my ideas are the right way to go, but we
definitely need something that (a) covers the use cases we already know
about, and (b) can accommodate future use cases with some
confidence. Now there's always the chance that we have a requirement
down the road that's so odd that our attempts to make this extensible
are not enough, but we really need to be able to cover at least things
we can imagine. Especially as the use cases in (a) are very specific
(essentially PCI-specific), we need to at least answer the question "how
could this work for things that are not PCI?"

(And shoving everything into command-specific data seems too
underspecified to me.)


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 13:58               ` Michael S. Tsirkin
  2022-02-08 14:59                 ` [virtio-comment] " Cornelia Huck
@ 2022-02-08 15:06                 ` Parav Pandit
  2022-02-08 15:39                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 15:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 7:29 PM
> 
> > Do we have a concrete example of a command that can be targeted for same
> device and a target device, which requires differentiating their destination? If
> so, lets discuss and then it make sense to add for the well-defined use case.
> 
> So e.g. things like controlling NIC's MAC can reasonably be part of the same
> device.
A mac address of NIC can be programmed via the existing control VQ for the self.
Lets come up with some other example.

> > So better to first come up with a valid use case and a device that supports it,
> which needs a group.
> > Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a
> BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF
> UUID or a VF on a remote DPU system or PCI device on transparent bridge or
> something else.
> 
> Well PASID is IIRC just 20 bit on express. 
There are off line discussion and some on the mailing list too (I don't have a link to few months/year old email),
That PASID is being overloaded to identify the SF and process both. And I tend to agree to it.

So I won't be surprised if a new SF function id takes different route than PASID.
More below.

> I find it unlikely that we'll need more
> than 64 bit. Yes, it's hard to predict the future but just doing 16 bit here seems
> frankly like a premature optimization. UUID for a transient thing such as SF just
> seems unnecessary. 32 or 64 bit seem both acceptable.
> 
> > Without knowing the grouping and a nonexistence device we shouldn't
> complicate the commands.
> >
> > There are enough opcodes (64K) that can define new structure for more
> complex devices.
> 
> I think you are asking for a bit much frankly, it's up to you to build the interface.
I likely didn't understand above point.

> Just like with code, if the design does not feel robust the bar is much higher
> even if one can not prove there's a locking problem. 

> Same here, this interface
> design does not yet feel very robust yet - so either we build it in a way that
> seems robust based basically on our gut feeling, or actually spend time
> predicting and addressing future use-cases to prove they can be addressed.

> I dare say we've developed some intuition about what makes an extensible
> interface and where things are likely to go here at the TC, so I wouldn't discard
> all feedback as unnecessary complication even if it does not always come with
> concrete use-case examples.
I do not doubt your intuition. 
I am open to feedback, but we haven't really established at least single explicit grouping example and ask is to add some arbitrary reserved bytes for it.

From my DPU experience, over last 3 years, we have built SF and VF device on parent PF-A, and their management interface on parent PF-B.
On top of that it is expected to be uniquely indefinable in multi-host env, that brings the notion of multiple controllers by a single management device.
And also make it work uniformly with same interface when parent and management device are same.
You can see an extendible interface at [1].

With this base line of [1], I do not agree that defining a u32 device_identifier (to contain 20-bit PASID) will be sufficient in future.

So if we really want to cover variety of cases like [1] and some more complex nested cases, we better define,
Device identifier as below,
struct device_identifier {
	u8 id_length;
	u8 id[];	/* variable length field
};
For implicitly grouped VFs of a PF, id can be 2 bytes.
For more advance cases it can be a structure consist one or more combination of (a) host id or controller id (b) PF BDF, (c) sf id (d) PASID and more.

[1] https://www.kernel.org/doc/Documentation/networking/devlink/devlink-port.rst


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-dev] RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 14:59                 ` [virtio-comment] " Cornelia Huck
@ 2022-02-08 15:11                   ` Parav Pandit
  2022-02-08 15:18                     ` Cornelia Huck
  2022-02-08 15:28                     ` Michael S. Tsirkin
  2022-02-08 15:26                   ` Michael S. Tsirkin
  1 sibling, 2 replies; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 15:11 UTC (permalink / raw)
  To: Cornelia Huck, Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha


> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, February 8, 2022 8:29 PM

> very specific (essentially PCI-specific), we need to at least answer the question
> "how could this work for things that are not PCI?"
Why would one want to execute PCI VF MSI-X configuration command for a non PCI device?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [virtio-dev] RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:11                   ` [virtio-dev] " Parav Pandit
@ 2022-02-08 15:18                     ` Cornelia Huck
  2022-02-08 15:28                     ` Michael S. Tsirkin
  1 sibling, 0 replies; 43+ messages in thread
From: Cornelia Huck @ 2022-02-08 15:18 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:

>> From: Cornelia Huck <cohuck@redhat.com>
>> Sent: Tuesday, February 8, 2022 8:29 PM
>
>> very specific (essentially PCI-specific), we need to at least answer the question
>> "how could this work for things that are not PCI?"
> Why would one want to execute PCI VF MSI-X configuration command for a non PCI device?

Well, that obviously does not make sense! But we also obviously still
want to be able to target admin commands in general to non-PCI devices.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 14:59                 ` [virtio-comment] " Cornelia Huck
  2022-02-08 15:11                   ` [virtio-dev] " Parav Pandit
@ 2022-02-08 15:26                   ` Michael S. Tsirkin
  2022-02-08 15:32                     ` [virtio-comment] " Cornelia Huck
  2022-02-08 15:35                     ` [virtio-dev] " Parav Pandit
  1 sibling, 2 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 15:26 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:59:13PM +0100, Cornelia Huck wrote:
> On Tue, Feb 08 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, Feb 08, 2022 at 01:32:12PM +0000, Parav Pandit wrote:
> >> 
> >> > From: Cornelia Huck <cohuck@redhat.com>
> >> > Sent: Tuesday, February 8, 2022 6:50 PM
> >> > 
> >> > On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:
> >> > 
> >> > >> From: Michael S. Tsirkin <mst@redhat.com>
> >> > >> Sent: Tuesday, February 8, 2022 12:13 PM
> >> > >
> >> > >> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
> >> > >> >
> >> > >> > > From: Michael S. Tsirkin <mst@redhat.com>
> >> > >> > > Sent: Monday, February 7, 2022 4:09 PM
> >> > >> > >
> >> > >> > > Next, trying to think about scalable iov extensions. So we will
> >> > >> > > have groups of VFs and then SFs as the next level.
> >> > >> > > How does one differentiate between the two?
> >> > >> > > Maybe reserve a field for "destination type"?
> >> > >> > >
> >> > >> > We already discussed this in v2.
> >> > >> > SF will have different identification than 16-bits. And no one
> >> > >> > knows what
> >> > >> that would be.
> >> > >> > We just cannot reserve some arbitrary bytes for unknown.
> >> > >> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained
> >> > >> > you that 4
> >> > >> bytes may not be enough.
> >> > >> >
> >> > >> > Whether SFs are on top of VFs or SFs are on top of PFs or both is
> >> > >> > completely
> >> > >> different spec.
> >> > >> > Whether PF will manage SFs of the VFs or it will be done nested
> >> > >> > manner by
> >> > >> VF etc is a completely different discussion than what is being proposed here.
> >> > >> > Whether PF will manage the SF is yet another big question. We work
> >> > >> > with
> >> > >> users and they dislike this.
> >> > >> > To address it, some OS has a different management interface (not
> >> > >> > visible to
> >> > >> PF) for SF life cycle even though SFs are anchored on a PF.
> >> > >> >
> >> > >> > So SF/iov extension discussion has long way to go for community to
> >> > >> > first
> >> > >> understand the use cases before crafting some extension.
> >> > >> >
> >> > >> > So lets not complicate and mix things here for a blurry definition
> >> > >> > of scalable
> >> > >> iov/sf extension.
> >> > >>
> >> > >> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
> >> > >> sounds like that should be enough.
> >> > > It doesn't stop there.
> >> > > Mentioning some destination type, interrupt type, etc also requires reserving
> >> > bytes for different device id type, interrupt type and more.
> >> > > We past this stage long ago after discussing this in v1 at [1].
> >> > > It is just better and cleaner to define a different structure to describe SF/iov
> >> > and its configuration.
> >> > 
> >> > I have the feeling that we might be overcomplicating this. We have some
> >> > groups of targets (a device, a group, that more complicated SF thingy), and we
> >> > want to distinguish between them. That's easy enough to cover via some kind of
> >> > enum-equivalent (0 == same dev, 1 == target a dev id, 2 == target a group id, 3
> >> > == multi-layer target) and some spec how 1 and 2 should look like (as I'd expect
> >> > them to be common for many different things). 
> >> Do we have a concrete example of a command that can be targeted for same device and a target device, which requires differentiating their destination? If so, lets discuss and then it make sense to add for the well-defined use case.
> >
> > So e.g. things like controlling NIC's MAC can reasonably be part of
> > the same device.
> 
> Yes, that would be an example.
> 
> I might have been a bit too vague about what I had been thinking
> about. Let's do a sketch (intentionally without concrete sizes):
> 
> +-------------------------------------------------------+
> | command                                               |
> +-------------------------------------------------------+
> | target type (0 - self, 1 - dev id, 2 - group id, ...  |
> +-------------------------------------------------------+
> | dev id                                                |
> +-------------------------------------------------------+
> | group id                                              |
> +-------------------------------------------------------+
> | command-specific data                                 |
> +-------------------------------------------------------+
> | response part                                         |
> +-------------------------------------------------------+
> 
> 'dev id' would be valid for 'target type' == 1, 'group id' would be
> valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
> could be a single 'target id' field; if there's nothing better to use,
> it can simply contain a uuid.

I am not sure why do we have both dev id and group id.
They are never used together right?
Maybe just have an id length field if we can't agree on
how much space to reserve.

> >
> >> > The SF thingy can be covered
> >> > by 3, and we'll probably want to make that one command-specific, as the whole
> >> > setup does not have enough things we can standardize on.
> >> This comment has underlying assumption that there are nested layers. And that assumption may not be true at all.
> >> At least for iov extension of intel and SF of Mellanox (already open source for 1+ year) doesn't have the nested layers as today as far as I understand.
> >> > 
> >> > Does that make sense? This standardizes the simpler setups, and gives enough
> >> > flexibility for the more complicated ones. If we have another simpler setup in
> >> > the future, it can become type 4, 5, etc.
> >> I feel that without a use case we are over complicating the commands by introducing group/target id etc.
> >> 
> >> So better to first come up with a valid use case and a device that supports it, which needs a group.
> >> Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF UUID or a VF on a remote DPU system or PCI device on transparent bridge or something else.
> >
> > Well PASID is IIRC just 20 bit on express. I find it unlikely that we'll
> > need more than 64 bit. Yes, it's hard to predict the future but just
> > doing 16 bit here seems frankly like a premature optimization. UUID for
> > a transient thing such as SF just seems unnecessary. 32 or 64 bit seem
> > both acceptable.
> 
> However, if we want to be able to accommodate targets we have no idea
> about yet (and that may have nothing to do with PCI at all), we should
> maybe standardize on something that can fit a uuid -- if all else fails,
> you can identify anything with a uuid.
> 
> 64 bits might be enough in practice, 32 bit seems a bit short.

Ok, 64 sounds reasonable, not too much.


> >
> >> Without knowing the grouping and a nonexistence device we shouldn't complicate the commands.
> >> 
> >> There are enough opcodes (64K) that can define new structure for more complex devices.
> >
> > I think you are asking for a bit much frankly, it's up to you to build
> > the interface. Just like with code, if the design does not feel robust
> > the bar is much higher even if one can not prove there's a locking
> > problem.  Same here, this interface design does not yet feel very robust
> > yet - so either we build it in a way that seems robust based
> > basically on our gut feeling, or actually spend time predicting and
> > addressing future use-cases to prove they can be addressed.
> > I dare say we've developed some intuition about what makes
> > an extensible interface and where things are likely to go here at the
> > TC, so I wouldn't discard all feedback as unnecessary complication even
> > if it does not always come with concrete use-case examples.
> 
> Indeed. I'm not saying that my ideas are the right way to go, but we
> definitely need something that (a) covers the use cases we already know
> about, and (b) can accommodate future use cases with some
> confidence. Now there's always the chance that we have a requirement
> down the road that's so odd that our attempts to make this extensible
> are not enough, but we really need to be able to cover at least things
> we can imagine. Especially as the use cases in (a) are very specific
> (essentially PCI-specific), we need to at least answer the question "how
> could this work for things that are not PCI?"
> 
> (And shoving everything into command-specific data seems too
> underspecified to me.)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:11                   ` [virtio-dev] " Parav Pandit
  2022-02-08 15:18                     ` Cornelia Huck
@ 2022-02-08 15:28                     ` Michael S. Tsirkin
  2022-02-08 15:33                       ` Parav Pandit
  1 sibling, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 15:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:11:28PM +0000, Parav Pandit wrote:
> 
> > From: Cornelia Huck <cohuck@redhat.com>
> > Sent: Tuesday, February 8, 2022 8:29 PM
> 
> > very specific (essentially PCI-specific), we need to at least answer the question
> > "how could this work for things that are not PCI?"
> Why would one want to execute PCI VF MSI-X configuration command for a non PCI device?

E.g. there were proposals to add msi support to virtio-mmio.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:26                   ` Michael S. Tsirkin
@ 2022-02-08 15:32                     ` Cornelia Huck
  2022-02-08 15:35                     ` [virtio-dev] " Parav Pandit
  1 sibling, 0 replies; 43+ messages in thread
From: Cornelia Huck @ 2022-02-08 15:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Feb 08, 2022 at 03:59:13PM +0100, Cornelia Huck wrote:
>> On Tue, Feb 08 2022, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>> > On Tue, Feb 08, 2022 at 01:32:12PM +0000, Parav Pandit wrote:
>> >> 
>> >> > From: Cornelia Huck <cohuck@redhat.com>
>> >> > Sent: Tuesday, February 8, 2022 6:50 PM
>> >> > 
>> >> > On Tue, Feb 08 2022, Parav Pandit <parav@nvidia.com> wrote:
>> >> > 
>> >> > >> From: Michael S. Tsirkin <mst@redhat.com>
>> >> > >> Sent: Tuesday, February 8, 2022 12:13 PM
>> >> > >
>> >> > >> On Tue, Feb 08, 2022 at 06:25:41AM +0000, Parav Pandit wrote:
>> >> > >> >
>> >> > >> > > From: Michael S. Tsirkin <mst@redhat.com>
>> >> > >> > > Sent: Monday, February 7, 2022 4:09 PM
>> >> > >> > >
>> >> > >> > > Next, trying to think about scalable iov extensions. So we will
>> >> > >> > > have groups of VFs and then SFs as the next level.
>> >> > >> > > How does one differentiate between the two?
>> >> > >> > > Maybe reserve a field for "destination type"?
>> >> > >> > >
>> >> > >> > We already discussed this in v2.
>> >> > >> > SF will have different identification than 16-bits. And no one
>> >> > >> > knows what
>> >> > >> that would be.
>> >> > >> > We just cannot reserve some arbitrary bytes for unknown.
>> >> > >> > You suggested in v2 to reserve 4 bytes for sf_id, and I explained
>> >> > >> > you that 4
>> >> > >> bytes may not be enough.
>> >> > >> >
>> >> > >> > Whether SFs are on top of VFs or SFs are on top of PFs or both is
>> >> > >> > completely
>> >> > >> different spec.
>> >> > >> > Whether PF will manage SFs of the VFs or it will be done nested
>> >> > >> > manner by
>> >> > >> VF etc is a completely different discussion than what is being proposed here.
>> >> > >> > Whether PF will manage the SF is yet another big question. We work
>> >> > >> > with
>> >> > >> users and they dislike this.
>> >> > >> > To address it, some OS has a different management interface (not
>> >> > >> > visible to
>> >> > >> PF) for SF life cycle even though SFs are anchored on a PF.
>> >> > >> >
>> >> > >> > So SF/iov extension discussion has long way to go for community to
>> >> > >> > first
>> >> > >> understand the use cases before crafting some extension.
>> >> > >> >
>> >> > >> > So lets not complicate and mix things here for a blurry definition
>> >> > >> > of scalable
>> >> > >> iov/sf extension.
>> >> > >>
>> >> > >> Some reserved bytes won't hurt. 2 bytes for type gives us 64k types,
>> >> > >> sounds like that should be enough.
>> >> > > It doesn't stop there.
>> >> > > Mentioning some destination type, interrupt type, etc also requires reserving
>> >> > bytes for different device id type, interrupt type and more.
>> >> > > We past this stage long ago after discussing this in v1 at [1].
>> >> > > It is just better and cleaner to define a different structure to describe SF/iov
>> >> > and its configuration.
>> >> > 
>> >> > I have the feeling that we might be overcomplicating this. We have some
>> >> > groups of targets (a device, a group, that more complicated SF thingy), and we
>> >> > want to distinguish between them. That's easy enough to cover via some kind of
>> >> > enum-equivalent (0 == same dev, 1 == target a dev id, 2 == target a group id, 3
>> >> > == multi-layer target) and some spec how 1 and 2 should look like (as I'd expect
>> >> > them to be common for many different things). 
>> >> Do we have a concrete example of a command that can be targeted for same device and a target device, which requires differentiating their destination? If so, lets discuss and then it make sense to add for the well-defined use case.
>> >
>> > So e.g. things like controlling NIC's MAC can reasonably be part of
>> > the same device.
>> 
>> Yes, that would be an example.
>> 
>> I might have been a bit too vague about what I had been thinking
>> about. Let's do a sketch (intentionally without concrete sizes):
>> 
>> +-------------------------------------------------------+
>> | command                                               |
>> +-------------------------------------------------------+
>> | target type (0 - self, 1 - dev id, 2 - group id, ...  |
>> +-------------------------------------------------------+
>> | dev id                                                |
>> +-------------------------------------------------------+
>> | group id                                              |
>> +-------------------------------------------------------+
>> | command-specific data                                 |
>> +-------------------------------------------------------+
>> | response part                                         |
>> +-------------------------------------------------------+
>> 
>> 'dev id' would be valid for 'target type' == 1, 'group id' would be
>> valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
>> could be a single 'target id' field; if there's nothing better to use,
>> it can simply contain a uuid.
>
> I am not sure why do we have both dev id and group id.
> They are never used together right?

Right, we can certainly use a single field for both.

> Maybe just have an id length field if we can't agree on
> how much space to reserve.

If we think that 64 bit should be able to accommodate everything, I'd
say just go with that, no need to make it overly complicated.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:28                     ` Michael S. Tsirkin
@ 2022-02-08 15:33                       ` Parav Pandit
  2022-02-08 15:36                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 15:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 8:58 PM
> On Tue, Feb 08, 2022 at 03:11:28PM +0000, Parav Pandit wrote:
> >
> > > From: Cornelia Huck <cohuck@redhat.com>
> > > Sent: Tuesday, February 8, 2022 8:29 PM
> >
> > > very specific (essentially PCI-specific), we need to at least answer
> > > the question "how could this work for things that are not PCI?"
> > Why would one want to execute PCI VF MSI-X configuration command for a
> non PCI device?
> 
> E.g. there were proposals to add msi support to virtio-mmio.

What prevents mmio specific command say mmio_dev_interrupt_config()?
Which fields do overlap with msix config of pci vf?
How is the mmio device identified?


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-dev] RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:26                   ` Michael S. Tsirkin
  2022-02-08 15:32                     ` [virtio-comment] " Cornelia Huck
@ 2022-02-08 15:35                     ` Parav Pandit
  2022-02-08 15:37                       ` Michael S. Tsirkin
  1 sibling, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 15:35 UTC (permalink / raw)
  To: Michael S. Tsirkin, Cornelia Huck
  Cc: Max Gurtovoy, virtio-comment, virtio-dev, jasowang,
	Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 8:56 PM

> > I might have been a bit too vague about what I had been thinking
> > about. Let's do a sketch (intentionally without concrete sizes):
> >
> > +-------------------------------------------------------+
> > | command                                               |
> > +-------------------------------------------------------+
> > | target type (0 - self, 1 - dev id, 2 - group id, ...  |
> > +-------------------------------------------------------+
> > | dev id                                                |
> > +-------------------------------------------------------+
> > | group id                                              |
> > +-------------------------------------------------------+
> > | command-specific data                                 |
> > +-------------------------------------------------------+
> > | response part                                         |
> > +-------------------------------------------------------+
> >
> > 'dev id' would be valid for 'target type' == 1, 'group id' would be
> > valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
> > could be a single 'target id' field; if there's nothing better to use,
> > it can simply contain a uuid.
> 
> I am not sure why do we have both dev id and group id.
> They are never used together right?
> Maybe just have an id length field if we can't agree on how much space to
> reserve.
This is what I propose in a previous email.
A device id can be duplicate in different groups.
So to build hierarchy group id will be desired.

So id[] array can contain nested one or multiple fields.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:33                       ` Parav Pandit
@ 2022-02-08 15:36                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 15:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:33:36PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, February 8, 2022 8:58 PM
> > On Tue, Feb 08, 2022 at 03:11:28PM +0000, Parav Pandit wrote:
> > >
> > > > From: Cornelia Huck <cohuck@redhat.com>
> > > > Sent: Tuesday, February 8, 2022 8:29 PM
> > >
> > > > very specific (essentially PCI-specific), we need to at least answer
> > > > the question "how could this work for things that are not PCI?"
> > > Why would one want to execute PCI VF MSI-X configuration command for a
> > non PCI device?
> > 
> > E.g. there were proposals to add msi support to virtio-mmio.
> 
> What prevents mmio specific command say mmio_dev_interrupt_config()?
> Which fields do overlap with msix config of pci vf?
> How is the mmio device identified?

if mmio starts supporting msi-x then # of vectors can reasonably be
specified in exactly the same way.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:35                     ` [virtio-dev] " Parav Pandit
@ 2022-02-08 15:37                       ` Michael S. Tsirkin
  2022-02-08 15:48                         ` Parav Pandit
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 15:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:35:58PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, February 8, 2022 8:56 PM
> 
> > > I might have been a bit too vague about what I had been thinking
> > > about. Let's do a sketch (intentionally without concrete sizes):
> > >
> > > +-------------------------------------------------------+
> > > | command                                               |
> > > +-------------------------------------------------------+
> > > | target type (0 - self, 1 - dev id, 2 - group id, ...  |
> > > +-------------------------------------------------------+
> > > | dev id                                                |
> > > +-------------------------------------------------------+
> > > | group id                                              |
> > > +-------------------------------------------------------+
> > > | command-specific data                                 |
> > > +-------------------------------------------------------+
> > > | response part                                         |
> > > +-------------------------------------------------------+
> > >
> > > 'dev id' would be valid for 'target type' == 1, 'group id' would be
> > > valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
> > > could be a single 'target id' field; if there's nothing better to use,
> > > it can simply contain a uuid.
> > 
> > I am not sure why do we have both dev id and group id.
> > They are never used together right?
> > Maybe just have an id length field if we can't agree on how much space to
> > reserve.
> This is what I propose in a previous email.
> A device id can be duplicate in different groups.
> So to build hierarchy group id will be desired.
> 
> So id[] array can contain nested one or multiple fields.

I guess nesting is for when there's like an SF within a VF?

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:06                 ` Parav Pandit
@ 2022-02-08 15:39                   ` Michael S. Tsirkin
  2022-02-08 18:52                     ` Parav Pandit
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 15:39 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:06:16PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, February 8, 2022 7:29 PM
> > 
> > > Do we have a concrete example of a command that can be targeted for same
> > device and a target device, which requires differentiating their destination? If
> > so, lets discuss and then it make sense to add for the well-defined use case.
> > 
> > So e.g. things like controlling NIC's MAC can reasonably be part of the same
> > device.
> A mac address of NIC can be programmed via the existing control VQ for the self.

Not if it's disabled for the guest.

> Lets come up with some other example.

Go ahead.

> > > So better to first come up with a valid use case and a device that supports it,
> > which needs a group.
> > > Otherwise a target id can be a long string of PCI device = 0000:03:00.0 or a
> > BDF or a VF number or a VF of a different PCI PF or a SF number 32-bit or SF
> > UUID or a VF on a remote DPU system or PCI device on transparent bridge or
> > something else.
> > 
> > Well PASID is IIRC just 20 bit on express. 
> There are off line discussion and some on the mailing list too (I don't have a link to few months/year old email),
> That PASID is being overloaded to identify the SF and process both. And I tend to agree to it.
> 
> So I won't be surprised if a new SF function id takes different route than PASID.
> More below.
> 
> > I find it unlikely that we'll need more
> > than 64 bit. Yes, it's hard to predict the future but just doing 16 bit here seems
> > frankly like a premature optimization. UUID for a transient thing such as SF just
> > seems unnecessary. 32 or 64 bit seem both acceptable.
> > 
> > > Without knowing the grouping and a nonexistence device we shouldn't
> > complicate the commands.
> > >
> > > There are enough opcodes (64K) that can define new structure for more
> > complex devices.
> > 
> > I think you are asking for a bit much frankly, it's up to you to build the interface.
> I likely didn't understand above point.
> 
> > Just like with code, if the design does not feel robust the bar is much higher
> > even if one can not prove there's a locking problem. 
> 
> > Same here, this interface
> > design does not yet feel very robust yet - so either we build it in a way that
> > seems robust based basically on our gut feeling, or actually spend time
> > predicting and addressing future use-cases to prove they can be addressed.
> 
> > I dare say we've developed some intuition about what makes an extensible
> > interface and where things are likely to go here at the TC, so I wouldn't discard
> > all feedback as unnecessary complication even if it does not always come with
> > concrete use-case examples.
> I do not doubt your intuition. 
> I am open to feedback, but we haven't really established at least single explicit grouping example and ask is to add some arbitrary reserved bytes for it.
> 
> >From my DPU experience, over last 3 years, we have built SF and VF device on parent PF-A, and their management interface on parent PF-B.
> On top of that it is expected to be uniquely indefinable in multi-host env, that brings the notion of multiple controllers by a single management device.
> And also make it work uniformly with same interface when parent and management device are same.
> You can see an extendible interface at [1].
> 
> With this base line of [1], I do not agree that defining a u32 device_identifier (to contain 20-bit PASID) will be sufficient in future.
> 

OK, and Cornelia also said she thinks 64 is necessary.

> So if we really want to cover variety of cases like [1] and some more complex nested cases, we better define,
> Device identifier as below,
> struct device_identifier {
> 	u8 id_length;
> 	u8 id[];	/* variable length field
> };
> For implicitly grouped VFs of a PF, id can be 2 bytes.
> For more advance cases it can be a structure consist one or more combination of (a) host id or controller id (b) PF BDF, (c) sf id (d) PASID and more.
> 
> [1] https://www.kernel.org/doc/Documentation/networking/devlink/devlink-port.rst

I'm fine with this too.

-- 
MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:37                       ` Michael S. Tsirkin
@ 2022-02-08 15:48                         ` Parav Pandit
  2022-02-08 21:02                           ` [virtio-comment] " Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 15:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 9:08 PM


> On Tue, Feb 08, 2022 at 03:35:58PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, February 8, 2022 8:56 PM
> >
> > > > I might have been a bit too vague about what I had been thinking
> > > > about. Let's do a sketch (intentionally without concrete sizes):
> > > >
> > > > +-------------------------------------------------------+
> > > > | command                                               |
> > > > +-------------------------------------------------------+
> > > > | target type (0 - self, 1 - dev id, 2 - group id, ...  |
> > > > +-------------------------------------------------------+
> > > > | dev id                                                |
> > > > +-------------------------------------------------------+
> > > > | group id                                              |
> > > > +-------------------------------------------------------+
> > > > | command-specific data                                 |
> > > > +-------------------------------------------------------+
> > > > | response part                                         |
> > > > +-------------------------------------------------------+
> > > >
> > > > 'dev id' would be valid for 'target type' == 1, 'group id' would
> > > > be valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
> > > > could be a single 'target id' field; if there's nothing better to
> > > > use, it can simply contain a uuid.
> > >
> > > I am not sure why do we have both dev id and group id.
> > > They are never used together right?
> > > Maybe just have an id length field if we can't agree on how much
> > > space to reserve.
> > This is what I propose in a previous email.
> > A device id can be duplicate in different groups.
> > So to build hierarchy group id will be desired.
> >
> > So id[] array can contain nested one or multiple fields.
> 
> I guess nesting is for when there's like an SF within a VF?
That is one case, but I imagine that if SF is on top of VF, than VF will directly manage it. PF should not get involved in nested management.

Nesting is for the case, where there is explicit group. A simple example is,
A NIC has two PCI functions, both support SR-IOV and VFs.
But only one PF is managing the VFs for msix, mac etc configuration.
In this case there are two groups g0, g1.
grp0 belongs to PF0.
grp1 belongs to PF1.

So a id definition looks like below.

struct pf_identifier {
	u8 device; /* pci device of bdf */
	u8 function;	/* pci function of bdf */
} __attribute__(packed);

struct pci_vf_grouped_id {
	struct pf_identifier pf_id;
	u16 vf_num;
};

struct device_identfier {
	u8 id_type;	/* below id type enum */
	u16 id_len;
	union {
		u8 raw[0];
		struct pci_vf_grouped_id gvf_id;
	} id_data;	
};

enum id_type {
	VIRTIO_PCI_VF = 0,	/* implicitly grouped VF */
	VIRTIO_PCI_REMOTE_VF = 1, /* part of hirechical group */
	VIRTIO_SF,
	VIRTIO_PCI_REMOTE_SF, /* part of hirechical group */
	VIRTIO_MMIO,
	/* more */
};


^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:39                   ` Michael S. Tsirkin
@ 2022-02-08 18:52                     ` Parav Pandit
  2022-02-08 21:00                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Parav Pandit @ 2022-02-08 18:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, February 8, 2022 9:10 PM
> 
> On Tue, Feb 08, 2022 at 03:06:16PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, February 8, 2022 7:29 PM
> > >
> > > > Do we have a concrete example of a command that can be targeted
> > > > for same
> > > device and a target device, which requires differentiating their
> > > destination? If so, lets discuss and then it make sense to add for the well-
> defined use case.
> > >
> > > So e.g. things like controlling NIC's MAC can reasonably be part of
> > > the same device.
> > A mac address of NIC can be programmed via the existing control VQ for the
> self.
> 
> Not if it's disabled for the guest.
> 
Its unrelated.
The idea was to issue same command in same way by two devices = primary and secondary.
And in case of primary, it will refer to secondary device.
And in second case secondary tells to self.

So if its disabled by guest, it doesn't matter how guest transports it, either via CVQ or AQ.
Its disabled.
Why to re-invent command that exists on CVQ to AQ?

> OK, and Cornelia also said she thinks 64 is necessary.
> 
> > So if we really want to cover variety of cases like [1] and some more
> > complex nested cases, we better define, Device identifier as below,
> > struct device_identifier {
> > 	u8 id_length;
> > 	u8 id[];	/* variable length field
> > };
> > For implicitly grouped VFs of a PF, id can be 2 bytes.
> > For more advance cases it can be a structure consist one or more
> combination of (a) host id or controller id (b) PF BDF, (c) sf id (d) PASID and
> more.
> >
> > [1]
> > https://www.kernel.org/doc/Documentation/networking/devlink/devlink-po
> > rt.rst
> 
> I'm fine with this too.
>
Since we aim for future proofing and flexibility, variable length id is better than constant u64. It covers the u64 case anyway.
 
> --
> MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 18:52                     ` Parav Pandit
@ 2022-02-08 21:00                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 21:00 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 06:52:09PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, February 8, 2022 9:10 PM
> > 
> > On Tue, Feb 08, 2022 at 03:06:16PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, February 8, 2022 7:29 PM
> > > >
> > > > > Do we have a concrete example of a command that can be targeted
> > > > > for same
> > > > device and a target device, which requires differentiating their
> > > > destination? If so, lets discuss and then it make sense to add for the well-
> > defined use case.
> > > >
> > > > So e.g. things like controlling NIC's MAC can reasonably be part of
> > > > the same device.
> > > A mac address of NIC can be programmed via the existing control VQ for the
> > self.
> > 
> > Not if it's disabled for the guest.
> > 
> Its unrelated.
> The idea was to issue same command in same way by two devices = primary and secondary.
> And in case of primary, it will refer to secondary device.
> And in second case secondary tells to self.
> 
> So if its disabled by guest, it doesn't matter how guest transports it, either via CVQ or AQ.
> Its disabled.
> Why to re-invent command that exists on CVQ to AQ?

I can go into it but it's beside the point.  I was just trying to help
you come up with use-cases.  Don't like it - come up with your own ones.


> > OK, and Cornelia also said she thinks 64 is necessary.
> > 
> > > So if we really want to cover variety of cases like [1] and some more
> > > complex nested cases, we better define, Device identifier as below,
> > > struct device_identifier {
> > > 	u8 id_length;
> > > 	u8 id[];	/* variable length field
> > > };
> > > For implicitly grouped VFs of a PF, id can be 2 bytes.
> > > For more advance cases it can be a structure consist one or more
> > combination of (a) host id or controller id (b) PF BDF, (c) sf id (d) PASID and
> > more.
> > >
> > > [1]
> > > https://www.kernel.org/doc/Documentation/networking/devlink/devlink-po
> > > rt.rst
> > 
> > I'm fine with this too.
> >
> Since we aim for future proofing and flexibility, variable length id is better than constant u64. It covers the u64 case anyway.
>  
> > --
> > MST


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add virtio Admin virtqueue
  2022-02-08 15:48                         ` Parav Pandit
@ 2022-02-08 21:02                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 43+ messages in thread
From: Michael S. Tsirkin @ 2022-02-08 21:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, Max Gurtovoy, virtio-comment, virtio-dev,
	jasowang, Shahaf Shuler, Oren Duer, stefanha

On Tue, Feb 08, 2022 at 03:48:42PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, February 8, 2022 9:08 PM
> 
> 
> > On Tue, Feb 08, 2022 at 03:35:58PM +0000, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Tuesday, February 8, 2022 8:56 PM
> > >
> > > > > I might have been a bit too vague about what I had been thinking
> > > > > about. Let's do a sketch (intentionally without concrete sizes):
> > > > >
> > > > > +-------------------------------------------------------+
> > > > > | command                                               |
> > > > > +-------------------------------------------------------+
> > > > > | target type (0 - self, 1 - dev id, 2 - group id, ...  |
> > > > > +-------------------------------------------------------+
> > > > > | dev id                                                |
> > > > > +-------------------------------------------------------+
> > > > > | group id                                              |
> > > > > +-------------------------------------------------------+
> > > > > | command-specific data                                 |
> > > > > +-------------------------------------------------------+
> > > > > | response part                                         |
> > > > > +-------------------------------------------------------+
> > > > >
> > > > > 'dev id' would be valid for 'target type' == 1, 'group id' would
> > > > > be valid for 'target type' == 2. Alternatively, 'dev id' and 'group id'
> > > > > could be a single 'target id' field; if there's nothing better to
> > > > > use, it can simply contain a uuid.
> > > >
> > > > I am not sure why do we have both dev id and group id.
> > > > They are never used together right?
> > > > Maybe just have an id length field if we can't agree on how much
> > > > space to reserve.
> > > This is what I propose in a previous email.
> > > A device id can be duplicate in different groups.
> > > So to build hierarchy group id will be desired.
> > >
> > > So id[] array can contain nested one or multiple fields.
> > 
> > I guess nesting is for when there's like an SF within a VF?
> That is one case, but I imagine that if SF is on top of VF, than VF will directly manage it. PF should not get involved in nested management.
> 
> Nesting is for the case, where there is explicit group. A simple example is,
> A NIC has two PCI functions, both support SR-IOV and VFs.
> But only one PF is managing the VFs for msix, mac etc configuration.

Or maybe that's the case of "it hurts when I do this - then don't do that".


> In this case there are two groups g0, g1.
> grp0 belongs to PF0.
> grp1 belongs to PF1.
> 
> So a id definition looks like below.
> 
> struct pf_identifier {
> 	u8 device; /* pci device of bdf */
> 	u8 function;	/* pci function of bdf */
> } __attribute__(packed);
> 
> struct pci_vf_grouped_id {
> 	struct pf_identifier pf_id;
> 	u16 vf_num;
> };
> 
> struct device_identfier {
> 	u8 id_type;	/* below id type enum */
> 	u16 id_len;
> 	union {
> 		u8 raw[0];
> 		struct pci_vf_grouped_id gvf_id;
> 	} id_data;	
> };
> 
> enum id_type {
> 	VIRTIO_PCI_VF = 0,	/* implicitly grouped VF */
> 	VIRTIO_PCI_REMOTE_VF = 1, /* part of hirechical group */
> 	VIRTIO_SF,
> 	VIRTIO_PCI_REMOTE_SF, /* part of hirechical group */
> 	VIRTIO_MMIO,
> 	/* more */
> };


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2022-02-08 21:02 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03  7:57 [PATCH v3 0/4] VIRTIO: Provision maximum MSI-X vectors for a VF Max Gurtovoy
2022-02-03  7:57 ` [PATCH v3 1/4] Add virtio Admin virtqueue Max Gurtovoy
2022-02-03 13:09   ` [virtio-dev] " Cornelia Huck
2022-02-07 10:14     ` Max Gurtovoy
2022-02-07 10:28       ` Michael S. Tsirkin
2022-02-07 11:51         ` [virtio-dev] " Cornelia Huck
2022-02-07 14:34           ` Max Gurtovoy
2022-02-07 15:08             ` [virtio-comment] " Cornelia Huck
2022-02-07 16:19               ` Michael S. Tsirkin
2022-02-07 10:39   ` Michael S. Tsirkin
2022-02-07 14:58     ` Max Gurtovoy
2022-02-07 16:18       ` Michael S. Tsirkin
2022-02-08  0:41         ` Max Gurtovoy
2022-02-08  6:45           ` Michael S. Tsirkin
2022-02-08  8:34             ` Max Gurtovoy
2022-02-08 13:08               ` [virtio-dev] " Cornelia Huck
2022-02-08 13:20                 ` Parav Pandit
2022-02-08 14:04               ` Michael S. Tsirkin
2022-02-08  6:25     ` Parav Pandit
2022-02-08  6:42       ` Michael S. Tsirkin
2022-02-08  7:04         ` Parav Pandit
2022-02-08 13:19           ` [virtio-comment] " Cornelia Huck
2022-02-08 13:32             ` Parav Pandit
2022-02-08 13:58               ` Michael S. Tsirkin
2022-02-08 14:59                 ` [virtio-comment] " Cornelia Huck
2022-02-08 15:11                   ` [virtio-dev] " Parav Pandit
2022-02-08 15:18                     ` Cornelia Huck
2022-02-08 15:28                     ` Michael S. Tsirkin
2022-02-08 15:33                       ` Parav Pandit
2022-02-08 15:36                         ` Michael S. Tsirkin
2022-02-08 15:26                   ` Michael S. Tsirkin
2022-02-08 15:32                     ` [virtio-comment] " Cornelia Huck
2022-02-08 15:35                     ` [virtio-dev] " Parav Pandit
2022-02-08 15:37                       ` Michael S. Tsirkin
2022-02-08 15:48                         ` Parav Pandit
2022-02-08 21:02                           ` [virtio-comment] " Michael S. Tsirkin
2022-02-08 15:06                 ` Parav Pandit
2022-02-08 15:39                   ` Michael S. Tsirkin
2022-02-08 18:52                     ` Parav Pandit
2022-02-08 21:00                       ` Michael S. Tsirkin
2022-02-03  7:57 ` [PATCH v3 2/4] Add miscellaneous configuration structure for PCI Max Gurtovoy
2022-02-03  7:57 ` [PATCH v3 3/4] Add device management facility Max Gurtovoy
2022-02-03  7:57 ` [virtio-comment] [PATCH v3 4/4] Add support for MSI-X vectors configuration for PCI VFs Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.