virtio-dev.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device
@ 2023-03-30 22:58 Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets Parav Pandit
                   ` (15 more replies)
  0 siblings, 16 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck; +Cc: virtio-comment, shahafs, Parav Pandit

Overview:
---------
The Transitional MMR device is a variant of the transitional PCI device.
It has its own small Device ID range. It does not have I/O
region BAR; instead it exposes legacy configuration and device
specific registers at an offset in the memory region BAR.

Such transitional MMR devices will be used at the scale of
thousands of devices using PCI SR-IOV and/or future scalable
virtualization technology to provide backward
compatibility (for legacy devices) and also future
compatibility with new features.

Usecase:
--------
1. A hypervisor/system needs to provide transitional
   virtio devices to the guest VM at scale of thousands,
   typically, one to eight devices per VM.

2. A hypervisor/system needs to provide such devices using a
   vendor agnostic driver in the hypervisor system.

3. A hypervisor system prefers to have single stack regardless of
   virtio device type (net/blk) and be future compatible with a
   single vfio stack using SR-IOV or other scalable device
   virtualization technology to map PCI devices to the guest VM.
   (as transitional or otherwise)

Motivation/Background:
----------------------
The existing transitional PCI device is missing support for
PCI SR-IOV based devices. Currently it does not work beyond
PCI PF, or as software emulated device in reality. It currently
has below cited system level limitations:

[a] PCIe spec citation:
VFs do not support I/O Space and thus VF BARs shall not
indicate I/O Space.

[b] cpu arch citiation:
Intel 64 and IA-32 Architectures Software Developer’s Manual:
The processor’s I/O address space is separate and distinct from
the physical-memory address space. The I/O address space consists
of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.

[c] PCIe spec citation:
If a bridge implements an I/O address range,...I/O address range
will be aligned to a 4 KB boundary.

[d] I/O region accesses at PCI system level is slow as they are non-posted
operations in PCIe fabric.

The usecase requirements and limitations above can be solved by
extending the transitional device, mapping legacy and device
specific configuration registers in a memory PCI BAR instead
of using non composable I/O region.

Please review.

Patch summary:
--------------
patch 1 to 5 prepares the spec
patch 6 to 11 defines transitional mmr device

patch-1 uses lower case alphabets to name device id
patch-2 move transitional device id in legay section along with
        revision id
patch-3 splits legacy feature bits description from device id
patch-4 rename and moves virtio config registers next to 1.x
        registers section
patch-5 Adds missing helper verb in terminology definitions
patch-6 introduces transitional mmr device
patch-7 introduces transitional mmr device pci device ids
patch-8 introduces virtio extended pci capability
patch-9 describes new pci capability to locate legacy mmr
        registers
patch-10 extended usage of driver notification capability for
         the transitional mmr device
patch-11 adds conformance section of the transitional mmr device

This design and details further described below.

Design:
-------
Below picture captures the main small difference between current
transitional PCI SR-IOV VF and transitional MMR SR-IOV VF.

+------------------+ +--------------------+ +--------------------+
|virtio 1.x        | |Transitional        | |Transitional        |
|SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
|                  | |                    | |                    |
++---------------+ | ++---------------+   | ++---------------+   |
||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
|+---------------+ | |+---------------+   | |+---------------+   |
|                  | |                    | |                    |
|+------------+    | |+------------+      | |+-----------------+ |
||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
|+------------+    | |+------------+      | ||                 | |
|                  | |                    | || +--------------+| |
|                  | |+-----------------+ | || |legacy virtio || |
|                  | ||IOBAR impossible | | || |+ dev cfg     || |
|                  | |+-----------------+ | || |registers     || |
|                  | |                    | || +--------------+| |
|                  | |                    | |+-----------------+ |
+------------------+ +--------------------+ +--------------------+

Here transitional MMR SR-IOV VF has legacy configuration and
legacy device specific registers located at an offset in the memory
region BAR.

A memory region can be dedicated at BAR0 or it can be in an
existing BAR, allowing flexibility when implementing support
in a hardware device.

Transitional MMR SR-IOV VFs use a distinct device ID range to that
of existing virtio SR-IOV VFs to allow flexibility in driver
binding.

A more zoom-in version of transitional MMR SR-IOV device shows
that the location of the legacy registers are discovered by the
driver using a new capability.

+------------------------------+
|Transitional                  |
|MMR SRIOV VF                  |
|                              |
++---------------+             |
||dev_id =       |             |
||{0x10f9-0x10ff}|             |
|+---------------+             |
|                              |
++--------------------+        |
|| PCIe ext cap = 0xB |        |
|| cfg_type = 10      |        |
|| offset   = 0x1000  |        |
|| bar      = N {0..5}|        |
|+--|-----------------+        |
|   |                          |
|   |                          |
|   |    +-------------------+ |
|   |    | Memory BAR = A    | |
|   |    |                   | |
|   +------>+--------------+ | |
|        |  |legacy virtio | | |
|        |  |+ dev cfg     | | |
|        |  |registers     | | |
|        |  +--------------+ | |
|        +-----------------+ | |
+------------------------------+

Software usage:
---------------
Transitional MMR device can be used by multiple ways.

1. The most common way to use and map to the guest VM is by
using vfio driver framework in Linux kernel.

                +----------------------+
                |pci_dev_id = 0x100X   |
+---------------|pci_rev_id = 0x0      |-----+
|vfio device    |BAR0 = I/O region     |     |
|               |Other attributes      |     |
|               +----------------------+     |
|                                            |
+   +--------------+     +-----------------+ |
|   |I/O to memory |     | Other vfio      | |
|   |rd/wr mapper  |     | functionalities | |
|   +--------------+     +-----------------+ |
|                                            |
+-------------------+------------------------+
                    |
       +------------+-----------------+
       |         Transitional         |
       |         MMR SRIOV VF         |
       +------------------------------+

2. Virtio pci driver to bind to the listed device id and
   use it as native device in the host.

3. Use it in a light weight hypervisor to run bare-metal OS.

Parav Pandit (11):
  transport-pci: Use lowecase alphabets
  transport-pci: Move transitional device id to legacy section
  transport-pci: Split notes of PCI Device Layout
  transport-pci: Rename and move legacy PCI Device layout section
  introduction: Add missing helping verb
  introduction: Introduce transitional MMR interface
  transport-pci: Introduce transitional MMR device id
  transport-pci: Introduce virtio extended capability
  transport-pci: Describe PCI MMR dev config registers
  transport-pci: Use driver notification PCI capability
  conformance: Add transitional MMR interface conformance

 conformance.tex      |  11 +-
 introduction.tex     |  34 +++-
 tmmr-conformance.tex |  27 +++
 transport-pci.tex    | 405 ++++++++++++++++++++++++++++++-------------
 4 files changed, 354 insertions(+), 123 deletions(-)
 create mode 100644 tmmr-conformance.tex

-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section Parav Pandit
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck; +Cc: virtio-comment, shahafs, Parav Pandit

Use uniformly lowercase alphabets to write PCI vendor id and device id.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 85b2dae..7f27107 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -18,17 +18,17 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport Options / Virtio Over P
 
 \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 
-Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device ID 0x1000 through
-0x107F inclusive is a virtio device. The actual value within this range
+Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
+0x107f inclusive is a virtio device. The actual value within this range
 indicates which virtio device is supported by the device.
 The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
 Additionally, devices MAY utilize a Transitional PCI Device ID range,
-0x1000 to 0x103F depending on the device type.
+0x1000 to 0x103f depending on the device type.
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 
-Devices MUST have the PCI Vendor ID 0x1AF4.
+Devices MUST have the PCI Vendor ID 0x1af4.
 Devices MUST either have the PCI Device ID calculated by adding 0x1040
 to the Virtio Device ID, as indicated in section \ref{sec:Device
 Types} or have the Transitional PCI Device ID depending on the device type,
@@ -69,13 +69,13 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Ov
 to drive the device.
 
 \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
-Drivers MUST match devices with the PCI Vendor ID 0x1AF4 and
+Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
 the PCI Device ID in the range 0x1040 to 0x107f,
 calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
 Drivers for device types listed in section \ref{sec:Virtio
 Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
-MUST match devices with the PCI Vendor ID 0x1AF4 and
+MUST match devices with the PCI Vendor ID 0x1af4 and
 the Transitional PCI Device ID indicated in section
  \ref{sec:Virtio
 Transport Options / Virtio Over PCI Bus / PCI Device Discovery}.
@@ -691,7 +691,7 @@ \subsubsection{Vendor data capability}\label{sec:Virtio
 Devices CAN present multiple Vendor data capabilities with
 either different or identical \field{vendor_id} values.
 
-The value \field{vendor_id} MUST NOT equal 0x1AF4.
+The value \field{vendor_id} MUST NOT equal 0x1af4.
 
 The size of the Vendor data capability MUST be a multiple of 4 bytes.
 
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-31  6:43   ` [virtio-dev] " Michael S. Tsirkin
  2023-03-30 22:58 ` [virtio-dev] [PATCH 03/11] transport-pci: Split notes of PCI Device Layout Parav Pandit
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Currently PCI device discovery details for the transitional device
are documented in two different sections.

For example, PCI device and vendor ID registers are documented in
'Device Requirements: PCI Device Discovery' section, while PCI
revision id is documented in 'Legacy Interfaces: A Note on PCI
Device Discovery' section.

Transitional devices requirements should be documented in "legacy
interfaces" section as clearly mentioned in
'Legacy Interface: A Note on Feature Bits'.

Hence,
1. Move transitional device requirements to its designated Legacy
   interface section
2. Describe regular device requirements without quoting it as "non
   transitional device"

While at it, write the description using a singular object definition.

Reviewed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 70 ++++++++++++++++++++++++-----------------------
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 7f27107..1f74c6f 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -28,46 +28,24 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Ov
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 
-Devices MUST have the PCI Vendor ID 0x1af4.
-Devices MUST either have the PCI Device ID calculated by adding 0x1040
+The device MUST have the PCI Vendor ID 0x1af4.
+The device MUST calculate PCI Device ID by adding 0x1040
 to the Virtio Device ID, as indicated in section \ref{sec:Device
-Types} or have the Transitional PCI Device ID depending on the device type,
-as follows:
-
-\begin{tabular}{|l|c|}
-\hline
-Transitional PCI Device ID  &  Virtio Device    \\
-\hline \hline
-0x1000      &   network device     \\
-\hline
-0x1001     &   block device     \\
-\hline
-0x1002     & memory ballooning (traditional)  \\
-\hline
-0x1003     &      console       \\
-\hline
-0x1004     &     SCSI host      \\
-\hline
-0x1005     &  entropy source    \\
-\hline
-0x1009     &   9P transport     \\
-\hline
-\end{tabular}
+Types}.
 
 For example, the network device with the Virtio Device ID 1
-has the PCI Device ID 0x1041 or the Transitional PCI Device ID 0x1000.
-
-The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
-the PCI Vendor and Device ID of the environment (for informational purposes by the driver).
+has the PCI Device ID 0x1041.
 
-Non-transitional devices SHOULD have a PCI Device ID in the range
-0x1040 to 0x107f.
-Non-transitional devices SHOULD have a PCI Revision ID of 1 or higher.
-Non-transitional devices SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
+The device SHOULD have a PCI Device ID in the range 0x1040 to 0x107f.
+The device SHOULD have a PCI Revision ID of 1 or higher.
+The device SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
 
 This is to reduce the chance of a legacy driver attempting
 to drive the device.
 
+The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
+the PCI Vendor and Device ID of the environment (for informational purposes by the driver).
+
 \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
 the PCI Device ID in the range 0x1040 to 0x107f,
@@ -85,8 +63,32 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Ov
 PCI Subsystem Device ID value.
 
 \subsubsection{Legacy Interfaces: A Note on PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
-Transitional devices MUST have a PCI Revision ID of 0.
-Transitional devices MUST have the PCI Subsystem Device ID
+
+The transitional device has one of the following PCI Device ID
+depending on the device type:
+
+\begin{tabular}{|l|c|}
+\hline
+Transitional PCI Device ID  &  Virtio Device    \\
+\hline \hline
+0x1000      &   network device     \\
+\hline
+0x1001     &   block device     \\
+\hline
+0x1002     & memory ballooning (traditional)  \\
+\hline
+0x1003     &      console       \\
+\hline
+0x1004     &     SCSI host      \\
+\hline
+0x1005     &  entropy source    \\
+\hline
+0x1009     &   9P transport     \\
+\hline
+\end{tabular}
+
+The transitional device MUST have a PCI Revision ID of 0.
+The transitional device MUST have the PCI Subsystem Device ID
 matching the Virtio Device ID, as indicated in section \ref{sec:Device Types}.
 Transitional devices MUST have the Transitional PCI Device ID in
 the range 0x1000 to 0x103f.
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 03/11] transport-pci: Split notes of PCI Device Layout
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 04/11] transport-pci: Rename and move legacy PCI Device layout section Parav Pandit
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Currently single legacy interface section describes PCI common
configuration layout and feature bits operation for the
legacy interface.

Hence, split PCI Device Layout legacy interface section into two
parts. First subsection for common configuration and second
subsection for feature bits.

Reviewed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 conformance.tex   |  1 +
 transport-pci.tex | 14 +++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/conformance.tex b/conformance.tex
index 01ccd69..4f724c2 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -263,6 +263,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item Section \ref{sec:General Initialization And Device Operation / Device Initialization / Legacy Interface: Device Initialization}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtio Device Configuration Layout Detection / Legacy Interface: A Note on Device Layout Detection}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration / Legacy Interface: A Note on Virtqueue Configuration}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}
diff --git a/transport-pci.tex b/transport-pci.tex
index 1f74c6f..65d9748 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -845,16 +845,20 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio
 devices MUST present the device-specific configuration space
 if any at an offset immediately following the general headers.
 
-Note that only Feature Bits 0 to 31 are accessible through the
-Legacy Interface. When used through the Legacy Interface,
-Transitional Devices MUST assume that Feature Bits 32 to 63
-are not acknowledged by Driver.
-
 As legacy devices had no \field{config_generation} field,
 see \ref{sec:Basic Facilities of a Virtio Device / Device
 Configuration Space / Legacy Interface: Device Configuration
 Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
 
+\subsubsection{Legacy Interface: A Note on Feature Bits}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
+Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
+
+Only Feature Bits 0 to 31 are accessible through the
+Legacy Interface. When used through the Legacy Interface,
+Transitional Devices MUST assume that Feature Bits 32 to 63
+are not acknowledged by Driver.
+
 \subsubsection{Non-transitional Device With Legacy Driver: A Note
 on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio
 Over PCI Bus / PCI Device Layout / Non-transitional Device With
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 04/11] transport-pci: Rename and move legacy PCI Device layout section
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (2 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 03/11] transport-pci: Split notes of PCI Device Layout Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 05/11] introduction: Add missing helping verb Parav Pandit
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Current 'Legacy Interfaces: A Note on PCI Device Layout' explains virtio
legacy registers which consist of common configuration structure and
device specific registers. It has little to do with the existing
PCI Device layout section.

For example, non transitional device's common configuration layout is
described in 'Common configuration structure layout' section outside of
'PCI Device Layout' section.

Hence, to keep Legacy common configuration registers section consistent
with 1.x, rename and move it adjacent to 1.x common configuration
structure.

Reviewed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 conformance.tex   |   2 +-
 transport-pci.tex | 162 +++++++++++++++++++++++-----------------------
 2 files changed, 83 insertions(+), 81 deletions(-)

diff --git a/conformance.tex b/conformance.tex
index 4f724c2..ccbc9bf 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -262,7 +262,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item Section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Message Framing / Legacy Interface: Message Framing}
 \item Section \ref{sec:General Initialization And Device Operation / Device Initialization / Legacy Interface: Device Initialization}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
-\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtio Device Configuration Layout Detection / Legacy Interface: A Note on Device Layout Detection}
 \item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtqueue Configuration / Legacy Interface: A Note on Virtqueue Configuration}
diff --git a/transport-pci.tex b/transport-pci.tex
index 65d9748..ee11ba5 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -497,6 +497,88 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 were used before the queue reset.
 (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
 
+\paragraph{Legacy Interfaces: A Note on Configuration Registers}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
+
+Transitional devices MUST present part of configuration
+registers in a legacy configuration structure in BAR0 in the first I/O
+region of the PCI device, as documented below.
+When using the legacy interface, transitional drivers
+MUST use the legacy configuration structure in BAR0 in the first
+I/O region of the PCI device, as documented below.
+
+When using the legacy interface the driver MAY access
+the device-specific configuration region using any width accesses, and
+a transitional device MUST present driver with the same results as
+when accessed using the ``natural'' access method (i.e.
+32-bit accesses for 32-bit fields, etc).
+
+Note that this is possible because while the virtio common configuration structure is PCI
+(i.e. little) endian, when using the legacy interface the device-specific
+configuration region is encoded in the native endian of the guest (where such distinction is
+applicable).
+
+When used through the legacy interface, the virtio common configuration structure looks as follows:
+
+\begin{tabularx}{\textwidth}{ |X||X|X|X|X|X|X|X|X| }
+\hline
+ Bits & 32 & 32 & 32 & 16 & 16 & 16 & 8 & 8 \\
+\hline
+ Read / Write & R & R+W & R+W & R & R+W & R+W & R+W & R \\
+\hline
+ Purpose & Device Features bits 0:31 & Driver Features bits 0:31 &
+  Queue Address & \field{queue_size} & \field{queue_select} & Queue Notify &
+  Device Status & ISR \newline Status \\
+\hline
+\end{tabularx}
+
+If MSI-X is enabled for the device, two additional fields
+immediately follow this header:
+
+\begin{tabular}{ |l||l|l| }
+\hline
+Bits       & 16             & 16     \\
+\hline
+Read/Write & R+W            & R+W    \\
+\hline
+Purpose (MSI-X) & \field{config_msix_vector}  & \field{queue_msix_vector} \\
+\hline
+\end{tabular}
+
+Note: When MSI-X capability is enabled, device-specific configuration starts at
+byte offset 24 in virtio common configuration structure. When MSI-X capability is not
+enabled, device-specific configuration starts at byte offset 20 in virtio
+header.  ie. once you enable MSI-X on the device, the other fields move.
+If you turn it off again, they move back!
+
+Any device-specific configuration space immediately follows
+these general headers:
+
+\begin{tabular}{|l||l|l|}
+\hline
+Bits & Device Specific & \multirow{3}{*}{\ldots} \\
+\cline{1-2}
+Read / Write & Device Specific & \\
+\cline{1-2}
+Purpose & Device Specific & \\
+\hline
+\end{tabular}
+
+When accessing the device-specific configuration space
+using the legacy interface, transitional
+drivers MUST access the device-specific configuration space
+at an offset immediately following the general headers.
+
+When using the legacy interface, transitional
+devices MUST present the device-specific configuration space
+if any at an offset immediately following the general headers.
+
+As legacy devices had no \field{config_generation} field,
+see \ref{sec:Basic Facilities of a Virtio Device / Device
+Configuration Space / Legacy Interface: Device Configuration
+Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
+
+
 \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
 
 The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
@@ -770,86 +852,6 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
-\subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
-
-Transitional devices MUST present part of configuration
-registers in a legacy configuration structure in BAR0 in the first I/O
-region of the PCI device, as documented below.
-When using the legacy interface, transitional drivers
-MUST use the legacy configuration structure in BAR0 in the first
-I/O region of the PCI device, as documented below.
-
-When using the legacy interface the driver MAY access
-the device-specific configuration region using any width accesses, and
-a transitional device MUST present driver with the same results as
-when accessed using the ``natural'' access method (i.e.
-32-bit accesses for 32-bit fields, etc).
-
-Note that this is possible because while the virtio common configuration structure is PCI
-(i.e. little) endian, when using the legacy interface the device-specific
-configuration region is encoded in the native endian of the guest (where such distinction is
-applicable).
-
-When used through the legacy interface, the virtio common configuration structure looks as follows:
-
-\begin{tabularx}{\textwidth}{ |X||X|X|X|X|X|X|X|X| }
-\hline
- Bits & 32 & 32 & 32 & 16 & 16 & 16 & 8 & 8 \\
-\hline
- Read / Write & R & R+W & R+W & R & R+W & R+W & R+W & R \\
-\hline
- Purpose & Device Features bits 0:31 & Driver Features bits 0:31 &
-  Queue Address & \field{queue_size} & \field{queue_select} & Queue Notify &
-  Device Status & ISR \newline Status \\
-\hline
-\end{tabularx}
-
-If MSI-X is enabled for the device, two additional fields
-immediately follow this header:
-
-\begin{tabular}{ |l||l|l| }
-\hline
-Bits       & 16             & 16     \\
-\hline
-Read/Write & R+W            & R+W    \\
-\hline
-Purpose (MSI-X) & \field{config_msix_vector}  & \field{queue_msix_vector} \\
-\hline
-\end{tabular}
-
-Note: When MSI-X capability is enabled, device-specific configuration starts at
-byte offset 24 in virtio common configuration structure. When MSI-X capability is not
-enabled, device-specific configuration starts at byte offset 20 in virtio
-header.  ie. once you enable MSI-X on the device, the other fields move.
-If you turn it off again, they move back!
-
-Any device-specific configuration space immediately follows
-these general headers:
-
-\begin{tabular}{|l||l|l|}
-\hline
-Bits & Device Specific & \multirow{3}{*}{\ldots} \\
-\cline{1-2}
-Read / Write & Device Specific & \\
-\cline{1-2}
-Purpose & Device Specific & \\
-\hline
-\end{tabular}
-
-When accessing the device-specific configuration space
-using the legacy interface, transitional
-drivers MUST access the device-specific configuration space
-at an offset immediately following the general headers.
-
-When using the legacy interface, transitional
-devices MUST present the device-specific configuration space
-if any at an offset immediately following the general headers.
-
-As legacy devices had no \field{config_generation} field,
-see \ref{sec:Basic Facilities of a Virtio Device / Device
-Configuration Space / Legacy Interface: Device Configuration
-Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
-
 \subsubsection{Legacy Interface: A Note on Feature Bits}
 \label{sec:Virtio Transport Options / Virtio Over PCI Bus /
 Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 05/11] introduction: Add missing helping verb
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (3 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 04/11] transport-pci: Rename and move legacy PCI Device layout section Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-30 22:58 ` [virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface Parav Pandit
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck; +Cc: virtio-comment, shahafs, Parav Pandit

The terminology of Transitional device and driver is missing the helping
verb 'is' similar to other terminologies.

Hence, add them to complete the sentence.

Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 introduction.tex | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/introduction.tex b/introduction.tex
index 287c5fc..e8b34e3 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -145,14 +145,14 @@ \subsection{Legacy Interface: Terminology}\label{intro:Legacy
 
 \begin{description}
 \item[Transitional Device]
-        a device supporting both drivers conforming to this
+        is a device supporting both drivers conforming to this
         specification, and allowing legacy drivers.
 \end{description}
 
 Similarly, a driver MAY implement:
 \begin{description}
 \item[Transitional Driver]
-        a driver supporting both devices conforming to this
+        is a driver supporting both devices conforming to this
         specification, and legacy devices.
 \end{description}
 
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (4 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 05/11] introduction: Add missing helping verb Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-04-07  9:17   ` [virtio-dev] " Michael S. Tsirkin
  2023-03-30 22:58 ` [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id Parav Pandit
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Introduce terminology for the transitional MMR device and transitional
MMR driver.

Add description of the transitional MMR device. It is a PCI
device that implements legacy virtio common configuration registers
followed by legacy device specific registers in a memory region at
an offset.

This enables hypervisor such as vfio driver to emulate
I/O region towards the guest at BAR0. By doing so VFIO driver can
translate read/write accesses on I/O region from the guest
to the device memory region.

High level comparison of 1.x, transitional & transitional MMR
sriov vf device:

+------------------+ +--------------------+ +--------------------+
|virtio 1.x        | |Transitional        | |Transitional        |
|SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
|                  | |                    | |                    |
++---------------+ | ++---------------+   | ++---------------+   |
||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
|+---------------+ | |+---------------+   | |+---------------+   |
|                  | |                    | |                    |
|+------------+    | |+------------+      | |+-----------------+ |
||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
|+------------+    | |+------------+      | ||                 | |
|                  | |                    | || +--------------+| |
|                  | |+-----------------+ | || |legacy virtio || |
|                  | ||IOBAR impossible | | || |+ dev cfg     || |
|                  | |+-----------------+ | || |registers     || |
|                  | |                    | || +--------------+| |
|                  | |                    | |+-----------------+ |
+------------------+ +--------------------+ +--------------------+

Motivation and background:
PCIe and system limitations:
1. PCIe VFs do not support IOBAR cited at [1].

Perhaps the PCIe spec could be extended, however it would be only
useful for virtio transitional devices. Even if such an extension
is present, there are other system limitations described below in (2)
and (3).

2. cpu io port space limit and fragmentation
x86_64 is limited to only 64K worth of IO port space at [2],
which is shared with many other onboard system peripherals which
are behind PCIe bridge; such I/O region also needs to be aligned
to 4KB at PCIe bridge level cited at [3]. This can lead to a I/O space
fragmentation. Due to this fragmentation and alignment need,
actual usable range is small.

3. IO space access of PCI device is done through non-posted message
 which requires higher completion time in the PCIe fabric for
round trip travel.

[1] PCIe spec citation:
VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.

[2] cpu arch citiation:
Intel 64 and IA-32 Architectures Software Developer’s Manual
The processor’s I/O address space is separate and distinct from
the physical-memory address space. The I/O address space consists
of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.

[3] PCIe spec citation:
If a bridge implements an I/O address range,...I/O address range
will be aligned to a 4 KB boundary.

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 introduction.tex | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/introduction.tex b/introduction.tex
index e8b34e3..9a0f96a 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -161,6 +161,20 @@ \subsection{Legacy Interface: Terminology}\label{intro:Legacy
   have a need for backwards compatibility!
 \end{note}
 
+\begin{description}
+\item[Transitional MMR Device]
+       is a PCI device which exposes legacy virtio configuration
+       registers followed by legacy device configuration registers as
+       memory mapped registers (MMR) at an offset in a memory region
+       BAR, has no I/O region BAR, having its own PCI Device ID range,
+       and follows the rest of the functionalities of the transitional device.
+\end{description}
+
+\begin{description}
+\item[Transitional MMR Driver]
+       is a PCI device driver that supports the Transitional MMR device.
+\end{description}
+
 Devices or drivers with no legacy compatibility are referred to as
 non-transitional devices and drivers, respectively.
 
@@ -174,6 +188,22 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
 sections tagged "Legacy Interface" in the section title.
 These highlight the changes made since the earlier drafts.
 
+\subsection{Transitional MMR interface: specification drafts}\label{sec:Transitional MMR interface: specification drafts}
+
+The transitional MMR device and driver differs from the
+transitional device and driver respectively in few areas. Such
+differences are contained in sections named
+'Transitional MMR interface', like this one. When no differences
+are mentioned explicitly, the transitional MMR device and driver
+follow exactly the same functionalities as that of the
+transitional device and driver respectively.
+
+\begin{note}
+Transitional MMR interface is only required to support backward
+compatibility. It should not be implemented unless there is a need
+for the backward compatibility.
+\end{note}
+
 \section{Structure Specifications}\label{sec:Structure Specifications}
 
 Many device and driver in-memory structure layouts are documented using
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (5 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-04-04  7:28   ` [virtio-dev] " Michael S. Tsirkin
  2023-04-07  8:37   ` [virtio-dev] " Michael S. Tsirkin
  2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
                   ` (8 subsequent siblings)
  15 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Transitional MMR device PCI Device IDs are unique. Hence,
any of the existing drivers do not bind to it.
This further maintains the backward compatibility with
existing drivers.

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 45 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index ee11ba5..665448e 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -19,12 +19,14 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport Options / Virtio Over P
 \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 
 Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
-0x107f inclusive is a virtio device. The actual value within this range
-indicates which virtio device is supported by the device.
+0x107f inclusive and DeviceID 0x10f9 through 0x10ff is a virtio device.
+The actual value within this range indicates which virtio device
+type it is.
 The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
 as indicated in section \ref{sec:Device Types}.
-Additionally, devices MAY utilize a Transitional PCI Device ID range,
-0x1000 to 0x103f depending on the device type.
+Additionally, devices MAY utilize a Transitional PCI Device ID range
+0x1000 to 0x103f inclusive or a Transitional MMR PCI Device ID range
+0x10f9 to 0x10ff inclusive, depending on the device type.
 
 \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
 
@@ -95,6 +97,41 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device Discovery}\label{sec:Virt
 
 This is to match legacy drivers.
 
+\subsubsection{Transitional MMR Interface: A Note on PCI Device
+Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI
+Bus / PCI Device Discovery / Transitional MMR Interface: A Note on PCI Device Discovery}
+
+The transitional MMR device has one of the following PCI Device ID
+depending on the device type:
+
+\begin{tabular}{|l|c|}
+\hline
+Transitional PCI Device ID  &  Virtio Device    \\
+\hline \hline
+0x10f9      &   network device     \\
+\hline
+0x10fa     &   block device     \\
+\hline
+0x10fb     & memory ballooning (traditional)  \\
+\hline
+0x10fc     &      console       \\
+\hline
+0x10fd     &     SCSI host      \\
+\hline
+0x10fe     &  entropy source    \\
+\hline
+0x10ff     &   9P transport     \\
+\hline
+\end{tabular}
+
+The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
+reflect the PCI Vendor and Device ID of the environment.
+
+The transitional MMR driver MUST match any PCI Revision ID value.
+
+The transitional MMR driver MAY match any PCI Subsystem Vendor ID and
+any PCI Subsystem Device ID value.
+
 \subsection{PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout}
 
 The device is configured via I/O and/or memory regions (though see
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (6 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
                     ` (2 more replies)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers Parav Pandit
                   ` (7 subsequent siblings)
  15 siblings, 3 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

PCI device configuration space for capabilities is limited to only 192
bytes shared by many PCI capabilities of generic PCI device and virtio
specific.

Hence, introduce virtio extended capability that uses PCI Express
extended capability.
Subsequent patch uses this virtio extended capability.

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 69 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index 665448e..aeda4a1 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -174,7 +174,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 the function, or accessed via the special VIRTIO_PCI_CAP_PCI_CFG field in the PCI configuration space.
 
 The location of each structure is specified using a vendor-specific PCI capability located
-on the capability list in PCI configuration space of the device.
+on the capability list in PCI configuration space of the device
+unless stated otherwise.
 This virtio structure capability uses little-endian format; all fields are
 read-only for the driver unless stated otherwise:
 
@@ -301,6 +302,72 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 fields provide the most significant 32 bits of a total 64 bit offset and
 length within the BAR specified by \field{cap.bar}.
 
+Virtio extended PCI Express capability structure defines
+the location of certain virtio device configuration related
+structures using PCI Express extended capability. Virtio
+extended PCI Express capability structure uses PCI Express
+vendor specific extended capability (VSEC). It has a below
+layout:
+
+\begin{lstlisting}
+struct pcie_ext_cap {
+        le16 cap_vendor_id; /* Generic PCI field: 0xB */
+        le16 cap_version : 2; /* Generic PCI field: 0 */
+        le16 next_cap_offset : 14; /* Generic PCI field: next cap or 0 */
+};
+
+struct virtio_pcie_ext_cap {
+        struct pcie_ext_cap pcie_ecap;
+        u8 cfg_type; /* Identifies the structure. */
+        u8 bar; /* Index of the BAR where its located */
+        u8 id; /* Multiple capabilities of the same type */
+        u8 zero_padding[1];
+        le64 offset; /* Offset with the bar */
+        le64 length; /* Length of the structure, in bytes. */
+        u8 data[]; /* Optional variable length data */
+};
+\end{lstlisting}
+
+This structure contains optional data, depending on
+\field{cfg_type}. The fields are interpreted as follows:
+
+\begin{description}
+\item[\field{cap_vendor_id}]
+         0x0B; identifies a vendor-specific extended capability.
+
+\item[\field{cap_version}]
+         contains a value of 0.
+
+\item[\field{next_cap_offset}]
+        Offset to the next capability.
+
+\item[\field{cfg_type}]
+        follows the same definition as \field{cfg_type}
+        from the \field{struct virtio_pci_cap}.
+
+\item[\field{bar}]
+        follows the same  same definition as  \field{bar}
+        from the \field{struct virtio_pci_cap}.
+
+\item[\field{id}]
+        follows the same  same definition as  \field{id}
+        from the \field{struct virtio_pci_cap}.
+
+\item[\field{offset}]
+        indicates where the structure begins relative to the
+        base address associated with the BAR. The alignment
+        requirements of offset are indicated in each
+        structure-specific section that uses
+        \field{struct virtio_pcie_ext_cap}.
+
+\item[\field{length}]
+        indicates the length of the structure indicated by this
+        capability.
+
+\item[\field{data}]
+        optional data of this capability.
+\end{description}
+
 \drivernormative{\subsubsection}{Virtio Structure PCI Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
 
 The driver MUST ignore any vendor-specific capability structure which has
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (7 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-04-07  8:55   ` [virtio-dev] " Michael S. Tsirkin
  2023-04-12  4:33   ` [virtio-dev] " Michael S. Tsirkin
  2023-03-30 22:58 ` [virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability Parav Pandit
                   ` (6 subsequent siblings)
  15 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Legacy virtio configuration registers and adjacent
device configuration registers are located somewhere
in a memory BAR.

A new capability supplies the location of these registers
which a driver can use to map I/O access to legacy
memory mapped registers.

This gives the ability to locate legacy registers in either
the existing memory BAR or as completely new BAR at BAR 0.

A below example diagram attempts to depicts it in an existing
memory BAR.

+------------------------------+
|Transitional                  |
|MMR SRIOV VF                  |
|                              |
++---------------+             |
||dev_id =       |             |
||{0x10f9-0x10ff}|             |
|+---------------+             |
|                              |
++--------------------+        |
|| PCIe ext cap = 0xB |        |
|| cfg_type = 10      |        |
|| offset   = 0x1000  |        |
|| bar      = A {0..5}|        |
|+--|-----------------+        |
|   |                          |
|   |                          |
|   |    +-------------------+ |
|   |    | Memory BAR = A    | |
|   |    |                   | |
|   +------>+--------------+ | |
|        |  |legacy virtio | | |
|        |  |+ dev cfg     | | |
|        |  |registers     | | |
|        |  +--------------+ | |
|        +-----------------+ | |
+------------------------------+

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/transport-pci.tex b/transport-pci.tex
index aeda4a1..55a6aa0 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -168,6 +168,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 \item ISR Status
 \item Device-specific configuration (optional)
 \item PCI configuration access
+\item Legacy memory mapped configuration registers (optional)
 \end{itemize}
 
 Each structure can be mapped by a Base Address register (BAR) belonging to
@@ -228,6 +229,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG        9
+/* Legacy configuration registers capability */
+#define VIRTIO_PCI_CAP_LEGACY_MMR_CFG    10
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -682,6 +685,18 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 Configuration Space / Legacy Interface: Device Configuration
 Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
 
+\paragraph{Transitional MMR Interface: A Note on Configuration Registers}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Transitional MMR Interface: A Note on Configuration Registers}
+
+The transitional MMR device MUST present legacy virtio registers
+consisting of legacy common configuration registers followed by
+legacy device specific configuration registers described in section
+\ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
+in a memory region PCI BAR.
+
+The transitional MMR device MUST provide the location of the
+legacy virtio configuration registers using a legacy memory mapped
+registers capability described in section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}.
 
 \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
 
@@ -956,9 +971,23 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
+
+The optional VIRTIO_PCI_CAP_LEGACY_MMR_CFG capability defines
+the location of the legacy virtio configuration registers
+followed by legacy device specific configuration registers in
+the memory region BAR for the transitional MMR device.
+
+The \field{cap.offset} MUST be 4-byte aligned.
+The \field{cap.offset} SHOULD be 4KBytes aligned and
+\field{cap.length} SHOULD be 4KBytes.
+
+The transitional MMR device MUST present a legacy configuration
+memory mapped registers capability using \field{virtio_pcie_ext_cap}.
+
 \subsubsection{Legacy Interface: A Note on Feature Bits}
-\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
-Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
 
 Only Feature Bits 0 to 31 are accessible through the
 Legacy Interface. When used through the Legacy Interface,
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (8 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-04-12  4:31   ` [virtio-dev] " Michael S. Tsirkin
  2023-03-30 22:58 ` [virtio-dev] [PATCH 11/11] conformance: Add transitional MMR interface conformance Parav Pandit
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

PCI devices support memory BAR regions for performant driver
notifications using the notification capability.
Enable transitional MMR devices to use it in simpler manner.

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 transport-pci.tex | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index 55a6aa0..4fd9898 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -763,6 +763,34 @@ \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options
 cap.length >= queue_notify_off * notify_off_multiplier + 4
 \end{lstlisting}
 
+\paragraph{Transitional MMR Interface: A note on Notification Capability}
+\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Notification capability / Transitional MMR Interface}
+
+The transitional MMR device benefits from receiving driver
+notifications at the Queue Notification address offered using
+the notification capability, rather than via the memory mapped
+legacy QueueNotify configuration register.
+
+Transitional MMR device uses same Queue Notification address
+within a BAR for all virtqueues:
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
+The transitional MMR device MUST support Queue Notification
+address within a BAR for all virtqueues at:
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
+The transitional MMR driver that wants to use driver
+notifications offered using notification capability MUST use
+same Queue Notification address within a BAR for all virtqueues at:
+
+\begin{lstlisting}
+cap.offset
+\end{lstlisting}
+
 \subsubsection{ISR status capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / ISR status capability}
 
 The VIRTIO_PCI_CAP_ISR_CFG capability
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] [PATCH 11/11] conformance: Add transitional MMR interface conformance
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (9 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability Parav Pandit
@ 2023-03-30 22:58 ` Parav Pandit
  2023-03-31  7:03 ` [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device Michael S. Tsirkin
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-03-30 22:58 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Parav Pandit, Satananda Burla

Add conformance section for the transitional MMR interface.

Basically transitional MMR interface follows same requirements as that
of the transitional device with few exceptions to it. List such delta
requirements in the conformance section.

Co-developed-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
---
 conformance.tex      |  8 ++++++--
 tmmr-conformance.tex | 27 +++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 tmmr-conformance.tex

diff --git a/conformance.tex b/conformance.tex
index ccbc9bf..4cccba7 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -11,7 +11,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 
 Conformance targets:
 \begin{description}
-\item[Driver] A driver MUST conform to four conformance clauses:
+\item[Driver] A driver MUST conform to five conformance clauses:
   \begin{itemize}
     \item Clause \ref{sec:Conformance / Driver Conformance}.
     \item One of clauses \ref{sec:Conformance / Driver Conformance / PCI Driver Conformance}, \ref{sec:Conformance / Driver Conformance / MMIO Driver Conformance} or \ref{sec:Conformance / Driver Conformance / Channel I/O Driver Conformance}.
@@ -36,8 +36,9 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \ref{sec:Conformance / Driver Conformance / PMEM Driver Conformance}.
 
     \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}.
+    \item Clause \ref{sec:Conformance / Transitional MMR Interface: Transitional MMR Device and Transitional MMR Driver Conformance}.
   \end{itemize}
-\item[Device] A device MUST conform to four conformance clauses:
+\item[Device] A device MUST conform to five conformance clauses:
   \begin{itemize}
     \item Clause \ref{sec:Conformance / Device Conformance}.
     \item One of clauses \ref{sec:Conformance / Device Conformance / PCI Device Conformance}, \ref{sec:Conformance / Device Conformance / MMIO Device Conformance} or \ref{sec:Conformance / Device Conformance / Channel I/O Device Conformance}.
@@ -63,6 +64,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \ref{sec:Conformance / Device Conformance / PMEM Device Conformance}.
 
     \item Clause \ref{sec:Conformance / Legacy Interface: Transitional Device and Transitional Driver Conformance}.
+    \item Clause \ref{sec:Conformance / Transitional MMR Interface: Transitional MMR Device and Transitional MMR Driver Conformance}.
   \end{itemize}
 \end{description}
 
@@ -294,3 +296,5 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item Section \ref{sec:Device Types / SCSI Host Device / Device Operation / Device Operation: eventq / Legacy Interface: Device Operation: eventq}
 \item Section \ref{sec:Reserved Feature Bits / Legacy Interface: Reserved Feature Bits}
 \end{itemize}
+
+\input{tmmr-conformance.tex}
diff --git a/tmmr-conformance.tex b/tmmr-conformance.tex
new file mode 100644
index 0000000..ad3489b
--- /dev/null
+++ b/tmmr-conformance.tex
@@ -0,0 +1,27 @@
+\conformance{\section}{Transitional MMR Interface: Transitional MMR Device and Transitional MMR Driver Conformance}\label{sec:Conformance / Transitional MMR Interface: Transitional MMR Device and Transitional MMR Driver Conformance}
+
+An implementation MAY choose to implement OPTIONAL support for the
+transitional MMR interface, by conforming to all of the MUST
+level requirements for the transitional MMR interface for the
+transitional devices and drivers.
+
+The requirements for the transitional MMR interface follows all
+the legacy interface requirements listed in section
+\ref{sec:Conformance / Legacy Interface: Transitional Device and
+Transitional Driver Conformance} with the following exceptions.
+
+Following requirements MUST NOT be implemented:
+
+\begin{itemize}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
+\end{itemize}
+
+Instead following requirements MUST be implemented:
+
+\begin{itemize}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Transitional MMR Interface: A Note on PCI Device Discovery}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Transitional MMR Interface: A Note on Configuration Registers}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Notification capability / Transitional MMR Interface}
+\item Section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
+\end{itemize}
-- 
2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-03-30 22:58 ` [virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section Parav Pandit
@ 2023-03-31  6:43   ` Michael S. Tsirkin
  2023-03-31 21:24     ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-03-31  6:43 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> Currently PCI device discovery details for the transitional device
> are documented in two different sections.
> 
> For example, PCI device and vendor ID registers are documented in
> 'Device Requirements: PCI Device Discovery' section, while PCI
> revision id is documented in 'Legacy Interfaces: A Note on PCI
> Device Discovery' section.
> 
> Transitional devices requirements should be documented in "legacy
> interfaces" section as clearly mentioned in
> 'Legacy Interface: A Note on Feature Bits'.

I already commented on this, I disagree.
Modern drivers must be able
to completely ignore legacy interface sections, but they
do bind to transitional device IDs.
This change breaks this assumption.


> Hence,
> 1. Move transitional device requirements to its designated Legacy
>    interface section
> 2. Describe regular device requirements without quoting it as "non
>    transitional device"
> 
> While at it, write the description using a singular object definition.
> 
> Reviewed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
>  transport-pci.tex | 70 ++++++++++++++++++++++++-----------------------
>  1 file changed, 36 insertions(+), 34 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 7f27107..1f74c6f 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -28,46 +28,24 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Ov
>  
>  \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
> -Devices MUST have the PCI Vendor ID 0x1af4.
> -Devices MUST either have the PCI Device ID calculated by adding 0x1040
> +The device MUST have the PCI Vendor ID 0x1af4.
> +The device MUST calculate PCI Device ID by adding 0x1040
>  to the Virtio Device ID, as indicated in section \ref{sec:Device
> -Types} or have the Transitional PCI Device ID depending on the device type,
> -as follows:
> -
> -\begin{tabular}{|l|c|}
> -\hline
> -Transitional PCI Device ID  &  Virtio Device    \\
> -\hline \hline
> -0x1000      &   network device     \\
> -\hline
> -0x1001     &   block device     \\
> -\hline
> -0x1002     & memory ballooning (traditional)  \\
> -\hline
> -0x1003     &      console       \\
> -\hline
> -0x1004     &     SCSI host      \\
> -\hline
> -0x1005     &  entropy source    \\
> -\hline
> -0x1009     &   9P transport     \\
> -\hline
> -\end{tabular}
> +Types}.
>  
>  For example, the network device with the Virtio Device ID 1
> -has the PCI Device ID 0x1041 or the Transitional PCI Device ID 0x1000.
> -
> -The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
> -the PCI Vendor and Device ID of the environment (for informational purposes by the driver).
> +has the PCI Device ID 0x1041.
>  
> -Non-transitional devices SHOULD have a PCI Device ID in the range
> -0x1040 to 0x107f.
> -Non-transitional devices SHOULD have a PCI Revision ID of 1 or higher.
> -Non-transitional devices SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
> +The device SHOULD have a PCI Device ID in the range 0x1040 to 0x107f.
> +The device SHOULD have a PCI Revision ID of 1 or higher.
> +The device SHOULD have a PCI Subsystem Device ID of 0x40 or higher.
>  
>  This is to reduce the chance of a legacy driver attempting
>  to drive the device.
>  
> +The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect
> +the PCI Vendor and Device ID of the environment (for informational purposes by the driver).
> +
>  \drivernormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  Drivers MUST match devices with the PCI Vendor ID 0x1af4 and
>  the PCI Device ID in the range 0x1040 to 0x107f,
> @@ -85,8 +63,32 @@ \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Ov
>  PCI Subsystem Device ID value.
>  
>  \subsubsection{Legacy Interfaces: A Note on PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery / Legacy Interfaces: A Note on PCI Device Discovery}
> -Transitional devices MUST have a PCI Revision ID of 0.
> -Transitional devices MUST have the PCI Subsystem Device ID
> +
> +The transitional device has one of the following PCI Device ID
> +depending on the device type:
> +
> +\begin{tabular}{|l|c|}
> +\hline
> +Transitional PCI Device ID  &  Virtio Device    \\
> +\hline \hline
> +0x1000      &   network device     \\
> +\hline
> +0x1001     &   block device     \\
> +\hline
> +0x1002     & memory ballooning (traditional)  \\
> +\hline
> +0x1003     &      console       \\
> +\hline
> +0x1004     &     SCSI host      \\
> +\hline
> +0x1005     &  entropy source    \\
> +\hline
> +0x1009     &   9P transport     \\
> +\hline
> +\end{tabular}
> +
> +The transitional device MUST have a PCI Revision ID of 0.
> +The transitional device MUST have the PCI Subsystem Device ID
>  matching the Virtio Device ID, as indicated in section \ref{sec:Device Types}.
>  Transitional devices MUST have the Transitional PCI Device ID in
>  the range 0x1000 to 0x103f.
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (10 preceding siblings ...)
  2023-03-30 22:58 ` [virtio-dev] [PATCH 11/11] conformance: Add transitional MMR interface conformance Parav Pandit
@ 2023-03-31  7:03 ` Michael S. Tsirkin
  2023-03-31 21:43   ` Parav Pandit
  2023-04-03 14:45 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-03-31  7:03 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs

On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.
> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 
> Such transitional MMR devices will be used at the scale of
> thousands of devices using PCI SR-IOV and/or future scalable
> virtualization technology to provide backward
> compatibility (for legacy devices) and also future
> compatibility with new features.
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>    virtio devices to the guest VM at scale of thousands,
>    typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>    vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>    virtio device type (net/blk) and be future compatible with a
>    single vfio stack using SR-IOV or other scalable device
>    virtualization technology to map PCI devices to the guest VM.
>    (as transitional or otherwise)
> 
> Motivation/Background:
> ----------------------
> The existing transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. It currently
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not
> indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range
> will be aligned to a 4 KB boundary.
> 
> [d] I/O region accesses at PCI system level is slow as they are non-posted
> operations in PCIe fabric.
> 
> The usecase requirements and limitations above can be solved by
> extending the transitional device, mapping legacy and device
> specific configuration registers in a memory PCI BAR instead
> of using non composable I/O region.
> 
> Please review.

So as you explain in a lot of detail above, IO support is going away,
so the transitional device can no longer be used through the
legacy interface.

OK but this does not answer the following question:
since a legacy driver can not bind to this type of MMR device,
a new driver is needed anyway so
why not implement a modern driver?


I think we discussed this at some call and it made some kind of sense.
Unfortunately it has been a while and I am not sure I remember the
detail, so I can no longer say for sure whether this proposal is fit for
the purpose.  Here is what I vaguely remember:

A valid use-case is an emulation layer (e.g. a hypervisor) translating
a legacy driver I/O accesses to MMIO. Ideally layering this emulation
on top of a modern device would work ok
but there are several things making this approach problematic.
One is a different virtio net header size between legacy and modern
driver. Another is use of control VQ by modern where legacy used
IO writes. In both cases the different would require the
emulation getting involved on the DMA path, in particular
somehow finding private addresses for communication between
emulation and modern device.


Does above summarize it reasonably?


And if yes, would an alternative approach of adding legacy config
support to transport vq work well?  I can not say I thought about this
deeply so maybe there's some problem, or maybe it's a worse approach -
could you comment on this? It looks like this could be a smaller change,
but maybe it isn't? Did you consider this option?


More review later.



> Patch summary:
> --------------
> patch 1 to 5 prepares the spec
> patch 6 to 11 defines transitional mmr device
> 
> patch-1 uses lower case alphabets to name device id
> patch-2 move transitional device id in legay section along with
>         revision id
> patch-3 splits legacy feature bits description from device id
> patch-4 rename and moves virtio config registers next to 1.x
>         registers section
> patch-5 Adds missing helper verb in terminology definitions
> patch-6 introduces transitional mmr device
> patch-7 introduces transitional mmr device pci device ids
> patch-8 introduces virtio extended pci capability
> patch-9 describes new pci capability to locate legacy mmr
>         registers
> patch-10 extended usage of driver notification capability for
>          the transitional mmr device
> patch-11 adds conformance section of the transitional mmr device
> 
> This design and details further described below.
> 
> Design:
> -------
> Below picture captures the main small difference between current
> transitional PCI SR-IOV VF and transitional MMR SR-IOV VF.
> 
> +------------------+ +--------------------+ +--------------------+
> |virtio 1.x        | |Transitional        | |Transitional        |
> |SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
> |                  | |                    | |                    |
> ++---------------+ | ++---------------+   | ++---------------+   |
> ||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
> ||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
> |+---------------+ | |+---------------+   | |+---------------+   |
> |                  | |                    | |                    |
> |+------------+    | |+------------+      | |+-----------------+ |
> ||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
> |+------------+    | |+------------+      | ||                 | |
> |                  | |                    | || +--------------+| |
> |                  | |+-----------------+ | || |legacy virtio || |
> |                  | ||IOBAR impossible | | || |+ dev cfg     || |
> |                  | |+-----------------+ | || |registers     || |
> |                  | |                    | || +--------------+| |
> |                  | |                    | |+-----------------+ |
> +------------------+ +--------------------+ +--------------------+
> 
> Here transitional MMR SR-IOV VF has legacy configuration and
> legacy device specific registers located at an offset in the memory
> region BAR.
> 
> A memory region can be dedicated at BAR0 or it can be in an
> existing BAR, allowing flexibility when implementing support
> in a hardware device.
> 
> Transitional MMR SR-IOV VFs use a distinct device ID range to that
> of existing virtio SR-IOV VFs to allow flexibility in driver
> binding.
> 
> A more zoom-in version of transitional MMR SR-IOV device shows
> that the location of the legacy registers are discovered by the
> driver using a new capability.
> 
> +------------------------------+
> |Transitional                  |
> |MMR SRIOV VF                  |
> |                              |
> ++---------------+             |
> ||dev_id =       |             |
> ||{0x10f9-0x10ff}|             |
> |+---------------+             |
> |                              |
> ++--------------------+        |
> || PCIe ext cap = 0xB |        |
> || cfg_type = 10      |        |
> || offset   = 0x1000  |        |
> || bar      = N {0..5}|        |
> |+--|-----------------+        |
> |   |                          |
> |   |                          |
> |   |    +-------------------+ |
> |   |    | Memory BAR = A    | |
> |   |    |                   | |
> |   +------>+--------------+ | |
> |        |  |legacy virtio | | |
> |        |  |+ dev cfg     | | |
> |        |  |registers     | | |
> |        |  +--------------+ | |
> |        +-----------------+ | |
> +------------------------------+
> 
> Software usage:
> ---------------
> Transitional MMR device can be used by multiple ways.
> 
> 1. The most common way to use and map to the guest VM is by
> using vfio driver framework in Linux kernel.
> 
>                 +----------------------+
>                 |pci_dev_id = 0x100X   |
> +---------------|pci_rev_id = 0x0      |-----+
> |vfio device    |BAR0 = I/O region     |     |
> |               |Other attributes      |     |
> |               +----------------------+     |
> |                                            |
> +   +--------------+     +-----------------+ |
> |   |I/O to memory |     | Other vfio      | |
> |   |rd/wr mapper  |     | functionalities | |
> |   +--------------+     +-----------------+ |
> |                                            |
> +-------------------+------------------------+
>                     |
>        +------------+-----------------+
>        |         Transitional         |
>        |         MMR SRIOV VF         |
>        +------------------------------+
> 
> 2. Virtio pci driver to bind to the listed device id and
>    use it as native device in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Parav Pandit (11):
>   transport-pci: Use lowecase alphabets
>   transport-pci: Move transitional device id to legacy section
>   transport-pci: Split notes of PCI Device Layout
>   transport-pci: Rename and move legacy PCI Device layout section
>   introduction: Add missing helping verb
>   introduction: Introduce transitional MMR interface
>   transport-pci: Introduce transitional MMR device id
>   transport-pci: Introduce virtio extended capability
>   transport-pci: Describe PCI MMR dev config registers
>   transport-pci: Use driver notification PCI capability
>   conformance: Add transitional MMR interface conformance
> 
>  conformance.tex      |  11 +-
>  introduction.tex     |  34 +++-
>  tmmr-conformance.tex |  27 +++
>  transport-pci.tex    | 405 ++++++++++++++++++++++++++++++-------------
>  4 files changed, 354 insertions(+), 123 deletions(-)
>  create mode 100644 tmmr-conformance.tex
> 
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-03-31  6:43   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-03-31 21:24     ` Parav Pandit
  2023-04-02  7:54       ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-03-31 21:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, March 31, 2023 2:44 AM
> 
> On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> > Currently PCI device discovery details for the transitional device are
> > documented in two different sections.
> >
> > For example, PCI device and vendor ID registers are documented in
> > 'Device Requirements: PCI Device Discovery' section, while PCI
> > revision id is documented in 'Legacy Interfaces: A Note on PCI Device
> > Discovery' section.
> >
> > Transitional devices requirements should be documented in "legacy
> > interfaces" section as clearly mentioned in 'Legacy Interface: A Note
> > on Feature Bits'.
> 
> I already commented on this, I disagree.
> Modern drivers must be able
> to completely ignore legacy interface sections, but they do bind to transitional
> device IDs.
> This change breaks this assumption.
> 
Legacy interface section holds the detail about transitional devices.
We do not have,
"Legacy only" section.

It doesn't make sense to partial information in legacy and partial in other place.
Modern drivers are not mentioned in the spec terminology section.

Can you please explain, how can modern driver ignore the text " Transitional devices MUST have a PCI Revision ID of 0." written in legacy interface section?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-31  7:03 ` [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device Michael S. Tsirkin
@ 2023-03-31 21:43   ` Parav Pandit
  2023-04-03 14:53     ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-03-31 21:43 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, shahafs



On 3/31/2023 3:03 AM, Michael S. Tsirkin wrote:
> 
> OK but this does not answer the following question:
> since a legacy driver can not bind to this type of MMR device,
> a new driver is needed anyway so
> why not implement a modern driver?
> 
Not sure I follow "implement a modern driver".
If you mean hypervisor driver over modern driver, than yes, you captured 
those two problems below.

More reply below.

> 
> I think we discussed this at some call and it made some kind of sense.
Yep.
> Unfortunately it has been a while and I am not sure I remember the
> detail, so I can no longer say for sure whether this proposal is fit for
> the purpose.  Here is what I vaguely remember:
> 
> A valid use-case is an emulation layer (e.g. a hypervisor) translating
> a legacy driver I/O accesses to MMIO. 
Yes.

> Ideally layering this emulation
> on top of a modern device would work ok
> but there are several things making this approach problematic.
Right.

> One is a different virtio net header size between legacy and modern
> driver. Another is use of control VQ by modern where legacy used
> IO writes. In both cases the different would require the
> emulation getting involved on the DMA path, in particular
> somehow finding private addresses for communication between
> emulation and modern device.
> 
Both of these issues are resolved by this proposal.

> 
> Does above summarize it reasonably?
> 
> 
> And if yes, would an alternative approach of adding legacy config
> support to transport vq work well?

VF is supplying the legacy config region (subset of 1.x) in memory 
mapped area.

A transport vq on the parent PF is yet another option for legacy 
register emulation. I think latency wise it will be a lot more higher, 
though it is not of lot of importance.

The good part of transport vq is, device reset is better as it can act 
as slow operation.

Given that device already implements part of registers in 1.x memory 
mapped area, it is reasonable for device to provide similar registers 
via memory map. (legacy is subset, no new addition).

> I can not say I thought about this
> deeply so maybe there's some problem, or maybe it's a worse approach -
> could you comment on this? It looks like this could be a smaller change,
> but maybe it isn't? Did you consider this option?

We can possibly let both the options open for device vendors to implement.

Change wise transport VQ is fairly big addition for both hypervisor 
driver and also for the device.

> 
> 
> More review later.
>
ok.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-03-31 21:24     ` [virtio-dev] " Parav Pandit
@ 2023-04-02  7:54       ` Michael S. Tsirkin
  2023-04-03 14:42         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-02  7:54 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Fri, Mar 31, 2023 at 09:24:21PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Friday, March 31, 2023 2:44 AM
> > 
> > On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> > > Currently PCI device discovery details for the transitional device are
> > > documented in two different sections.
> > >
> > > For example, PCI device and vendor ID registers are documented in
> > > 'Device Requirements: PCI Device Discovery' section, while PCI
> > > revision id is documented in 'Legacy Interfaces: A Note on PCI Device
> > > Discovery' section.
> > >
> > > Transitional devices requirements should be documented in "legacy
> > > interfaces" section as clearly mentioned in 'Legacy Interface: A Note
> > > on Feature Bits'.
> > 
> > I already commented on this, I disagree.
> > Modern drivers must be able
> > to completely ignore legacy interface sections, but they do bind to transitional
> > device IDs.
> > This change breaks this assumption.
> > 
> Legacy interface section holds the detail about transitional devices.
> We do not have,
> "Legacy only" section.
> 
> It doesn't make sense to partial information in legacy and partial in other place.
> Modern drivers are not mentioned in the spec terminology section.
> 
> Can you please explain, how can modern driver ignore the text " Transitional devices MUST have a PCI Revision ID of 0." written in legacy interface section?

Modern drivers ignore revision ID. It is 0 to accomodate legacy drivers.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-04-02  7:54       ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 14:42         ` Parav Pandit
  2023-04-03 14:50           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 14:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, April 2, 2023 3:55 AM
> 
> On Fri, Mar 31, 2023 at 09:24:21PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Friday, March 31, 2023 2:44 AM
> > >
> > > On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> > > > Currently PCI device discovery details for the transitional device
> > > > are documented in two different sections.
> > > >
> > > > For example, PCI device and vendor ID registers are documented in
> > > > 'Device Requirements: PCI Device Discovery' section, while PCI
> > > > revision id is documented in 'Legacy Interfaces: A Note on PCI
> > > > Device Discovery' section.
> > > >
> > > > Transitional devices requirements should be documented in "legacy
> > > > interfaces" section as clearly mentioned in 'Legacy Interface: A
> > > > Note on Feature Bits'.
> > >
> > > I already commented on this, I disagree.
> > > Modern drivers must be able
> > > to completely ignore legacy interface sections, but they do bind to
> > > transitional device IDs.
> > > This change breaks this assumption.
> > >
> > Legacy interface section holds the detail about transitional devices.
> > We do not have,
> > "Legacy only" section.
> >
> > It doesn't make sense to partial information in legacy and partial in other
> place.
> > Modern drivers are not mentioned in the spec terminology section.
> >
> > Can you please explain, how can modern driver ignore the text " Transitional
> devices MUST have a PCI Revision ID of 0." written in legacy interface section?
> 
> Modern drivers ignore revision ID. It is 0 to accommodate legacy drivers.
For transitional device the revision ID must be zero and "a driver" miust check it be zero.
The section refers to Legacy interface covering transitional devices (not just legacy device).
So it cannot be written in spec from undefined modern driver POV in the spec.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (11 preceding siblings ...)
  2023-03-31  7:03 ` [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device Michael S. Tsirkin
@ 2023-04-03 14:45 ` Stefan Hajnoczi
  2023-04-03 14:53   ` Parav Pandit
  2023-04-12  4:48 ` [virtio-dev] " Michael S. Tsirkin
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 200+ messages in thread
From: Stefan Hajnoczi @ 2023-04-03 14:45 UTC (permalink / raw)
  To: Parav Pandit; +Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs

[-- Attachment #1: Type: text/plain, Size: 10019 bytes --]

On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.

What does "MMR" mean?

> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 
> Such transitional MMR devices will be used at the scale of
> thousands of devices using PCI SR-IOV and/or future scalable
> virtualization technology to provide backward
> compatibility (for legacy devices) and also future
> compatibility with new features.
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>    virtio devices to the guest VM at scale of thousands,
>    typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>    vendor agnostic driver in the hypervisor system.

Is the idea that the hypervisor configures the new Transitional MMR
devices and makes them appear like virtio-pci Transitional devices?

In other words, the guest doesn't know about Transitional MMR and does
not need any code changes.

> 3. A hypervisor system prefers to have single stack regardless of
>    virtio device type (net/blk) and be future compatible with a
>    single vfio stack using SR-IOV or other scalable device
>    virtualization technology to map PCI devices to the guest VM.
>    (as transitional or otherwise)

What does this paragraph mean?

> 
> Motivation/Background:
> ----------------------
> The existing transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. It currently
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not
> indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range
> will be aligned to a 4 KB boundary.
> 
> [d] I/O region accesses at PCI system level is slow as they are non-posted
> operations in PCIe fabric.
> 
> The usecase requirements and limitations above can be solved by
> extending the transitional device, mapping legacy and device
> specific configuration registers in a memory PCI BAR instead
> of using non composable I/O region.
> 
> Please review.

Modern devices were added to Linux in 2014 and support SR-IOV. Why is it
important to support Transitional (which really means Legacy devices,
otherwise Modern devices would be sufficient)?

> 
> Patch summary:
> --------------
> patch 1 to 5 prepares the spec
> patch 6 to 11 defines transitional mmr device
> 
> patch-1 uses lower case alphabets to name device id
> patch-2 move transitional device id in legay section along with
>         revision id
> patch-3 splits legacy feature bits description from device id
> patch-4 rename and moves virtio config registers next to 1.x
>         registers section
> patch-5 Adds missing helper verb in terminology definitions
> patch-6 introduces transitional mmr device
> patch-7 introduces transitional mmr device pci device ids
> patch-8 introduces virtio extended pci capability
> patch-9 describes new pci capability to locate legacy mmr
>         registers
> patch-10 extended usage of driver notification capability for
>          the transitional mmr device
> patch-11 adds conformance section of the transitional mmr device
> 
> This design and details further described below.
> 
> Design:
> -------
> Below picture captures the main small difference between current
> transitional PCI SR-IOV VF and transitional MMR SR-IOV VF.
> 
> +------------------+ +--------------------+ +--------------------+
> |virtio 1.x        | |Transitional        | |Transitional        |
> |SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
> |                  | |                    | |                    |
> ++---------------+ | ++---------------+   | ++---------------+   |
> ||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
> ||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
> |+---------------+ | |+---------------+   | |+---------------+   |
> |                  | |                    | |                    |
> |+------------+    | |+------------+      | |+-----------------+ |
> ||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
> |+------------+    | |+------------+      | ||                 | |
> |                  | |                    | || +--------------+| |
> |                  | |+-----------------+ | || |legacy virtio || |
> |                  | ||IOBAR impossible | | || |+ dev cfg     || |
> |                  | |+-----------------+ | || |registers     || |
> |                  | |                    | || +--------------+| |
> |                  | |                    | |+-----------------+ |
> +------------------+ +--------------------+ +--------------------+
> 
> Here transitional MMR SR-IOV VF has legacy configuration and
> legacy device specific registers located at an offset in the memory
> region BAR.
> 
> A memory region can be dedicated at BAR0 or it can be in an
> existing BAR, allowing flexibility when implementing support
> in a hardware device.
> 
> Transitional MMR SR-IOV VFs use a distinct device ID range to that
> of existing virtio SR-IOV VFs to allow flexibility in driver
> binding.
> 
> A more zoom-in version of transitional MMR SR-IOV device shows
> that the location of the legacy registers are discovered by the
> driver using a new capability.
> 
> +------------------------------+
> |Transitional                  |
> |MMR SRIOV VF                  |
> |                              |
> ++---------------+             |
> ||dev_id =       |             |
> ||{0x10f9-0x10ff}|             |
> |+---------------+             |
> |                              |
> ++--------------------+        |
> || PCIe ext cap = 0xB |        |
> || cfg_type = 10      |        |
> || offset   = 0x1000  |        |
> || bar      = N {0..5}|        |
> |+--|-----------------+        |
> |   |                          |
> |   |                          |
> |   |    +-------------------+ |
> |   |    | Memory BAR = A    | |
> |   |    |                   | |
> |   +------>+--------------+ | |
> |        |  |legacy virtio | | |
> |        |  |+ dev cfg     | | |
> |        |  |registers     | | |
> |        |  +--------------+ | |
> |        +-----------------+ | |
> +------------------------------+
> 
> Software usage:
> ---------------
> Transitional MMR device can be used by multiple ways.
> 
> 1. The most common way to use and map to the guest VM is by
> using vfio driver framework in Linux kernel.
> 
>                 +----------------------+
>                 |pci_dev_id = 0x100X   |
> +---------------|pci_rev_id = 0x0      |-----+
> |vfio device    |BAR0 = I/O region     |     |
> |               |Other attributes      |     |
> |               +----------------------+     |
> |                                            |
> +   +--------------+     +-----------------+ |
> |   |I/O to memory |     | Other vfio      | |
> |   |rd/wr mapper  |     | functionalities | |
> |   +--------------+     +-----------------+ |
> |                                            |
> +-------------------+------------------------+
>                     |
>        +------------+-----------------+
>        |         Transitional         |
>        |         MMR SRIOV VF         |
>        +------------------------------+
> 
> 2. Virtio pci driver to bind to the listed device id and
>    use it as native device in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Parav Pandit (11):
>   transport-pci: Use lowecase alphabets
>   transport-pci: Move transitional device id to legacy section
>   transport-pci: Split notes of PCI Device Layout
>   transport-pci: Rename and move legacy PCI Device layout section
>   introduction: Add missing helping verb
>   introduction: Introduce transitional MMR interface
>   transport-pci: Introduce transitional MMR device id
>   transport-pci: Introduce virtio extended capability
>   transport-pci: Describe PCI MMR dev config registers
>   transport-pci: Use driver notification PCI capability
>   conformance: Add transitional MMR interface conformance
> 
>  conformance.tex      |  11 +-
>  introduction.tex     |  34 +++-
>  tmmr-conformance.tex |  27 +++
>  transport-pci.tex    | 405 ++++++++++++++++++++++++++++++-------------
>  4 files changed, 354 insertions(+), 123 deletions(-)
>  create mode 100644 tmmr-conformance.tex
> 
> -- 
> 2.26.2
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-04-03 14:42         ` [virtio-dev] " Parav Pandit
@ 2023-04-03 14:50           ` Michael S. Tsirkin
  2023-04-03 14:58             ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 14:50 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Mon, Apr 03, 2023 at 02:42:15PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, April 2, 2023 3:55 AM
> > 
> > On Fri, Mar 31, 2023 at 09:24:21PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Friday, March 31, 2023 2:44 AM
> > > >
> > > > On Fri, Mar 31, 2023 at 01:58:25AM +0300, Parav Pandit wrote:
> > > > > Currently PCI device discovery details for the transitional device
> > > > > are documented in two different sections.
> > > > >
> > > > > For example, PCI device and vendor ID registers are documented in
> > > > > 'Device Requirements: PCI Device Discovery' section, while PCI
> > > > > revision id is documented in 'Legacy Interfaces: A Note on PCI
> > > > > Device Discovery' section.
> > > > >
> > > > > Transitional devices requirements should be documented in "legacy
> > > > > interfaces" section as clearly mentioned in 'Legacy Interface: A
> > > > > Note on Feature Bits'.
> > > >
> > > > I already commented on this, I disagree.
> > > > Modern drivers must be able
> > > > to completely ignore legacy interface sections, but they do bind to
> > > > transitional device IDs.
> > > > This change breaks this assumption.
> > > >
> > > Legacy interface section holds the detail about transitional devices.
> > > We do not have,
> > > "Legacy only" section.
> > >
> > > It doesn't make sense to partial information in legacy and partial in other
> > place.
> > > Modern drivers are not mentioned in the spec terminology section.
> > >
> > > Can you please explain, how can modern driver ignore the text " Transitional
> > devices MUST have a PCI Revision ID of 0." written in legacy interface section?
> > 
> > Modern drivers ignore revision ID. It is 0 to accommodate legacy drivers.
> For transitional device the revision ID must be zero and "a driver" miust check it be zero.
> The section refers to Legacy interface covering transitional devices (not just legacy device).
> So it cannot be written in spec from undefined modern driver POV in the spec.

No idea what all this means, sorry.  Please do not move text that
affects modern drivers to a legacy section. And we've spilled way too
much ink on this already.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-31 21:43   ` Parav Pandit
@ 2023-04-03 14:53     ` Michael S. Tsirkin
  2023-04-03 14:57       ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 14:53 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs

On Fri, Mar 31, 2023 at 05:43:11PM -0400, Parav Pandit wrote:
> > I can not say I thought about this
> > deeply so maybe there's some problem, or maybe it's a worse approach -
> > could you comment on this? It looks like this could be a smaller change,
> > but maybe it isn't? Did you consider this option?
> 
> We can possibly let both the options open for device vendors to implement.
> 
> Change wise transport VQ is fairly big addition for both hypervisor driver
> and also for the device.

OTOH it is presumably required for scalability anyway, no?
And presumably it can all be done in firmware ...
Is there actual hardware that can't implement transport vq
but is going to implement the mmr spec?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 14:45 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
@ 2023-04-03 14:53   ` Parav Pandit
  2023-04-03 17:48     ` Michael S. Tsirkin
  2023-04-03 19:10     ` Stefan Hajnoczi
  0 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 14:53 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs



On 4/3/2023 10:45 AM, Stefan Hajnoczi wrote:
> On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
>> Overview:
>> ---------
>> The Transitional MMR device is a variant of the transitional PCI device.
> 
> What does "MMR" mean?
> 
memory mapped registers.
Explained below in the design section and also in relevant patches 6 to 11.

>> It has its own small Device ID range. It does not have I/O
>> region BAR; instead it exposes legacy configuration and device
>> specific registers at an offset in the memory region BAR.
>>
>> Such transitional MMR devices will be used at the scale of
>> thousands of devices using PCI SR-IOV and/or future scalable
>> virtualization technology to provide backward
>> compatibility (for legacy devices) and also future
>> compatibility with new features.
>>
>> Usecase:
>> --------
>> 1. A hypervisor/system needs to provide transitional
>>     virtio devices to the guest VM at scale of thousands,
>>     typically, one to eight devices per VM.
>>
>> 2. A hypervisor/system needs to provide such devices using a
>>     vendor agnostic driver in the hypervisor system.
> 
> Is the idea that the hypervisor configures the new Transitional MMR
> devices and makes them appear like virtio-pci Transitional devices?
>
Yes, but hypervisor is not involved in any configuration parsing or 
anything of that nature.
It is only a passthrough fowarder from emulated IOBAR to memory mapped 
legacy registers.
In other words, hypervisor do not care for the registers content at all.

> In other words, the guest doesn't know about Transitional MMR and does
> not need any code changes.
> 
>> 3. A hypervisor system prefers to have single stack regardless of
>>     virtio device type (net/blk) and be future compatible with a
>>     single vfio stack using SR-IOV or other scalable device
>>     virtualization technology to map PCI devices to the guest VM.
>>     (as transitional or otherwise)
> 
> What does this paragraph mean?
>
It means regardless of a VF being transitional MMR VF or 1.x VF without 
any MMR extensions, there is single vfio virtio driver handling both 
type of devices to map to the guest VM.

> 
> Modern devices were added to Linux in 2014 and support SR-IOV. 

> Why is it
> important to support Transitional (which really means Legacy devices,
> otherwise Modern devices would be sufficient)?
>
To support guest VMs which only understand legacy devices and 
unfortunately they are still in much wider use by the users.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 14:53     ` Michael S. Tsirkin
@ 2023-04-03 14:57       ` Parav Pandit
  2023-04-03 15:06         ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 14:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 10:53 AM
> 
> On Fri, Mar 31, 2023 at 05:43:11PM -0400, Parav Pandit wrote:
> > > I can not say I thought about this
> > > deeply so maybe there's some problem, or maybe it's a worse approach
> > > - could you comment on this? It looks like this could be a smaller
> > > change, but maybe it isn't? Did you consider this option?
> >
> > We can possibly let both the options open for device vendors to implement.
> >
> > Change wise transport VQ is fairly big addition for both hypervisor
> > driver and also for the device.
> 
> OTOH it is presumably required for scalability anyway, no?
No.
Most new generation SIOV and SR-IOV devices operate without any para-virtualization.

> And presumably it can all be done in firmware ...
> Is there actual hardware that can't implement transport vq but is going to
> implement the mmr spec?
> 
Nvidia and Marvell DPUs implement MMR spec.
Transport VQ has very high latency and DMA overheads for 2 to 4 bytes read/write.

And before discussing "why not that approach", lets finish reviewing "this approach" first.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-04-03 14:50           ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 14:58             ` Parav Pandit
  2023-04-03 15:14               ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 14:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 10:50 AM

> 
> No idea what all this means, sorry.  Please do not move text that affects modern
> drivers to a legacy section. And we've spilled way too much ink on this already.

I disagree because spec do not describe modern driver and what you are describing is not aligned the way current spec is written.
I prefer to avoid mentioning it again the same feature bits section that talks about Transitional interface.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 14:57       ` [virtio-dev] " Parav Pandit
@ 2023-04-03 15:06         ` Michael S. Tsirkin
  2023-04-03 15:16           ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 15:06 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 02:57:26PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 10:53 AM
> > 
> > On Fri, Mar 31, 2023 at 05:43:11PM -0400, Parav Pandit wrote:
> > > > I can not say I thought about this
> > > > deeply so maybe there's some problem, or maybe it's a worse approach
> > > > - could you comment on this? It looks like this could be a smaller
> > > > change, but maybe it isn't? Did you consider this option?
> > >
> > > We can possibly let both the options open for device vendors to implement.
> > >
> > > Change wise transport VQ is fairly big addition for both hypervisor
> > > driver and also for the device.
> > 
> > OTOH it is presumably required for scalability anyway, no?
> No.
> Most new generation SIOV and SR-IOV devices operate without any para-virtualization.

Don't see the connection to PV. You need an emulation layer in the host
if you want to run legacy guests. Looks like it could do transport vq
just as well.

> > And presumably it can all be done in firmware ...
> > Is there actual hardware that can't implement transport vq but is going to
> > implement the mmr spec?
> > 
> Nvidia and Marvell DPUs implement MMR spec.

Hmm implement it in what sense exactly?

> Transport VQ has very high latency and DMA overheads for 2 to 4 bytes read/write.

How many of these 2 byte accesses trigger from a typical guest?

> And before discussing "why not that approach", lets finish reviewing "this approach" first.

That's a weird way to put it. We don't want so many ways to do legacy
if we can help it.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 02/11] transport-pci: Move transitional device id to legacy section
  2023-04-03 14:58             ` [virtio-dev] " Parav Pandit
@ 2023-04-03 15:14               ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 15:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Mon, Apr 03, 2023 at 02:58:52PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 10:50 AM
> 
> > 
> > No idea what all this means, sorry.  Please do not move text that affects modern
> > drivers to a legacy section. And we've spilled way too much ink on this already.
> 
> I disagree because spec do not describe modern driver and what you are describing is not aligned the way current spec is written.
> I prefer to avoid mentioning it again the same feature bits section that talks about Transitional interface.

Sorry I don't understand what you are trying to say here.  This is all
cosmetics, matter of personal preference.  But I did my best to try to
explain the reason this is not a cleanup but a breaking change. Was I
misunderstood or you just don't agree? No idea.
The reason for current placement is this:

	A conformant implementation MUST be either transitional or
	non-transitional, see \ref{intro:Legacy
	Interface: Terminology}.

	An implementation MAY choose to implement OPTIONAL support for the
	legacy interface, including support for legacy drivers
	or devices, by conforming to all of the MUST or
	REQUIRED level requirements for the legacy interface
	for the transitional devices and drivers.

	The requirements for the legacy interface for transitional implementations
	are located in sections named ``Legacy Interface'' listed below:


Binding to a transitional ID is mandatory for modern drivers.
*This* is why this ID can not go to legacy section - all of legacy sections
are and must stay optional.

What is true (and unfortunate) is that legacy sections are not as formal
as modern ones - originally we wanted them to be informational only. For
example there is no clear separation between driver and device
conformance sections.  Work on this if you like, that is welcome.
But please stop moving mandatory text to legacy sections.

-- 
MSR


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:06         ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 15:16           ` Parav Pandit
  2023-04-03 15:23             ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 15:16 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 11:07 AM

> > > OTOH it is presumably required for scalability anyway, no?
> > No.
> > Most new generation SIOV and SR-IOV devices operate without any para-
> virtualization.
> 
> Don't see the connection to PV. You need an emulation layer in the host if you
> want to run legacy guests. Looks like it could do transport vq just as well.
>
Transport vq for legacy MMR purpose seems fine with its latency and DMA overheads.
Your question was about "scalability".
After your latest response, I am unclear what "scalability" means.
Do you mean saving the register space in the PCI device?
If yes, than, no for legacy guests for scalability it is not required, because the legacy register is subset of 1.x.

 
> > > And presumably it can all be done in firmware ...
> > > Is there actual hardware that can't implement transport vq but is
> > > going to implement the mmr spec?
> > >
> > Nvidia and Marvell DPUs implement MMR spec.
> 
> Hmm implement it in what sense exactly?
>
Do not follow the question.
The proposed series will be implemented as PCI SR-IOV devices using MMR spec.
 
> > Transport VQ has very high latency and DMA overheads for 2 to 4 bytes
> read/write.
> 
> How many of these 2 byte accesses trigger from a typical guest?
> 
Mostly during the VM boot time. 20 to 40 registers read write access.

> > And before discussing "why not that approach", lets finish reviewing "this
> approach" first.
> 
> That's a weird way to put it. We don't want so many ways to do legacy if we can
> help it.
Sure, so lets finish the review of current proposal details.
At the moment 
a. I don't see any visible gain of transport VQ other than device reset part I explained.
b. it can be a way with high latency, DMA overheads on the virtqueue for read/writes for small access.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:16           ` [virtio-dev] " Parav Pandit
@ 2023-04-03 15:23             ` Michael S. Tsirkin
  2023-04-03 15:34               ` Michael S. Tsirkin
  2023-04-03 15:36               ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  0 siblings, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 15:23 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 03:16:53PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 11:07 AM
> 
> > > > OTOH it is presumably required for scalability anyway, no?
> > > No.
> > > Most new generation SIOV and SR-IOV devices operate without any para-
> > virtualization.
> > 
> > Don't see the connection to PV. You need an emulation layer in the host if you
> > want to run legacy guests. Looks like it could do transport vq just as well.
> >
> Transport vq for legacy MMR purpose seems fine with its latency and DMA overheads.
> Your question was about "scalability".
> After your latest response, I am unclear what "scalability" means.
> Do you mean saving the register space in the PCI device?

yes that's how you used scalability in the past.

> If yes, than, no for legacy guests for scalability it is not required, because the legacy register is subset of 1.x.

Weird.  what does guest being legacy have to do with a wish to save
registers on the host hardware? You don't have so many legacy guests as
modern guests? Why?



>  
> > > > And presumably it can all be done in firmware ...
> > > > Is there actual hardware that can't implement transport vq but is
> > > > going to implement the mmr spec?
> > > >
> > > Nvidia and Marvell DPUs implement MMR spec.
> > 
> > Hmm implement it in what sense exactly?
> >
> Do not follow the question.
> The proposed series will be implemented as PCI SR-IOV devices using MMR spec.
>  
> > > Transport VQ has very high latency and DMA overheads for 2 to 4 bytes
> > read/write.
> > 
> > How many of these 2 byte accesses trigger from a typical guest?
> > 
> Mostly during the VM boot time. 20 to 40 registers read write access.

That is not a lot! How long does a DMA operation take then?

> > > And before discussing "why not that approach", lets finish reviewing "this
> > approach" first.
> > 
> > That's a weird way to put it. We don't want so many ways to do legacy if we can
> > help it.
> Sure, so lets finish the review of current proposal details.
> At the moment 
> a. I don't see any visible gain of transport VQ other than device reset part I explained.

For example, we do not need a new range of device IDs and existing
drivers can bind on the host.

> b. it can be a way with high latency, DMA overheads on the virtqueue for read/writes for small access.

numbers?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:23             ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 15:34               ` Michael S. Tsirkin
  2023-04-03 15:47                 ` [virtio-dev] " Parav Pandit
  2023-04-03 15:36               ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 15:34 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 11:23:11AM -0400, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 03:16:53PM +0000, Parav Pandit wrote:
> > 
> > 
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Monday, April 3, 2023 11:07 AM
> > 
> > > > > OTOH it is presumably required for scalability anyway, no?
> > > > No.
> > > > Most new generation SIOV and SR-IOV devices operate without any para-
> > > virtualization.
> > > 
> > > Don't see the connection to PV. You need an emulation layer in the host if you
> > > want to run legacy guests. Looks like it could do transport vq just as well.
> > >
> > Transport vq for legacy MMR purpose seems fine with its latency and DMA overheads.
> > Your question was about "scalability".
> > After your latest response, I am unclear what "scalability" means.
> > Do you mean saving the register space in the PCI device?
> 
> yes that's how you used scalability in the past.
> 
> > If yes, than, no for legacy guests for scalability it is not required, because the legacy register is subset of 1.x.
> 
> Weird.  what does guest being legacy have to do with a wish to save
> registers on the host hardware? You don't have so many legacy guests as
> modern guests? Why?
> 
> 
> 
> >  
> > > > > And presumably it can all be done in firmware ...
> > > > > Is there actual hardware that can't implement transport vq but is
> > > > > going to implement the mmr spec?
> > > > >
> > > > Nvidia and Marvell DPUs implement MMR spec.
> > > 
> > > Hmm implement it in what sense exactly?
> > >
> > Do not follow the question.
> > The proposed series will be implemented as PCI SR-IOV devices using MMR spec.
> >  
> > > > Transport VQ has very high latency and DMA overheads for 2 to 4 bytes
> > > read/write.
> > > 
> > > How many of these 2 byte accesses trigger from a typical guest?
> > > 
> > Mostly during the VM boot time. 20 to 40 registers read write access.
> 
> That is not a lot! How long does a DMA operation take then?
> 
> > > > And before discussing "why not that approach", lets finish reviewing "this
> > > approach" first.
> > > 
> > > That's a weird way to put it. We don't want so many ways to do legacy if we can
> > > help it.
> > Sure, so lets finish the review of current proposal details.
> > At the moment 
> > a. I don't see any visible gain of transport VQ other than device reset part I explained.
> 
> For example, we do not need a new range of device IDs and existing
> drivers can bind on the host.

Another is that we can actually work around legacy bugs in the
hypervisor. For example, atomicity and alignment bugs do not exist under
DMA. Consider MAC field, writeable in legacy.  Problem this write is not
atomic, so there is a window where MAC is corrupted.  If you do MMIO
then you just have to copy this bug. If you do DMA then hypervisor can
buffer all of MAC and send to device in one go.

> > b. it can be a way with high latency, DMA overheads on the virtqueue for read/writes for small access.
> 
> numbers?
> 
> -- 
> MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:23             ` [virtio-dev] " Michael S. Tsirkin
  2023-04-03 15:34               ` Michael S. Tsirkin
@ 2023-04-03 15:36               ` Parav Pandit
  2023-04-03 17:16                 ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 15:36 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin

> > Transport vq for legacy MMR purpose seems fine with its latency and DMA
> overheads.
> > Your question was about "scalability".
> > After your latest response, I am unclear what "scalability" means.
> > Do you mean saving the register space in the PCI device?
> 
> yes that's how you used scalability in the past.
>
Ok. I am aligned.
 
> > If yes, than, no for legacy guests for scalability it is not required, because the
> legacy register is subset of 1.x.
> 
> Weird.  what does guest being legacy have to do with a wish to save registers
> on the host hardware? 
Because legacy has subset of the registers of 1.x. So no new registers additional expected on legacy side.

> You don't have so many legacy guests as modern
> guests? Why?
> 
This isn't true.

There is a trade-off, upto certain N, MMR based register access is fine.
This is because 1.x is exposing super set of registers of legacy.
Beyond a certain point device will have difficulty in doing MMR for legacy and 1.x.
At that point, legacy over tvq can be better scale but with lot higher latency order of magnitude higher compare to MMR.
If tvq being the only transport for these registers access, it would hurt at lower scale too, due the primary nature of non_register access.
And scale is relative from device to device.

> >
> > > > > And presumably it can all be done in firmware ...
> > > > > Is there actual hardware that can't implement transport vq but
> > > > > is going to implement the mmr spec?
> > > > >
> > > > Nvidia and Marvell DPUs implement MMR spec.
> > >
> > > Hmm implement it in what sense exactly?
> > >
> > Do not follow the question.
> > The proposed series will be implemented as PCI SR-IOV devices using MMR
> spec.
> >
> > > > Transport VQ has very high latency and DMA overheads for 2 to 4
> > > > bytes
> > > read/write.
> > >
> > > How many of these 2 byte accesses trigger from a typical guest?
> > >
> > Mostly during the VM boot time. 20 to 40 registers read write access.
> 
> That is not a lot! How long does a DMA operation take then?
> 
> > > > And before discussing "why not that approach", lets finish
> > > > reviewing "this
> > > approach" first.
> > >
> > > That's a weird way to put it. We don't want so many ways to do
> > > legacy if we can help it.
> > Sure, so lets finish the review of current proposal details.
> > At the moment
> > a. I don't see any visible gain of transport VQ other than device reset part I
> explained.
> 
> For example, we do not need a new range of device IDs and existing drivers can
> bind on the host.
>
So, unlikely due to already discussed limitation of feature negotiation.
Existing transitional driver would also look for an IOBAR being second limitation.

> > b. it can be a way with high latency, DMA overheads on the virtqueue for
> read/writes for small access.
> 
> numbers?
It depends on the implementation, but at minimum, writes and reads can pay order of magnitude higher in 10 msec range.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:34               ` Michael S. Tsirkin
@ 2023-04-03 15:47                 ` Parav Pandit
  2023-04-03 17:28                   ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 15:47 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 11:34 AM

> Another is that we can actually work around legacy bugs in the hypervisor. For
> example, atomicity and alignment bugs do not exist under DMA. Consider MAC
> field, writeable in legacy.  Problem this write is not atomic, so there is a window
> where MAC is corrupted.  If you do MMIO then you just have to copy this bug.
> If you do DMA then hypervisor can buffer all of MAC and send to device in one
> go.
I am familiar with this bug.
Users feedback that we received so far has kernels with driver support that uses CVQ for setting the mac address on legacy device.
So, it may help but not super important.

Also, if I recollect correctly, the mac address is configuring bit early in if-scripts sequence before bringing up the interface.
So, haven't seen real issue around it.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:36               ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-03 17:16                 ` Michael S. Tsirkin
  2023-04-03 17:29                   ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 17:16 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 03:36:25PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> 
> > > Transport vq for legacy MMR purpose seems fine with its latency and DMA
> > overheads.
> > > Your question was about "scalability".
> > > After your latest response, I am unclear what "scalability" means.
> > > Do you mean saving the register space in the PCI device?
> > 
> > yes that's how you used scalability in the past.
> >
> Ok. I am aligned.
>  
> > > If yes, than, no for legacy guests for scalability it is not required, because the
> > legacy register is subset of 1.x.
> > 
> > Weird.  what does guest being legacy have to do with a wish to save registers
> > on the host hardware? 
> Because legacy has subset of the registers of 1.x. So no new registers additional expected on legacy side.
> 
> > You don't have so many legacy guests as modern
> > guests? Why?
> > 
> This isn't true.
> 
> There is a trade-off, upto certain N, MMR based register access is fine.
> This is because 1.x is exposing super set of registers of legacy.
> Beyond a certain point device will have difficulty in doing MMR for legacy and 1.x.
> At that point, legacy over tvq can be better scale but with lot higher latency order of magnitude higher compare to MMR.
> If tvq being the only transport for these registers access, it would hurt at lower scale too, due the primary nature of non_register access.
> And scale is relative from device to device.

Wow! Why an order of magnitide?

> > >
> > > > > > And presumably it can all be done in firmware ...
> > > > > > Is there actual hardware that can't implement transport vq but
> > > > > > is going to implement the mmr spec?
> > > > > >
> > > > > Nvidia and Marvell DPUs implement MMR spec.
> > > >
> > > > Hmm implement it in what sense exactly?
> > > >
> > > Do not follow the question.
> > > The proposed series will be implemented as PCI SR-IOV devices using MMR
> > spec.
> > >
> > > > > Transport VQ has very high latency and DMA overheads for 2 to 4
> > > > > bytes
> > > > read/write.
> > > >
> > > > How many of these 2 byte accesses trigger from a typical guest?
> > > >
> > > Mostly during the VM boot time. 20 to 40 registers read write access.
> > 
> > That is not a lot! How long does a DMA operation take then?
> > 
> > > > > And before discussing "why not that approach", lets finish
> > > > > reviewing "this
> > > > approach" first.
> > > >
> > > > That's a weird way to put it. We don't want so many ways to do
> > > > legacy if we can help it.
> > > Sure, so lets finish the review of current proposal details.
> > > At the moment
> > > a. I don't see any visible gain of transport VQ other than device reset part I
> > explained.
> > 
> > For example, we do not need a new range of device IDs and existing drivers can
> > bind on the host.
> >
> So, unlikely due to already discussed limitation of feature negotiation.
> Existing transitional driver would also look for an IOBAR being second limitation.

Some confusion here.
If you have a transitional driver you do not need a legacy device.



> > > b. it can be a way with high latency, DMA overheads on the virtqueue for
> > read/writes for small access.
> > 
> > numbers?
> It depends on the implementation, but at minimum, writes and reads can pay order of magnitude higher in 10 msec range.

A single VQ roundtrip takes a minimum of 10 milliseconds? This is indeed
completely unworkable for transport vq. Points:
- even for memory mapped you have an access take 1 millisecond?
  Extremely slow. Why?
- Why is DMA 10x more expensive? I expect it to be 2x more expensive:
  Normal read goes cpu -> device -> cpu, DMA does cpu -> device -> memory -> device -> cpu

Reason I am asking is because it is important for transport vq to have
a workable design.


But let me guess. Is there a chance that you are talking about an
interrupt driven design? *That* is going to be slow though I don't think
10msec, more like 10usec. But I expect transport vq to typically
work by (adaptive?) polling mostly avoiding interrupts.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 15:47                 ` [virtio-dev] " Parav Pandit
@ 2023-04-03 17:28                   ` Michael S. Tsirkin
  2023-04-03 17:35                     ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 17:28 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 03:47:56PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 11:34 AM
> 
> > Another is that we can actually work around legacy bugs in the hypervisor. For
> > example, atomicity and alignment bugs do not exist under DMA. Consider MAC
> > field, writeable in legacy.  Problem this write is not atomic, so there is a window
> > where MAC is corrupted.  If you do MMIO then you just have to copy this bug.
> > If you do DMA then hypervisor can buffer all of MAC and send to device in one
> > go.
> I am familiar with this bug.
> Users feedback that we received so far has kernels with driver support that uses CVQ for setting the mac address on legacy device.
> So, it may help but not super important.
> 
> Also, if I recollect correctly, the mac address is configuring bit early in if-scripts sequence before bringing up the interface.
> So, haven't seen real issue around it.

It's an example, there are other bugs in legacy interfaces.

Take inability to decline feature negotiation as an example.
With transport vq we can fail at transport level and
hypervisor can decide what to do, such as stopping guest or
unplugging device, etc.

So something like a vq would be a step up. I would like to
understand the performance angle though. What you describe
is pretty bad.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:16                 ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 17:29                   ` Parav Pandit
  2023-04-03 18:02                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 17:29 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



On 4/3/2023 1:16 PM, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 03:36:25PM +0000, Parav Pandit wrote:
>>
>>
>>> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
>>> open.org> On Behalf Of Michael S. Tsirkin
>>
>>>> Transport vq for legacy MMR purpose seems fine with its latency and DMA
>>> overheads.
>>>> Your question was about "scalability".
>>>> After your latest response, I am unclear what "scalability" means.
>>>> Do you mean saving the register space in the PCI device?
>>>
>>> yes that's how you used scalability in the past.
>>>
>> Ok. I am aligned.
>>   
>>>> If yes, than, no for legacy guests for scalability it is not required, because the
>>> legacy register is subset of 1.x.
>>>
>>> Weird.  what does guest being legacy have to do with a wish to save registers
>>> on the host hardware?
>> Because legacy has subset of the registers of 1.x. So no new registers additional expected on legacy side.
>>
>>> You don't have so many legacy guests as modern
>>> guests? Why?
>>>
>> This isn't true.
>>
>> There is a trade-off, upto certain N, MMR based register access is fine.
>> This is because 1.x is exposing super set of registers of legacy.
>> Beyond a certain point device will have difficulty in doing MMR for legacy and 1.x.
>> At that point, legacy over tvq can be better scale but with lot higher latency order of magnitude higher compare to MMR.
>> If tvq being the only transport for these registers access, it would hurt at lower scale too, due the primary nature of non_register access.
>> And scale is relative from device to device.
> 
> Wow! Why an order of magnitide?
> 
Because vqs involve DMA operations.
It is left to the device implementation to do it, but a generic wisdom 
is not implement such slow work in the data path engines.
So such register access vqs can/may be through firmware.
Hence it can involve a lot higher latency.

>>>>
>>>>>>> And presumably it can all be done in firmware ...
>>>>>>> Is there actual hardware that can't implement transport vq but
>>>>>>> is going to implement the mmr spec?
>>>>>>>
>>>>>> Nvidia and Marvell DPUs implement MMR spec.
>>>>>
>>>>> Hmm implement it in what sense exactly?
>>>>>
>>>> Do not follow the question.
>>>> The proposed series will be implemented as PCI SR-IOV devices using MMR
>>> spec.
>>>>
>>>>>> Transport VQ has very high latency and DMA overheads for 2 to 4
>>>>>> bytes
>>>>> read/write.
>>>>>
>>>>> How many of these 2 byte accesses trigger from a typical guest?
>>>>>
>>>> Mostly during the VM boot time. 20 to 40 registers read write access.
>>>
>>> That is not a lot! How long does a DMA operation take then?
>>>
>>>>>> And before discussing "why not that approach", lets finish
>>>>>> reviewing "this
>>>>> approach" first.
>>>>>
>>>>> That's a weird way to put it. We don't want so many ways to do
>>>>> legacy if we can help it.
>>>> Sure, so lets finish the review of current proposal details.
>>>> At the moment
>>>> a. I don't see any visible gain of transport VQ other than device reset part I
>>> explained.
>>>
>>> For example, we do not need a new range of device IDs and existing drivers can
>>> bind on the host.
>>>
>> So, unlikely due to already discussed limitation of feature negotiation.
>> Existing transitional driver would also look for an IOBAR being second limitation.
> 
> Some confusion here.
Yes.
> If you have a transitional driver you do not need a legacy device.
> 
IF I understood your thoughts split in two emails,

Your point was "we dont need new range of device IDs for transitional 
TVQ device because TVQ is new and its optional".

But this transitional TVQ device do not expose IOBAR expected by the 
existing transitional device, failing the driver load.

Your idea is not very clear.

> 
> 
>>>> b. it can be a way with high latency, DMA overheads on the virtqueue for
>>> read/writes for small access.
>>>
>>> numbers?
>> It depends on the implementation, but at minimum, writes and reads can pay order of magnitude higher in 10 msec range.
> 
> A single VQ roundtrip takes a minimum of 10 milliseconds? This is indeed
> completely unworkable for transport vq. Points:
> - even for memory mapped you have an access take 1 millisecond?
>    Extremely slow. Why?
> - Why is DMA 10x more expensive? I expect it to be 2x more expensive:
>    Normal read goes cpu -> device -> cpu, DMA does cpu -> device -> memory -> device -> cpu
> 
> Reason I am asking is because it is important for transport vq to have
> a workable design.
> 
> 
> But let me guess. Is there a chance that you are talking about an
> interrupt driven design? *That* is going to be slow though I don't think
> 10msec, more like 10usec. But I expect transport vq to typically
> work by (adaptive?) polling mostly avoiding interrupts.
> 
No. Interrupt latency is in usec range.
The major latency contributors in msec range can arise from the device side.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:28                   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 17:35                     ` Parav Pandit
  2023-04-03 17:39                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 17:35 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



On 4/3/2023 1:28 PM, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 03:47:56PM +0000, Parav Pandit wrote:
>>
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Monday, April 3, 2023 11:34 AM
>>
>>> Another is that we can actually work around legacy bugs in the hypervisor. For
>>> example, atomicity and alignment bugs do not exist under DMA. Consider MAC
>>> field, writeable in legacy.  Problem this write is not atomic, so there is a window
>>> where MAC is corrupted.  If you do MMIO then you just have to copy this bug.
>>> If you do DMA then hypervisor can buffer all of MAC and send to device in one
>>> go.
>> I am familiar with this bug.
>> Users feedback that we received so far has kernels with driver support that uses CVQ for setting the mac address on legacy device.
>> So, it may help but not super important.
>>
>> Also, if I recollect correctly, the mac address is configuring bit early in if-scripts sequence before bringing up the interface.
>> So, haven't seen real issue around it.
> 
> It's an example, there are other bugs in legacy interfaces.
> 
The intent is to provide backward compatibility to the legacy interface, 
and not really fixing the legacy interface in itself as it may break the 
legacy itself.

> Take inability to decline feature negotiation as an example.
Legacy driver would do this anyway. It would expect certain flow to work 
that has been worked for it when it was working over previous sw-hypervisor.

Hypervisor attempting to fail what was working before will not help.

> With transport vq we can fail at transport level and
> hypervisor can decide what to do, such as stopping guest or
> unplugging device, etc.
> 

> So something like a vq would be a step up. I would like to
> understand the performance angle though. What you describe
> is pretty bad.
> 
Do you mean latency is bad or the description?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:35                     ` Parav Pandit
@ 2023-04-03 17:39                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 17:39 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 01:35:01PM -0400, Parav Pandit wrote:
> > So something like a vq would be a step up. I would like to
> > understand the performance angle though. What you describe
> > is pretty bad.
> > 
> Do you mean latency is bad or the description?

I don't know. We need admin vq and transport vq to work.
You describe latency numbers that make both unworkable.
I am interested in fixing that somehow, since it's a blocker.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 14:53   ` Parav Pandit
@ 2023-04-03 17:48     ` Michael S. Tsirkin
  2023-04-03 19:11       ` Stefan Hajnoczi
  2023-04-03 19:48       ` [virtio-dev] " Parav Pandit
  2023-04-03 19:10     ` Stefan Hajnoczi
  1 sibling, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 17:48 UTC (permalink / raw)
  To: Parav Pandit; +Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, shahafs

On Mon, Apr 03, 2023 at 10:53:29AM -0400, Parav Pandit wrote:
> 
> 
> On 4/3/2023 10:45 AM, Stefan Hajnoczi wrote:
> > On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> > > Overview:
> > > ---------
> > > The Transitional MMR device is a variant of the transitional PCI device.
> > 
> > What does "MMR" mean?
> > 
> memory mapped registers.
> Explained below in the design section and also in relevant patches 6 to 11.
> 
> > > It has its own small Device ID range. It does not have I/O
> > > region BAR; instead it exposes legacy configuration and device
> > > specific registers at an offset in the memory region BAR.
> > > 
> > > Such transitional MMR devices will be used at the scale of
> > > thousands of devices using PCI SR-IOV and/or future scalable
> > > virtualization technology to provide backward
> > > compatibility (for legacy devices) and also future
> > > compatibility with new features.
> > > 
> > > Usecase:
> > > --------
> > > 1. A hypervisor/system needs to provide transitional
> > >     virtio devices to the guest VM at scale of thousands,
> > >     typically, one to eight devices per VM.
> > > 
> > > 2. A hypervisor/system needs to provide such devices using a
> > >     vendor agnostic driver in the hypervisor system.
> > 
> > Is the idea that the hypervisor configures the new Transitional MMR
> > devices and makes them appear like virtio-pci Transitional devices?
> > 
> Yes, but hypervisor is not involved in any configuration parsing or anything
> of that nature.
> It is only a passthrough fowarder from emulated IOBAR to memory mapped
> legacy registers.
> In other words, hypervisor do not care for the registers content at all.

This part I do not see as important. legacy is frozen in time. Implement
it once and you are done. Datapath differences are more important.

> > In other words, the guest doesn't know about Transitional MMR and does
> > not need any code changes.
> > 
> > > 3. A hypervisor system prefers to have single stack regardless of
> > >     virtio device type (net/blk) and be future compatible with a
> > >     single vfio stack using SR-IOV or other scalable device
> > >     virtualization technology to map PCI devices to the guest VM.
> > >     (as transitional or otherwise)
> > 
> > What does this paragraph mean?
> > 
> It means regardless of a VF being transitional MMR VF or 1.x VF without any
> MMR extensions, there is single vfio virtio driver handling both type of
> devices to map to the guest VM.

I don't think this can be vfio. You need a host layer translating
things such as device ID etc.

> > 
> > Modern devices were added to Linux in 2014 and support SR-IOV.
> 
> > Why is it
> > important to support Transitional (which really means Legacy devices,
> > otherwise Modern devices would be sufficient)?
> > 
> To support guest VMs which only understand legacy devices and unfortunately
> they are still in much wider use by the users.

OK but supporting them with a passthrough driver such as vfio does
not seem that important.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:29                   ` Parav Pandit
@ 2023-04-03 18:02                     ` Michael S. Tsirkin
  2023-04-03 20:25                       ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 18:02 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 01:29:32PM -0400, Parav Pandit wrote:
> 
> 
> On 4/3/2023 1:16 PM, Michael S. Tsirkin wrote:
> > On Mon, Apr 03, 2023 at 03:36:25PM +0000, Parav Pandit wrote:
> > > 
> > > 
> > > > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > > > open.org> On Behalf Of Michael S. Tsirkin
> > > 
> > > > > Transport vq for legacy MMR purpose seems fine with its latency and DMA
> > > > overheads.
> > > > > Your question was about "scalability".
> > > > > After your latest response, I am unclear what "scalability" means.
> > > > > Do you mean saving the register space in the PCI device?
> > > > 
> > > > yes that's how you used scalability in the past.
> > > > 
> > > Ok. I am aligned.
> > > > > If yes, than, no for legacy guests for scalability it is not required, because the
> > > > legacy register is subset of 1.x.
> > > > 
> > > > Weird.  what does guest being legacy have to do with a wish to save registers
> > > > on the host hardware?
> > > Because legacy has subset of the registers of 1.x. So no new registers additional expected on legacy side.
> > > 
> > > > You don't have so many legacy guests as modern
> > > > guests? Why?
> > > > 
> > > This isn't true.
> > > 
> > > There is a trade-off, upto certain N, MMR based register access is fine.
> > > This is because 1.x is exposing super set of registers of legacy.
> > > Beyond a certain point device will have difficulty in doing MMR for legacy and 1.x.
> > > At that point, legacy over tvq can be better scale but with lot higher latency order of magnitude higher compare to MMR.
> > > If tvq being the only transport for these registers access, it would hurt at lower scale too, due the primary nature of non_register access.
> > > And scale is relative from device to device.
> > 
> > Wow! Why an order of magnitide?
> > 
> Because vqs involve DMA operations.
> It is left to the device implementation to do it, but a generic wisdom is
> not implement such slow work in the data path engines.
> So such register access vqs can/may be through firmware.
> Hence it can involve a lot higher latency.

Then that wisdom is wrong? tens of microseconds is not workable even for
ethtool operations, you are killing boot time.

I frankly don't know, if device vendors are going to interpret
"DMA" as "can take insane time" then maybe we need to scrap
the whole admin vq idea and make it all memory mapped like
Jason wanted, so as not to lead them into temptation?


> > > > > 
> > > > > > > > And presumably it can all be done in firmware ...
> > > > > > > > Is there actual hardware that can't implement transport vq but
> > > > > > > > is going to implement the mmr spec?
> > > > > > > > 
> > > > > > > Nvidia and Marvell DPUs implement MMR spec.
> > > > > > 
> > > > > > Hmm implement it in what sense exactly?
> > > > > > 
> > > > > Do not follow the question.
> > > > > The proposed series will be implemented as PCI SR-IOV devices using MMR
> > > > spec.
> > > > > 
> > > > > > > Transport VQ has very high latency and DMA overheads for 2 to 4
> > > > > > > bytes
> > > > > > read/write.
> > > > > > 
> > > > > > How many of these 2 byte accesses trigger from a typical guest?
> > > > > > 
> > > > > Mostly during the VM boot time. 20 to 40 registers read write access.
> > > > 
> > > > That is not a lot! How long does a DMA operation take then?
> > > > 
> > > > > > > And before discussing "why not that approach", lets finish
> > > > > > > reviewing "this
> > > > > > approach" first.
> > > > > > 
> > > > > > That's a weird way to put it. We don't want so many ways to do
> > > > > > legacy if we can help it.
> > > > > Sure, so lets finish the review of current proposal details.
> > > > > At the moment
> > > > > a. I don't see any visible gain of transport VQ other than device reset part I
> > > > explained.
> > > > 
> > > > For example, we do not need a new range of device IDs and existing drivers can
> > > > bind on the host.
> > > > 
> > > So, unlikely due to already discussed limitation of feature negotiation.
> > > Existing transitional driver would also look for an IOBAR being second limitation.
> > 
> > Some confusion here.
> Yes.
> > If you have a transitional driver you do not need a legacy device.
> > 
> IF I understood your thoughts split in two emails,
> 
> Your point was "we dont need new range of device IDs for transitional TVQ
> device because TVQ is new and its optional".
> 
> But this transitional TVQ device do not expose IOBAR expected by the
> existing transitional device, failing the driver load.
> 
> Your idea is not very clear.

Let me try again.

Modern host binds to modern interface. It can use the PF normally.
Legacy guest IOBAR accesses to VF are translated to transport vq
accesses.






> > 
> > 
> > > > > b. it can be a way with high latency, DMA overheads on the virtqueue for
> > > > read/writes for small access.
> > > > 
> > > > numbers?
> > > It depends on the implementation, but at minimum, writes and reads can pay order of magnitude higher in 10 msec range.
> > 
> > A single VQ roundtrip takes a minimum of 10 milliseconds? This is indeed
> > completely unworkable for transport vq. Points:
> > - even for memory mapped you have an access take 1 millisecond?
> >    Extremely slow. Why?
> > - Why is DMA 10x more expensive? I expect it to be 2x more expensive:
> >    Normal read goes cpu -> device -> cpu, DMA does cpu -> device -> memory -> device -> cpu
> > 
> > Reason I am asking is because it is important for transport vq to have
> > a workable design.
> > 
> > 
> > But let me guess. Is there a chance that you are talking about an
> > interrupt driven design? *That* is going to be slow though I don't think
> > 10msec, more like 10usec. But I expect transport vq to typically
> > work by (adaptive?) polling mostly avoiding interrupts.
> > 
> No. Interrupt latency is in usec range.
> The major latency contributors in msec range can arise from the device side.

So you are saying there are devices out there already with this MMR hack
baked in, and in hardware not firmware, so it works reasonably?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 14:53   ` Parav Pandit
  2023-04-03 17:48     ` Michael S. Tsirkin
@ 2023-04-03 19:10     ` Stefan Hajnoczi
  2023-04-03 20:27       ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Stefan Hajnoczi @ 2023-04-03 19:10 UTC (permalink / raw)
  To: Parav Pandit; +Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

On Mon, Apr 03, 2023 at 10:53:29AM -0400, Parav Pandit wrote:
> 
> 
> On 4/3/2023 10:45 AM, Stefan Hajnoczi wrote:
> > On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> > > Overview:
> > > ---------
> > > The Transitional MMR device is a variant of the transitional PCI device.
> > 
> > What does "MMR" mean?
> > 
> memory mapped registers.
> Explained below in the design section and also in relevant patches 6 to 11.

Maybe call it "Memory-mapped Transitional"? That name would be easier to
understand.

> > Modern devices were added to Linux in 2014 and support SR-IOV.
> 
> > Why is it
> > important to support Transitional (which really means Legacy devices,
> > otherwise Modern devices would be sufficient)?
> > 
> To support guest VMs which only understand legacy devices and unfortunately
> they are still in much wider use by the users.

I wonder which guest software without Modern VIRTIO support will still
be supported by the time Transitional MMR software and hardware becomes
available. Are you aiming for particular guest software versions?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:48     ` Michael S. Tsirkin
@ 2023-04-03 19:11       ` Stefan Hajnoczi
  2023-04-03 20:03         ` Michael S. Tsirkin
  2023-04-03 19:48       ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Stefan Hajnoczi @ 2023-04-03 19:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

On Mon, Apr 03, 2023 at 01:48:46PM -0400, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 10:53:29AM -0400, Parav Pandit wrote:
> > On 4/3/2023 10:45 AM, Stefan Hajnoczi wrote:
> > > On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> > > > 3. A hypervisor system prefers to have single stack regardless of
> > > >     virtio device type (net/blk) and be future compatible with a
> > > >     single vfio stack using SR-IOV or other scalable device
> > > >     virtualization technology to map PCI devices to the guest VM.
> > > >     (as transitional or otherwise)
> > > 
> > > What does this paragraph mean?
> > > 
> > It means regardless of a VF being transitional MMR VF or 1.x VF without any
> > MMR extensions, there is single vfio virtio driver handling both type of
> > devices to map to the guest VM.
> 
> I don't think this can be vfio. You need a host layer translating
> things such as device ID etc.

An mdev driver?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 17:48     ` Michael S. Tsirkin
  2023-04-03 19:11       ` Stefan Hajnoczi
@ 2023-04-03 19:48       ` Parav Pandit
  2023-04-03 20:02         ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 19:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 1:49 PM

> > Yes, but hypervisor is not involved in any configuration parsing or
> > anything of that nature.
> > It is only a passthrough fowarder from emulated IOBAR to memory mapped
> > legacy registers.
> > In other words, hypervisor do not care for the registers content at all.
> 
> This part I do not see as important. legacy is frozen in time. Implement it once
> and you are done. Datapath differences are more important.
>
 
> > > In other words, the guest doesn't know about Transitional MMR and
> > > does not need any code changes.
> > >
> > > > 3. A hypervisor system prefers to have single stack regardless of
> > > >     virtio device type (net/blk) and be future compatible with a
> > > >     single vfio stack using SR-IOV or other scalable device
> > > >     virtualization technology to map PCI devices to the guest VM.
> > > >     (as transitional or otherwise)
> > >
> > > What does this paragraph mean?
> > >
> > It means regardless of a VF being transitional MMR VF or 1.x VF
> > without any MMR extensions, there is single vfio virtio driver
> > handling both type of devices to map to the guest VM.
> 
> I don't think this can be vfio. You need a host layer translating things such as
> device ID etc.
>
vfio layer does it.

 
> > >
> > > Modern devices were added to Linux in 2014 and support SR-IOV.
> >
> > > Why is it
> > > important to support Transitional (which really means Legacy
> > > devices, otherwise Modern devices would be sufficient)?
> > >
> > To support guest VMs which only understand legacy devices and
> > unfortunately they are still in much wider use by the users.
> 
> OK but supporting them with a passthrough driver such as vfio does not seem
> that important.
Not sure on what basis you assert it.
I clarified in the cover letter that these are the user level requirements to support transitional and non-transitional devices both via single vfio subsystem.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 19:48       ` [virtio-dev] " Parav Pandit
@ 2023-04-03 20:02         ` Michael S. Tsirkin
  2023-04-03 20:42           ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 20:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 07:48:56PM +0000, Parav Pandit wrote:
> > OK but supporting them with a passthrough driver such as vfio does not seem
> > that important.
> Not sure on what basis you assert it.
> I clarified in the cover letter that these are the user level requirements to support transitional and non-transitional devices both via single vfio subsystem.

And what is so wrong with vdpa?  Really I don't see how the virtio spec
needs to accomodate specific partitioning between linux modules, be it
vdpa or vfio. Way beyond the scope of the driver.

But anyway, my main
point is about DMA. On the one hand you are asking for a VQ based
management interface because it saves money. On the other you are
saying DMA operations take extremely long to the point where they are
unusable in the boot sequence.
So what is it? Was admin vq a mistake and we should do memory mapped?
I know Jason really wanted that, and it makes a bunch of things
easier. Or is legacy emulation doable over a vq and latency is not
a concern, and the real reason is because it makes you push out
a host driver with a bit less effort?

I just do not see how these claims do not contradict each other.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 19:11       ` Stefan Hajnoczi
@ 2023-04-03 20:03         ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 20:03 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs

On Mon, Apr 03, 2023 at 03:11:52PM -0400, Stefan Hajnoczi wrote:
> On Mon, Apr 03, 2023 at 01:48:46PM -0400, Michael S. Tsirkin wrote:
> > On Mon, Apr 03, 2023 at 10:53:29AM -0400, Parav Pandit wrote:
> > > On 4/3/2023 10:45 AM, Stefan Hajnoczi wrote:
> > > > On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> > > > > 3. A hypervisor system prefers to have single stack regardless of
> > > > >     virtio device type (net/blk) and be future compatible with a
> > > > >     single vfio stack using SR-IOV or other scalable device
> > > > >     virtualization technology to map PCI devices to the guest VM.
> > > > >     (as transitional or otherwise)
> > > > 
> > > > What does this paragraph mean?
> > > > 
> > > It means regardless of a VF being transitional MMR VF or 1.x VF without any
> > > MMR extensions, there is single vfio virtio driver handling both type of
> > > devices to map to the guest VM.
> > 
> > I don't think this can be vfio. You need a host layer translating
> > things such as device ID etc.
> 
> An mdev driver?

vdpa seems more likely. we can extend it to support legacy if we want
to.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 18:02                     ` Michael S. Tsirkin
@ 2023-04-03 20:25                       ` Parav Pandit
  2023-04-03 21:04                         ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 20:25 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 2:02 PM

> > Because vqs involve DMA operations.
> > It is left to the device implementation to do it, but a generic wisdom
> > is not implement such slow work in the data path engines.
> > So such register access vqs can/may be through firmware.
> > Hence it can involve a lot higher latency.
> 
> Then that wisdom is wrong? tens of microseconds is not workable even for
> ethtool operations, you are killing boot time.
> 
Huh.
What ethtool latencies have you experienced? Number?

> I frankly don't know, if device vendors are going to interpret "DMA" as "can
> take insane time" then maybe we need to scrap the whole admin vq idea and
> make it all memory mapped like Jason wanted, so as not to lead them into
> temptation?

DMA happens for all types of devices for control and data path.
Can you point to any existing industry specification and real implementation that highlights such timing requirements.
This will be useful to understand where these requirements come from.

Multiple device implementors do not see memory mapped registers as way forward.
Discussed many times.
There is no point in going that dead end.

> Let me try again.
> 
> Modern host binds to modern interface. It can use the PF normally.
> Legacy guest IOBAR accesses to VF are translated to transport vq accesses.
> 
I understand this part.
Transport VQ is on the PF, right? (Nothing but AQ, right?)

It can work in VF case with trade-off compared to memory mapped registers.
A lightweight hypervisor cannot benefit from this which wants to utilize this for transitional PF too.
So providing both the options is useful.

Again, I want to emphasize that register read/write over tvq has merits with trade-off.
And so the mmr has merits with trade-off too.

Better to list them and proceed forward.

Method-1: VF's register read/write via PF based transport VQ
Pros:
a. Light weight registers implementation in device for new memory region window

Cons:
a. Higher DMA read/write latency
b. Device requires synchronization between non legacy memory mapped registers and legacy regs access via tvq
c. Can only work with the VF. Cannot work for thin hypervisor, which can map transitional PF to bare metal OS
(also listed in cover letter)

Method-2: VF's register read/write via MMR (current proposal)
Pros:
a. Device utilizes the same legacy and non-legacy registers.
b. an order of magnitude lower latency due to avoidance of DMA on register accesses
(Important but not critical)

> > No. Interrupt latency is in usec range.
> > The major latency contributors in msec range can arise from the device side.
> 
> So you are saying there are devices out there already with this MMR hack
> baked in, and in hardware not firmware, so it works reasonably?
It is better to not assert a solution a "hack", when you are still trying to understand the trade-offs of multiple solutions and when you are yet to fully review all requirements.
(and when solution is also based on an offline feedback from you!)

No. I didn't say that device is out there.
However, large part of the proposed changes is done based on real devices (and not limited to virtio).

Regarding tvq, I have some idea on how to improve the register read/writes so that its optimal for devices to implement.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 19:10     ` Stefan Hajnoczi
@ 2023-04-03 20:27       ` Parav Pandit
  2023-04-04 14:30         ` [virtio-dev] " Stefan Hajnoczi
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 20:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler


> From: Stefan Hajnoczi <stefanha@redhat.com>
> Sent: Monday, April 3, 2023 3:10 PM


> Maybe call it "Memory-mapped Transitional"? That name would be easier to
> understand.
>
Sounds fine to me.
 
> > > Modern devices were added to Linux in 2014 and support SR-IOV.
> >
> > > Why is it
> > > important to support Transitional (which really means Legacy
> > > devices, otherwise Modern devices would be sufficient)?
> > >
> > To support guest VMs which only understand legacy devices and
> > unfortunately they are still in much wider use by the users.
> 
> I wonder which guest software without Modern VIRTIO support will still be
> supported by the time Transitional MMR software and hardware becomes
> available. Are you aiming for particular guest software versions?

Transitional MMR hardware is available almost now.
Transitional MMR software is also WIP to be available as soon as we ratify the spec via tvq or via mmr or both.
This will be support the guest sw version such as 2.6.32.754 kernel.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 20:02         ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 20:42           ` Parav Pandit
  2023-04-03 21:14             ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 20:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, Shahaf Shuler


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 3, 2023 4:03 PM
> 
> On Mon, Apr 03, 2023 at 07:48:56PM +0000, Parav Pandit wrote:
> > > OK but supporting them with a passthrough driver such as vfio does
> > > not seem that important.
> > Not sure on what basis you assert it.
> > I clarified in the cover letter that these are the user level requirements to
> support transitional and non-transitional devices both via single vfio subsystem.
> 
> And what is so wrong with vdpa?  Really I don't see how the virtio spec needs to
> accomodate specific partitioning between linux modules, be it vdpa or vfio.
> Way beyond the scope of the driver.
>
vdpa has its own value.

Here requirements are different as listed so let's focus on it.
 
> But anyway, my main
> point is about DMA. On the one hand you are asking for a VQ based
> management interface because it saves money. On the other you are saying
> DMA operations take extremely long to the point where they are unusable in
> the boot sequence.

I think you missed the point I described few emails back.
The legacy registers are subset of the 1.x registers, so a device that implements existing 1.x registers, they get legacy registers for free.
Hence, there is no _real_ saving.

> So what is it? Was admin vq a mistake and we should do memory mapped?
No. Certainly not.
AQ is needed for LM, SR-IOV (SR-PCIM management), SIOV device life cycle.

> Or is
> legacy emulation doable over a vq and latency is not a concern, and the real
> reason is because it makes you push out a host driver with a bit less effort?
> 
Legacy registers emulation is doable over VQ and has its merits (I listed in previous email).
I forgot to mention in previous email that device reset is also better via tvq.
It is just that legacy_registers_transport_vq (LRT_VQ) requires more complex hypervisor driver and only works for the VFs.

At spec level, MMR has value on the PF as well, hence I previously proposed last week on your first email that spec should allow both.

Efforts of hypervisor not really a big concern.
Once we converge that LRT_VQ is good, it is viable option too.
I will shortly send out little more verbose command on lrt_vq so that its optimal enough.

> I just do not see how these claims do not contradict each other.

An AQ for queuing, parallelism, memory saving.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 20:25                       ` [virtio-dev] " Parav Pandit
@ 2023-04-03 21:04                         ` Michael S. Tsirkin
  2023-04-03 22:00                           ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 21:04 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 08:25:02PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 2:02 PM
> 
> > > Because vqs involve DMA operations.
> > > It is left to the device implementation to do it, but a generic wisdom
> > > is not implement such slow work in the data path engines.
> > > So such register access vqs can/may be through firmware.
> > > Hence it can involve a lot higher latency.
> > 
> > Then that wisdom is wrong? tens of microseconds is not workable even for
> > ethtool operations, you are killing boot time.
> > 
> Huh.
> What ethtool latencies have you experienced? Number?

I know an order of tens of eth calls happens during boot.
If as you said each takes tens of ms then we are talking close to a second.
That is measureable.

> > I frankly don't know, if device vendors are going to interpret "DMA" as "can
> > take insane time" then maybe we need to scrap the whole admin vq idea and
> > make it all memory mapped like Jason wanted, so as not to lead them into
> > temptation?
> 
> DMA happens for all types of devices for control and data path.
> Can you point to any existing industry specification and real implementation that highlights such timing requirements.
> This will be useful to understand where these requirements come from.
> 
> Multiple device implementors do not see memory mapped registers as way forward.
> Discussed many times.
> There is no point in going that dead end.

OK then. Then if it is a dead end then it looks weird to add a whole new
config space as memory mapped.

> > Let me try again.
> > 
> > Modern host binds to modern interface. It can use the PF normally.
> > Legacy guest IOBAR accesses to VF are translated to transport vq accesses.
> > 
> I understand this part.
> Transport VQ is on the PF, right? (Nothing but AQ, right?)
> 
> It can work in VF case with trade-off compared to memory mapped registers.
> A lightweight hypervisor cannot benefit from this which wants to utilize this for transitional PF too.
> So providing both the options is useful.

If hardware vendors do not want to bear the costs of registers then they
will not implement devices with registers, and then the whole thing will
become yet another legacy thing we need to support. If legacy emulation
without IO is useful, then can we not find a way to do it that will
survive the test of time?

> Again, I want to emphasize that register read/write over tvq has merits with trade-off.
> And so the mmr has merits with trade-off too.
> 
> Better to list them and proceed forward.
> 
> Method-1: VF's register read/write via PF based transport VQ
> Pros:
> a. Light weight registers implementation in device for new memory region window

Is that all? I mentioned more.

> Cons:
> a. Higher DMA read/write latency
> b. Device requires synchronization between non legacy memory mapped registers and legacy regs access via tvq

Same as a separate mmemory bar really.  Just don't do it. Either access
legacy or non legacy.

> c. Can only work with the VF. Cannot work for thin hypervisor, which can map transitional PF to bare metal OS
> (also listed in cover letter)

Is that a significant limitation? Why?

> Method-2: VF's register read/write via MMR (current proposal)
> Pros:
> a. Device utilizes the same legacy and non-legacy registers.

Not really. For starters endian-ness can be different. Maybe for some
devices and systems. 

> b. an order of magnitude lower latency due to avoidance of DMA on register accesses
> (Important but not critical)

And no cons? Even if you could not see them yourself did I fail to express myself to such
an extent?

> > > No. Interrupt latency is in usec range.
> > > The major latency contributors in msec range can arise from the device side.
> > 
> > So you are saying there are devices out there already with this MMR hack
> > baked in, and in hardware not firmware, so it works reasonably?
> It is better to not assert a solution a "hack",

Sorry if that sounded offensive.  a hack is not necessary a bad thing.
It's a quick solution to a very local problem, though.

> when you are still
> trying to understand the trade-offs of multiple solutions and when you
> are yet to fully review all requirements.
> (and when solution is also based on an offline feedback from you!)

Sorry if I set you on a wrong path I frankly didn't realize how
big a change the result will be. I was thinking more along the lines
of a capability describing a portion of a memory bar, saying access
there is equivalent to access to the io bar. That's it.

> No. I didn't say that device is out there.
> However, large part of the proposed changes is done based on real devices (and not limited to virtio).

Yes motivation is one of the things I'm trying to work out here.
It does however not help that it's an 11 patch strong patchset
adding 500 lines of text for what is supposedly a small change.

> Regarding tvq, I have some idea on how to improve the register read/writes so that its optimal for devices to implement.

Sounds useful, and maybe if tvq addresses legacy need then focus on
that?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 20:42           ` [virtio-dev] " Parav Pandit
@ 2023-04-03 21:14             ` Michael S. Tsirkin
  2023-04-03 22:08               ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-03 21:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 08:42:52PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 3, 2023 4:03 PM
> > 
> > On Mon, Apr 03, 2023 at 07:48:56PM +0000, Parav Pandit wrote:
> > > > OK but supporting them with a passthrough driver such as vfio does
> > > > not seem that important.
> > > Not sure on what basis you assert it.
> > > I clarified in the cover letter that these are the user level requirements to
> > support transitional and non-transitional devices both via single vfio subsystem.
> > 
> > And what is so wrong with vdpa?  Really I don't see how the virtio spec needs to
> > accomodate specific partitioning between linux modules, be it vdpa or vfio.
> > Way beyond the scope of the driver.
> >
> vdpa has its own value.
> 
> Here requirements are different as listed so let's focus on it.

I'm not sure how convincing that is. Yes simpler software is good,
it's nice to have, but it's not such a hard requirement to use vfio
when vdpa is there. And again, the cost is reduced robustness.

> > But anyway, my main
> > point is about DMA. On the one hand you are asking for a VQ based
> > management interface because it saves money. On the other you are saying
> > DMA operations take extremely long to the point where they are unusable in
> > the boot sequence.
> 
> I think you missed the point I described few emails back.
> The legacy registers are subset of the 1.x registers, so a device that implements existing 1.x registers, they get legacy registers for free.
> Hence, there is no _real_ saving.

First not 100%.  E.g. MAC is writeable so that's a R/W register as
opposed to RO.  But generally, why implement 1.0 registers at all? Do it
all in transport vq.

> > So what is it? Was admin vq a mistake and we should do memory mapped?
> No. Certainly not.
> AQ is needed for LM, SR-IOV (SR-PCIM management), SIOV device life cycle.
> 
> > Or is
> > legacy emulation doable over a vq and latency is not a concern, and the real
> > reason is because it makes you push out a host driver with a bit less effort?
> > 
> Legacy registers emulation is doable over VQ and has its merits (I listed in previous email).
> I forgot to mention in previous email that device reset is also better via tvq.
> It is just that legacy_registers_transport_vq (LRT_VQ) requires more complex hypervisor driver and only works for the VFs.
> 
> At spec level, MMR has value on the PF as well, hence I previously proposed last week on your first email that spec should allow both.
> 
> Efforts of hypervisor not really a big concern.
> Once we converge that LRT_VQ is good, it is viable option too.
> I will shortly send out little more verbose command on lrt_vq so that its optimal enough.
> 
> > I just do not see how these claims do not contradict each other.
> 
> An AQ for queuing, parallelism, memory saving.

Ok that all sounds good.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 21:04                         ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 22:00                           ` Parav Pandit
  2023-04-07  9:35                             ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 22:00 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



On 4/3/2023 5:04 PM, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 08:25:02PM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Monday, April 3, 2023 2:02 PM
>>
>>>> Because vqs involve DMA operations.
>>>> It is left to the device implementation to do it, but a generic wisdom
>>>> is not implement such slow work in the data path engines.
>>>> So such register access vqs can/may be through firmware.
>>>> Hence it can involve a lot higher latency.
>>>
>>> Then that wisdom is wrong? tens of microseconds is not workable even for
>>> ethtool operations, you are killing boot time.
>>>
>> Huh.
>> What ethtool latencies have you experienced? Number?
> 
> I know an order of tens of eth calls happens during boot.
> If as you said each takes tens of ms then we are talking close to a second.
> That is measureable.
I said it can take, doesn't have to be always same for all the commands.
Better to work with real numbers. :)

Let me take an example to walk through.

If a cvq or aq command takes 0.5msec, total of 100 such commands will 
take 50msec.

Once a while if two of commands say take 5msec, will result in 50 -> 60 
msec.


> OK then. Then if it is a dead end then it looks weird to add a whole new
> config space as memory mapped.
> 
I am aligned with you to not add any new register as memory mapped for 1.x.
Or access through device own's tvq is fine if such q can be initialized 
before during device reset (init) phase.

I explained that legacy registers are sub-set of existing 1.x.
They should not consume extra memory.

Lets walk through the merits and negatives of both to conclude.

>>> Let me try again.

> If hardware vendors do not want to bear the costs of registers then they
> will not implement devices with registers, and then the whole thing will
> become yet another legacy thing we need to support. If legacy emulation
> without IO is useful, then can we not find a way to do it that will
> survive the test of time?
legacy_register_transport_vq for VF can be a option, but not for PF 
emulation.
More below.

> 
>> Again, I want to emphasize that register read/write over tvq has merits with trade-off.
>> And so the mmr has merits with trade-off too.
>>
>> Better to list them and proceed forward.
>>
>> Method-1: VF's register read/write via PF based transport VQ
>> Pros:
>> a. Light weight registers implementation in device for new memory region window
> 
> Is that all? I mentioned more.
> 
b. device reset is more optimal with transport VQ
c. a hypervisor may want to check (but not necessary) register content
d. Some unknown guest VM driver which modifies mac address and still 
expect atomicity can benefit if hypervisor wants to do extra checks

>> Cons:
>> a. Higher DMA read/write latency
>> b. Device requires synchronization between non legacy memory mapped registers and legacy regs access via tvq
> 
> Same as a separate mmemory bar really.  Just don't do it. Either access
> legacy or non legacy.
> 
It is really not same to treat them equally as tvq encapsulation is 
different, and hw wouldn't prefer to treat them equally like regular 
memory writes.

Transitional device exposed by hypervisor contains both legacy I/O bar 
and also the memory mapped registers. So a guest vm can access both.

>> c. Can only work with the VF. Cannot work for thin hypervisor, which can map transitional PF to bare metal OS
>> (also listed in cover letter)
> 
> Is that a significant limitation? Why?
It is a functional limitation for the PF, as PF has no parent.
and PF can also utilize memory BAR.

> 
>> Method-2: VF's register read/write via MMR (current proposal)
>> Pros:
>> a. Device utilizes the same legacy and non-legacy registers.
> 
>> b. an order of magnitude lower latency due to avoidance of DMA on register accesses
>> (Important but not critical)
> 
> And no cons? Even if you could not see them yourself did I fail to express myself to such
> an extent?
> 
Method-1 pros covered the advantage of it over method-2, but yes worth 
to list here for completeness.

Cons:
requires creating new memory region window in the device for 
configuration access

>>>> No. Interrupt latency is in usec range.
>>>> The major latency contributors in msec range can arise from the device side.
>>>
>>> So you are saying there are devices out there already with this MMR hack
>>> baked in, and in hardware not firmware, so it works reasonably?
>> It is better to not assert a solution a "hack",
> 
> Sorry if that sounded offensive.  a hack is not necessary a bad thing.
> It's a quick solution to a very local problem, though.
> 
It is a solution because device can do at near to zero extra memory for 
existing registers.
Anyways, we have better technical details to resolve. :)
Lets focus on it.

> Yes motivation is one of the things I'm trying to work out here.
> It does however not help that it's an 11 patch strong patchset
> adding 500 lines of text for what is supposedly a small change.
> 
Many of the patches are rework and incorrect to attribute to the 
specific feature.

Like others it could have been one giant patch... but we see value in 
smaller patches..

Using tvq is even bigger change than this. So we shouldn't be afraid of 
making transitional device actually work using it with larger spec patch.

>> Regarding tvq, I have some idea on how to improve the register read/writes so that its optimal for devices to implement.
> 
> Sounds useful, and maybe if tvq addresses legacy need then focus on
> that?
> 

tvq specific for legacy register access make sense.
Some generic tvq is abstract and dont see any relation here.

So better to name it as legacy_reg_transport_vq (lrt_vq).

How about having below format?

/* Format of 16B descriptors for lrt_vq
  * lrt_vq = legacy register tranport vq.
  */
struct legacy_reg_req_vf {
	union {
		struct {
			le32 reg_wr_data;
			le32 reserved;
		} write;
		struct {
			le64 reg_read_addr;
		};
	};
	le8 rd_wr : 1;	/* rd=0, wr=1 */
	le8 reg_byte_offset : 7;
	le8 req_tag;	/* unique request tag on this vq */
	le16 vf_num;

	le16 flags; /* new flag below */
         le16 next;
};

#define VIRTQ_DESC_F_Q_DEFINED 8
/* Content of the VQ descriptor other than flags field is VQ
  * specific and defined by the VQ type.
  */

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 21:14             ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-03 22:08               ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-03 22:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stefan Hajnoczi, virtio-dev, cohuck, virtio-comment, Shahaf Shuler



On 4/3/2023 5:14 PM, Michael S. Tsirkin wrote:
> On Mon, Apr 03, 2023 at 08:42:52PM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Monday, April 3, 2023 4:03 PM
>>>
>>> On Mon, Apr 03, 2023 at 07:48:56PM +0000, Parav Pandit wrote:
>>>>> OK but supporting them with a passthrough driver such as vfio does
>>>>> not seem that important.
>>>> Not sure on what basis you assert it.
>>>> I clarified in the cover letter that these are the user level requirements to
>>> support transitional and non-transitional devices both via single vfio subsystem.
>>>
>>> And what is so wrong with vdpa?  Really I don't see how the virtio spec needs to
>>> accomodate specific partitioning between linux modules, be it vdpa or vfio.
>>> Way beyond the scope of the driver.
>>>
>> vdpa has its own value.
>>
>> Here requirements are different as listed so let's focus on it.
> 
> I'm not sure how convincing that is. Yes simpler software is good,
> it's nice to have, but it's not such a hard requirement to use vfio
> when vdpa is there. And again, the cost is reduced robustness.
> 
It is not hard requirement to use vdpa or vfio one way or either.

vdpa users can use vdpa.
vfio users can use vfio.

>>> But anyway, my main
>>> point is about DMA. On the one hand you are asking for a VQ based
>>> management interface because it saves money. On the other you are saying
>>> DMA operations take extremely long to the point where they are unusable in
>>> the boot sequence.
>>
>> I think you missed the point I described few emails back.
>> The legacy registers are subset of the 1.x registers, so a device that implements existing 1.x registers, they get legacy registers for free.
>> Hence, there is no _real_ saving.
> 
> First not 100%.  E.g. MAC is writeable so that's a R/W register as
> opposed to RO.  But generally, why implement 1.0 registers at all? Do it
> all in transport vq.
> 
1.x non transitional VFs and SIOV devices will use their own 
register_transport_vq once that infrastructure is in place to access its 
registers. Such VQ is directly accessible in the guest VM without 
hypervisor intervention.

It is orthogonal to this use case (and such future VQ still works with 
this design).

In this approach a hypervisor needs to use PF's AQ because the VF device 
(its registers, features etc) is not owned by the hypervisor VF driver.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-03-30 22:58 ` [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id Parav Pandit
@ 2023-04-04  7:28   ` Michael S. Tsirkin
  2023-04-04 16:08     ` Parav Pandit
  2023-04-07  8:37   ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-04  7:28 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:30AM +0300, Parav Pandit wrote:
> Transitional MMR device PCI Device IDs are unique. Hence,
> any of the existing drivers do not bind to it.
> This further maintains the backward compatibility with
> existing drivers.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

I took a fresh look at it, and I don't get it: what exactly is wrong
with just using modern ID? Why do we need new ones?


> ---
>  transport-pci.tex | 45 +++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 41 insertions(+), 4 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index ee11ba5..665448e 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -19,12 +19,14 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport Options / Virtio Over P
>  \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
>  Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
> -0x107f inclusive is a virtio device. The actual value within this range
> -indicates which virtio device is supported by the device.
> +0x107f inclusive and DeviceID 0x10f9 through 0x10ff is a virtio device.
> +The actual value within this range indicates which virtio device
> +type it is.
>  The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
>  as indicated in section \ref{sec:Device Types}.
> -Additionally, devices MAY utilize a Transitional PCI Device ID range,
> -0x1000 to 0x103f depending on the device type.
> +Additionally, devices MAY utilize a Transitional PCI Device ID range
> +0x1000 to 0x103f inclusive or a Transitional MMR PCI Device ID range
> +0x10f9 to 0x10ff inclusive, depending on the device type.
>  
>  \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
> @@ -95,6 +97,41 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device Discovery}\label{sec:Virt
>  
>  This is to match legacy drivers.
>  
> +\subsubsection{Transitional MMR Interface: A Note on PCI Device
> +Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI
> +Bus / PCI Device Discovery / Transitional MMR Interface: A Note on PCI Device Discovery}
> +
> +The transitional MMR device has one of the following PCI Device ID
> +depending on the device type:
> +
> +\begin{tabular}{|l|c|}
> +\hline
> +Transitional PCI Device ID  &  Virtio Device    \\
> +\hline \hline
> +0x10f9      &   network device     \\
> +\hline
> +0x10fa     &   block device     \\
> +\hline
> +0x10fb     & memory ballooning (traditional)  \\
> +\hline
> +0x10fc     &      console       \\
> +\hline
> +0x10fd     &     SCSI host      \\
> +\hline
> +0x10fe     &  entropy source    \\
> +\hline
> +0x10ff     &   9P transport     \\
> +\hline
> +\end{tabular}
> +
> +The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
> +reflect the PCI Vendor and Device ID of the environment.
> +
> +The transitional MMR driver MUST match any PCI Revision ID value.
> +
> +The transitional MMR driver MAY match any PCI Subsystem Vendor ID and
> +any PCI Subsystem Device ID value.
> +
>  \subsection{PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout}
>  
>  The device is configured via I/O and/or memory regions (though see
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
@ 2023-04-04  7:35   ` Michael S. Tsirkin
  2023-04-04  7:54     ` Cornelia Huck
  2023-04-04 21:18     ` [virtio-dev] " Parav Pandit
  2023-04-10  1:36   ` [virtio-dev] " Jason Wang
  2023-05-19  6:10   ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-04  7:35 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:31AM +0300, Parav Pandit wrote:
> PCI device configuration space for capabilities is limited to only 192
> bytes shared by many PCI capabilities of generic PCI device and virtio
> specific.
> 
> Hence, introduce virtio extended capability that uses PCI Express
> extended capability.
> Subsequent patch uses this virtio extended capability.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>

What does it "Co-developed-by" mean exactly that Signed-off-by
does not?


> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
>  transport-pci.tex | 69 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 665448e..aeda4a1 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -174,7 +174,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  the function, or accessed via the special VIRTIO_PCI_CAP_PCI_CFG field in the PCI configuration space.
>  
>  The location of each structure is specified using a vendor-specific PCI capability located
> -on the capability list in PCI configuration space of the device.
> +on the capability list in PCI configuration space of the device
> +unless stated otherwise.
>  This virtio structure capability uses little-endian format; all fields are
>  read-only for the driver unless stated otherwise:
>  
> @@ -301,6 +302,72 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  fields provide the most significant 32 bits of a total 64 bit offset and
>  length within the BAR specified by \field{cap.bar}.
>  
> +Virtio extended PCI Express capability structure defines
> +the location of certain virtio device configuration related
> +structures using PCI Express extended capability. Virtio
> +extended PCI Express capability structure uses PCI Express
> +vendor specific extended capability (VSEC). It has a below

a layout below, or the following layout

> +layout:
> +
> +\begin{lstlisting}
> +struct pcie_ext_cap {
> +        le16 cap_vendor_id; /* Generic PCI field: 0xB */
> +        le16 cap_version : 2; /* Generic PCI field: 0 */
> +        le16 next_cap_offset : 14; /* Generic PCI field: next cap or 0 */
> +};
> +
> +struct virtio_pcie_ext_cap {
> +        struct pcie_ext_cap pcie_ecap;
> +        u8 cfg_type; /* Identifies the structure. */
> +        u8 bar; /* Index of the BAR where its located */
> +        u8 id; /* Multiple capabilities of the same type */
> +        u8 zero_padding[1];
> +        le64 offset; /* Offset with the bar */
> +        le64 length; /* Length of the structure, in bytes. */
> +        u8 data[]; /* Optional variable length data */

Maybe le64 data[], for alignment?

> +};
> +\end{lstlisting}
> +
> +This structure contains optional data, depending on
> +\field{cfg_type}. The fields are interpreted as follows:
> +
> +\begin{description}
> +\item[\field{cap_vendor_id}]
> +         0x0B; identifies a vendor-specific extended capability.
> +
> +\item[\field{cap_version}]
> +         contains a value of 0.
> +
> +\item[\field{next_cap_offset}]
> +        Offset to the next capability.
> +
> +\item[\field{cfg_type}]
> +        follows the same definition as \field{cfg_type}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{bar}]
> +        follows the same  same definition as  \field{bar}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{id}]
> +        follows the same  same definition as  \field{id}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{offset}]
> +        indicates where the structure begins relative to the
> +        base address associated with the BAR. The alignment
> +        requirements of offset are indicated in each
> +        structure-specific section that uses
> +        \field{struct virtio_pcie_ext_cap}.
> +
> +\item[\field{length}]
> +        indicates the length of the structure indicated by this
> +        capability.
> +
> +\item[\field{data}]
> +        optional data of this capability.
> +\end{description}
> +
>  \drivernormative{\subsubsection}{Virtio Structure PCI Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
>  
>  The driver MUST ignore any vendor-specific capability structure which has
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-04  7:54     ` Cornelia Huck
  2023-04-04 12:43       ` Michael S. Tsirkin
  2023-04-04 21:18     ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Cornelia Huck @ 2023-04-04  7:54 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: virtio-dev, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 04 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Mar 31, 2023 at 01:58:31AM +0300, Parav Pandit wrote:
>> PCI device configuration space for capabilities is limited to only 192
>> bytes shared by many PCI capabilities of generic PCI device and virtio
>> specific.
>> 
>> Hence, introduce virtio extended capability that uses PCI Express
>> extended capability.
>> Subsequent patch uses this virtio extended capability.
>> 
>> Co-developed-by: Satananda Burla <sburla@marvell.com>
>
> What does it "Co-developed-by" mean exactly that Signed-off-by
> does not?

AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
give attribution. (Linux kernel rules actually state that any
"Co-developed-by:" must be followed by a "Signed-off-by:" for that
author, but we don't really do DCO here, the S-o-b is more of a
convention.)

>
>
>> Signed-off-by: Parav Pandit <parav@nvidia.com>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04  7:54     ` Cornelia Huck
@ 2023-04-04 12:43       ` Michael S. Tsirkin
  2023-04-04 13:19         ` Cornelia Huck
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-04 12:43 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Parav Pandit, virtio-dev, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 04, 2023 at 09:54:55AM +0200, Cornelia Huck wrote:
> On Tue, Apr 04 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Fri, Mar 31, 2023 at 01:58:31AM +0300, Parav Pandit wrote:
> >> PCI device configuration space for capabilities is limited to only 192
> >> bytes shared by many PCI capabilities of generic PCI device and virtio
> >> specific.
> >> 
> >> Hence, introduce virtio extended capability that uses PCI Express
> >> extended capability.
> >> Subsequent patch uses this virtio extended capability.
> >> 
> >> Co-developed-by: Satananda Burla <sburla@marvell.com>
> >
> > What does it "Co-developed-by" mean exactly that Signed-off-by
> > does not?
> 
> AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
> give attribution. (Linux kernel rules actually state that any
> "Co-developed-by:" must be followed by a "Signed-off-by:" for that
> author, but we don't really do DCO here, the S-o-b is more of a
> convention.)


Actually, we might want to generally, to signify agreement to the IPR.
How about adding this to our rules?  But in this case Satananda Burla is
a TC member so yes, no problem.

> >
> >
> >> Signed-off-by: Parav Pandit <parav@nvidia.com>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04 12:43       ` Michael S. Tsirkin
@ 2023-04-04 13:19         ` Cornelia Huck
  2023-04-04 14:37           ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Cornelia Huck @ 2023-04-04 13:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 04 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Apr 04, 2023 at 09:54:55AM +0200, Cornelia Huck wrote:
>> On Tue, Apr 04 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>> > On Fri, Mar 31, 2023 at 01:58:31AM +0300, Parav Pandit wrote:
>> >> PCI device configuration space for capabilities is limited to only 192
>> >> bytes shared by many PCI capabilities of generic PCI device and virtio
>> >> specific.
>> >> 
>> >> Hence, introduce virtio extended capability that uses PCI Express
>> >> extended capability.
>> >> Subsequent patch uses this virtio extended capability.
>> >> 
>> >> Co-developed-by: Satananda Burla <sburla@marvell.com>
>> >
>> > What does it "Co-developed-by" mean exactly that Signed-off-by
>> > does not?
>> 
>> AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
>> give attribution. (Linux kernel rules actually state that any
>> "Co-developed-by:" must be followed by a "Signed-off-by:" for that
>> author, but we don't really do DCO here, the S-o-b is more of a
>> convention.)
>
>
> Actually, we might want to generally, to signify agreement to the IPR.
> How about adding this to our rules?  But in this case Satananda Burla is
> a TC member so yes, no problem.

Adding a s-o-b requirement is not a bad idea... do you want to propose
an update?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 20:27       ` [virtio-dev] " Parav Pandit
@ 2023-04-04 14:30         ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ messages in thread
From: Stefan Hajnoczi @ 2023-04-04 14:30 UTC (permalink / raw)
  To: Parav Pandit; +Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler

[-- Attachment #1: Type: text/plain, Size: 1432 bytes --]

On Mon, Apr 03, 2023 at 08:27:28PM +0000, Parav Pandit wrote:
> 
> > From: Stefan Hajnoczi <stefanha@redhat.com>
> > Sent: Monday, April 3, 2023 3:10 PM
> 
> 
> > Maybe call it "Memory-mapped Transitional"? That name would be easier to
> > understand.
> >
> Sounds fine to me.
>  
> > > > Modern devices were added to Linux in 2014 and support SR-IOV.
> > >
> > > > Why is it
> > > > important to support Transitional (which really means Legacy
> > > > devices, otherwise Modern devices would be sufficient)?
> > > >
> > > To support guest VMs which only understand legacy devices and
> > > unfortunately they are still in much wider use by the users.
> > 
> > I wonder which guest software without Modern VIRTIO support will still be
> > supported by the time Transitional MMR software and hardware becomes
> > available. Are you aiming for particular guest software versions?
> 
> Transitional MMR hardware is available almost now.
> Transitional MMR software is also WIP to be available as soon as we ratify the spec via tvq or via mmr or both.
> This will be support the guest sw version such as 2.6.32.754 kernel.

That is a RHEL 6 kernel. RHEL 6 entered Extended Life Cycle Support on
November 30, 2020 (https://access.redhat.com/articles/4665701).

I would use existing software emulation for really old guests, but if
you see an opportunity for hardware passthrough, then why not.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04 13:19         ` Cornelia Huck
@ 2023-04-04 14:37           ` Michael S. Tsirkin
  2023-04-10 16:21             ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-04 14:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Parav Pandit, virtio-dev, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 04, 2023 at 03:19:53PM +0200, Cornelia Huck wrote:
> >> > What does it "Co-developed-by" mean exactly that Signed-off-by
> >> > does not?
> >> 
> >> AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
> >> give attribution. (Linux kernel rules actually state that any
> >> "Co-developed-by:" must be followed by a "Signed-off-by:" for that
> >> author, but we don't really do DCO here, the S-o-b is more of a
> >> convention.)
> >
> >
> > Actually, we might want to generally, to signify agreement to the IPR.
> > How about adding this to our rules?  But in this case Satananda Burla is
> > a TC member so yes, no problem.
> 
> Adding a s-o-b requirement is not a bad idea... do you want to propose
> an update?

OK how about the following:

---------------
The process for forwarding comments from others:

Generally, subscribing to the virtio comments mailing list
requires agreement to the OASIS feedback license, at
https://www.oasis-open.org/who/ipr/feedback_license.pdf

When forwarding (in modified or unmodified form) comments from others, please
make sure the commenter has read and agreed to the
feedback license. If this is the case, please include the following:

License-consent: commenter's name <commenter's email>

where commenter's name <commenter's email> is a valid
name-addr as specified in RFC 5322, see
https://datatracker.ietf.org/doc/html/rfc5322#section-3.4

If the comment is on behalf of an organization, please use the
organization's email address.


If forwarding a comment from a TC member, please instead get consent to
the full Virtio TC IPR, and then, as before, include

License-consent: commenter's name <commenter's email>

If you are reusing a comment that has already been posted to the
TC mailing list, the above tags are not required.

---------------

We could reuse Signed-off-by though I am a bit concerned whether
people will assume it's a DCO thing which everyone copies.
Thoughts?





-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-04  7:28   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-04 16:08     ` Parav Pandit
  2023-04-07 12:03       ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-04 16:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla



On 4/4/2023 3:28 AM, Michael S. Tsirkin wrote:
> On Fri, Mar 31, 2023 at 01:58:30AM +0300, Parav Pandit wrote:
>> Transitional MMR device PCI Device IDs are unique. Hence,
>> any of the existing drivers do not bind to it.
>> This further maintains the backward compatibility with
>> existing drivers.
>>
>> Co-developed-by: Satananda Burla <sburla@marvell.com>
>> Signed-off-by: Parav Pandit <parav@nvidia.com>
> 
> I took a fresh look at it, and I don't get it: what exactly is wrong
> with just using modern ID? Why do we need new ones?

A modern (non transitional device) do not support legacy functionality 
such as virtio net hdr, device reset sequence.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
  2023-04-04  7:54     ` Cornelia Huck
@ 2023-04-04 21:18     ` Parav Pandit
  2023-04-05  5:10       ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-04 21:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 4, 2023 3:35 AM

> > +Virtio extended PCI Express capability structure defines the location
> > +of certain virtio device configuration related structures using PCI
> > +Express extended capability. Virtio extended PCI Express capability
> > +structure uses PCI Express vendor specific extended capability
> > +(VSEC). It has a below
> 
> a layout below, or the following layout
> 
Yes. somehow it got trimmed.
Will fix it.

> > +layout:
> > +
> > +\begin{lstlisting}
> > +struct pcie_ext_cap {
> > +        le16 cap_vendor_id; /* Generic PCI field: 0xB */
> > +        le16 cap_version : 2; /* Generic PCI field: 0 */
> > +        le16 next_cap_offset : 14; /* Generic PCI field: next cap or
> > +0 */ };
> > +
> > +struct virtio_pcie_ext_cap {
> > +        struct pcie_ext_cap pcie_ecap;
> > +        u8 cfg_type; /* Identifies the structure. */
> > +        u8 bar; /* Index of the BAR where its located */
> > +        u8 id; /* Multiple capabilities of the same type */
> > +        u8 zero_padding[1];
> > +        le64 offset; /* Offset with the bar */
> > +        le64 length; /* Length of the structure, in bytes. */
> > +        u8 data[]; /* Optional variable length data */
> 
> Maybe le64 data[], for alignment?
> 
It gets harder to decode (typecasting ..) if its string with le64 data type.
I will extend the comment, 

+        u8 data[]; /* Optional variable length data, must be aligned to 8 bytes */

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04 21:18     ` [virtio-dev] " Parav Pandit
@ 2023-04-05  5:10       ` Michael S. Tsirkin
  2023-04-05 13:16         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-05  5:10 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Tue, Apr 04, 2023 at 09:18:53PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, April 4, 2023 3:35 AM
> 
> > > +Virtio extended PCI Express capability structure defines the location
> > > +of certain virtio device configuration related structures using PCI
> > > +Express extended capability. Virtio extended PCI Express capability
> > > +structure uses PCI Express vendor specific extended capability
> > > +(VSEC). It has a below
> > 
> > a layout below, or the following layout
> > 
> Yes. somehow it got trimmed.
> Will fix it.
> 
> > > +layout:
> > > +
> > > +\begin{lstlisting}
> > > +struct pcie_ext_cap {
> > > +        le16 cap_vendor_id; /* Generic PCI field: 0xB */
> > > +        le16 cap_version : 2; /* Generic PCI field: 0 */
> > > +        le16 next_cap_offset : 14; /* Generic PCI field: next cap or
> > > +0 */ };
> > > +
> > > +struct virtio_pcie_ext_cap {
> > > +        struct pcie_ext_cap pcie_ecap;
> > > +        u8 cfg_type; /* Identifies the structure. */
> > > +        u8 bar; /* Index of the BAR where its located */
> > > +        u8 id; /* Multiple capabilities of the same type */
> > > +        u8 zero_padding[1];
> > > +        le64 offset; /* Offset with the bar */
> > > +        le64 length; /* Length of the structure, in bytes. */
> > > +        u8 data[]; /* Optional variable length data */
> > 
> > Maybe le64 data[], for alignment?
> > 
> It gets harder to decode (typecasting ..) if its string with le64 data type.

In what language? In C you have to cast anyway, string is char *, often
signed, not u8.

> I will extend the comment, 
> 
> +        u8 data[]; /* Optional variable length data, must be aligned to 8 bytes */

I'd keep it le64 or u64, it is highly unlikely we'll pass strings through
this interface anyway.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-05  5:10       ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-05 13:16         ` Parav Pandit
  2023-04-07  8:15           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-05 13:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 5, 2023 1:11 AM

> > > > +struct virtio_pcie_ext_cap {
> > > > +        struct pcie_ext_cap pcie_ecap;
> > > > +        u8 cfg_type; /* Identifies the structure. */
> > > > +        u8 bar; /* Index of the BAR where its located */
> > > > +        u8 id; /* Multiple capabilities of the same type */
> > > > +        u8 zero_padding[1];
> > > > +        le64 offset; /* Offset with the bar */
> > > > +        le64 length; /* Length of the structure, in bytes. */
> > > > +        u8 data[]; /* Optional variable length data */
> > >
> > > Maybe le64 data[], for alignment?
> > >
> > It gets harder to decode (typecasting ..) if its string with le64 data type.
> 
> In what language? In C you have to cast anyway, string is char *, often signed,
> not u8.
> 
> > I will extend the comment,
> >
> > +        u8 data[]; /* Optional variable length data, must be aligned
> > + to 8 bytes */
> 
> I'd keep it le64 or u64, it is highly unlikely we'll pass strings through this
> interface anyway.

Ok. will change.

What about rest of the patches? If we proceed using MMR interface, rest of the patches are fine?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-05 13:16         ` [virtio-dev] " Parav Pandit
@ 2023-04-07  8:15           ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07  8:15 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 05, 2023 at 01:16:31PM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 5, 2023 1:11 AM
> 
> > > > > +struct virtio_pcie_ext_cap {
> > > > > +        struct pcie_ext_cap pcie_ecap;
> > > > > +        u8 cfg_type; /* Identifies the structure. */
> > > > > +        u8 bar; /* Index of the BAR where its located */
> > > > > +        u8 id; /* Multiple capabilities of the same type */
> > > > > +        u8 zero_padding[1];
> > > > > +        le64 offset; /* Offset with the bar */
> > > > > +        le64 length; /* Length of the structure, in bytes. */
> > > > > +        u8 data[]; /* Optional variable length data */
> > > >
> > > > Maybe le64 data[], for alignment?
> > > >
> > > It gets harder to decode (typecasting ..) if its string with le64 data type.
> > 
> > In what language? In C you have to cast anyway, string is char *, often signed,
> > not u8.
> > 
> > > I will extend the comment,
> > >
> > > +        u8 data[]; /* Optional variable length data, must be aligned
> > > + to 8 bytes */
> > 
> > I'd keep it le64 or u64, it is highly unlikely we'll pass strings through this
> > interface anyway.
> 
> Ok. will change.
> 
> What about rest of the patches? If we proceed using MMR interface, rest of the patches are fine?

Biggest problem is 7/11 - the new IDs, breaking all existing drivers.

I thought I replied on that but don't see it on that specific patch.
Let me repost my thoughts now I had time to think it over.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-03-30 22:58 ` [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id Parav Pandit
  2023-04-04  7:28   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-07  8:37   ` Michael S. Tsirkin
  1 sibling, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07  8:37 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:30AM +0300, Parav Pandit wrote:
> Transitional MMR device PCI Device IDs are unique. Hence,
> any of the existing drivers do not bind to it.
> This further maintains the backward compatibility with
> existing drivers.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

This is IMO just too big a change: fundamentally this is a completely
new transport since no existing drivers can use this device at all. Not
a thing we should do lightly. And we have options on the table
such as AQ, which address most of the issue. Yes they do not
work for passthrough of a PF but that seems like a minor issue.
If we want to save the MMR idea, let's find a way to cut
down this patchset's size significantly.


On a more positive side, I do not really see a reason to
have a new ID at all. Device can have a normal device ID
and additionally the new capability.









> ---
>  transport-pci.tex | 45 +++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 41 insertions(+), 4 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index ee11ba5..665448e 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -19,12 +19,14 @@ \section{Virtio Over PCI Bus}\label{sec:Virtio Transport Options / Virtio Over P
>  \subsection{PCI Device Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
>  Any PCI device with PCI Vendor ID 0x1af4, and PCI Device ID 0x1000 through
> -0x107f inclusive is a virtio device. The actual value within this range
> -indicates which virtio device is supported by the device.
> +0x107f inclusive and DeviceID 0x10f9 through 0x10ff is a virtio device.
> +The actual value within this range indicates which virtio device
> +type it is.
>  The PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID,
>  as indicated in section \ref{sec:Device Types}.
> -Additionally, devices MAY utilize a Transitional PCI Device ID range,
> -0x1000 to 0x103f depending on the device type.
> +Additionally, devices MAY utilize a Transitional PCI Device ID range
> +0x1000 to 0x103f inclusive or a Transitional MMR PCI Device ID range
> +0x10f9 to 0x10ff inclusive, depending on the device type.
>  
>  \devicenormative{\subsubsection}{PCI Device Discovery}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Discovery}
>  
> @@ -95,6 +97,41 @@ \subsubsection{Legacy Interfaces: A Note on PCI Device Discovery}\label{sec:Virt
>  
>  This is to match legacy drivers.
>  
> +\subsubsection{Transitional MMR Interface: A Note on PCI Device
> +Discovery}\label{sec:Virtio Transport Options / Virtio Over PCI
> +Bus / PCI Device Discovery / Transitional MMR Interface: A Note on PCI Device Discovery}
> +
> +The transitional MMR device has one of the following PCI Device ID
> +depending on the device type:
> +
> +\begin{tabular}{|l|c|}
> +\hline
> +Transitional PCI Device ID  &  Virtio Device    \\
> +\hline \hline
> +0x10f9      &   network device     \\
> +\hline
> +0x10fa     &   block device     \\
> +\hline
> +0x10fb     & memory ballooning (traditional)  \\
> +\hline
> +0x10fc     &      console       \\
> +\hline
> +0x10fd     &     SCSI host      \\
> +\hline
> +0x10fe     &  entropy source    \\
> +\hline
> +0x10ff     &   9P transport     \\
> +\hline
> +\end{tabular}
> +
> +The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
> +reflect the PCI Vendor and Device ID of the environment.
> +
> +The transitional MMR driver MUST match any PCI Revision ID value.
> +
> +The transitional MMR driver MAY match any PCI Subsystem Vendor ID and
> +any PCI Subsystem Device ID value.
> +
>  \subsection{PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout}
>  
>  The device is configured via I/O and/or memory regions (though see
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-03-30 22:58 ` [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers Parav Pandit
@ 2023-04-07  8:55   ` Michael S. Tsirkin
  2023-04-10  1:33     ` [virtio-dev] Re: [virtio-comment] " Jason Wang
  2023-04-12  4:33   ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07  8:55 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:32AM +0300, Parav Pandit wrote:
> Legacy virtio configuration registers and adjacent
> device configuration registers are located somewhere
> in a memory BAR.
> 
> A new capability supplies the location of these registers
> which a driver can use to map I/O access to legacy
> memory mapped registers.
> 
> This gives the ability to locate legacy registers in either
> the existing memory BAR or as completely new BAR at BAR 0.
> 
> A below example diagram attempts to depicts it in an existing
> memory BAR.
> 
> +------------------------------+
> |Transitional                  |
> |MMR SRIOV VF                  |
> |                              |
> ++---------------+             |
> ||dev_id =       |             |
> ||{0x10f9-0x10ff}|             |
> |+---------------+             |
> |                              |
> ++--------------------+        |
> || PCIe ext cap = 0xB |        |
> || cfg_type = 10      |        |
> || offset   = 0x1000  |        |
> || bar      = A {0..5}|        |
> |+--|-----------------+        |
> |   |                          |
> |   |                          |
> |   |    +-------------------+ |
> |   |    | Memory BAR = A    | |
> |   |    |                   | |
> |   +------>+--------------+ | |
> |        |  |legacy virtio | | |
> |        |  |+ dev cfg     | | |
> |        |  |registers     | | |
> |        |  +--------------+ | |
> |        +-----------------+ | |
> +------------------------------+
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>


I am split about using the extended capability for this, since in
practice this makes for more code in hypervisors.  How about just using
an existing capability as opposed to the extended capability? Less work
for existing hypervisors, no? And let's begin to use the extended
capability for something more important than legacy access.




> ---
>  transport-pci.tex | 33 +++++++++++++++++++++++++++++++--
>  1 file changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index aeda4a1..55a6aa0 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -168,6 +168,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  \item ISR Status
>  \item Device-specific configuration (optional)
>  \item PCI configuration access
> +\item Legacy memory mapped configuration registers (optional)
>  \end{itemize}
>  
>  Each structure can be mapped by a Base Address register (BAR) belonging to
> @@ -228,6 +229,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Legacy configuration registers capability */
> +#define VIRTIO_PCI_CAP_LEGACY_MMR_CFG    10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -682,6 +685,18 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>  Configuration Space / Legacy Interface: Device Configuration
>  Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
>  
> +\paragraph{Transitional MMR Interface: A Note on Configuration Registers}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Transitional MMR Interface: A Note on Configuration Registers}
> +
> +The transitional MMR device MUST present legacy virtio registers
> +consisting of legacy common configuration registers followed by
> +legacy device specific configuration registers described in section
> +\ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
> +in a memory region PCI BAR.
> +
> +The transitional MMR device MUST provide the location of the
> +legacy virtio configuration registers using a legacy memory mapped
> +registers capability described in section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}.
>  
>  \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
>  
> @@ -956,9 +971,23 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> +
> +The optional VIRTIO_PCI_CAP_LEGACY_MMR_CFG capability defines
> +the location of the legacy virtio configuration registers
> +followed by legacy device specific configuration registers in
> +the memory region BAR for the transitional MMR device.
> +
> +The \field{cap.offset} MUST be 4-byte aligned.
> +The \field{cap.offset} SHOULD be 4KBytes aligned and

what's the point of this?

> +\field{cap.length} SHOULD be 4KBytes.

Why is length 4KBytes? Why not the actual length?


> +
> +The transitional MMR device MUST present a legacy configuration
> +memory mapped registers capability using \field{virtio_pcie_ext_cap}.
> +
>  \subsubsection{Legacy Interface: A Note on Feature Bits}
> -\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
> -Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
>


So it is not by chance that we abadoned the legacy interface,
it had some fundamental issues.
Let me list some off the top of my mind:
- no atomicity across accesses, so if a register changes
  while it is read driver gets trash. solved using generation id
- no defined endian-ness. solved using le
- no way to reject driver configuration

we just did a new thing instead, but I feel if you are reviving the
legacy interface yet again, it is worth thinking about solving the worst
of the warts. For example, I can see an endian-ness register that
hypervisor can either read to check device is compatible with guest, or
maybe even write to force device to specific endian-ness.
A generation counter that hypervisor can check to
verify value is consistent? Can work if hypervisor
caches configuration.
A bad state flag that device can set to make hypervisor
stop guest? Better than corrupting it ...



  
>  Only Feature Bits 0 to 31 are accessible through the
>  Legacy Interface. When used through the Legacy Interface,
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 06/11] introduction: Introduce transitional MMR interface
  2023-03-30 22:58 ` [virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface Parav Pandit
@ 2023-04-07  9:17   ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07  9:17 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:29AM +0300, Parav Pandit wrote:
> Introduce terminology for the transitional MMR device and transitional
> MMR driver.
> 
> Add description of the transitional MMR device. It is a PCI
> device that implements legacy virtio common configuration registers
> followed by legacy device specific registers in a memory region at
> an offset.
> 
> This enables hypervisor such as vfio driver to emulate
> I/O region towards the guest at BAR0. By doing so VFIO driver can
> translate read/write accesses on I/O region from the guest
> to the device memory region.
> 
> High level comparison of 1.x, transitional & transitional MMR
> sriov vf device:
> 
> +------------------+ +--------------------+ +--------------------+
> |virtio 1.x        | |Transitional        | |Transitional        |
> |SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
> |                  | |                    | |                    |
> ++---------------+ | ++---------------+   | ++---------------+   |
> ||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
> ||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
> |+---------------+ | |+---------------+   | |+---------------+   |
> |                  | |                    | |                    |
> |+------------+    | |+------------+      | |+-----------------+ |
> ||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
> |+------------+    | |+------------+      | ||                 | |
> |                  | |                    | || +--------------+| |
> |                  | |+-----------------+ | || |legacy virtio || |
> |                  | ||IOBAR impossible | | || |+ dev cfg     || |
> |                  | |+-----------------+ | || |registers     || |
> |                  | |                    | || +--------------+| |
> |                  | |                    | |+-----------------+ |
> +------------------+ +--------------------+ +--------------------+
> 
> Motivation and background:
> PCIe and system limitations:
> 1. PCIe VFs do not support IOBAR cited at [1].
> 
> Perhaps the PCIe spec could be extended, however it would be only
> useful for virtio transitional devices. Even if such an extension
> is present, there are other system limitations described below in (2)
> and (3).
> 
> 2. cpu io port space limit and fragmentation
> x86_64 is limited to only 64K worth of IO port space at [2],
> which is shared with many other onboard system peripherals which
> are behind PCIe bridge; such I/O region also needs to be aligned
> to 4KB at PCIe bridge level cited at [3]. This can lead to a I/O space
> fragmentation. Due to this fragmentation and alignment need,
> actual usable range is small.
> 
> 3. IO space access of PCI device is done through non-posted message
>  which requires higher completion time in the PCIe fabric for
> round trip travel.
> 
> [1] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.
> 
> [2] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [3] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range
> will be aligned to a 4 KB boundary.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
>  introduction.tex | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/introduction.tex b/introduction.tex
> index e8b34e3..9a0f96a 100644
> --- a/introduction.tex
> +++ b/introduction.tex
> @@ -161,6 +161,20 @@ \subsection{Legacy Interface: Terminology}\label{intro:Legacy
>    have a need for backwards compatibility!
>  \end{note}
>  
> +\begin{description}
> +\item[Transitional MMR Device]
> +       is a PCI device which exposes legacy virtio configuration
> +       registers followed by legacy device configuration registers as
> +       memory mapped registers (MMR) at an offset in a memory region
> +       BAR, has no I/O region BAR,

in fact, most of existing pci interface can be in either IO or memory bar,
I don't think we specify that it's in a memory bar in the spec.
For example IIRC qemu has a flag that uses IO for signalling
kicks for modern VQs, this is slightly faster on some architectures.


> having its own PCI Device ID range,
> +       and follows the rest of the functionalities


this setence isn't grammatical.
not sure what exactly you are trying to say here, can not
help you rewrite.

> of the transitional device.

a transitional device probably.

> +\end{description}
> +
> +\begin{description}
> +\item[Transitional MMR Driver]
> +       is a PCI device driver that supports the Transitional MMR device.
> +\end{description}
> +
>  Devices or drivers with no legacy compatibility are referred to as
>  non-transitional devices and drivers, respectively.
>  
> @@ -174,6 +188,22 @@ \subsection{Transition from earlier specification drafts}\label{sec:Transition f
>  sections tagged "Legacy Interface" in the section title.
>  These highlight the changes made since the earlier drafts.
>  
> +\subsection{Transitional MMR interface: specification drafts}\label{sec:Transitional MMR interface: specification drafts}
> +
> +The transitional MMR device and driver differs from the
> +transitional device and driver respectively in few areas. Such

a few

> +differences are contained in sections named
> +'Transitional MMR interface', like this one. When no differences
> +are mentioned explicitly, the transitional MMR device and driver
> +follow exactly the same functionalities as that of the
> +transitional device and driver respectively.

Ugh, we called these transitional because they were there for
the transition period :)

The thing that I feel you miss here is that
transitional driver using transitional device MUST NOT
use the legacy interface.

The new thing with this MMR is that it's a memory mapped
access to legacy registers.


Going back to my idea of just adding legacy MMR capability to existing
modern and transitional devices, as opposed to a completely new type of
device, we would basically have a modern driver access the new
capability, and forward accesses from a legacy driver to there.
So "memory mapped legacy interface" would be a better name I think.




> +
> +\begin{note}
> +Transitional MMR interface is only required to support backward
> +compatibility. It should not be implemented unless there is a need
> +for the backward compatibility.
> +\end{note}
> +
>  \section{Structure Specifications}\label{sec:Structure Specifications}
>  
>  Many device and driver in-memory structure layouts are documented using
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-03 22:00                           ` Parav Pandit
@ 2023-04-07  9:35                             ` Michael S. Tsirkin
  2023-04-10  1:52                               ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07  9:35 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Mon, Apr 03, 2023 at 06:00:13PM -0400, Parav Pandit wrote:
> 
> 
> On 4/3/2023 5:04 PM, Michael S. Tsirkin wrote:
> > On Mon, Apr 03, 2023 at 08:25:02PM +0000, Parav Pandit wrote:
> > > 
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Monday, April 3, 2023 2:02 PM
> > > 
> > > > > Because vqs involve DMA operations.
> > > > > It is left to the device implementation to do it, but a generic wisdom
> > > > > is not implement such slow work in the data path engines.
> > > > > So such register access vqs can/may be through firmware.
> > > > > Hence it can involve a lot higher latency.
> > > > 
> > > > Then that wisdom is wrong? tens of microseconds is not workable even for
> > > > ethtool operations, you are killing boot time.
> > > > 
> > > Huh.
> > > What ethtool latencies have you experienced? Number?
> > 
> > I know an order of tens of eth calls happens during boot.
> > If as you said each takes tens of ms then we are talking close to a second.
> > That is measureable.
> I said it can take, doesn't have to be always same for all the commands.
> Better to work with real numbers. :)
> 
> Let me take an example to walk through.
> 
> If a cvq or aq command takes 0.5msec, total of 100 such commands will take
> 50msec.
> 
> Once a while if two of commands say take 5msec, will result in 50 -> 60
> msec.

Not too bad. then it seems it should not be a problem to tunnel config
over AQ then?


> 
> > OK then. Then if it is a dead end then it looks weird to add a whole new
> > config space as memory mapped.
> > 
> I am aligned with you to not add any new register as memory mapped for 1.x.
> Or access through device own's tvq is fine if such q can be initialized
> before during device reset (init) phase.
> 
> I explained that legacy registers are sub-set of existing 1.x.
> They should not consume extra memory.
> 
> Lets walk through the merits and negatives of both to conclude.
> 
> > > > Let me try again.
> 
> > If hardware vendors do not want to bear the costs of registers then they
> > will not implement devices with registers, and then the whole thing will
> > become yet another legacy thing we need to support. If legacy emulation
> > without IO is useful, then can we not find a way to do it that will
> > survive the test of time?
> legacy_register_transport_vq for VF can be a option, but not for PF
> emulation.

OK. Do we really care? Are you guys selling lots of high end cards
without SRIOV that it matters?

> More below.
> 
> > 
> > > Again, I want to emphasize that register read/write over tvq has merits with trade-off.
> > > And so the mmr has merits with trade-off too.
> > > 
> > > Better to list them and proceed forward.
> > > 
> > > Method-1: VF's register read/write via PF based transport VQ
> > > Pros:
> > > a. Light weight registers implementation in device for new memory region window
> > 
> > Is that all? I mentioned more.
> > 
> b. device reset is more optimal with transport VQ
> c. a hypervisor may want to check (but not necessary) register content
> d. Some unknown guest VM driver which modifies mac address and still expect
> atomicity can benefit if hypervisor wants to do extra checks

It's not hard to be more specific.
Old Linux kernels are like this, this was fixed with:

commit 7e58d5aea8abb993983a3f3088fd4a3f06180a1c
Author: Amos Kong <akong@redhat.com>
Date:   Mon Jan 21 01:17:23 2013 +0000

Currently we write MAC address to pci config space byte by byte,
this means that we have an intermediate step where mac is wrong.
This patch introduced a new control command to set MAC address,
it's atomic.

about 10 years ago.


> > > Cons:
> > > a. Higher DMA read/write latency
> > > b. Device requires synchronization between non legacy memory mapped registers and legacy regs access via tvq
> > 
> > Same as a separate mmemory bar really.  Just don't do it. Either access
> > legacy or non legacy.
> > 
> It is really not same to treat them equally as tvq encapsulation is
> different, and hw wouldn't prefer to treat them equally like regular memory
> writes.


I think yoiu missunderstand what I said. You listed a problem:
the same device can be accessed through both
a modern and a legacy interface.
I said that it is not a problem at all, there is no reason
to use both.

> Transitional device exposed by hypervisor contains both legacy I/O bar and
> also the memory mapped registers. So a guest vm can access both.

But it must not, and some devices break if you do.


> > > c. Can only work with the VF. Cannot work for thin hypervisor, which can map transitional PF to bare metal OS
> > > (also listed in cover letter)
> > 
> > Is that a significant limitation? Why?
> It is a functional limitation for the PF, as PF has no parent.
> and PF can also utilize memory BAR.

Yes it's a limitation, I just don't see why we care.

> > 
> > > Method-2: VF's register read/write via MMR (current proposal)
> > > Pros:
> > > a. Device utilizes the same legacy and non-legacy registers.
> > 
> > > b. an order of magnitude lower latency due to avoidance of DMA on register accesses
> > > (Important but not critical)
> > 
> > And no cons? Even if you could not see them yourself did I fail to express myself to such
> > an extent?
> > 
> Method-1 pros covered the advantage of it over method-2, but yes worth to
> list here for completeness.
> 
> Cons:
> requires creating new memory region window in the device for configuration
> access

Parav please take a look at the discussion so far as collect more cons
that were mentioned for the proposal, I definitely listed some and I
don't really want to repeat myself.  I expect a proposal to be balanced,
not a sales pitch.


> > > > > No. Interrupt latency is in usec range.
> > > > > The major latency contributors in msec range can arise from the device side.
> > > > 
> > > > So you are saying there are devices out there already with this MMR hack
> > > > baked in, and in hardware not firmware, so it works reasonably?
> > > It is better to not assert a solution a "hack",
> > 
> > Sorry if that sounded offensive.  a hack is not necessary a bad thing.
> > It's a quick solution to a very local problem, though.
> > 
> It is a solution because device can do at near to zero extra memory for
> existing registers.
> Anyways, we have better technical details to resolve. :)
> Lets focus on it.
> 
> > Yes motivation is one of the things I'm trying to work out here.
> > It does however not help that it's an 11 patch strong patchset
> > adding 500 lines of text for what is supposedly a small change.
> > 
> Many of the patches are rework and incorrect to attribute to the specific
> feature.
> 
> Like others it could have been one giant patch... but we see value in
> smaller patches..
> 
> Using tvq is even bigger change than this.

The main thing is that there's no new ID so the PF device itself will
stay usable with existing drivers.

> So we shouldn't be afraid of
> making transitional device actually work using it with larger spec patch.
> 
> > > Regarding tvq, I have some idea on how to improve the register read/writes so that its optimal for devices to implement.
> > 
> > Sounds useful, and maybe if tvq addresses legacy need then focus on
> > that?
> > 
> 
> tvq specific for legacy register access make sense.
> Some generic tvq is abstract and dont see any relation here.
> 
> So better to name it as legacy_reg_transport_vq (lrt_vq).

Again this assumes tvq will be rewritten on top of AQ.
I guess legacy can then become a new type of AQ command?

And maybe you want a memory mapped register for AQ commands? I know
Jason really wanted that.



> How about having below format?
> 
> /* Format of 16B descriptors for lrt_vq
>  * lrt_vq = legacy register tranport vq.
>  */
> struct legacy_reg_req_vf {
> 	union {
> 		struct {
> 			le32 reg_wr_data;
> 			le32 reserved;
> 		} write;
> 		struct {
> 			le64 reg_read_addr;
> 		};
> 	};
> 	le8 rd_wr : 1;	/* rd=0, wr=1 */
> 	le8 reg_byte_offset : 7;
> 	le8 req_tag;	/* unique request tag on this vq */
> 	le16 vf_num;
> 
> 	le16 flags; /* new flag below */
>         le16 next;
> };
> 
> #define VIRTQ_DESC_F_Q_DEFINED 8
> /* Content of the VQ descriptor other than flags field is VQ
>  * specific and defined by the VQ type.
>  */

Any way to allow accesses of arbitrary length?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-04 16:08     ` Parav Pandit
@ 2023-04-07 12:03       ` Michael S. Tsirkin
  2023-04-07 15:18         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07 12:03 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 04, 2023 at 12:08:54PM -0400, Parav Pandit wrote:
> 
> 
> On 4/4/2023 3:28 AM, Michael S. Tsirkin wrote:
> > On Fri, Mar 31, 2023 at 01:58:30AM +0300, Parav Pandit wrote:
> > > Transitional MMR device PCI Device IDs are unique. Hence,
> > > any of the existing drivers do not bind to it.
> > > This further maintains the backward compatibility with
> > > existing drivers.
> > > 
> > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > 
> > I took a fresh look at it, and I don't get it: what exactly is wrong
> > with just using modern ID? Why do we need new ones?
> 
> A modern (non transitional device) do not support legacy functionality such
> as virtio net hdr,

It's all a question of terminology, but it is not worth
sacrificing functionality for cleaner terminology.
Basically most of the spec just talks about the legacy interface,
and will work fine with this.

Yes we do say:
Devices or drivers with no legacy compatibility are referred to as
non-transitional devices and drivers, respectively.


So we will want to refine this somewhat. Maybe:

	Devices not compatible with legacy drivers and drivers not compatible
	with legacy devices are referred to as non-transitional devices and
	drivers, respectively.


This allows non-transitional devices to expose the legacy
capability - having this capability does not make them
compatible with 

Similarly in the conformance section:

An implementation MAY choose to implement OPTIONAL support for the
legacy interface, including support for legacy drivers
or devices, by conforming to all of the MUST or
REQUIRED level requirements for the legacy interface
for the transitional devices and drivers.

we would just remove "for the transitional devices and drivers"
here as now non-transitional can have a legacy interface.

Similarly:

The requirements for the legacy interface for transitional implementations

would become:

"The requirements for the legacy interface"




> device reset sequence.

what is this one?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-07 12:03       ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
@ 2023-04-07 15:18         ` Parav Pandit
  2023-04-07 15:51           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-07 15:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, April 7, 2023 8:03 AM
> 
> On Tue, Apr 04, 2023 at 12:08:54PM -0400, Parav Pandit wrote:

[..]
> > > I took a fresh look at it, and I don't get it: what exactly is wrong
> > > with just using modern ID? Why do we need new ones?
> >
> > A modern (non transitional device) do not support legacy functionality
> > such as virtio net hdr,
> 
> It's all a question of terminology, but it is not worth sacrificing functionality for
> cleaner terminology.
> Basically most of the spec just talks about the legacy interface, and will work
> fine with this.
> 
> Yes we do say:
> Devices or drivers with no legacy compatibility are referred to as non-
> transitional devices and drivers, respectively.
> 
> 
> So we will want to refine this somewhat. Maybe:
> 
> 	Devices not compatible with legacy drivers and drivers not compatible
> 	with legacy devices are referred to as non-transitional devices and
> 	drivers, respectively.
> 
> 
> This allows non-transitional devices to expose the legacy capability - having this
> capability does not make them compatible with
> 
> Similarly in the conformance section:
> 
> An implementation MAY choose to implement OPTIONAL support for the
> legacy interface, including support for legacy drivers or devices, by conforming
> to all of the MUST or REQUIRED level requirements for the legacy interface for
> the transitional devices and drivers.
> 
> we would just remove "for the transitional devices and drivers"
> here as now non-transitional can have a legacy interface.
> 
> Similarly:
> 
> The requirements for the legacy interface for transitional implementations
> 
> would become:
> 
> "The requirements for the legacy interface"
> 
>
I will hold to respond to other emails in this series, because the key part is here.

If I understand you correctly, will above wording translate to below behavior?

1. A non-transitional device will expose a capability (not a feature bit, but a capability at transport level).
This capability indicates that, it supports legacy interface.
Lets name it legacy_if_emulation for sake of this discussion.
It is a two-way pci capability.
Device reports it.
And driver enables it. (Why two way and why driver needs to enable it, described later in point #d below).

Hence, such non transitional device does not need to comply to below listed requirements #a and #b.

a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
(Because hypervisor driver is a passthrough driver; and legacy driver will not accept this feature bit).

b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.

c. A non-transitional device with above legacy_if_supported capability, will allow device reset sequence, described in
[1] Driver Requirements: Device Initialization (3.1.1)
[2] Legacy Interface: Device Initialization (3.1.2)

> > device reset sequence.
> 
> what is this one?

I listed above in #c.
And 

d. When legacy_if_emulation capability is offered and hypervisor driver enabled it, when driver perform device reset, driver will not wait for device reset to go zero.
When legacy_if_emulation capability is not enabled by (hypervisor or other say existing) driver, driver will wait for device reset to turn 0. (Following the driver requirement 2.4.2).

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-07 15:18         ` [virtio-dev] " Parav Pandit
@ 2023-04-07 15:51           ` Michael S. Tsirkin
  2023-04-09  3:15             ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-07 15:51 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Fri, Apr 07, 2023 at 03:18:47PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Friday, April 7, 2023 8:03 AM
> > 
> > On Tue, Apr 04, 2023 at 12:08:54PM -0400, Parav Pandit wrote:
> 
> [..]
> > > > I took a fresh look at it, and I don't get it: what exactly is wrong
> > > > with just using modern ID? Why do we need new ones?
> > >
> > > A modern (non transitional device) do not support legacy functionality
> > > such as virtio net hdr,
> > 
> > It's all a question of terminology, but it is not worth sacrificing functionality for
> > cleaner terminology.
> > Basically most of the spec just talks about the legacy interface, and will work
> > fine with this.
> > 
> > Yes we do say:
> > Devices or drivers with no legacy compatibility are referred to as non-
> > transitional devices and drivers, respectively.
> > 
> > 
> > So we will want to refine this somewhat. Maybe:
> > 
> > 	Devices not compatible with legacy drivers and drivers not compatible
> > 	with legacy devices are referred to as non-transitional devices and
> > 	drivers, respectively.
> > 
> > 
> > This allows non-transitional devices to expose the legacy capability - having this
> > capability does not make them compatible with
> > 
> > Similarly in the conformance section:
> > 
> > An implementation MAY choose to implement OPTIONAL support for the
> > legacy interface, including support for legacy drivers or devices, by conforming
> > to all of the MUST or REQUIRED level requirements for the legacy interface for
> > the transitional devices and drivers.
> > 
> > we would just remove "for the transitional devices and drivers"
> > here as now non-transitional can have a legacy interface.
> > 
> > Similarly:
> > 
> > The requirements for the legacy interface for transitional implementations
> > 
> > would become:
> > 
> > "The requirements for the legacy interface"
> > 
> >
> I will hold to respond to other emails in this series, because the key part is here.
> 
> If I understand you correctly, will above wording translate to below behavior?
> 
> 1. A non-transitional device will expose a capability (not a feature bit, but a capability at transport level).

Note that we can allow this capability in transitional devices too.
This is useful since IO bar might not be enabled even if present.

> This capability indicates that, it supports legacy interface.
> Lets name it legacy_if_emulation for sake of this discussion.
> It is a two-way pci capability.
> Device reports it.
> And driver enables it. (Why two way and why driver needs to enable it, described later in point #d below).
> 
> Hence, such non transitional device does not need to comply to below listed requirements #a and #b.
> 
> a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
> (Because hypervisor driver is a passthrough driver; and legacy driver will not accept this feature bit).

This is not a device requirement at all.

> b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.

This is optional not a requirement.

> c. A non-transitional device with above legacy_if_supported capability, will allow device reset sequence, described in
> [1] Driver Requirements: Device Initialization (3.1.1)
> [2] Legacy Interface: Device Initialization (3.1.2)
> 
> > > device reset sequence.
> > 
> > what is this one?
> 
> I listed above in #c.
> And 
> 
> d. When legacy_if_emulation capability is offered and hypervisor driver enabled it, when driver perform device reset, driver will not wait for device reset to go zero.
> When legacy_if_emulation capability is not enabled by (hypervisor or other say existing) driver, driver will wait for device reset to turn 0. (Following the driver requirement 2.4.2).

It might not be a bad idea to enable it, but I observe that it is
possible for hypervisor to expose a standard transitional device on top
of this MMR capability. Thus it will not be known whether guest driver
accesses legacy or modern BAR until guest runs.
I propose, instead, that device exposes same registers
at two addresses and executes reset correctly depending
on which address it was accessed through. WDYT?



-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-07 15:51           ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-09  3:15             ` Parav Pandit
  2023-04-10 10:18               ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-09  3:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, April 7, 2023 11:51 AM

> > 1. A non-transitional device will expose a capability (not a feature bit, but a
> capability at transport level).
> 
> Note that we can allow this capability in transitional devices too.
> This is useful since IO bar might not be enabled even if present.
>
This capability exposure makes a device transitional in some sense.
 
> > This capability indicates that, it supports legacy interface.
> > Lets name it legacy_if_emulation for sake of this discussion.
> > It is a two-way pci capability.
> > Device reports it.
> > And driver enables it. (Why two way and why driver needs to enable it,
> described later in point #d below).
> >
> > Hence, such non transitional device does not need to comply to below listed
> requirements #a and #b.
> >
> > a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
> > (Because hypervisor driver is a passthrough driver; and legacy driver will not
> accept this feature bit).
> 
> This is not a device requirement at all.
> 
Those this is written as driver requirement; a device expects this feature bit to be negotiated.
What should device implementor do? It should allow driver to not negotiate bit, right?

Which means, below line to be change:

from:
device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.

to:
Non transitional device that does not have legacy interface capability MAY fail to operate further if V_1 is not accepted.
Non transitional device that has legacy interface capability SHOULD operate further even if V_1 is not accepted.

> > b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
> 
> This is optional not a requirement.
> 
Please see above wording, if its acceptable.

> > c. A non-transitional device with above legacy_if_supported
> > capability, will allow device reset sequence, described in [1] Driver
> > Requirements: Device Initialization (3.1.1) [2] Legacy Interface:
> > Device Initialization (3.1.2)
> >
> > > > device reset sequence.
> > >
> > > what is this one?
> >
> > I listed above in #c.
> > And
> >
> > d. When legacy_if_emulation capability is offered and hypervisor driver
> enabled it, when driver perform device reset, driver will not wait for device
> reset to go zero.
> > When legacy_if_emulation capability is not enabled by (hypervisor or other
> say existing) driver, driver will wait for device reset to turn 0. (Following the
> driver requirement 2.4.2).
> 
> It might not be a bad idea to enable it, but I observe that it is possible for
> hypervisor to expose a standard transitional device on top of this MMR
> capability. Thus it will not be known whether guest driver accesses legacy or
> modern BAR until guest runs.
> I propose, instead, that device exposes same registers at two addresses and
> executes reset correctly depending on which address it was accessed through.
> WDYT?
Yep, this the exact proposal here.
Legacy registers exposes via AQ (aka TVQ) or MMR location, behaves like legacy.
And regular registers at their location as-is.

With that feature bit negotiation is the only thing to relax like worded above.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-07  8:55   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-10  1:33     ` Jason Wang
  2023-04-10  6:14       ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-10  1:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Fri, Apr 7, 2023 at 4:55 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Mar 31, 2023 at 01:58:32AM +0300, Parav Pandit wrote:
> > Legacy virtio configuration registers and adjacent
> > device configuration registers are located somewhere
> > in a memory BAR.
> >
> > A new capability supplies the location of these registers
> > which a driver can use to map I/O access to legacy
> > memory mapped registers.
> >
> > This gives the ability to locate legacy registers in either
> > the existing memory BAR or as completely new BAR at BAR 0.
> >
> > A below example diagram attempts to depicts it in an existing
> > memory BAR.
> >
> > +------------------------------+
> > |Transitional                  |
> > |MMR SRIOV VF                  |
> > |                              |
> > ++---------------+             |
> > ||dev_id =       |             |
> > ||{0x10f9-0x10ff}|             |
> > |+---------------+             |
> > |                              |
> > ++--------------------+        |
> > || PCIe ext cap = 0xB |        |
> > || cfg_type = 10      |        |
> > || offset   = 0x1000  |        |
> > || bar      = A {0..5}|        |
> > |+--|-----------------+        |
> > |   |                          |
> > |   |                          |
> > |   |    +-------------------+ |
> > |   |    | Memory BAR = A    | |
> > |   |    |                   | |
> > |   +------>+--------------+ | |
> > |        |  |legacy virtio | | |
> > |        |  |+ dev cfg     | | |
> > |        |  |registers     | | |
> > |        |  +--------------+ | |
> > |        +-----------------+ | |
> > +------------------------------+
> >
> > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
>
>
> I am split about using the extended capability for this, since in
> practice this makes for more code in hypervisors.  How about just using
> an existing capability as opposed to the extended capability? Less work
> for existing hypervisors, no? And let's begin to use the extended
> capability for something more important than legacy access.
>
>
>
>
> > ---
> >  transport-pci.tex | 33 +++++++++++++++++++++++++++++++--
> >  1 file changed, 31 insertions(+), 2 deletions(-)
> >
> > diff --git a/transport-pci.tex b/transport-pci.tex
> > index aeda4a1..55a6aa0 100644
> > --- a/transport-pci.tex
> > +++ b/transport-pci.tex
> > @@ -168,6 +168,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> >  \item ISR Status
> >  \item Device-specific configuration (optional)
> >  \item PCI configuration access
> > +\item Legacy memory mapped configuration registers (optional)
> >  \end{itemize}
> >
> >  Each structure can be mapped by a Base Address register (BAR) belonging to
> > @@ -228,6 +229,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> >  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> >  /* Vendor-specific data */
> >  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> > +/* Legacy configuration registers capability */
> > +#define VIRTIO_PCI_CAP_LEGACY_MMR_CFG    10
> >  \end{lstlisting}
> >
> >          Any other value is reserved for future use.
> > @@ -682,6 +685,18 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
> >  Configuration Space / Legacy Interface: Device Configuration
> >  Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
> >
> > +\paragraph{Transitional MMR Interface: A Note on Configuration Registers}
> > +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Transitional MMR Interface: A Note on Configuration Registers}
> > +
> > +The transitional MMR device MUST present legacy virtio registers
> > +consisting of legacy common configuration registers followed by
> > +legacy device specific configuration registers described in section
> > +\ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
> > +in a memory region PCI BAR.
> > +
> > +The transitional MMR device MUST provide the location of the
> > +legacy virtio configuration registers using a legacy memory mapped
> > +registers capability described in section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}.
> >
> >  \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
> >
> > @@ -956,9 +971,23 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
> >  specified by some other Virtio Structure PCI Capability
> >  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
> >
> > +\subsubsection{Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> > +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> > +
> > +The optional VIRTIO_PCI_CAP_LEGACY_MMR_CFG capability defines
> > +the location of the legacy virtio configuration registers
> > +followed by legacy device specific configuration registers in
> > +the memory region BAR for the transitional MMR device.
> > +
> > +The \field{cap.offset} MUST be 4-byte aligned.
> > +The \field{cap.offset} SHOULD be 4KBytes aligned and
>
> what's the point of this?
>
> > +\field{cap.length} SHOULD be 4KBytes.
>
> Why is length 4KBytes? Why not the actual length?
>
>
> > +
> > +The transitional MMR device MUST present a legacy configuration
> > +memory mapped registers capability using \field{virtio_pcie_ext_cap}.
> > +
> >  \subsubsection{Legacy Interface: A Note on Feature Bits}
> > -\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
> > -Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
> > +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
> >
>
>
> So it is not by chance that we abadoned the legacy interface,
> it had some fundamental issues.
> Let me list some off the top of my mind:
> - no atomicity across accesses, so if a register changes
>   while it is read driver gets trash. solved using generation id
> - no defined endian-ness. solved using le
> - no way to reject driver configuration

I can think more:

- VIRTIO_F_ACCESS_PLATFORM
- VIRTIO_F_ORDER_PLATFORM

The above two are a must for hardware devices which are almost
impossible for legacy interfaces.

Hardware transitional devices are really tricky, ENI is one example
which can only work for x86.

config ALIBABA_ENI_VDPA
        tristate "vDPA driver for Alibaba ENI"
        select VIRTIO_PCI_LIB_LEGACY
        depends on PCI_MSI && X86

It has an MMIO bar for legacy registers and it works by chance for
Linux drivers; DPDK requires patches to let it work.

This is fine for vDPA but not for virtio if the design can only work
for some specific setups (OSes/archs).

Thanks

>
> we just did a new thing instead, but I feel if you are reviving the
> legacy interface yet again, it is worth thinking about solving the worst
> of the warts. For example, I can see an endian-ness register that
> hypervisor can either read to check device is compatible with guest, or
> maybe even write to force device to specific endian-ness.
> A generation counter that hypervisor can check to
> verify value is consistent? Can work if hypervisor
> caches configuration.
> A bad state flag that device can set to make hypervisor
> stop guest? Better than corrupting it ...
>
>
>
>
> >  Only Feature Bits 0 to 31 are accessible through the
> >  Legacy Interface. When used through the Legacy Interface,
> > --
> > 2.26.2
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
  2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-10  1:36   ` Jason Wang
  2023-04-10  6:24     ` Michael S. Tsirkin
  2023-04-10 17:54     ` Parav Pandit
  2023-05-19  6:10   ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 2 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-10  1:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
>
> PCI device configuration space for capabilities is limited to only 192
> bytes shared by many PCI capabilities of generic PCI device and virtio
> specific.
>
> Hence, introduce virtio extended capability that uses PCI Express
> extended capability.
> Subsequent patch uses this virtio extended capability.
>
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

Can you explain the differences compared to what I've used to propose?

https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html

This can save time for everybody.

Thanks

> ---
>  transport-pci.tex | 69 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 665448e..aeda4a1 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -174,7 +174,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  the function, or accessed via the special VIRTIO_PCI_CAP_PCI_CFG field in the PCI configuration space.
>
>  The location of each structure is specified using a vendor-specific PCI capability located
> -on the capability list in PCI configuration space of the device.
> +on the capability list in PCI configuration space of the device
> +unless stated otherwise.
>  This virtio structure capability uses little-endian format; all fields are
>  read-only for the driver unless stated otherwise:
>
> @@ -301,6 +302,72 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  fields provide the most significant 32 bits of a total 64 bit offset and
>  length within the BAR specified by \field{cap.bar}.
>
> +Virtio extended PCI Express capability structure defines
> +the location of certain virtio device configuration related
> +structures using PCI Express extended capability. Virtio
> +extended PCI Express capability structure uses PCI Express
> +vendor specific extended capability (VSEC). It has a below
> +layout:
> +
> +\begin{lstlisting}
> +struct pcie_ext_cap {
> +        le16 cap_vendor_id; /* Generic PCI field: 0xB */
> +        le16 cap_version : 2; /* Generic PCI field: 0 */
> +        le16 next_cap_offset : 14; /* Generic PCI field: next cap or 0 */
> +};
> +
> +struct virtio_pcie_ext_cap {
> +        struct pcie_ext_cap pcie_ecap;
> +        u8 cfg_type; /* Identifies the structure. */
> +        u8 bar; /* Index of the BAR where its located */
> +        u8 id; /* Multiple capabilities of the same type */
> +        u8 zero_padding[1];
> +        le64 offset; /* Offset with the bar */
> +        le64 length; /* Length of the structure, in bytes. */
> +        u8 data[]; /* Optional variable length data */
> +};
> +\end{lstlisting}
> +
> +This structure contains optional data, depending on
> +\field{cfg_type}. The fields are interpreted as follows:
> +
> +\begin{description}
> +\item[\field{cap_vendor_id}]
> +         0x0B; identifies a vendor-specific extended capability.
> +
> +\item[\field{cap_version}]
> +         contains a value of 0.
> +
> +\item[\field{next_cap_offset}]
> +        Offset to the next capability.
> +
> +\item[\field{cfg_type}]
> +        follows the same definition as \field{cfg_type}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{bar}]
> +        follows the same  same definition as  \field{bar}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{id}]
> +        follows the same  same definition as  \field{id}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{offset}]
> +        indicates where the structure begins relative to the
> +        base address associated with the BAR. The alignment
> +        requirements of offset are indicated in each
> +        structure-specific section that uses
> +        \field{struct virtio_pcie_ext_cap}.
> +
> +\item[\field{length}]
> +        indicates the length of the structure indicated by this
> +        capability.
> +
> +\item[\field{data}]
> +        optional data of this capability.
> +\end{description}
> +
>  \drivernormative{\subsubsection}{Virtio Structure PCI Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
>
>  The driver MUST ignore any vendor-specific capability structure which has
> --
> 2.26.2
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-07  9:35                             ` Michael S. Tsirkin
@ 2023-04-10  1:52                               ` Jason Wang
  0 siblings, 0 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-10  1:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Fri, Apr 7, 2023 at 5:35 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 03, 2023 at 06:00:13PM -0400, Parav Pandit wrote:
> >
> >
> > On 4/3/2023 5:04 PM, Michael S. Tsirkin wrote:
> > > On Mon, Apr 03, 2023 at 08:25:02PM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Monday, April 3, 2023 2:02 PM
> > > >

[...]

> >
> > tvq specific for legacy register access make sense.
> > Some generic tvq is abstract and dont see any relation here.
> >
> > So better to name it as legacy_reg_transport_vq (lrt_vq).
>
> Again this assumes tvq will be rewritten on top of AQ.
> I guess legacy can then become a new type of AQ command?
>
> And maybe you want a memory mapped register for AQ commands? I know
> Jason really wanted that.
>

That's exactly why we decouple the commands from a specific transport
(queue or register). It allows sufficient flexibility.

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10  1:33     ` [virtio-dev] Re: [virtio-comment] " Jason Wang
@ 2023-04-10  6:14       ` Michael S. Tsirkin
  2023-04-10  6:20         ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10  6:14 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> This is fine for vDPA but not for virtio if the design can only work
> for some specific setups (OSes/archs).
> 
> Thanks

Well virtio legacy has a long history of documenting existing hacks :)
But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
And we have to decide what to do about ACCESS_PLATFORM since
there's a security problem if device allows not acking it.
Two options:
- relax the rules a bit and say device will assume ACCESS_PLATFORM
  is acked anyway
- a new flag that is insecure (so useful for sec but useless for dpdk) but optional

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10  6:14       ` Michael S. Tsirkin
@ 2023-04-10  6:20         ` Jason Wang
  2023-04-10  6:39           ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-10  6:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > This is fine for vDPA but not for virtio if the design can only work
> > for some specific setups (OSes/archs).
> >
> > Thanks
>
> Well virtio legacy has a long history of documenting existing hacks :)

Exactly, so the legacy behaviour is not (or can't be) defined by the
spec but the codes.

> But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> And we have to decide what to do about ACCESS_PLATFORM since
> there's a security problem if device allows not acking it.
> Two options:
> - relax the rules a bit and say device will assume ACCESS_PLATFORM
>   is acked anyway

This will break legacy drivers which assume physical addresses.

> - a new flag that is insecure (so useful for sec but useless for dpdk) but optional

This looks like a new "hack" for the legacy hacks.

And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...

Thanks

>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10  1:36   ` [virtio-dev] " Jason Wang
@ 2023-04-10  6:24     ` Michael S. Tsirkin
  2023-04-10  7:16       ` Jason Wang
  2023-04-10 17:54     ` Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10  6:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> > PCI device configuration space for capabilities is limited to only 192
> > bytes shared by many PCI capabilities of generic PCI device and virtio
> > specific.
> >
> > Hence, introduce virtio extended capability that uses PCI Express
> > extended capability.
> > Subsequent patch uses this virtio extended capability.
> >
> > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> 
> Can you explain the differences compared to what I've used to propose?
> 
> https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> 
> This can save time for everybody.
> 
> Thanks

BTW another advantage of extended capabilities is - these are actually
cheaper to access from a VM than classic config space.


Several points
- I don't like it that yours is 32 bit. We do not need 2 variants just
  make it all 64 bit
- We need to document that if driver does not scan extended capbilities it will not find them.
  And existing drivers do not scan them. So what is safe
  to put there? vendor specific? extra access types?
  Can we make scanning these mandatory in future drivers? future devices?
  I guess we can add a feature bit to flag that.
  Is accessing these possible from bios?

So I like this one better as a basis - care reviewing it and adding
stuff?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10  6:20         ` Jason Wang
@ 2023-04-10  6:39           ` Michael S. Tsirkin
  2023-04-10  7:20             ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10  6:39 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > This is fine for vDPA but not for virtio if the design can only work
> > > for some specific setups (OSes/archs).
> > >
> > > Thanks
> >
> > Well virtio legacy has a long history of documenting existing hacks :)
> 
> Exactly, so the legacy behaviour is not (or can't be) defined by the
> spec but the codes.

I mean driver behaviour derives from the code but we do document it in
the spec to help people build devices.


> > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > And we have to decide what to do about ACCESS_PLATFORM since
> > there's a security problem if device allows not acking it.
> > Two options:
> > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> >   is acked anyway
> 
> This will break legacy drivers which assume physical addresses.

not that they are not already broken.

> > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> 
> This looks like a new "hack" for the legacy hacks.

it's not just for legacy.

> And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> 
> Thanks

You play some tricks with shadow VQ I guess.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10  6:24     ` Michael S. Tsirkin
@ 2023-04-10  7:16       ` Jason Wang
  2023-04-10 10:04         ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-10  7:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > > PCI device configuration space for capabilities is limited to only 192
> > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > specific.
> > >
> > > Hence, introduce virtio extended capability that uses PCI Express
> > > extended capability.
> > > Subsequent patch uses this virtio extended capability.
> > >
> > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> >
> > Can you explain the differences compared to what I've used to propose?
> >
> > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> >
> > This can save time for everybody.
> >
> > Thanks
>
> BTW another advantage of extended capabilities is - these are actually
> cheaper to access from a VM than classic config space.

Config space/BAR is allowed by both of the proposals or anything I missed?

>
>
> Several points
> - I don't like it that yours is 32 bit. We do not need 2 variants just
>   make it all 64 bit

That's fine.

> - We need to document that if driver does not scan extended capbilities it will not find them.

This is implicit since I remember we don't have such documentation for
pci capability, anything makes pcie special?

>   And existing drivers do not scan them. So what is safe
>   to put there? vendor specific? extra access types?

For PASID at least, since it's a PCI-E feature, vendor specific should
be fine. Not sure about legacy MMIO then.

>   Can we make scanning these mandatory in future drivers? future devices?
>   I guess we can add a feature bit to flag that.

For PASID, it doesn't need this, otherwise we may duplicate transport
specific features.

>   Is accessing these possible from bios?

Not at least for the two use cases now PASID or legacy MMIO.

>
> So I like this one better as a basis - care reviewing it and adding
> stuff?

There are very few differences and I will have a look.

Thanks

>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10  6:39           ` Michael S. Tsirkin
@ 2023-04-10  7:20             ` Jason Wang
  2023-04-10 10:06               ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-10  7:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > This is fine for vDPA but not for virtio if the design can only work
> > > > for some specific setups (OSes/archs).
> > > >
> > > > Thanks
> > >
> > > Well virtio legacy has a long history of documenting existing hacks :)
> >
> > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > spec but the codes.
>
> I mean driver behaviour derives from the code but we do document it in
> the spec to help people build devices.
>
>
> > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > And we have to decide what to do about ACCESS_PLATFORM since
> > > there's a security problem if device allows not acking it.
> > > Two options:
> > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > >   is acked anyway
> >
> > This will break legacy drivers which assume physical addresses.
>
> not that they are not already broken.

I may miss something, the whole point is to allow legacy drivers to
run otherwise a modern device is sufficient?

>
> > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> >
> > This looks like a new "hack" for the legacy hacks.
>
> it's not just for legacy.

We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?

>
> > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> >
> > Thanks
>
> You play some tricks with shadow VQ I guess.

Do we really want to add a new feature in the virtio spec that can
only work with the datapath mediation?

Thanks

>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10  7:16       ` Jason Wang
@ 2023-04-10 10:04         ` Michael S. Tsirkin
  2023-04-11  2:19           ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 10:04 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 03:16:46PM +0800, Jason Wang wrote:
> On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > > >
> > > > PCI device configuration space for capabilities is limited to only 192
> > > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > > specific.
> > > >
> > > > Hence, introduce virtio extended capability that uses PCI Express
> > > > extended capability.
> > > > Subsequent patch uses this virtio extended capability.
> > > >
> > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > >
> > > Can you explain the differences compared to what I've used to propose?
> > >
> > > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> > >
> > > This can save time for everybody.
> > >
> > > Thanks
> >
> > BTW another advantage of extended capabilities is - these are actually
> > cheaper to access from a VM than classic config space.
> 
> Config space/BAR is allowed by both of the proposals or anything I missed?
> 
> >
> >
> > Several points
> > - I don't like it that yours is 32 bit. We do not need 2 variants just
> >   make it all 64 bit
> 
> That's fine.
> 
> > - We need to document that if driver does not scan extended capbilities it will not find them.
> 
> This is implicit since I remember we don't have such documentation for
> pci capability, anything makes pcie special?

yes - the fact that there are tons of existing drivers expecting
everything in standard space.


> >   And existing drivers do not scan them. So what is safe
> >   to put there? vendor specific? extra access types?
> 
> For PASID at least, since it's a PCI-E feature, vendor specific should
> be fine. Not sure about legacy MMIO then.
> 
> >   Can we make scanning these mandatory in future drivers? future devices?
> >   I guess we can add a feature bit to flag that.
> 
> For PASID, it doesn't need this, otherwise we may duplicate transport
> specific features.

i don't get it. what does PASID have to do with it?
A new feature will allow clean split at least:
we make any new features and new devices that expect
express capability depend on this new feature bit.

> >   Is accessing these possible from bios?
> 
> Not at least for the two use cases now PASID or legacy MMIO.

can't parse english here. what does this mean?


> >
> > So I like this one better as a basis - care reviewing it and adding
> > stuff?
> 
> There are very few differences and I will have a look.
> 
> Thanks
> 
> >
> > --
> > MST
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10  7:20             ` Jason Wang
@ 2023-04-10 10:06               ` Michael S. Tsirkin
  2023-04-11  2:13                 ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 10:06 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > for some specific setups (OSes/archs).
> > > > >
> > > > > Thanks
> > > >
> > > > Well virtio legacy has a long history of documenting existing hacks :)
> > >
> > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > spec but the codes.
> >
> > I mean driver behaviour derives from the code but we do document it in
> > the spec to help people build devices.
> >
> >
> > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > there's a security problem if device allows not acking it.
> > > > Two options:
> > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > >   is acked anyway
> > >
> > > This will break legacy drivers which assume physical addresses.
> >
> > not that they are not already broken.
> 
> I may miss something, the whole point is to allow legacy drivers to
> run otherwise a modern device is sufficient?

yes and if legacy drivers don't work in a given setup then we
should not worry.

> >
> > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > >
> > > This looks like a new "hack" for the legacy hacks.
> >
> > it's not just for legacy.
> 
> We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?


ACCESS_PLATFORM is also a security boundary. so devices must fail
negotiation if it's not there. this new one won't be.


> >
> > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > >
> > > Thanks
> >
> > You play some tricks with shadow VQ I guess.
> 
> Do we really want to add a new feature in the virtio spec that can
> only work with the datapath mediation?
> 
> Thanks

As long as a feature is useful and can't be supported otherwise
we are out of options. Keeping field practice things out of the
spec helps no one.

> >
> > --
> > MST
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-09  3:15             ` [virtio-dev] " Parav Pandit
@ 2023-04-10 10:18               ` Michael S. Tsirkin
  2023-04-10 14:34                 ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 10:18 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Sun, Apr 09, 2023 at 03:15:01AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Friday, April 7, 2023 11:51 AM
> 
> > > 1. A non-transitional device will expose a capability (not a feature bit, but a
> > capability at transport level).
> > 
> > Note that we can allow this capability in transitional devices too.
> > This is useful since IO bar might not be enabled even if present.
> >
> This capability exposure makes a device transitional in some sense.

not in the sense spec uses it at the moment: transitional devices
are those that legacy drivers can bind to. transitional drivers
btw are those that can bind to legacy devices.

perhaps suprisingly, a transitional driver using a transitional
device does not rely on any legacy spec at all, they will
use the standard interfaces.


> > > This capability indicates that, it supports legacy interface.
> > > Lets name it legacy_if_emulation for sake of this discussion.
> > > It is a two-way pci capability.
> > > Device reports it.
> > > And driver enables it. (Why two way and why driver needs to enable it,
> > described later in point #d below).
> > >
> > > Hence, such non transitional device does not need to comply to below listed
> > requirements #a and #b.
> > >
> > > a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
> > > (Because hypervisor driver is a passthrough driver; and legacy driver will not
> > accept this feature bit).
> > 
> > This is not a device requirement at all.
> > 
> Those this is written as driver requirement; a device expects this feature bit to be negotiated.
> What should device implementor do? It should allow driver to not negotiate bit, right?
> 
> Which means, below line to be change:
> 
> from:
> device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
> 
> to:
> Non transitional device that does not have legacy interface capability MAY fail to operate further if V_1 is not accepted.
> Non transitional device that has legacy interface capability SHOULD operate further even if V_1 is not accepted.



Look nothing changes with MMR capability at all.
We currently have:

	A device MUST offer VIRTIO_F_VERSION_1.  A device MAY fail to operate further
	if VIRTIO_F_VERSION_1 is not accepted.

it's implied that this does not refer to legacy interface.


You want to clarify this adding to legacy interface section text
explaining that of course VIRTIO_F_VERSION_1 must not
be offered through that? Sure but it's a separate issue
from MMR capability. don't try to drink the ocean.





> > > b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
> > 
> > This is optional not a requirement.
> > 
> Please see above wording, if its acceptable.


you don't need any of that for this effort, generally
VIRTIO_F_VERSION_1 thing needs a lot of work, if you
want to invest the time just ask I'll try to list the issues.

But nothing to do with memory mapped legacy interface
directly.



> > > c. A non-transitional device with above legacy_if_supported
> > > capability, will allow device reset sequence, described in [1] Driver
> > > Requirements: Device Initialization (3.1.1) [2] Legacy Interface:
> > > Device Initialization (3.1.2)
> > >
> > > > > device reset sequence.
> > > >
> > > > what is this one?
> > >
> > > I listed above in #c.
> > > And
> > >
> > > d. When legacy_if_emulation capability is offered and hypervisor driver
> > enabled it, when driver perform device reset, driver will not wait for device
> > reset to go zero.
> > > When legacy_if_emulation capability is not enabled by (hypervisor or other
> > say existing) driver, driver will wait for device reset to turn 0. (Following the
> > driver requirement 2.4.2).
> > 
> > It might not be a bad idea to enable it, but I observe that it is possible for
> > hypervisor to expose a standard transitional device on top of this MMR
> > capability. Thus it will not be known whether guest driver accesses legacy or
> > modern BAR until guest runs.
> > I propose, instead, that device exposes same registers at two addresses and
> > executes reset correctly depending on which address it was accessed through.
> > WDYT?
> Yep, this the exact proposal here.
> Legacy registers exposes via AQ (aka TVQ) or MMR location, behaves like legacy.
> And regular registers at their location as-is.
> 
> With that feature bit negotiation is the only thing to relax like worded above.

It's not really different from IO port legacy then.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-10 10:18               ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-10 14:34                 ` Parav Pandit
  2023-04-10 19:58                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 14:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



On 4/10/2023 6:18 AM, Michael S. Tsirkin wrote:
> On Sun, Apr 09, 2023 at 03:15:01AM +0000, Parav Pandit wrote:
>>
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Friday, April 7, 2023 11:51 AM
>>
>>>> 1. A non-transitional device will expose a capability (not a feature bit, but a
>>> capability at transport level).
>>>
>>> Note that we can allow this capability in transitional devices too.
>>> This is useful since IO bar might not be enabled even if present.
>>>
>> This capability exposure makes a device transitional in some sense.
> 
> not in the sense spec uses it at the moment: transitional devices
> are those that legacy drivers can bind to. transitional drivers
> btw are those that can bind to legacy devices.
> 
> perhaps suprisingly, a transitional driver using a transitional
> device does not rely on any legacy spec at all, they will
> use the standard interfaces.
> 
> 
>>>> This capability indicates that, it supports legacy interface.
>>>> Lets name it legacy_if_emulation for sake of this discussion.
>>>> It is a two-way pci capability.
>>>> Device reports it.
>>>> And driver enables it. (Why two way and why driver needs to enable it,
>>> described later in point #d below).
>>>>
>>>> Hence, such non transitional device does not need to comply to below listed
>>> requirements #a and #b.
>>>>
>>>> a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
>>>> (Because hypervisor driver is a passthrough driver; and legacy driver will not
>>> accept this feature bit).
>>>
>>> This is not a device requirement at all.
>>>
>> Those this is written as driver requirement; a device expects this feature bit to be negotiated.
>> What should device implementor do? It should allow driver to not negotiate bit, right?
>>
>> Which means, below line to be change:
>>
>> from:
>> device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
>>
>> to:
>> Non transitional device that does not have legacy interface capability MAY fail to operate further if V_1 is not accepted.
>> Non transitional device that has legacy interface capability SHOULD operate further even if V_1 is not accepted.
> 
> 
> 
> Look nothing changes with MMR capability at all.
> We currently have:
> 
> 	A device MUST offer VIRTIO_F_VERSION_1.  A device MAY fail to operate further
> 	if VIRTIO_F_VERSION_1 is not accepted.
> 
> it's implied that this does not refer to legacy interface.
> 
But the interface being exposes is not a legacy interface at the PCI 
device level.

A PCI device is exposing a interface that can be used
by either
a. existing non transitional driver who will negotiate _1, just fine.
or
b. by legacy driver in the guest VM, which will not negotiate _1.

And here device must not fail to operate.
Hence spec should say that it should not fail to operate.

> 
> You want to clarify this adding to legacy interface section text
> explaining that of course VIRTIO_F_VERSION_1 must not
> be offered through that? 
It will be offered because hypervisor driver is not getting involved in 
the reset flow and in read/write of the feature bits etc.

Hypervisor driver is only providing the transport channel from the guest 
vm to the device.

And since the guest driver may be 1.x, _1 will be offered by the device.

> Sure but it's a separate issue
> from MMR capability. don't try to drink the ocean.
> 
> 
> 
> 
> 
>>>> b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
>>>
>>> This is optional not a requirement.
>>>
>> Please see above wording, if its acceptable.
> 
> 
> you don't need any of that for this effort, generally
> VIRTIO_F_VERSION_1 thing needs a lot of work, if you
> want to invest the time just ask I'll try to list the issues.
> 
> But nothing to do with memory mapped legacy interface
> directly.
> 
I likely dont understand your above point.
The point is, a device with new capability needs to

a. offer VERSION_1 offer, allow negotiation VERSION_1
b. allow not negotiation of VERSION_1

With single virtio device id, how to frame this in the spec?
One way I proposed above that a new transport capability indicates this.

I didnt follow your idea of how above #a and #b can be worded without 
the new capability wording.

> 
> 
>>>> c. A non-transitional device with above legacy_if_supported
>>>> capability, will allow device reset sequence, described in [1] Driver
>>>> Requirements: Device Initialization (3.1.1) [2] Legacy Interface:
>>>> Device Initialization (3.1.2)
>>>>
>>>>>> device reset sequence.
>>>>>
>>>>> what is this one?
>>>>
>>>> I listed above in #c.
>>>> And
>>>>
>>>> d. When legacy_if_emulation capability is offered and hypervisor driver
>>> enabled it, when driver perform device reset, driver will not wait for device
>>> reset to go zero.
>>>> When legacy_if_emulation capability is not enabled by (hypervisor or other
>>> say existing) driver, driver will wait for device reset to turn 0. (Following the
>>> driver requirement 2.4.2).
>>>
>>> It might not be a bad idea to enable it, but I observe that it is possible for
>>> hypervisor to expose a standard transitional device on top of this MMR
>>> capability. Thus it will not be known whether guest driver accesses legacy or
>>> modern BAR until guest runs.
>>> I propose, instead, that device exposes same registers at two addresses and
>>> executes reset correctly depending on which address it was accessed through.
>>> WDYT?
>> Yep, this the exact proposal here.
>> Legacy registers exposes via AQ (aka TVQ) or MMR location, behaves like legacy.
>> And regular registers at their location as-is.
>>
>> With that feature bit negotiation is the only thing to relax like worded above.
> 
> It's not really different from IO port legacy then.
> 
Yes, it is no different, because what is provided is just the transport, 
not a new functional behavior.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-04 14:37           ` Michael S. Tsirkin
@ 2023-04-10 16:21             ` Parav Pandit
  2023-04-10 19:49               ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 16:21 UTC (permalink / raw)
  To: Michael S. Tsirkin, Cornelia Huck
  Cc: virtio-dev, virtio-comment, shahafs, Satananda Burla



On 4/4/2023 10:37 AM, Michael S. Tsirkin wrote:
> On Tue, Apr 04, 2023 at 03:19:53PM +0200, Cornelia Huck wrote:
>>>>> What does it "Co-developed-by" mean exactly that Signed-off-by
>>>>> does not?
>>>>
>>>> AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
>>>> give attribution. (Linux kernel rules actually state that any
>>>> "Co-developed-by:" must be followed by a "Signed-off-by:" for that
>>>> author, but we don't really do DCO here, the S-o-b is more of a
>>>> convention.)
>>>
>>>
>>> Actually, we might want to generally, to signify agreement to the IPR.
>>> How about adding this to our rules?  But in this case Satananda Burla is
>>> a TC member so yes, no problem.
>>
>> Adding a s-o-b requirement is not a bad idea... do you want to propose
>> an update?
> 
> OK how about the following:
> 
> ---------------
> The process for forwarding comments from others:
> 
> Generally, subscribing to the virtio comments mailing list
> requires agreement to the OASIS feedback license, at
> https://www.oasis-open.org/who/ipr/feedback_license.pdf
> 
It is not straight forward for non virtio tc member to realize above 
obligation.
It is worth to document it in the section [1].

[1] 
https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio#feedback

Something like,

From:
If you're interested in following ongoing development and can't join the 
committee, I recommend you subscribe to the virtio-comment and 
virtio-dev lists.

To:
If you're interested in following ongoing development and can't join the 
committee, I recommend you subscribe to the virtio-comment and 
virtio-dev lists. Your contribution to above mailing lists is licensed 
under the OASIS feedback license agreement describe at [1].

> When forwarding (in modified or unmodified form) comments from others, please
> make sure the commenter has read and agreed to the
> feedback license. If this is the case, please include the following:
> 
> License-consent: commenter's name <commenter's email>
> 
> where commenter's name <commenter's email> is a valid
> name-addr as specified in RFC 5322, see
> https://datatracker.ietf.org/doc/html/rfc5322#section-3.4
> 
> If the comment is on behalf of an organization, please use the
> organization's email address.
> 
> 
> If forwarding a comment from a TC member, please instead get consent to
> the full Virtio TC IPR, and then, as before, include
> 
> License-consent: commenter's name <commenter's email>
> 
License-consent tag do not capture the contribution by other virtio tc 
member(s) in the generated comment.

Signed-off-by captures it.

So both tags are needed depending on comment contribution type.

Rest above change looks good to me.

> If you are reusing a comment that has already been posted to the
> TC mailing list, the above tags are not required.
> 
> ---------------
> 
> We could reuse Signed-off-by though I am a bit concerned whether
> people will assume it's a DCO thing which everyone copies.
> Thoughts?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10  1:36   ` [virtio-dev] " Jason Wang
  2023-04-10  6:24     ` Michael S. Tsirkin
@ 2023-04-10 17:54     ` Parav Pandit
  2023-04-10 17:58       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  2023-04-11  3:28       ` Jason Wang
  1 sibling, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 17:54 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

Hi Jason,

On 4/9/2023 9:36 PM, Jason Wang wrote:
> On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
>>
>> PCI device configuration space for capabilities is limited to only 192
>> bytes shared by many PCI capabilities of generic PCI device and virtio
>> specific.
>>
>> Hence, introduce virtio extended capability that uses PCI Express
>> extended capability.
>> Subsequent patch uses this virtio extended capability.
>>
>> Co-developed-by: Satananda Burla <sburla@marvell.com>
>> Signed-off-by: Parav Pandit <parav@nvidia.com>
> 
> Can you explain the differences compared to what I've used to propose?
> 
> https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> 
> This can save time for everybody.
> 

What is proposed in this patch similar to [1].

The main difference is, the proposed new capability is always placed in 
the pci extended capability section,
This is because legacy capability section is nearly to its full level as 
described in commit message.

So providing it at either of the two locations is not valuable.

What you proposed in [1] is in general useful regardless;

However, it is not backward compatible, if the device place them in 
extended capability, it will not work.

To make it backward compatible, a device needs to expose existing 
structure in legacy area. And extended structure for same capability in 
extended pci capability region.

In other words, it will have to be a both places.

Otherwise its similar.

We should do this regardless, and it will also make this series shorter 
which is also what Michael prefers.

Would you like join efforts with me of drafting [1] + above description 
as independent patch?
We may need it even sooner than this because the AQ patch is expanding 
the structure located in legacy area.

[1] 
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html

PASID part of the patch is not relevant here, so will skip to comment.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 17:54     ` Parav Pandit
@ 2023-04-10 17:58       ` Parav Pandit
  2023-04-11  3:28       ` Jason Wang
  1 sibling, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 17:58 UTC (permalink / raw)
  To: Parav Pandit, Jason Wang
  Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Parav Pandit

> We may need it even sooner than this because the AQ patch is expanding the
> structure located in legacy area.

I mixed up the two structures. AQ related field expansion doesn’t need capability expansion.
Sorry about it.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 16:21             ` Parav Pandit
@ 2023-04-10 19:49               ` Michael S. Tsirkin
  2023-04-10 19:57                 ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 19:49 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, virtio-dev, virtio-comment, shahafs, Satananda Burla

On Mon, Apr 10, 2023 at 12:21:56PM -0400, Parav Pandit wrote:
> 
> 
> On 4/4/2023 10:37 AM, Michael S. Tsirkin wrote:
> > On Tue, Apr 04, 2023 at 03:19:53PM +0200, Cornelia Huck wrote:
> > > > > > What does it "Co-developed-by" mean exactly that Signed-off-by
> > > > > > does not?
> > > > > 
> > > > > AIUI, "Co-developed-by:" is more akin to a second "Author:", i.e. to
> > > > > give attribution. (Linux kernel rules actually state that any
> > > > > "Co-developed-by:" must be followed by a "Signed-off-by:" for that
> > > > > author, but we don't really do DCO here, the S-o-b is more of a
> > > > > convention.)
> > > > 
> > > > 
> > > > Actually, we might want to generally, to signify agreement to the IPR.
> > > > How about adding this to our rules?  But in this case Satananda Burla is
> > > > a TC member so yes, no problem.
> > > 
> > > Adding a s-o-b requirement is not a bad idea... do you want to propose
> > > an update?
> > 
> > OK how about the following:
> > 
> > ---------------
> > The process for forwarding comments from others:
> > 
> > Generally, subscribing to the virtio comments mailing list
> > requires agreement to the OASIS feedback license, at
> > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > 
> It is not straight forward for non virtio tc member to realize above
> obligation.
> It is worth to document it in the section [1].
> 
> [1]
> https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio#feedback
> 
> Something like,
> 
> From:
> If you're interested in following ongoing development and can't join the
> committee, I recommend you subscribe to the virtio-comment and virtio-dev
> lists.
> 
> To:
> If you're interested in following ongoing development and can't join the
> committee, I recommend you subscribe to the virtio-comment and virtio-dev
> lists. Your contribution to above mailing lists is licensed under the OASIS
> feedback license agreement describe at [1].

yes. unfortunately oasis does not let us tweak this section.


> > When forwarding (in modified or unmodified form) comments from others, please
> > make sure the commenter has read and agreed to the
> > feedback license. If this is the case, please include the following:
> > 
> > License-consent: commenter's name <commenter's email>
> > 
> > where commenter's name <commenter's email> is a valid
> > name-addr as specified in RFC 5322, see
> > https://datatracker.ietf.org/doc/html/rfc5322#section-3.4
> > 
> > If the comment is on behalf of an organization, please use the
> > organization's email address.
> > 
> > 
> > If forwarding a comment from a TC member, please instead get consent to
> > the full Virtio TC IPR, and then, as before, include
> > 
> > License-consent: commenter's name <commenter's email>
> > 
> License-consent tag do not capture the contribution by other virtio tc
> member(s) in the generated comment.
> 
> Signed-off-by captures it.
> So both tags are needed depending on comment contribution type.

Attribution is nice but Signed-off-by is not that.
And fundamentally people go read the PDF where the Signed-off-by
does not appear at all no one pokes at git history.
Let's just do:

Thanks-to: name <email>

if you want attribution.

will be helpful to fill in the thanks section in the spec
and does not mention signatures.



> Rest above change looks good to me.
> 
> > If you are reusing a comment that has already been posted to the
> > TC mailing list, the above tags are not required.
> > 
> > ---------------
> > 
> > We could reuse Signed-off-by though I am a bit concerned whether
> > people will assume it's a DCO thing which everyone copies.
> > Thoughts?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 19:49               ` Michael S. Tsirkin
@ 2023-04-10 19:57                 ` Parav Pandit
  2023-04-10 20:02                   ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 19:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, virtio-dev, virtio-comment, Shahaf Shuler,
	Satananda Burla


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 10, 2023 3:49 PM

> Attribution is nice but Signed-off-by is not that.

Then what is Signed-off-by for virtio spec?
Can it be same definition as what Linux kernel and many other projects use like [1]?

[1] https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=signed%20off

> And fundamentally people go read the PDF where the Signed-off-by does not
> appear at all no one pokes at git history.
When people read PDF, they do not care about the sign-off. Signed-off-by is not for that.

> Let's just do:
> 
> Thanks-to: name <email>
> 
Why to now learn a new term?
Why terminology of [1] is not enough like AQ status codes? :)

> if you want attribution.
> 
> will be helpful to fill in the thanks section in the spec and does not mention
> signatures.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-10 14:34                 ` Parav Pandit
@ 2023-04-10 19:58                   ` Michael S. Tsirkin
  2023-04-10 20:16                     ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 19:58 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Mon, Apr 10, 2023 at 10:34:16AM -0400, Parav Pandit wrote:
> 
> 
> On 4/10/2023 6:18 AM, Michael S. Tsirkin wrote:
> > On Sun, Apr 09, 2023 at 03:15:01AM +0000, Parav Pandit wrote:
> > > 
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Friday, April 7, 2023 11:51 AM
> > > 
> > > > > 1. A non-transitional device will expose a capability (not a feature bit, but a
> > > > capability at transport level).
> > > > 
> > > > Note that we can allow this capability in transitional devices too.
> > > > This is useful since IO bar might not be enabled even if present.
> > > > 
> > > This capability exposure makes a device transitional in some sense.
> > 
> > not in the sense spec uses it at the moment: transitional devices
> > are those that legacy drivers can bind to. transitional drivers
> > btw are those that can bind to legacy devices.
> > 
> > perhaps suprisingly, a transitional driver using a transitional
> > device does not rely on any legacy spec at all, they will
> > use the standard interfaces.
> > 
> > 
> > > > > This capability indicates that, it supports legacy interface.
> > > > > Lets name it legacy_if_emulation for sake of this discussion.
> > > > > It is a two-way pci capability.
> > > > > Device reports it.
> > > > > And driver enables it. (Why two way and why driver needs to enable it,
> > > > described later in point #d below).
> > > > > 
> > > > > Hence, such non transitional device does not need to comply to below listed
> > > > requirements #a and #b.
> > > > > 
> > > > > a. A driver MUST accept VIRTIO_F_VERSION_1 if it is offered.
> > > > > (Because hypervisor driver is a passthrough driver; and legacy driver will not
> > > > accept this feature bit).
> > > > 
> > > > This is not a device requirement at all.
> > > > 
> > > Those this is written as driver requirement; a device expects this feature bit to be negotiated.
> > > What should device implementor do? It should allow driver to not negotiate bit, right?
> > > 
> > > Which means, below line to be change:
> > > 
> > > from:
> > > device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
> > > 
> > > to:
> > > Non transitional device that does not have legacy interface capability MAY fail to operate further if V_1 is not accepted.
> > > Non transitional device that has legacy interface capability SHOULD operate further even if V_1 is not accepted.
> > 
> > 
> > 
> > Look nothing changes with MMR capability at all.
> > We currently have:
> > 
> > 	A device MUST offer VIRTIO_F_VERSION_1.  A device MAY fail to operate further
> > 	if VIRTIO_F_VERSION_1 is not accepted.
> > 
> > it's implied that this does not refer to legacy interface.
> > 
> But the interface being exposes is not a legacy interface at the PCI device
> level.
> 
> A PCI device is exposing a interface that can be used
> by either
> a. existing non transitional driver who will negotiate _1, just fine.
> or
> b. by legacy driver in the guest VM, which will not negotiate _1.
> 
> And here device must not fail to operate.
> Hence spec should say that it should not fail to operate.

sorry, what? it's exactly the legacy interface that's the whole
value of this hack, just exposed through another bar.

> > 
> > You want to clarify this adding to legacy interface section text
> > explaining that of course VIRTIO_F_VERSION_1 must not
> > be offered through that?
> It will be offered because hypervisor driver is not getting involved in the
> reset flow and in read/write of the feature bits etc.
> 
> Hypervisor driver is only providing the transport channel from the guest vm
> to the device.
> 
> And since the guest driver may be 1.x, _1 will be offered by the device.

_1 is not offered in the legacy interface. And if you negotiate
_1 you better not touch legacy with a 10 foot pole.


> > Sure but it's a separate issue
> > from MMR capability. don't try to drink the ocean.
> > 
> > 
> > 
> > 
> > 
> > > > > b. device MAY fail to operate further if VIRTIO_F_VERSION_1 is not accepted.
> > > > 
> > > > This is optional not a requirement.
> > > > 
> > > Please see above wording, if its acceptable.
> > 
> > 
> > you don't need any of that for this effort, generally
> > VIRTIO_F_VERSION_1 thing needs a lot of work, if you
> > want to invest the time just ask I'll try to list the issues.
> > 
> > But nothing to do with memory mapped legacy interface
> > directly.
> > 
> I likely dont understand your above point.
> The point is, a device with new capability needs to
> 
> a. offer VERSION_1 offer, allow negotiation VERSION_1
> b. allow not negotiation of VERSION_1
> 
> With single virtio device id, how to frame this in the spec?
> One way I proposed above that a new transport capability indicates this.
> 
> I didnt follow your idea of how above #a and #b can be worded without the
> new capability wording.

Just add a new capability and explain that it exposes
the legacy interface in a window at an offset inside a memory bar.
that is mostly it. if there's an adapting layer that forwards
IO requests from legacy driver to that window, this allows this
driver to work and use the device through the legacy
interface.


There could be a small patch or two on top to tweak wording if there are
places where it says "non transitional devices have no legacy
interfaces". And probably an explicit list of devices
which are allowed to have this capability.


That is it really.



> > 
> > 
> > > > > c. A non-transitional device with above legacy_if_supported
> > > > > capability, will allow device reset sequence, described in [1] Driver
> > > > > Requirements: Device Initialization (3.1.1) [2] Legacy Interface:
> > > > > Device Initialization (3.1.2)
> > > > > 
> > > > > > > device reset sequence.
> > > > > > 
> > > > > > what is this one?
> > > > > 
> > > > > I listed above in #c.
> > > > > And
> > > > > 
> > > > > d. When legacy_if_emulation capability is offered and hypervisor driver
> > > > enabled it, when driver perform device reset, driver will not wait for device
> > > > reset to go zero.
> > > > > When legacy_if_emulation capability is not enabled by (hypervisor or other
> > > > say existing) driver, driver will wait for device reset to turn 0. (Following the
> > > > driver requirement 2.4.2).
> > > > 
> > > > It might not be a bad idea to enable it, but I observe that it is possible for
> > > > hypervisor to expose a standard transitional device on top of this MMR
> > > > capability. Thus it will not be known whether guest driver accesses legacy or
> > > > modern BAR until guest runs.
> > > > I propose, instead, that device exposes same registers at two addresses and
> > > > executes reset correctly depending on which address it was accessed through.
> > > > WDYT?
> > > Yep, this the exact proposal here.
> > > Legacy registers exposes via AQ (aka TVQ) or MMR location, behaves like legacy.
> > > And regular registers at their location as-is.
> > > 
> > > With that feature bit negotiation is the only thing to relax like worded above.
> > 
> > It's not really different from IO port legacy then.
> > 
> Yes, it is no different, because what is provided is just the transport, not
> a new functional behavior.

So let it be, just add stuff where capability is different.
Leave cleanups of legacy and transitional stuff for later.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 19:57                 ` [virtio-dev] " Parav Pandit
@ 2023-04-10 20:02                   ` Michael S. Tsirkin
  2023-04-11  8:39                     ` Cornelia Huck
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-10 20:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Cornelia Huck, virtio-dev, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Mon, Apr 10, 2023 at 07:57:08PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 10, 2023 3:49 PM
> 
> > Attribution is nice but Signed-off-by is not that.
> 
> Then what is Signed-off-by for virtio spec?

we never defined it. using it kind of by 

> Can it be same definition as what Linux kernel and many other projects use like [1]?
> 
> [1] https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=signed%20off

That is DCO. useful for linux pointless for us as is since
we want people to agree to our IPR.
Unless you want to make DCO refer to our IPR.
That might be possible.
We will need to float this past OASIS stuff.
I prefer explicit agreement to license personally.

> > And fundamentally people go read the PDF where the Signed-off-by does not
> > appear at all no one pokes at git history.
> When people read PDF, they do not care about the sign-off. Signed-off-by is not for that.
> 
> > Let's just do:
> > 
> > Thanks-to: name <email>
> > 
> Why to now learn a new term?
> Why terminology of [1] is not enough like AQ status codes? :)

I have no idea what problem you are trying to address.
If it is attribution Signed-off-by is not that.
If it is IPR Signed-off-by is not that either but might be
made to imply that.

> > if you want attribution.
> > 
> > will be helpful to fill in the thanks section in the spec and does not mention
> > signatures.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 07/11] transport-pci: Introduce transitional MMR device id
  2023-04-10 19:58                   ` Michael S. Tsirkin
@ 2023-04-10 20:16                     ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-10 20:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


On 4/10/2023 3:58 PM, Michael S. Tsirkin wrote:
> On Mon, Apr 10, 2023 at 10:34:16AM -0400, Parav Pandit wrote:

>>> 	A device MUST offer VIRTIO_F_VERSION_1.  A device MAY fail to operate further
>>> 	if VIRTIO_F_VERSION_1 is not accepted.
>>>
>>> it's implied that this does not refer to legacy interface.
>>>
>> But the interface being exposes is not a legacy interface at the PCI device
>> level.
>>
>> A PCI device is exposing a interface that can be used
>> by either
>> a. existing non transitional driver who will negotiate _1, just fine.
>> or
>> b. by legacy driver in the guest VM, which will not negotiate _1.
>>
>> And here device must not fail to operate.
>> Hence spec should say that it should not fail to operate.
> 
> sorry, what? it's exactly the legacy interface that's the whole
> value of this hack, just exposed through another bar.
>
I am aligned with you on this part of exposing legacy interface via 
another bar (or a window within an existing memory bar).

What is translates to is a spec wording something like below:

A non transitional device exposes legacy interface using a memory mapped 
registers in a BAR or its parent PCI device through AQ.

Legacy common configuration registers and device specific registers 
accessed through such interface work as legacy interface defined in rest 
of the specification.

>> And since the guest driver may be 1.x, _1 will be offered by the device.
> 
> _1 is not offered in the legacy interface. And if you negotiate
Right.

> _1 you better not touch legacy with a 10 foot pole.
>
Right.

> Just add a new capability and explain that it exposes
> the legacy interface in a window at an offset inside a memory bar.
> that is mostly it. if there's an adapting layer that forwards
> IO requests from legacy driver to that window, this allows this
> driver to work and use the device through the legacy
> interface.
> 
ok. Something like above?

> 
> There could be a small patch or two on top to tweak wording if there are
> places where it says "non transitional devices have no legacy
> interfaces". And probably an explicit list of devices
> which are allowed to have this capability.
> 
Make sense to me.

effectively we will have,

1. one or two patches, as you describe in above point for tweaking 
wording + device list.
2. new optional capability (in extended area because legacy area is near 
to full).
    a. indicates it supports memory mapped and/or
    b. via aq (when aq is ready)

3. patch to make use of notification region as done in patch-10.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-10 10:06               ` Michael S. Tsirkin
@ 2023-04-11  2:13                 ` Jason Wang
  2023-04-11  7:04                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-11  2:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > for some specific setups (OSes/archs).
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > >
> > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > spec but the codes.
> > >
> > > I mean driver behaviour derives from the code but we do document it in
> > > the spec to help people build devices.
> > >
> > >
> > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > there's a security problem if device allows not acking it.
> > > > > Two options:
> > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > >   is acked anyway
> > > >
> > > > This will break legacy drivers which assume physical addresses.
> > >
> > > not that they are not already broken.
> >
> > I may miss something, the whole point is to allow legacy drivers to
> > run otherwise a modern device is sufficient?
>
> yes and if legacy drivers don't work in a given setup then we
> should not worry.
>
> > >
> > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > >
> > > > This looks like a new "hack" for the legacy hacks.
> > >
> > > it's not just for legacy.
> >
> > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
>
>
> ACCESS_PLATFORM is also a security boundary. so devices must fail
> negotiation if it's not there. this new one won't be.
>
>
> > >
> > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > >
> > > > Thanks
> > >
> > > You play some tricks with shadow VQ I guess.
> >
> > Do we really want to add a new feature in the virtio spec that can
> > only work with the datapath mediation?
> >
> > Thanks
>
> As long as a feature is useful and can't be supported otherwise
> we are out of options.

Probably not? Is it as simple as relaxing this:

"Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."

To allow memory space.

This works for both software and hardware devices (I had a handy
hardware that supports legacy virtio drivers in this way).

Thanks

> Keeping field practice things out of the
> spec helps no one.
>
> > >
> > > --
> > > MST
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 10:04         ` Michael S. Tsirkin
@ 2023-04-11  2:19           ` Jason Wang
  2023-04-11  7:00             ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-11  2:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Mon, Apr 10, 2023 at 6:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 10, 2023 at 03:16:46PM +0800, Jason Wang wrote:
> > On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > > > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > > > >
> > > > > PCI device configuration space for capabilities is limited to only 192
> > > > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > > > specific.
> > > > >
> > > > > Hence, introduce virtio extended capability that uses PCI Express
> > > > > extended capability.
> > > > > Subsequent patch uses this virtio extended capability.
> > > > >
> > > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > >
> > > > Can you explain the differences compared to what I've used to propose?
> > > >
> > > > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> > > >
> > > > This can save time for everybody.
> > > >
> > > > Thanks
> > >
> > > BTW another advantage of extended capabilities is - these are actually
> > > cheaper to access from a VM than classic config space.
> >
> > Config space/BAR is allowed by both of the proposals or anything I missed?
> >
> > >
> > >
> > > Several points
> > > - I don't like it that yours is 32 bit. We do not need 2 variants just
> > >   make it all 64 bit
> >
> > That's fine.
> >
> > > - We need to document that if driver does not scan extended capbilities it will not find them.
> >
> > This is implicit since I remember we don't have such documentation for
> > pci capability, anything makes pcie special?
>
> yes - the fact that there are tons of existing drivers expecting
> everything in standard space.
>
>
> > >   And existing drivers do not scan them. So what is safe
> > >   to put there? vendor specific? extra access types?
> >
> > For PASID at least, since it's a PCI-E feature, vendor specific should
> > be fine. Not sure about legacy MMIO then.
> >
> > >   Can we make scanning these mandatory in future drivers? future devices?
> > >   I guess we can add a feature bit to flag that.
> >
> > For PASID, it doesn't need this, otherwise we may duplicate transport
> > specific features.
>
> i don't get it. what does PASID have to do with it?

My proposal is to allow PASID capability to be placed on top. So what
I meant is:

if the driver needs to use PASID, it needs to scan extend capability

So it is only used for future drivers. I think this applies to legacy
MMIO as well.

> A new feature will allow clean split at least:
> we make any new features and new devices that expect
> express capability depend on this new feature bit.
>
> > >   Is accessing these possible from bios?
> >
> > Not at least for the two use cases now PASID or legacy MMIO.
>
> can't parse english here. what does this mean?

I meant, it depends on the capability semantics. Both PASID and legacy
MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
use legacy MMIO bars.

Thanks

>
>
> > >
> > > So I like this one better as a basis - care reviewing it and adding
> > > stuff?
> >
> > There are very few differences and I will have a look.
> >
> > Thanks
> >
> > >
> > > --
> > > MST
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 17:54     ` Parav Pandit
  2023-04-10 17:58       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-11  3:28       ` Jason Wang
  2023-04-11 19:01         ` Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-11  3:28 UTC (permalink / raw)
  To: Parav Pandit
  Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 11, 2023 at 1:54 AM Parav Pandit <parav@nvidia.com> wrote:
>
> Hi Jason,
>
> On 4/9/2023 9:36 PM, Jason Wang wrote:
> > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> >>
> >> PCI device configuration space for capabilities is limited to only 192
> >> bytes shared by many PCI capabilities of generic PCI device and virtio
> >> specific.
> >>
> >> Hence, introduce virtio extended capability that uses PCI Express
> >> extended capability.
> >> Subsequent patch uses this virtio extended capability.
> >>
> >> Co-developed-by: Satananda Burla <sburla@marvell.com>
> >> Signed-off-by: Parav Pandit <parav@nvidia.com>
> >
> > Can you explain the differences compared to what I've used to propose?
> >
> > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> >
> > This can save time for everybody.
> >
>
> What is proposed in this patch similar to [1].
>
> The main difference is, the proposed new capability is always placed in
> the pci extended capability section,
> This is because legacy capability section is nearly to its full level as
> described in commit message.
>
> So providing it at either of the two locations is not valuable.
>
> What you proposed in [1] is in general useful regardless;
>
> However, it is not backward compatible, if the device place them in
> extended capability, it will not work.
>

It is kind of intended since it is only used for new PCI-E features:

"
+The location of the virtio structures that depend on the PCI Express
+capability are specified using a vendor-specific extended capabilities
+on the extended capabilities list in PCI Express extended
+configuration space of the device.
"

> To make it backward compatible, a device needs to expose existing
> structure in legacy area. And extended structure for same capability in
> extended pci capability region.
>
> In other words, it will have to be a both places.

Then we will run out of config space again? Otherwise we need to deal
with the case when existing structures were only placed at extended
capability. Michael suggest to add a new feature, but the driver may
not negotiate the feature which requires more thought.

>
> Otherwise its similar.
>
> We should do this regardless, and it will also make this series shorter
> which is also what Michael prefers.
>
> Would you like join efforts with me of drafting [1] + above description
> as independent patch?

I think it should be fine but I'm not sure what it looks like.

> We may need it even sooner than this because the AQ patch is expanding
> the structure located in legacy area.

Just to make sure I understand this, assuming we have adminq, any
reason a dedicated pcie ext cap is required?

Thanks

>
> [1]
> https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
>
> PASID part of the patch is not relevant here, so will skip to comment.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  2:19           ` Jason Wang
@ 2023-04-11  7:00             ` Michael S. Tsirkin
  2023-04-11  9:07               ` Jason Wang
  2023-04-11 13:47               ` Parav Pandit
  0 siblings, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11  7:00 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Tue, Apr 11, 2023 at 10:19:39AM +0800, Jason Wang wrote:
> On Mon, Apr 10, 2023 at 6:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 10, 2023 at 03:16:46PM +0800, Jason Wang wrote:
> > > On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > > > > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > > > > >
> > > > > > PCI device configuration space for capabilities is limited to only 192
> > > > > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > > > > specific.
> > > > > >
> > > > > > Hence, introduce virtio extended capability that uses PCI Express
> > > > > > extended capability.
> > > > > > Subsequent patch uses this virtio extended capability.
> > > > > >
> > > > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > >
> > > > > Can you explain the differences compared to what I've used to propose?
> > > > >
> > > > > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> > > > >
> > > > > This can save time for everybody.
> > > > >
> > > > > Thanks
> > > >
> > > > BTW another advantage of extended capabilities is - these are actually
> > > > cheaper to access from a VM than classic config space.
> > >
> > > Config space/BAR is allowed by both of the proposals or anything I missed?
> > >
> > > >
> > > >
> > > > Several points
> > > > - I don't like it that yours is 32 bit. We do not need 2 variants just
> > > >   make it all 64 bit
> > >
> > > That's fine.
> > >
> > > > - We need to document that if driver does not scan extended capbilities it will not find them.
> > >
> > > This is implicit since I remember we don't have such documentation for
> > > pci capability, anything makes pcie special?
> >
> > yes - the fact that there are tons of existing drivers expecting
> > everything in standard space.
> >
> >
> > > >   And existing drivers do not scan them. So what is safe
> > > >   to put there? vendor specific? extra access types?
> > >
> > > For PASID at least, since it's a PCI-E feature, vendor specific should
> > > be fine. Not sure about legacy MMIO then.
> > >
> > > >   Can we make scanning these mandatory in future drivers? future devices?
> > > >   I guess we can add a feature bit to flag that.
> > >
> > > For PASID, it doesn't need this, otherwise we may duplicate transport
> > > specific features.
> >
> > i don't get it. what does PASID have to do with it?
> 
> My proposal is to allow PASID capability to be placed on top.

Assuming you mean a patch applied on top of this one.

> So what
> I meant is:
> 
> if the driver needs to use PASID, it needs to scan extend capability
> 
> So it is only used for future drivers. I think this applies to legacy
> MMIO as well.

sure

> > A new feature will allow clean split at least:
> > we make any new features and new devices that expect
> > express capability depend on this new feature bit.
> >
> > > >   Is accessing these possible from bios?
> > >
> > > Not at least for the two use cases now PASID or legacy MMIO.
> >
> > can't parse english here. what does this mean?
> 
> I meant, it depends on the capability semantics. Both PASID and legacy
> MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
> use legacy MMIO bars.
> 
> Thanks

makes sense.


now, imagine someone building a new device. if existing drivers are not
a concern, it is possible to move capabilities all to extended space. is
that possible while keeping the bios working?

> >
> >
> > > >
> > > > So I like this one better as a basis - care reviewing it and adding
> > > > stuff?
> > >
> > > There are very few differences and I will have a look.
> > >
> > > Thanks
> > >
> > > >
> > > > --
> > > > MST
> > > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11  2:13                 ` Jason Wang
@ 2023-04-11  7:04                   ` Michael S. Tsirkin
  2023-04-11  9:01                     ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11  7:04 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > for some specific setups (OSes/archs).
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > >
> > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > spec but the codes.
> > > >
> > > > I mean driver behaviour derives from the code but we do document it in
> > > > the spec to help people build devices.
> > > >
> > > >
> > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > there's a security problem if device allows not acking it.
> > > > > > Two options:
> > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > >   is acked anyway
> > > > >
> > > > > This will break legacy drivers which assume physical addresses.
> > > >
> > > > not that they are not already broken.
> > >
> > > I may miss something, the whole point is to allow legacy drivers to
> > > run otherwise a modern device is sufficient?
> >
> > yes and if legacy drivers don't work in a given setup then we
> > should not worry.
> >
> > > >
> > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > >
> > > > > This looks like a new "hack" for the legacy hacks.
> > > >
> > > > it's not just for legacy.
> > >
> > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> >
> >
> > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > negotiation if it's not there. this new one won't be.
> >
> >
> > > >
> > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > >
> > > > > Thanks
> > > >
> > > > You play some tricks with shadow VQ I guess.
> > >
> > > Do we really want to add a new feature in the virtio spec that can
> > > only work with the datapath mediation?
> > >
> > > Thanks
> >
> > As long as a feature is useful and can't be supported otherwise
> > we are out of options.
> 
> Probably not? Is it as simple as relaxing this:
> 
> "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> 
> To allow memory space.
> 
> This works for both software and hardware devices (I had a handy
> hardware that supports legacy virtio drivers in this way).
> 
> Thanks

Yes it is certainly simpler.

Question: what happens if you try to run existing windows guests or dpdk
on these? Do they crash horribly or exit gracefully?

The point of the capability is to allow using modern device ID so such
guests will not even try to bind.


> > Keeping field practice things out of the
> > spec helps no one.
> >
> > > >
> > > > --
> > > > MST
> > > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-10 20:02                   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-11  8:39                     ` Cornelia Huck
  0 siblings, 0 replies; 200+ messages in thread
From: Cornelia Huck @ 2023-04-11  8:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: virtio-dev, virtio-comment, Shahaf Shuler, Satananda Burla

On Mon, Apr 10 2023, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Apr 10, 2023 at 07:57:08PM +0000, Parav Pandit wrote:
>> 
>> > From: Michael S. Tsirkin <mst@redhat.com>
>> > Sent: Monday, April 10, 2023 3:49 PM
>> 
>> > Attribution is nice but Signed-off-by is not that.
>> 
>> Then what is Signed-off-by for virtio spec?
>
> we never defined it. using it kind of by 
>
>> Can it be same definition as what Linux kernel and many other projects use like [1]?
>> 
>> [1] https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=signed%20off
>
> That is DCO. useful for linux pointless for us as is since
> we want people to agree to our IPR.
> Unless you want to make DCO refer to our IPR.
> That might be possible.
> We will need to float this past OASIS stuff.
> I prefer explicit agreement to license personally.

In most projects, s-o-b either means "I abide by the DCO", or "I'm
adding this because it seems to be the usual thing to do". "Agreeing to
the IPR" would be a somewhat surprising meaning, so I'd prefer to keep
IPR agreement separate as well. Not sure if we want to actually
specify what we mean by s-o-b; when I'm attaching it when merging I'm
using it mostly in the "let's record the chain" meaning. (Specifying
that we indeed use it in that sense, and not the full DCO sense, might
make sense -- or it might be overkill.)

>
>> > And fundamentally people go read the PDF where the Signed-off-by does not
>> > appear at all no one pokes at git history.
>> When people read PDF, they do not care about the sign-off. Signed-off-by is not for that.
>> 
>> > Let's just do:
>> > 
>> > Thanks-to: name <email>
>> > 
>> Why to now learn a new term?
>> Why terminology of [1] is not enough like AQ status codes? :)
>
> I have no idea what problem you are trying to address.
> If it is attribution Signed-off-by is not that.
> If it is IPR Signed-off-by is not that either but might be
> made to imply that.

Yes, s-o-b is distinct from attribution (and that's why the Linux kernel
requires it on top of a Co-developed-by: to satisfy the DCO, not instead
of it.) If we think that Co-developed-by: without a s-o-b might be
confusing, a Thanks-to: could be a better term (and also a broader one
meaning "I discussed this with this person, and I want to acknowledge
them".)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11  7:04                   ` Michael S. Tsirkin
@ 2023-04-11  9:01                     ` Jason Wang
       [not found]                       ` <CALBs2cXURMEzCGnULicXbsBfwnKE5cZOz=M-_hhFCXZ=Lqb9Nw@mail.gmail.com>
                                         ` (2 more replies)
  0 siblings, 3 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-11  9:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > for some specific setups (OSes/archs).
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > >
> > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > spec but the codes.
> > > > >
> > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > the spec to help people build devices.
> > > > >
> > > > >
> > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > there's a security problem if device allows not acking it.
> > > > > > > Two options:
> > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > >   is acked anyway
> > > > > >
> > > > > > This will break legacy drivers which assume physical addresses.
> > > > >
> > > > > not that they are not already broken.
> > > >
> > > > I may miss something, the whole point is to allow legacy drivers to
> > > > run otherwise a modern device is sufficient?
> > >
> > > yes and if legacy drivers don't work in a given setup then we
> > > should not worry.
> > >
> > > > >
> > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > >
> > > > > > This looks like a new "hack" for the legacy hacks.
> > > > >
> > > > > it's not just for legacy.
> > > >
> > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > >
> > >
> > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > negotiation if it's not there. this new one won't be.
> > >
> > >
> > > > >
> > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > You play some tricks with shadow VQ I guess.
> > > >
> > > > Do we really want to add a new feature in the virtio spec that can
> > > > only work with the datapath mediation?
> > > >
> > > > Thanks
> > >
> > > As long as a feature is useful and can't be supported otherwise
> > > we are out of options.
> >
> > Probably not? Is it as simple as relaxing this:
> >
> > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> >
> > To allow memory space.
> >
> > This works for both software and hardware devices (I had a handy
> > hardware that supports legacy virtio drivers in this way).
> >
> > Thanks
>
> Yes it is certainly simpler.
>
> Question: what happens if you try to run existing windows guests or dpdk
> on these? Do they crash horribly or exit gracefully?

Haven't tried DPDK and windows. But I remember DPDK supported legacy
MMIO bars for a while.

Adding Maxime and Yan for comments here.

>
> The point of the capability is to allow using modern device ID so such
> guests will not even try to bind.

It means a mediation layer is required to use. Then it's not an issue
for this simple relaxing any more?

An advantage of such relaxing is that, for the legacy drivers like an
ancient Linux version that can already do MMIO access to legacy BAR it
can work without any mediation.

At least, if ID can help, it can be used with this as well.

Thanks



>
>
> > > Keeping field practice things out of the
> > > spec helps no one.
> > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  7:00             ` Michael S. Tsirkin
@ 2023-04-11  9:07               ` Jason Wang
  2023-04-11 10:43                 ` Michael S. Tsirkin
                                   ` (2 more replies)
  2023-04-11 13:47               ` Parav Pandit
  1 sibling, 3 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-11  9:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Tue, Apr 11, 2023 at 3:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Apr 11, 2023 at 10:19:39AM +0800, Jason Wang wrote:
> > On Mon, Apr 10, 2023 at 6:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 10, 2023 at 03:16:46PM +0800, Jason Wang wrote:
> > > > On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > > > > > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > > > > > >
> > > > > > > PCI device configuration space for capabilities is limited to only 192
> > > > > > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > > > > > specific.
> > > > > > >
> > > > > > > Hence, introduce virtio extended capability that uses PCI Express
> > > > > > > extended capability.
> > > > > > > Subsequent patch uses this virtio extended capability.
> > > > > > >
> > > > > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > > >
> > > > > > Can you explain the differences compared to what I've used to propose?
> > > > > >
> > > > > > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> > > > > >
> > > > > > This can save time for everybody.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > BTW another advantage of extended capabilities is - these are actually
> > > > > cheaper to access from a VM than classic config space.
> > > >
> > > > Config space/BAR is allowed by both of the proposals or anything I missed?
> > > >
> > > > >
> > > > >
> > > > > Several points
> > > > > - I don't like it that yours is 32 bit. We do not need 2 variants just
> > > > >   make it all 64 bit
> > > >
> > > > That's fine.
> > > >
> > > > > - We need to document that if driver does not scan extended capbilities it will not find them.
> > > >
> > > > This is implicit since I remember we don't have such documentation for
> > > > pci capability, anything makes pcie special?
> > >
> > > yes - the fact that there are tons of existing drivers expecting
> > > everything in standard space.
> > >
> > >
> > > > >   And existing drivers do not scan them. So what is safe
> > > > >   to put there? vendor specific? extra access types?
> > > >
> > > > For PASID at least, since it's a PCI-E feature, vendor specific should
> > > > be fine. Not sure about legacy MMIO then.
> > > >
> > > > >   Can we make scanning these mandatory in future drivers? future devices?
> > > > >   I guess we can add a feature bit to flag that.
> > > >
> > > > For PASID, it doesn't need this, otherwise we may duplicate transport
> > > > specific features.
> > >
> > > i don't get it. what does PASID have to do with it?
> >
> > My proposal is to allow PASID capability to be placed on top.
>
> Assuming you mean a patch applied on top of this one.
>
> > So what
> > I meant is:
> >
> > if the driver needs to use PASID, it needs to scan extend capability
> >
> > So it is only used for future drivers. I think this applies to legacy
> > MMIO as well.
>
> sure
>
> > > A new feature will allow clean split at least:
> > > we make any new features and new devices that expect
> > > express capability depend on this new feature bit.
> > >
> > > > >   Is accessing these possible from bios?
> > > >
> > > > Not at least for the two use cases now PASID or legacy MMIO.
> > >
> > > can't parse english here. what does this mean?
> >
> > I meant, it depends on the capability semantics. Both PASID and legacy
> > MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
> > use legacy MMIO bars.
> >
> > Thanks
>
> makes sense.
>
>
> now, imagine someone building a new device. if existing drivers are not
> a concern, it is possible to move capabilities all to extended space. is
> that possible while keeping the bios working?

This is possible but I'm not sure it's worthwhile. What happens if the
device puts all capabilities in the extended space but the bios can't
scan there? We can place them at both but it then doesn't address the
out of space issue. Things will be easier if we allow new
features/capabilities to be placed on the extended space.

Thanks

>
> > >
> > >
> > > > >
> > > > > So I like this one better as a basis - care reviewing it and adding
> > > > > stuff?
> > > >
> > > > There are very few differences and I will have a look.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
       [not found]                       ` <CALBs2cXURMEzCGnULicXbsBfwnKE5cZOz=M-_hhFCXZ=Lqb9Nw@mail.gmail.com>
@ 2023-04-11 10:39                         ` Michael S. Tsirkin
  2023-04-11 11:03                           ` Yan Vugenfirer
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 10:39 UTC (permalink / raw)
  To: Yan Vugenfirer
  Cc: Jason Wang, Parav Pandit, virtio-dev, cohuck, virtio-comment,
	shahafs, Satananda Burla, Maxime Coquelin

On Tue, Apr 11, 2023 at 12:19:14PM +0300, Yan Vugenfirer wrote:
> On Tue, Apr 11, 2023 at 12:02 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > > > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > > > for some specific setups (OSes/archs).
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > > > >
> > > > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > > > spec but the codes.
> > > > > > >
> > > > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > > > the spec to help people build devices.
> > > > > > >
> > > > > > >
> > > > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > > > there's a security problem if device allows not acking it.
> > > > > > > > > Two options:
> > > > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > > > >   is acked anyway
> > > > > > > >
> > > > > > > > This will break legacy drivers which assume physical addresses.
> > > > > > >
> > > > > > > not that they are not already broken.
> > > > > >
> > > > > > I may miss something, the whole point is to allow legacy drivers to
> > > > > > run otherwise a modern device is sufficient?
> > > > >
> > > > > yes and if legacy drivers don't work in a given setup then we
> > > > > should not worry.
> > > > >
> > > > > > >
> > > > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > > > >
> > > > > > > > This looks like a new "hack" for the legacy hacks.
> > > > > > >
> > > > > > > it's not just for legacy.
> > > > > >
> > > > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > > > >
> > > > >
> > > > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > > > negotiation if it's not there. this new one won't be.
> > > > >
> > > > >
> > > > > > >
> > > > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > You play some tricks with shadow VQ I guess.
> > > > > >
> > > > > > Do we really want to add a new feature in the virtio spec that can
> > > > > > only work with the datapath mediation?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > As long as a feature is useful and can't be supported otherwise
> > > > > we are out of options.
> > > >
> > > > Probably not? Is it as simple as relaxing this:
> > > >
> > > > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> > > >
> > > > To allow memory space.
> > > >
> > > > This works for both software and hardware devices (I had a handy
> > > > hardware that supports legacy virtio drivers in this way).
> > > >
> > > > Thanks
> > >
> > > Yes it is certainly simpler.
> > >
> > > Question: what happens if you try to run existing windows guests or dpdk
> > > on these? Do they crash horribly or exit gracefully?
> >
> > Haven't tried DPDK and windows. But I remember DPDK supported legacy
> > MMIO bars for a while.
> 
> Regarding Windows drivers:
> 1. We are acking VIRTIO_F_ACCESS_PLATFORM in the driver. But, if you
> remember the "ATS" issue (Windows is either not detecting it or, even
> if detected, not using it) - then actually, we are not forcing Windows
> to remap the memory because Window fails to work with it correctly.
> 2. Current Windows drivers implementation doesn't support MMIO bars.
> We can enable the support if needed.
> 
> Best regards,
> Yan.

The question was about old legacy drivers, not modern ones.
What if they attach to a device and BAR0
is a memory not an IO bar?
Will they fail gracefully or crash?


> 
> >
> > Adding Maxime and Yan for comments here.
> >
> > >
> > > The point of the capability is to allow using modern device ID so such
> > > guests will not even try to bind.
> >
> > It means a mediation layer is required to use. Then it's not an issue
> > for this simple relaxing any more?
> >
> > An advantage of such relaxing is that, for the legacy drivers like an
> > ancient Linux version that can already do MMIO access to legacy BAR it
> > can work without any mediation.
> >
> > At least, if ID can help, it can be used with this as well.
> >
> > Thanks
> >
> >
> >
> > >
> > >
> > > > > Keeping field practice things out of the
> > > > > spec helps no one.
> > > > >
> > > > > > >
> > > > > > > --
> > > > > > > MST
> > > > > > >
> > > > >
> > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11  9:01                     ` Jason Wang
       [not found]                       ` <CALBs2cXURMEzCGnULicXbsBfwnKE5cZOz=M-_hhFCXZ=Lqb9Nw@mail.gmail.com>
@ 2023-04-11 10:42                       ` Michael S. Tsirkin
  2023-04-12  3:58                         ` Jason Wang
  2023-04-11 13:57                       ` [virtio-dev] " Parav Pandit
  2 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 10:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 11, 2023 at 05:01:59PM +0800, Jason Wang wrote:
> On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > > for some specific setups (OSes/archs).
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > > >
> > > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > > spec but the codes.
> > > > > >
> > > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > > the spec to help people build devices.
> > > > > >
> > > > > >
> > > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > > there's a security problem if device allows not acking it.
> > > > > > > > Two options:
> > > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > > >   is acked anyway
> > > > > > >
> > > > > > > This will break legacy drivers which assume physical addresses.
> > > > > >
> > > > > > not that they are not already broken.
> > > > >
> > > > > I may miss something, the whole point is to allow legacy drivers to
> > > > > run otherwise a modern device is sufficient?
> > > >
> > > > yes and if legacy drivers don't work in a given setup then we
> > > > should not worry.
> > > >
> > > > > >
> > > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > > >
> > > > > > > This looks like a new "hack" for the legacy hacks.
> > > > > >
> > > > > > it's not just for legacy.
> > > > >
> > > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > > >
> > > >
> > > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > > negotiation if it's not there. this new one won't be.
> > > >
> > > >
> > > > > >
> > > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > You play some tricks with shadow VQ I guess.
> > > > >
> > > > > Do we really want to add a new feature in the virtio spec that can
> > > > > only work with the datapath mediation?
> > > > >
> > > > > Thanks
> > > >
> > > > As long as a feature is useful and can't be supported otherwise
> > > > we are out of options.
> > >
> > > Probably not? Is it as simple as relaxing this:
> > >
> > > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> > >
> > > To allow memory space.
> > >
> > > This works for both software and hardware devices (I had a handy
> > > hardware that supports legacy virtio drivers in this way).
> > >
> > > Thanks
> >
> > Yes it is certainly simpler.
> >
> > Question: what happens if you try to run existing windows guests or dpdk
> > on these? Do they crash horribly or exit gracefully?
> 
> Haven't tried DPDK and windows. But I remember DPDK supported legacy
> MMIO bars for a while.
> 
> Adding Maxime and Yan for comments here.
> 
> >
> > The point of the capability is to allow using modern device ID so such
> > guests will not even try to bind.
> 
> It means a mediation layer is required to use. Then it's not an issue
> for this simple relaxing any more?
> 
> An advantage of such relaxing is that, for the legacy drivers like an
> ancient Linux version that can already do MMIO access to legacy BAR it
> can work without any mediation.

Yes. But the capability approach does not prevent that -
just use a transitional ID.
The disadvantage of a transitional ID is some old drivers might crash, or fail
to work in other ways. Using a modern ID with a capability
prevents old drivers from attaching. Using a capability the system
designed is in control and can decide which drivers to use.



> At least, if ID can help, it can be used with this as well.
> 
> Thanks


I don't know what that means.

> 
> 
> >
> >
> > > > Keeping field practice things out of the
> > > > spec helps no one.
> > > >
> > > > > >
> > > > > > --
> > > > > > MST
> > > > > >
> > > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  9:07               ` Jason Wang
@ 2023-04-11 10:43                 ` Michael S. Tsirkin
  2023-04-11 13:59                 ` Parav Pandit
  2023-04-11 14:11                 ` Michael S. Tsirkin
  2 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 10:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Tue, Apr 11, 2023 at 05:07:11PM +0800, Jason Wang wrote:
> On Tue, Apr 11, 2023 at 3:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 10:19:39AM +0800, Jason Wang wrote:
> > > On Mon, Apr 10, 2023 at 6:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 10, 2023 at 03:16:46PM +0800, Jason Wang wrote:
> > > > > On Mon, Apr 10, 2023 at 2:24 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 10, 2023 at 09:36:17AM +0800, Jason Wang wrote:
> > > > > > > On Fri, Mar 31, 2023 at 7:00 AM Parav Pandit <parav@nvidia.com> wrote:
> > > > > > > >
> > > > > > > > PCI device configuration space for capabilities is limited to only 192
> > > > > > > > bytes shared by many PCI capabilities of generic PCI device and virtio
> > > > > > > > specific.
> > > > > > > >
> > > > > > > > Hence, introduce virtio extended capability that uses PCI Express
> > > > > > > > extended capability.
> > > > > > > > Subsequent patch uses this virtio extended capability.
> > > > > > > >
> > > > > > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > > > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > > > >
> > > > > > > Can you explain the differences compared to what I've used to propose?
> > > > > > >
> > > > > > > https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08078.html
> > > > > > >
> > > > > > > This can save time for everybody.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > BTW another advantage of extended capabilities is - these are actually
> > > > > > cheaper to access from a VM than classic config space.
> > > > >
> > > > > Config space/BAR is allowed by both of the proposals or anything I missed?
> > > > >
> > > > > >
> > > > > >
> > > > > > Several points
> > > > > > - I don't like it that yours is 32 bit. We do not need 2 variants just
> > > > > >   make it all 64 bit
> > > > >
> > > > > That's fine.
> > > > >
> > > > > > - We need to document that if driver does not scan extended capbilities it will not find them.
> > > > >
> > > > > This is implicit since I remember we don't have such documentation for
> > > > > pci capability, anything makes pcie special?
> > > >
> > > > yes - the fact that there are tons of existing drivers expecting
> > > > everything in standard space.
> > > >
> > > >
> > > > > >   And existing drivers do not scan them. So what is safe
> > > > > >   to put there? vendor specific? extra access types?
> > > > >
> > > > > For PASID at least, since it's a PCI-E feature, vendor specific should
> > > > > be fine. Not sure about legacy MMIO then.
> > > > >
> > > > > >   Can we make scanning these mandatory in future drivers? future devices?
> > > > > >   I guess we can add a feature bit to flag that.
> > > > >
> > > > > For PASID, it doesn't need this, otherwise we may duplicate transport
> > > > > specific features.
> > > >
> > > > i don't get it. what does PASID have to do with it?
> > >
> > > My proposal is to allow PASID capability to be placed on top.
> >
> > Assuming you mean a patch applied on top of this one.
> >
> > > So what
> > > I meant is:
> > >
> > > if the driver needs to use PASID, it needs to scan extend capability
> > >
> > > So it is only used for future drivers. I think this applies to legacy
> > > MMIO as well.
> >
> > sure
> >
> > > > A new feature will allow clean split at least:
> > > > we make any new features and new devices that expect
> > > > express capability depend on this new feature bit.
> > > >
> > > > > >   Is accessing these possible from bios?
> > > > >
> > > > > Not at least for the two use cases now PASID or legacy MMIO.
> > > >
> > > > can't parse english here. what does this mean?
> > >
> > > I meant, it depends on the capability semantics. Both PASID and legacy
> > > MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
> > > use legacy MMIO bars.
> > >
> > > Thanks
> >
> > makes sense.
> >
> >
> > now, imagine someone building a new device. if existing drivers are not
> > a concern, it is possible to move capabilities all to extended space. is
> > that possible while keeping the bios working?
> 
> This is possible but I'm not sure it's worthwhile. What happens if the
> device puts all capabilities in the extended space but the bios can't
> scan there?

that's my question. can seabios access extended caps?

> We can place them at both but it then doesn't address the
> out of space issue. Things will be easier if we allow new
> features/capabilities to be placed on the extended space.
> 
> Thanks
> 
> >
> > > >
> > > >
> > > > > >
> > > > > > So I like this one better as a basis - care reviewing it and adding
> > > > > > stuff?
> > > > >
> > > > > There are very few differences and I will have a look.
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > --
> > > > > > MST
> > > > > >
> > > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11 10:39                         ` Michael S. Tsirkin
@ 2023-04-11 11:03                           ` Yan Vugenfirer
  0 siblings, 0 replies; 200+ messages in thread
From: Yan Vugenfirer @ 2023-04-11 11:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yan Vugenfirer, Jason Wang, Parav Pandit, virtio-dev, cohuck,
	virtio-comment, shahafs, Satananda Burla, Maxime Coquelin



Sent from my iPhone

> On 11 Apr 2023, at 13:39, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
> On Tue, Apr 11, 2023 at 12:19:14PM +0300, Yan Vugenfirer wrote:
>>> On Tue, Apr 11, 2023 at 12:02 PM Jason Wang <jasowang@redhat.com> wrote:
>>> 
>>> On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> 
>>>> On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
>>>>> On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> 
>>>>>> On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
>>>>>>> On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>> 
>>>>>>>> On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
>>>>>>>>> On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
>>>>>>>>>>> This is fine for vDPA but not for virtio if the design can only work
>>>>>>>>>>> for some specific setups (OSes/archs).
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>> 
>>>>>>>>>> Well virtio legacy has a long history of documenting existing hacks :)
>>>>>>>>> 
>>>>>>>>> Exactly, so the legacy behaviour is not (or can't be) defined by the
>>>>>>>>> spec but the codes.
>>>>>>>> 
>>>>>>>> I mean driver behaviour derives from the code but we do document it in
>>>>>>>> the spec to help people build devices.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
>>>>>>>>>> And we have to decide what to do about ACCESS_PLATFORM since
>>>>>>>>>> there's a security problem if device allows not acking it.
>>>>>>>>>> Two options:
>>>>>>>>>> - relax the rules a bit and say device will assume ACCESS_PLATFORM
>>>>>>>>>>  is acked anyway
>>>>>>>>> 
>>>>>>>>> This will break legacy drivers which assume physical addresses.
>>>>>>>> 
>>>>>>>> not that they are not already broken.
>>>>>>> 
>>>>>>> I may miss something, the whole point is to allow legacy drivers to
>>>>>>> run otherwise a modern device is sufficient?
>>>>>> 
>>>>>> yes and if legacy drivers don't work in a given setup then we
>>>>>> should not worry.
>>>>>> 
>>>>>>>> 
>>>>>>>>>> - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
>>>>>>>>> 
>>>>>>>>> This looks like a new "hack" for the legacy hacks.
>>>>>>>> 
>>>>>>>> it's not just for legacy.
>>>>>>> 
>>>>>>> We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
>>>>>> 
>>>>>> 
>>>>>> ACCESS_PLATFORM is also a security boundary. so devices must fail
>>>>>> negotiation if it's not there. this new one won't be.
>>>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>>> And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> You play some tricks with shadow VQ I guess.
>>>>>>> 
>>>>>>> Do we really want to add a new feature in the virtio spec that can
>>>>>>> only work with the datapath mediation?
>>>>>>> 
>>>>>>> Thanks
>>>>>> 
>>>>>> As long as a feature is useful and can't be supported otherwise
>>>>>> we are out of options.
>>>>> 
>>>>> Probably not? Is it as simple as relaxing this:
>>>>> 
>>>>> "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
>>>>> 
>>>>> To allow memory space.
>>>>> 
>>>>> This works for both software and hardware devices (I had a handy
>>>>> hardware that supports legacy virtio drivers in this way).
>>>>> 
>>>>> Thanks
>>>> 
>>>> Yes it is certainly simpler.
>>>> 
>>>> Question: what happens if you try to run existing windows guests or dpdk
>>>> on these? Do they crash horribly or exit gracefully?
>>> 
>>> Haven't tried DPDK and windows. But I remember DPDK supported legacy
>>> MMIO bars for a while.
>> 
>> Regarding Windows drivers:
>> 1. We are acking VIRTIO_F_ACCESS_PLATFORM in the driver. But, if you
>> remember the "ATS" issue (Windows is either not detecting it or, even
>> if detected, not using it) - then actually, we are not forcing Windows
>> to remap the memory because Window fails to work with it correctly.
>> 2. Current Windows drivers implementation doesn't support MMIO bars.
>> We can enable the support if needed.
>> 
>> Best regards,
>> Yan.
> 
> The question was about old legacy drivers, not modern ones.
> What if they attach to a device and BAR0
> is a memory not an IO bar?
> Will they fail gracefully or crash?
The drivers should fail to load gracefully. There will be no crash, except in the case of the virtio-blk or virtio-scsi as a boot device.

> 
> 
>> 
>>> 
>>> Adding Maxime and Yan for comments here.
>>> 
>>>> 
>>>> The point of the capability is to allow using modern device ID so such
>>>> guests will not even try to bind.
>>> 
>>> It means a mediation layer is required to use. Then it's not an issue
>>> for this simple relaxing any more?
>>> 
>>> An advantage of such relaxing is that, for the legacy drivers like an
>>> ancient Linux version that can already do MMIO access to legacy BAR it
>>> can work without any mediation.
>>> 
>>> At least, if ID can help, it can be used with this as well.
>>> 
>>> Thanks
>>> 
>>> 
>>> 
>>>> 
>>>> 
>>>>>> Keeping field practice things out of the
>>>>>> spec helps no one.
>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> MST
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  7:00             ` Michael S. Tsirkin
  2023-04-11  9:07               ` Jason Wang
@ 2023-04-11 13:47               ` Parav Pandit
  2023-04-11 14:02                 ` Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 13:47 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla



On 4/11/2023 3:00 AM, Michael S. Tsirkin wrote:

>> I meant, it depends on the capability semantics. Both PASID and legacy
>> MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
>> use legacy MMIO bars.
>>
>> Thanks
> 
> makes sense.
> 
> 
> now, imagine someone building a new device. if existing drivers are not
> a concern, it is possible to move capabilities all to extended space. is
> that possible while keeping the bios working?
> 
unlikely as bios is looking in the legacy section.

New capabilities like legacy (and pasid) has no backward compat 
requirement from bios and drivers.
so they can start from new.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11  9:01                     ` Jason Wang
       [not found]                       ` <CALBs2cXURMEzCGnULicXbsBfwnKE5cZOz=M-_hhFCXZ=Lqb9Nw@mail.gmail.com>
  2023-04-11 10:42                       ` Michael S. Tsirkin
@ 2023-04-11 13:57                       ` Parav Pandit
  2 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 13:57 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, April 11, 2023 5:02 AM

> > The point of the capability is to allow using modern device ID so such
> > guests will not even try to bind.
> 
> It means a mediation layer is required to use. Then it's not an issue for this
> simple relaxing any more?
> 
It is desired to for the best performance because the notification 2B is sandwitched by legacy in middle of the configuration registers.
So like described in patch-10, a mediation layer will be able to use dedicated notification region.

Secondly, PCI has only effectively 3 BARs.
It is better to reuse one of the existing BAR when possible for the device depending on the features it offered.
So better not to hard code to BAR0 always.

> An advantage of such relaxing is that, for the legacy drivers like an ancient Linux
> version that can already do MMIO access to legacy BAR it can work without any
> mediation.
May be, such MMIO BAR mapping to user space will need page aligned BAR instead of 32B or 64B
(You asked about 4K in other thread).
It is unlikely performant due to intermixing of notification and config registers.

So letting it driven by notification region is better, may be some device can map to the same notification register in middle of the config area.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* RE: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  9:07               ` Jason Wang
  2023-04-11 10:43                 ` Michael S. Tsirkin
@ 2023-04-11 13:59                 ` Parav Pandit
  2023-04-11 14:11                 ` Michael S. Tsirkin
  2 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 13:59 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Jason Wang <jasowang@redhat.com>
> Sent: Tuesday, April 11, 2023 5:07 AM

> This is possible but I'm not sure it's worthwhile. What happens if the device
> puts all capabilities in the extended space but the bios can't scan there? We can
> place them at both but it then doesn't address the out of space issue. Things
> will be easier if we allow new features/capabilities to be placed on the
> extended space.

+1.
Driver side code to refer to extended pci area for new cap negligible being ext cap is far more common outside of virtio.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 13:47               ` Parav Pandit
@ 2023-04-11 14:02                 ` Michael S. Tsirkin
  2023-04-11 14:07                   ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 14:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Tue, Apr 11, 2023 at 09:47:50AM -0400, Parav Pandit wrote:
> 
> 
> On 4/11/2023 3:00 AM, Michael S. Tsirkin wrote:
> 
> > > I meant, it depends on the capability semantics. Both PASID and legacy
> > > MMIO don't need to be accessed by BIOS. We can't change legacy BIOS to
> > > use legacy MMIO bars.
> > > 
> > > Thanks
> > 
> > makes sense.
> > 
> > 
> > now, imagine someone building a new device. if existing drivers are not
> > a concern, it is possible to move capabilities all to extended space. is
> > that possible while keeping the bios working?
> > 
> unlikely as bios is looking in the legacy section.

what do you mean by the legacy section?

> New capabilities like legacy (and pasid) has no backward compat requirement
> from bios and drivers.
> so they can start from new.

sure.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 14:02                 ` Michael S. Tsirkin
@ 2023-04-11 14:07                   ` Parav Pandit
  2023-04-11 14:10                     ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 14:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin

> > >
> > > now, imagine someone building a new device. if existing drivers are
> > > not a concern, it is possible to move capabilities all to extended
> > > space. is that possible while keeping the bios working?
> > >
> > unlikely as bios is looking in the legacy section.
> 
> what do you mean by the legacy section?
>
Existing virtio PCI capabilities which are in the PCI capabilities region in first 256 bytes of the PCI config space.
 
> > New capabilities like legacy (and pasid) has no backward compat
> > requirement from bios and drivers.
> > so they can start from new.
> 
> sure.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 14:07                   ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-11 14:10                     ` Michael S. Tsirkin
  2023-04-11 14:30                       ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 14:10 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Tue, Apr 11, 2023 at 02:07:27PM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> 
> > > >
> > > > now, imagine someone building a new device. if existing drivers are
> > > > not a concern, it is possible to move capabilities all to extended
> > > > space. is that possible while keeping the bios working?
> > > >
> > > unlikely as bios is looking in the legacy section.
> > 
> > what do you mean by the legacy section?
> >
> Existing virtio PCI capabilities which are in the PCI capabilities region in first 256 bytes of the PCI config space.

well, of course, my question was for a new, future device, can
we put all capabilities in the extended space? will
we be able to teach bios to access these?

> > > New capabilities like legacy (and pasid) has no backward compat
> > > requirement from bios and drivers.
> > > so they can start from new.
> > 
> > sure.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  9:07               ` Jason Wang
  2023-04-11 10:43                 ` Michael S. Tsirkin
  2023-04-11 13:59                 ` Parav Pandit
@ 2023-04-11 14:11                 ` Michael S. Tsirkin
  2 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 14:11 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla

On Tue, Apr 11, 2023 at 05:07:11PM +0800, Jason Wang wrote:
> > now, imagine someone building a new device. if existing drivers are not
> > a concern, it is possible to move capabilities all to extended space. is
> > that possible while keeping the bios working?
> 
> This is possible but I'm not sure it's worthwhile. What happens if the
> device puts all capabilities in the extended space but the bios can't
> scan there?

that's my question. why can't bios scan there?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 14:10                     ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-11 14:30                       ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 14:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 11, 2023 10:10 AM
> 
> well, of course, my question was for a new, future device, can we put all
> capabilities in the extended space? will we be able to teach bios to access
> these?
Cannot speak for BIOS vendors, 
but from PCI spec side, the extended capability exists for more than a decade ago.
ACS, SR-IOV etc are part of extended.

We still need to teach BIOS for the new device where to look for.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* RE: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11  3:28       ` Jason Wang
@ 2023-04-11 19:01         ` Parav Pandit
  2023-04-11 21:25           ` Michael S. Tsirkin
  2023-04-12  4:04           ` Jason Wang
  0 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-11 19:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla


> From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> Behalf Of Jason Wang
> Sent: Monday, April 10, 2023 11:29 PM

> > However, it is not backward compatible, if the device place them in
> > extended capability, it will not work.
> >
> 
> It is kind of intended since it is only used for new PCI-E features:
> 
New fields in new extended pci cap area is fine.
Migrating old fields to be present in the new extended pci cap, is not your intention. Right?

> "
> +The location of the virtio structures that depend on the PCI Express
> +capability are specified using a vendor-specific extended capabilities
> +on the extended capabilities list in PCI Express extended configuration
> +space of the device.
> "
> 
> > To make it backward compatible, a device needs to expose existing
> > structure in legacy area. And extended structure for same capability
> > in extended pci capability region.
> >
> > In other words, it will have to be a both places.
> 
> Then we will run out of config space again? 
No. 
Only currently defined caps to be placed in two places.
New fields don’t need to be placed in PCI cap, because no driver is looking there.

We probably already discussed this in previous email by now.

> Otherwise we need to deal with the
> case when existing structures were only placed at extended capability. Michael
> suggest to add a new feature, but the driver may not negotiate the feature
> which requires more thought.
> 
Not sure I understand feature bit.
PCI transport fields existence is usually not dependent on upper layer protocol.

> > We may need it even sooner than this because the AQ patch is expanding
> > the structure located in legacy area.
> 
> Just to make sure I understand this, assuming we have adminq, any reason a
> dedicated pcie ext cap is required?
> 
No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 19:01         ` Parav Pandit
@ 2023-04-11 21:25           ` Michael S. Tsirkin
  2023-04-12  0:40             ` Parav Pandit
  2023-04-12  4:07             ` Jason Wang
  2023-04-12  4:04           ` Jason Wang
  1 sibling, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-11 21:25 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> 
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Jason Wang
> > Sent: Monday, April 10, 2023 11:29 PM
> 
> > > However, it is not backward compatible, if the device place them in
> > > extended capability, it will not work.
> > >
> > 
> > It is kind of intended since it is only used for new PCI-E features:
> > 
> New fields in new extended pci cap area is fine.
> Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> 
> > "
> > +The location of the virtio structures that depend on the PCI Express
> > +capability are specified using a vendor-specific extended capabilities
> > +on the extended capabilities list in PCI Express extended configuration
> > +space of the device.
> > "
> > 
> > > To make it backward compatible, a device needs to expose existing
> > > structure in legacy area. And extended structure for same capability
> > > in extended pci capability region.
> > >
> > > In other words, it will have to be a both places.
> > 
> > Then we will run out of config space again? 
> No. 
> Only currently defined caps to be placed in two places.
> New fields don’t need to be placed in PCI cap, because no driver is looking there.
> 
> We probably already discussed this in previous email by now.
> 
> > Otherwise we need to deal with the
> > case when existing structures were only placed at extended capability. Michael
> > suggest to add a new feature, but the driver may not negotiate the feature
> > which requires more thought.
> > 
> Not sure I understand feature bit.

This is because we have a concept of dependency between
features but not a concept of dependency of feature on
capability.

> PCI transport fields existence is usually not dependent on upper layer protocol.
> 
> > > We may need it even sooner than this because the AQ patch is expanding
> > > the structure located in legacy area.
> > 
> > Just to make sure I understand this, assuming we have adminq, any reason a
> > dedicated pcie ext cap is required?
> > 
> No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.



You know, thinking about this, I begin to feel that we should
require that if at least one extended config exists then
all caps present in the regular config are *also*
mirrored in the extended config. IOW extended >= regular.
The reason is that extended config can be emulated more efficiently
(2x less exits).
WDYT?


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* RE: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 21:25           ` Michael S. Tsirkin
@ 2023-04-12  0:40             ` Parav Pandit
  2023-04-12  2:56               ` Michael S. Tsirkin
  2023-04-12  4:07             ` Jason Wang
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  0:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 11, 2023 5:25 PM
> 
> On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> >
> > > From: virtio-dev@lists.oasis-open.org
> > > <virtio-dev@lists.oasis-open.org> On Behalf Of Jason Wang
> > > Sent: Monday, April 10, 2023 11:29 PM
> >
> > > > However, it is not backward compatible, if the device place them
> > > > in extended capability, it will not work.
> > > >
> > >
> > > It is kind of intended since it is only used for new PCI-E features:
> > >
> > New fields in new extended pci cap area is fine.
> > Migrating old fields to be present in the new extended pci cap, is not your
> intention. Right?
> >
> > > "
> > > +The location of the virtio structures that depend on the PCI
> > > +Express capability are specified using a vendor-specific extended
> > > +capabilities on the extended capabilities list in PCI Express
> > > +extended configuration space of the device.
> > > "
> > >
> > > > To make it backward compatible, a device needs to expose existing
> > > > structure in legacy area. And extended structure for same
> > > > capability in extended pci capability region.
> > > >
> > > > In other words, it will have to be a both places.
> > >
> > > Then we will run out of config space again?
> > No.
> > Only currently defined caps to be placed in two places.
> > New fields don’t need to be placed in PCI cap, because no driver is looking
> there.
> >
> > We probably already discussed this in previous email by now.
> >
> > > Otherwise we need to deal with the
> > > case when existing structures were only placed at extended
> > > capability. Michael suggest to add a new feature, but the driver may
> > > not negotiate the feature which requires more thought.
> > >
> > Not sure I understand feature bit.
> 
> This is because we have a concept of dependency between features but not a
> concept of dependency of feature on capability.
> 
A feature bit is accessible after virtio level initialization is complete.
Virtio level initialization depends on the PCI layer transport attributes.
So pci level attributes should exists regardless of upper layer initialization.

> > PCI transport fields existence is usually not dependent on upper layer
> protocol.
> >
> > > > We may need it even sooner than this because the AQ patch is
> > > > expanding the structure located in legacy area.
> > >
> > > Just to make sure I understand this, assuming we have adminq, any
> > > reason a dedicated pcie ext cap is required?
> > >
> > No. it was my short sight. I responded right after above text that AQ doesn’t
> need cap extension.
> 
> 
> 
> You know, thinking about this, I begin to feel that we should require that if at
> least one extended config exists then all caps present in the regular config are
> *also* mirrored in the extended config. IOW extended >= regular.
Fair but why is this a requirement to mirror?
Is the below one to improve the perf?
If so, it is fine, but I guess it is optional.

> The reason is that extended config can be emulated more efficiently (2x less
> exits).
> WDYT?
> 
> 
> --
> MST


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  0:40             ` Parav Pandit
@ 2023-04-12  2:56               ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  2:56 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:40:53AM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, April 11, 2023 5:25 PM
> > 
> > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > >
> > > > From: virtio-dev@lists.oasis-open.org
> > > > <virtio-dev@lists.oasis-open.org> On Behalf Of Jason Wang
> > > > Sent: Monday, April 10, 2023 11:29 PM
> > >
> > > > > However, it is not backward compatible, if the device place them
> > > > > in extended capability, it will not work.
> > > > >
> > > >
> > > > It is kind of intended since it is only used for new PCI-E features:
> > > >
> > > New fields in new extended pci cap area is fine.
> > > Migrating old fields to be present in the new extended pci cap, is not your
> > intention. Right?
> > >
> > > > "
> > > > +The location of the virtio structures that depend on the PCI
> > > > +Express capability are specified using a vendor-specific extended
> > > > +capabilities on the extended capabilities list in PCI Express
> > > > +extended configuration space of the device.
> > > > "
> > > >
> > > > > To make it backward compatible, a device needs to expose existing
> > > > > structure in legacy area. And extended structure for same
> > > > > capability in extended pci capability region.
> > > > >
> > > > > In other words, it will have to be a both places.
> > > >
> > > > Then we will run out of config space again?
> > > No.
> > > Only currently defined caps to be placed in two places.
> > > New fields don’t need to be placed in PCI cap, because no driver is looking
> > there.
> > >
> > > We probably already discussed this in previous email by now.
> > >
> > > > Otherwise we need to deal with the
> > > > case when existing structures were only placed at extended
> > > > capability. Michael suggest to add a new feature, but the driver may
> > > > not negotiate the feature which requires more thought.
> > > >
> > > Not sure I understand feature bit.
> > 
> > This is because we have a concept of dependency between features but not a
> > concept of dependency of feature on capability.
> > 
> A feature bit is accessible after virtio level initialization is complete.
> Virtio level initialization depends on the PCI layer transport attributes.
> So pci level attributes should exists regardless of upper layer initialization.

sure

> > > PCI transport fields existence is usually not dependent on upper layer
> > protocol.
> > >
> > > > > We may need it even sooner than this because the AQ patch is
> > > > > expanding the structure located in legacy area.
> > > >
> > > > Just to make sure I understand this, assuming we have adminq, any
> > > > reason a dedicated pcie ext cap is required?
> > > >
> > > No. it was my short sight. I responded right after above text that AQ doesn’t
> > need cap extension.
> > 
> > 
> > 
> > You know, thinking about this, I begin to feel that we should require that if at
> > least one extended config exists then all caps present in the regular config are
> > *also* mirrored in the extended config. IOW extended >= regular.
> Fair but why is this a requirement to mirror?
> Is the below one to improve the perf?

yes

> If so, it is fine, but I guess it is optional.

It can not be optional otherwise guest has to scan legacy capability list too
losing perf advantage :).

> > The reason is that extended config can be emulated more efficiently (2x less
> > exits).
> > WDYT?
> > 
> > 
> > --
> > MST
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-11 10:42                       ` Michael S. Tsirkin
@ 2023-04-12  3:58                         ` Jason Wang
  2023-04-12  4:15                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  3:58 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 11, 2023 at 6:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Apr 11, 2023 at 05:01:59PM +0800, Jason Wang wrote:
> > On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > > > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > > > for some specific setups (OSes/archs).
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > > > >
> > > > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > > > spec but the codes.
> > > > > > >
> > > > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > > > the spec to help people build devices.
> > > > > > >
> > > > > > >
> > > > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > > > there's a security problem if device allows not acking it.
> > > > > > > > > Two options:
> > > > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > > > >   is acked anyway
> > > > > > > >
> > > > > > > > This will break legacy drivers which assume physical addresses.
> > > > > > >
> > > > > > > not that they are not already broken.
> > > > > >
> > > > > > I may miss something, the whole point is to allow legacy drivers to
> > > > > > run otherwise a modern device is sufficient?
> > > > >
> > > > > yes and if legacy drivers don't work in a given setup then we
> > > > > should not worry.
> > > > >
> > > > > > >
> > > > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > > > >
> > > > > > > > This looks like a new "hack" for the legacy hacks.
> > > > > > >
> > > > > > > it's not just for legacy.
> > > > > >
> > > > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > > > >
> > > > >
> > > > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > > > negotiation if it's not there. this new one won't be.
> > > > >
> > > > >
> > > > > > >
> > > > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > You play some tricks with shadow VQ I guess.
> > > > > >
> > > > > > Do we really want to add a new feature in the virtio spec that can
> > > > > > only work with the datapath mediation?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > As long as a feature is useful and can't be supported otherwise
> > > > > we are out of options.
> > > >
> > > > Probably not? Is it as simple as relaxing this:
> > > >
> > > > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> > > >
> > > > To allow memory space.
> > > >
> > > > This works for both software and hardware devices (I had a handy
> > > > hardware that supports legacy virtio drivers in this way).
> > > >
> > > > Thanks
> > >
> > > Yes it is certainly simpler.
> > >
> > > Question: what happens if you try to run existing windows guests or dpdk
> > > on these? Do they crash horribly or exit gracefully?
> >
> > Haven't tried DPDK and windows. But I remember DPDK supported legacy
> > MMIO bars for a while.
> >
> > Adding Maxime and Yan for comments here.
> >
> > >
> > > The point of the capability is to allow using modern device ID so such
> > > guests will not even try to bind.
> >
> > It means a mediation layer is required to use. Then it's not an issue
> > for this simple relaxing any more?
> >
> > An advantage of such relaxing is that, for the legacy drivers like an
> > ancient Linux version that can already do MMIO access to legacy BAR it
> > can work without any mediation.
>
> Yes. But the capability approach does not prevent that -
> just use a transitional ID.
> The disadvantage of a transitional ID is some old drivers might crash, or fail
> to work in other ways.

If the driver is wrote correctly it should fail gracefully if it can't
request legacy I/O regions.

> Using a modern ID with a capability
> prevents old drivers from attaching.

Just to make sure we are at the same page.

It looks like the motivation for this series is allow hypervisor to
mediate between legacy MMIO BAR and legacy driver. This end ups with a
design that only work for a specific use case (virtualization) and a
specific setups (e.g using Linux as a hypervisor). By definition, it's
not a transitional device. What's more important, if it is only used
for virtualization, legacy MMIO bar is not a must. We can present a
legacy device on top of modern device by using necessary mediation
technologies as discussed.

By relaxing to use MMIO at BAR0, we had a much broad use cases. By
definition, it's a transitional device. It's useful for both bare
metal and virtualization. We don't worry about the buggy drivers or at
least fixing (I meant fail on no I/O bar) those drivers is not hard.

> Using a capability the system
> designed is in control and can decide which drivers to use.

Do we allow a virtio device only with this capability? If yes, this
seems can mislead vendors. If not, we have already had modern
interfaces so I don't see value to add that in the current Linux
drivers. The only possible use case is the doing mediation in the
hypervisor like Qemu not the driver.

>
>
>
> > At least, if ID can help, it can be used with this as well.
> >
> > Thanks
>
>
> I don't know what that means.

I mean having dedicated ID for this relaxing (though I don't see too
much value).

Thanks

>
> >
> >
> > >
> > >
> > > > > Keeping field practice things out of the
> > > > > spec helps no one.
> > > > >
> > > > > > >
> > > > > > > --
> > > > > > > MST
> > > > > > >
> > > > >
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 19:01         ` Parav Pandit
  2023-04-11 21:25           ` Michael S. Tsirkin
@ 2023-04-12  4:04           ` Jason Wang
  2023-04-12  4:13             ` Parav Pandit
  2023-04-12  4:20             ` Michael S. Tsirkin
  1 sibling, 2 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-12  4:04 UTC (permalink / raw)
  To: Parav Pandit
  Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 12, 2023 at 3:01 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > Behalf Of Jason Wang
> > Sent: Monday, April 10, 2023 11:29 PM
>
> > > However, it is not backward compatible, if the device place them in
> > > extended capability, it will not work.
> > >
> >
> > It is kind of intended since it is only used for new PCI-E features:
> >
> New fields in new extended pci cap area is fine.
> Migrating old fields to be present in the new extended pci cap, is not your intention. Right?

Right, but what I want to say is, such migration may cause unnecessary
problems. And I don't see why it is a must for your legacy MMIO bar
proposal.

>
> > "
> > +The location of the virtio structures that depend on the PCI Express
> > +capability are specified using a vendor-specific extended capabilities
> > +on the extended capabilities list in PCI Express extended configuration
> > +space of the device.
> > "
> >
> > > To make it backward compatible, a device needs to expose existing
> > > structure in legacy area. And extended structure for same capability
> > > in extended pci capability region.
> > >
> > > In other words, it will have to be a both places.
> >
> > Then we will run out of config space again?
> No.
> Only currently defined caps to be placed in two places.

What's the advantage of doing this?

New drivers should provide backward compatibility so they must scan the pci cap.
The Old driver can only scan the pci cap.

> New fields don’t need to be placed in PCI cap, because no driver is looking there.

It would be much more simple if we forbid placing new fields in the
PCI cap, it is already out of space.

Thanks

>
> We probably already discussed this in previous email by now.
>
> > Otherwise we need to deal with the
> > case when existing structures were only placed at extended capability. Michael
> > suggest to add a new feature, but the driver may not negotiate the feature
> > which requires more thought.
> >
> Not sure I understand feature bit.
> PCI transport fields existence is usually not dependent on upper layer protocol.
>
> > > We may need it even sooner than this because the AQ patch is expanding
> > > the structure located in legacy area.
> >
> > Just to make sure I understand this, assuming we have adminq, any reason a
> > dedicated pcie ext cap is required?
> >
> No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-11 21:25           ` Michael S. Tsirkin
  2023-04-12  0:40             ` Parav Pandit
@ 2023-04-12  4:07             ` Jason Wang
  2023-04-12  4:20               ` Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  4:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> >
> > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > Behalf Of Jason Wang
> > > Sent: Monday, April 10, 2023 11:29 PM
> >
> > > > However, it is not backward compatible, if the device place them in
> > > > extended capability, it will not work.
> > > >
> > >
> > > It is kind of intended since it is only used for new PCI-E features:
> > >
> > New fields in new extended pci cap area is fine.
> > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> >
> > > "
> > > +The location of the virtio structures that depend on the PCI Express
> > > +capability are specified using a vendor-specific extended capabilities
> > > +on the extended capabilities list in PCI Express extended configuration
> > > +space of the device.
> > > "
> > >
> > > > To make it backward compatible, a device needs to expose existing
> > > > structure in legacy area. And extended structure for same capability
> > > > in extended pci capability region.
> > > >
> > > > In other words, it will have to be a both places.
> > >
> > > Then we will run out of config space again?
> > No.
> > Only currently defined caps to be placed in two places.
> > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> >
> > We probably already discussed this in previous email by now.
> >
> > > Otherwise we need to deal with the
> > > case when existing structures were only placed at extended capability. Michael
> > > suggest to add a new feature, but the driver may not negotiate the feature
> > > which requires more thought.
> > >
> > Not sure I understand feature bit.
>
> This is because we have a concept of dependency between
> features but not a concept of dependency of feature on
> capability.
>
> > PCI transport fields existence is usually not dependent on upper layer protocol.
> >
> > > > We may need it even sooner than this because the AQ patch is expanding
> > > > the structure located in legacy area.
> > >
> > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > dedicated pcie ext cap is required?
> > >
> > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
>
>
>
> You know, thinking about this, I begin to feel that we should
> require that if at least one extended config exists then
> all caps present in the regular config are *also*
> mirrored in the extended config. IOW extended >= regular.
> The reason is that extended config can be emulated more efficiently
> (2x less exits).

Any reason for it to get less exits? At least it has not been done in
current Qemu's emulation. (And do we really care about the performance
of config space access?)

Thanks

> WDYT?
>
>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* RE: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:04           ` Jason Wang
@ 2023-04-12  4:13             ` Parav Pandit
  2023-04-12  4:20             ` Michael S. Tsirkin
  1 sibling, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  4:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, April 12, 2023 12:05 AM

> > > It is kind of intended since it is only used for new PCI-E features:
> > >
> > New fields in new extended pci cap area is fine.
> > Migrating old fields to be present in the new extended pci cap, is not your
> intention. Right?
> 
> Right, but what I want to say is, such migration may cause unnecessary
> problems. And I don't see why it is a must for your legacy MMIO bar proposal.
>
As I explained in the commit log, there is very small space left in the PCI capabilities.
Last time I counted, our VF doesn’t even have space in the PCI cap 192 bytes anymore.

So starting with the extended PCI cap for legacy MMIO bar window.

> > Only currently defined caps to be placed in two places.
> 
> What's the advantage of doing this?
>
I think Michael found one advantage of being 2x faster?
 
> New drivers should provide backward compatibility so they must scan the pci
> cap.
Yes.
> The Old driver can only scan the pci cap.
> 
Yes.
> > New fields don’t need to be placed in PCI cap, because no driver is looking
> there.
> 
> It would be much more simple if we forbid placing new fields in the PCI cap, it is
> already out of space.
> 
Yes. We both are saying same point. :)


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  3:58                         ` Jason Wang
@ 2023-04-12  4:15                           ` Michael S. Tsirkin
  2023-04-12  4:51                             ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:15 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 11:58:39AM +0800, Jason Wang wrote:
> On Tue, Apr 11, 2023 at 6:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 05:01:59PM +0800, Jason Wang wrote:
> > > On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > > > > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > > > > for some specific setups (OSes/archs).
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > > > > >
> > > > > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > > > > spec but the codes.
> > > > > > > >
> > > > > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > > > > the spec to help people build devices.
> > > > > > > >
> > > > > > > >
> > > > > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > > > > there's a security problem if device allows not acking it.
> > > > > > > > > > Two options:
> > > > > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > > > > >   is acked anyway
> > > > > > > > >
> > > > > > > > > This will break legacy drivers which assume physical addresses.
> > > > > > > >
> > > > > > > > not that they are not already broken.
> > > > > > >
> > > > > > > I may miss something, the whole point is to allow legacy drivers to
> > > > > > > run otherwise a modern device is sufficient?
> > > > > >
> > > > > > yes and if legacy drivers don't work in a given setup then we
> > > > > > should not worry.
> > > > > >
> > > > > > > >
> > > > > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > > > > >
> > > > > > > > > This looks like a new "hack" for the legacy hacks.
> > > > > > > >
> > > > > > > > it's not just for legacy.
> > > > > > >
> > > > > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > > > > >
> > > > > >
> > > > > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > > > > negotiation if it's not there. this new one won't be.
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > You play some tricks with shadow VQ I guess.
> > > > > > >
> > > > > > > Do we really want to add a new feature in the virtio spec that can
> > > > > > > only work with the datapath mediation?
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > As long as a feature is useful and can't be supported otherwise
> > > > > > we are out of options.
> > > > >
> > > > > Probably not? Is it as simple as relaxing this:
> > > > >
> > > > > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> > > > >
> > > > > To allow memory space.
> > > > >
> > > > > This works for both software and hardware devices (I had a handy
> > > > > hardware that supports legacy virtio drivers in this way).
> > > > >
> > > > > Thanks
> > > >
> > > > Yes it is certainly simpler.
> > > >
> > > > Question: what happens if you try to run existing windows guests or dpdk
> > > > on these? Do they crash horribly or exit gracefully?
> > >
> > > Haven't tried DPDK and windows. But I remember DPDK supported legacy
> > > MMIO bars for a while.
> > >
> > > Adding Maxime and Yan for comments here.
> > >
> > > >
> > > > The point of the capability is to allow using modern device ID so such
> > > > guests will not even try to bind.
> > >
> > > It means a mediation layer is required to use. Then it's not an issue
> > > for this simple relaxing any more?
> > >
> > > An advantage of such relaxing is that, for the legacy drivers like an
> > > ancient Linux version that can already do MMIO access to legacy BAR it
> > > can work without any mediation.
> >
> > Yes. But the capability approach does not prevent that -
> > just use a transitional ID.
> > The disadvantage of a transitional ID is some old drivers might crash, or fail
> > to work in other ways.
> 
> If the driver is wrote correctly it should fail gracefully if it can't
> request legacy I/O regions.

it's not just regions. it's things like iommu, ordering, endian-ness...
we all know legacy is a mess.

> > Using a modern ID with a capability
> > prevents old drivers from attaching.
> 
> Just to make sure we are at the same page.
> 
> It looks like the motivation for this series is allow hypervisor to
> mediate between legacy MMIO BAR and legacy driver. This end ups with a
> design that only work for a specific use case (virtualization) and a
> specific setups (e.g using Linux as a hypervisor). By definition, it's
> not a transitional device.

It can be a transitional or non-transitional device.

> What's more important, if it is only used
> for virtualization, legacy MMIO bar is not a must. We can present a
> legacy device on top of modern device by using necessary mediation
> technologies as discussed.

You may call it a bug, i call it a feature. I am not really
excited about all the bugs we'll see reported because
people will try to use legacy guests on bare metal virtio.
Can we just not go there please or at least give people
an option not to go there?

> By relaxing to use MMIO at BAR0, we had a much broad use cases. By
> definition, it's a transitional device. It's useful for both bare
> metal and virtualization. We don't worry about the buggy drivers or at
> least fixing (I meant fail on no I/O bar) those drivers is not hard.
>
> > Using a capability the system
> > designed is in control and can decide which drivers to use.
> 
> Do we allow a virtio device only with this capability? If yes, this
> seems can mislead vendors. If not, we have already had modern
> interfaces so I don't see value to add that in the current Linux
> drivers. The only possible use case is the doing mediation in the
> hypervisor like Qemu not the driver.

exactly.

> >
> >
> >
> > > At least, if ID can help, it can be used with this as well.
> > >
> > > Thanks
> >
> >
> > I don't know what that means.
> 
> I mean having dedicated ID for this relaxing (though I don't see too
> much value).
> 
> Thanks

me neither.

> >
> > >
> > >
> > > >
> > > >
> > > > > > Keeping field practice things out of the
> > > > > > spec helps no one.
> > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > MST
> > > > > > > >
> > > > > >
> > > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:07             ` Jason Wang
@ 2023-04-12  4:20               ` Michael S. Tsirkin
  2023-04-12  4:53                 ` [virtio-dev] Re: [virtio-comment] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:07:26PM +0800, Jason Wang wrote:
> On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > >
> > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > Behalf Of Jason Wang
> > > > Sent: Monday, April 10, 2023 11:29 PM
> > >
> > > > > However, it is not backward compatible, if the device place them in
> > > > > extended capability, it will not work.
> > > > >
> > > >
> > > > It is kind of intended since it is only used for new PCI-E features:
> > > >
> > > New fields in new extended pci cap area is fine.
> > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> > >
> > > > "
> > > > +The location of the virtio structures that depend on the PCI Express
> > > > +capability are specified using a vendor-specific extended capabilities
> > > > +on the extended capabilities list in PCI Express extended configuration
> > > > +space of the device.
> > > > "
> > > >
> > > > > To make it backward compatible, a device needs to expose existing
> > > > > structure in legacy area. And extended structure for same capability
> > > > > in extended pci capability region.
> > > > >
> > > > > In other words, it will have to be a both places.
> > > >
> > > > Then we will run out of config space again?
> > > No.
> > > Only currently defined caps to be placed in two places.
> > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> > >
> > > We probably already discussed this in previous email by now.
> > >
> > > > Otherwise we need to deal with the
> > > > case when existing structures were only placed at extended capability. Michael
> > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > which requires more thought.
> > > >
> > > Not sure I understand feature bit.
> >
> > This is because we have a concept of dependency between
> > features but not a concept of dependency of feature on
> > capability.
> >
> > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > >
> > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > the structure located in legacy area.
> > > >
> > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > dedicated pcie ext cap is required?
> > > >
> > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
> >
> >
> >
> > You know, thinking about this, I begin to feel that we should
> > require that if at least one extended config exists then
> > all caps present in the regular config are *also*
> > mirrored in the extended config. IOW extended >= regular.
> > The reason is that extended config can be emulated more efficiently
> > (2x less exits).
> 
> Any reason for it to get less exits?

For a variety of reasons having to do with buggy hardware e.g. linux
likes to use cf8/cfc for legacy ranges. 2 accesses are required for each
read/write.  extended space is just 1.


> At least it has not been done in
> current Qemu's emulation. (And do we really care about the performance
> of config space access?)
> 
> Thanks

For boot speed, yes. Not minor 5% things but 2x, sure.

> > WDYT?
> >
> >
> > --
> > MST
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:04           ` Jason Wang
  2023-04-12  4:13             ` Parav Pandit
@ 2023-04-12  4:20             ` Michael S. Tsirkin
  2023-04-12  4:55               ` Jason Wang
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:04:45PM +0800, Jason Wang wrote:
> On Wed, Apr 12, 2023 at 3:01 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > Behalf Of Jason Wang
> > > Sent: Monday, April 10, 2023 11:29 PM
> >
> > > > However, it is not backward compatible, if the device place them in
> > > > extended capability, it will not work.
> > > >
> > >
> > > It is kind of intended since it is only used for new PCI-E features:
> > >
> > New fields in new extended pci cap area is fine.
> > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> 
> Right, but what I want to say is, such migration may cause unnecessary
> problems. And I don't see why it is a must for your legacy MMIO bar
> proposal.
> 
> >
> > > "
> > > +The location of the virtio structures that depend on the PCI Express
> > > +capability are specified using a vendor-specific extended capabilities
> > > +on the extended capabilities list in PCI Express extended configuration
> > > +space of the device.
> > > "
> > >
> > > > To make it backward compatible, a device needs to expose existing
> > > > structure in legacy area. And extended structure for same capability
> > > > in extended pci capability region.
> > > >
> > > > In other words, it will have to be a both places.
> > >
> > > Then we will run out of config space again?
> > No.
> > Only currently defined caps to be placed in two places.
> 
> What's the advantage of doing this?
> 
> New drivers should provide backward compatibility so they must scan the pci cap.

No, they can start with express cap. Finding one they can skip
the old cap completely.

> The Old driver can only scan the pci cap.
> 
> > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> 
> It would be much more simple if we forbid placing new fields in the
> PCI cap, it is already out of space.
> 
> Thanks
> 
> >
> > We probably already discussed this in previous email by now.
> >
> > > Otherwise we need to deal with the
> > > case when existing structures were only placed at extended capability. Michael
> > > suggest to add a new feature, but the driver may not negotiate the feature
> > > which requires more thought.
> > >
> > Not sure I understand feature bit.
> > PCI transport fields existence is usually not dependent on upper layer protocol.
> >
> > > > We may need it even sooner than this because the AQ patch is expanding
> > > > the structure located in legacy area.
> > >
> > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > dedicated pcie ext cap is required?
> > >
> > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-03-30 22:58 ` [virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability Parav Pandit
@ 2023-04-12  4:31   ` Michael S. Tsirkin
  2023-04-12  4:37     ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:31 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:33AM +0300, Parav Pandit wrote:
> PCI devices support memory BAR regions for performant driver
> notifications using the notification capability.
> Enable transitional MMR devices to use it in simpler manner.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
>  transport-pci.tex | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 55a6aa0..4fd9898 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -763,6 +763,34 @@ \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options
>  cap.length >= queue_notify_off * notify_off_multiplier + 4
>  \end{lstlisting}
>  
> +\paragraph{Transitional MMR Interface: A note on Notification Capability}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Notification capability / Transitional MMR Interface}
> +
> +The transitional MMR device benefits from receiving driver
> +notifications at the Queue Notification address offered using
> +the notification capability, rather than via the memory mapped
> +legacy QueueNotify configuration register.
> +
> +Transitional MMR device uses same Queue Notification address
> +within a BAR for all virtqueues:
> +\begin{lstlisting}
> +cap.offset
> +\end{lstlisting}
> +
> +The transitional MMR device MUST support Queue Notification
> +address within a BAR for all virtqueues at:
> +\begin{lstlisting}
> +cap.offset
> +\end{lstlisting}
> +
> +The transitional MMR driver that wants to use driver
> +notifications offered using notification capability MUST use
> +same Queue Notification address within a BAR for all virtqueues at:
> +
> +\begin{lstlisting}
> +cap.offset
> +\end{lstlisting}
> +
>  \subsubsection{ISR status capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / ISR status capability}
>  
>  The VIRTIO_PCI_CAP_ISR_CFG capability

Why? What exactly is going on here? legacy drivers will
not do this.

> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-03-30 22:58 ` [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers Parav Pandit
  2023-04-07  8:55   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  4:33   ` Michael S. Tsirkin
  1 sibling, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:33 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:32AM +0300, Parav Pandit wrote:
> Legacy virtio configuration registers and adjacent
> device configuration registers are located somewhere
> in a memory BAR.
> 
> A new capability supplies the location of these registers
> which a driver can use to map I/O access to legacy
> memory mapped registers.
> 
> This gives the ability to locate legacy registers in either
> the existing memory BAR or as completely new BAR at BAR 0.
> 
> A below example diagram attempts to depicts it in an existing
> memory BAR.
> 
> +------------------------------+
> |Transitional                  |
> |MMR SRIOV VF                  |
> |                              |
> ++---------------+             |
> ||dev_id =       |             |
> ||{0x10f9-0x10ff}|             |
> |+---------------+             |
> |                              |
> ++--------------------+        |
> || PCIe ext cap = 0xB |        |
> || cfg_type = 10      |        |
> || offset   = 0x1000  |        |
> || bar      = A {0..5}|        |
> |+--|-----------------+        |
> |   |                          |
> |   |                          |
> |   |    +-------------------+ |
> |   |    | Memory BAR = A    | |
> |   |    |                   | |
> |   +------>+--------------+ | |
> |        |  |legacy virtio | | |
> |        |  |+ dev cfg     | | |
> |        |  |registers     | | |
> |        |  +--------------+ | |
> |        +-----------------+ | |
> +------------------------------+
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>
> ---
>  transport-pci.tex | 33 +++++++++++++++++++++++++++++++--
>  1 file changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index aeda4a1..55a6aa0 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -168,6 +168,7 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  \item ISR Status
>  \item Device-specific configuration (optional)
>  \item PCI configuration access
> +\item Legacy memory mapped configuration registers (optional)
>  \end{itemize}
>  
>  Each structure can be mapped by a Base Address register (BAR) belonging to
> @@ -228,6 +229,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Legacy configuration registers capability */
> +#define VIRTIO_PCI_CAP_LEGACY_MMR_CFG    10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -682,6 +685,18 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>  Configuration Space / Legacy Interface: Device Configuration
>  Space}~\nameref{sec:Basic Facilities of a Virtio Device / Device Configuration Space / Legacy Interface: Device Configuration Space} for workarounds.
>  
> +\paragraph{Transitional MMR Interface: A Note on Configuration Registers}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Transitional MMR Interface: A Note on Configuration Registers}
> +
> +The transitional MMR device MUST present legacy virtio registers
> +consisting of legacy common configuration registers followed by
> +legacy device specific configuration registers described in section
> +\ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Common configuration structure layout / Legacy Interfaces: A Note on Configuration Registers}
> +in a memory region PCI BAR.


So considering common legacy registers. How exactly is INT#x
handled? It's required for old guests since they fail over to that
when MSI-X fails for some reason.




> +
> +The transitional MMR device MUST provide the location of the
> +legacy virtio configuration registers using a legacy memory mapped
> +registers capability described in section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}.
>  
>  \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
>  
> @@ -956,9 +971,23 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Transitional MMR Interface: Legacy Memory Mapped Configuration Registers Capability}
> +
> +The optional VIRTIO_PCI_CAP_LEGACY_MMR_CFG capability defines
> +the location of the legacy virtio configuration registers
> +followed by legacy device specific configuration registers in
> +the memory region BAR for the transitional MMR device.
> +
> +The \field{cap.offset} MUST be 4-byte aligned.
> +The \field{cap.offset} SHOULD be 4KBytes aligned and
> +\field{cap.length} SHOULD be 4KBytes.
> +
> +The transitional MMR device MUST present a legacy configuration
> +memory mapped registers capability using \field{virtio_pcie_ext_cap}.
> +
>  \subsubsection{Legacy Interface: A Note on Feature Bits}
> -\label{sec:Virtio Transport Options / Virtio Over PCI Bus /
> -Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
> +\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities / Legacy Interface: A Note on Feature Bits}
>  
>  Only Feature Bits 0 to 31 are accessible through the
>  Legacy Interface. When used through the Legacy Interface,
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  4:31   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  4:37     ` Parav Pandit
  2023-04-12  4:43       ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  4:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 12, 2023 12:31 AM
> 
> On Fri, Mar 31, 2023 at 01:58:33AM +0300, Parav Pandit wrote:
> > PCI devices support memory BAR regions for performant driver
> > notifications using the notification capability.
> > Enable transitional MMR devices to use it in simpler manner.
> >
> > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > ---
> >  transport-pci.tex | 28 ++++++++++++++++++++++++++++
> >  1 file changed, 28 insertions(+)
> >
> > diff --git a/transport-pci.tex b/transport-pci.tex index
> > 55a6aa0..4fd9898 100644
> > --- a/transport-pci.tex
> > +++ b/transport-pci.tex
> > @@ -763,6 +763,34 @@ \subsubsection{Notification structure
> > layout}\label{sec:Virtio Transport Options  cap.length >=
> > queue_notify_off * notify_off_multiplier + 4  \end{lstlisting}
> >
> > +\paragraph{Transitional MMR Interface: A note on Notification
> > +Capability} \label{sec:Virtio Transport Options / Virtio Over PCI Bus
> > +/ Virtio Structure PCI Capabilities / Notification capability /
> > +Transitional MMR Interface}
> > +
> > +The transitional MMR device benefits from receiving driver
> > +notifications at the Queue Notification address offered using the
> > +notification capability, rather than via the memory mapped legacy
> > +QueueNotify configuration register.
> > +
> > +Transitional MMR device uses same Queue Notification address within a
> > +BAR for all virtqueues:
> > +\begin{lstlisting}
> > +cap.offset
> > +\end{lstlisting}
> > +
> > +The transitional MMR device MUST support Queue Notification address
> > +within a BAR for all virtqueues at:
> > +\begin{lstlisting}
> > +cap.offset
> > +\end{lstlisting}
> > +
> > +The transitional MMR driver that wants to use driver notifications
> > +offered using notification capability MUST use same Queue
> > +Notification address within a BAR for all virtqueues at:
> > +
> > +\begin{lstlisting}
> > +cap.offset
> > +\end{lstlisting}
> > +
> Why? What exactly is going on here? legacy drivers will not do this.

Legacy driver does in the q notify register that was sandwitched in between of slow configuration registers.
This is the notification offset for the hypervisor driver to perform the notification on behalf of the guest driver so that the acceleration available for the non-transitional device can be utilized here as well.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  4:37     ` [virtio-dev] " Parav Pandit
@ 2023-04-12  4:43       ` Michael S. Tsirkin
  2023-04-12  4:48         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:43 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 12, 2023 at 04:37:05AM +0000, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 12, 2023 12:31 AM
> > 
> > On Fri, Mar 31, 2023 at 01:58:33AM +0300, Parav Pandit wrote:
> > > PCI devices support memory BAR regions for performant driver
> > > notifications using the notification capability.
> > > Enable transitional MMR devices to use it in simpler manner.
> > >
> > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > ---
> > >  transport-pci.tex | 28 ++++++++++++++++++++++++++++
> > >  1 file changed, 28 insertions(+)
> > >
> > > diff --git a/transport-pci.tex b/transport-pci.tex index
> > > 55a6aa0..4fd9898 100644
> > > --- a/transport-pci.tex
> > > +++ b/transport-pci.tex
> > > @@ -763,6 +763,34 @@ \subsubsection{Notification structure
> > > layout}\label{sec:Virtio Transport Options  cap.length >=
> > > queue_notify_off * notify_off_multiplier + 4  \end{lstlisting}
> > >
> > > +\paragraph{Transitional MMR Interface: A note on Notification
> > > +Capability} \label{sec:Virtio Transport Options / Virtio Over PCI Bus
> > > +/ Virtio Structure PCI Capabilities / Notification capability /
> > > +Transitional MMR Interface}
> > > +
> > > +The transitional MMR device benefits from receiving driver
> > > +notifications at the Queue Notification address offered using the
> > > +notification capability, rather than via the memory mapped legacy
> > > +QueueNotify configuration register.
> > > +
> > > +Transitional MMR device uses same Queue Notification address within a
> > > +BAR for all virtqueues:
> > > +\begin{lstlisting}
> > > +cap.offset
> > > +\end{lstlisting}
> > > +
> > > +The transitional MMR device MUST support Queue Notification address
> > > +within a BAR for all virtqueues at:
> > > +\begin{lstlisting}
> > > +cap.offset
> > > +\end{lstlisting}
> > > +
> > > +The transitional MMR driver that wants to use driver notifications
> > > +offered using notification capability MUST use same Queue
> > > +Notification address within a BAR for all virtqueues at:
> > > +
> > > +\begin{lstlisting}
> > > +cap.offset
> > > +\end{lstlisting}
> > > +
> > Why? What exactly is going on here? legacy drivers will not do this.
> 
> Legacy driver does in the q notify register that was sandwitched in between of slow configuration registers.
> This is the notification offset for the hypervisor driver to perform the notification on behalf of the guest driver so that the acceleration available for the non-transitional device can be utilized here as well.

I don't get it. What acceleration? for guests you need a separate page
so card can be mapped directly while config causes an exit. But
hypervisor can access any register without vmexits.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (12 preceding siblings ...)
  2023-04-03 14:45 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
@ 2023-04-12  4:48 ` Michael S. Tsirkin
  2023-04-12  4:52   ` [virtio-dev] " Parav Pandit
  2023-04-12  5:10 ` [virtio-dev] " Halil Pasic
  2023-04-25  2:42 ` [virtio-dev] " Parav Pandit
  15 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  4:48 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs

On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.
> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 
> Such transitional MMR devices will be used at the scale of
> thousands of devices using PCI SR-IOV and/or future scalable
> virtualization technology to provide backward
> compatibility (for legacy devices) and also future
> compatibility with new features.
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>    virtio devices to the guest VM at scale of thousands,
>    typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>    vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>    virtio device type (net/blk) and be future compatible with a
>    single vfio stack using SR-IOV or other scalable device
>    virtualization technology to map PCI devices to the guest VM.
>    (as transitional or otherwise)


The more I look at it the more issues I see.

Here is a counter proposal:

#define VIRTIO_NET_F_LEGACY_HEADER  52      /* Use the legacy 10 byte header for all packets */


Yes, sorry to say, you need to emulate legacy pci in software.

With notification hacks, and reset hacks, and legacy interrupt hacks,
and writeable mac ...  this thing best belongs in vdpa anyway.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  4:43       ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  4:48         ` Parav Pandit
  2023-04-12  5:02           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  4:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 12, 2023 12:43 AM
> To: Parav Pandit <parav@nvidia.com>
> Cc: virtio-dev@lists.oasis-open.org; cohuck@redhat.com; virtio-
> comment@lists.oasis-open.org; Shahaf Shuler <shahafs@nvidia.com>;
> Satananda Burla <sburla@marvell.com>
> Subject: Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
> 
> On Wed, Apr 12, 2023 at 04:37:05AM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, April 12, 2023 12:31 AM
> > >
> > > On Fri, Mar 31, 2023 at 01:58:33AM +0300, Parav Pandit wrote:
> > > > PCI devices support memory BAR regions for performant driver
> > > > notifications using the notification capability.
> > > > Enable transitional MMR devices to use it in simpler manner.
> > > >
> > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > ---
> > > >  transport-pci.tex | 28 ++++++++++++++++++++++++++++
> > > >  1 file changed, 28 insertions(+)
> > > >
> > > > diff --git a/transport-pci.tex b/transport-pci.tex index
> > > > 55a6aa0..4fd9898 100644
> > > > --- a/transport-pci.tex
> > > > +++ b/transport-pci.tex
> > > > @@ -763,6 +763,34 @@ \subsubsection{Notification structure
> > > > layout}\label{sec:Virtio Transport Options  cap.length >=
> > > > queue_notify_off * notify_off_multiplier + 4  \end{lstlisting}
> > > >
> > > > +\paragraph{Transitional MMR Interface: A note on Notification
> > > > +Capability} \label{sec:Virtio Transport Options / Virtio Over PCI
> > > > +Bus / Virtio Structure PCI Capabilities / Notification capability
> > > > +/ Transitional MMR Interface}
> > > > +
> > > > +The transitional MMR device benefits from receiving driver
> > > > +notifications at the Queue Notification address offered using the
> > > > +notification capability, rather than via the memory mapped legacy
> > > > +QueueNotify configuration register.
> > > > +
> > > > +Transitional MMR device uses same Queue Notification address
> > > > +within a BAR for all virtqueues:
> > > > +\begin{lstlisting}
> > > > +cap.offset
> > > > +\end{lstlisting}
> > > > +
> > > > +The transitional MMR device MUST support Queue Notification
> > > > +address within a BAR for all virtqueues at:
> > > > +\begin{lstlisting}
> > > > +cap.offset
> > > > +\end{lstlisting}
> > > > +
> > > > +The transitional MMR driver that wants to use driver
> > > > +notifications offered using notification capability MUST use same
> > > > +Queue Notification address within a BAR for all virtqueues at:
> > > > +
> > > > +\begin{lstlisting}
> > > > +cap.offset
> > > > +\end{lstlisting}
> > > > +
> > > Why? What exactly is going on here? legacy drivers will not do this.
> >
> > Legacy driver does in the q notify register that was sandwitched in between
> of slow configuration registers.
> > This is the notification offset for the hypervisor driver to perform the
> notification on behalf of the guest driver so that the acceleration available for
> the non-transitional device can be utilized here as well.
> 
> I don't get it. What acceleration? for guests you need a separate page so card
> can be mapped directly while config causes an exit. But hypervisor can access
> any register without vmexits.

Typically when guest VM writes to IOBAR q notification register, a vmexit occurs.
On that occurrence, hypervisor driver forwards the q notification using the q notification region which is defined by struct virtio_pci_notify_cap.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  4:15                           ` Michael S. Tsirkin
@ 2023-04-12  4:51                             ` Jason Wang
  2023-04-12  5:01                               ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  4:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, shahafs,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 12:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Apr 12, 2023 at 11:58:39AM +0800, Jason Wang wrote:
> > On Tue, Apr 11, 2023 at 6:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 05:01:59PM +0800, Jason Wang wrote:
> > > > On Tue, Apr 11, 2023 at 3:04 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Apr 11, 2023 at 10:13:40AM +0800, Jason Wang wrote:
> > > > > > On Mon, Apr 10, 2023 at 6:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 10, 2023 at 03:20:52PM +0800, Jason Wang wrote:
> > > > > > > > On Mon, Apr 10, 2023 at 2:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Apr 10, 2023 at 02:20:16PM +0800, Jason Wang wrote:
> > > > > > > > > > On Mon, Apr 10, 2023 at 2:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Apr 10, 2023 at 09:33:32AM +0800, Jason Wang wrote:
> > > > > > > > > > > > This is fine for vDPA but not for virtio if the design can only work
> > > > > > > > > > > > for some specific setups (OSes/archs).
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > Well virtio legacy has a long history of documenting existing hacks :)
> > > > > > > > > >
> > > > > > > > > > Exactly, so the legacy behaviour is not (or can't be) defined by the
> > > > > > > > > > spec but the codes.
> > > > > > > > >
> > > > > > > > > I mean driver behaviour derives from the code but we do document it in
> > > > > > > > > the spec to help people build devices.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > But yes, VIRTIO_F_ORDER_PLATFORM has to be documented.
> > > > > > > > > > > And we have to decide what to do about ACCESS_PLATFORM since
> > > > > > > > > > > there's a security problem if device allows not acking it.
> > > > > > > > > > > Two options:
> > > > > > > > > > > - relax the rules a bit and say device will assume ACCESS_PLATFORM
> > > > > > > > > > >   is acked anyway
> > > > > > > > > >
> > > > > > > > > > This will break legacy drivers which assume physical addresses.
> > > > > > > > >
> > > > > > > > > not that they are not already broken.
> > > > > > > >
> > > > > > > > I may miss something, the whole point is to allow legacy drivers to
> > > > > > > > run otherwise a modern device is sufficient?
> > > > > > >
> > > > > > > yes and if legacy drivers don't work in a given setup then we
> > > > > > > should not worry.
> > > > > > >
> > > > > > > > >
> > > > > > > > > > > - a new flag that is insecure (so useful for sec but useless for dpdk) but optional
> > > > > > > > > >
> > > > > > > > > > This looks like a new "hack" for the legacy hacks.
> > > > > > > > >
> > > > > > > > > it's not just for legacy.
> > > > > > > >
> > > > > > > > We have the ACCESS_PLATFORM feature bit, what is the useage for this new flag?
> > > > > > >
> > > > > > >
> > > > > > > ACCESS_PLATFORM is also a security boundary. so devices must fail
> > > > > > > negotiation if it's not there. this new one won't be.
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > And what about ORDER_PLATFORM, I don't think we can modify legacy drivers...
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > You play some tricks with shadow VQ I guess.
> > > > > > > >
> > > > > > > > Do we really want to add a new feature in the virtio spec that can
> > > > > > > > only work with the datapath mediation?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > As long as a feature is useful and can't be supported otherwise
> > > > > > > we are out of options.
> > > > > >
> > > > > > Probably not? Is it as simple as relaxing this:
> > > > > >
> > > > > > "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0."
> > > > > >
> > > > > > To allow memory space.
> > > > > >
> > > > > > This works for both software and hardware devices (I had a handy
> > > > > > hardware that supports legacy virtio drivers in this way).
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Yes it is certainly simpler.
> > > > >
> > > > > Question: what happens if you try to run existing windows guests or dpdk
> > > > > on these? Do they crash horribly or exit gracefully?
> > > >
> > > > Haven't tried DPDK and windows. But I remember DPDK supported legacy
> > > > MMIO bars for a while.
> > > >
> > > > Adding Maxime and Yan for comments here.
> > > >
> > > > >
> > > > > The point of the capability is to allow using modern device ID so such
> > > > > guests will not even try to bind.
> > > >
> > > > It means a mediation layer is required to use. Then it's not an issue
> > > > for this simple relaxing any more?
> > > >
> > > > An advantage of such relaxing is that, for the legacy drivers like an
> > > > ancient Linux version that can already do MMIO access to legacy BAR it
> > > > can work without any mediation.
> > >
> > > Yes. But the capability approach does not prevent that -
> > > just use a transitional ID.
> > > The disadvantage of a transitional ID is some old drivers might crash, or fail
> > > to work in other ways.
> >
> > If the driver is wrote correctly it should fail gracefully if it can't
> > request legacy I/O regions.
>
> it's not just regions. it's things like iommu, ordering, endian-ness...
> we all know legacy is a mess.

Yes, but this proposal will drag us back to legacy, isn't it? Or if it
is used only in the vitalization environment, what's the advantages of
doing it over simply mediating on top of modern device?

1) legacy MMIO bar: spec changes, hypervisor mediation
2) modern device: no spec changes, hypervisor mediation

>
> > > Using a modern ID with a capability
> > > prevents old drivers from attaching.
> >
> > Just to make sure we are at the same page.
> >
> > It looks like the motivation for this series is allow hypervisor to
> > mediate between legacy MMIO BAR and legacy driver. This end ups with a
> > design that only work for a specific use case (virtualization) and a
> > specific setups (e.g using Linux as a hypervisor). By definition, it's
> > not a transitional device.
>
> It can be a transitional or non-transitional device.
>
> > What's more important, if it is only used
> > for virtualization, legacy MMIO bar is not a must. We can present a
> > legacy device on top of modern device by using necessary mediation
> > technologies as discussed.
>
> You may call it a bug, i call it a feature. I am not really
> excited about all the bugs we'll see reported because
> people will try to use legacy guests on bare metal virtio.

I don't either but this (MMIO BAR0) have been used by several vendors for years.

> Can we just not go there please or at least give people
> an option not to go there?

Yes, so we've already had modern device. We know it could be used to
mediate a legacy device without worries ordering iommu etc.

So my understandings are:

1) it's better not invent any of new facilities for legacy
2) if legacy is insisted, allow MMIO BAR0 is much simpler and better

Thanks

>
> > By relaxing to use MMIO at BAR0, we had a much broad use cases. By
> > definition, it's a transitional device. It's useful for both bare
> > metal and virtualization. We don't worry about the buggy drivers or at
> > least fixing (I meant fail on no I/O bar) those drivers is not hard.
> >
> > > Using a capability the system
> > > designed is in control and can decide which drivers to use.
> >
> > Do we allow a virtio device only with this capability? If yes, this
> > seems can mislead vendors. If not, we have already had modern
> > interfaces so I don't see value to add that in the current Linux
> > drivers. The only possible use case is the doing mediation in the
> > hypervisor like Qemu not the driver.
>
> exactly.
>
> > >
> > >
> > >
> > > > At least, if ID can help, it can be used with this as well.
> > > >
> > > > Thanks
> > >
> > >
> > > I don't know what that means.
> >
> > I mean having dedicated ID for this relaxing (though I don't see too
> > much value).
> >
> > Thanks
>
> me neither.
>
> > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > > > Keeping field practice things out of the
> > > > > > > spec helps no one.
> > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > MST
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  4:48 ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  4:52   ` Parav Pandit
  2023-04-12  5:12     ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  4:52 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 12, 2023 12:48 AM

> Here is a counter proposal:
> 
> #define VIRTIO_NET_F_LEGACY_HEADER  52      /* Use the legacy 10 byte
> header for all packets */
> 
> 
> Yes, sorry to say, you need to emulate legacy pci in software.
> 
> With notification hacks, and reset hacks, and legacy interrupt hacks, and
> writeable mac ...  this thing best belongs in vdpa anyway.

What? I don't follow.
Suddenly you attribute everything as hack with least explanation.



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:20               ` Michael S. Tsirkin
@ 2023-04-12  4:53                 ` Jason Wang
  2023-04-12  5:25                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  4:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:20 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Apr 12, 2023 at 12:07:26PM +0800, Jason Wang wrote:
> > On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > > >
> > > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > > Behalf Of Jason Wang
> > > > > Sent: Monday, April 10, 2023 11:29 PM
> > > >
> > > > > > However, it is not backward compatible, if the device place them in
> > > > > > extended capability, it will not work.
> > > > > >
> > > > >
> > > > > It is kind of intended since it is only used for new PCI-E features:
> > > > >
> > > > New fields in new extended pci cap area is fine.
> > > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> > > >
> > > > > "
> > > > > +The location of the virtio structures that depend on the PCI Express
> > > > > +capability are specified using a vendor-specific extended capabilities
> > > > > +on the extended capabilities list in PCI Express extended configuration
> > > > > +space of the device.
> > > > > "
> > > > >
> > > > > > To make it backward compatible, a device needs to expose existing
> > > > > > structure in legacy area. And extended structure for same capability
> > > > > > in extended pci capability region.
> > > > > >
> > > > > > In other words, it will have to be a both places.
> > > > >
> > > > > Then we will run out of config space again?
> > > > No.
> > > > Only currently defined caps to be placed in two places.
> > > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> > > >
> > > > We probably already discussed this in previous email by now.
> > > >
> > > > > Otherwise we need to deal with the
> > > > > case when existing structures were only placed at extended capability. Michael
> > > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > > which requires more thought.
> > > > >
> > > > Not sure I understand feature bit.
> > >
> > > This is because we have a concept of dependency between
> > > features but not a concept of dependency of feature on
> > > capability.
> > >
> > > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > > >
> > > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > > the structure located in legacy area.
> > > > >
> > > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > > dedicated pcie ext cap is required?
> > > > >
> > > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
> > >
> > >
> > >
> > > You know, thinking about this, I begin to feel that we should
> > > require that if at least one extended config exists then
> > > all caps present in the regular config are *also*
> > > mirrored in the extended config. IOW extended >= regular.
> > > The reason is that extended config can be emulated more efficiently
> > > (2x less exits).
> >
> > Any reason for it to get less exits?
>
> For a variety of reasons having to do with buggy hardware e.g. linux
> likes to use cf8/cfc for legacy ranges. 2 accesses are required for each
> read/write.  extended space is just 1.
>

Ok.

>
> > At least it has not been done in
> > current Qemu's emulation. (And do we really care about the performance
> > of config space access?)
> >
> > Thanks
>
> For boot speed, yes. Not minor 5% things but 2x, sure.

If we care about boot speed we should avoid using the PCI layer in the
guest completely.

Thanks

>
> > > WDYT?
> > >
> > >
> > > --
> > > MST
> > >
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:20             ` Michael S. Tsirkin
@ 2023-04-12  4:55               ` Jason Wang
  0 siblings, 0 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-12  4:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Apr 12, 2023 at 12:04:45PM +0800, Jason Wang wrote:
> > On Wed, Apr 12, 2023 at 3:01 AM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > Behalf Of Jason Wang
> > > > Sent: Monday, April 10, 2023 11:29 PM
> > >
> > > > > However, it is not backward compatible, if the device place them in
> > > > > extended capability, it will not work.
> > > > >
> > > >
> > > > It is kind of intended since it is only used for new PCI-E features:
> > > >
> > > New fields in new extended pci cap area is fine.
> > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> >
> > Right, but what I want to say is, such migration may cause unnecessary
> > problems. And I don't see why it is a must for your legacy MMIO bar
> > proposal.
> >
> > >
> > > > "
> > > > +The location of the virtio structures that depend on the PCI Express
> > > > +capability are specified using a vendor-specific extended capabilities
> > > > +on the extended capabilities list in PCI Express extended configuration
> > > > +space of the device.
> > > > "
> > > >
> > > > > To make it backward compatible, a device needs to expose existing
> > > > > structure in legacy area. And extended structure for same capability
> > > > > in extended pci capability region.
> > > > >
> > > > > In other words, it will have to be a both places.
> > > >
> > > > Then we will run out of config space again?
> > > No.
> > > Only currently defined caps to be placed in two places.
> >
> > What's the advantage of doing this?
> >
> > New drivers should provide backward compatibility so they must scan the pci cap.
>
> No, they can start with express cap. Finding one they can skip
> the old cap completely.

Then the driver can't work on the old device.

Thanks

>
> > The Old driver can only scan the pci cap.
> >
> > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> >
> > It would be much more simple if we forbid placing new fields in the
> > PCI cap, it is already out of space.
> >
> > Thanks
> >
> > >
> > > We probably already discussed this in previous email by now.
> > >
> > > > Otherwise we need to deal with the
> > > > case when existing structures were only placed at extended capability. Michael
> > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > which requires more thought.
> > > >
> > > Not sure I understand feature bit.
> > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > >
> > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > the structure located in legacy area.
> > > >
> > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > dedicated pcie ext cap is required?
> > > >
> > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  4:51                             ` Jason Wang
@ 2023-04-12  5:01                               ` Parav Pandit
  2023-04-12  5:14                                 ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:01 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, April 12, 2023 12:51 AM
> 
> Yes, but this proposal will drag us back to legacy, isn't it? 
No. This proposal supports the legacy transitional pci device.

> Or if it is used only in
> the vitalization environment, what's the advantages of doing it over simply
> mediating on top of modern device?
> 
Because the spec for modern device do not allow it. Discussed in these threads.

> 1) legacy MMIO bar: spec changes, hypervisor mediation
> 2) modern device: no spec changes, hypervisor mediation
>
This question repeats the same discussion occurred in this patch series.
You might want to refer it again to avoid repeating all over again.

> 1) it's better not invent any of new facilities for legacy
> 2) if legacy is insisted, allow MMIO BAR0 is much simpler and better
You have missed few emails. :)
MMIO BAR is proposed here and it is not limited to BAR 0.
It is left for the device to either map something in existing BAR or use BAR 0.
Because PCI has only 3 BARs.
A device may want to support legacy and non-legacy functionality both at the same time.
So better not to hard wire for BAR 0.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  4:48         ` [virtio-dev] " Parav Pandit
@ 2023-04-12  5:02           ` Michael S. Tsirkin
  2023-04-12  5:06             ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 12, 2023 at 04:48:41AM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 12, 2023 12:43 AM
> > To: Parav Pandit <parav@nvidia.com>
> > Cc: virtio-dev@lists.oasis-open.org; cohuck@redhat.com; virtio-
> > comment@lists.oasis-open.org; Shahaf Shuler <shahafs@nvidia.com>;
> > Satananda Burla <sburla@marvell.com>
> > Subject: Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
> > 
> > On Wed, Apr 12, 2023 at 04:37:05AM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > Sent: Wednesday, April 12, 2023 12:31 AM
> > > >
> > > > On Fri, Mar 31, 2023 at 01:58:33AM +0300, Parav Pandit wrote:
> > > > > PCI devices support memory BAR regions for performant driver
> > > > > notifications using the notification capability.
> > > > > Enable transitional MMR devices to use it in simpler manner.
> > > > >
> > > > > Co-developed-by: Satananda Burla <sburla@marvell.com>
> > > > > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > > > > ---
> > > > >  transport-pci.tex | 28 ++++++++++++++++++++++++++++
> > > > >  1 file changed, 28 insertions(+)
> > > > >
> > > > > diff --git a/transport-pci.tex b/transport-pci.tex index
> > > > > 55a6aa0..4fd9898 100644
> > > > > --- a/transport-pci.tex
> > > > > +++ b/transport-pci.tex
> > > > > @@ -763,6 +763,34 @@ \subsubsection{Notification structure
> > > > > layout}\label{sec:Virtio Transport Options  cap.length >=
> > > > > queue_notify_off * notify_off_multiplier + 4  \end{lstlisting}
> > > > >
> > > > > +\paragraph{Transitional MMR Interface: A note on Notification
> > > > > +Capability} \label{sec:Virtio Transport Options / Virtio Over PCI
> > > > > +Bus / Virtio Structure PCI Capabilities / Notification capability
> > > > > +/ Transitional MMR Interface}
> > > > > +
> > > > > +The transitional MMR device benefits from receiving driver
> > > > > +notifications at the Queue Notification address offered using the
> > > > > +notification capability, rather than via the memory mapped legacy
> > > > > +QueueNotify configuration register.
> > > > > +
> > > > > +Transitional MMR device uses same Queue Notification address
> > > > > +within a BAR for all virtqueues:
> > > > > +\begin{lstlisting}
> > > > > +cap.offset
> > > > > +\end{lstlisting}
> > > > > +
> > > > > +The transitional MMR device MUST support Queue Notification
> > > > > +address within a BAR for all virtqueues at:
> > > > > +\begin{lstlisting}
> > > > > +cap.offset
> > > > > +\end{lstlisting}
> > > > > +
> > > > > +The transitional MMR driver that wants to use driver
> > > > > +notifications offered using notification capability MUST use same
> > > > > +Queue Notification address within a BAR for all virtqueues at:
> > > > > +
> > > > > +\begin{lstlisting}
> > > > > +cap.offset
> > > > > +\end{lstlisting}
> > > > > +
> > > > Why? What exactly is going on here? legacy drivers will not do this.
> > >
> > > Legacy driver does in the q notify register that was sandwitched in between
> > of slow configuration registers.
> > > This is the notification offset for the hypervisor driver to perform the
> > notification on behalf of the guest driver so that the acceleration available for
> > the non-transitional device can be utilized here as well.
> > 
> > I don't get it. What acceleration? for guests you need a separate page so card
> > can be mapped directly while config causes an exit. But hypervisor can access
> > any register without vmexits.
> 
> Typically when guest VM writes to IOBAR q notification register, a vmexit occurs.
> On that occurrence, hypervisor driver forwards the q notification using the q notification region which is defined by struct virtio_pci_notify_cap.

What is wrong with forwarding it on the legacy window?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  5:02           ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  5:06             ` Parav Pandit
  2023-04-12  5:17               ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin

> > On that occurrence, hypervisor driver forwards the q notification using the q
> notification region which is defined by struct virtio_pci_notify_cap.
> 
> What is wrong with forwarding it on the legacy window?
Legacy window is configuration registers.
One has sandwitched q notification register in middle of all the config registers.
Such registers do not start the VQ data path engines.
1.x spec has done the right thing to have dedicated notification region which something an actual pci hw can implement.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (13 preceding siblings ...)
  2023-04-12  4:48 ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  5:10 ` Halil Pasic
  2023-04-25  2:42 ` [virtio-dev] " Parav Pandit
  15 siblings, 0 replies; 200+ messages in thread
From: Halil Pasic @ 2023-04-12  5:10 UTC (permalink / raw)
  To: Parav Pandit
  Cc: mst, virtio-dev, cohuck, virtio-comment, shahafs, Halil Pasic

On Fri, 31 Mar 2023 01:58:23 +0300
Parav Pandit <parav@nvidia.com> wrote:

> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.
> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 

I have some conceptual problems with the current wording. I have to
admit the ongoing discussion is for me quite difficult to follow.
Nevertheless I have the feeling most of my concerns were already
voiced by mostly Michael.

 
[..]
> Please review.
> 

After reading trough everything, I have a feeling, I have at least a
basic understanding of what you are trying to accomplish. If you
don't mind I would prefer to wait for a v2, maybe things will
settle down a little by then. I feel like me commenting right now
would just muddy the waters even further, without any real merrit. But if
you insist, you can get some...

Regards,
Halil

[..]


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  4:52   ` [virtio-dev] " Parav Pandit
@ 2023-04-12  5:12     ` Michael S. Tsirkin
  2023-04-12  5:15       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
  2023-04-12  6:02       ` Parav Pandit
  0 siblings, 2 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:12 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Wed, Apr 12, 2023 at 04:52:09AM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, April 12, 2023 12:48 AM
> 
> > Here is a counter proposal:
> > 
> > #define VIRTIO_NET_F_LEGACY_HEADER  52      /* Use the legacy 10 byte
> > header for all packets */
> > 
> > 
> > Yes, sorry to say, you need to emulate legacy pci in software.
> > 
> > With notification hacks, and reset hacks, and legacy interrupt hacks, and
> > writeable mac ...  this thing best belongs in vdpa anyway.
> 
> What? I don't follow.
> Suddenly you attribute everything as hack with least explanation.
> 

Again hacks is not a bad thing but it's an attempt at reusing things in
unexpected ways.

New issue I found today:
- if guest disables MSI-X host can not disable MSI-X.
  need some other channel to notify device about this.

Old issues we discussed before today:
- reset needs some special handling because real hardware
  can not guarantee returning 0 on the 1st read
- if guest writes into mac, reusing host mac (which is RO)
  will not work, need extra registers
- something about notification makes you want to poke
  at modern notification register? which of course
  is its own can of worms with VIRTIO_F_NOTIFICATION_DATA
  changing the format completely.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  5:01                               ` [virtio-dev] " Parav Pandit
@ 2023-04-12  5:14                                 ` Jason Wang
  2023-04-12  5:30                                   ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  5:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 1:01 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, April 12, 2023 12:51 AM
> >
> > Yes, but this proposal will drag us back to legacy, isn't it?
> No. This proposal supports the legacy transitional pci device.
>
> > Or if it is used only in
> > the vitalization environment, what's the advantages of doing it over simply
> > mediating on top of modern device?
> >
> Because the spec for modern device do not allow it. Discussed in these threads.

Can you please tell me which part of the spec disallows it? There's a
long discussion in the virtualization-list a few months ago about
mediating legacy devices on top of modern. We don't see any blocker do
you?

>
> > 1) legacy MMIO bar: spec changes, hypervisor mediation
> > 2) modern device: no spec changes, hypervisor mediation
> >
> This question repeats the same discussion occurred in this patch series.
> You might want to refer it again to avoid repeating all over again.

No, I think you miss the point that modern devices could be used for
mediating legacy devices.

>
> > 1) it's better not invent any of new facilities for legacy
> > 2) if legacy is insisted, allow MMIO BAR0 is much simpler and better
> You have missed few emails. :)
> MMIO BAR is proposed here and it is not limited to BAR 0.

In the context of mediation, why do you need that flexibility?

> It is left for the device to either map something in existing BAR or use BAR 0.
> Because PCI has only 3 BARs.
> A device may want to support legacy and non-legacy functionality both at the same time.

This is perfectly fine, this is how Qemu did:

    /*
     * virtio pci bar layout used by default.
     * subclasses can re-arrange things if needed.
     *
     *   region 0   --  virtio legacy io bar
     *   region 1   --  msi-x bar
     *   region 2   --  virtio modern io bar (off by default)
     *   region 4+5 --  virtio modern memory (64bit) bar
     *
     */

> So better not to hard wire for BAR 0.

Using BAR0 doesn't prevent you from adding modern capabilities in BAR0, no?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  5:12     ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  5:15       ` Parav Pandit
  2023-04-12  5:23         ` [virtio-dev] " Michael S. Tsirkin
  2023-04-12  6:02       ` Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:15 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin

> New issue I found today:
> - if guest disables MSI-X host can not disable MSI-X.
>   need some other channel to notify device about this.
>
I will look into it. 
 
> Old issues we discussed before today:
> - reset needs some special handling because real hardware
>   can not guarantee returning 0 on the 1st read
When done through the legacy MMR as we discussed, it will.

> - if guest writes into mac, reusing host mac (which is RO)
>   will not work, need extra registers
No. Again legacy interface section of MMR behaves as legacy.

> - something about notification makes you want to poke
>   at modern notification register? which of course
>   is its own can of worms with VIRTIO_F_NOTIFICATION_DATA
>   changing the format completely.
There is no NOTIFICATION_DATA with legacy. So it is not applicable.
Format is same with only vq index.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  5:06             ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-12  5:17               ` Michael S. Tsirkin
  2023-04-12  5:24                 ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:17 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 12, 2023 at 05:06:09AM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> 
> > > On that occurrence, hypervisor driver forwards the q notification using the q
> > notification region which is defined by struct virtio_pci_notify_cap.
> > 
> > What is wrong with forwarding it on the legacy window?
> Legacy window is configuration registers.
> One has sandwitched q notification register in middle of all the config registers.
> Such registers do not start the VQ data path engines.
> 1.x spec has done the right thing to have dedicated notification region which something an actual pci hw can implement.

okay so.. in fact it turns out existing hardware is not
really happy to emulate legacy. it does not really fit that well.
so maybe we should stop propagating the legacy interface to
modern hardware then.
Add the legacy net header size feature and be done with it.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  5:15       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-12  5:23         ` Michael S. Tsirkin
  2023-04-12  5:39           ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:23 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler

On Wed, Apr 12, 2023 at 05:15:40AM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> 
> > New issue I found today:
> > - if guest disables MSI-X host can not disable MSI-X.
> >   need some other channel to notify device about this.
> >
> I will look into it. 
>  
> > Old issues we discussed before today:
> > - reset needs some special handling because real hardware
> >   can not guarantee returning 0 on the 1st read
> When done through the legacy MMR as we discussed, it will.
> 
> > - if guest writes into mac, reusing host mac (which is RO)
> >   will not work, need extra registers
> No. Again legacy interface section of MMR behaves as legacy.
> 
> > - something about notification makes you want to poke
> >   at modern notification register? which of course
> >   is its own can of worms with VIRTIO_F_NOTIFICATION_DATA
> >   changing the format completely.
> There is no NOTIFICATION_DATA with legacy. So it is not applicable.
> Format is same with only vq index.

You are writing into modern register here. *what* are you writing there?
with modern register the format depends on features. here you did not
negotiate any features. what if NOTIFICATION_DATA is a required feature?
with legacy there's no FEATURES_OK so no way to report failure.  driver
barrels on, sends wrong data in the kick and hangs.


Sure, we can add a section to notification register and come up with
some behaviour. Without even trying to guess how messy or clean that
will be, far from being a small clean contained thing this feature is
sticking its fingers in lots of pies.



Look I know proposed this originally. I thought it's a small thing too.
It was an idea. I am not sure it pans out though. Not all ideas work.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  5:17               ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  5:24                 ` Parav Pandit
  2023-04-12  5:27                   ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin

> > Such registers do not start the VQ data path engines.
> > 1.x spec has done the right thing to have dedicated notification region which
> something an actual pci hw can implement.
> 
> okay so.. in fact it turns out existing hardware is not really happy to emulate
> legacy. it does not really fit that well.
> so maybe we should stop propagating the legacy interface to modern hardware
> then.
It is not about current hw, utilize what hw has to offer and utilize what sw has to offer.
The key efficiency is coming by reusing what 1.x has already to offer.

> Add the legacy net header size feature and be done with it.
Hypervisor need to mediate and participate in all life cycle of the device.

One wants to run the block device too. Current proposal with other changes we discussed cover wider case.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  4:53                 ` [virtio-dev] Re: [virtio-comment] " Jason Wang
@ 2023-04-12  5:25                   ` Michael S. Tsirkin
  2023-04-12  5:37                     ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 12:53:52PM +0800, Jason Wang wrote:
> On Wed, Apr 12, 2023 at 12:20 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Apr 12, 2023 at 12:07:26PM +0800, Jason Wang wrote:
> > > On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > > > >
> > > > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > > > Behalf Of Jason Wang
> > > > > > Sent: Monday, April 10, 2023 11:29 PM
> > > > >
> > > > > > > However, it is not backward compatible, if the device place them in
> > > > > > > extended capability, it will not work.
> > > > > > >
> > > > > >
> > > > > > It is kind of intended since it is only used for new PCI-E features:
> > > > > >
> > > > > New fields in new extended pci cap area is fine.
> > > > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> > > > >
> > > > > > "
> > > > > > +The location of the virtio structures that depend on the PCI Express
> > > > > > +capability are specified using a vendor-specific extended capabilities
> > > > > > +on the extended capabilities list in PCI Express extended configuration
> > > > > > +space of the device.
> > > > > > "
> > > > > >
> > > > > > > To make it backward compatible, a device needs to expose existing
> > > > > > > structure in legacy area. And extended structure for same capability
> > > > > > > in extended pci capability region.
> > > > > > >
> > > > > > > In other words, it will have to be a both places.
> > > > > >
> > > > > > Then we will run out of config space again?
> > > > > No.
> > > > > Only currently defined caps to be placed in two places.
> > > > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> > > > >
> > > > > We probably already discussed this in previous email by now.
> > > > >
> > > > > > Otherwise we need to deal with the
> > > > > > case when existing structures were only placed at extended capability. Michael
> > > > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > > > which requires more thought.
> > > > > >
> > > > > Not sure I understand feature bit.
> > > >
> > > > This is because we have a concept of dependency between
> > > > features but not a concept of dependency of feature on
> > > > capability.
> > > >
> > > > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > > > >
> > > > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > > > the structure located in legacy area.
> > > > > >
> > > > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > > > dedicated pcie ext cap is required?
> > > > > >
> > > > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
> > > >
> > > >
> > > >
> > > > You know, thinking about this, I begin to feel that we should
> > > > require that if at least one extended config exists then
> > > > all caps present in the regular config are *also*
> > > > mirrored in the extended config. IOW extended >= regular.
> > > > The reason is that extended config can be emulated more efficiently
> > > > (2x less exits).
> > >
> > > Any reason for it to get less exits?
> >
> > For a variety of reasons having to do with buggy hardware e.g. linux
> > likes to use cf8/cfc for legacy ranges. 2 accesses are required for each
> > read/write.  extended space is just 1.
> >
> 
> Ok.
> 
> >
> > > At least it has not been done in
> > > current Qemu's emulation. (And do we really care about the performance
> > > of config space access?)
> > >
> > > Thanks
> >
> > For boot speed, yes. Not minor 5% things but 2x, sure.
> 
> If we care about boot speed we should avoid using the PCI layer in the
> guest completely.
> 
> Thanks

Woa. And do what? Add a ton of functionality in a PV way to MMIO?
NUMA, MSI, power management .... the list goes on and on.
If you have pci on the host it is way easier to pass that
through to guest than do a completely different thing.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 10/11] transport-pci: Use driver notification PCI capability
  2023-04-12  5:24                 ` [virtio-dev] " Parav Pandit
@ 2023-04-12  5:27                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-12  5:27 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Wed, Apr 12, 2023 at 05:24:57AM +0000, Parav Pandit wrote:
> 
> 
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> 
> > > Such registers do not start the VQ data path engines.
> > > 1.x spec has done the right thing to have dedicated notification region which
> > something an actual pci hw can implement.
> > 
> > okay so.. in fact it turns out existing hardware is not really happy to emulate
> > legacy. it does not really fit that well.
> > so maybe we should stop propagating the legacy interface to modern hardware
> > then.
> It is not about current hw, utilize what hw has to offer and utilize what sw has to offer.
> The key efficiency is coming by reusing what 1.x has already to offer.

Just ... no. Either run 0.X or 1.X. This mix opens a ton of corner
cases. NOTIFICATION_DATA is just off the top of my head, there's more
for sure.  It's a can of worms we won't be able to close.

> > Add the legacy net header size feature and be done with it.
> Hypervisor need to mediate and participate in all life cycle of the device.
> 
> One wants to run the block device too. Current proposal with other changes we discussed cover wider case.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  5:14                                 ` [virtio-dev] " Jason Wang
@ 2023-04-12  5:30                                   ` Parav Pandit
  2023-04-12  5:38                                     ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, April 12, 2023 1:15 AM

> > Because the spec for modern device do not allow it. Discussed in these
> threads.
> 
> Can you please tell me which part of the spec disallows it? There's a long
> discussion in the virtualization-list a few months ago about mediating legacy
> devices on top of modern. We don't see any blocker do you?
>
Modern device says FEAETURE_1 must be offered and must be negotiated by driver.
Legacy has Mac as RW area. (hypervisor can do it).
Reset flow is difference between the legacy and modern.

> >
> > > 1) legacy MMIO bar: spec changes, hypervisor mediation
> > > 2) modern device: no spec changes, hypervisor mediation
> > >
> > This question repeats the same discussion occurred in this patch series.
> > You might want to refer it again to avoid repeating all over again.
> 
> No, I think you miss the point that modern devices could be used for mediating
> legacy devices.
> 
> >
> > > 1) it's better not invent any of new facilities for legacy
> > > 2) if legacy is insisted, allow MMIO BAR0 is much simpler and better
> > You have missed few emails. :)
> > MMIO BAR is proposed here and it is not limited to BAR 0.
> 
> In the context of mediation, why do you need that flexibility?
> 
The PCI device exposed is transitional to the guest VM, so it can do legacy as well as newer features.
And if BAR 0 is hard coded, it may not be able to support features that may need additional BAR.

> > It is left for the device to either map something in existing BAR or use BAR 0.
> > Because PCI has only 3 BARs.
> > A device may want to support legacy and non-legacy functionality both at the
> same time.
> 
> This is perfectly fine, this is how Qemu did:
> 
>     /*
>      * virtio pci bar layout used by default.
>      * subclasses can re-arrange things if needed.
>      *
>      *   region 0   --  virtio legacy io bar
>      *   region 1   --  msi-x bar
>      *   region 2   --  virtio modern io bar (off by default)
>      *   region 4+5 --  virtio modern memory (64bit) bar
>      *
>      */
> 
> > So better not to hard wire for BAR 0.
> 
> Using BAR0 doesn't prevent you from adding modern capabilities in BAR0, no?
>
Right, it doesn’t. But spec shouldn’t write BAR0 is only for legacy MMIO emulation, that would prevent BAR0 usage.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  5:25                   ` Michael S. Tsirkin
@ 2023-04-12  5:37                     ` Jason Wang
  2023-04-13 17:03                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  5:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 1:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Apr 12, 2023 at 12:53:52PM +0800, Jason Wang wrote:
> > On Wed, Apr 12, 2023 at 12:20 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Apr 12, 2023 at 12:07:26PM +0800, Jason Wang wrote:
> > > > On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > > > > >
> > > > > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > > > > Behalf Of Jason Wang
> > > > > > > Sent: Monday, April 10, 2023 11:29 PM
> > > > > >
> > > > > > > > However, it is not backward compatible, if the device place them in
> > > > > > > > extended capability, it will not work.
> > > > > > > >
> > > > > > >
> > > > > > > It is kind of intended since it is only used for new PCI-E features:
> > > > > > >
> > > > > > New fields in new extended pci cap area is fine.
> > > > > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> > > > > >
> > > > > > > "
> > > > > > > +The location of the virtio structures that depend on the PCI Express
> > > > > > > +capability are specified using a vendor-specific extended capabilities
> > > > > > > +on the extended capabilities list in PCI Express extended configuration
> > > > > > > +space of the device.
> > > > > > > "
> > > > > > >
> > > > > > > > To make it backward compatible, a device needs to expose existing
> > > > > > > > structure in legacy area. And extended structure for same capability
> > > > > > > > in extended pci capability region.
> > > > > > > >
> > > > > > > > In other words, it will have to be a both places.
> > > > > > >
> > > > > > > Then we will run out of config space again?
> > > > > > No.
> > > > > > Only currently defined caps to be placed in two places.
> > > > > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> > > > > >
> > > > > > We probably already discussed this in previous email by now.
> > > > > >
> > > > > > > Otherwise we need to deal with the
> > > > > > > case when existing structures were only placed at extended capability. Michael
> > > > > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > > > > which requires more thought.
> > > > > > >
> > > > > > Not sure I understand feature bit.
> > > > >
> > > > > This is because we have a concept of dependency between
> > > > > features but not a concept of dependency of feature on
> > > > > capability.
> > > > >
> > > > > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > > > > >
> > > > > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > > > > the structure located in legacy area.
> > > > > > >
> > > > > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > > > > dedicated pcie ext cap is required?
> > > > > > >
> > > > > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
> > > > >
> > > > >
> > > > >
> > > > > You know, thinking about this, I begin to feel that we should
> > > > > require that if at least one extended config exists then
> > > > > all caps present in the regular config are *also*
> > > > > mirrored in the extended config. IOW extended >= regular.
> > > > > The reason is that extended config can be emulated more efficiently
> > > > > (2x less exits).
> > > >
> > > > Any reason for it to get less exits?
> > >
> > > For a variety of reasons having to do with buggy hardware e.g. linux
> > > likes to use cf8/cfc for legacy ranges. 2 accesses are required for each
> > > read/write.  extended space is just 1.
> > >
> >
> > Ok.
> >
> > >
> > > > At least it has not been done in
> > > > current Qemu's emulation. (And do we really care about the performance
> > > > of config space access?)
> > > >
> > > > Thanks
> > >
> > > For boot speed, yes. Not minor 5% things but 2x, sure.
> >
> > If we care about boot speed we should avoid using the PCI layer in the
> > guest completely.
> >
> > Thanks
>
> Woa. And do what? Add a ton of functionality in a PV way to MMIO?

Probably, we have microVM already. And hyperv drops PCI since Gen2.

> NUMA, MSI, power management .... the list goes on and on.
> If you have pci on the host it is way easier to pass that
> through to guest than do a completely different thing.

It's a balance. If you want functionality, PCI is probably a must. But
if you care about only the boot speed, the boot speed is not slowed
down by a single device but the whole PCI layer.

Thanks

>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  5:30                                   ` [virtio-dev] " Parav Pandit
@ 2023-04-12  5:38                                     ` Jason Wang
  2023-04-12  5:55                                       ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  5:38 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 1:30 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, April 12, 2023 1:15 AM
>
> > > Because the spec for modern device do not allow it. Discussed in these
> > threads.
> >
> > Can you please tell me which part of the spec disallows it? There's a long
> > discussion in the virtualization-list a few months ago about mediating legacy
> > devices on top of modern. We don't see any blocker do you?
> >
> Modern device says FEAETURE_1 must be offered and must be negotiated by driver.
> Legacy has Mac as RW area. (hypervisor can do it).
> Reset flow is difference between the legacy and modern.

Just to make sure we're at the same page. We're talking in the context
of mediation. Without mediation, your proposal can't work.

So in this case, the guest driver is not talking with the device
directly. Qemu needs to traps whatever it wants to achieve the
mediation:

1) It's perfectly fine that Qemu negotiated VERSION_1 but presented a
mediated legacy device to guests.
2) For MAC and Reset, Qemu can trap and do anything it wants.

>
> > >
> > > > 1) legacy MMIO bar: spec changes, hypervisor mediation
> > > > 2) modern device: no spec changes, hypervisor mediation
> > > >
> > > This question repeats the same discussion occurred in this patch series.
> > > You might want to refer it again to avoid repeating all over again.
> >
> > No, I think you miss the point that modern devices could be used for mediating
> > legacy devices.
> >
> > >
> > > > 1) it's better not invent any of new facilities for legacy
> > > > 2) if legacy is insisted, allow MMIO BAR0 is much simpler and better
> > > You have missed few emails. :)
> > > MMIO BAR is proposed here and it is not limited to BAR 0.
> >
> > In the context of mediation, why do you need that flexibility?
> >
> The PCI device exposed is transitional to the guest VM, so it can do legacy as well as newer features.
> And if BAR 0 is hard coded, it may not be able to support features that may need additional BAR.

This part I don't understand, you can just put existing modern
capabilities in BAR0, then everything is fine.

>
> > > It is left for the device to either map something in existing BAR or use BAR 0.
> > > Because PCI has only 3 BARs.
> > > A device may want to support legacy and non-legacy functionality both at the
> > same time.
> >
> > This is perfectly fine, this is how Qemu did:
> >
> >     /*
> >      * virtio pci bar layout used by default.
> >      * subclasses can re-arrange things if needed.
> >      *
> >      *   region 0   --  virtio legacy io bar
> >      *   region 1   --  msi-x bar
> >      *   region 2   --  virtio modern io bar (off by default)
> >      *   region 4+5 --  virtio modern memory (64bit) bar
> >      *
> >      */
> >
> > > So better not to hard wire for BAR 0.
> >
> > Using BAR0 doesn't prevent you from adding modern capabilities in BAR0, no?
> >
> Right, it doesn’t. But spec shouldn’t write BAR0 is only for legacy MMIO emulation, that would prevent BAR0 usage.

How can it be prevented? Can you give me an example?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  5:23         ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-12  5:39           ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:39 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, April 12, 2023 1:24 AM

> You are writing into modern register here. *what* are you writing there?
> with modern register the format depends on features. here you did not
> negotiate any features. 
Q notify content.

> what if NOTIFICATION_DATA is a required feature?
How can it be required when it is not defined in the legacy spec.

> with legacy there's no FEATURES_OK so no way to report failure.  driver barrels
> on, sends wrong data in the kick and hangs.
>
There is no need for FEATURE_OK because as 1.x configuration registers are not touched.

The device provide q notify register region for forwarding.

> 
> Look I know proposed this originally. I thought it's a small thing too.
> It was an idea. I am not sure it pans out though. Not all ideas work.

This one do work and we have been already testing it.
Described changes with modern devices are simpler than the original proposal indeed.



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  5:38                                     ` [virtio-dev] " Jason Wang
@ 2023-04-12  5:55                                       ` Parav Pandit
  2023-04-12  6:15                                         ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  5:55 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, April 12, 2023 1:38 AM

> > Modern device says FEAETURE_1 must be offered and must be negotiated by
> driver.
> > Legacy has Mac as RW area. (hypervisor can do it).
> > Reset flow is difference between the legacy and modern.
> 
> Just to make sure we're at the same page. We're talking in the context of
> mediation. Without mediation, your proposal can't work.
>
Right.
 
> So in this case, the guest driver is not talking with the device directly. Qemu
> needs to traps whatever it wants to achieve the
> mediation:
> 
I prefer to avoid picking specific sw component here, but yes. QEMU can trap.

> 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented a
> mediated legacy device to guests.
Right but if VERSION_1 is negotiated, device will work as V_1 with 12B virtio_net_hdr.

> 2) For MAC and Reset, Qemu can trap and do anything it wants.
>
The idea is not to poke in the fields even though such sw can.
MAC is RW in legacy.
Mac ia RO in 1.x.

So QEMU cannot make RO register into RW.

The proposed solution in this series enables it and avoid per field sw interpretation and mediation in parsing values etc.

> > The PCI device exposed is transitional to the guest VM, so it can do legacy as
> well as newer features.
> > And if BAR 0 is hard coded, it may not be able to support features that may
> need additional BAR.
> 
> This part I don't understand, you can just put existing modern capabilities in
> BAR0, then everything is fine.
>
Not sure I follow.
May be a step back.

What is proposed here, that 
a. legacy registers are emulated as MMIO in a BAR.
b. This can be either be BAR0 or some other BAR

Your question was why this flexibility?

The reason is: 
a. if device prefers to implement only two BARs, it can do so and have window for this 60+ config registers in an existing BAR.
b. if device prefers to implement a new BAR dedicated for legacy registers emulation, it is fine too.

A mediating sw will be able to forward them regardless.

> > Right, it doesn’t. But spec shouldn’t write BAR0 is only for legacy MMIO
> emulation, that would prevent BAR0 usage.
> 
> How can it be prevented? Can you give me an example?

I mean to say, that say if we write a spec like below,

A device exposes BAR 0 of size X bytes for supporting legacy configuration and device specific registers as memory mapped region.

Above line will prevent using BAR0 beyond legacy register emulation.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-12  5:12     ` [virtio-dev] " Michael S. Tsirkin
  2023-04-12  5:15       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
@ 2023-04-12  6:02       ` Parav Pandit
  1 sibling, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-12  6:02 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler



> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Wednesday, April 12, 2023 1:12 AM

> New issue I found today:
> - if guest disables MSI-X host can not disable MSI-X.
>   need some other channel to notify device about this.
Not sure why you say so.
MSI-X is not coming from virtio side.
So host will be able to disable.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  5:55                                       ` [virtio-dev] " Parav Pandit
@ 2023-04-12  6:15                                         ` Jason Wang
  2023-04-12 14:23                                           ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-12  6:15 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 1:55 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, April 12, 2023 1:38 AM
>
> > > Modern device says FEAETURE_1 must be offered and must be negotiated by
> > driver.
> > > Legacy has Mac as RW area. (hypervisor can do it).
> > > Reset flow is difference between the legacy and modern.
> >
> > Just to make sure we're at the same page. We're talking in the context of
> > mediation. Without mediation, your proposal can't work.
> >
> Right.
>
> > So in this case, the guest driver is not talking with the device directly. Qemu
> > needs to traps whatever it wants to achieve the
> > mediation:
> >
> I prefer to avoid picking specific sw component here, but yes. QEMU can trap.
>
> > 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented a
> > mediated legacy device to guests.
> Right but if VERSION_1 is negotiated, device will work as V_1 with 12B virtio_net_hdr.

Shadow virtqueue could be used here. And we have much more issues
without shadow virtqueue, more below.

>
> > 2) For MAC and Reset, Qemu can trap and do anything it wants.
> >
> The idea is not to poke in the fields even though such sw can.
> MAC is RW in legacy.
> Mac ia RO in 1.x.
>
> So QEMU cannot make RO register into RW.

It can be done via using the control vq. Trap the MAC write and
forward it via control virtqueue.

>
> The proposed solution in this series enables it and avoid per field sw interpretation and mediation in parsing values etc.

I don't think it's possible. See the discussion about ORDER_PLATFORM
and ACCESS_PLATFORM in previous threads.

>
> > > The PCI device exposed is transitional to the guest VM, so it can do legacy as
> > well as newer features.
> > > And if BAR 0 is hard coded, it may not be able to support features that may
> > need additional BAR.
> >
> > This part I don't understand, you can just put existing modern capabilities in
> > BAR0, then everything is fine.
> >
> Not sure I follow.
> May be a step back.
>
> What is proposed here, that
> a. legacy registers are emulated as MMIO in a BAR.
> b. This can be either be BAR0 or some other BAR
>
> Your question was why this flexibility?

Yes.

>
> The reason is:
> a. if device prefers to implement only two BARs, it can do so and have window for this 60+ config registers in an existing BAR.
> b. if device prefers to implement a new BAR dedicated for legacy registers emulation, it is fine too.
>
> A mediating sw will be able to forward them regardless.

I'm not sure I fully understand this. The only difference is that for
b, it can only use BAR0. Unless there's a new feature that mandates
BAR0 (which I think is impossible since all the features are
advertised via capabilities now). We're fine.

>
> > > Right, it doesn’t. But spec shouldn’t write BAR0 is only for legacy MMIO
> > emulation, that would prevent BAR0 usage.
> >
> > How can it be prevented? Can you give me an example?
>
> I mean to say, that say if we write a spec like below,
>
> A device exposes BAR 0 of size X bytes for supporting legacy configuration and device specific registers as memory mapped region.
>

Ok, it looks just a matter of how the spec is written. The problematic
part is that it tries to enforce a size which is suboptimal.

What's has been done is:

"
Transitional devices MUST expose the Legacy Interface in I/O space in BAR0.
"

Without mentioning the size.

Thanks


> Above line will prevent using BAR0 beyond legacy register emulation.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12  6:15                                         ` [virtio-dev] " Jason Wang
@ 2023-04-12 14:23                                           ` Parav Pandit
  2023-04-13  1:48                                             ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-12 14:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, April 12, 2023 2:15 AM
> 
> On Wed, Apr 12, 2023 at 1:55 PM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Wednesday, April 12, 2023 1:38 AM
> >
> > > > Modern device says FEAETURE_1 must be offered and must be
> > > > negotiated by
> > > driver.
> > > > Legacy has Mac as RW area. (hypervisor can do it).
> > > > Reset flow is difference between the legacy and modern.
> > >
> > > Just to make sure we're at the same page. We're talking in the
> > > context of mediation. Without mediation, your proposal can't work.
> > >
> > Right.
> >
> > > So in this case, the guest driver is not talking with the device
> > > directly. Qemu needs to traps whatever it wants to achieve the
> > > mediation:
> > >
> > I prefer to avoid picking specific sw component here, but yes. QEMU can trap.
> >
> > > 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented
> > > a mediated legacy device to guests.
> > Right but if VERSION_1 is negotiated, device will work as V_1 with 12B
> virtio_net_hdr.
> 
> Shadow virtqueue could be used here. And we have much more issues without
> shadow virtqueue, more below.
> 
> >
> > > 2) For MAC and Reset, Qemu can trap and do anything it wants.
> > >
> > The idea is not to poke in the fields even though such sw can.
> > MAC is RW in legacy.
> > Mac ia RO in 1.x.
> >
> > So QEMU cannot make RO register into RW.
> 
> It can be done via using the control vq. Trap the MAC write and forward it via
> control virtqueue.
>
This proposal Is not implementing about vdpa mediator that requires far higher understanding in hypervisor.
Such mediation works fine for vdpa and it is upto vdpa layer to do. Not relevant here.
 
> >
> > The proposed solution in this series enables it and avoid per field sw
> interpretation and mediation in parsing values etc.
> 
> I don't think it's possible. See the discussion about ORDER_PLATFORM and
> ACCESS_PLATFORM in previous threads.
> 
I have read the previous thread.
Hypervisor will be limiting to those platforms where ORDER_PLATFORM is not needed.
And this is a pci transitional device that uses the standard platform dma anyway so ACCESS_PLATFORM is not related.

> >
> > What is proposed here, that
> > a. legacy registers are emulated as MMIO in a BAR.
> > b. This can be either be BAR0 or some other BAR
> >
> > Your question was why this flexibility?
> 
> Yes.
> 
> >
> > The reason is:
> > a. if device prefers to implement only two BARs, it can do so and have window
> for this 60+ config registers in an existing BAR.
> > b. if device prefers to implement a new BAR dedicated for legacy registers
> emulation, it is fine too.
> >
> > A mediating sw will be able to forward them regardless.
> 
> I'm not sure I fully understand this. The only difference is that for b, it can only
> use BAR0. 
Why do say it can use only BAR 0?

For example, a device may have implemented say only BAR2, and small portion of the BAR2 is pointing to legacy MMIO config registers.
A mediator hypervisor sw will be able to read/write to it when BAR0 is exposed towards the guest VM as IOBAR 0.

> Unless there's a new feature that mandates
> BAR0 (which I think is impossible since all the features are advertised via
> capabilities now). We're fine.
>
No new feature. Legacy BAR emulation is exposed via the extended capability we discussed providing the location.
 
> >
> > > > Right, it doesn’t. But spec shouldn’t write BAR0 is only for
> > > > legacy MMIO
> > > emulation, that would prevent BAR0 usage.
> > >
> > > How can it be prevented? Can you give me an example?
> >
> > I mean to say, that say if we write a spec like below,
> >
> > A device exposes BAR 0 of size X bytes for supporting legacy configuration
> and device specific registers as memory mapped region.
> >
> 
> Ok, it looks just a matter of how the spec is written. The problematic part is that
> it tries to enforce a size which is suboptimal.
> 
> What's has been done is:
> 
> "
> Transitional devices MUST expose the Legacy Interface in I/O space in BAR0.
> "
> 
> Without mentioning the size.

For new legacy MMIO registers can be implemented as BAR0 with same size. But better to not place such restriction like above wording.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-12 14:23                                           ` [virtio-dev] " Parav Pandit
@ 2023-04-13  1:48                                             ` Jason Wang
  2023-04-13  3:31                                               ` Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-13  1:48 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Wed, Apr 12, 2023 at 10:23 PM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, April 12, 2023 2:15 AM
> >
> > On Wed, Apr 12, 2023 at 1:55 PM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Wednesday, April 12, 2023 1:38 AM
> > >
> > > > > Modern device says FEAETURE_1 must be offered and must be
> > > > > negotiated by
> > > > driver.
> > > > > Legacy has Mac as RW area. (hypervisor can do it).
> > > > > Reset flow is difference between the legacy and modern.
> > > >
> > > > Just to make sure we're at the same page. We're talking in the
> > > > context of mediation. Without mediation, your proposal can't work.
> > > >
> > > Right.
> > >
> > > > So in this case, the guest driver is not talking with the device
> > > > directly. Qemu needs to traps whatever it wants to achieve the
> > > > mediation:
> > > >
> > > I prefer to avoid picking specific sw component here, but yes. QEMU can trap.
> > >
> > > > 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented
> > > > a mediated legacy device to guests.
> > > Right but if VERSION_1 is negotiated, device will work as V_1 with 12B
> > virtio_net_hdr.
> >
> > Shadow virtqueue could be used here. And we have much more issues without
> > shadow virtqueue, more below.
> >
> > >
> > > > 2) For MAC and Reset, Qemu can trap and do anything it wants.
> > > >
> > > The idea is not to poke in the fields even though such sw can.
> > > MAC is RW in legacy.
> > > Mac ia RO in 1.x.
> > >
> > > So QEMU cannot make RO register into RW.
> >
> > It can be done via using the control vq. Trap the MAC write and forward it via
> > control virtqueue.
> >
> This proposal Is not implementing about vdpa mediator that requires far higher understanding in hypervisor.

It's not related to vDPA, it's about a common technology that is used
in virtualization. You do a trap and emulate the status, why can't you
do that for others?

> Such mediation works fine for vdpa and it is upto vdpa layer to do. Not relevant here.
>
> > >
> > > The proposed solution in this series enables it and avoid per field sw
> > interpretation and mediation in parsing values etc.
> >
> > I don't think it's possible. See the discussion about ORDER_PLATFORM and
> > ACCESS_PLATFORM in previous threads.
> >
> I have read the previous thread.
> Hypervisor will be limiting to those platforms where ORDER_PLATFORM is not needed.

So you introduce a bunch of new facilities that only work on some
specific archs. This breaks the architecture independence of virtio
since 1.0. The root cause is legacy is not fit for hardware
implementation, any kind of hardware that tries to offer legacy
function will finally run into those corner cases which require extra
interfaces which may finally end up with a (partial) duplication of
the modern interface.

> And this is a pci transitional device that uses the standard platform dma anyway so ACCESS_PLATFORM is not related.

So which type of transactions did this device use when it is used via
legacy MMIO BAR? Translated request or not?

>
> > >
> > > What is proposed here, that
> > > a. legacy registers are emulated as MMIO in a BAR.
> > > b. This can be either be BAR0 or some other BAR
> > >
> > > Your question was why this flexibility?
> >
> > Yes.
> >
> > >
> > > The reason is:
> > > a. if device prefers to implement only two BARs, it can do so and have window
> > for this 60+ config registers in an existing BAR.
> > > b. if device prefers to implement a new BAR dedicated for legacy registers
> > emulation, it is fine too.
> > >
> > > A mediating sw will be able to forward them regardless.
> >
> > I'm not sure I fully understand this. The only difference is that for b, it can only
> > use BAR0.
> Why do say it can use only BAR 0?

Because:

1) It's the way current transitional device works
2) it's simple, a small extension to the transitional device instead
of a brunch of facilities that is can do much less than this
3) it works for legacy drivers on some OSes such as Linux and DPDK, it
means it works for bare metal which can't be achieved by your proposal
here

>
> For example, a device may have implemented say only BAR2, and small portion of the BAR2 is pointing to legacy MMIO config registers.

We're discussing spec changes, not a specific implementation here. Why
is the device can't use BAR0, do you see any restriction in the spec?

> A mediator hypervisor sw will be able to read/write to it when BAR0 is exposed towards the guest VM as IOBAR 0.

So I don't think it can work:

1) This is very dangerous unless the spec mandates the size (this is
also tricky since page size varies among arches) for any
BAR/capability which is not what virtio wants, the spec leave those
flexibility to the implementation:

E.g

"""
The driver MUST accept a cap_len value which is larger than specified here.
"""

2) A blocker for live migration (and compatibility), the hypervisor
should not assume the size for any capability so for whatever case it
should have a fallback for the case where the BAR can't be assigned.

>
> > Unless there's a new feature that mandates
> > BAR0 (which I think is impossible since all the features are advertised via
> > capabilities now). We're fine.
> >
> No new feature. Legacy BAR emulation is exposed via the extended capability we discussed providing the location.
>
> > >
> > > > > Right, it doesn’t. But spec shouldn’t write BAR0 is only for
> > > > > legacy MMIO
> > > > emulation, that would prevent BAR0 usage.
> > > >
> > > > How can it be prevented? Can you give me an example?
> > >
> > > I mean to say, that say if we write a spec like below,
> > >
> > > A device exposes BAR 0 of size X bytes for supporting legacy configuration
> > and device specific registers as memory mapped region.
> > >
> >
> > Ok, it looks just a matter of how the spec is written. The problematic part is that
> > it tries to enforce a size which is suboptimal.
> >
> > What's has been done is:
> >
> > "
> > Transitional devices MUST expose the Legacy Interface in I/O space in BAR0.
> > "
> >
> > Without mentioning the size.
>
> For new legacy MMIO registers can be implemented as BAR0 with same size. But better to not place such restriction like above wording.

Let me summarize, we had three ways currently:

1) legacy MMIO BAR via capability:

Pros:
- allow some flexibility to place MMIO BAR other than 0
Cons:
- new device ID
- non trivial spec changes which ends up of the tricky cases that
tries to workaround legacy to fit for a hardware implementation
- work only for the case of virtualization with the help of
meditation, can't work for bare metal
- only work for some specific archs without SVQ

2) allow BAR0 to be MMIO for transitional device

Pros:
- very minor change for the spec
- work for virtualization (and it work even without dedicated
mediation for some setups)
- work for bare metal for some setups (without mediation)
Cons:
- only work for some specific archs without SVQ
- BAR0 is required

3) modern device mediation for legacy

Pros:
- no changes in the spec
Cons:
- require mediation layer in order to work in bare metal
- require datapath mediation like SVQ to work for virtualization

Compared to method 2) the only advantages of method 1) is the
flexibility of BAR0 but it has too many disadvantages. If we only care
about virtualization, modern devices are sufficient. Then why bother
for that?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13  1:48                                             ` [virtio-dev] " Jason Wang
@ 2023-04-13  3:31                                               ` Parav Pandit
  2023-04-13  5:14                                                 ` Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-13  3:31 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



On 4/12/2023 9:48 PM, Jason Wang wrote:
> On Wed, Apr 12, 2023 at 10:23 PM Parav Pandit <parav@nvidia.com> wrote:
>>
>>
>>
>>> From: Jason Wang <jasowang@redhat.com>
>>> Sent: Wednesday, April 12, 2023 2:15 AM
>>>
>>> On Wed, Apr 12, 2023 at 1:55 PM Parav Pandit <parav@nvidia.com> wrote:
>>>>
>>>>
>>>>
>>>>> From: Jason Wang <jasowang@redhat.com>
>>>>> Sent: Wednesday, April 12, 2023 1:38 AM
>>>>
>>>>>> Modern device says FEAETURE_1 must be offered and must be
>>>>>> negotiated by
>>>>> driver.
>>>>>> Legacy has Mac as RW area. (hypervisor can do it).
>>>>>> Reset flow is difference between the legacy and modern.
>>>>>
>>>>> Just to make sure we're at the same page. We're talking in the
>>>>> context of mediation. Without mediation, your proposal can't work.
>>>>>
>>>> Right.
>>>>
>>>>> So in this case, the guest driver is not talking with the device
>>>>> directly. Qemu needs to traps whatever it wants to achieve the
>>>>> mediation:
>>>>>
>>>> I prefer to avoid picking specific sw component here, but yes. QEMU can trap.
>>>>
>>>>> 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented
>>>>> a mediated legacy device to guests.
>>>> Right but if VERSION_1 is negotiated, device will work as V_1 with 12B
>>> virtio_net_hdr.
>>>
>>> Shadow virtqueue could be used here. And we have much more issues without
>>> shadow virtqueue, more below.
>>>
>>>>
>>>>> 2) For MAC and Reset, Qemu can trap and do anything it wants.
>>>>>
>>>> The idea is not to poke in the fields even though such sw can.
>>>> MAC is RW in legacy.
>>>> Mac ia RO in 1.x.
>>>>
>>>> So QEMU cannot make RO register into RW.
>>>
>>> It can be done via using the control vq. Trap the MAC write and forward it via
>>> control virtqueue.
>>>
>> This proposal Is not implementing about vdpa mediator that requires far higher understanding in hypervisor.
> 
> It's not related to vDPA, it's about a common technology that is used
> in virtualization. You do a trap and emulate the status, why can't you
> do that for others?
> 
>> Such mediation works fine for vdpa and it is upto vdpa layer to do. Not relevant here.
>>
>>>>
>>>> The proposed solution in this series enables it and avoid per field sw
>>> interpretation and mediation in parsing values etc.
>>>
>>> I don't think it's possible. See the discussion about ORDER_PLATFORM and
>>> ACCESS_PLATFORM in previous threads.
>>>
>> I have read the previous thread.
>> Hypervisor will be limiting to those platforms where ORDER_PLATFORM is not needed.
> 
> So you introduce a bunch of new facilities that only work on some
> specific archs. This breaks the architecture independence of virtio
> since 1.0. 
The defined spec for PCI device does not work today for transitional 
device for virtualization. Only works in limited PF case.
Hence this update. More below.

> The root cause is legacy is not fit for hardware
> implementation, any kind of hardware that tries to offer legacy
> function will finally run into those corner cases which require extra
> interfaces which may finally end up with a (partial) duplication of
> the modern interface.
> 
I agree with you. We cannot change the legacy.
What is being added here it to enable legacy transport via MMIO or AQ 
and using notification region.

Will comment where you listed 3 options.

>> And this is a pci transitional device that uses the standard platform dma anyway so ACCESS_PLATFORM is not related.
> 
> So which type of transactions did this device use when it is used via
> legacy MMIO BAR? Translated request or not?
> 
Device uses the PCI transport level addresses configured because its a 
PCI device.

>> For example, a device may have implemented say only BAR2, and small portion of the BAR2 is pointing to legacy MMIO config registers.
> 
> We're discussing spec changes, not a specific implementation here. Why
> is the device can't use BAR0, do you see any restriction in the spec?
> 
No restriction.
Forcing it to use BAR0 is the restrictive method.
>> A mediator hypervisor sw will be able to read/write to it when BAR0 is exposed towards the guest VM as IOBAR 0.
> 
> So I don't think it can work:
> 
> 1) This is very dangerous unless the spec mandates the size (this is
> also tricky since page size varies among arches) for any
> BAR/capability which is not what virtio wants, the spec leave those
> flexibility to the implementation:
> 
> E.g
> 
> """
> The driver MUST accept a cap_len value which is larger than specified here.
> """
cap_len talks about length of the PCI capability structure as defined by 
the PCI spec. BAR length is located in the le32 length.

So new MMIO region can be of any size and anywhere in the BAR.

For LM BAR length and number should be same between two PCI VFs. But its 
orthogonal to this point. Such checks will be done anyway.

> 
> 2) A blocker for live migration (and compatibility), the hypervisor
> should not assume the size for any capability so for whatever case it
> should have a fallback for the case where the BAR can't be assigned.
> 
I agree that hypervisor should not assume.
for LM such compatibility checks will be done anyway.
So not a blocker, they should match on two sides is all needed.

> Let me summarize, we had three ways currently:
> 
> 1) legacy MMIO BAR via capability:
> 
> Pros:
> - allow some flexibility to place MMIO BAR other than 0
> Cons:
> - new device ID
Not needed as Michael suggest. Existing transitional or non transitional 
device can expose this optional capability and its attached MMIO region.

Spec changes are similar to #2.
> - non trivial spec changes which ends up of the tricky cases that
> tries to workaround legacy to fit for a hardware implementation
> - work only for the case of virtualization with the help of
> meditation, can't work for bare metal
For bare-metal PFs usually thin hypervisors are used that does very 
minimal setup. But I agree that bare-metal is relatively less important.

> - only work for some specific archs without SVQ
>
That is the legacy limitation that we don't worry about.

> 2) allow BAR0 to be MMIO for transitional device
> 
> Pros:
> - very minor change for the spec
Spec changes wise they are similar to #1.
> - work for virtualization (and it work even without dedicated
> mediation for some setups)
I am not aware where can it work without mediation. Do you know any 
specific kernel version where it actually works?

> - work for bare metal for some setups (without mediation)
> Cons:
> - only work for some specific archs without SVQ
> - BAR0 is required
> 
Both are not limitation as they are mainly coming from the legacy side 
of things.

> 3) modern device mediation for legacy
> 
> Pros:
> - no changes in the spec
> Cons:
> - require mediation layer in order to work in bare metal
> - require datapath mediation like SVQ to work for virtualization
> 
Spec change is still require for net and blk because modern device do 
not understand legacy, even with mediation layer.
FEATURE_1, RW cap via CVQ which is not really owned by the hypervisor.
A guest may be legacy or non legacy, so mediation shouldn't be always done.

> Compared to method 2) the only advantages of method 1) is the
> flexibility of BAR0 but it has too many disadvantages. If we only care
> about virtualization, modern devices are sufficient. Then why bother
> for that?

So that a single stack which doesn't always have the knowledge of which 
driver version is running is guest can utilize it. Otherwise 1.x also 
end up doing mediation when guest driver = 1.x and device = transitional 
PCI VF.

so (1) and (2) both are equivalent, one is more flexible, if you know 
more valid cases where BAR0 as MMIO can work as_is, such option is open.

We can draft the spec that MMIO BAR SHOULD be exposes in BAR0.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13  3:31                                               ` Parav Pandit
@ 2023-04-13  5:14                                                 ` Jason Wang
  2023-04-13 17:19                                                   ` Michael S. Tsirkin
  2023-04-13 17:24                                                   ` [virtio-dev] " Parav Pandit
  0 siblings, 2 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-13  5:14 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Thu, Apr 13, 2023 at 11:31 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> On 4/12/2023 9:48 PM, Jason Wang wrote:
> > On Wed, Apr 12, 2023 at 10:23 PM Parav Pandit <parav@nvidia.com> wrote:
> >>
> >>
> >>
> >>> From: Jason Wang <jasowang@redhat.com>
> >>> Sent: Wednesday, April 12, 2023 2:15 AM
> >>>
> >>> On Wed, Apr 12, 2023 at 1:55 PM Parav Pandit <parav@nvidia.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> From: Jason Wang <jasowang@redhat.com>
> >>>>> Sent: Wednesday, April 12, 2023 1:38 AM
> >>>>
> >>>>>> Modern device says FEAETURE_1 must be offered and must be
> >>>>>> negotiated by
> >>>>> driver.
> >>>>>> Legacy has Mac as RW area. (hypervisor can do it).
> >>>>>> Reset flow is difference between the legacy and modern.
> >>>>>
> >>>>> Just to make sure we're at the same page. We're talking in the
> >>>>> context of mediation. Without mediation, your proposal can't work.
> >>>>>
> >>>> Right.
> >>>>
> >>>>> So in this case, the guest driver is not talking with the device
> >>>>> directly. Qemu needs to traps whatever it wants to achieve the
> >>>>> mediation:
> >>>>>
> >>>> I prefer to avoid picking specific sw component here, but yes. QEMU can trap.
> >>>>
> >>>>> 1) It's perfectly fine that Qemu negotiated VERSION_1 but presented
> >>>>> a mediated legacy device to guests.
> >>>> Right but if VERSION_1 is negotiated, device will work as V_1 with 12B
> >>> virtio_net_hdr.
> >>>
> >>> Shadow virtqueue could be used here. And we have much more issues without
> >>> shadow virtqueue, more below.
> >>>
> >>>>
> >>>>> 2) For MAC and Reset, Qemu can trap and do anything it wants.
> >>>>>
> >>>> The idea is not to poke in the fields even though such sw can.
> >>>> MAC is RW in legacy.
> >>>> Mac ia RO in 1.x.
> >>>>
> >>>> So QEMU cannot make RO register into RW.
> >>>
> >>> It can be done via using the control vq. Trap the MAC write and forward it via
> >>> control virtqueue.
> >>>
> >> This proposal Is not implementing about vdpa mediator that requires far higher understanding in hypervisor.
> >
> > It's not related to vDPA, it's about a common technology that is used
> > in virtualization. You do a trap and emulate the status, why can't you
> > do that for others?
> >
> >> Such mediation works fine for vdpa and it is upto vdpa layer to do. Not relevant here.
> >>
> >>>>
> >>>> The proposed solution in this series enables it and avoid per field sw
> >>> interpretation and mediation in parsing values etc.
> >>>
> >>> I don't think it's possible. See the discussion about ORDER_PLATFORM and
> >>> ACCESS_PLATFORM in previous threads.
> >>>
> >> I have read the previous thread.
> >> Hypervisor will be limiting to those platforms where ORDER_PLATFORM is not needed.
> >
> > So you introduce a bunch of new facilities that only work on some
> > specific archs. This breaks the architecture independence of virtio
> > since 1.0.
> The defined spec for PCI device does not work today for transitional
> device for virtualization. Only works in limited PF case.
> Hence this update.

I fully understand the motivation. I just want to say

1) compare to the MMIO ar BAR0, this proposal doesn't provide much advantages
2) mediate on top of modern devices allows us to not worry about the
device design which is hard for legacy

> More below.
>
> > The root cause is legacy is not fit for hardware
> > implementation, any kind of hardware that tries to offer legacy
> > function will finally run into those corner cases which require extra
> > interfaces which may finally end up with a (partial) duplication of
> > the modern interface.
> >
> I agree with you. We cannot change the legacy.
> What is being added here it to enable legacy transport via MMIO or AQ
> and using notification region.
>
> Will comment where you listed 3 options.
>
> >> And this is a pci transitional device that uses the standard platform dma anyway so ACCESS_PLATFORM is not related.
> >
> > So which type of transactions did this device use when it is used via
> > legacy MMIO BAR? Translated request or not?
> >
> Device uses the PCI transport level addresses configured because its a
> PCI device.
>
> >> For example, a device may have implemented say only BAR2, and small portion of the BAR2 is pointing to legacy MMIO config registers.
> >
> > We're discussing spec changes, not a specific implementation here. Why
> > is the device can't use BAR0, do you see any restriction in the spec?
> >
> No restriction.
> Forcing it to use BAR0 is the restrictive method.
> >> A mediator hypervisor sw will be able to read/write to it when BAR0 is exposed towards the guest VM as IOBAR 0.
> >
> > So I don't think it can work:
> >
> > 1) This is very dangerous unless the spec mandates the size (this is
> > also tricky since page size varies among arches) for any
> > BAR/capability which is not what virtio wants, the spec leave those
> > flexibility to the implementation:
> >
> > E.g
> >
> > """
> > The driver MUST accept a cap_len value which is larger than specified here.
> > """
> cap_len talks about length of the PCI capability structure as defined by
> the PCI spec. BAR length is located in the le32 length.
>
> So new MMIO region can be of any size and anywhere in the BAR.
>
> For LM BAR length and number should be same between two PCI VFs. But its
> orthogonal to this point. Such checks will be done anyway.

Quoted the wrong sections, I think it should be:

"
length MAY include padding, or fields unused by the driver, or future
extensions. Note: For example, a future device might present a large
structure size of several MBytes. As current devices never utilize
structures larger than 4KBytes in size, driver MAY limit the mapped
structure size to e.g. 4KBytes (thus ignoring parts of structure after
the first 4KBytes) to allow forward compatibility with such devices
without loss of functionality and without wasting resources.
"

>
> >
> > 2) A blocker for live migration (and compatibility), the hypervisor
> > should not assume the size for any capability so for whatever case it
> > should have a fallback for the case where the BAR can't be assigned.
> >
> I agree that hypervisor should not assume.
> for LM such compatibility checks will be done anyway.
> So not a blocker, they should match on two sides is all needed.
>
> > Let me summarize, we had three ways currently:
> >
> > 1) legacy MMIO BAR via capability:
> >
> > Pros:
> > - allow some flexibility to place MMIO BAR other than 0
> > Cons:
> > - new device ID
> Not needed as Michael suggest. Existing transitional or non transitional

If it's a transitional device but not placed at BAR0, it might have
side effects for Linux drivers which assumes BAR0 for legacy.

I don't see how easy it could be a non transitional device:

"
Devices or drivers with no legacy compatibility are referred to as
non-transitional devices and drivers, respectively.
"

> device can expose this optional capability and its attached MMIO region.
>
> Spec changes are similar to #2.
> > - non trivial spec changes which ends up of the tricky cases that
> > tries to workaround legacy to fit for a hardware implementation
> > - work only for the case of virtualization with the help of
> > meditation, can't work for bare metal
> For bare-metal PFs usually thin hypervisors are used that does very
> minimal setup. But I agree that bare-metal is relatively less important.

This is not what I understand. I know several vendors that are using
virtio devices for bare metal.

>
> > - only work for some specific archs without SVQ
> >
> That is the legacy limitation that we don't worry about.
>
> > 2) allow BAR0 to be MMIO for transitional device
> >
> > Pros:
> > - very minor change for the spec
> Spec changes wise they are similar to #1.

This is different since the changes for this are trivial.

> > - work for virtualization (and it work even without dedicated
> > mediation for some setups)
> I am not aware where can it work without mediation. Do you know any
> specific kernel version where it actually works?

E.g current Linux driver did:

rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");

It doesn't differ from I/O with memory. It means if you had a
"transitional" device with legacy MMIO BAR0, it just works.

>
> > - work for bare metal for some setups (without mediation)
> > Cons:
> > - only work for some specific archs without SVQ
> > - BAR0 is required
> >
> Both are not limitation as they are mainly coming from the legacy side
> of things.
>
> > 3) modern device mediation for legacy
> >
> > Pros:
> > - no changes in the spec
> > Cons:
> > - require mediation layer in order to work in bare metal
> > - require datapath mediation like SVQ to work for virtualization
> >
> Spec change is still require for net and blk because modern device do
> not understand legacy, even with mediation layer.

That's fine and easy since we work on top of modern devices.

> FEATURE_1, RW cap via CVQ which is not really owned by the hypervisor.

Hypervisors can trap if they wish.

> A guest may be legacy or non legacy, so mediation shouldn't be always done.

Yes, so mediation can work only if we found it's a legacy driver.

>
> > Compared to method 2) the only advantages of method 1) is the
> > flexibility of BAR0 but it has too many disadvantages. If we only care
> > about virtualization, modern devices are sufficient. Then why bother
> > for that?
>
> So that a single stack which doesn't always have the knowledge of which
> driver version is running is guest can utilize it. Otherwise 1.x also
> end up doing mediation when guest driver = 1.x and device = transitional
> PCI VF.

I don't see how this can be solved in your proposal.

>
> so (1) and (2) both are equivalent, one is more flexible, if you know
> more valid cases where BAR0 as MMIO can work as_is, such option is open.

As said in previous threads, this has been used by several vendors for years.

E.g I have a handy transitional hardware virtio device that has:

        Region 0: Memory at f5ff0000 (64-bit, prefetchable) [size=8K]
        Region 2: Memory at f5fe0000 (64-bit, prefetchable) [size=4K]
        Region 4: Memory at f5800000 (64-bit, prefetchable) [size=4M]

And:

        Capabilities: [64] Vendor Specific Information: VirtIO: CommonCfg
                BAR=0 offset=00000888 size=00000078
        Capabilities: [74] Vendor Specific Information: VirtIO: Notify
                BAR=0 offset=00001800 size=00000020 multiplier=00000000
        Capabilities: [88] Vendor Specific Information: VirtIO: ISR
                BAR=0 offset=00000820 size=00000020
        Capabilities: [98] Vendor Specific Information: VirtIO: DeviceCfg
                BAR=0 offset=00000840 size=00000020

>
> We can draft the spec that MMIO BAR SHOULD be exposes in BAR0.
>

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-04-12  5:37                     ` Jason Wang
@ 2023-04-13 17:03                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-13 17:03 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla

On Wed, Apr 12, 2023 at 01:37:59PM +0800, Jason Wang wrote:
> On Wed, Apr 12, 2023 at 1:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Apr 12, 2023 at 12:53:52PM +0800, Jason Wang wrote:
> > > On Wed, Apr 12, 2023 at 12:20 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Wed, Apr 12, 2023 at 12:07:26PM +0800, Jason Wang wrote:
> > > > > On Wed, Apr 12, 2023 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Apr 11, 2023 at 07:01:16PM +0000, Parav Pandit wrote:
> > > > > > >
> > > > > > > > From: virtio-dev@lists.oasis-open.org <virtio-dev@lists.oasis-open.org> On
> > > > > > > > Behalf Of Jason Wang
> > > > > > > > Sent: Monday, April 10, 2023 11:29 PM
> > > > > > >
> > > > > > > > > However, it is not backward compatible, if the device place them in
> > > > > > > > > extended capability, it will not work.
> > > > > > > > >
> > > > > > > >
> > > > > > > > It is kind of intended since it is only used for new PCI-E features:
> > > > > > > >
> > > > > > > New fields in new extended pci cap area is fine.
> > > > > > > Migrating old fields to be present in the new extended pci cap, is not your intention. Right?
> > > > > > >
> > > > > > > > "
> > > > > > > > +The location of the virtio structures that depend on the PCI Express
> > > > > > > > +capability are specified using a vendor-specific extended capabilities
> > > > > > > > +on the extended capabilities list in PCI Express extended configuration
> > > > > > > > +space of the device.
> > > > > > > > "
> > > > > > > >
> > > > > > > > > To make it backward compatible, a device needs to expose existing
> > > > > > > > > structure in legacy area. And extended structure for same capability
> > > > > > > > > in extended pci capability region.
> > > > > > > > >
> > > > > > > > > In other words, it will have to be a both places.
> > > > > > > >
> > > > > > > > Then we will run out of config space again?
> > > > > > > No.
> > > > > > > Only currently defined caps to be placed in two places.
> > > > > > > New fields don’t need to be placed in PCI cap, because no driver is looking there.
> > > > > > >
> > > > > > > We probably already discussed this in previous email by now.
> > > > > > >
> > > > > > > > Otherwise we need to deal with the
> > > > > > > > case when existing structures were only placed at extended capability. Michael
> > > > > > > > suggest to add a new feature, but the driver may not negotiate the feature
> > > > > > > > which requires more thought.
> > > > > > > >
> > > > > > > Not sure I understand feature bit.
> > > > > >
> > > > > > This is because we have a concept of dependency between
> > > > > > features but not a concept of dependency of feature on
> > > > > > capability.
> > > > > >
> > > > > > > PCI transport fields existence is usually not dependent on upper layer protocol.
> > > > > > >
> > > > > > > > > We may need it even sooner than this because the AQ patch is expanding
> > > > > > > > > the structure located in legacy area.
> > > > > > > >
> > > > > > > > Just to make sure I understand this, assuming we have adminq, any reason a
> > > > > > > > dedicated pcie ext cap is required?
> > > > > > > >
> > > > > > > No. it was my short sight. I responded right after above text that AQ doesn’t need cap extension.
> > > > > >
> > > > > >
> > > > > >
> > > > > > You know, thinking about this, I begin to feel that we should
> > > > > > require that if at least one extended config exists then
> > > > > > all caps present in the regular config are *also*
> > > > > > mirrored in the extended config. IOW extended >= regular.
> > > > > > The reason is that extended config can be emulated more efficiently
> > > > > > (2x less exits).
> > > > >
> > > > > Any reason for it to get less exits?
> > > >
> > > > For a variety of reasons having to do with buggy hardware e.g. linux
> > > > likes to use cf8/cfc for legacy ranges. 2 accesses are required for each
> > > > read/write.  extended space is just 1.
> > > >
> > >
> > > Ok.
> > >
> > > >
> > > > > At least it has not been done in
> > > > > current Qemu's emulation. (And do we really care about the performance
> > > > > of config space access?)
> > > > >
> > > > > Thanks
> > > >
> > > > For boot speed, yes. Not minor 5% things but 2x, sure.
> > >
> > > If we care about boot speed we should avoid using the PCI layer in the
> > > guest completely.
> > >
> > > Thanks
> >
> > Woa. And do what? Add a ton of functionality in a PV way to MMIO?
> 
> Probably, we have microVM already. And hyperv drops PCI since Gen2.
> 
> > NUMA, MSI, power management .... the list goes on and on.
> > If you have pci on the host it is way easier to pass that
> > through to guest than do a completely different thing.
> 
> It's a balance. If you want functionality, PCI is probably a must. But
> if you care about only the boot speed, the boot speed is not slowed
> down by a single device but the whole PCI layer.
> 
> Thanks

I don't know that if we add a ton of features to the mmio layer
that it won't slow down, too.

> >
> > --
> > MST
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13  5:14                                                 ` Jason Wang
@ 2023-04-13 17:19                                                   ` Michael S. Tsirkin
  2023-04-13 19:39                                                     ` [virtio-dev] " Parav Pandit
  2023-04-13 17:24                                                   ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-13 17:19 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Thu, Apr 13, 2023 at 01:14:15PM +0800, Jason Wang wrote:
> > >>>> The proposed solution in this series enables it and avoid per field sw
> > >>> interpretation and mediation in parsing values etc.

... except for reset, notifications, and maybe more down the road.


> > >>> I don't think it's possible. See the discussion about ORDER_PLATFORM and
> > >>> ACCESS_PLATFORM in previous threads.
> > >>>
> > >> I have read the previous thread.
> > >> Hypervisor will be limiting to those platforms where ORDER_PLATFORM is not needed.
> > >
> > > So you introduce a bunch of new facilities that only work on some
> > > specific archs. This breaks the architecture independence of virtio
> > > since 1.0.
> > The defined spec for PCI device does not work today for transitional
> > device for virtualization. Only works in limited PF case.
> > Hence this update.
> 
> I fully understand the motivation. I just want to say
> 
> 1) compare to the MMIO ar BAR0, this proposal doesn't provide much advantages
> 2) mediate on top of modern devices allows us to not worry about the
> device design which is hard for legacy

I begin to think so, too. When I proposed this it looked like just a
single capability will be enough, without a lot of mess.  But it seems
that addressing this fully is getting more and more complex.
The one thing we can't do in software is different header size for
virtio net. For starters, let's add a capability to address that?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13  5:14                                                 ` Jason Wang
  2023-04-13 17:19                                                   ` Michael S. Tsirkin
@ 2023-04-13 17:24                                                   ` Parav Pandit
  2023-04-13 21:02                                                     ` Michael S. Tsirkin
  2023-04-14  3:08                                                     ` Jason Wang
  1 sibling, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-13 17:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



On 4/13/2023 1:14 AM, Jason Wang wrote:

>> For LM BAR length and number should be same between two PCI VFs. But its
>> orthogonal to this point. Such checks will be done anyway.
> 
> Quoted the wrong sections, I think it should be:
> 
> "
> length MAY include padding, or fields unused by the driver, or future
> extensions. Note: For example, a future device might present a large
> structure size of several MBytes. As current devices never utilize
> structures larger than 4KBytes in size, driver MAY limit the mapped
> structure size to e.g. 4KBytes (thus ignoring parts of structure after
> the first 4KBytes) to allow forward compatibility with such devices
> without loss of functionality and without wasting resources.
> "
yes. This is the one.

> If it's a transitional device but not placed at BAR0, it might have
> side effects for Linux drivers which assumes BAR0 for legacy.
> 
True. Transitional can be at BAR0.

> I don't see how easy it could be a non transitional device:
> 
> "
> Devices or drivers with no legacy compatibility are referred to as
> non-transitional devices and drivers, respectively.
> "
Michael has suggested rewording of the text.
It is anyway new text so lets park it aside for now.
It is mostly tweaking the text.

> 
>> device can expose this optional capability and its attached MMIO region.
>>
>> Spec changes are similar to #2.
>>> - non trivial spec changes which ends up of the tricky cases that
>>> tries to workaround legacy to fit for a hardware implementation
>>> - work only for the case of virtualization with the help of
>>> meditation, can't work for bare metal
>> For bare-metal PFs usually thin hypervisors are used that does very
>> minimal setup. But I agree that bare-metal is relatively less important.
> 
> This is not what I understand. I know several vendors that are using
> virtio devices for bare metal.
> 
I was saying the case for legacy bare metal is less of a problem because 
PCIe does not limit functionality, perf is still limited due to IOBAR.

>>
>>> - only work for some specific archs without SVQ
>>>
>> That is the legacy limitation that we don't worry about.
>>
>>> 2) allow BAR0 to be MMIO for transitional device
>>>
>>> Pros:
>>> - very minor change for the spec
>> Spec changes wise they are similar to #1.
> 
> This is different since the changes for this are trivial.
> 
>>> - work for virtualization (and it work even without dedicated
>>> mediation for some setups)
>> I am not aware where can it work without mediation. Do you know any
>> specific kernel version where it actually works?
> 
> E.g current Linux driver did:
> 
> rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> 
> It doesn't differ from I/O with memory. It means if you had a
> "transitional" device with legacy MMIO BAR0, it just works.
> 

Thanks to the abstract PCI API in Linux.

>>> - work for bare metal for some setups (without mediation)
>>> Cons:
>>> - only work for some specific archs without SVQ
>>> - BAR0 is required
>>>
>> Both are not limitation as they are mainly coming from the legacy side
>> of things.
>>
>>> 3) modern device mediation for legacy
>>>
>>> Pros:
>>> - no changes in the spec
>>> Cons:
>>> - require mediation layer in order to work in bare metal
>>> - require datapath mediation like SVQ to work for virtualization
>>>
>> Spec change is still require for net and blk because modern device do
>> not understand legacy, even with mediation layer.
> 
> That's fine and easy since we work on top of modern devices.
> 
>> FEATURE_1, RW cap via CVQ which is not really owned by the hypervisor.
> 
> Hypervisors can trap if they wish.
> 
Trapping non legacy accessing for 1.x doesn't make sense.

>> A guest may be legacy or non legacy, so mediation shouldn't be always done.
> 
> Yes, so mediation can work only if we found it's a legacy driver.
> 
Mediation will be done only for legacy accesses without cvq, rest will 
go as-is without any cvq and other mediation.

>>
>>> Compared to method 2) the only advantages of method 1) is the
>>> flexibility of BAR0 but it has too many disadvantages. If we only care
>>> about virtualization, modern devices are sufficient. Then why bother
>>> for that?
>>
>> So that a single stack which doesn't always have the knowledge of which
>> driver version is running is guest can utilize it. Otherwise 1.x also
>> end up doing mediation when guest driver = 1.x and device = transitional
>> PCI VF.
> 
> I don't see how this can be solved in your proposal.
> 
This proposal only traps the legacy accesses and doesnt require other 
giant framework.

I think we can make the BAR0 work for transitional with spec change and 
with optional notification region.
I am evaluating further.

>>
>> so (1) and (2) both are equivalent, one is more flexible, if you know
>> more valid cases where BAR0 as MMIO can work as_is, such option is open.
> 
> As said in previous threads, this has been used by several vendors for years.
> 
> E.g I have a handy transitional hardware virtio device that has:
> 
>          Region 0: Memory at f5ff0000 (64-bit, prefetchable) [size=8K]
>          Region 2: Memory at f5fe0000 (64-bit, prefetchable) [size=4K]
>          Region 4: Memory at f5800000 (64-bit, prefetchable) [size=4M]
> 
> And:
> 
>          Capabilities: [64] Vendor Specific Information: VirtIO: CommonCfg
>                  BAR=0 offset=00000888 size=00000078
>          Capabilities: [74] Vendor Specific Information: VirtIO: Notify
>                  BAR=0 offset=00001800 size=00000020 multiplier=00000000
>          Capabilities: [88] Vendor Specific Information: VirtIO: ISR
>                  BAR=0 offset=00000820 size=00000020
>          Capabilities: [98] Vendor Specific Information: VirtIO: DeviceCfg
>                  BAR=0 offset=00000840 size=00000020
> 
>>
>> We can draft the spec that MMIO BAR SHOULD be exposes in BAR0.
>>
Yes, above one.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 17:19                                                   ` Michael S. Tsirkin
@ 2023-04-13 19:39                                                     ` Parav Pandit
  2023-04-14  3:09                                                       ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-13 19:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, April 13, 2023 1:20 PM
> 
> On Thu, Apr 13, 2023 at 01:14:15PM +0800, Jason Wang wrote:
> > > >>>> The proposed solution in this series enables it and avoid per
> > > >>>> field sw
> > > >>> interpretation and mediation in parsing values etc.
> 
> ... except for reset, notifications, and maybe more down the road.
> 
Your AQ proposal addresses reset too.
Nothing extra for the notifications, as comes for free from the device side.
> 
> > > >>> I don't think it's possible. See the discussion about
> > > >>> ORDER_PLATFORM and ACCESS_PLATFORM in previous threads.
> > > >>>
> > > >> I have read the previous thread.
> > > >> Hypervisor will be limiting to those platforms where ORDER_PLATFORM
> is not needed.
> > > >
> > > > So you introduce a bunch of new facilities that only work on some
> > > > specific archs. This breaks the architecture independence of
> > > > virtio since 1.0.
> > > The defined spec for PCI device does not work today for transitional
> > > device for virtualization. Only works in limited PF case.
> > > Hence this update.
> >
> > I fully understand the motivation. I just want to say
> >
> > 1) compare to the MMIO ar BAR0, this proposal doesn't provide much
> > advantages
> > 2) mediate on top of modern devices allows us to not worry about the
> > device design which is hard for legacy
> 
> I begin to think so, too. When I proposed this it looked like just a single
> capability will be enough, without a lot of mess.  But it seems that addressing
> this fully is getting more and more complex.
> The one thing we can't do in software is different header size for virtio net. For
> starters, let's add a capability to address that?

Hdr bit doesn't solve it because hypervisor is not involved in any trapping of feature bits, cvq or other vqs.
It is unified code for 1.x and transitional in hypervisor.

We have two options to satisfy the requirements.
(partly taken/repeated from Jason's yday email).

1. AQ (solves reset) + notification for building non transitional device that support perform well and it is both backward and forward compat
Pros:
a. efficient device reset.
b. efficient notifications from OS to device
c. device vendor doesn't need to build transitional configuration space.
d. works without any mediation in hv for 1.x and non 1.x for all non-legacy interfaces (vqs, config space, cvq, and future features).
e. can work with non-Linux guest VMs too

Cons:
a. More AQ commands work in sw
b. Does not work for bare metal PFs

2. Allowing MMIO BAR0 on transitional device as SHOULD requirement with larger BAR size.
Pros:
a. Can work with Linux bare-metal and Linux guest VMs as one of the wider uses case
b. in-efficient device handling for notifications
c. Works without mediation like 1.d.
d. Also works without HV mediation.

Cons:
a. device reset implementation is very for the hw.
b. requires transitional device to be built.
c. Notification performance may suffer.

For Marvell and us #1 works well.
I am evaluating #2 and get back.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 17:24                                                   ` [virtio-dev] " Parav Pandit
@ 2023-04-13 21:02                                                     ` Michael S. Tsirkin
  2023-04-13 21:08                                                       ` [virtio-dev] " Parav Pandit
  2023-04-14  3:08                                                     ` Jason Wang
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-13 21:02 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Thu, Apr 13, 2023 at 01:24:24PM -0400, Parav Pandit wrote:
> > > > - work for virtualization (and it work even without dedicated
> > > > mediation for some setups)
> > > I am not aware where can it work without mediation. Do you know any
> > > specific kernel version where it actually works?
> > 
> > E.g current Linux driver did:
> > 
> > rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> > 
> > It doesn't differ from I/O with memory. It means if you had a
> > "transitional" device with legacy MMIO BAR0, it just works.
> > 
> 
> Thanks to the abstract PCI API in Linux.

Right. I do however at least see the point of what Jason is proposing,
which is to enable some legacy guests without mediation in software.

This thing ... you move some code to the card and reduce the amount of
virtio knowledge in software but do not eliminate it completely.
Seems kind of pointless. Minimal hardware changes make more sense to
me, I'd say. Talking about that, what is a minimal hardware change
to allow a vdpa based solution?
I think that's VIRTIO_NET_F_LEGACY_HEADER, right?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 21:02                                                     ` Michael S. Tsirkin
@ 2023-04-13 21:08                                                       ` Parav Pandit
  2023-04-14  2:36                                                         ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-13 21:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Thursday, April 13, 2023 5:02 PM

> This thing ... you move some code to the card and reduce the amount of virtio
> knowledge in software but do not eliminate it completely.
Sure. Practically there is no knowledge other than transporting like a vxlan encapsulating, here its AQ.

> Seems kind of pointless. Minimal hardware changes make more sense to me, I'd
> say. Talking about that, what is a minimal hardware change to allow a vdpa
> based solution?
> I think that's VIRTIO_NET_F_LEGACY_HEADER, right?

The main requirement/point is that there is virito PCI VF that to be mapped to the guest VM.
Hence, there is no vdpa type of hypervisor layer exists in this use case.

This same VF need to be transitional as the guest kernel may not be known.
hence sometimes vdpa sometime regular 1.x VF is not an option.
Hence for next few years, transitional VF will be plugged in into the guest VM when user is using the PCI VF devices.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 21:08                                                       ` [virtio-dev] " Parav Pandit
@ 2023-04-14  2:36                                                         ` Jason Wang
  2023-04-14  2:43                                                           ` [virtio-dev] " Parav Pandit
  2023-04-14  6:58                                                           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 2 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-14  2:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 5:08 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, April 13, 2023 5:02 PM
>
> > This thing ... you move some code to the card and reduce the amount of virtio
> > knowledge in software but do not eliminate it completely.
> Sure. Practically there is no knowledge other than transporting like a vxlan encapsulating, here its AQ.
>
> > Seems kind of pointless. Minimal hardware changes make more sense to me, I'd
> > say. Talking about that, what is a minimal hardware change to allow a vdpa
> > based solution?
> > I think that's VIRTIO_NET_F_LEGACY_HEADER, right?

I think it is. It would be much easier if we do this.

>
> The main requirement/point is that there is virito PCI VF that to be mapped to the guest VM.
> Hence, there is no vdpa type of hypervisor layer exists in this use case.

It's about mediation which is a must for things like legacy. If
there's a way that helps the vendor to get rid of the tricky legacy
completely, then why not?

>
> This same VF need to be transitional as the guest kernel may not be known.
> hence sometimes vdpa sometime regular 1.x VF is not an option.
> Hence for next few years, transitional VF will be plugged in into the guest VM when user is using the PCI VF devices.

I'm not sure I get this. With VIRTIO_NET_F_LEGACY_HEADER, we don't
need mediation for datapath. For the control path, mediation is a must
for legacy and it's very easy to keep it work for modern, what's wrong
with that?

Thanks

>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  2:36                                                         ` [virtio-dev] " Jason Wang
@ 2023-04-14  2:43                                                           ` Parav Pandit
  2023-04-14  6:57                                                             ` [virtio-dev] " Michael S. Tsirkin
  2023-04-14  6:58                                                           ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-14  2:43 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, April 13, 2023 10:37 PM

> I'm not sure I get this. With VIRTIO_NET_F_LEGACY_HEADER, we don't need
> mediation for datapath. For the control path, mediation is a must for legacy and
> it's very easy to keep it work for modern, what's wrong with that?

There is virtio PCI VF.
The user who attaches this VF to the VM, it doesn’t know if its guest is to run legacy driver or 1.x
Hence hypervisor doesn’t want to run special stack when guest may (likely) be 1.x and device also supports 1.x.

Therefore, MMIO BAR 0 emulation is better solution in this use case without any mediation.
The current specification claim that transitional device "works", hence it is fine to extend BAR0 to be of MMIO type instead of IO type.

In some other cases control path or additional other mediation can be done.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 17:24                                                   ` [virtio-dev] " Parav Pandit
  2023-04-13 21:02                                                     ` Michael S. Tsirkin
@ 2023-04-14  3:08                                                     ` Jason Wang
  2023-04-14  3:13                                                       ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-14  3:08 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 1:24 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> On 4/13/2023 1:14 AM, Jason Wang wrote:
>
> >> For LM BAR length and number should be same between two PCI VFs. But its
> >> orthogonal to this point. Such checks will be done anyway.
> >
> > Quoted the wrong sections, I think it should be:
> >
> > "
> > length MAY include padding, or fields unused by the driver, or future
> > extensions. Note: For example, a future device might present a large
> > structure size of several MBytes. As current devices never utilize
> > structures larger than 4KBytes in size, driver MAY limit the mapped
> > structure size to e.g. 4KBytes (thus ignoring parts of structure after
> > the first 4KBytes) to allow forward compatibility with such devices
> > without loss of functionality and without wasting resources.
> > "
> yes. This is the one.
>
> > If it's a transitional device but not placed at BAR0, it might have
> > side effects for Linux drivers which assumes BAR0 for legacy.
> >
> True. Transitional can be at BAR0.
>
> > I don't see how easy it could be a non transitional device:
> >
> > "
> > Devices or drivers with no legacy compatibility are referred to as
> > non-transitional devices and drivers, respectively.
> > "
> Michael has suggested rewording of the text.
> It is anyway new text so lets park it aside for now.
> It is mostly tweaking the text.
>
> >
> >> device can expose this optional capability and its attached MMIO region.
> >>
> >> Spec changes are similar to #2.
> >>> - non trivial spec changes which ends up of the tricky cases that
> >>> tries to workaround legacy to fit for a hardware implementation
> >>> - work only for the case of virtualization with the help of
> >>> meditation, can't work for bare metal
> >> For bare-metal PFs usually thin hypervisors are used that does very
> >> minimal setup. But I agree that bare-metal is relatively less important.
> >
> > This is not what I understand. I know several vendors that are using
> > virtio devices for bare metal.
> >
> I was saying the case for legacy bare metal is less of a problem because
> PCIe does not limit functionality, perf is still limited due to IOBAR.
>
> >>
> >>> - only work for some specific archs without SVQ
> >>>
> >> That is the legacy limitation that we don't worry about.
> >>
> >>> 2) allow BAR0 to be MMIO for transitional device
> >>>
> >>> Pros:
> >>> - very minor change for the spec
> >> Spec changes wise they are similar to #1.
> >
> > This is different since the changes for this are trivial.
> >
> >>> - work for virtualization (and it work even without dedicated
> >>> mediation for some setups)
> >> I am not aware where can it work without mediation. Do you know any
> >> specific kernel version where it actually works?
> >
> > E.g current Linux driver did:
> >
> > rc = pci_request_region(pci_dev, 0, "virtio-pci-legacy");
> >
> > It doesn't differ from I/O with memory. It means if you had a
> > "transitional" device with legacy MMIO BAR0, it just works.
> >
>
> Thanks to the abstract PCI API in Linux.

And this (legacy MMIO bar) has been supported by DPDK as well for a while.

>
> >>> - work for bare metal for some setups (without mediation)
> >>> Cons:
> >>> - only work for some specific archs without SVQ
> >>> - BAR0 is required
> >>>
> >> Both are not limitation as they are mainly coming from the legacy side
> >> of things.
> >>
> >>> 3) modern device mediation for legacy
> >>>
> >>> Pros:
> >>> - no changes in the spec
> >>> Cons:
> >>> - require mediation layer in order to work in bare metal
> >>> - require datapath mediation like SVQ to work for virtualization
> >>>
> >> Spec change is still require for net and blk because modern device do
> >> not understand legacy, even with mediation layer.
> >
> > That's fine and easy since we work on top of modern devices.
> >
> >> FEATURE_1, RW cap via CVQ which is not really owned by the hypervisor.
> >
> > Hypervisors can trap if they wish.
> >
> Trapping non legacy accessing for 1.x doesn't make sense.

Actually not, I think I've mentioned the reason for several times:

It's a must for ABI and migration compatibility:

1) The offset is not necessarily a page boundary
2) The length is not necessarily times of a PAGE_SIZE
3) PAGE_SIZE varies among different archs
4) Two vendors may have two different layout for the structure

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-13 19:39                                                     ` [virtio-dev] " Parav Pandit
@ 2023-04-14  3:09                                                       ` Jason Wang
  2023-04-14  3:18                                                         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-14  3:09 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 3:39 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Thursday, April 13, 2023 1:20 PM
> >
> > On Thu, Apr 13, 2023 at 01:14:15PM +0800, Jason Wang wrote:
> > > > >>>> The proposed solution in this series enables it and avoid per
> > > > >>>> field sw
> > > > >>> interpretation and mediation in parsing values etc.
> >
> > ... except for reset, notifications, and maybe more down the road.
> >
> Your AQ proposal addresses reset too.
> Nothing extra for the notifications, as comes for free from the device side.
> >
> > > > >>> I don't think it's possible. See the discussion about
> > > > >>> ORDER_PLATFORM and ACCESS_PLATFORM in previous threads.
> > > > >>>
> > > > >> I have read the previous thread.
> > > > >> Hypervisor will be limiting to those platforms where ORDER_PLATFORM
> > is not needed.
> > > > >
> > > > > So you introduce a bunch of new facilities that only work on some
> > > > > specific archs. This breaks the architecture independence of
> > > > > virtio since 1.0.
> > > > The defined spec for PCI device does not work today for transitional
> > > > device for virtualization. Only works in limited PF case.
> > > > Hence this update.
> > >
> > > I fully understand the motivation. I just want to say
> > >
> > > 1) compare to the MMIO ar BAR0, this proposal doesn't provide much
> > > advantages
> > > 2) mediate on top of modern devices allows us to not worry about the
> > > device design which is hard for legacy
> >
> > I begin to think so, too. When I proposed this it looked like just a single
> > capability will be enough, without a lot of mess.  But it seems that addressing
> > this fully is getting more and more complex.
> > The one thing we can't do in software is different header size for virtio net. For
> > starters, let's add a capability to address that?
>
> Hdr bit doesn't solve it because hypervisor is not involved in any trapping of feature bits, cvq or other vqs.
> It is unified code for 1.x and transitional in hypervisor.
>
> We have two options to satisfy the requirements.
> (partly taken/repeated from Jason's yday email).
>
> 1. AQ (solves reset) + notification for building non transitional device that support perform well and it is both backward and forward compat
> Pros:
> a. efficient device reset.
> b. efficient notifications from OS to device
> c. device vendor doesn't need to build transitional configuration space.
> d. works without any mediation in hv for 1.x and non 1.x for all non-legacy interfaces (vqs, config space, cvq, and future features).

Without mediation, how could you forward guest config access to admin
virtqueue? Or you mean:

1) hypervisor medate for legacy
2) otherwise the modern BARs are assigned to the guest

For 2) as we discussed, we can't have such an assumption as

1) spec doesn't enforce the size of a specific structure
2) bing vendor locked thus a blocker for live migration as it mandate
the layout for the guest, mediation layer is a must in this case to
maintain cross vendor compatibility

Hypervisor needs to start from a mediation method and do BAR
assignment only when possible.

> e. can work with non-Linux guest VMs too
>
> Cons:
> a. More AQ commands work in sw

Note that this needs to be done on top of the transport virtqueue. And
we need to carefully design the command sets since they could be
mutually exclusive.

Thanks


> b. Does not work for bare metal PFs
>
> 2. Allowing MMIO BAR0 on transitional device as SHOULD requirement with larger BAR size.
> Pros:
> a. Can work with Linux bare-metal and Linux guest VMs as one of the wider uses case
> b. in-efficient device handling for notifications
> c. Works without mediation like 1.d.
> d. Also works without HV mediation.
>
> Cons:
> a. device reset implementation is very for the hw.
> b. requires transitional device to be built.
> c. Notification performance may suffer.
>
> For Marvell and us #1 works well.
> I am evaluating #2 and get back.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:08                                                     ` Jason Wang
@ 2023-04-14  3:13                                                       ` Parav Pandit
  2023-04-14  3:18                                                         ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-14  3:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, April 13, 2023 11:09 PM

> 
> Actually not, I think I've mentioned the reason for several times:
> 
> It's a must for ABI and migration compatibility:
> 
> 1) The offset is not necessarily a page boundary
> 2) The length is not necessarily times of a PAGE_SIZE
> 3) PAGE_SIZE varies among different archs
> 4) Two vendors may have two different layout for the structure

Migration compatibility check and device composition work will start for regular PCI VF in virtio.
Two vendors will had some level of programmability as well to make this happen.
It is orthogonal topic.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:13                                                       ` [virtio-dev] " Parav Pandit
@ 2023-04-14  3:18                                                         ` Jason Wang
  2023-04-14  3:22                                                           ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-14  3:18 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 11:13 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 11:09 PM
>
> >
> > Actually not, I think I've mentioned the reason for several times:
> >
> > It's a must for ABI and migration compatibility:
> >
> > 1) The offset is not necessarily a page boundary
> > 2) The length is not necessarily times of a PAGE_SIZE
> > 3) PAGE_SIZE varies among different archs
> > 4) Two vendors may have two different layout for the structure
>
> Migration compatibility check and device composition work will start for regular PCI VF in virtio.
> Two vendors will had some level of programmability as well to make this happen.
> It is orthogonal topic.

Actually not, having new blockers for live migration will complicate
the solution in the future.

And 1) 2) 3) can happen even if we don't care about migration compatibility, no?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:09                                                       ` [virtio-dev] " Jason Wang
@ 2023-04-14  3:18                                                         ` Parav Pandit
  2023-04-14  3:37                                                           ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-14  3:18 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, April 13, 2023 11:10 PM

> > We have two options to satisfy the requirements.
> > (partly taken/repeated from Jason's yday email).
> >
> > 1. AQ (solves reset) + notification for building non transitional
> > device that support perform well and it is both backward and forward
> > compat
> > Pros:
> > a. efficient device reset.
> > b. efficient notifications from OS to device c. device vendor doesn't
> > need to build transitional configuration space.
> > d. works without any mediation in hv for 1.x and non 1.x for all non-legacy
> interfaces (vqs, config space, cvq, and future features).
> 
> Without mediation, how could you forward guest config access to admin
> virtqueue? Or you mean:
> 
> 1) hypervisor medate for legacy
> 2) otherwise the modern BARs are assigned to the guest
>
Right.
 
> For 2) as we discussed, we can't have such an assumption as
> 
> 1) spec doesn't enforce the size of a specific structure
Spec will be extended in coming time.

> 2) bing vendor locked thus a blocker for live migration as it mandate the layout
> for the guest, mediation layer is a must in this case to maintain cross vendor
> compatibility
>
With PCI VF will have compatibility check and also vendor recommendations.
 
> Hypervisor needs to start from a mediation method and do BAR assignment
> only when possible.
>
Not necessarily.

> > Cons:
> > a. More AQ commands work in sw
> 
> Note that this needs to be done on top of the transport virtqueue. And we
> need to carefully design the command sets since they could be mutually
> exclusive.
>
Not sure what more to expect out of transport virtqueue compared to AQ.
I didn’t follow, which part could be mutually exclusive?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:18                                                         ` [virtio-dev] " Jason Wang
@ 2023-04-14  3:22                                                           ` Parav Pandit
  2023-04-14  3:29                                                             ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-14  3:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, April 13, 2023 11:18 PM

> > >
> > > It's a must for ABI and migration compatibility:
> > >
> > > 1) The offset is not necessarily a page boundary
> > > 2) The length is not necessarily times of a PAGE_SIZE
> > > 3) PAGE_SIZE varies among different archs
> > > 4) Two vendors may have two different layout for the structure
> >
> > Migration compatibility check and device composition work will start for
> regular PCI VF in virtio.
> > Two vendors will had some level of programmability as well to make this
> happen.
> > It is orthogonal topic.
> 
> Actually not, having new blockers for live migration will complicate the solution
> in the future.
> 
It is not a blocker, its just another flexible field that hw vendors will align to a common way to configure.

Just to make sure this is the BAR0 size, right?

> And 1) 2) 3) can happen even if we don't care about migration compatibility,
> no?

I guess one we have spec definition, it wont be a problem.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:22                                                           ` [virtio-dev] " Parav Pandit
@ 2023-04-14  3:29                                                             ` Jason Wang
  0 siblings, 0 replies; 200+ messages in thread
From: Jason Wang @ 2023-04-14  3:29 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 11:22 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 11:18 PM
>
> > > >
> > > > It's a must for ABI and migration compatibility:
> > > >
> > > > 1) The offset is not necessarily a page boundary
> > > > 2) The length is not necessarily times of a PAGE_SIZE
> > > > 3) PAGE_SIZE varies among different archs
> > > > 4) Two vendors may have two different layout for the structure
> > >
> > > Migration compatibility check and device composition work will start for
> > regular PCI VF in virtio.
> > > Two vendors will had some level of programmability as well to make this
> > happen.
> > > It is orthogonal topic.
> >
> > Actually not, having new blockers for live migration will complicate the solution
> > in the future.
> >
> It is not a blocker, its just another flexible field that hw vendors will align to a common way to configure.
>
> Just to make sure this is the BAR0 size, right?

We're discussing "if mediating 1.x device is a must" as I reply to
what you said:

"
Trapping non legacy accessing for 1.x doesn't make sense.
"

>
> > And 1) 2) 3) can happen even if we don't care about migration compatibility,
> > no?
>
> I guess one we have spec definition, it wont be a problem.

How can you assign a cap length 4K to a guest assuming 8K as page size?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:18                                                         ` [virtio-dev] " Parav Pandit
@ 2023-04-14  3:37                                                           ` Jason Wang
  2023-04-14  3:51                                                             ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-14  3:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 11:18 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 11:10 PM
>
> > > We have two options to satisfy the requirements.
> > > (partly taken/repeated from Jason's yday email).
> > >
> > > 1. AQ (solves reset) + notification for building non transitional
> > > device that support perform well and it is both backward and forward
> > > compat
> > > Pros:
> > > a. efficient device reset.
> > > b. efficient notifications from OS to device c. device vendor doesn't
> > > need to build transitional configuration space.
> > > d. works without any mediation in hv for 1.x and non 1.x for all non-legacy
> > interfaces (vqs, config space, cvq, and future features).
> >
> > Without mediation, how could you forward guest config access to admin
> > virtqueue? Or you mean:
> >
> > 1) hypervisor medate for legacy
> > 2) otherwise the modern BARs are assigned to the guest
> >
> Right.
>
> > For 2) as we discussed, we can't have such an assumption as
> >
> > 1) spec doesn't enforce the size of a specific structure
> Spec will be extended in coming time.

It's too late to do any new restriction without introducing a flag.
Mandating size may easily end up with a architecte specific solution.

>
> > 2) bing vendor locked thus a blocker for live migration as it mandate the layout
> > for the guest, mediation layer is a must in this case to maintain cross vendor
> > compatibility
> >
> With PCI VF will have compatibility check and also vendor recommendations.

I'm not sure what type of recommendations you want.

We've already had:

1) features
2) config

And what you proposed is to allow the management to know the exact
hardware layout in order to check the compatibility? And the
management needs to evolve as new structures are added. This
complicates the work of management's furtherly which I'm not sure it
can work.

>
> > Hypervisor needs to start from a mediation method and do BAR assignment
> > only when possible.
> >
> Not necessarily.
>
> > > Cons:
> > > a. More AQ commands work in sw
> >
> > Note that this needs to be done on top of the transport virtqueue. And we
> > need to carefully design the command sets since they could be mutually
> > exclusive.
> >
> Not sure what more to expect out of transport virtqueue compared to AQ.
> I didn’t follow, which part could be mutually exclusive?

Transport VQ allows a modern device to be transported via adminq.

And you want to add commands to transport for legacy devices.

Can a driver use both the modern transport commands as well as the
legacy transport commands?

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:37                                                           ` [virtio-dev] " Jason Wang
@ 2023-04-14  3:51                                                             ` Parav Pandit
  2023-04-14  7:05                                                               ` [virtio-dev] " Michael S. Tsirkin
  2023-04-17  3:22                                                               ` Jason Wang
  0 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-14  3:51 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, April 13, 2023 11:38 PM

> > > 1) spec doesn't enforce the size of a specific structure
> > Spec will be extended in coming time.
> 
> It's too late to do any new restriction without introducing a flag.
We are really diverging from the topic.
I don’t think it is late. The work in this area of PCI VF has not even begun fully.

> Mandating size may easily end up with a architecte specific solution.
> 
Unlikely. Other standard device types are also expanding this way.

> >
> > > 2) bing vendor locked thus a blocker for live migration as it
> > > mandate the layout for the guest, mediation layer is a must in this
> > > case to maintain cross vendor compatibility
> > >
> > With PCI VF will have compatibility check and also vendor recommendations.
> 
> I'm not sure what type of recommendations you want.
> 
> We've already had:
> 
> 1) features
> 2) config
>
Those cover the large part.
Apart from it some of the PCI device layout compat checks will be covered too.
 
> And what you proposed is to allow the management to know the exact
> hardware layout in order to check the compatibility? And the management
> needs to evolve as new structures are added. 
Mostly not mgmt stack may not need to evolve a lot.
Because most layouts should be growing within the device context and not at the PCI capabilities etc area.

And even if it does, its fine as large part of it standard PCI spec definitions.

> This complicates the work of
> management's furtherly which I'm not sure it can work.
>
Well, once we work towards it, it can work. :)

> >
> > > Hypervisor needs to start from a mediation method and do BAR
> > > assignment only when possible.
> > >
> > Not necessarily.
> >
> > > > Cons:
> > > > a. More AQ commands work in sw
> > >
> > > Note that this needs to be done on top of the transport virtqueue.
> > > And we need to carefully design the command sets since they could be
> > > mutually exclusive.
> > >
> > Not sure what more to expect out of transport virtqueue compared to AQ.
> > I didn’t follow, which part could be mutually exclusive?
> 
> Transport VQ allows a modern device to be transported via adminq.
> 
May be for devices it can work. Hypervisor mediation with CC on horizon for new capabilities is being reduced.
So we don’t see transport vq may not be path forward.

> And you want to add commands to transport for legacy devices.
>
Yes only legacy emulation who do not care about hypervisor mediation.
 
> Can a driver use both the modern transport commands as well as the legacy
> transport commands?
Hard to answer, I likely do not understand as driver namespace is unclear.
Let me try.

A modern driver in guest VM accessing a transitional device has its own transport vq for say config space RW.
This transport queue can directly reach to the device without hypervisor mediation, - yes.
Same device if accessed via some legacy guest VM driver, it will gets its config space accessed via transport VQ (AVQ) of the hypervisor PF AQ.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  2:43                                                           ` [virtio-dev] " Parav Pandit
@ 2023-04-14  6:57                                                             ` Michael S. Tsirkin
  2023-04-16 13:41                                                               ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-14  6:57 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 02:43:21AM +0000, Parav Pandit wrote:
> 
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 10:37 PM
> 
> > I'm not sure I get this. With VIRTIO_NET_F_LEGACY_HEADER, we don't need
> > mediation for datapath. For the control path, mediation is a must for legacy and
> > it's very easy to keep it work for modern, what's wrong with that?
> 
> There is virtio PCI VF.
> The user who attaches this VF to the VM, it doesn’t know if its guest is to run legacy driver or 1.x
> Hence hypervisor doesn’t want to run special stack when guest may (likely) be 1.x and device also supports 1.x.
> 
> Therefore, MMIO BAR 0 emulation is better solution in this use case without any mediation.
> The current specification claim that transitional device "works", hence it is fine to extend BAR0 to be of MMIO type instead of IO type.
> 
> In some other cases control path or additional other mediation can be done.

Do you refer to the trick Jason proposed where BAR0 is memory but
otherwise matches legacy BAR0 exactly? Is this your preferred
solution at this point then?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  2:36                                                         ` [virtio-dev] " Jason Wang
  2023-04-14  2:43                                                           ` [virtio-dev] " Parav Pandit
@ 2023-04-14  6:58                                                           ` Michael S. Tsirkin
  1 sibling, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-14  6:58 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 10:36:52AM +0800, Jason Wang wrote:
> On Fri, Apr 14, 2023 at 5:08 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Thursday, April 13, 2023 5:02 PM
> >
> > > This thing ... you move some code to the card and reduce the amount of virtio
> > > knowledge in software but do not eliminate it completely.
> > Sure. Practically there is no knowledge other than transporting like a vxlan encapsulating, here its AQ.
> >
> > > Seems kind of pointless. Minimal hardware changes make more sense to me, I'd
> > > say. Talking about that, what is a minimal hardware change to allow a vdpa
> > > based solution?
> > > I think that's VIRTIO_NET_F_LEGACY_HEADER, right?
> 
> I think it is. It would be much easier if we do this.

It does not look like Parav is interested in this approach but
if you like it feel free to propose it.

> >
> > The main requirement/point is that there is virito PCI VF that to be mapped to the guest VM.
> > Hence, there is no vdpa type of hypervisor layer exists in this use case.
> 
> It's about mediation which is a must for things like legacy. If
> there's a way that helps the vendor to get rid of the tricky legacy
> completely, then why not?
> 
> >
> > This same VF need to be transitional as the guest kernel may not be known.
> > hence sometimes vdpa sometime regular 1.x VF is not an option.
> > Hence for next few years, transitional VF will be plugged in into the guest VM when user is using the PCI VF devices.
> 
> I'm not sure I get this. With VIRTIO_NET_F_LEGACY_HEADER, we don't
> need mediation for datapath. For the control path, mediation is a must
> for legacy and it's very easy to keep it work for modern, what's wrong
> with that?
> 
> Thanks
> 
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:51                                                             ` [virtio-dev] " Parav Pandit
@ 2023-04-14  7:05                                                               ` Michael S. Tsirkin
  2023-04-17  3:22                                                               ` Jason Wang
  1 sibling, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-14  7:05 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 03:51:01AM +0000, Parav Pandit wrote:
> 
> 
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 11:38 PM
> 
> > > > 1) spec doesn't enforce the size of a specific structure
> > > Spec will be extended in coming time.
> > 
> > It's too late to do any new restriction without introducing a flag.
> We are really diverging from the topic.
> I don’t think it is late. The work in this area of PCI VF has not even begun fully.

I think Jason means that any new feature needs some kind of approach for
compatibility, given a huge installed base, a flag day where all old
drivers stop working on new devices does not look acceptable to him.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  6:57                                                             ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-16 13:41                                                               ` Parav Pandit
  2023-04-16 20:44                                                                 ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-16 13:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> open.org> On Behalf Of Michael S. Tsirkin
> Sent: Friday, April 14, 2023 2:57 AM
 
> Do you refer to the trick Jason proposed where BAR0 is memory but otherwise
> matches legacy BAR0 exactly? Is this your preferred solution at this point then?

We look at it again.
Above solution can work reliably only for a very small number of PF and that too with very special hardware circuitry due to the reset flow.

Therefore, for virtualization below interface is preferred.
a. For transitional device legacy configuration register transport over AQ,
Notification to utilize transitional device notification area of the BAR.

b. Non legacy interface of transitional and non-transitional PCI device to access direct PCI device without mediation.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-16 13:41                                                               ` [virtio-dev] " Parav Pandit
@ 2023-04-16 20:44                                                                 ` Michael S. Tsirkin
  2023-04-17 16:59                                                                   ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-16 20:44 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Sun, Apr 16, 2023 at 01:41:55PM +0000, Parav Pandit wrote:
> > From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > open.org> On Behalf Of Michael S. Tsirkin
> > Sent: Friday, April 14, 2023 2:57 AM
>  
> > Do you refer to the trick Jason proposed where BAR0 is memory but otherwise
> > matches legacy BAR0 exactly? Is this your preferred solution at this point then?
> 
> We look at it again.
> Above solution can work reliably only for a very small number of PF and that too with very special hardware circuitry due to the reset flow.
> 
> Therefore, for virtualization below interface is preferred.
> a. For transitional device legacy configuration register transport over AQ,

I don't get what this has to do with transitional ...

> Notification to utilize transitional device notification area of the BAR.

The vq transport does something like this, no?

> b. Non legacy interface of transitional and non-transitional PCI device to access direct PCI device without mediation.

So VF can either be accessed through AQ of PF, or through direct
mapping?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-14  3:51                                                             ` [virtio-dev] " Parav Pandit
  2023-04-14  7:05                                                               ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-17  3:22                                                               ` Jason Wang
  2023-04-17 17:23                                                                 ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-17  3:22 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Fri, Apr 14, 2023 at 11:51 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, April 13, 2023 11:38 PM
>
> > > > 1) spec doesn't enforce the size of a specific structure
> > > Spec will be extended in coming time.
> >
> > It's too late to do any new restriction without introducing a flag.
> We are really diverging from the topic.
> I don’t think it is late. The work in this area of PCI VF has not even begun fully.

I meant it needs a new feature flag.

>
> > Mandating size may easily end up with a architecte specific solution.
> >
> Unlikely. Other standard device types are also expanding this way.

I think we are talking about software technologies instead of device
design here.

For devices it works.But for hypervisor, it needs to deal with the
size that doesn't match arch's page size.

>
> > >
> > > > 2) bing vendor locked thus a blocker for live migration as it
> > > > mandate the layout for the guest, mediation layer is a must in this
> > > > case to maintain cross vendor compatibility
> > > >
> > > With PCI VF will have compatibility check and also vendor recommendations.
> >
> > I'm not sure what type of recommendations you want.
> >
> > We've already had:
> >
> > 1) features
> > 2) config
> >
> Those cover the large part.

I meant you can't have recommendations in features and config. What's
more, assuming you have two generations of device

gen1: features x,y
gen2: features x,y,z

You won't be able to do migration between gen1 and gen2 without
mediation. Such technologies have been used by cpu features for years.
I am not sure why it became a problem for you.

> Apart from it some of the PCI device layout compat checks will be covered too.
>
> > And what you proposed is to allow the management to know the exact
> > hardware layout in order to check the compatibility? And the management
> > needs to evolve as new structures are added.
> Mostly not mgmt stack may not need to evolve a lot.
> Because most layouts should be growing within the device context and not at the PCI capabilities etc area.
>
> And even if it does, its fine as large part of it standard PCI spec definitions.

So as mentioned in another thread, this is a PCI specific solution:

1) feature and config are basic virtio facility
2) capability is not but specific to PCI transport

Checking PCI capability layout in the virtio management is a layer
violation which can't work for future transport like SIOV or adminq.
Management should only see virtio device otherwise the solution
becomes transport specific.

>
> > This complicates the work of
> > management's furtherly which I'm not sure it can work.
> >
> Well, once we work towards it, it can work. :)
>
> > >
> > > > Hypervisor needs to start from a mediation method and do BAR
> > > > assignment only when possible.
> > > >
> > > Not necessarily.
> > >
> > > > > Cons:
> > > > > a. More AQ commands work in sw
> > > >
> > > > Note that this needs to be done on top of the transport virtqueue.
> > > > And we need to carefully design the command sets since they could be
> > > > mutually exclusive.
> > > >
> > > Not sure what more to expect out of transport virtqueue compared to AQ.
> > > I didn’t follow, which part could be mutually exclusive?
> >
> > Transport VQ allows a modern device to be transported via adminq.
> >
> May be for devices it can work. Hypervisor mediation with CC on horizon for new capabilities is being reduced.
> So we don’t see transport vq may not be path forward.
>
> > And you want to add commands to transport for legacy devices.
> >
> Yes only legacy emulation who do not care about hypervisor mediation.
>
> > Can a driver use both the modern transport commands as well as the legacy
> > transport commands?
> Hard to answer, I likely do not understand as driver namespace is unclear.

Things might be simplified if we use separate queues for admin,
transport and legacy.

> Let me try.
>
> A modern driver in guest VM accessing a transitional device has its own transport vq for say config space RW.
> This transport queue can directly reach to the device without hypervisor mediation, - yes.
> Same device if accessed via some legacy guest VM driver, it will gets its config space accessed via transport VQ (AVQ) of the hypervisor PF AQ.

Not only the config space, but we can revisit those issues when we
agree to use adminq (or other like PASID which is even more flexible).

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-16 20:44                                                                 ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-17 16:59                                                                   ` Parav Pandit
  2023-04-18  1:09                                                                     ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-17 16:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, April 16, 2023 4:44 PM
> 
> On Sun, Apr 16, 2023 at 01:41:55PM +0000, Parav Pandit wrote:
> > > From: virtio-comment@lists.oasis-open.org
> > > <virtio-comment@lists.oasis- open.org> On Behalf Of Michael S.
> > > Tsirkin
> > > Sent: Friday, April 14, 2023 2:57 AM
> >
> > > Do you refer to the trick Jason proposed where BAR0 is memory but
> > > otherwise matches legacy BAR0 exactly? Is this your preferred solution at
> this point then?
> >
> > We look at it again.
> > Above solution can work reliably only for a very small number of PF and that
> too with very special hardware circuitry due to the reset flow.
> >
> > Therefore, for virtualization below interface is preferred.
> > a. For transitional device legacy configuration register transport
> > over AQ,
> 
> I don't get what this has to do with transitional ...
> 
Typically, in current wordings, transitional is the device that supports legacy interface.
So, it doesn't have to be for the transitional.

I just wanted to highlight that a PCI VF device with its parent PCI PF device can transport the legacy interface commands.

> > Notification to utilize transitional device notification area of the BAR.
> 
> The vq transport does something like this, no?
> 
Notifications over a queuing interface unlikely can be a performant interface because one is configuration task and other is data path task.

> > b. Non legacy interface of transitional and non-transitional PCI device to
> access direct PCI device without mediation.
> 
> So VF can either be accessed through AQ of PF, or through direct mapping?
Right. VF to access legacy registers using AQ of PF and continue non-legacy registers using direct mapping as done today.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17  3:22                                                               ` Jason Wang
@ 2023-04-17 17:23                                                                 ` Parav Pandit
  2023-04-17 20:26                                                                   ` [virtio-dev] " Michael S. Tsirkin
  2023-04-18  1:01                                                                   ` [virtio-dev] " Jason Wang
  0 siblings, 2 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-17 17:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Sunday, April 16, 2023 11:23 PM
> 
> On Fri, Apr 14, 2023 at 11:51 AM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Thursday, April 13, 2023 11:38 PM
> >
> > > > > 1) spec doesn't enforce the size of a specific structure
> > > > Spec will be extended in coming time.
> > >
> > > It's too late to do any new restriction without introducing a flag.
> > We are really diverging from the topic.
> > I don’t think it is late. The work in this area of PCI VF has not even begun fully.
> 
> I meant it needs a new feature flag.
> 
Ok.
> >
> > > Mandating size may easily end up with a architecte specific solution.
> > >
> > Unlikely. Other standard device types are also expanding this way.
> 
> I think we are talking about software technologies instead of device design
> here.
> 
Isn't the size of BAR and its cap_len exposed by the device?

> For devices it works.But for hypervisor, it needs to deal with the size that
> doesn't match arch's page size.
> 
PCI BAR size of the VF can learn the system page size being different for x86 (4K) and arm (64K).
PCI transport seems to support it.

PCI PF on bare-metal has to understand the highest page size anyway if for some reason bare-metal host wants to map this PF to the VM.

A hypevisor mediating and emulating needs to learn the system page size anyway.
If underlying device page size is smaller, hypevisor may end up mediating it.

> I meant you can't have recommendations in features and config. 
Sure. There is none.

> What's more,
> assuming you have two generations of device
> 
> gen1: features x,y
> gen2: features x,y,z
> 
> You won't be able to do migration between gen1 and gen2 without mediation.
Gen1 can easily migrate to gen2, because gen1 has smaller subset than gen2.
When gen2 device is composed, feature z is disabled.

Gen2 to gen1 migration can do software-based migration anyway or through mediation.
But because gen2 may need to migrate to gen1, hence gen2 to gen2 migration also should be done through mediation, doesn’t make sense to me.

> Such technologies have been used by cpu features for years.
> I am not sure why it became a problem for you.
> 
> > Apart from it some of the PCI device layout compat checks will be covered
> too.
> >
> > > And what you proposed is to allow the management to know the exact
> > > hardware layout in order to check the compatibility? And the
> > > management needs to evolve as new structures are added.
> > Mostly not mgmt stack may not need to evolve a lot.
> > Because most layouts should be growing within the device context and not at
> the PCI capabilities etc area.
> >
> > And even if it does, its fine as large part of it standard PCI spec definitions.
> 
> So as mentioned in another thread, this is a PCI specific solution:
> 
> 1) feature and config are basic virtio facility
> 2) capability is not but specific to PCI transport
> 
So any LM solution will have transport specific checks and virtio level checks.

> Checking PCI capability layout in the virtio management is a layer violation
> which can't work for future transport like SIOV or adminq.
Virtio management that will have transport level checks is not a violation.
SIOV will define its own transport anyway. Not to mix with ccw/mmio or pci.

> Management should only see virtio device otherwise the solution becomes
> transport specific.
> 
Solution needs to cover transport as well as transport is integral part of the virtio spec.
Each transport layer will implement feature/config/cap in its own way.

> >
> > > This complicates the work of
> > > management's furtherly which I'm not sure it can work.
> > >
> > Well, once we work towards it, it can work. :)
> >
> > > >
> > > > > Hypervisor needs to start from a mediation method and do BAR
> > > > > assignment only when possible.
> > > > >
> > > > Not necessarily.
> > > >
> > > > > > Cons:
> > > > > > a. More AQ commands work in sw
> > > > >
> > > > > Note that this needs to be done on top of the transport virtqueue.
> > > > > And we need to carefully design the command sets since they
> > > > > could be mutually exclusive.
> > > > >
> > > > Not sure what more to expect out of transport virtqueue compared to AQ.
> > > > I didn’t follow, which part could be mutually exclusive?
> > >
> > > Transport VQ allows a modern device to be transported via adminq.
> > >
> > May be for devices it can work. Hypervisor mediation with CC on horizon for
> new capabilities is being reduced.
> > So we don’t see transport vq may not be path forward.
> >
> > > And you want to add commands to transport for legacy devices.
> > >
> > Yes only legacy emulation who do not care about hypervisor mediation.
> >
> > > Can a driver use both the modern transport commands as well as the
> > > legacy transport commands?
> > Hard to answer, I likely do not understand as driver namespace is unclear.
> 
> Things might be simplified if we use separate queues for admin, transport and
> legacy.
> 
Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
AQ_1 of the PF used by admin work as SIOV device create, SRIOV MSIX configuration.
AQ_2 of the PF used for transporting legacy config access of the PCI VF
AQ_3 of the PF for some transport work.

If yes, sounds find to me.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17 17:23                                                                 ` [virtio-dev] " Parav Pandit
@ 2023-04-17 20:26                                                                   ` Michael S. Tsirkin
  2023-04-17 20:28                                                                     ` [virtio-dev] " Parav Pandit
  2023-04-18  1:01                                                                   ` [virtio-dev] " Jason Wang
  1 sibling, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-17 20:26 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Mon, Apr 17, 2023 at 05:23:30PM +0000, Parav Pandit wrote:
> > Things might be simplified if we use separate queues for admin, transport and
> > legacy.
> > 
> Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> AQ_1 of the PF used by admin work as SIOV device create, SRIOV MSIX configuration.
> AQ_2 of the PF used for transporting legacy config access of the PCI VF
> AQ_3 of the PF for some transport work.
> 
> If yes, sounds find to me.

Latest proposal simply leaves the split between AQs up to the driver.
Seems the most flexible.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17 20:26                                                                   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-17 20:28                                                                     ` Parav Pandit
  2023-04-18  0:36                                                                       ` [virtio-dev] " Jason Wang
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-17 20:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, April 17, 2023 4:27 PM
> On Mon, Apr 17, 2023 at 05:23:30PM +0000, Parav Pandit wrote:
> > > Things might be simplified if we use separate queues for admin,
> > > transport and legacy.
> > >
> > Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> > AQ_1 of the PF used by admin work as SIOV device create, SRIOV MSIX
> configuration.
> > AQ_2 of the PF used for transporting legacy config access of the PCI
> > VF
> > AQ_3 of the PF for some transport work.
> >
> > If yes, sounds find to me.
> 
> Latest proposal simply leaves the split between AQs up to the driver.
> Seems the most flexible.
Yes. It is. Different opcode range and multiple AQs enable to do so.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17 20:28                                                                     ` [virtio-dev] " Parav Pandit
@ 2023-04-18  0:36                                                                       ` Jason Wang
  2023-04-18  1:30                                                                         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-18  0:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 18, 2023 at 4:29 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Monday, April 17, 2023 4:27 PM
> > On Mon, Apr 17, 2023 at 05:23:30PM +0000, Parav Pandit wrote:
> > > > Things might be simplified if we use separate queues for admin,
> > > > transport and legacy.
> > > >
> > > Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> > > AQ_1 of the PF used by admin work as SIOV device create, SRIOV MSIX
> > configuration.
> > > AQ_2 of the PF used for transporting legacy config access of the PCI
> > > VF
> > > AQ_3 of the PF for some transport work.
> > >
> > > If yes, sounds find to me.
> >
> > Latest proposal simply leaves the split between AQs up to the driver.
> > Seems the most flexible.
> Yes. It is. Different opcode range and multiple AQs enable to do so.

Right, so it would be some facility that makes the transport commands
of modern and legacy are mutually exclusive.

Thanks

>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17 17:23                                                                 ` [virtio-dev] " Parav Pandit
  2023-04-17 20:26                                                                   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-18  1:01                                                                   ` Jason Wang
  2023-04-18  1:48                                                                     ` [virtio-dev] " Parav Pandit
  1 sibling, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-18  1:01 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 18, 2023 at 1:23 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Sunday, April 16, 2023 11:23 PM
> >
> > On Fri, Apr 14, 2023 at 11:51 AM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Thursday, April 13, 2023 11:38 PM
> > >
> > > > > > 1) spec doesn't enforce the size of a specific structure
> > > > > Spec will be extended in coming time.
> > > >
> > > > It's too late to do any new restriction without introducing a flag.
> > > We are really diverging from the topic.
> > > I don’t think it is late. The work in this area of PCI VF has not even begun fully.
> >
> > I meant it needs a new feature flag.
> >
> Ok.
> > >
> > > > Mandating size may easily end up with a architecte specific solution.
> > > >
> > > Unlikely. Other standard device types are also expanding this way.
> >
> > I think we are talking about software technologies instead of device design
> > here.
> >
> Isn't the size of BAR and its cap_len exposed by the device?

Somehow, it's more about how the hypervisor is going to use this,
memory mapped or trapping. For either case, the hypervisor needs to
have virtio knowledge in order to finish this.

>
> > For devices it works.But for hypervisor, it needs to deal with the size that
> > doesn't match arch's page size.
> >
> PCI BAR size of the VF can learn the system page size being different for x86 (4K) and arm (64K).
> PCI transport seems to support it.

Yes this is for SR-IOV but not for other cases. We could invent new
facilities for sure but the hypervisor can not have this assumption.

>
> PCI PF on bare-metal has to understand the highest page size anyway if for some reason bare-metal host wants to map this PF to the VM.
>
> A hypevisor mediating and emulating needs to learn the system page size anyway.
> If underlying device page size is smaller, hypevisor may end up mediating it.

Exactly. So what I want to say is, for whatever case, a hypervisor
needs to have virtio knowledge in order to achieve these.

>
> > I meant you can't have recommendations in features and config.
> Sure. There is none.
>
> > What's more,
> > assuming you have two generations of device
> >
> > gen1: features x,y
> > gen2: features x,y,z
> >
> > You won't be able to do migration between gen1 and gen2 without mediation.
> Gen1 can easily migrate to gen2, because gen1 has smaller subset than gen2.
> When gen2 device is composed, feature z is disabled.

Sure, but this requires a lot of features that do not exist in the
spec. E.g it assumes the device could be composed on demand which
seems to fit the idea of transport virtqueue. So it adds dependencies
for migration where a simple mediation could be used to solve this
without bothering the spec.

>
> Gen2 to gen1 migration can do software-based migration anyway or through mediation.
> But because gen2 may need to migrate to gen1, hence gen2 to gen2 migration also should be done through mediation, doesn’t make sense to me.

It really depends on the design:

1) if you want to expose any features that is done by admin virtqueue
to a guest, mediation is a must (e.g if you want do live migration for
L1)
2) mediation is a must for the idea of transport virtqueue

>
> > Such technologies have been used by cpu features for years.
> > I am not sure why it became a problem for you.
> >
> > > Apart from it some of the PCI device layout compat checks will be covered
> > too.
> > >
> > > > And what you proposed is to allow the management to know the exact
> > > > hardware layout in order to check the compatibility? And the
> > > > management needs to evolve as new structures are added.
> > > Mostly not mgmt stack may not need to evolve a lot.
> > > Because most layouts should be growing within the device context and not at
> > the PCI capabilities etc area.
> > >
> > > And even if it does, its fine as large part of it standard PCI spec definitions.
> >
> > So as mentioned in another thread, this is a PCI specific solution:
> >
> > 1) feature and config are basic virtio facility
> > 2) capability is not but specific to PCI transport
> >
> So any LM solution will have transport specific checks and virtio level checks.

So here's the model that is used by Qemu currently:

1) Device is emulated, it's the charge of the libvirt to launch Qemu
and present a stable ABI for guests.
2) Datapath doesn't need to care about the hardware details since the
hardware layout is invisible from guest

You can see, it's more than sufficient for libvirt to check
features/config space, it doesn't need to care about the hardware BAR
layout. Migration is much easier in this way. And we can use transport
other than PCI in the guest in this case for live migration.

>
> > Checking PCI capability layout in the virtio management is a layer violation
> > which can't work for future transport like SIOV or adminq.
> Virtio management that will have transport level checks is not a violation.
> SIOV will define its own transport anyway. Not to mix with ccw/mmio or pci.
>
> > Management should only see virtio device otherwise the solution becomes
> > transport specific.
> >
> Solution needs to cover transport as well as transport is integral part of the virtio spec.
> Each transport layer will implement feature/config/cap in its own way.

If we can avoid those hardware details to be checked, we should not go
for that. It's a great ease of the management layer.

Thanks

>
> > >
> > > > This complicates the work of
> > > > management's furtherly which I'm not sure it can work.
> > > >
> > > Well, once we work towards it, it can work. :)
> > >
> > > > >
> > > > > > Hypervisor needs to start from a mediation method and do BAR
> > > > > > assignment only when possible.
> > > > > >
> > > > > Not necessarily.
> > > > >
> > > > > > > Cons:
> > > > > > > a. More AQ commands work in sw
> > > > > >
> > > > > > Note that this needs to be done on top of the transport virtqueue.
> > > > > > And we need to carefully design the command sets since they
> > > > > > could be mutually exclusive.
> > > > > >
> > > > > Not sure what more to expect out of transport virtqueue compared to AQ.
> > > > > I didn’t follow, which part could be mutually exclusive?
> > > >
> > > > Transport VQ allows a modern device to be transported via adminq.
> > > >
> > > May be for devices it can work. Hypervisor mediation with CC on horizon for
> > new capabilities is being reduced.
> > > So we don’t see transport vq may not be path forward.
> > >
> > > > And you want to add commands to transport for legacy devices.
> > > >
> > > Yes only legacy emulation who do not care about hypervisor mediation.
> > >
> > > > Can a driver use both the modern transport commands as well as the
> > > > legacy transport commands?
> > > Hard to answer, I likely do not understand as driver namespace is unclear.
> >
> > Things might be simplified if we use separate queues for admin, transport and
> > legacy.
> >
> Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> AQ_1 of the PF used by admin work as SIOV device create, SRIOV MSIX configuration.
> AQ_2 of the PF used for transporting legacy config access of the PCI VF
> AQ_3 of the PF for some transport work.
>
> If yes, sounds find to me.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-17 16:59                                                                   ` [virtio-dev] " Parav Pandit
@ 2023-04-18  1:09                                                                     ` Jason Wang
  2023-04-18  1:37                                                                       ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Jason Wang @ 2023-04-18  1:09 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 18, 2023 at 12:59 AM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, April 16, 2023 4:44 PM
> >
> > On Sun, Apr 16, 2023 at 01:41:55PM +0000, Parav Pandit wrote:
> > > > From: virtio-comment@lists.oasis-open.org
> > > > <virtio-comment@lists.oasis- open.org> On Behalf Of Michael S.
> > > > Tsirkin
> > > > Sent: Friday, April 14, 2023 2:57 AM
> > >
> > > > Do you refer to the trick Jason proposed where BAR0 is memory but
> > > > otherwise matches legacy BAR0 exactly? Is this your preferred solution at
> > this point then?
> > >
> > > We look at it again.
> > > Above solution can work reliably only for a very small number of PF and that
> > too with very special hardware circuitry due to the reset flow.
> > >
> > > Therefore, for virtualization below interface is preferred.
> > > a. For transitional device legacy configuration register transport
> > > over AQ,
> >
> > I don't get what this has to do with transitional ...
> >
> Typically, in current wordings, transitional is the device that supports legacy interface.
> So, it doesn't have to be for the transitional.
>
> I just wanted to highlight that a PCI VF device with its parent PCI PF device can transport the legacy interface commands.
>
> > > Notification to utilize transitional device notification area of the BAR.
> >
> > The vq transport does something like this, no?
> >
> Notifications over a queuing interface unlikely can be a performant interface because one is configuration task and other is data path task.

Note that current transport virtqueue only allows the notification via
MMIO. It introduces a command to get the address of the notification
area.

Thanks

>
> > > b. Non legacy interface of transitional and non-transitional PCI device to
> > access direct PCI device without mediation.
> >
> > So VF can either be accessed through AQ of PF, or through direct mapping?
> Right. VF to access legacy registers using AQ of PF and continue non-legacy registers using direct mapping as done today.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18  0:36                                                                       ` [virtio-dev] " Jason Wang
@ 2023-04-18  1:30                                                                         ` Parav Pandit
  2023-04-18 11:58                                                                           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-18  1:30 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, April 17, 2023 8:37 PM

> > > > Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> > > > AQ_1 of the PF used by admin work as SIOV device create, SRIOV
> > > > MSIX
> > > configuration.
> > > > AQ_2 of the PF used for transporting legacy config access of the
> > > > PCI VF
> > > > AQ_3 of the PF for some transport work.
> > > >
> > > > If yes, sounds find to me.
> > >
> > > Latest proposal simply leaves the split between AQs up to the driver.
> > > Seems the most flexible.
> > Yes. It is. Different opcode range and multiple AQs enable to do so.
> 
> Right, so it would be some facility that makes the transport commands of
> modern and legacy are mutually exclusive.

Ok. I didn’t follow the mutual exclusion part.
If a device has exposed legacy interface it will have to transport legacy access via its PF.
Same device can be transitional, and its 1.x interface doesn’t need to through this transport channel of PF, right?

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18  1:09                                                                     ` [virtio-dev] " Jason Wang
@ 2023-04-18  1:37                                                                       ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-18  1:37 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, April 17, 2023 9:09 PM

> Note that current transport virtqueue only allows the notification via MMIO. It
> introduces a command to get the address of the notification area.
>
Notifications via MMIO is the obvious choice.
Command is also fine to convey that.
I haven’t seen the transport VQ proposal.
Do you have pointer to it?


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18  1:01                                                                   ` [virtio-dev] " Jason Wang
@ 2023-04-18  1:48                                                                     ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-18  1:48 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtio-dev, cohuck, virtio-comment,
	Shahaf Shuler, Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, April 17, 2023 9:02 PM
> 
> > Isn't the size of BAR and its cap_len exposed by the device?
> 
> Somehow, it's more about how the hypervisor is going to use this, memory
> mapped or trapping. For either case, the hypervisor needs to have virtio
> knowledge in order to finish this.
>
Ok.

> > PCI BAR size of the VF can learn the system page size being different for x86
> (4K) and arm (64K).
> > PCI transport seems to support it.
> 
> Yes this is for SR-IOV but not for other cases. We could invent new facilities for
> sure but the hypervisor can not have this assumption.
>
Yeah, its not assumption.
 

> > > assuming you have two generations of device
> > >
> > > gen1: features x,y
> > > gen2: features x,y,z
> > >
> > > You won't be able to do migration between gen1 and gen2 without
> mediation.
> > Gen1 can easily migrate to gen2, because gen1 has smaller subset than gen2.
> > When gen2 device is composed, feature z is disabled.
> 
> Sure, but this requires a lot of features that do not exist in the spec. E.g it
> assumes the device could be composed on demand which seems to fit the idea
> of transport virtqueue. 
I don’t see how transport vq is related.
A device could be composed as PCI VF, PCI SIOV or something else.
Underlying transport will tell how it is composed.
May be underlying transport is a transport VQ, but that is not the only transport.

> So it adds dependencies for migration where a simple
> mediation could be used to solve this without bothering the spec.
>
Mediation of PF and hypervisor is not encouraged anymore as we move towards the CC.
So may be some system will do, but as we have the PCI VFs, there is clear need for non-mediated 1.x devices for such guest VMs.
For legacy kernel mediation is acceptable as there is no CC infrastructure in place on older systems.
 
> >
> > Gen2 to gen1 migration can do software-based migration anyway or through
> mediation.
> > But because gen2 may need to migrate to gen1, hence gen2 to gen2 migration
> also should be done through mediation, doesn’t make sense to me.
> 
> It really depends on the design:
> 
> 1) if you want to expose any features that is done by admin virtqueue to a
> guest, mediation is a must (e.g if you want do live migration for
> L1)
> 2) mediation is a must for the idea of transport virtqueue
>
Yes. So both transport options are there.
A PCI VF that doesn’t legacy baggage will be just fine without a mediation.
For some reason, if one wants to have mediation, may be there is some option of such new transport.
But such transport cannot be the only transport.
 
> > > So as mentioned in another thread, this is a PCI specific solution:
> > >
> > > 1) feature and config are basic virtio facility
> > > 2) capability is not but specific to PCI transport
> > >
> > So any LM solution will have transport specific checks and virtio level checks.
> 
> So here's the model that is used by Qemu currently:
> 
> 1) Device is emulated, it's the charge of the libvirt to launch Qemu and present
> a stable ABI for guests.
> 2) Datapath doesn't need to care about the hardware details since the
> hardware layout is invisible from guest
> 
> You can see, it's more than sufficient for libvirt to check features/config space,
> it doesn't need to care about the hardware BAR layout. Migration is much
> easier in this way. And we can use transport other than PCI in the guest in this
> case for live migration.
>
Sure works in some use cases.
But it is not the only way to operate it as I explained above where there is requirement to not have mediation for non_legacy interface.

> > Solution needs to cover transport as well as transport is integral part of the
> virtio spec.
> > Each transport layer will implement feature/config/cap in its own way.
> 
> If we can avoid those hardware details to be checked, we should not go for
> that. It's a great ease of the management layer.
Those are mainly RO checks and cheap too. It largely does not involved in the LM or data path flow either.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18  1:30                                                                         ` [virtio-dev] " Parav Pandit
@ 2023-04-18 11:58                                                                           ` Michael S. Tsirkin
  2023-04-18 12:09                                                                             ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-18 11:58 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 18, 2023 at 01:30:43AM +0000, Parav Pandit wrote:
> 
> 
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Monday, April 17, 2023 8:37 PM
> 
> > > > > Do you mean say we have three AQs, AQ_1, AQ_2, and AQ_3;
> > > > > AQ_1 of the PF used by admin work as SIOV device create, SRIOV
> > > > > MSIX
> > > > configuration.
> > > > > AQ_2 of the PF used for transporting legacy config access of the
> > > > > PCI VF
> > > > > AQ_3 of the PF for some transport work.
> > > > >
> > > > > If yes, sounds find to me.
> > > >
> > > > Latest proposal simply leaves the split between AQs up to the driver.
> > > > Seems the most flexible.
> > > Yes. It is. Different opcode range and multiple AQs enable to do so.
> > 
> > Right, so it would be some facility that makes the transport commands of
> > modern and legacy are mutually exclusive.
> 
> Ok. I didn’t follow the mutual exclusion part.
> If a device has exposed legacy interface it will have to transport legacy access via its PF.
> Same device can be transitional, and its 1.x interface doesn’t need to through this transport channel of PF, right?

I think in any case, using device through two transports at the same
time shouldn't be legal.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18 11:58                                                                           ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-18 12:09                                                                             ` Parav Pandit
  2023-04-18 12:30                                                                               ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-18 12:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 18, 2023 7:58 AM

> I think in any case, using device through two transports at the same time
> shouldn't be legal.

PCI is the transport.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18 12:09                                                                             ` [virtio-dev] " Parav Pandit
@ 2023-04-18 12:30                                                                               ` Michael S. Tsirkin
  2023-04-18 12:36                                                                                 ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-04-18 12:30 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer

On Tue, Apr 18, 2023 at 12:09:17PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, April 18, 2023 7:58 AM
> 
> > I think in any case, using device through two transports at the same time
> > shouldn't be legal.
> 
> PCI is the transport.

... and AQ can be a transport too :)

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers
  2023-04-18 12:30                                                                               ` [virtio-dev] " Michael S. Tsirkin
@ 2023-04-18 12:36                                                                                 ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-04-18 12:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, virtio-dev, cohuck, virtio-comment, Shahaf Shuler,
	Satananda Burla, Maxime Coquelin, Yan Vugenfirer



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, April 18, 2023 8:31 AM
> >
> > > I think in any case, using device through two transports at the same
> > > time shouldn't be legal.
> >
> > PCI is the transport.
> 
> ... and AQ can be a transport too :)

AQ is on top of the PCI transport. :)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
                   ` (14 preceding siblings ...)
  2023-04-12  5:10 ` [virtio-dev] " Halil Pasic
@ 2023-04-25  2:42 ` Parav Pandit
  2023-05-02  7:17   ` [virtio-dev] Re: [virtio-comment] " David Edmondson
  15 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-04-25  2:42 UTC (permalink / raw)
  To: mst, virtio-dev, cohuck; +Cc: virtio-comment, shahafs, Satananda Burla



On 3/30/2023 6:58 PM, Parav Pandit wrote:
> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.
> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 
> Such transitional MMR devices will be used at the scale of
> thousands of devices using PCI SR-IOV and/or future scalable
> virtualization technology to provide backward
> compatibility (for legacy devices) and also future
> compatibility with new features.
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>     virtio devices to the guest VM at scale of thousands,
>     typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>     vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>     virtio device type (net/blk) and be future compatible with a
>     single vfio stack using SR-IOV or other scalable device
>     virtualization technology to map PCI devices to the guest VM.
>     (as transitional or otherwise)
> 
> Motivation/Background:
> ----------------------
> The existing transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. It currently
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not
> indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range
> will be aligned to a 4 KB boundary.
> 
> [d] I/O region accesses at PCI system level is slow as they are non-posted
> operations in PCIe fabric.
> 
After our last several discussions, feedback from Michel and Jason,
to support above use case requirements, I would like to update v1 with 
below proposal.

1. Use existing non transitional device to extend the legacy registers 
access.

2. AQ of the parent PF is the optimal choice to access VF legacy 
registers. (As opposed to MMR of the VF).
This is because:
a. it enables to avoid complex reset flow at scale for the VFs.

b. it enables using existing driver notification which is already 
present at notification section of 1.x and transitional device.

3. New AQ command opcode for legacy register access read/write
Input fields:
a. opcode 0x8000
b. group and VF member identifiers.
c. registers offset,
d. registers size (1 to 64B)
e. registers content (on write)

output fields:
a. cmd status
b. register content on read

4. New AQ command to return q notify address for legacy access.
Inputs:
a. opcode 0x8001
b. group and VF member identifier or this can be just constant for all VFs?

Output:
a. BAR index
b. byte offset within the BAR

5. PCI Extended capabilities for all the existing capabilities located 
in the legacy section.
Why?
a. This is for the new driver (such as vfio) to always rely on the new 
capabilities.
b. Legacy PCI regions is close to its full capacity.

Few option questions:
1. Should the q notification query command be per VF or should be one 
for all group members (VF)?

Any further comments to address in v1?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-04-25  2:42 ` [virtio-dev] " Parav Pandit
@ 2023-05-02  7:17   ` David Edmondson
  2023-05-02 13:54     ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: David Edmondson @ 2023-05-02  7:17 UTC (permalink / raw)
  To: Parav Pandit, mst, virtio-dev, cohuck
  Cc: virtio-comment, shahafs, Satananda Burla

In line with our recent discussion about agreeing requirements for
specification changes, I wanted to say that we have a significant
existing estate of VMs using the legacy interface where the customer is
disinclined to update their software to a newer version (which might
consume the 1.x interface).

Given that, some mechanism for supporting a (mostly) hardware offloaded
legacy interface would definitely be useful to us, and the proposal here
seems like a sensible approach.

We are aware of vDPA, but that comes with its own challenges.

Parav Pandit <parav@nvidia.com> writes:

> On 3/30/2023 6:58 PM, Parav Pandit wrote:
>> Overview:
>> ---------
>> The Transitional MMR device is a variant of the transitional PCI device.
>> It has its own small Device ID range. It does not have I/O
>> region BAR; instead it exposes legacy configuration and device
>> specific registers at an offset in the memory region BAR.
>> 
>> Such transitional MMR devices will be used at the scale of
>> thousands of devices using PCI SR-IOV and/or future scalable
>> virtualization technology to provide backward
>> compatibility (for legacy devices) and also future
>> compatibility with new features.
>> 
>> Usecase:
>> --------
>> 1. A hypervisor/system needs to provide transitional
>>     virtio devices to the guest VM at scale of thousands,
>>     typically, one to eight devices per VM.
>> 
>> 2. A hypervisor/system needs to provide such devices using a
>>     vendor agnostic driver in the hypervisor system.
>> 
>> 3. A hypervisor system prefers to have single stack regardless of
>>     virtio device type (net/blk) and be future compatible with a
>>     single vfio stack using SR-IOV or other scalable device
>>     virtualization technology to map PCI devices to the guest VM.
>>     (as transitional or otherwise)
>> 
>> Motivation/Background:
>> ----------------------
>> The existing transitional PCI device is missing support for
>> PCI SR-IOV based devices. Currently it does not work beyond
>> PCI PF, or as software emulated device in reality. It currently
>> has below cited system level limitations:
>> 
>> [a] PCIe spec citation:
>> VFs do not support I/O Space and thus VF BARs shall not
>> indicate I/O Space.
>> 
>> [b] cpu arch citiation:
>> Intel 64 and IA-32 Architectures Software Developer’s Manual:
>> The processor’s I/O address space is separate and distinct from
>> the physical-memory address space. The I/O address space consists
>> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
>> 
>> [c] PCIe spec citation:
>> If a bridge implements an I/O address range,...I/O address range
>> will be aligned to a 4 KB boundary.
>> 
>> [d] I/O region accesses at PCI system level is slow as they are non-posted
>> operations in PCIe fabric.
>> 
> After our last several discussions, feedback from Michel and Jason,
> to support above use case requirements, I would like to update v1 with 
> below proposal.
>
> 1. Use existing non transitional device to extend the legacy registers 
> access.
>
> 2. AQ of the parent PF is the optimal choice to access VF legacy 
> registers. (As opposed to MMR of the VF).
> This is because:
> a. it enables to avoid complex reset flow at scale for the VFs.
>
> b. it enables using existing driver notification which is already 
> present at notification section of 1.x and transitional device.
>
> 3. New AQ command opcode for legacy register access read/write
> Input fields:
> a. opcode 0x8000
> b. group and VF member identifiers.
> c. registers offset,
> d. registers size (1 to 64B)
> e. registers content (on write)
>
> output fields:
> a. cmd status
> b. register content on read
>
> 4. New AQ command to return q notify address for legacy access.
> Inputs:
> a. opcode 0x8001
> b. group and VF member identifier or this can be just constant for all VFs?
>
> Output:
> a. BAR index
> b. byte offset within the BAR
>
> 5. PCI Extended capabilities for all the existing capabilities located 
> in the legacy section.
> Why?
> a. This is for the new driver (such as vfio) to always rely on the new 
> capabilities.
> b. Legacy PCI regions is close to its full capacity.
>
> Few option questions:
> 1. Should the q notification query command be per VF or should be one 
> for all group members (VF)?
>
> Any further comments to address in v1?
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
-- 
And you're standing here beside me, I love the passing of time.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [virtio-comment] Re: [PATCH 00/11] Introduce transitional mmr pci device
  2023-05-02  7:17   ` [virtio-dev] Re: [virtio-comment] " David Edmondson
@ 2023-05-02 13:54     ` Parav Pandit
  0 siblings, 0 replies; 200+ messages in thread
From: Parav Pandit @ 2023-05-02 13:54 UTC (permalink / raw)
  To: David Edmondson, mst, virtio-dev, cohuck
  Cc: virtio-comment, Shahaf Shuler, Satananda Burla



> From: David Edmondson <david.edmondson@oracle.com>
> Sent: Tuesday, May 2, 2023 3:18 AM
> 
> In line with our recent discussion about agreeing requirements for specification
> changes, I wanted to say that we have a significant existing estate of VMs using
> the legacy interface where the customer is disinclined to update their software
> to a newer version (which might consume the 1.x interface).
> 
> Given that, some mechanism for supporting a (mostly) hardware offloaded
> legacy interface would definitely be useful to us, and the proposal here seems
> like a sensible approach.
>

Thanks, David, for the inputs.
I am working on v1 to have legacy interface supported using proposed AQ.

I realized that below listed PCI extended capabilities is not useful for this work because existing capabilities in the legacy region need to be anyway there.
And new AQ interface-based method doesn’t use any new capabilities.
So, I will drop below #5 for now.

Regarding, open question below, the AQ commands for the SR-IOV group are for specific group member.
Hence, will keep the query as VF level, unless we want to optimize this further.
Since its one-time query per VF, it is not hurting the perf.

> > 5. PCI Extended capabilities for all the existing capabilities located
> > in the legacy section.
> > Why?
> > a. This is for the new driver (such as vfio) to always rely on the new
> > capabilities.
> > b. Legacy PCI regions is close to its full capacity.
> >
> > Few option questions:
> > 1. Should the q notification query command be per VF or should be one
> > for all group members (VF)?


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
  2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
  2023-04-10  1:36   ` [virtio-dev] " Jason Wang
@ 2023-05-19  6:10   ` Michael S. Tsirkin
  2023-05-19 21:02     ` [virtio-dev] " Parav Pandit
  2 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-05-19  6:10 UTC (permalink / raw)
  To: Parav Pandit; +Cc: virtio-dev, cohuck, virtio-comment, shahafs, Satananda Burla

On Fri, Mar 31, 2023 at 01:58:31AM +0300, Parav Pandit wrote:
> PCI device configuration space for capabilities is limited to only 192
> bytes shared by many PCI capabilities of generic PCI device and virtio
> specific.
> 
> Hence, introduce virtio extended capability that uses PCI Express
> extended capability.
> Subsequent patch uses this virtio extended capability.
> 
> Co-developed-by: Satananda Burla <sburla@marvell.com>
> Signed-off-by: Parav Pandit <parav@nvidia.com>

So looking at this in isolation, it probably warrants an extra
github issue. So I would like to find out whether we
should introduce this capability and switch to it for new devices.

My main question is whether e.g. seabios will be able to use it.
Could you take a look at the source and let us all know?


> ---
>  transport-pci.tex | 69 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index 665448e..aeda4a1 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -174,7 +174,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  the function, or accessed via the special VIRTIO_PCI_CAP_PCI_CFG field in the PCI configuration space.
>  
>  The location of each structure is specified using a vendor-specific PCI capability located
> -on the capability list in PCI configuration space of the device.
> +on the capability list in PCI configuration space of the device
> +unless stated otherwise.
>  This virtio structure capability uses little-endian format; all fields are
>  read-only for the driver unless stated otherwise:
>  
> @@ -301,6 +302,72 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  fields provide the most significant 32 bits of a total 64 bit offset and
>  length within the BAR specified by \field{cap.bar}.
>  
> +Virtio extended PCI Express capability structure defines
> +the location of certain virtio device configuration related
> +structures using PCI Express extended capability. Virtio
> +extended PCI Express capability structure uses PCI Express
> +vendor specific extended capability (VSEC). It has a below
> +layout:
> +
> +\begin{lstlisting}
> +struct pcie_ext_cap {
> +        le16 cap_vendor_id; /* Generic PCI field: 0xB */
> +        le16 cap_version : 2; /* Generic PCI field: 0 */
> +        le16 next_cap_offset : 14; /* Generic PCI field: next cap or 0 */
> +};
> +
> +struct virtio_pcie_ext_cap {
> +        struct pcie_ext_cap pcie_ecap;
> +        u8 cfg_type; /* Identifies the structure. */
> +        u8 bar; /* Index of the BAR where its located */
> +        u8 id; /* Multiple capabilities of the same type */
> +        u8 zero_padding[1];
> +        le64 offset; /* Offset with the bar */
> +        le64 length; /* Length of the structure, in bytes. */
> +        u8 data[]; /* Optional variable length data */
> +};
> +\end{lstlisting}
> +
> +This structure contains optional data, depending on
> +\field{cfg_type}. The fields are interpreted as follows:
> +
> +\begin{description}
> +\item[\field{cap_vendor_id}]
> +         0x0B; identifies a vendor-specific extended capability.
> +
> +\item[\field{cap_version}]
> +         contains a value of 0.
> +
> +\item[\field{next_cap_offset}]
> +        Offset to the next capability.
> +
> +\item[\field{cfg_type}]
> +        follows the same definition as \field{cfg_type}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{bar}]
> +        follows the same  same definition as  \field{bar}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{id}]
> +        follows the same  same definition as  \field{id}
> +        from the \field{struct virtio_pci_cap}.
> +
> +\item[\field{offset}]
> +        indicates where the structure begins relative to the
> +        base address associated with the BAR. The alignment
> +        requirements of offset are indicated in each
> +        structure-specific section that uses
> +        \field{struct virtio_pcie_ext_cap}.
> +
> +\item[\field{length}]
> +        indicates the length of the structure indicated by this
> +        capability.
> +
> +\item[\field{data}]
> +        optional data of this capability.
> +\end{description}
> +
>  \drivernormative{\subsubsection}{Virtio Structure PCI Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
>  
>  The driver MUST ignore any vendor-specific capability structure which has
> -- 
> 2.26.2


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-05-19  6:10   ` [virtio-dev] " Michael S. Tsirkin
@ 2023-05-19 21:02     ` Parav Pandit
  2023-05-21  5:57       ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-05-19 21:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, May 19, 2023 2:10 AM
> 
> So looking at this in isolation, it probably warrants an extra github issue. So I
> would like to find out whether we should introduce this capability and switch to
> it for new devices.
> 
> My main question is whether e.g. seabios will be able to use it.
> Could you take a look at the source and let us all know?
> 
Existing unmodified seabios cannot use it because it is not looking at the new offset and cap_id.
But that is the case with any existing software, not just seabios.

New capability is in addition to the existing one.
And code already exists to refer to the old one, so I am not seeing any gain to create them now.
Any new capability that one creates in the future can be in the extended area.

You mentioned a perf angle that it takes half the time, but I don't see how as the only change between legacy and ext is: offset.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-05-19 21:02     ` [virtio-dev] " Parav Pandit
@ 2023-05-21  5:57       ` Michael S. Tsirkin
  2023-05-21 13:24         ` [virtio-dev] " Parav Pandit
  0 siblings, 1 reply; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-05-21  5:57 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Fri, May 19, 2023 at 09:02:58PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Friday, May 19, 2023 2:10 AM
> > 
> > So looking at this in isolation, it probably warrants an extra github issue. So I
> > would like to find out whether we should introduce this capability and switch to
> > it for new devices.
> > 
> > My main question is whether e.g. seabios will be able to use it.
> > Could you take a look at the source and let us all know?
> > 
> Existing unmodified seabios cannot use it because it is not looking at the new offset and cap_id.

Yes but neither can it support new devices because it does not look
for the new device id. So it's fine - I was talking about new devices
using extended capability.

We shouldn't just laser-focus on existing software and devices,
there's probably more to come.


> But that is the case with any existing software, not just seabios.
> 
> New capability is in addition to the existing one.
> And code already exists to refer to the old one, so I am not seeing any gain to create them now.
> Any new capability that one creates in the future can be in the extended area.

Question is about MCFG in general.  Is adding capability to access MCFG
to seabios for pci accesses practical? hard?

> You mentioned a perf angle that it takes half the time, but I don't
> see how as the only change between legacy and ext is: offset.

For a variety of reasons Linux does not use memory mapped
accesses for legacy config space, it uses cf8/cfc for that.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] RE: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-05-21  5:57       ` [virtio-dev] " Michael S. Tsirkin
@ 2023-05-21 13:24         ` Parav Pandit
  2023-05-21 14:34           ` [virtio-dev] " Michael S. Tsirkin
  0 siblings, 1 reply; 200+ messages in thread
From: Parav Pandit @ 2023-05-21 13:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Sunday, May 21, 2023 1:58 AM

> Yes but neither can it support new devices because it does not look for the new
> device id. So it's fine - I was talking about new devices using extended
> capability.
> 
> We shouldn't just laser-focus on existing software and devices, there's probably
> more to come.
>
Sure, new devices can use new capability.
I will roll this as separate patch and github issue.
As we discussed, legacy register access over AQ does not need this capability anyway.

So not related to v2 discussion.
I am in middle of splitting this as separate patch unrelated to the v2 for legacy access.

> 
> > But that is the case with any existing software, not just seabios.
> >
> > New capability is in addition to the existing one.
> > And code already exists to refer to the old one, so I am not seeing any gain to
> create them now.
> > Any new capability that one creates in the future can be in the extended area.
> 
> Question is about MCFG in general.  Is adding capability to access MCFG to
> seabios for pci accesses practical? hard?
>
What is MCFG?
 
> > You mentioned a perf angle that it takes half the time, but I don't
> > see how as the only change between legacy and ext is: offset.
> 
> For a variety of reasons Linux does not use memory mapped accesses for legacy
> config space, it uses cf8/cfc for that.
And does it use memory mapped accesses for non-legacy config space?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [virtio-dev] Re: [PATCH 08/11] transport-pci: Introduce virtio extended capability
  2023-05-21 13:24         ` [virtio-dev] " Parav Pandit
@ 2023-05-21 14:34           ` Michael S. Tsirkin
  0 siblings, 0 replies; 200+ messages in thread
From: Michael S. Tsirkin @ 2023-05-21 14:34 UTC (permalink / raw)
  To: Parav Pandit
  Cc: virtio-dev, cohuck, virtio-comment, Shahaf Shuler, Satananda Burla

On Sun, May 21, 2023 at 01:24:55PM +0000, Parav Pandit wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Sunday, May 21, 2023 1:58 AM
> 
> > Yes but neither can it support new devices because it does not look for the new
> > device id. So it's fine - I was talking about new devices using extended
> > capability.
> > 
> > We shouldn't just laser-focus on existing software and devices, there's probably
> > more to come.
> >
> Sure, new devices can use new capability.
> I will roll this as separate patch and github issue.
> As we discussed, legacy register access over AQ does not need this capability anyway.
> 
> So not related to v2 discussion.
> I am in middle of splitting this as separate patch unrelated to the v2 for legacy access.
> 
> > 
> > > But that is the case with any existing software, not just seabios.
> > >
> > > New capability is in addition to the existing one.
> > > And code already exists to refer to the old one, so I am not seeing any gain to
> > create them now.
> > > Any new capability that one creates in the future can be in the extended area.
> > 
> > Question is about MCFG in general.  Is adding capability to access MCFG to
> > seabios for pci accesses practical? hard?
> >
> What is MCFG?

This is the name of the ACPI table describing memory mapped pci config
accesses.


> > > You mentioned a perf angle that it takes half the time, but I don't
> > > see how as the only change between legacy and ext is: offset.
> > 
> > For a variety of reasons Linux does not use memory mapped accesses for legacy
> > config space, it uses cf8/cfc for that.
> And does it use memory mapped accesses for non-legacy config space?

No other way to access that, right?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 200+ messages in thread

end of thread, other threads:[~2023-05-21 14:35 UTC | newest]

Thread overview: 200+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-30 22:58 [virtio-dev] [PATCH 00/11] Introduce transitional mmr pci device Parav Pandit
2023-03-30 22:58 ` [virtio-dev] [PATCH 01/11] transport-pci: Use lowecase alphabets Parav Pandit
2023-03-30 22:58 ` [virtio-dev] [PATCH 02/11] transport-pci: Move transitional device id to legacy section Parav Pandit
2023-03-31  6:43   ` [virtio-dev] " Michael S. Tsirkin
2023-03-31 21:24     ` [virtio-dev] " Parav Pandit
2023-04-02  7:54       ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 14:42         ` [virtio-dev] " Parav Pandit
2023-04-03 14:50           ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 14:58             ` [virtio-dev] " Parav Pandit
2023-04-03 15:14               ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 03/11] transport-pci: Split notes of PCI Device Layout Parav Pandit
2023-03-30 22:58 ` [virtio-dev] [PATCH 04/11] transport-pci: Rename and move legacy PCI Device layout section Parav Pandit
2023-03-30 22:58 ` [virtio-dev] [PATCH 05/11] introduction: Add missing helping verb Parav Pandit
2023-03-30 22:58 ` [virtio-dev] [PATCH 06/11] introduction: Introduce transitional MMR interface Parav Pandit
2023-04-07  9:17   ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 07/11] transport-pci: Introduce transitional MMR device id Parav Pandit
2023-04-04  7:28   ` [virtio-dev] " Michael S. Tsirkin
2023-04-04 16:08     ` Parav Pandit
2023-04-07 12:03       ` [virtio-dev] Re: [virtio-comment] " Michael S. Tsirkin
2023-04-07 15:18         ` [virtio-dev] " Parav Pandit
2023-04-07 15:51           ` [virtio-dev] " Michael S. Tsirkin
2023-04-09  3:15             ` [virtio-dev] " Parav Pandit
2023-04-10 10:18               ` [virtio-dev] " Michael S. Tsirkin
2023-04-10 14:34                 ` Parav Pandit
2023-04-10 19:58                   ` Michael S. Tsirkin
2023-04-10 20:16                     ` Parav Pandit
2023-04-07  8:37   ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 08/11] transport-pci: Introduce virtio extended capability Parav Pandit
2023-04-04  7:35   ` [virtio-dev] " Michael S. Tsirkin
2023-04-04  7:54     ` Cornelia Huck
2023-04-04 12:43       ` Michael S. Tsirkin
2023-04-04 13:19         ` Cornelia Huck
2023-04-04 14:37           ` Michael S. Tsirkin
2023-04-10 16:21             ` Parav Pandit
2023-04-10 19:49               ` Michael S. Tsirkin
2023-04-10 19:57                 ` [virtio-dev] " Parav Pandit
2023-04-10 20:02                   ` [virtio-dev] " Michael S. Tsirkin
2023-04-11  8:39                     ` Cornelia Huck
2023-04-04 21:18     ` [virtio-dev] " Parav Pandit
2023-04-05  5:10       ` [virtio-dev] " Michael S. Tsirkin
2023-04-05 13:16         ` [virtio-dev] " Parav Pandit
2023-04-07  8:15           ` [virtio-dev] " Michael S. Tsirkin
2023-04-10  1:36   ` [virtio-dev] " Jason Wang
2023-04-10  6:24     ` Michael S. Tsirkin
2023-04-10  7:16       ` Jason Wang
2023-04-10 10:04         ` Michael S. Tsirkin
2023-04-11  2:19           ` Jason Wang
2023-04-11  7:00             ` Michael S. Tsirkin
2023-04-11  9:07               ` Jason Wang
2023-04-11 10:43                 ` Michael S. Tsirkin
2023-04-11 13:59                 ` Parav Pandit
2023-04-11 14:11                 ` Michael S. Tsirkin
2023-04-11 13:47               ` Parav Pandit
2023-04-11 14:02                 ` Michael S. Tsirkin
2023-04-11 14:07                   ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-04-11 14:10                     ` [virtio-dev] " Michael S. Tsirkin
2023-04-11 14:30                       ` [virtio-dev] " Parav Pandit
2023-04-10 17:54     ` Parav Pandit
2023-04-10 17:58       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-04-11  3:28       ` Jason Wang
2023-04-11 19:01         ` Parav Pandit
2023-04-11 21:25           ` Michael S. Tsirkin
2023-04-12  0:40             ` Parav Pandit
2023-04-12  2:56               ` Michael S. Tsirkin
2023-04-12  4:07             ` Jason Wang
2023-04-12  4:20               ` Michael S. Tsirkin
2023-04-12  4:53                 ` [virtio-dev] Re: [virtio-comment] " Jason Wang
2023-04-12  5:25                   ` Michael S. Tsirkin
2023-04-12  5:37                     ` Jason Wang
2023-04-13 17:03                       ` Michael S. Tsirkin
2023-04-12  4:04           ` Jason Wang
2023-04-12  4:13             ` Parav Pandit
2023-04-12  4:20             ` Michael S. Tsirkin
2023-04-12  4:55               ` Jason Wang
2023-05-19  6:10   ` [virtio-dev] " Michael S. Tsirkin
2023-05-19 21:02     ` [virtio-dev] " Parav Pandit
2023-05-21  5:57       ` [virtio-dev] " Michael S. Tsirkin
2023-05-21 13:24         ` [virtio-dev] " Parav Pandit
2023-05-21 14:34           ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 09/11] transport-pci: Describe PCI MMR dev config registers Parav Pandit
2023-04-07  8:55   ` [virtio-dev] " Michael S. Tsirkin
2023-04-10  1:33     ` [virtio-dev] Re: [virtio-comment] " Jason Wang
2023-04-10  6:14       ` Michael S. Tsirkin
2023-04-10  6:20         ` Jason Wang
2023-04-10  6:39           ` Michael S. Tsirkin
2023-04-10  7:20             ` Jason Wang
2023-04-10 10:06               ` Michael S. Tsirkin
2023-04-11  2:13                 ` Jason Wang
2023-04-11  7:04                   ` Michael S. Tsirkin
2023-04-11  9:01                     ` Jason Wang
     [not found]                       ` <CALBs2cXURMEzCGnULicXbsBfwnKE5cZOz=M-_hhFCXZ=Lqb9Nw@mail.gmail.com>
2023-04-11 10:39                         ` Michael S. Tsirkin
2023-04-11 11:03                           ` Yan Vugenfirer
2023-04-11 10:42                       ` Michael S. Tsirkin
2023-04-12  3:58                         ` Jason Wang
2023-04-12  4:15                           ` Michael S. Tsirkin
2023-04-12  4:51                             ` Jason Wang
2023-04-12  5:01                               ` [virtio-dev] " Parav Pandit
2023-04-12  5:14                                 ` [virtio-dev] " Jason Wang
2023-04-12  5:30                                   ` [virtio-dev] " Parav Pandit
2023-04-12  5:38                                     ` [virtio-dev] " Jason Wang
2023-04-12  5:55                                       ` [virtio-dev] " Parav Pandit
2023-04-12  6:15                                         ` [virtio-dev] " Jason Wang
2023-04-12 14:23                                           ` [virtio-dev] " Parav Pandit
2023-04-13  1:48                                             ` [virtio-dev] " Jason Wang
2023-04-13  3:31                                               ` Parav Pandit
2023-04-13  5:14                                                 ` Jason Wang
2023-04-13 17:19                                                   ` Michael S. Tsirkin
2023-04-13 19:39                                                     ` [virtio-dev] " Parav Pandit
2023-04-14  3:09                                                       ` [virtio-dev] " Jason Wang
2023-04-14  3:18                                                         ` [virtio-dev] " Parav Pandit
2023-04-14  3:37                                                           ` [virtio-dev] " Jason Wang
2023-04-14  3:51                                                             ` [virtio-dev] " Parav Pandit
2023-04-14  7:05                                                               ` [virtio-dev] " Michael S. Tsirkin
2023-04-17  3:22                                                               ` Jason Wang
2023-04-17 17:23                                                                 ` [virtio-dev] " Parav Pandit
2023-04-17 20:26                                                                   ` [virtio-dev] " Michael S. Tsirkin
2023-04-17 20:28                                                                     ` [virtio-dev] " Parav Pandit
2023-04-18  0:36                                                                       ` [virtio-dev] " Jason Wang
2023-04-18  1:30                                                                         ` [virtio-dev] " Parav Pandit
2023-04-18 11:58                                                                           ` [virtio-dev] " Michael S. Tsirkin
2023-04-18 12:09                                                                             ` [virtio-dev] " Parav Pandit
2023-04-18 12:30                                                                               ` [virtio-dev] " Michael S. Tsirkin
2023-04-18 12:36                                                                                 ` [virtio-dev] " Parav Pandit
2023-04-18  1:01                                                                   ` [virtio-dev] " Jason Wang
2023-04-18  1:48                                                                     ` [virtio-dev] " Parav Pandit
2023-04-13 17:24                                                   ` [virtio-dev] " Parav Pandit
2023-04-13 21:02                                                     ` Michael S. Tsirkin
2023-04-13 21:08                                                       ` [virtio-dev] " Parav Pandit
2023-04-14  2:36                                                         ` [virtio-dev] " Jason Wang
2023-04-14  2:43                                                           ` [virtio-dev] " Parav Pandit
2023-04-14  6:57                                                             ` [virtio-dev] " Michael S. Tsirkin
2023-04-16 13:41                                                               ` [virtio-dev] " Parav Pandit
2023-04-16 20:44                                                                 ` [virtio-dev] " Michael S. Tsirkin
2023-04-17 16:59                                                                   ` [virtio-dev] " Parav Pandit
2023-04-18  1:09                                                                     ` [virtio-dev] " Jason Wang
2023-04-18  1:37                                                                       ` [virtio-dev] " Parav Pandit
2023-04-14  6:58                                                           ` [virtio-dev] " Michael S. Tsirkin
2023-04-14  3:08                                                     ` Jason Wang
2023-04-14  3:13                                                       ` [virtio-dev] " Parav Pandit
2023-04-14  3:18                                                         ` [virtio-dev] " Jason Wang
2023-04-14  3:22                                                           ` [virtio-dev] " Parav Pandit
2023-04-14  3:29                                                             ` [virtio-dev] " Jason Wang
2023-04-11 13:57                       ` [virtio-dev] " Parav Pandit
2023-04-12  4:33   ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 10/11] transport-pci: Use driver notification PCI capability Parav Pandit
2023-04-12  4:31   ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  4:37     ` [virtio-dev] " Parav Pandit
2023-04-12  4:43       ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  4:48         ` [virtio-dev] " Parav Pandit
2023-04-12  5:02           ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  5:06             ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-04-12  5:17               ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  5:24                 ` [virtio-dev] " Parav Pandit
2023-04-12  5:27                   ` [virtio-dev] " Michael S. Tsirkin
2023-03-30 22:58 ` [virtio-dev] [PATCH 11/11] conformance: Add transitional MMR interface conformance Parav Pandit
2023-03-31  7:03 ` [virtio-dev] Re: [PATCH 00/11] Introduce transitional mmr pci device Michael S. Tsirkin
2023-03-31 21:43   ` Parav Pandit
2023-04-03 14:53     ` Michael S. Tsirkin
2023-04-03 14:57       ` [virtio-dev] " Parav Pandit
2023-04-03 15:06         ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 15:16           ` [virtio-dev] " Parav Pandit
2023-04-03 15:23             ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 15:34               ` Michael S. Tsirkin
2023-04-03 15:47                 ` [virtio-dev] " Parav Pandit
2023-04-03 17:28                   ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 17:35                     ` Parav Pandit
2023-04-03 17:39                       ` Michael S. Tsirkin
2023-04-03 15:36               ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-04-03 17:16                 ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 17:29                   ` Parav Pandit
2023-04-03 18:02                     ` Michael S. Tsirkin
2023-04-03 20:25                       ` [virtio-dev] " Parav Pandit
2023-04-03 21:04                         ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 22:00                           ` Parav Pandit
2023-04-07  9:35                             ` Michael S. Tsirkin
2023-04-10  1:52                               ` Jason Wang
2023-04-03 14:45 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-04-03 14:53   ` Parav Pandit
2023-04-03 17:48     ` Michael S. Tsirkin
2023-04-03 19:11       ` Stefan Hajnoczi
2023-04-03 20:03         ` Michael S. Tsirkin
2023-04-03 19:48       ` [virtio-dev] " Parav Pandit
2023-04-03 20:02         ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 20:42           ` [virtio-dev] " Parav Pandit
2023-04-03 21:14             ` [virtio-dev] " Michael S. Tsirkin
2023-04-03 22:08               ` Parav Pandit
2023-04-03 19:10     ` Stefan Hajnoczi
2023-04-03 20:27       ` [virtio-dev] " Parav Pandit
2023-04-04 14:30         ` [virtio-dev] " Stefan Hajnoczi
2023-04-12  4:48 ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  4:52   ` [virtio-dev] " Parav Pandit
2023-04-12  5:12     ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  5:15       ` [virtio-dev] RE: [virtio-comment] " Parav Pandit
2023-04-12  5:23         ` [virtio-dev] " Michael S. Tsirkin
2023-04-12  5:39           ` [virtio-dev] " Parav Pandit
2023-04-12  6:02       ` Parav Pandit
2023-04-12  5:10 ` [virtio-dev] " Halil Pasic
2023-04-25  2:42 ` [virtio-dev] " Parav Pandit
2023-05-02  7:17   ` [virtio-dev] Re: [virtio-comment] " David Edmondson
2023-05-02 13:54     ` [virtio-dev] " Parav Pandit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).