All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support
@ 2022-01-12  5:57 Jason Wang
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability Jason Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Jason Wang @ 2022-01-12  5:57 UTC (permalink / raw)
  To: virtio-dev; +Cc: mst, stefanha, eperezma, lulu, Jason Wang

Hi All:

This series tries to add PASID support for virtio-pci to allow the
virtqueue to use PASID TLP prefix for PCI transactions. This will be
useful for future work like, queue assignment, virtqueue
virtualization and presenting multiple vDPA devices with a single PCI
device.

Since we're short of the space for the PCI capabilities, the PCI
extended capability for virtio structure is introduced that allows the
PASID configuration structure to use.

A prototype is implemented with emulated virtio-pci device in [1]. A
test driver is implemented in [2].

Please review.

Thanks

[1] https://github.com/jasowang/qemu.git virtio-pasid
[2] https://github.com/jasowang/linux.git virtio-pasid

Jason Wang (2):
  virtio-pci: introduce virtio structure PCI Extended Capability
  virtio-pci: add PASID configuration extended capability

 content.tex | 244 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 244 insertions(+)

-- 
2.32.0 (Apple Git-132)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-12  5:57 [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support Jason Wang
@ 2022-01-12  5:57 ` Jason Wang
  2022-01-12 10:10   ` [virtio-dev] " Stefan Hajnoczi
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability Jason Wang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-12  5:57 UTC (permalink / raw)
  To: virtio-dev; +Cc: mst, stefanha, eperezma, lulu, Jason Wang

We're already out of the configuration space if there's a device that
supports all kinds of the virtio structure via PCI capability. This
prevents us from adding new capabilities in the future.

So the patch adds the support for virtio structure via PCI Extended
Capability via the vendor specific extended capability.

Only MMIO bar is allowed now.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 127 insertions(+)

diff --git a/content.tex b/content.tex
index cf20570..00a75f2 100644
--- a/content.tex
+++ b/content.tex
@@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
    of BAR0 by presenting zeroes on every BAR and ignoring writes.
 \end{enumerate}
 
+\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
+
+The location of the virtio structures that depend on the PCI Express
+capability are specified using a vendor-specific extended capabilities
+on the extended capabilities list in PCI Express extended
+configuration space of the device. The virtio structure extended
+capability uses little-endian format: all fields are read-only for the
+driver unless stated otherwise:
+
+\begin{lstlisting}
+struct virtio_pci_ecap {
+        le16 cap_vndr;      /* Generic PCI field: PCI_EXT_CAP_ID_VNDR */
+        le16 cap_rev:4;     /* Generic PCI field: capability version: 0x1 */
+        le16 cap_next:12;   /* Generic PCI field: next ptr */
+        le16 cfg_type;      /* Identifies the structure */
+        le16 cfg_rev:4;     /* Identifies the version of the structure: 0x1 */
+        le16 cap_len:12;    /* The bytes of the entire capability */
+        u8 bar;             /* Where to find it. */
+        u8 id;              /* Multiple capabilities of the same type */
+        u8 padding[2];      /* Pad to full dword. */
+        le32 offset;        /* Offset within bar. */
+        le32 length;        /* Length of the structure, in bytes. */
+};
+\end{lstlisting}
+
+This structure can be followed by extra data, depending on the
+\field{cfg_type}.
+
+\begin{description}
+\item[\field{cap_vndr}]
+        0x0B; Identifies a vendor-specific extended capability.
+
+\item[\field{cap_rev}]
+        0x01; Identifies the version of the capability structure.
+
+\item[\field{cap_next}]
+        Link to next extended capability in the capability list in the
+        PCI Express extended configuration space.
+
+\item[\field{cfg_type}]
+        Identifies the structure. All values are reserved for future
+        use.
+
+        The device MAY offer more than one structure of any type - this makes it
+        possible for the device to expose multiple interfaces to drivers.  The order of
+        the capabilities in the capability list specifies the order of preference
+        suggested by the device. A device MAY specify that this ordering mechanism be
+        overridden by the use of the \field{id} field.
+
+\item[\field{cfg_rev}]
+        0x01; Identifies the version of virtio structure PCI extended
+        capability.
+
+\item[\field{cap_len}]
+        The length of the entire vendor specific extended capability,
+        including the virtio_pci_ecap structure and vendor specific
+        registers.
+
+\item[\field{bar}]
+        Values 0x0 to 0x5 specify a Base Address register (BAR) belonging to
+        the function located beginning at 10h in PCI Configuration Space
+        and used to map the structure into Memory. The BAR is
+        permitted to be either 32-bit or 64-bit.
+
+        Any other value is reserved for future use.
+
+\item[\field{offset}]
+        Indicates where the structure begins relative to the base address associated
+        with the BAR. The alignment requirements of \field{offset} are indicated
+        in each structure-specific section.
+
+\item[\field{length}]
+        Indicates the length of the structure.
+        \field{length} MAY include padding, or fields unused by the driver, or
+        future extensions.
+\end{description}
+
+A variant of this type, struct virtio_pci_ecap64, is defined for
+those extended capabilities that require offsets or lengths larger
+than 4GiB:
+
+\begin{lstlisting}
+struct virtio_pci_ecap64 {
+        struct virtio_pci_ecap ecap;
+        u32 offset_hi;
+        u32 length_hi;
+};
+\end{lstlisting}
+
+Given that the \field{ecap.length} and \field{ecap.offset} fields
+are only 32 bit, the additional \field{offset_hi} and \field {length_hi}
+fields provide the most significant 32 bits of a total 64 bit offset and
+length within the BAR specified by \field{ecap.bar}.
+
+\drivernormative{\subsubsection}{Virtio Structure PCI Extended Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
+
+The driver MUST ignore any vendor-specific capability structure which has
+a reserved \field{cfg_type} value.
+
+The driver SHOULD use the first instance of each virtio structure type
+they can support.
+
+The driver MUST accept a \field{cap_len} value which is larger than
+specified here.
+
+The driver MUST ignore any vendor-specific capability structure which has
+a reserved \field{bar} value.
+
+        The drivers SHOULD only map part of configuration structure
+        large enough for device operation.  The drivers MUST handle
+        an unexpectedly large \field{length}, but MAY check that \field{length}
+        is large enough for device operation.
+
+\devicenormative{\subsubsection}{Virtio Structure PCI Extended Capabilities}{Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
+
+The device MUST include any extra data (from the beginning of the
+\field{cap_vndr} field through end of the extra data fields if any) in
+\field{cap_len}. The device MAY append extra data or padding to any
+structure beyond that.
+
+If the device presents multiple structures of the same type, it SHOULD order
+them from optimal (first) to least-optimal (last).
+
 \subsection{PCI-specific Initialization And Device Operation}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation}
 
 \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization}
@@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
 PCI capability list, detecting virtio configuration layout using Virtio
 Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
 
+Optionally, if the device is a PCIe device, the driver scans the PCI
+Extended capability list, detecting virtio configuration layout using
+Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
+
 \subparagraph{Legacy Interface: A Note on Device Layout Detection}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization / Virtio Device Configuration Layout Detection / Legacy Interface: A Note on Device Layout Detection}
 
 Legacy drivers skipped the Device Layout Detection step, assuming legacy
-- 
2.32.0 (Apple Git-132)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [virtio-dev] [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-12  5:57 [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support Jason Wang
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability Jason Wang
@ 2022-01-12  5:57 ` Jason Wang
  2022-01-12 10:41   ` [virtio-dev] " Stefan Hajnoczi
  2022-01-12 10:44 ` [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support Stefan Hajnoczi
  2022-01-13 15:18 ` Michael S. Tsirkin
  3 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-12  5:57 UTC (permalink / raw)
  To: virtio-dev; +Cc: mst, stefanha, eperezma, lulu, Jason Wang

This patch tries to add PASID configuration structure. It is used for
assigning PASID to virtqueue then the device use PASID TLP prefix for
the PCI transactions like DMA. The goal is to isolate e.g DMA at
subdevice level which could be used for things like:

- direct queue assignment to userspace
- virtqueue virtualization
- presenting multiple vDPA devices within a single PCI device

The virtqueue group is introduced as an intermediate layer for having
better compatibility and flexibility. The virtqueue group is the
minimal set of the virtqueues that can be assigned with a single
PASID. The PASID is then assigned at the level of virtqueue group
instead of the virtqueue itself.

For a full PASID capable hardware, it can simply advertise a model
of 1:1 mapping of the virtqueue and virtqueue group. For the device
with mediated virtqueues, it can choose to place the mediated
virtqueues into dedicated virtqueue group(s).

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 content.tex | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 119 insertions(+), 2 deletions(-)

diff --git a/content.tex b/content.tex
index 00a75f2..552c0a8 100644
--- a/content.tex
+++ b/content.tex
@@ -1516,8 +1516,14 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
         PCI Express extended configuration space.
 
 \item[\field{cfg_type}]
-        Identifies the structure. All values are reserved for future
-        use.
+        Identifies the structure. according to the following table:
+
+\begin{lstlisting}
+/* PASID configuration */
+#define VIRTIO_PCI_ECAP_PASID_CFG        1
+\end{lstlisting}
+
+        Any other value is reserved for future use.
 
         The device MAY offer more than one structure of any type - this makes it
         possible for the device to expose multiple interfaces to drivers.  The order of
@@ -1599,6 +1605,117 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
 If the device presents multiple structures of the same type, it SHOULD order
 them from optimal (first) to least-optimal (last).
 
+\subsubsection{PASID configuration structure layout}
+
+The PASID configuration structure is found at the \field{bar} and
+\field{offset} within the VIRTIO_PCI_ECAP_PASID_CFG extended
+capability; its layout is below.
+
+\begin{lstlisting}
+struct virtio_pci_common_cfg {
+        /* About the whole device. */
+        le16 groups;                   /* read-only for driver */
+        /* About a specific virtqueue */
+        le16 queue_select;             /* read-write */
+        le16 group_id;                 /* read-only for driver */
+        /* About a specific virtqueue group */
+        le16 group_select;             /* read-write */
+        le32 group_pasid;              /* read-write */
+}
+\end{lstlisting}
+
+\begin{description}
+\item[\field{groups}]
+        The device specifies the number of virtqueue groups here. A
+        virtqueue group is the minimal set of the virtqueues that can
+        be assigned with a single PASID.
+
+\item[\field{queue_select}]
+        Queue Select. The driver selects which virtqueue the
+        \field{group_id} refers to.
+
+\item[\field{group_id}]
+        Virtqueue Group Identifier. The device specify the virtqueue
+        group identifier that the virtqueue belongs to.
+
+\item[\field{group_select}]
+        Virtqueue Group Select. The driver selects which virtqueue
+        group the field{group_pasid} refers to.
+
+\item[\field{group_pasid}]
+        Virtqueue Group PASID. The driver use this field to assign a
+        PASID to a virtqueue group.
+\end{description}
+
+\paragraph{PASID Configuration}
+
+When PASID extended capability is present and enabled in the device
+(through the standard PCI Extended configuration space). To assign a
+PASID to a specific virtqueue group. The driver need to perform the
+following steps:
+
+\begin{enumerate}
+\item Identify the number of groups by reading \field{groups}
+
+\item Identify the virtqueue group for a specific virtqueue by first
+  writing the index of the virtqueue to \field{queue_select} then
+  reading its virtqueue group identifier from \field{group_id}.
+
+\item Determine number of PASIDs and the PASID that will be associated
+  for each virtqueue group.
+
+\item Assign PASID to virtqueue groups by first writing the
+  virtqueue group ID to \field{group_select} then writing the valid
+  PASID number that assigned to the virtqueue group to
+  \field{group_pasid}
+\end{enumerate}
+
+To disable PASID for a virtqueue group, the driver write a special
+NO_PASID value:
+
+\begin{lstlisting}
+/* PASID value used to disable PASID for virtqueue group */
+#define VIRTIO_PASID_NO_ID            0xffffffff
+\end{lstlisting}
+
+\drivernormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
+
+Except for VIRTIO_PASID_NO_ID, a driver MUST NOT attempt to assign a
+PASID number outside the range that is defined in the \field{Max PASID Width}
+in the PASID Extended Capability.
+
+After assigning a PASID to a virtqueue group, the driver MUST verify
+the success by reading the Virtqueue Group PASID: on success, the
+previously written value is returned, and on failure
+VIRTIO_PASID_NO_ID is returned. If a mapping failure is detected, the
+driver MAY retry assigning with fewer PASIDs, disable PASID or report
+device failure.
+
+\devicenormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
+
+The device MUST support at least one virtqueue group if it offers
+PASID configuration structure.
+
+The device MUST supply virtqueue group identifiers continuously from
+number zero up to the number of virtqueue groups minus one. For
+example, if the device has 4 virtqueue groups, the group ids should be
+0, 1, 2, 3.
+
+The device MUST make sure all the previous PCIe transactions for the
+virtqueue group is completed before presenting the new PASID value
+wrote to the drvier in \field{group_pasid}.
+
+The device MUST use PASID TLP prefix for all memory transactions
+belongs to a virtqueue group if a valid PASID is assigned and PASID is
+enabled in the PASID extended capability.
+
+The device MUST NOT use PASID TLP prefix for any memory transaction
+belongs to a virtqueue group if VIRTIO_PASID_NO_ID is assigned or
+PASID is not enabled in the PASID extended capability.
+
+Upon reset, the device MUST present VIRTIO_PASID_NO_ID in
+\field{group_pasid} for each virtqueue group.
+
 \subsection{PCI-specific Initialization And Device Operation}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation}
 
 \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization}
-- 
2.32.0 (Apple Git-132)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability Jason Wang
@ 2022-01-12 10:10   ` Stefan Hajnoczi
  2022-01-13  0:55     ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-12 10:10 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtio-dev, mst, eperezma, lulu

[-- Attachment #1: Type: text/plain, Size: 5040 bytes --]

On Wed, Jan 12, 2022 at 01:57:54PM +0800, Jason Wang wrote:
> We're already out of the configuration space if there's a device that
> supports all kinds of the virtio structure via PCI capability. This
> prevents us from adding new capabilities in the future.
> 
> So the patch adds the support for virtio structure via PCI Extended
> Capability via the vendor specific extended capability.
> 
> Only MMIO bar is allowed now.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 127 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index cf20570..00a75f2 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
>     of BAR0 by presenting zeroes on every BAR and ignoring writes.
>  \end{enumerate}
>  
> +\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> +
> +The location of the virtio structures that depend on the PCI Express
> +capability are specified using a vendor-specific extended capabilities
> +on the extended capabilities list in PCI Express extended
> +configuration space of the device.

I can't parse this sentence. Can you rephrase it and/or split it into
multiple sentences?

> The virtio structure extended
> +capability uses little-endian format: all fields are read-only for the
> +driver unless stated otherwise:
> +
> +\begin{lstlisting}
> +struct virtio_pci_ecap {
> +        le16 cap_vndr;      /* Generic PCI field: PCI_EXT_CAP_ID_VNDR */
> +        le16 cap_rev:4;     /* Generic PCI field: capability version: 0x1 */
> +        le16 cap_next:12;   /* Generic PCI field: next ptr */
> +        le16 cfg_type;      /* Identifies the structure */
> +        le16 cfg_rev:4;     /* Identifies the version of the structure: 0x1 */
> +        le16 cap_len:12;    /* The bytes of the entire capability */
> +        u8 bar;             /* Where to find it. */
> +        u8 id;              /* Multiple capabilities of the same type */
> +        u8 padding[2];      /* Pad to full dword. */
> +        le32 offset;        /* Offset within bar. */
> +        le32 length;        /* Length of the structure, in bytes. */
> +};
> +\end{lstlisting}
> +
> +This structure can be followed by extra data, depending on the
> +\field{cfg_type}.
> +
> +\begin{description}
> +\item[\field{cap_vndr}]
> +        0x0B; Identifies a vendor-specific extended capability.

It's worth clarifying that "vendor-specific" here refers to the PCIe
standard, not to the VIRTIO standard. Please use "PCI Vendor-Specific
Extended Capability" everywhere in this patch instead of just
"vendor-specific extended capability". That helps avoid confusion with
VIRTIO_PCI_CAP_VENDOR_CFG.

> +
> +\item[\field{cap_rev}]
> +        0x01; Identifies the version of the capability structure.
> +
> +\item[\field{cap_next}]
> +        Link to next extended capability in the capability list in the
> +        PCI Express extended configuration space.
> +
> +\item[\field{cfg_type}]
> +        Identifies the structure. All values are reserved for future
> +        use.
> +
> +        The device MAY offer more than one structure of any type - this makes it
> +        possible for the device to expose multiple interfaces to drivers.  The order of
> +        the capabilities in the capability list specifies the order of preference
> +        suggested by the device. A device MAY specify that this ordering mechanism be
> +        overridden by the use of the \field{id} field.
> +
> +\item[\field{cfg_rev}]
> +        0x01; Identifies the version of virtio structure PCI extended
> +        capability.
> +
> +\item[\field{cap_len}]
> +        The length of the entire vendor specific extended capability,
> +        including the virtio_pci_ecap structure and vendor specific
> +        registers.

I thought the point of this is to place the register in ->bar, so why
would there be extra registers after struct virtio_pci_ecap(64)? Again,
it's unclear who the "vendor" is.

> @@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
>  PCI capability list, detecting virtio configuration layout using Virtio
>  Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
>  
> +Optionally, if the device is a PCIe device, the driver scans the PCI
> +Extended capability list, detecting virtio configuration layout using
> +Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}

If the same structure is present in both the PCI Capabilities and PCI
Extended Capabilities lists, which one has a higher priority?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability Jason Wang
@ 2022-01-12 10:41   ` Stefan Hajnoczi
  2022-01-13  1:24     ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-12 10:41 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtio-dev, mst, eperezma, lulu

[-- Attachment #1: Type: text/plain, Size: 10800 bytes --]

On Wed, Jan 12, 2022 at 01:57:55PM +0800, Jason Wang wrote:
> This patch tries to add PASID configuration structure. It is used for
> assigning PASID to virtqueue then the device use PASID TLP prefix for
> the PCI transactions like DMA. The goal is to isolate e.g DMA at
> subdevice level which could be used for things like:
> 
> - direct queue assignment to userspace
> - virtqueue virtualization
> - presenting multiple vDPA devices within a single PCI device
> 
> The virtqueue group is introduced as an intermediate layer for having
> better compatibility and flexibility. The virtqueue group is the
> minimal set of the virtqueues that can be assigned with a single
> PASID. The PASID is then assigned at the level of virtqueue group
> instead of the virtqueue itself.
> 
> For a full PASID capable hardware, it can simply advertise a model
> of 1:1 mapping of the virtqueue and virtqueue group. For the device
> with mediated virtqueues, it can choose to place the mediated
> virtqueues into dedicated virtqueue group(s).
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  content.tex | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 119 insertions(+), 2 deletions(-)
> 
> diff --git a/content.tex b/content.tex
> index 00a75f2..552c0a8 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -1516,8 +1516,14 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
>          PCI Express extended configuration space.
>  
>  \item[\field{cfg_type}]
> -        Identifies the structure. All values are reserved for future
> -        use.
> +        Identifies the structure. according to the following table:
> +
> +\begin{lstlisting}
> +/* PASID configuration */
> +#define VIRTIO_PCI_ECAP_PASID_CFG        1
> +\end{lstlisting}
> +
> +        Any other value is reserved for future use.
>  
>          The device MAY offer more than one structure of any type - this makes it
>          possible for the device to expose multiple interfaces to drivers.  The order of
> @@ -1599,6 +1605,117 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
>  If the device presents multiple structures of the same type, it SHOULD order
>  them from optimal (first) to least-optimal (last).
>  
> +\subsubsection{PASID configuration structure layout}
> +
> +The PASID configuration structure is found at the \field{bar} and
> +\field{offset} within the VIRTIO_PCI_ECAP_PASID_CFG extended

It looks like PCI Extended Capabilities have a different namespace from
classic PCI Capabilities (VIRTIO_PCI_**ECAP**_PASID_CFG). If a device
runs out of PCI Configuration Space it will be unable to define
additional capabilities as PCI Extended Capabilities. At least I'm not
aware of VIRTIO_PCI_ECAP_* definitions for existing capabilities. If the
point is to help with PCI Configuration Space exhaustion, should all
capabilities be defined in the same namespace so they can be trivially
defined as either PCI Capabilities or PCI Extended Capabilities?

(Maybe I misunderstood the purpose of adding PCI Extended Capabilities?)

> +capability; its layout is below.
> +
> +\begin{lstlisting}
> +struct virtio_pci_common_cfg {

The struct name looks like it was copy-pasted. Should this be struct
virtio_pci_group_cfg?

> +        /* About the whole device. */
> +        le16 groups;                   /* read-only for driver */
> +        /* About a specific virtqueue */
> +        le16 queue_select;             /* read-write */
> +        le16 group_id;                 /* read-only for driver */
> +        /* About a specific virtqueue group */
> +        le16 group_select;             /* read-write */
> +        le32 group_pasid;              /* read-write */
> +}
> +\end{lstlisting}
> +
> +\begin{description}
> +\item[\field{groups}]
> +        The device specifies the number of virtqueue groups here. A
> +        virtqueue group is the minimal set of the virtqueues that can
> +        be assigned with a single PASID.
> +
> +\item[\field{queue_select}]
> +        Queue Select. The driver selects which virtqueue the
> +        \field{group_id} refers to.
> +
> +\item[\field{group_id}]
> +        Virtqueue Group Identifier. The device specify the virtqueue
> +        group identifier that the virtqueue belongs to.
> +
> +\item[\field{group_select}]
> +        Virtqueue Group Select. The driver selects which virtqueue
> +        group the field{group_pasid} refers to.
> +
> +\item[\field{group_pasid}]
> +        Virtqueue Group PASID. The driver use this field to assign a
> +        PASID to a virtqueue group.
> +\end{description}

The queue_select/group_select interface makes me wonder if an admin
virtqueue interface would be better. The driver is forced to scan
virtqueues and groups, and each access requires a vmexit. An admin
virtqueue interface would allow a more efficient DMA interface where all
virtqueues/groups can be queried or set in a single operation.

Luckily this interface is only used at initialization time, but I
sometimes still wonder if it's scalable enough.

> +
> +\paragraph{PASID Configuration}
> +
> +When PASID extended capability is present and enabled in the device
> +(through the standard PCI Extended configuration space). To assign a
> +PASID to a specific virtqueue group. The driver need to perform the
> +following steps:

This paragraph is a single sentence, the full stops ('.') are used more
like commas (','). I suggest shortening it like this:

  When the PCI PASID Extended Capability is present and enabled, the
  driver performs the following steps to assign a PASID to a virtqueue
  group:

(I also used explicit "PCI PASID Extended Capability" naming so it's
clear we're talking about a formal term defined in the PCI
specification.)

> +
> +\begin{enumerate}
> +\item Identify the number of groups by reading \field{groups}
> +
> +\item Identify the virtqueue group for a specific virtqueue by first
> +  writing the index of the virtqueue to \field{queue_select} then
> +  reading its virtqueue group identifier from \field{group_id}.
> +
> +\item Determine number of PASIDs and the PASID that will be associated
> +  for each virtqueue group.
> +
> +\item Assign PASID to virtqueue groups by first writing the
> +  virtqueue group ID to \field{group_select} then writing the valid
> +  PASID number that assigned to the virtqueue group to
> +  \field{group_pasid}

  \item Assign PASIDs to virtqueue groups by first writing the
    virtqueue group ID to \field{group_select} and then writing a valid
    PASID number to assign to that group to \field{group_pasid}.

> +\end{enumerate}
> +
> +To disable PASID for a virtqueue group, the driver write a special
> +NO_PASID value:

  The driver writes the special VIRTIO_PASID_NO_ID value to disable
  PASID for a virtqueue group:

> +
> +\begin{lstlisting}
> +/* PASID value used to disable PASID for virtqueue group */
> +#define VIRTIO_PASID_NO_ID            0xffffffff
> +\end{lstlisting}
> +
> +\drivernormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> +
> +Except for VIRTIO_PASID_NO_ID, a driver MUST NOT attempt to assign a
> +PASID number outside the range that is defined in the \field{Max PASID Width}
> +in the PASID Extended Capability.
> +
> +After assigning a PASID to a virtqueue group, the driver MUST verify
> +the success by reading the Virtqueue Group PASID: on success, the
> +previously written value is returned, and on failure
> +VIRTIO_PASID_NO_ID is returned. If a mapping failure is detected, the
> +driver MAY retry assigning with fewer PASIDs, disable PASID or report
> +device failure.
> +
> +\devicenormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> +
> +The device MUST support at least one virtqueue group if it offers
> +PASID configuration structure.

  the PASID configuration structure.

("a" would be okay too but leaves open the possibility of multiple PASID
configuration structures, which I think we don't want.)

> +
> +The device MUST supply virtqueue group identifiers continuously from
> +number zero up to the number of virtqueue groups minus one. For

s/number zero/the number zero/

> +example, if the device has 4 virtqueue groups, the group ids should be

To avoid using "should":
s/should be/are/

> +0, 1, 2, 3.

/, 3/, and 3/

> +
> +The device MUST make sure all the previous PCIe transactions for the
> +virtqueue group is completed before presenting the new PASID value

s/transactions ... is completed/transactions ... are completed/

> +wrote to the drvier in \field{group_pasid}.

This is ambiguous. It leaves open the possibility that group_pasid
updates at a later point in time instead of immediately after a write by
the driver. My interpretation is that the device must respond
immediately by presenting the new PASID or VIRTIO_PASID_NO_ID when the
driver reads back group_pasid after a write. If you agree, please
rephrase this clause to make it clear that reading group_pasid after
write acts like a barrier (previous PCIe transactions must complete
before the device responds to the read) and the device is not allowed to
present the old group_pasid value until all transactions are completed
and then present the new group_pasid value.

> +
> +The device MUST use PASID TLP prefix for all memory transactions
> +belongs to a virtqueue group if a valid PASID is assigned and PASID is

s/belongs/belonging/

> +enabled in the PASID extended capability.

This statement is vague. Which memory transactions "belong to a
virtqueue group"? vring reads/writes and data buffer reads/writes come
to mind. Virtqueue MSI-X messages are probably not included? Anything
else?

> +
> +The device MUST NOT use PASID TLP prefix for any memory transaction
> +belongs to a virtqueue group if VIRTIO_PASID_NO_ID is assigned or
> +PASID is not enabled in the PASID extended capability.
> +
> +Upon reset, the device MUST present VIRTIO_PASID_NO_ID in
> +\field{group_pasid} for each virtqueue group.
> +
>  \subsection{PCI-specific Initialization And Device Operation}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation}
>  
>  \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization}
> -- 
> 2.32.0 (Apple Git-132)
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-12  5:57 [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support Jason Wang
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability Jason Wang
  2022-01-12  5:57 ` [virtio-dev] [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability Jason Wang
@ 2022-01-12 10:44 ` Stefan Hajnoczi
  2022-01-13  1:28   ` Jason Wang
  2022-01-13 15:18 ` Michael S. Tsirkin
  3 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-12 10:44 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtio-dev, mst, eperezma, lulu

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> Hi All:
> 
> This series tries to add PASID support for virtio-pci to allow the
> virtqueue to use PASID TLP prefix for PCI transactions. This will be
> useful for future work like, queue assignment, virtqueue
> virtualization and presenting multiple vDPA devices with a single PCI
> device.
> 
> Since we're short of the space for the PCI capabilities, the PCI
> extended capability for virtio structure is introduced that allows the
> PASID configuration structure to use.
> 
> A prototype is implemented with emulated virtio-pci device in [1]. A
> test driver is implemented in [2].
> 
> Please review.

I don't know the security model for PASIDs. My guess is that PASIDs can
be bruteforced so we must trust the driver (it can assign PASIDs to
virtqueue groups) and we must prevent untrusted applications from
setting PASIDs on virtqueues. Is that correct?

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-12 10:10   ` [virtio-dev] " Stefan Hajnoczi
@ 2022-01-13  0:55     ` Jason Wang
  2022-01-13 10:19       ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-13  0:55 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

On Wed, Jan 12, 2022 at 6:10 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Wed, Jan 12, 2022 at 01:57:54PM +0800, Jason Wang wrote:
> > We're already out of the configuration space if there's a device that
> > supports all kinds of the virtio structure via PCI capability. This
> > prevents us from adding new capabilities in the future.
> >
> > So the patch adds the support for virtio structure via PCI Extended
> > Capability via the vendor specific extended capability.
> >
> > Only MMIO bar is allowed now.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 127 insertions(+)
> >
> > diff --git a/content.tex b/content.tex
> > index cf20570..00a75f2 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
> >     of BAR0 by presenting zeroes on every BAR and ignoring writes.
> >  \end{enumerate}
> >
> > +\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> > +
> > +The location of the virtio structures that depend on the PCI Express
> > +capability are specified using a vendor-specific extended capabilities
> > +on the extended capabilities list in PCI Express extended
> > +configuration space of the device.
>
> I can't parse this sentence. Can you rephrase it and/or split it into
> multiple sentences?

Sure, how about something like:

There could be virtio structures that depend on the PCI express
capability. The location of those structures are specified using a
vendor-specific extended capabilities on the extended capabilities
list.

>
> > The virtio structure extended
> > +capability uses little-endian format: all fields are read-only for the
> > +driver unless stated otherwise:
> > +
> > +\begin{lstlisting}
> > +struct virtio_pci_ecap {
> > +        le16 cap_vndr;      /* Generic PCI field: PCI_EXT_CAP_ID_VNDR */
> > +        le16 cap_rev:4;     /* Generic PCI field: capability version: 0x1 */
> > +        le16 cap_next:12;   /* Generic PCI field: next ptr */
> > +        le16 cfg_type;      /* Identifies the structure */
> > +        le16 cfg_rev:4;     /* Identifies the version of the structure: 0x1 */
> > +        le16 cap_len:12;    /* The bytes of the entire capability */
> > +        u8 bar;             /* Where to find it. */
> > +        u8 id;              /* Multiple capabilities of the same type */
> > +        u8 padding[2];      /* Pad to full dword. */
> > +        le32 offset;        /* Offset within bar. */
> > +        le32 length;        /* Length of the structure, in bytes. */
> > +};
> > +\end{lstlisting}
> > +
> > +This structure can be followed by extra data, depending on the
> > +\field{cfg_type}.
> > +
> > +\begin{description}
> > +\item[\field{cap_vndr}]
> > +        0x0B; Identifies a vendor-specific extended capability.
>
> It's worth clarifying that "vendor-specific" here refers to the PCIe
> standard, not to the VIRTIO standard. Please use "PCI Vendor-Specific
> Extended Capability" everywhere in this patch instead of just
> "vendor-specific extended capability". That helps avoid confusion with
> VIRTIO_PCI_CAP_VENDOR_CFG.

I'm ok with this. Note that I just copy-paste from the pci part:

\item[\field{cap_vndr}]
0x09; Identifies a vendor-specific capability.

I think the logic is that it's a general PCI field, so we can avoid
mentioning "PCIe" again. So for general PCI fields, the "vendor" is
the one deduced from the PCI vendor ID. For the virtio field like
cfg_type, it's the virtio vendor.

>
> > +
> > +\item[\field{cap_rev}]
> > +        0x01; Identifies the version of the capability structure.
> > +
> > +\item[\field{cap_next}]
> > +        Link to next extended capability in the capability list in the
> > +        PCI Express extended configuration space.
> > +
> > +\item[\field{cfg_type}]
> > +        Identifies the structure. All values are reserved for future
> > +        use.
> > +
> > +        The device MAY offer more than one structure of any type - this makes it
> > +        possible for the device to expose multiple interfaces to drivers.  The order of
> > +        the capabilities in the capability list specifies the order of preference
> > +        suggested by the device. A device MAY specify that this ordering mechanism be
> > +        overridden by the use of the \field{id} field.
> > +
> > +\item[\field{cfg_rev}]
> > +        0x01; Identifies the version of virtio structure PCI extended
> > +        capability.
> > +
> > +\item[\field{cap_len}]
> > +        The length of the entire vendor specific extended capability,
> > +        including the virtio_pci_ecap structure and vendor specific
> > +        registers.
>
> I thought the point of this is to place the register in ->bar, so why
> would there be extra registers after struct virtio_pci_ecap(64)?

PCI spec allows to place registers in the capability itself. We had
already done this, one example is:

struct virtio_pci_notify_cap {
        struct virtio_pci_cap cap;
        le32 notify_off_multiplier; /* Multiplier for queue_notify_off. */
};

Actually, this is the only way to make VIRTIO_PCI_CAP_PCI_CFG to work.

struct virtio_pci_cfg_cap {
        struct virtio_pci_cap cap;
        u8 pci_cfg_data[4]; /* Data for BAR access. */
};

> Again,
> it's unclear who the "vendor" is.

It's a general PCI field, so the vendor here is specified from the
view of PCI that is the vendor deduced from the PCI vendor ID.

>
> > @@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
> >  PCI capability list, detecting virtio configuration layout using Virtio
> >  Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
> >
> > +Optionally, if the device is a PCIe device, the driver scans the PCI
> > +Extended capability list, detecting virtio configuration layout using
> > +Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
>
> If the same structure is present in both the PCI Capabilities and PCI
> Extended Capabilities lists, which one has a higher priority?

This is not allowed in this proposal where only the structure that
depends on the PCI Express capability is allowed. E.g the PASID
configuration structure depends on the PASID extended capability.

This is much more simpler than allowing existing virtio structures to
be presented on both PCI and PCIe capability list.

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-12 10:41   ` [virtio-dev] " Stefan Hajnoczi
@ 2022-01-13  1:24     ` Jason Wang
  2022-01-13 10:32       ` Stefan Hajnoczi
  2022-01-14  9:43       ` Jean-Philippe Brucker
  0 siblings, 2 replies; 30+ messages in thread
From: Jason Wang @ 2022-01-13  1:24 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

On Wed, Jan 12, 2022 at 6:42 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Wed, Jan 12, 2022 at 01:57:55PM +0800, Jason Wang wrote:
> > This patch tries to add PASID configuration structure. It is used for
> > assigning PASID to virtqueue then the device use PASID TLP prefix for
> > the PCI transactions like DMA. The goal is to isolate e.g DMA at
> > subdevice level which could be used for things like:
> >
> > - direct queue assignment to userspace
> > - virtqueue virtualization
> > - presenting multiple vDPA devices within a single PCI device
> >
> > The virtqueue group is introduced as an intermediate layer for having
> > better compatibility and flexibility. The virtqueue group is the
> > minimal set of the virtqueues that can be assigned with a single
> > PASID. The PASID is then assigned at the level of virtqueue group
> > instead of the virtqueue itself.
> >
> > For a full PASID capable hardware, it can simply advertise a model
> > of 1:1 mapping of the virtqueue and virtqueue group. For the device
> > with mediated virtqueues, it can choose to place the mediated
> > virtqueues into dedicated virtqueue group(s).
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  content.tex | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 119 insertions(+), 2 deletions(-)
> >
> > diff --git a/content.tex b/content.tex
> > index 00a75f2..552c0a8 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -1516,8 +1516,14 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
> >          PCI Express extended configuration space.
> >
> >  \item[\field{cfg_type}]
> > -        Identifies the structure. All values are reserved for future
> > -        use.
> > +        Identifies the structure. according to the following table:
> > +
> > +\begin{lstlisting}
> > +/* PASID configuration */
> > +#define VIRTIO_PCI_ECAP_PASID_CFG        1
> > +\end{lstlisting}
> > +
> > +        Any other value is reserved for future use.
> >
> >          The device MAY offer more than one structure of any type - this makes it
> >          possible for the device to expose multiple interfaces to drivers.  The order of
> > @@ -1599,6 +1605,117 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
> >  If the device presents multiple structures of the same type, it SHOULD order
> >  them from optimal (first) to least-optimal (last).
> >
> > +\subsubsection{PASID configuration structure layout}
> > +
> > +The PASID configuration structure is found at the \field{bar} and
> > +\field{offset} within the VIRTIO_PCI_ECAP_PASID_CFG extended
>
> It looks like PCI Extended Capabilities have a different namespace from
> classic PCI Capabilities (VIRTIO_PCI_**ECAP**_PASID_CFG). If a device
> runs out of PCI Configuration Space it will be unable to define
> additional capabilities as PCI Extended Capabilities. At least I'm not
> aware of VIRTIO_PCI_ECAP_* definitions for existing capabilities. If the
> point is to help with PCI Configuration Space exhaustion, should all
> capabilities be defined in the same namespace so they can be trivially
> defined as either PCI Capabilities or PCI Extended Capabilities?

I'm not sure, but only allowing new types to be placed in extended
capabilities is much easier.

>
> (Maybe I misunderstood the purpose of adding PCI Extended Capabilities?)

More or less, the proposal is only for the structure that depends on
PCI express.

>
> > +capability; its layout is below.
> > +
> > +\begin{lstlisting}
> > +struct virtio_pci_common_cfg {
>
> The struct name looks like it was copy-pasted. Should this be struct
> virtio_pci_group_cfg?

Right, but I think virtio_pci_pasid_cfg is better.

>
> > +        /* About the whole device. */
> > +        le16 groups;                   /* read-only for driver */
> > +        /* About a specific virtqueue */
> > +        le16 queue_select;             /* read-write */
> > +        le16 group_id;                 /* read-only for driver */
> > +        /* About a specific virtqueue group */
> > +        le16 group_select;             /* read-write */
> > +        le32 group_pasid;              /* read-write */
> > +}
> > +\end{lstlisting}
> > +
> > +\begin{description}
> > +\item[\field{groups}]
> > +        The device specifies the number of virtqueue groups here. A
> > +        virtqueue group is the minimal set of the virtqueues that can
> > +        be assigned with a single PASID.
> > +
> > +\item[\field{queue_select}]
> > +        Queue Select. The driver selects which virtqueue the
> > +        \field{group_id} refers to.
> > +
> > +\item[\field{group_id}]
> > +        Virtqueue Group Identifier. The device specify the virtqueue
> > +        group identifier that the virtqueue belongs to.
> > +
> > +\item[\field{group_select}]
> > +        Virtqueue Group Select. The driver selects which virtqueue
> > +        group the field{group_pasid} refers to.
> > +
> > +\item[\field{group_pasid}]
> > +        Virtqueue Group PASID. The driver use this field to assign a
> > +        PASID to a virtqueue group.
> > +\end{description}
>
> The queue_select/group_select interface makes me wonder if an admin
> virtqueue interface would be better. The driver is forced to scan
> virtqueues and groups, and each access requires a vmexit. An admin
> virtqueue interface would allow a more efficient DMA interface where all
> virtqueues/groups can be queried or set in a single operation.
>
> Luckily this interface is only used at initialization time, but I
> sometimes still wonder if it's scalable enough.

Yes, so it should be no worse than the current layout in the
virtio_pci_common_cfg where we had a bunch of fields that are
multiplexed via the queue_select.

Actually the admin virtqueue and capability are not mutually
exclusive. E.g we can have admin virtqueue in L0 and capability in
L1/L0. The capability make it much more easier to be virtualized for
future vPASID/vSVA stuffs.

>
> > +
> > +\paragraph{PASID Configuration}
> > +
> > +When PASID extended capability is present and enabled in the device
> > +(through the standard PCI Extended configuration space). To assign a
> > +PASID to a specific virtqueue group. The driver need to perform the
> > +following steps:
>
> This paragraph is a single sentence, the full stops ('.') are used more
> like commas (','). I suggest shortening it like this:
>
>   When the PCI PASID Extended Capability is present and enabled, the
>   driver performs the following steps to assign a PASID to a virtqueue
>   group:
>
> (I also used explicit "PCI PASID Extended Capability" naming so it's
> clear we're talking about a formal term defined in the PCI
> specification.)

That's fine.

>
> > +
> > +\begin{enumerate}
> > +\item Identify the number of groups by reading \field{groups}
> > +
> > +\item Identify the virtqueue group for a specific virtqueue by first
> > +  writing the index of the virtqueue to \field{queue_select} then
> > +  reading its virtqueue group identifier from \field{group_id}.
> > +
> > +\item Determine number of PASIDs and the PASID that will be associated
> > +  for each virtqueue group.
> > +
> > +\item Assign PASID to virtqueue groups by first writing the
> > +  virtqueue group ID to \field{group_select} then writing the valid
> > +  PASID number that assigned to the virtqueue group to
> > +  \field{group_pasid}
>
>   \item Assign PASIDs to virtqueue groups by first writing the
>     virtqueue group ID to \field{group_select} and then writing a valid
>     PASID number to assign to that group to \field{group_pasid}.

Ok.

>
> > +\end{enumerate}
> > +
> > +To disable PASID for a virtqueue group, the driver write a special
> > +NO_PASID value:
>
>   The driver writes the special VIRTIO_PASID_NO_ID value to disable
>   PASID for a virtqueue group:

Ok.

>
> > +
> > +\begin{lstlisting}
> > +/* PASID value used to disable PASID for virtqueue group */
> > +#define VIRTIO_PASID_NO_ID            0xffffffff
> > +\end{lstlisting}
> > +
> > +\drivernormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> > +
> > +Except for VIRTIO_PASID_NO_ID, a driver MUST NOT attempt to assign a
> > +PASID number outside the range that is defined in the \field{Max PASID Width}
> > +in the PASID Extended Capability.
> > +
> > +After assigning a PASID to a virtqueue group, the driver MUST verify
> > +the success by reading the Virtqueue Group PASID: on success, the
> > +previously written value is returned, and on failure
> > +VIRTIO_PASID_NO_ID is returned. If a mapping failure is detected, the
> > +driver MAY retry assigning with fewer PASIDs, disable PASID or report
> > +device failure.
> > +
> > +\devicenormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> > +
> > +The device MUST support at least one virtqueue group if it offers
> > +PASID configuration structure.
>
>   the PASID configuration structure.
>
> ("a" would be okay too but leaves open the possibility of multiple PASID
> configuration structures, which I think we don't want.)

Yes, but multiple structures of the same type are allowed for PCI
capability, so I try to allow it for PCIe as well.

>
> > +
> > +The device MUST supply virtqueue group identifiers continuously from
> > +number zero up to the number of virtqueue groups minus one. For
>
> s/number zero/the number zero/

Will fix.

>
> > +example, if the device has 4 virtqueue groups, the group ids should be
>
> To avoid using "should":
> s/should be/are/
>
> > +0, 1, 2, 3.
>
> /, 3/, and 3/

Ok.

>
> > +
> > +The device MUST make sure all the previous PCIe transactions for the
> > +virtqueue group is completed before presenting the new PASID value
>
> s/transactions ... is completed/transactions ... are completed/
>

Yes.

> > +wrote to the drvier in \field{group_pasid}.
>
> This is ambiguous. It leaves open the possibility that group_pasid
> updates at a later point in time instead of immediately after a write by
> the driver. My interpretation is that the device must respond
> immediately by presenting the new PASID or VIRTIO_PASID_NO_ID when the
> driver reads back group_pasid after a write. If you agree, please
> rephrase this clause to make it clear that reading group_pasid after
> write acts like a barrier (previous PCIe transactions must complete
> before the device responds to the read) and the device is not allowed to
> present the old group_pasid value until all transactions are completed
> and then present the new group_pasid value.

So my understanding is that instead of delaying the response to read
it's better to simply present the previous PASID value if there're any
pending transactions of previous PASID value. The driver can then
choose to poll or using timers etc.

>
> > +
> > +The device MUST use PASID TLP prefix for all memory transactions
> > +belongs to a virtqueue group if a valid PASID is assigned and PASID is
>
> s/belongs/belonging/
>
> > +enabled in the PASID extended capability.
>
> This statement is vague. Which memory transactions "belong to a
> virtqueue group"?

How about:

The device MUST use PASID TLP prefix for all memory transactions
initiated by the virtqueue that belong to a virtqueue group ...

> vring reads/writes and data buffer reads/writes come
> to mind. Virtqueue MSI-X messages are probably not included? Anything
> else?

According to the PCIe spec and the semantic, the PASID TLP prefix
should be used for

Memory Requests (including AtomicOp Requests) with Untranslated Addresses.
Address Translation Requests, ATS Invalidation Messages, Page Request
Messages, and PRG Response Messages

So we probably don't need to differentiate MSI-X messages with DMA
here. That's why I think a general "memory transaction" should be
sufficient here. If you don't agree, I can clarify it as what PCIe
spec did.

Thanks

>
> > +
> > +The device MUST NOT use PASID TLP prefix for any memory transaction
> > +belongs to a virtqueue group if VIRTIO_PASID_NO_ID is assigned or
> > +PASID is not enabled in the PASID extended capability.
> > +
> > +Upon reset, the device MUST present VIRTIO_PASID_NO_ID in
> > +\field{group_pasid} for each virtqueue group.
> > +
> >  \subsection{PCI-specific Initialization And Device Operation}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation}
> >
> >  \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Device Initialization}
> > --
> > 2.32.0 (Apple Git-132)
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-12 10:44 ` [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support Stefan Hajnoczi
@ 2022-01-13  1:28   ` Jason Wang
  2022-01-13 10:36     ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-13  1:28 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

On Wed, Jan 12, 2022 at 6:44 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> > Hi All:
> >
> > This series tries to add PASID support for virtio-pci to allow the
> > virtqueue to use PASID TLP prefix for PCI transactions. This will be
> > useful for future work like, queue assignment, virtqueue
> > virtualization and presenting multiple vDPA devices with a single PCI
> > device.
> >
> > Since we're short of the space for the PCI capabilities, the PCI
> > extended capability for virtio structure is introduced that allows the
> > PASID configuration structure to use.
> >
> > A prototype is implemented with emulated virtio-pci device in [1]. A
> > test driver is implemented in [2].
> >
> > Please review.
>
> I don't know the security model for PASIDs. My guess is that PASIDs can
> be bruteforced so we must trust the driver (it can assign PASIDs to
> virtqueue groups) and we must prevent untrusted applications from
> setting PASIDs on virtqueues. Is that correct?

Yes, and the kernel can choose to hide PASID even for the trusted
application by using token or other intermediate layers.

Thanks

>
> Thanks,
> Stefan


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-13  0:55     ` Jason Wang
@ 2022-01-13 10:19       ` Stefan Hajnoczi
  2022-01-14  3:23         ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-13 10:19 UTC (permalink / raw)
  To: Jason Wang; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

[-- Attachment #1: Type: text/plain, Size: 4168 bytes --]

On Thu, Jan 13, 2022 at 08:55:18AM +0800, Jason Wang wrote:
> On Wed, Jan 12, 2022 at 6:10 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 01:57:54PM +0800, Jason Wang wrote:
> > > We're already out of the configuration space if there's a device that
> > > supports all kinds of the virtio structure via PCI capability. This
> > > prevents us from adding new capabilities in the future.
> > >
> > > So the patch adds the support for virtio structure via PCI Extended
> > > Capability via the vendor specific extended capability.
> > >
> > > Only MMIO bar is allowed now.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 127 insertions(+)
> > >
> > > diff --git a/content.tex b/content.tex
> > > index cf20570..00a75f2 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
> > >     of BAR0 by presenting zeroes on every BAR and ignoring writes.
> > >  \end{enumerate}
> > >
> > > +\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> > > +
> > > +The location of the virtio structures that depend on the PCI Express
> > > +capability are specified using a vendor-specific extended capabilities
> > > +on the extended capabilities list in PCI Express extended
> > > +configuration space of the device.
> >
> > I can't parse this sentence. Can you rephrase it and/or split it into
> > multiple sentences?
> 
> Sure, how about something like:
> 
> There could be virtio structures that depend on the PCI express
> capability. The location of those structures are specified using a
> vendor-specific extended capabilities on the extended capabilities
> list.

"virtio structures" -> "Virtio device configuration structures" (same as
in "4.1.4 Virtio Structure PCI Capabilities")

"PCI express capability" -> "PCI Express Extended Capability" or did you
mean that more generally (PCI Express features)?

Here is the edited sentence:

  Virtio device configuration structures that depend on PCI Express
  Extended Capabilities are specified using vendor-specific PCI Express
  Extended Capabilities located on the PCI Express Extended Capabilities
  list.

> > > @@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
> > >  PCI capability list, detecting virtio configuration layout using Virtio
> > >  Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
> > >
> > > +Optionally, if the device is a PCIe device, the driver scans the PCI
> > > +Extended capability list, detecting virtio configuration layout using
> > > +Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> >
> > If the same structure is present in both the PCI Capabilities and PCI
> > Extended Capabilities lists, which one has a higher priority?
> 
> This is not allowed in this proposal where only the structure that
> depends on the PCI Express capability is allowed. E.g the PASID
> configuration structure depends on the PASID extended capability.
> 
> This is much more simpler than allowing existing virtio structures to
> be presented on both PCI and PCIe capability list.

Ah, I think I misunderstood the point of this patch. I thought you were
defining it because devices were running out of PCI Configuration Space
and PCI Extended Capabilities allow the device to use more space (PCI
Express Extended Configuration Space).

But now I think you're adding this to the spec in order to define VIRTIO
features that depend on PCI Extended Capabilities like PASID support? Is
it really necessary or just cleaner to use PCI Extended Capabilities
instead of VIRTIO's existing vendor-specific PCI Capabilities?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13  1:24     ` Jason Wang
@ 2022-01-13 10:32       ` Stefan Hajnoczi
  2022-01-13 10:45         ` Michael S. Tsirkin
  2022-01-14  9:43       ` Jean-Philippe Brucker
  1 sibling, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-13 10:32 UTC (permalink / raw)
  To: Jason Wang; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

[-- Attachment #1: Type: text/plain, Size: 12663 bytes --]

On Thu, Jan 13, 2022 at 09:24:16AM +0800, Jason Wang wrote:
> On Wed, Jan 12, 2022 at 6:42 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 01:57:55PM +0800, Jason Wang wrote:
> > > This patch tries to add PASID configuration structure. It is used for
> > > assigning PASID to virtqueue then the device use PASID TLP prefix for
> > > the PCI transactions like DMA. The goal is to isolate e.g DMA at
> > > subdevice level which could be used for things like:
> > >
> > > - direct queue assignment to userspace
> > > - virtqueue virtualization
> > > - presenting multiple vDPA devices within a single PCI device
> > >
> > > The virtqueue group is introduced as an intermediate layer for having
> > > better compatibility and flexibility. The virtqueue group is the
> > > minimal set of the virtqueues that can be assigned with a single
> > > PASID. The PASID is then assigned at the level of virtqueue group
> > > instead of the virtqueue itself.
> > >
> > > For a full PASID capable hardware, it can simply advertise a model
> > > of 1:1 mapping of the virtqueue and virtqueue group. For the device
> > > with mediated virtqueues, it can choose to place the mediated
> > > virtqueues into dedicated virtqueue group(s).
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  content.tex | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 119 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/content.tex b/content.tex
> > > index 00a75f2..552c0a8 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -1516,8 +1516,14 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
> > >          PCI Express extended configuration space.
> > >
> > >  \item[\field{cfg_type}]
> > > -        Identifies the structure. All values are reserved for future
> > > -        use.
> > > +        Identifies the structure. according to the following table:
> > > +
> > > +\begin{lstlisting}
> > > +/* PASID configuration */
> > > +#define VIRTIO_PCI_ECAP_PASID_CFG        1
> > > +\end{lstlisting}
> > > +
> > > +        Any other value is reserved for future use.
> > >
> > >          The device MAY offer more than one structure of any type - this makes it
> > >          possible for the device to expose multiple interfaces to drivers.  The order of
> > > @@ -1599,6 +1605,117 @@ \subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transpo
> > >  If the device presents multiple structures of the same type, it SHOULD order
> > >  them from optimal (first) to least-optimal (last).
> > >
> > > +\subsubsection{PASID configuration structure layout}
> > > +
> > > +The PASID configuration structure is found at the \field{bar} and
> > > +\field{offset} within the VIRTIO_PCI_ECAP_PASID_CFG extended
> >
> > It looks like PCI Extended Capabilities have a different namespace from
> > classic PCI Capabilities (VIRTIO_PCI_**ECAP**_PASID_CFG). If a device
> > runs out of PCI Configuration Space it will be unable to define
> > additional capabilities as PCI Extended Capabilities. At least I'm not
> > aware of VIRTIO_PCI_ECAP_* definitions for existing capabilities. If the
> > point is to help with PCI Configuration Space exhaustion, should all
> > capabilities be defined in the same namespace so they can be trivially
> > defined as either PCI Capabilities or PCI Extended Capabilities?
> 
> I'm not sure, but only allowing new types to be placed in extended
> capabilities is much easier.
> 
> >
> > (Maybe I misunderstood the purpose of adding PCI Extended Capabilities?)
> 
> More or less, the proposal is only for the structure that depends on
> PCI express.
> 
> >
> > > +capability; its layout is below.
> > > +
> > > +\begin{lstlisting}
> > > +struct virtio_pci_common_cfg {
> >
> > The struct name looks like it was copy-pasted. Should this be struct
> > virtio_pci_group_cfg?
> 
> Right, but I think virtio_pci_pasid_cfg is better.

Sounds good.

> 
> >
> > > +        /* About the whole device. */
> > > +        le16 groups;                   /* read-only for driver */
> > > +        /* About a specific virtqueue */
> > > +        le16 queue_select;             /* read-write */
> > > +        le16 group_id;                 /* read-only for driver */
> > > +        /* About a specific virtqueue group */
> > > +        le16 group_select;             /* read-write */
> > > +        le32 group_pasid;              /* read-write */
> > > +}
> > > +\end{lstlisting}
> > > +
> > > +\begin{description}
> > > +\item[\field{groups}]
> > > +        The device specifies the number of virtqueue groups here. A
> > > +        virtqueue group is the minimal set of the virtqueues that can
> > > +        be assigned with a single PASID.
> > > +
> > > +\item[\field{queue_select}]
> > > +        Queue Select. The driver selects which virtqueue the
> > > +        \field{group_id} refers to.
> > > +
> > > +\item[\field{group_id}]
> > > +        Virtqueue Group Identifier. The device specify the virtqueue
> > > +        group identifier that the virtqueue belongs to.
> > > +
> > > +\item[\field{group_select}]
> > > +        Virtqueue Group Select. The driver selects which virtqueue
> > > +        group the field{group_pasid} refers to.
> > > +
> > > +\item[\field{group_pasid}]
> > > +        Virtqueue Group PASID. The driver use this field to assign a
> > > +        PASID to a virtqueue group.
> > > +\end{description}
> >
> > The queue_select/group_select interface makes me wonder if an admin
> > virtqueue interface would be better. The driver is forced to scan
> > virtqueues and groups, and each access requires a vmexit. An admin
> > virtqueue interface would allow a more efficient DMA interface where all
> > virtqueues/groups can be queried or set in a single operation.
> >
> > Luckily this interface is only used at initialization time, but I
> > sometimes still wonder if it's scalable enough.
> 
> Yes, so it should be no worse than the current layout in the
> virtio_pci_common_cfg where we had a bunch of fields that are
> multiplexed via the queue_select.
> 
> Actually the admin virtqueue and capability are not mutually
> exclusive. E.g we can have admin virtqueue in L0 and capability in
> L1/L0. The capability make it much more easier to be virtualized for
> future vPASID/vSVA stuffs.
> 
> >
> > > +
> > > +\paragraph{PASID Configuration}
> > > +
> > > +When PASID extended capability is present and enabled in the device
> > > +(through the standard PCI Extended configuration space). To assign a
> > > +PASID to a specific virtqueue group. The driver need to perform the
> > > +following steps:
> >
> > This paragraph is a single sentence, the full stops ('.') are used more
> > like commas (','). I suggest shortening it like this:
> >
> >   When the PCI PASID Extended Capability is present and enabled, the
> >   driver performs the following steps to assign a PASID to a virtqueue
> >   group:
> >
> > (I also used explicit "PCI PASID Extended Capability" naming so it's
> > clear we're talking about a formal term defined in the PCI
> > specification.)
> 
> That's fine.
> 
> >
> > > +
> > > +\begin{enumerate}
> > > +\item Identify the number of groups by reading \field{groups}
> > > +
> > > +\item Identify the virtqueue group for a specific virtqueue by first
> > > +  writing the index of the virtqueue to \field{queue_select} then
> > > +  reading its virtqueue group identifier from \field{group_id}.
> > > +
> > > +\item Determine number of PASIDs and the PASID that will be associated
> > > +  for each virtqueue group.
> > > +
> > > +\item Assign PASID to virtqueue groups by first writing the
> > > +  virtqueue group ID to \field{group_select} then writing the valid
> > > +  PASID number that assigned to the virtqueue group to
> > > +  \field{group_pasid}
> >
> >   \item Assign PASIDs to virtqueue groups by first writing the
> >     virtqueue group ID to \field{group_select} and then writing a valid
> >     PASID number to assign to that group to \field{group_pasid}.
> 
> Ok.
> 
> >
> > > +\end{enumerate}
> > > +
> > > +To disable PASID for a virtqueue group, the driver write a special
> > > +NO_PASID value:
> >
> >   The driver writes the special VIRTIO_PASID_NO_ID value to disable
> >   PASID for a virtqueue group:
> 
> Ok.
> 
> >
> > > +
> > > +\begin{lstlisting}
> > > +/* PASID value used to disable PASID for virtqueue group */
> > > +#define VIRTIO_PASID_NO_ID            0xffffffff
> > > +\end{lstlisting}
> > > +
> > > +\drivernormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> > > +
> > > +Except for VIRTIO_PASID_NO_ID, a driver MUST NOT attempt to assign a
> > > +PASID number outside the range that is defined in the \field{Max PASID Width}
> > > +in the PASID Extended Capability.
> > > +
> > > +After assigning a PASID to a virtqueue group, the driver MUST verify
> > > +the success by reading the Virtqueue Group PASID: on success, the
> > > +previously written value is returned, and on failure
> > > +VIRTIO_PASID_NO_ID is returned. If a mapping failure is detected, the
> > > +driver MAY retry assigning with fewer PASIDs, disable PASID or report
> > > +device failure.
> > > +
> > > +\devicenormative{\subsubsection}{PASID configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PASID configuration structure layout}
> > > +
> > > +The device MUST support at least one virtqueue group if it offers
> > > +PASID configuration structure.
> >
> >   the PASID configuration structure.
> >
> > ("a" would be okay too but leaves open the possibility of multiple PASID
> > configuration structures, which I think we don't want.)
> 
> Yes, but multiple structures of the same type are allowed for PCI
> capability, so I try to allow it for PCIe as well.

I see. The line can be changed to "a PASID configuration structure".

> > > +wrote to the drvier in \field{group_pasid}.
> >
> > This is ambiguous. It leaves open the possibility that group_pasid
> > updates at a later point in time instead of immediately after a write by
> > the driver. My interpretation is that the device must respond
> > immediately by presenting the new PASID or VIRTIO_PASID_NO_ID when the
> > driver reads back group_pasid after a write. If you agree, please
> > rephrase this clause to make it clear that reading group_pasid after
> > write acts like a barrier (previous PCIe transactions must complete
> > before the device responds to the read) and the device is not allowed to
> > present the old group_pasid value until all transactions are completed
> > and then present the new group_pasid value.
> 
> So my understanding is that instead of delaying the response to read
> it's better to simply present the previous PASID value if there're any
> pending transactions of previous PASID value. The driver can then
> choose to poll or using timers etc.

Please state that explicitly. It wasn't obvious to me that the driver
needs to reread group_pasid until it changes either to the new value or
the VIRTIO_PASID_NO_ID. I thought the field updates immediately.

> 
> >
> > > +
> > > +The device MUST use PASID TLP prefix for all memory transactions
> > > +belongs to a virtqueue group if a valid PASID is assigned and PASID is
> >
> > s/belongs/belonging/
> >
> > > +enabled in the PASID extended capability.
> >
> > This statement is vague. Which memory transactions "belong to a
> > virtqueue group"?
> 
> How about:
> 
> The device MUST use PASID TLP prefix for all memory transactions
> initiated by the virtqueue that belong to a virtqueue group ...

Okay.

> > vring reads/writes and data buffer reads/writes come
> > to mind. Virtqueue MSI-X messages are probably not included? Anything
> > else?
> 
> According to the PCIe spec and the semantic, the PASID TLP prefix
> should be used for
> 
> Memory Requests (including AtomicOp Requests) with Untranslated Addresses.
> Address Translation Requests, ATS Invalidation Messages, Page Request
> Messages, and PRG Response Messages
> 
> So we probably don't need to differentiate MSI-X messages with DMA
> here. That's why I think a general "memory transaction" should be
> sufficient here. If you don't agree, I can clarify it as what PCIe
> spec did.

Okay, thanks for explaining!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-13  1:28   ` Jason Wang
@ 2022-01-13 10:36     ` Stefan Hajnoczi
  2022-01-13 10:40       ` Michael S. Tsirkin
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-13 10:36 UTC (permalink / raw)
  To: Jason Wang; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

[-- Attachment #1: Type: text/plain, Size: 1747 bytes --]

On Thu, Jan 13, 2022 at 09:28:19AM +0800, Jason Wang wrote:
> On Wed, Jan 12, 2022 at 6:44 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> > > Hi All:
> > >
> > > This series tries to add PASID support for virtio-pci to allow the
> > > virtqueue to use PASID TLP prefix for PCI transactions. This will be
> > > useful for future work like, queue assignment, virtqueue
> > > virtualization and presenting multiple vDPA devices with a single PCI
> > > device.
> > >
> > > Since we're short of the space for the PCI capabilities, the PCI
> > > extended capability for virtio structure is introduced that allows the
> > > PASID configuration structure to use.
> > >
> > > A prototype is implemented with emulated virtio-pci device in [1]. A
> > > test driver is implemented in [2].
> > >
> > > Please review.
> >
> > I don't know the security model for PASIDs. My guess is that PASIDs can
> > be bruteforced so we must trust the driver (it can assign PASIDs to
> > virtqueue groups) and we must prevent untrusted applications from
> > setting PASIDs on virtqueues. Is that correct?
> 
> Yes, and the kernel can choose to hide PASID even for the trusted
> application by using token or other intermediate layers.

It would be good to describe the security model from a virtio-pci
perspective so driver implementors don't accidentally expose trusted
interfaces to untrusted applications. It's obvious to someone who
already understands and has thought through all of this, but not obvious
to someone who is implementing a driver for the first time or someone
who is modifying the VIRTIO specification and doesn't know/care about
PASIDs.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-13 10:36     ` Stefan Hajnoczi
@ 2022-01-13 10:40       ` Michael S. Tsirkin
  2022-01-14  2:53         ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-13 10:40 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jason Wang, Virtio-Dev, eperezma, Cindy Lu

On Thu, Jan 13, 2022 at 10:36:52AM +0000, Stefan Hajnoczi wrote:
> On Thu, Jan 13, 2022 at 09:28:19AM +0800, Jason Wang wrote:
> > On Wed, Jan 12, 2022 at 6:44 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> > > > Hi All:
> > > >
> > > > This series tries to add PASID support for virtio-pci to allow the
> > > > virtqueue to use PASID TLP prefix for PCI transactions. This will be
> > > > useful for future work like, queue assignment, virtqueue
> > > > virtualization and presenting multiple vDPA devices with a single PCI
> > > > device.
> > > >
> > > > Since we're short of the space for the PCI capabilities, the PCI
> > > > extended capability for virtio structure is introduced that allows the
> > > > PASID configuration structure to use.
> > > >
> > > > A prototype is implemented with emulated virtio-pci device in [1]. A
> > > > test driver is implemented in [2].
> > > >
> > > > Please review.
> > >
> > > I don't know the security model for PASIDs. My guess is that PASIDs can
> > > be bruteforced so we must trust the driver (it can assign PASIDs to
> > > virtqueue groups) and we must prevent untrusted applications from
> > > setting PASIDs on virtqueues. Is that correct?
> > 
> > Yes, and the kernel can choose to hide PASID even for the trusted
> > application by using token or other intermediate layers.
> 
> It would be good to describe the security model from a virtio-pci
> perspective so driver implementors don't accidentally expose trusted
> interfaces to untrusted applications. It's obvious to someone who
> already understands and has thought through all of this, but not obvious
> to someone who is implementing a driver for the first time or someone
> who is modifying the VIRTIO specification and doesn't know/care about
> PASIDs.
> 
> Stefan


Can't hurt to have a security considerations chapter.
We should talk there about ACCESS_PLATFORM which has security implications
too.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13 10:32       ` Stefan Hajnoczi
@ 2022-01-13 10:45         ` Michael S. Tsirkin
  2022-01-13 14:53           ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-13 10:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jason Wang, Virtio-Dev, eperezma, Cindy Lu

On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
> > So my understanding is that instead of delaying the response to read
> > it's better to simply present the previous PASID value if there're any
> > pending transactions of previous PASID value. The driver can then
> > choose to poll or using timers etc.

Don't we limit these changes to before DRIVER_OK?
If we do then no previous transactions can exist.


-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13 10:45         ` Michael S. Tsirkin
@ 2022-01-13 14:53           ` Stefan Hajnoczi
  2022-01-13 15:17             ` Michael S. Tsirkin
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-13 14:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jason Wang, Virtio-Dev, eperezma, Cindy Lu

[-- Attachment #1: Type: text/plain, Size: 684 bytes --]

On Thu, Jan 13, 2022 at 05:45:11AM -0500, Michael S. Tsirkin wrote:
> On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
> > > So my understanding is that instead of delaying the response to read
> > > it's better to simply present the previous PASID value if there're any
> > > pending transactions of previous PASID value. The driver can then
> > > choose to poll or using timers etc.
> 
> Don't we limit these changes to before DRIVER_OK?
> If we do then no previous transactions can exist.

Not sure if that's possible for the vDPA virtqueue assignment use case
where one large VIRTIO device provides virtqueues for many smaller vDPA
devices?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13 14:53           ` Stefan Hajnoczi
@ 2022-01-13 15:17             ` Michael S. Tsirkin
  2022-01-14  3:15               ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-13 15:17 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Jason Wang, Virtio-Dev, eperezma, Cindy Lu

On Thu, Jan 13, 2022 at 02:53:36PM +0000, Stefan Hajnoczi wrote:
> On Thu, Jan 13, 2022 at 05:45:11AM -0500, Michael S. Tsirkin wrote:
> > On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
> > > > So my understanding is that instead of delaying the response to read
> > > > it's better to simply present the previous PASID value if there're any
> > > > pending transactions of previous PASID value. The driver can then
> > > > choose to poll or using timers etc.
> > 
> > Don't we limit these changes to before DRIVER_OK?
> > If we do then no previous transactions can exist.
> 
> Not sure if that's possible for the vDPA virtqueue assignment use case
> where one large VIRTIO device provides virtqueues for many smaller vDPA
> devices?
> 
> Stefan

Then I guess queue must be stopped?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-12  5:57 [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support Jason Wang
                   ` (2 preceding siblings ...)
  2022-01-12 10:44 ` [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support Stefan Hajnoczi
@ 2022-01-13 15:18 ` Michael S. Tsirkin
  2022-01-14  2:55   ` Jason Wang
  3 siblings, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-13 15:18 UTC (permalink / raw)
  To: Jason Wang; +Cc: virtio-dev, stefanha, eperezma, lulu

On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> Hi All:
> 
> This series tries to add PASID support for virtio-pci to allow the
> virtqueue to use PASID TLP prefix for PCI transactions. This will be
> useful for future work like, queue assignment, virtqueue
> virtualization and presenting multiple vDPA devices with a single PCI
> device.


Try this:

echo "ab virito virtio" >> ~/.vimrc

and your subjects will be perfect from now on.


> Since we're short of the space for the PCI capabilities, the PCI
> extended capability for virtio structure is introduced that allows the
> PASID configuration structure to use.
> 
> A prototype is implemented with emulated virtio-pci device in [1]. A
> test driver is implemented in [2].
> 
> Please review.
> 
> Thanks
> 
> [1] https://github.com/jasowang/qemu.git virtio-pasid
> [2] https://github.com/jasowang/linux.git virtio-pasid
> 
> Jason Wang (2):
>   virtio-pci: introduce virtio structure PCI Extended Capability
>   virtio-pci: add PASID configuration extended capability
> 
>  content.tex | 244 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 244 insertions(+)
> 
> -- 
> 2.32.0 (Apple Git-132)


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-13 10:40       ` Michael S. Tsirkin
@ 2022-01-14  2:53         ` Jason Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Wang @ 2022-01-14  2:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Thu, Jan 13, 2022 at 6:40 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jan 13, 2022 at 10:36:52AM +0000, Stefan Hajnoczi wrote:
> > On Thu, Jan 13, 2022 at 09:28:19AM +0800, Jason Wang wrote:
> > > On Wed, Jan 12, 2022 at 6:44 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> > > > > Hi All:
> > > > >
> > > > > This series tries to add PASID support for virtio-pci to allow the
> > > > > virtqueue to use PASID TLP prefix for PCI transactions. This will be
> > > > > useful for future work like, queue assignment, virtqueue
> > > > > virtualization and presenting multiple vDPA devices with a single PCI
> > > > > device.
> > > > >
> > > > > Since we're short of the space for the PCI capabilities, the PCI
> > > > > extended capability for virtio structure is introduced that allows the
> > > > > PASID configuration structure to use.
> > > > >
> > > > > A prototype is implemented with emulated virtio-pci device in [1]. A
> > > > > test driver is implemented in [2].
> > > > >
> > > > > Please review.
> > > >
> > > > I don't know the security model for PASIDs. My guess is that PASIDs can
> > > > be bruteforced so we must trust the driver (it can assign PASIDs to
> > > > virtqueue groups) and we must prevent untrusted applications from
> > > > setting PASIDs on virtqueues. Is that correct?
> > >
> > > Yes, and the kernel can choose to hide PASID even for the trusted
> > > application by using token or other intermediate layers.
> >
> > It would be good to describe the security model from a virtio-pci
> > perspective so driver implementors don't accidentally expose trusted
> > interfaces to untrusted applications. It's obvious to someone who
> > already understands and has thought through all of this, but not obvious
> > to someone who is implementing a driver for the first time or someone
> > who is modifying the VIRTIO specification and doesn't know/care about
> > PASIDs.
> >
> > Stefan
>
>
> Can't hurt to have a security considerations chapter.
> We should talk there about ACCESS_PLATFORM which has security implications
> too.

Ok, I guess what you meant is the translated the request with PASID
when ACCESS_PLATFORM is not negotiated? Should we simply forbid this?

Thanks

>
> --
> MST
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support
  2022-01-13 15:18 ` Michael S. Tsirkin
@ 2022-01-14  2:55   ` Jason Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Wang @ 2022-01-14  2:55 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Virtio-Dev, Stefan Hajnoczi, eperezma, Cindy Lu

On Thu, Jan 13, 2022 at 11:18 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 12, 2022 at 01:57:53PM +0800, Jason Wang wrote:
> > Hi All:
> >
> > This series tries to add PASID support for virtio-pci to allow the
> > virtqueue to use PASID TLP prefix for PCI transactions. This will be
> > useful for future work like, queue assignment, virtqueue
> > virtualization and presenting multiple vDPA devices with a single PCI
> > device.
>
>
> Try this:
>
> echo "ab virito virtio" >> ~/.vimrc
>
> and your subjects will be perfect from now on.

Sure. (Using emacs so I've added the virtio to the customized dict of
flyspell :))

Thanks

>
>
> > Since we're short of the space for the PCI capabilities, the PCI
> > extended capability for virtio structure is introduced that allows the
> > PASID configuration structure to use.
> >
> > A prototype is implemented with emulated virtio-pci device in [1]. A
> > test driver is implemented in [2].
> >
> > Please review.
> >
> > Thanks
> >
> > [1] https://github.com/jasowang/qemu.git virtio-pasid
> > [2] https://github.com/jasowang/linux.git virtio-pasid
> >
> > Jason Wang (2):
> >   virtio-pci: introduce virtio structure PCI Extended Capability
> >   virtio-pci: add PASID configuration extended capability
> >
> >  content.tex | 244 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 244 insertions(+)
> >
> > --
> > 2.32.0 (Apple Git-132)
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13 15:17             ` Michael S. Tsirkin
@ 2022-01-14  3:15               ` Jason Wang
  2022-01-14 10:38                 ` Jean-Philippe Brucker
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-14  3:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Stefan Hajnoczi; +Cc: Virtio-Dev, eperezma, Cindy Lu


在 2022/1/13 下午11:17, Michael S. Tsirkin 写道:
> On Thu, Jan 13, 2022 at 02:53:36PM +0000, Stefan Hajnoczi wrote:
>> On Thu, Jan 13, 2022 at 05:45:11AM -0500, Michael S. Tsirkin wrote:
>>> On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
>>>>> So my understanding is that instead of delaying the response to read
>>>>> it's better to simply present the previous PASID value if there're any
>>>>> pending transactions of previous PASID value. The driver can then
>>>>> choose to poll or using timers etc.
>>> Don't we limit these changes to before DRIVER_OK?
>>> If we do then no previous transactions can exist.
>> Not sure if that's possible for the vDPA virtqueue assignment use case
>> where one large VIRTIO device provides virtqueues for many smaller vDPA
>> devices?


That could be the case but usually a smaller vDPA device requires a 
bunch of hardware resources more than just virtqueues.

Another example is assign a queue directly to userspace with PASID for SVA.


>>
>> Stefan
> Then I guess queue must be stopped?


Yes, since we have virtqueue reset, I can limit the assigning before 
DRIVER_OK or queue(s) are reset. Does this work?

Thanks


>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-13 10:19       ` Stefan Hajnoczi
@ 2022-01-14  3:23         ` Jason Wang
  2022-01-17 10:03           ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-14  3:23 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu


在 2022/1/13 下午6:19, Stefan Hajnoczi 写道:
> On Thu, Jan 13, 2022 at 08:55:18AM +0800, Jason Wang wrote:
>> On Wed, Jan 12, 2022 at 6:10 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> On Wed, Jan 12, 2022 at 01:57:54PM +0800, Jason Wang wrote:
>>>> We're already out of the configuration space if there's a device that
>>>> supports all kinds of the virtio structure via PCI capability. This
>>>> prevents us from adding new capabilities in the future.
>>>>
>>>> So the patch adds the support for virtio structure via PCI Extended
>>>> Capability via the vendor specific extended capability.
>>>>
>>>> Only MMIO bar is allowed now.
>>>>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>   content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 127 insertions(+)
>>>>
>>>> diff --git a/content.tex b/content.tex
>>>> index cf20570..00a75f2 100644
>>>> --- a/content.tex
>>>> +++ b/content.tex
>>>> @@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
>>>>      of BAR0 by presenting zeroes on every BAR and ignoring writes.
>>>>   \end{enumerate}
>>>>
>>>> +\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
>>>> +
>>>> +The location of the virtio structures that depend on the PCI Express
>>>> +capability are specified using a vendor-specific extended capabilities
>>>> +on the extended capabilities list in PCI Express extended
>>>> +configuration space of the device.
>>> I can't parse this sentence. Can you rephrase it and/or split it into
>>> multiple sentences?
>> Sure, how about something like:
>>
>> There could be virtio structures that depend on the PCI express
>> capability. The location of those structures are specified using a
>> vendor-specific extended capabilities on the extended capabilities
>> list.
> "virtio structures" -> "Virtio device configuration structures" (same as
> in "4.1.4 Virtio Structure PCI Capabilities")


Ok.


>
> "PCI express capability" -> "PCI Express Extended Capability" or did you
> mean that more generally (PCI Express features)?


More general PCI Express feature.


>
> Here is the edited sentence:
>
>    Virtio device configuration structures that depend on PCI Express
>    Extended Capabilities are specified using vendor-specific PCI Express
>    Extended Capabilities located on the PCI Express Extended Capabilities
>    list.


That's fine.


>>>> @@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
>>>>   PCI capability list, detecting virtio configuration layout using Virtio
>>>>   Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
>>>>
>>>> +Optionally, if the device is a PCIe device, the driver scans the PCI
>>>> +Extended capability list, detecting virtio configuration layout using
>>>> +Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
>>> If the same structure is present in both the PCI Capabilities and PCI
>>> Extended Capabilities lists, which one has a higher priority?
>> This is not allowed in this proposal where only the structure that
>> depends on the PCI Express capability is allowed. E.g the PASID
>> configuration structure depends on the PASID extended capability.
>>
>> This is much more simpler than allowing existing virtio structures to
>> be presented on both PCI and PCIe capability list.
> Ah, I think I misunderstood the point of this patch. I thought you were
> defining it because devices were running out of PCI Configuration Space
> and PCI Extended Capabilities allow the device to use more space (PCI
> Express Extended Configuration Space).
>
> But now I think you're adding this to the spec in order to define VIRTIO
> features that depend on PCI Extended Capabilities like PASID support?


Right.


> Is
> it really necessary or just cleaner to use PCI Extended Capabilities
> instead of VIRTIO's existing vendor-specific PCI Capabilities?


Not a must, technically we can add them into the common configuration 
structure. But having a dedicated capability seems cleaner: it looks odd 
to configure PCIe features via the legacy capability anyhow. And this is 
more flexible to allow the hypervisor to map or hide the BAR independently.


Thanks


>
> Stefan


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-13  1:24     ` Jason Wang
  2022-01-13 10:32       ` Stefan Hajnoczi
@ 2022-01-14  9:43       ` Jean-Philippe Brucker
  2022-01-17  5:57         ` Jason Wang
  2022-01-19 23:53         ` Michael S. Tsirkin
  1 sibling, 2 replies; 30+ messages in thread
From: Jean-Philippe Brucker @ 2022-01-14  9:43 UTC (permalink / raw)
  To: Jason Wang; +Cc: Stefan Hajnoczi, Virtio-Dev, mst, eperezma, Cindy Lu

Hi Jason,

On Thu, Jan 13, 2022 at 09:24:16AM +0800, Jason Wang wrote:
> The device MUST use PASID TLP prefix for all memory transactions
> initiated by the virtqueue that belong to a virtqueue group ...
> 
> > vring reads/writes and data buffer reads/writes come
> > to mind. Virtqueue MSI-X messages are probably not included? Anything
> > else?
> 
> According to the PCIe spec and the semantic, the PASID TLP prefix
> should be used for
> 
> Memory Requests (including AtomicOp Requests) with Untranslated Addresses.
> Address Translation Requests, ATS Invalidation Messages, Page Request
> Messages, and PRG Response Messages
> 
> So we probably don't need to differentiate MSI-X messages with DMA
> here. That's why I think a general "memory transaction" should be
> sufficient here. If you don't agree, I can clarify it as what PCIe
> spec did.

I think we should be explicit about this particular case because someone
implementing this extension might get it wrong. MSIs should not have a
PASID because IOMMUs don't support it:

* VT-d treats Requests-with-PASID to the special range 0xfeexxxxx as
  normal DMA (3.14 Handling Requests to Interrupt Address Range)
* AMD IOMMU reports an error for MSIs with PASID (2.2.7.7 PCIe® TLP PASID
  Prefix).
* Arm requires creating mappings to the MSI controller in the SMMU. This
  has implications for SVA where the PASID accesses the whole process
  address space: if MSI transactions have a PASID prefix, that requires
  mapping the MSI controller into the process address space on Arm, which
  isn't convenient.

Thanks,
Jean

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-14  3:15               ` Jason Wang
@ 2022-01-14 10:38                 ` Jean-Philippe Brucker
  2022-01-17  5:58                   ` Jason Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Jean-Philippe Brucker @ 2022-01-14 10:38 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Fri, Jan 14, 2022 at 11:15:35AM +0800, Jason Wang wrote:
> 
> 在 2022/1/13 下午11:17, Michael S. Tsirkin 写道:
> > On Thu, Jan 13, 2022 at 02:53:36PM +0000, Stefan Hajnoczi wrote:
> > > On Thu, Jan 13, 2022 at 05:45:11AM -0500, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
> > > > > > So my understanding is that instead of delaying the response to read
> > > > > > it's better to simply present the previous PASID value if there're any
> > > > > > pending transactions of previous PASID value. The driver can then
> > > > > > choose to poll or using timers etc.
> > > > Don't we limit these changes to before DRIVER_OK?
> > > > If we do then no previous transactions can exist.
> > > Not sure if that's possible for the vDPA virtqueue assignment use case
> > > where one large VIRTIO device provides virtqueues for many smaller vDPA
> > > devices?
> 
> 
> That could be the case but usually a smaller vDPA device requires a bunch of
> hardware resources more than just virtqueues.
> 
> Another example is assign a queue directly to userspace with PASID for SVA.
> 
> 
> > > 
> > > Stefan
> > Then I guess queue must be stopped?
> 
> 
> Yes, since we have virtqueue reset, I can limit the assigning before
> DRIVER_OK or queue(s) are reset. Does this work?

Yes, I was wondering how this would work for the SVA use-case, where
processes can request queues from the virtio driver at runtime:

* Process opens a fd to the queue
* virtio driver selects a disabled queue, gets a PASID for that
  task and assigns it to the queue's group 
* process mmaps its queue and works with it. When it is done, it closes
  the fd.
* virtio driver now makes sure that the virtqueue does not issue any
  more DMA with this PASID, it disables all the queues in the group by
  writing 1 to their queue_reset. After reset completes, it clears the
  group_pasid field.
* The driver releases the PASID (iommu_sva_unbind_device()).
  Both PASID and queue can now safely be assigned to other processes.

So making group_pasid field writeable only if all queues in the group are
disabled should work

Thanks,
Jean

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-14  9:43       ` Jean-Philippe Brucker
@ 2022-01-17  5:57         ` Jason Wang
  2022-01-19 18:01           ` Jean-Philippe Brucker
  2022-01-19 23:53         ` Michael S. Tsirkin
  1 sibling, 1 reply; 30+ messages in thread
From: Jason Wang @ 2022-01-17  5:57 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Stefan Hajnoczi, Virtio-Dev, mst, eperezma, Cindy Lu

On Fri, Jan 14, 2022 at 5:43 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Hi Jason,
>
> On Thu, Jan 13, 2022 at 09:24:16AM +0800, Jason Wang wrote:
> > The device MUST use PASID TLP prefix for all memory transactions
> > initiated by the virtqueue that belong to a virtqueue group ...
> >
> > > vring reads/writes and data buffer reads/writes come
> > > to mind. Virtqueue MSI-X messages are probably not included? Anything
> > > else?
> >
> > According to the PCIe spec and the semantic, the PASID TLP prefix
> > should be used for
> >
> > Memory Requests (including AtomicOp Requests) with Untranslated Addresses.
> > Address Translation Requests, ATS Invalidation Messages, Page Request
> > Messages, and PRG Response Messages
> >
> > So we probably don't need to differentiate MSI-X messages with DMA
> > here. That's why I think a general "memory transaction" should be
> > sufficient here. If you don't agree, I can clarify it as what PCIe
> > spec did.
>
> I think we should be explicit about this particular case because someone
> implementing this extension might get it wrong. MSIs should not have a
> PASID because IOMMUs don't support it:

Good point, using PASID for MSI may break the assumption of SVA where
the VA is overlapped with the interrupt address range which is similar
to what you mention for SMMU below.

>
> * VT-d treats Requests-with-PASID to the special range 0xfeexxxxx as
>   normal DMA (3.14 Handling Requests to Interrupt Address Range)
> * AMD IOMMU reports an error for MSIs with PASID (2.2.7.7 PCIe® TLP PASID
>   Prefix).
> * Arm requires creating mappings to the MSI controller in the SMMU. This
>   has implications for SVA where the PASID accesses the whole process
>   address space: if MSI transactions have a PASID prefix, that requires
>   mapping the MSI controller into the process address space on Arm, which
>   isn't convenient.

How about something like:

"
Except for the MSI transactions, the device MUST use PASID TLP prefix
for the following memory transactions initiated by the virtqueue that
belong to a virtqueue group if a valid PASID is assigned and PASID is
enabled in the PASID extended capability:

\begin{itemize}
\item Memory Requests (including AtomicOp Requests) with Untranslated
  Addresses
\item Address Translation Requests
\item Page Request Messages
\end{itemize}

The device MUST NOT use PASID TLP prefix for the MSI memory
transactions.

Thanks

>
> Thanks,
> Jean
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-14 10:38                 ` Jean-Philippe Brucker
@ 2022-01-17  5:58                   ` Jason Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Wang @ 2022-01-17  5:58 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Fri, Jan 14, 2022 at 6:38 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Fri, Jan 14, 2022 at 11:15:35AM +0800, Jason Wang wrote:
> >
> > 在 2022/1/13 下午11:17, Michael S. Tsirkin 写道:
> > > On Thu, Jan 13, 2022 at 02:53:36PM +0000, Stefan Hajnoczi wrote:
> > > > On Thu, Jan 13, 2022 at 05:45:11AM -0500, Michael S. Tsirkin wrote:
> > > > > On Thu, Jan 13, 2022 at 10:32:11AM +0000, Stefan Hajnoczi wrote:
> > > > > > > So my understanding is that instead of delaying the response to read
> > > > > > > it's better to simply present the previous PASID value if there're any
> > > > > > > pending transactions of previous PASID value. The driver can then
> > > > > > > choose to poll or using timers etc.
> > > > > Don't we limit these changes to before DRIVER_OK?
> > > > > If we do then no previous transactions can exist.
> > > > Not sure if that's possible for the vDPA virtqueue assignment use case
> > > > where one large VIRTIO device provides virtqueues for many smaller vDPA
> > > > devices?
> >
> >
> > That could be the case but usually a smaller vDPA device requires a bunch of
> > hardware resources more than just virtqueues.
> >
> > Another example is assign a queue directly to userspace with PASID for SVA.
> >
> >
> > > >
> > > > Stefan
> > > Then I guess queue must be stopped?
> >
> >
> > Yes, since we have virtqueue reset, I can limit the assigning before
> > DRIVER_OK or queue(s) are reset. Does this work?
>
> Yes, I was wondering how this would work for the SVA use-case, where
> processes can request queues from the virtio driver at runtime:
>
> * Process opens a fd to the queue
> * virtio driver selects a disabled queue, gets a PASID for that
>   task and assigns it to the queue's group
> * process mmaps its queue and works with it. When it is done, it closes
>   the fd.
> * virtio driver now makes sure that the virtqueue does not issue any
>   more DMA with this PASID, it disables all the queues in the group by
>   writing 1 to their queue_reset. After reset completes, it clears the
>   group_pasid field.
> * The driver releases the PASID (iommu_sva_unbind_device()).
>   Both PASID and queue can now safely be assigned to other processes.
>
> So making group_pasid field writeable only if all queues in the group are
> disabled should work

Yes, I will do that in the next version.

Thanks

>
> Thanks,
> Jean
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [virtio-dev] Re: [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability
  2022-01-14  3:23         ` Jason Wang
@ 2022-01-17 10:03           ` Stefan Hajnoczi
  0 siblings, 0 replies; 30+ messages in thread
From: Stefan Hajnoczi @ 2022-01-17 10:03 UTC (permalink / raw)
  To: Jason Wang; +Cc: Virtio-Dev, mst, eperezma, Cindy Lu

[-- Attachment #1: Type: text/plain, Size: 5091 bytes --]

On Fri, Jan 14, 2022 at 11:23:00AM +0800, Jason Wang wrote:
> 
> 在 2022/1/13 下午6:19, Stefan Hajnoczi 写道:
> > On Thu, Jan 13, 2022 at 08:55:18AM +0800, Jason Wang wrote:
> > > On Wed, Jan 12, 2022 at 6:10 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > On Wed, Jan 12, 2022 at 01:57:54PM +0800, Jason Wang wrote:
> > > > > We're already out of the configuration space if there's a device that
> > > > > supports all kinds of the virtio structure via PCI capability. This
> > > > > prevents us from adding new capabilities in the future.
> > > > > 
> > > > > So the patch adds the support for virtio structure via PCI Extended
> > > > > Capability via the vendor specific extended capability.
> > > > > 
> > > > > Only MMIO bar is allowed now.
> > > > > 
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > >   content.tex | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >   1 file changed, 127 insertions(+)
> > > > > 
> > > > > diff --git a/content.tex b/content.tex
> > > > > index cf20570..00a75f2 100644
> > > > > --- a/content.tex
> > > > > +++ b/content.tex
> > > > > @@ -1476,6 +1476,129 @@ \subsubsection{Non-transitional Device With Legacy Driver: A Note
> > > > >      of BAR0 by presenting zeroes on every BAR and ignoring writes.
> > > > >   \end{enumerate}
> > > > > 
> > > > > +\subsection{Virtio Structure PCI Extended Capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> > > > > +
> > > > > +The location of the virtio structures that depend on the PCI Express
> > > > > +capability are specified using a vendor-specific extended capabilities
> > > > > +on the extended capabilities list in PCI Express extended
> > > > > +configuration space of the device.
> > > > I can't parse this sentence. Can you rephrase it and/or split it into
> > > > multiple sentences?
> > > Sure, how about something like:
> > > 
> > > There could be virtio structures that depend on the PCI express
> > > capability. The location of those structures are specified using a
> > > vendor-specific extended capabilities on the extended capabilities
> > > list.
> > "virtio structures" -> "Virtio device configuration structures" (same as
> > in "4.1.4 Virtio Structure PCI Capabilities")
> 
> 
> Ok.
> 
> 
> > 
> > "PCI express capability" -> "PCI Express Extended Capability" or did you
> > mean that more generally (PCI Express features)?
> 
> 
> More general PCI Express feature.
> 
> 
> > 
> > Here is the edited sentence:
> > 
> >    Virtio device configuration structures that depend on PCI Express
> >    Extended Capabilities are specified using vendor-specific PCI Express
> >    Extended Capabilities located on the PCI Express Extended Capabilities
> >    list.
> 
> 
> That's fine.
> 
> 
> > > > > @@ -1488,6 +1611,10 @@ \subsubsection{Device Initialization}\label{sec:Virtio Transport Options / Virti
> > > > >   PCI capability list, detecting virtio configuration layout using Virtio
> > > > >   Structure PCI capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Capabilities}
> > > > > 
> > > > > +Optionally, if the device is a PCIe device, the driver scans the PCI
> > > > > +Extended capability list, detecting virtio configuration layout using
> > > > > +Virtio Struct PCI Extended capabilities as detailed in \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / Virtio Structure PCI Extended Capabilities}
> > > > If the same structure is present in both the PCI Capabilities and PCI
> > > > Extended Capabilities lists, which one has a higher priority?
> > > This is not allowed in this proposal where only the structure that
> > > depends on the PCI Express capability is allowed. E.g the PASID
> > > configuration structure depends on the PASID extended capability.
> > > 
> > > This is much more simpler than allowing existing virtio structures to
> > > be presented on both PCI and PCIe capability list.
> > Ah, I think I misunderstood the point of this patch. I thought you were
> > defining it because devices were running out of PCI Configuration Space
> > and PCI Extended Capabilities allow the device to use more space (PCI
> > Express Extended Configuration Space).
> > 
> > But now I think you're adding this to the spec in order to define VIRTIO
> > features that depend on PCI Extended Capabilities like PASID support?
> 
> 
> Right.
> 
> 
> > Is
> > it really necessary or just cleaner to use PCI Extended Capabilities
> > instead of VIRTIO's existing vendor-specific PCI Capabilities?
> 
> 
> Not a must, technically we can add them into the common configuration
> structure. But having a dedicated capability seems cleaner: it looks odd to
> configure PCIe features via the legacy capability anyhow. And this is more
> flexible to allow the hypervisor to map or hide the BAR independently.

Thanks for explaining. The purpose of the patch is clearer now.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-17  5:57         ` Jason Wang
@ 2022-01-19 18:01           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 30+ messages in thread
From: Jean-Philippe Brucker @ 2022-01-19 18:01 UTC (permalink / raw)
  To: Jason Wang; +Cc: Stefan Hajnoczi, Virtio-Dev, mst, eperezma, Cindy Lu

On Mon, Jan 17, 2022 at 01:57:47PM +0800, Jason Wang wrote:
> How about something like:
> 
> "
> Except for the MSI transactions, the device MUST use PASID TLP prefix
> for the following memory transactions initiated by the virtqueue that
> belong to a virtqueue group if a valid PASID is assigned and PASID is
> enabled in the PASID extended capability:
> 
> \begin{itemize}
> \item Memory Requests (including AtomicOp Requests) with Untranslated
>   Addresses
> \item Address Translation Requests
> \item Page Request Messages
> \end{itemize}
> 
> The device MUST NOT use PASID TLP prefix for the MSI memory
> transactions.
> 

Sure, that looks good. Maybe splitting it would make it easier to explain.
If we force the driver to enable the PASID cap before using PASID:

"The driver MUST NOT assign a PASID to a virtqueue group if the PASID
extended capability is not enabled."

Then the device description is not concerned about the PASID cap:

"If a valid PASID is assigned to a virtqueue group, the device MUST use a
PASID TLP prefix for memory transactions of the following types initiated
by any virtqueue in the group:
  ...

The device MUST NOT use a PASID TLP prefix for MSI memory transactions.
"

Thanks,
Jean

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-14  9:43       ` Jean-Philippe Brucker
  2022-01-17  5:57         ` Jason Wang
@ 2022-01-19 23:53         ` Michael S. Tsirkin
  2022-01-24 15:26           ` Jean-Philippe Brucker
  1 sibling, 1 reply; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-19 23:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Jason Wang, Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Fri, Jan 14, 2022 at 09:43:11AM +0000, Jean-Philippe Brucker wrote:
> Hi Jason,
> 
> On Thu, Jan 13, 2022 at 09:24:16AM +0800, Jason Wang wrote:
> > The device MUST use PASID TLP prefix for all memory transactions
> > initiated by the virtqueue that belong to a virtqueue group ...
> > 
> > > vring reads/writes and data buffer reads/writes come
> > > to mind. Virtqueue MSI-X messages are probably not included? Anything
> > > else?
> > 
> > According to the PCIe spec and the semantic, the PASID TLP prefix
> > should be used for
> > 
> > Memory Requests (including AtomicOp Requests) with Untranslated Addresses.
> > Address Translation Requests, ATS Invalidation Messages, Page Request
> > Messages, and PRG Response Messages
> > 
> > So we probably don't need to differentiate MSI-X messages with DMA
> > here. That's why I think a general "memory transaction" should be
> > sufficient here. If you don't agree, I can clarify it as what PCIe
> > spec did.
> 
> I think we should be explicit about this particular case because someone
> implementing this extension might get it wrong. MSIs should not have a
> PASID because IOMMUs don't support it:
> 
> * VT-d treats Requests-with-PASID to the special range 0xfeexxxxx as
>   normal DMA (3.14 Handling Requests to Interrupt Address Range)
> * AMD IOMMU reports an error for MSIs with PASID (2.2.7.7 PCIe® TLP PASID
>   Prefix).

Ouch. I didn't realize. Really weird, of the "what were they thinking"
variety. The natural thing would be to have PASID==VM and scope
both DMA and MSI within PASID. I guess that is precluded on these
platforms then.

> * Arm requires creating mappings to the MSI controller in the SMMU. This
>   has implications for SVA where the PASID accesses the whole process
>   address space: if MSI transactions have a PASID prefix, that requires
>   mapping the MSI controller into the process address space on Arm, which
>   isn't convenient.

I'd like to understand this part a bit better. Can you explain?
We are talking about the IOMMU address space right?
What is wrong with mapping the MSI controller there?
Isn't convenient in what sense?

> 
> Thanks,
> Jean


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-19 23:53         ` Michael S. Tsirkin
@ 2022-01-24 15:26           ` Jean-Philippe Brucker
  2022-01-24 22:15             ` Michael S. Tsirkin
  0 siblings, 1 reply; 30+ messages in thread
From: Jean-Philippe Brucker @ 2022-01-24 15:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Wed, Jan 19, 2022 at 06:53:17PM -0500, Michael S. Tsirkin wrote:
> > I think we should be explicit about this particular case because someone
> > implementing this extension might get it wrong. MSIs should not have a
> > PASID because IOMMUs don't support it:
> > 
> > * VT-d treats Requests-with-PASID to the special range 0xfeexxxxx as
> >   normal DMA (3.14 Handling Requests to Interrupt Address Range)
> > * AMD IOMMU reports an error for MSIs with PASID (2.2.7.7 PCIe® TLP PASID
> >   Prefix).
> 
> Ouch. I didn't realize. Really weird, of the "what were they thinking"
> variety. The natural thing would be to have PASID==VM and scope
> both DMA and MSI within PASID. I guess that is precluded on these
> platforms then.
> 

I think the problem is mainly that MSIs appear as normal memory writes
requests. If PCI had given them a special type, IOMMU/IRQ remapping
hardware could route them more easily, but as is they only have the
transaction's address to work with. So vendors made different choices for
supporting MSIs with their IRQ remapping hardware. Intel and AMD define
reserved address ranges 0xfeexxxxx, where any memory transaction is
treated as MSI.

This is fine when the kernel is in charge of allocating the I/O address
space for DMA, it just needs to make sure IOVAs are not allocated within
the address region reserved for MSIs. But it's more complicated with SVA.

One of the use of PASID is assigning a partition of device (here a group
of virtqueues) to a process. With this approach (SVA) the PASID accesses
the process' page tables, so the virtual address space is shared between
CPU and IOMMU, IOVA==VA. How would we deal with MSIs in this case, if they
had a PASID?  On x86, if the IOMMU performed IRQ remapping instead of
address translation, on a memory write with PASID to address 0xfeexxxxx,
the process couldn't use any address in that range for DMA. The process
would need to filter addresses returned by malloc() and treat some of them
as non-DMA'able, which defeats the purpose of SVA. To avoid having to do
this Intel treats all DMA with PASID as normal memory access, and the
process can't accidentally trigger MSIs by using the wrong address. I
think AMD behaves the same way but am not entirely sure - the wording in
one part of the spec (2.2.7.7) seems to imply that PASID access to the MSI
range would trigger an IOMMU error report, in which case the range would
need carving out anyway.

> > * Arm requires creating mappings to the MSI controller in the SMMU. This
> >   has implications for SVA where the PASID accesses the whole process
> >   address space: if MSI transactions have a PASID prefix, that requires
> >   mapping the MSI controller into the process address space on Arm, which
> >   isn't convenient.
> 
> I'd like to understand this part a bit better. Can you explain?
> We are talking about the IOMMU address space right?
> What is wrong with mapping the MSI controller there?
> Isn't convenient in what sense?

On Arm the IRQ remapping device is separate from the IOMMU, so there is no
special address range as far as the IOMMU is concerned. An MSI goes
through IOMMU address translation like any other memory write, the IOVA
gets translated into a PA that corresponds to the doorbell (MMIO) address
of the IRQ remapping device. The IOVA of the MSI is chosen by the OS and
can have any value. Now if MSIs had a PASID, then the IOMMU would
translate MSI addresses with the page tables corresponding to that PASID,
which with SVA correspond to the process page tables. So the kernel would
need to create a mapping inside the process' address space, to the PA of
the IRQ remapping device. That's the part that is inconvenient (would not
be a security issue because accesses to the doorbell from CPUs are ignored
according to Arm's platform spec, but it would certainly be awkward to set
up).

Thanks,
Jean

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [virtio-dev] Re: [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability
  2022-01-24 15:26           ` Jean-Philippe Brucker
@ 2022-01-24 22:15             ` Michael S. Tsirkin
  0 siblings, 0 replies; 30+ messages in thread
From: Michael S. Tsirkin @ 2022-01-24 22:15 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Jason Wang, Stefan Hajnoczi, Virtio-Dev, eperezma, Cindy Lu

On Mon, Jan 24, 2022 at 03:26:35PM +0000, Jean-Philippe Brucker wrote:
> On Wed, Jan 19, 2022 at 06:53:17PM -0500, Michael S. Tsirkin wrote:
> > > I think we should be explicit about this particular case because someone
> > > implementing this extension might get it wrong. MSIs should not have a
> > > PASID because IOMMUs don't support it:
> > > 
> > > * VT-d treats Requests-with-PASID to the special range 0xfeexxxxx as
> > >   normal DMA (3.14 Handling Requests to Interrupt Address Range)
> > > * AMD IOMMU reports an error for MSIs with PASID (2.2.7.7 PCIe® TLP PASID
> > >   Prefix).
> > 
> > Ouch. I didn't realize. Really weird, of the "what were they thinking"
> > variety. The natural thing would be to have PASID==VM and scope
> > both DMA and MSI within PASID. I guess that is precluded on these
> > platforms then.
> > 
> 
> I think the problem is mainly that MSIs appear as normal memory writes
> requests. If PCI had given them a special type, IOMMU/IRQ remapping
> hardware could route them more easily, but as is they only have the
> transaction's address to work with. So vendors made different choices for
> supporting MSIs with their IRQ remapping hardware. Intel and AMD define
> reserved address ranges 0xfeexxxxx, where any memory transaction is
> treated as MSI.
> 
> This is fine when the kernel is in charge of allocating the I/O address
> space for DMA, it just needs to make sure IOVAs are not allocated within
> the address region reserved for MSIs. But it's more complicated with SVA.
> 
> One of the use of PASID is assigning a partition of device (here a group
> of virtqueues) to a process. With this approach (SVA) the PASID accesses
> the process' page tables, so the virtual address space is shared between
> CPU and IOMMU, IOVA==VA. How would we deal with MSIs in this case, if they
> had a PASID?

Basically what I would do is what ARM has done ;)


>  On x86, if the IOMMU performed IRQ remapping instead of
> address translation, on a memory write with PASID to address 0xfeexxxxx,
> the process couldn't use any address in that range for DMA. The process
> would need to filter addresses returned by malloc() and treat some of them
> as non-DMA'able, which defeats the purpose of SVA. To avoid having to do
> this Intel treats all DMA with PASID as normal memory access, and the
> process can't accidentally trigger MSIs by using the wrong address. I
> think AMD behaves the same way but am not entirely sure - the wording in
> one part of the spec (2.2.7.7) seems to imply that PASID access to the MSI
> range would trigger an IOMMU error report, in which case the range would
> need carving out anyway.
> 
> > > * Arm requires creating mappings to the MSI controller in the SMMU. This
> > >   has implications for SVA where the PASID accesses the whole process
> > >   address space: if MSI transactions have a PASID prefix, that requires
> > >   mapping the MSI controller into the process address space on Arm, which
> > >   isn't convenient.
> > 
> > I'd like to understand this part a bit better. Can you explain?
> > We are talking about the IOMMU address space right?
> > What is wrong with mapping the MSI controller there?
> > Isn't convenient in what sense?
> 
> On Arm the IRQ remapping device is separate from the IOMMU, so there is no
> special address range as far as the IOMMU is concerned. An MSI goes
> through IOMMU address translation like any other memory write, the IOVA
> gets translated into a PA that corresponds to the doorbell (MMIO) address
> of the IRQ remapping device. The IOVA of the MSI is chosen by the OS and
> can have any value. Now if MSIs had a PASID, then the IOMMU would
> translate MSI addresses with the page tables corresponding to that PASID,
> which with SVA correspond to the process page tables. So the kernel would
> need to create a mapping inside the process' address space, to the PA of
> the IRQ remapping device. That's the part that is inconvenient (would not
> be a security issue because accesses to the doorbell from CPUs are ignored
> according to Arm's platform spec, but it would certainly be awkward to set
> up).
> 
> Thanks,
> Jean

The way I'd do it, is probably by an mmap call. Nothing too tricky about
that. Gives you a VA or it can get a VA with MMAP_FIXED if process
wants to allocate a range. And for a VM, we just naturally use
the regular doorbell address.

In fact, as you describe it it seems that we do have a similar problem
on AMD, we need to preclude VAs in the MSI range somehow. so I guess 2
vendors out of 3 need to carve up the process address range.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-01-24 22:15 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12  5:57 [virtio-dev] [PATCH V2 0/2] virito-pci: PASID support Jason Wang
2022-01-12  5:57 ` [virtio-dev] [PATCH V2 1/2] virtio-pci: introduce virtio structure PCI Extended Capability Jason Wang
2022-01-12 10:10   ` [virtio-dev] " Stefan Hajnoczi
2022-01-13  0:55     ` Jason Wang
2022-01-13 10:19       ` Stefan Hajnoczi
2022-01-14  3:23         ` Jason Wang
2022-01-17 10:03           ` Stefan Hajnoczi
2022-01-12  5:57 ` [virtio-dev] [PATCH V2 2/2] virtio-pci: add PASID configuration extended capability Jason Wang
2022-01-12 10:41   ` [virtio-dev] " Stefan Hajnoczi
2022-01-13  1:24     ` Jason Wang
2022-01-13 10:32       ` Stefan Hajnoczi
2022-01-13 10:45         ` Michael S. Tsirkin
2022-01-13 14:53           ` Stefan Hajnoczi
2022-01-13 15:17             ` Michael S. Tsirkin
2022-01-14  3:15               ` Jason Wang
2022-01-14 10:38                 ` Jean-Philippe Brucker
2022-01-17  5:58                   ` Jason Wang
2022-01-14  9:43       ` Jean-Philippe Brucker
2022-01-17  5:57         ` Jason Wang
2022-01-19 18:01           ` Jean-Philippe Brucker
2022-01-19 23:53         ` Michael S. Tsirkin
2022-01-24 15:26           ` Jean-Philippe Brucker
2022-01-24 22:15             ` Michael S. Tsirkin
2022-01-12 10:44 ` [virtio-dev] Re: [PATCH V2 0/2] virito-pci: PASID support Stefan Hajnoczi
2022-01-13  1:28   ` Jason Wang
2022-01-13 10:36     ` Stefan Hajnoczi
2022-01-13 10:40       ` Michael S. Tsirkin
2022-01-14  2:53         ` Jason Wang
2022-01-13 15:18 ` Michael S. Tsirkin
2022-01-14  2:55   ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.