All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size
@ 2022-03-16 13:44 Christian Schoenebeck
  2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
                   ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 13:44 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

This is a follow-up of:
https://lists.oasis-open.org/archives/virtio-comment/202202/msg00059.html

These 4 patches introduce a more fine graded control over the maximum length
of indirect descriptor tables.

Associated Github Task:
https://github.com/oasis-tcs/virtio-spec/issues/122

v2 -> v3:

  * s/bus specific/transport specific/ [patch 1]

  * CCW: Add new field 'indirect_num' both to to struct vq_config_block and
    struct vq_info_block. [patch 4]

  * CCW: Bump virtio-ccw revision to 3 and make existence of new field
    'indirect_num' dependent on revision 3. [patch 4]

  * CCW: Rephrase description of new field 'indirect_num'. [patch 4]

Christian Schoenebeck (4):
  Add VIRTIO_RING_F_INDIRECT_SIZE
  Add PCI configuration field "queue_indirect_size"
  Add MMIO configuration register "QueueIndirectNum"
  Add CCW configuration field "indirect_num"

 content.tex     | 86 ++++++++++++++++++++++++++++++++++++++++++++++---
 packed-ring.tex |  2 +-
 split-ring.tex  |  8 +++--
 3 files changed, 88 insertions(+), 8 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
@ 2022-03-16 13:47 ` Christian Schoenebeck
  2022-03-17 13:40   ` [virtio-comment] " Cornelia Huck
  2022-03-19  9:33   ` [virtio-comment] " Michael S. Tsirkin
  2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 13:47 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

This new feature flag allows to decouple the maximum amount of
descriptors in indirect descriptor tables from the Queue Size.

The new term "Queue Indirect Size" is introduced for this purpose,
which is a transport specific configuration whose negotiation is
further specified for each transport with subsequent patches.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 content.tex     | 32 ++++++++++++++++++++++++++++++--
 packed-ring.tex |  2 +-
 split-ring.tex  |  8 ++++++--
 3 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/content.tex b/content.tex
index c6f116c..685525d 100644
--- a/content.tex
+++ b/content.tex
@@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a Virtio Device / Feature B
 \begin{description}
 \item[0 to 23, and 50 to 127] Feature bits for the specific device type
 
-\item[24 to 40] Feature bits reserved for extensions to the queue and
+\item[24 to 41] Feature bits reserved for extensions to the queue and
   feature negotiation mechanisms
 
-\item[41 to 49, and 128 and above] Feature bits reserved for future extensions.
+\item[42 to 49, and 128 and above] Feature bits reserved for future extensions.
 \end{description}
 
 \begin{note}
@@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 present either a value of 0 or a power of 2 in
 \field{queue_size}.
 
+If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST provide the
+Queue Indirect Size supported by device, which is a transport specific
+configuration. It MUST allow the driver to set a lower value.
+
 \drivernormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
 
 The driver MUST NOT write to \field{device_feature}, \field{num_queues}, \field{config_generation}, \field{queue_notify_off} or \field{queue_notify_data}.
@@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   that the driver can reset a queue individually.
   See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
 
+  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
+  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
+  descriptor tables, is independent from the Queue Size.
+
+  Without this feature, the Queue Size limits the length of the descriptor
+  chain, including indirect descriptor tables as in \ref{sec:Basic Facilities of
+  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
+  Descriptors}, i.e. both the maximum amount of slots in the vring and the
+  actual bulk data size transmitted per vring slot.
+
+  With this feature enabled, the Queue Size only limits the maximum amount
+  of slots in the vring, but does not limit the actual bulk data size
+  being transmitted when indirect descriptors are used. Decoupling these
+  two configuration parameters this way not only allows much larger bulk data
+  being transferred per vring slot, but also avoids complicated synchronization
+  mechanisms if the device only supports a very small amount of vring slots. Due
+  to the 16-bit size of a descriptor's "next" field there is still an absolute
+  limit of $2^{16}$ descriptors per indirect descriptor table. However the
+  actual maximum amount supported by either device or driver might be less,
+  and therefore the bus specific Queue Indirect Size value MUST additionally
+  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to subsequently
+  negotiate the actual amount of maximum indirect descriptors supported
+  by both sides.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
diff --git a/packed-ring.tex b/packed-ring.tex
index a9e6c16..e26d112 100644
--- a/packed-ring.tex
+++ b/packed-ring.tex
@@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
 The device limits the number of descriptors in a list through a
 transport-specific and/or device-specific value. If not limited,
 the maximum number of descriptors in a list is the virt queue
-size.
+size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been negotiated.
 
 \subsection{Next Flag: Descriptor Chaining}
 \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
diff --git a/split-ring.tex b/split-ring.tex
index de94038..eaa90c3 100644
--- a/split-ring.tex
+++ b/split-ring.tex
@@ -268,8 +268,12 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic Facilities of a Virtio Devi
 set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie. only
 one table per descriptor).
 
-A driver MUST NOT create a descriptor chain longer than the Queue Size of
-the device.
+If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
+NOT create a descriptor chain longer than the Queue Size of the device.
+
+If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
+descriptors per indirect descriptor table MUST NOT exceed the negotiated
+Queue Indirect Size.
 
 A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
 in \field{flags}.
-- 
2.30.2





^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size"
  2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
  2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
@ 2022-03-16 13:50 ` Christian Schoenebeck
  2022-03-16 14:41   ` [virtio-comment] " Christian Schoenebeck
  2022-03-16 13:52 ` [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum" Christian Schoenebeck
  2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
  3 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 13:50 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

This new PCI configuration field allows to negotiate a more fine
graded maximum lenght of indirect descriptor chains.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 content.tex | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/content.tex b/content.tex
index 685525d..5162e2b 100644
--- a/content.tex
+++ b/content.tex
@@ -902,6 +902,7 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         le64 queue_device;              /* read-write */
         le16 queue_notify_data;         /* read-only for driver */
         le16 queue_reset;               /* read-write */
+        le32 queue_indirect_size;       /* read-write */
 };
 \end{lstlisting}
 
@@ -987,6 +988,17 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         This field exists only if VIRTIO_F_RING_RESET has been
         negotiated. (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
 
+\item[\field{queue_indirect_size}]
+        This field is used to negotiate the maximum number of descriptors per
+        indirect descriptor table as in \ref{sec:Basic Facilities of a Virtio
+        Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
+        Descriptors} if and only if the VIRTIO_RING_F_INDIRECT_SIZE feature has
+        been negotiated.
+
+        The device specifies its maximum number of descriptors per indirect
+        descriptor table. If the driver requires fewer descriptors, it writes
+        its lower value to inform the device of the reduced resource
+        requirements.
 \end{description}
 
 \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
@@ -1081,6 +1093,12 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 values to different values than those that were used before the queue reset.
 (see \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}).
 
+The driver SHOULD write to \field{queue_indirect_size} if its maximum number of
+descriptors per vring slot is lower than that reported by the device.
+
+The driver MUST NOT write a higher value to \field{queue_indirect_size} than the
+one it reads from the device.
+
 \subsubsection{Notification structure layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Notification capability}
 
 The notification location is found using the VIRTIO_PCI_CAP_NOTIFY_CFG
@@ -6870,10 +6888,10 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   to the 16-bit size of a descriptor's "next" field there is still an absolute
   limit of $2^{16}$ descriptors per indirect descriptor table. However the
   actual maximum amount supported by either device or driver might be less,
-  and therefore the bus specific Queue Indirect Size value MUST additionally
-  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to subsequently
-  negotiate the actual amount of maximum indirect descriptors supported
-  by both sides.
+  and therefore the transport specific Queue Indirect Size value MUST
+  additionally be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
+  subsequently negotiate the actual amount of maximum indirect descriptors
+  supported by both sides.
 
 \end{description}
 
-- 
2.30.2





^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum"
  2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
  2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
  2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
@ 2022-03-16 13:52 ` Christian Schoenebeck
  2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
  3 siblings, 0 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 13:52 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

This new MMIO configuration register allows to negotiate a more fine
graded maximum lenght of indirect descriptor chains.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 content.tex | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/content.tex b/content.tex
index 5162e2b..97665a4 100644
--- a/content.tex
+++ b/content.tex
@@ -2225,8 +2225,9 @@ \subsection{Legacy interface}\label{sec:Virtio Transport Options / Virtio Over M
   \hline
   \mmioreg{QueueSel}{Virtual queue index}{0x030}{W}{%
     Writing to this register selects the virtual queue that the
-    following operations on the \field{QueueNumMax}, \field{QueueNum}, \field{QueueAlign}
-    and \field{QueuePFN} registers apply to. The index
+    following operations on the \field{QueueNumMax}, \field{QueueNum},
+    \field{QueueAlign}, \field{QueuePFN} and \field{QueueIndirectNum} registers
+    apply to. The index
     number of the first queue is zero (0x0). 
 .
   }
@@ -2282,6 +2283,19 @@ \subsection{Legacy interface}\label{sec:Virtio Transport Options / Virtio Over M
     Also see \ref{sec:General Initialization And Device Operation / Device Initialization}~\nameref{sec:General Initialization And Device Operation / Device Initialization}.
   }
   \hline
+  \mmioreg{QueueIndirectNum}{Virtual queue indirect size}{0x074}{RW}{%
+    This register is used to negotiate the maximum number of descriptors per
+    indirect descriptor table as in \ref{sec:Basic Facilities of a Virtio
+    Device / Virtqueues / The Virtqueue Descriptor Table / Indirect Descriptors}
+    if and only if the VIRTIO_RING_F_INDIRECT_SIZE feature has been negotiated.
+
+    The device specifies its maximum number of descriptors per indirect
+    descriptor table. If the driver requires fewer descriptors, it writes
+    its lower value to inform the device of the reduced resource requirements.
+
+    This applies to the queue selected by writing to \field{QueueSel}.
+  }
+  \hline
   \mmioreg{Config}{Configuration space}{0x100+}{RW}{}
   \hline
 \end{longtable}
@@ -2314,6 +2328,10 @@ \subsection{Legacy interface}\label{sec:Virtio Transport Options / Virtio Over M
 \item Notify the device about the queue size by writing the size to
    \field{QueueNum}.
 
+\item If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, read the maximum
+   length of indirect descriptor tables by reading \field{QueueIndirectNum}.
+   Write to lower this value if necessary.
+
 \item Notify the device about the used alignment by writing its value
    in bytes to \field{QueueAlign}.
 
-- 
2.30.2





^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
                   ` (2 preceding siblings ...)
  2022-03-16 13:52 ` [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum" Christian Schoenebeck
@ 2022-03-16 13:55 ` Christian Schoenebeck
  2022-03-17 14:12   ` [virtio-comment] " Cornelia Huck
  3 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 13:55 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

This new CCW configuration field allows to negotiate a more fine
graded maximum lenght of indirect descriptor chains.

Bump CCW virtio revision to 3 and make the existence of this new
field "indirect_num" dependant on revision 3.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 content.tex | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/content.tex b/content.tex
index 97665a4..ea205d3 100644
--- a/content.tex
+++ b/content.tex
@@ -2520,7 +2520,9 @@ \subsubsection{Setting the Virtio Revision}\label{sec:Virtio Transport Options /
 \hline
 2        & 0      & <empty>   & CCW_CMD_READ_STATUS support \\
 \hline
-3-n      &        &           & reserved for later revisions \\
+3        & 0      & <empty>   & Queue Indirect Size (indirect_num) support \\
+\hline
+4-n      &        &           & reserved for later revisions \\
 \hline
 \end{tabular}
 
@@ -2581,12 +2583,17 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
 struct vq_config_block {
         be16 index;
         be16 max_num;
+        be32 indirect_num; /* since virtio-ccw rev. 3 */
 };
 \end{lstlisting}
 
 The requested number of buffers for queue \field{index} is returned in
 \field{max_num}.
 
+Since revision 3, \field{indirect_num} exists, which is supposed to reflect the
+Queue Indirect Size (i.e. the maximum length of indirect descriptor tables)
+supported by device for this queue.
+
 Afterwards, CCW_CMD_SET_VQ is issued by the driver to inform the
 device about the location used for its queue. The transmitted
 structure is
@@ -2599,6 +2606,7 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
         be16 num;
         be64 driver;
         be64 device;
+        be32 indirect_num; /* since virtio-ccw rev. 3 */
 };
 \end{lstlisting}
 
@@ -2607,6 +2615,10 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
 available area and used area for queue \field{index}, respectively. The actual
 virtqueue size (number of allocated buffers) is transmitted in \field{num}.
 
+Since revision 3, \field{indirect_num} exists, which is supposed to reflect the
+Queue Indirect Size (i.e. the maximum length of indirect descriptor tables)
+supported by driver for this queue.
+
 \devicenormative{\paragraph}{Configuring a Virtqueue}{Virtio Transport Options / Virtio over channel I/O / Device Initialization / Configuring a Virtqueue}
 
 \field{res0} is reserved and MUST be ignored by the device.
-- 
2.30.2





^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size"
  2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
@ 2022-03-16 14:41   ` Christian Schoenebeck
  0 siblings, 0 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-16 14:41 UTC (permalink / raw)
  To: virtio-comment
  Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Mittwoch, 16. März 2022 14:50:22 CET Christian Schoenebeck wrote:
> This new PCI configuration field allows to negotiate a more fine
> graded maximum lenght of indirect descriptor chains.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  content.tex | 26 ++++++++++++++++++++++----
>  1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/content.tex b/content.tex
> index 685525d..5162e2b 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -902,6 +902,7 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport le64 queue_device;              /*
> read-write */
>          le16 queue_notify_data;         /* read-only for driver */
>          le16 queue_reset;               /* read-write */
> +        le32 queue_indirect_size;       /* read-write */
>  };
>  \end{lstlisting}
[...]
> @@ -6870,10 +6888,10 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    to the 16-bit size of a descriptor's "next" field there is still an absolute
>    limit of $2^{16}$ descriptors per indirect descriptor table. However the
>    actual maximum amount supported by either device or driver might be less,
> -  and therefore the bus specific Queue Indirect Size value MUST additionally
> -  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to subsequently
> -  negotiate the actual amount of maximum indirect descriptors supported
> -  by both sides.
> +  and therefore the transport specific Queue Indirect Size value MUST
> +  additionally be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> +  subsequently negotiate the actual amount of maximum indirect descriptors
> +  supported by both sides.

Oops, I accidentally squashed that change (s/bus/transport/) into this PCI
patch instead of patch 1 as it was supposed to.

I'll wait though if there is still something to address on CCW patch 4 before
posting a v4.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
@ 2022-03-17 13:40   ` Cornelia Huck
  2022-03-18 10:45     ` Christian Schoenebeck
  2022-03-19  9:33   ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 1 reply; 39+ messages in thread
From: Cornelia Huck @ 2022-03-17 13:40 UTC (permalink / raw)
  To: Christian Schoenebeck, virtio-comment
  Cc: Stefan Hajnoczi, Greg Kurz, Dominique Martinet, Halil Pasic

On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This new feature flag allows to decouple the maximum amount of
> descriptors in indirect descriptor tables from the Queue Size.
>
> The new term "Queue Indirect Size" is introduced for this purpose,
> which is a transport specific configuration whose negotiation is
> further specified for each transport with subsequent patches.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  content.tex     | 32 ++++++++++++++++++++++++++++++--
>  packed-ring.tex |  2 +-
>  split-ring.tex  |  8 ++++++--
>  3 files changed, 37 insertions(+), 5 deletions(-)
>
> diff --git a/content.tex b/content.tex
> index c6f116c..685525d 100644
> --- a/content.tex
> +++ b/content.tex

(...)

> @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>  present either a value of 0 or a power of 2 in
>  \field{queue_size}.
>  
> +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST provide the
> +Queue Indirect Size supported by device, which is a transport specific

"supported by the device", or maybe "it supports"?

> +configuration. It MUST allow the driver to set a lower value.

Maybe "It MUST allow the driver to specify a lower maximum size." ?

> +
>  \drivernormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
>  
>  The driver MUST NOT write to \field{device_feature}, \field{num_queues}, \field{config_generation}, \field{queue_notify_off} or \field{queue_notify_data}.
> @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    that the driver can reset a queue individually.
>    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / Virtqueue Reset}.
>  
> +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> +  descriptor tables, is independent from the Queue Size.
> +
> +  Without this feature, the Queue Size limits the length of the descriptor
> +  chain, including indirect descriptor tables as in \ref{sec:Basic Facilities of
> +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
> +  Descriptors}, i.e. both the maximum amount of slots in the vring and the
> +  actual bulk data size transmitted per vring slot.
> +
> +  With this feature enabled, the Queue Size only limits the maximum amount

s/enabled/negotiated/ ?

> +  of slots in the vring, but does not limit the actual bulk data size
> +  being transmitted when indirect descriptors are used. Decoupling these
> +  two configuration parameters this way not only allows much larger bulk data
> +  being transferred per vring slot, but also avoids complicated synchronization
> +  mechanisms if the device only supports a very small amount of vring slots. Due
> +  to the 16-bit size of a descriptor's "next" field there is still an absolute
> +  limit of $2^{16}$ descriptors per indirect descriptor table. However the
> +  actual maximum amount supported by either device or driver might be less,
> +  and therefore the bus specific Queue Indirect Size value MUST additionally
> +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to subsequently
> +  negotiate the actual amount of maximum indirect descriptors supported
> +  by both sides.

No 'MUST' in non-normative sections, please. Maybe make that

"and therefore the device and the driver additionally need to negotiate
the transport specific Queue Indirect Size value if
VIRTIO_RING_F_INDIRECT_SIZE was negotiated in order to determine the
actual amount of maximum indirect descriptors supported by both."

I'm not sure whether we would actually need some normative statements in
the sections below, but probably not.

> +
>  \end{description}
>  
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
@ 2022-03-17 14:12   ` Cornelia Huck
  2022-03-18 11:02     ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Cornelia Huck @ 2022-03-17 14:12 UTC (permalink / raw)
  To: Christian Schoenebeck, virtio-comment
  Cc: Stefan Hajnoczi, Greg Kurz, Dominique Martinet, Halil Pasic

On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This new CCW configuration field allows to negotiate a more fine
> graded maximum lenght of indirect descriptor chains.
>
> Bump CCW virtio revision to 3 and make the existence of this new
> field "indirect_num" dependant on revision 3.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  content.tex | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>

(...)

> @@ -2581,12 +2583,17 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
>  struct vq_config_block {
>          be16 index;
>          be16 max_num;
> +        be32 indirect_num; /* since virtio-ccw rev. 3 */

Maybe make it max_indirect_num (to mirror max_num?)

>  };
>  \end{lstlisting}
>  
>  The requested number of buffers for queue \field{index} is returned in
>  \field{max_num}.
>  
> +Since revision 3, \field{indirect_num} exists, which is supposed to reflect the
> +Queue Indirect Size (i.e. the maximum length of indirect descriptor tables)
> +supported by device for this queue.

It still depends on the feature flag, though.

Maybe

"If revision 2 or lower is set, struct vq_config_block extends up to
\field{max_num}.

If revision 3 or higher is set, struct vq_config_block extends to
\field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
\field{indirect_max_num} contains the maximum Queue Indirect Size
(i.e. the maximum length of indirect descriptor tables) supported by the
device for this queue; otherwise, its contents are unpredictable."

> +
>  Afterwards, CCW_CMD_SET_VQ is issued by the driver to inform the
>  device about the location used for its queue. The transmitted
>  structure is
> @@ -2599,6 +2606,7 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
>          be16 num;
>          be64 driver;
>          be64 device;
> +        be32 indirect_num; /* since virtio-ccw rev. 3 */
>  };
>  \end{lstlisting}
>  
> @@ -2607,6 +2615,10 @@ \subsubsection{Configuring a Virtqueue}\label{sec:Virtio Transport Options / Vir
>  available area and used area for queue \field{index}, respectively. The actual
>  virtqueue size (number of allocated buffers) is transmitted in \field{num}.
>  
> +Since revision 3, \field{indirect_num} exists, which is supposed to reflect the
> +Queue Indirect Size (i.e. the maximum length of indirect descriptor tables)
> +supported by driver for this queue.

Similar to above:

"If revision 2 or lower is set, struct vq_info_block extends up to
\field{device}.

If revision 3 or higher is set, struct vq_info_block extends to
\field{indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
the Queue Indirect Size (i.e. the maximum length of indirect descriptor
tables) supported by the driver for this queue is transmitted in
\field{indirect_num}."


> +
>  \devicenormative{\paragraph}{Configuring a Virtqueue}{Virtio Transport Options / Virtio over channel I/O / Device Initialization / Configuring a Virtqueue}
>  
>  \field{res0} is reserved and MUST be ignored by the device.

We should include a device normative statement:

"The device MUST NOT offer VIRTIO_RING_F_INDIRECT_SIZE, unless at least
revision 3 is set, and therefore \field{max_indirect_num} respectively
\field{indirect_num} are available."


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-17 13:40   ` [virtio-comment] " Cornelia Huck
@ 2022-03-18 10:45     ` Christian Schoenebeck
  2022-03-18 16:03       ` [virtio-comment] " Cornelia Huck
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-18 10:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Donnerstag, 17. März 2022 14:40:27 CET Cornelia Huck wrote:
> On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > This new feature flag allows to decouple the maximum amount of
> > descriptors in indirect descriptor tables from the Queue Size.
> > 
> > The new term "Queue Indirect Size" is introduced for this purpose,
> > which is a transport specific configuration whose negotiation is
> > further specified for each transport with subsequent patches.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > 
> >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> >  packed-ring.tex |  2 +-
> >  split-ring.tex  |  8 ++++++--
> >  3 files changed, 37 insertions(+), 5 deletions(-)
> > 
> > diff --git a/content.tex b/content.tex
> > index c6f116c..685525d 100644
> > --- a/content.tex
> > +++ b/content.tex
> 
> (...)
> 
> > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> > layout}\label{sec:Virtio Transport> 
> >  present either a value of 0 or a power of 2 in
> >  \field{queue_size}.
> > 
> > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> > provide the +Queue Indirect Size supported by device, which is a
> > transport specific
> "supported by the device", or maybe "it supports"?

Article "the" is missing here, yes. Both "the device" or "it" would be fine.

> > +configuration. It MUST allow the driver to set a lower value.
> 
> Maybe "It MUST allow the driver to specify a lower maximum size." ?

That exact phrase was actually suggested by you (2021-12-14 18:20). I have no 
strong opinion on that. I find the existing "to set a lower value" clear 
enough and a less complicated wording though.

> > +
> > 
> >  \drivernormative{\paragraph}{Common configuration structure
> >  layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device
> >  Layout / Common configuration structure layout}
> >  
> >  The driver MUST NOT write to \field{device_feature}, \field{num_queues},
> >  \field{config_generation}, \field{queue_notify_off} or
> >  \field{queue_notify_data}.> 
> > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> > Feature Bits}> 
> >    that the driver can reset a queue individually.
> >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> >    Virtqueue Reset}.> 
> > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> > +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> > +  descriptor tables, is independent from the Queue Size.
> > +
> > +  Without this feature, the Queue Size limits the length of the
> > descriptor
> > +  chain, including indirect descriptor tables as in \ref{sec:Basic
> > Facilities of +  a Virtio Device / Virtqueues / The Virtqueue Descriptor
> > Table / Indirect +  Descriptors}, i.e. both the maximum amount of slots
> > in the vring and the +  actual bulk data size transmitted per vring slot.
> > +
> > +  With this feature enabled, the Queue Size only limits the maximum
> > amount
> 
> s/enabled/negotiated/ ?

Also no strong opinion on that. "negotiated" makes it a bit more clear, yes.

> > +  of slots in the vring, but does not limit the actual bulk data size
> > +  being transmitted when indirect descriptors are used. Decoupling these
> > +  two configuration parameters this way not only allows much larger bulk
> > data +  being transferred per vring slot, but also avoids complicated
> > synchronization +  mechanisms if the device only supports a very small
> > amount of vring slots. Due +  to the 16-bit size of a descriptor's "next"
> > field there is still an absolute +  limit of $2^{16}$ descriptors per
> > indirect descriptor table. However the +  actual maximum amount supported
> > by either device or driver might be less, +  and therefore the bus
> > specific Queue Indirect Size value MUST additionally +  be negotiated if
> > VIRTIO_RING_F_INDIRECT_SIZE was negotiated to subsequently +  negotiate
> > the actual amount of maximum indirect descriptors supported +  by both
> > sides.
> 
> No 'MUST' in non-normative sections, please. Maybe make that
> 
> "and therefore the device and the driver additionally need to negotiate
> the transport specific Queue Indirect Size value if
> VIRTIO_RING_F_INDIRECT_SIZE was negotiated in order to determine the
> actual amount of maximum indirect descriptors supported by both."

Yes, that still slipped through, sorry. That change looks fine to me.

> I'm not sure whether we would actually need some normative statements in
> the sections below, but probably not.

Like what would you potentially miss here?

> > +
> > 
> >  \end{description}
> >  
> >  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-17 14:12   ` [virtio-comment] " Cornelia Huck
@ 2022-03-18 11:02     ` Christian Schoenebeck
  2022-03-18 16:06       ` Halil Pasic
  2022-03-18 16:10       ` Cornelia Huck
  0 siblings, 2 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-18 11:02 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Donnerstag, 17. März 2022 15:12:42 CET Cornelia Huck wrote:
> On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > This new CCW configuration field allows to negotiate a more fine
> > graded maximum lenght of indirect descriptor chains.
> > 
> > Bump CCW virtio revision to 3 and make the existence of this new
> > field "indirect_num" dependant on revision 3.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  content.tex | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> (...)
> 
> > @@ -2581,12 +2583,17 @@ \subsubsection{Configuring a
> > Virtqueue}\label{sec:Virtio Transport Options / Vir> 
> >  struct vq_config_block {
> >  
> >          be16 index;
> >          be16 max_num;
> > 
> > +        be32 indirect_num; /* since virtio-ccw rev. 3 */
> 
> Maybe make it max_indirect_num (to mirror max_num?)

Right, that name appears indeed more appropriate here.

> >  };
> >  \end{lstlisting}
> >  
> >  The requested number of buffers for queue \field{index} is returned in
> >  \field{max_num}.
> > 
> > +Since revision 3, \field{indirect_num} exists, which is supposed to
> > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
> > descriptor tables) +supported by device for this queue.
> 
> It still depends on the feature flag, though.
> 
> Maybe
> 
> "If revision 2 or lower is set, struct vq_config_block extends up to
> \field{max_num}.
> 
> If revision 3 or higher is set, struct vq_config_block extends to
> \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
> \field{indirect_max_num} contains the maximum Queue Indirect Size
> (i.e. the maximum length of indirect descriptor tables) supported by the
> device for this queue; otherwise, its contents are unpredictable."

Maybe rather something like this instead:

"Actual size of struct vq_config_block depends on the virtio-ccw revision.

If revision 2 or lower is set, struct vq_config_block extends up to
*including* \field{max_num}.

If revision 3 or higher is set, struct vq_config_block extends up to
*including* \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is 
negotiated, \field{indirect_max_num} contains the maximum Queue Indirect Size
(i.e. the maximum length of indirect descriptor tables) supported by the
device for this queue; otherwise, its contents are *undefined*."
 
> > +
> > 
> >  Afterwards, CCW_CMD_SET_VQ is issued by the driver to inform the
> >  device about the location used for its queue. The transmitted
> >  structure is
> > 
> > @@ -2599,6 +2606,7 @@ \subsubsection{Configuring a
> > Virtqueue}\label{sec:Virtio Transport Options / Vir> 
> >          be16 num;
> >          be64 driver;
> >          be64 device;
> > 
> > +        be32 indirect_num; /* since virtio-ccw rev. 3 */
> > 
> >  };
> >  \end{lstlisting}
> > 
> > @@ -2607,6 +2615,10 @@ \subsubsection{Configuring a
> > Virtqueue}\label{sec:Virtio Transport Options / Vir> 
> >  available area and used area for queue \field{index}, respectively. The
> >  actual virtqueue size (number of allocated buffers) is transmitted in
> >  \field{num}.> 
> > +Since revision 3, \field{indirect_num} exists, which is supposed to
> > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
> > descriptor tables) +supported by driver for this queue.
> 
> Similar to above:
> 
> "If revision 2 or lower is set, struct vq_info_block extends up to
> \field{device}.
> 
> If revision 3 or higher is set, struct vq_info_block extends to
> \field{indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
> the Queue Indirect Size (i.e. the maximum length of indirect descriptor
> tables) supported by the driver for this queue is transmitted in
> \field{indirect_num}."

Ditto.

> > +
> > 
> >  \devicenormative{\paragraph}{Configuring a Virtqueue}{Virtio Transport
> >  Options / Virtio over channel I/O / Device Initialization / Configuring
> >  a Virtqueue}
> >  
> >  \field{res0} is reserved and MUST be ignored by the device.
> 
> We should include a device normative statement:
> 
> "The device MUST NOT offer VIRTIO_RING_F_INDIRECT_SIZE, unless at least
> revision 3 is set, and therefore \field{max_indirect_num} respectively
> \field{indirect_num} are available."

Oh, you are suggesting two different names for those two fields? 
"max_indirect_num" vs. "indirect_num". If yes, why?

Otherwise about adding that normative statement in general here: sure, makes 
sense.

Best regards,
Christian Schoenebeck




This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [virtio-comment] Re: [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-18 10:45     ` Christian Schoenebeck
@ 2022-03-18 16:03       ` Cornelia Huck
  0 siblings, 0 replies; 39+ messages in thread
From: Cornelia Huck @ 2022-03-18 16:03 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Fri, Mar 18 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Donnerstag, 17. März 2022 14:40:27 CET Cornelia Huck wrote:
>> On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > This new feature flag allows to decouple the maximum amount of
>> > descriptors in indirect descriptor tables from the Queue Size.
>> > 
>> > The new term "Queue Indirect Size" is introduced for this purpose,
>> > which is a transport specific configuration whose negotiation is
>> > further specified for each transport with subsequent patches.
>> > 
>> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
>> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
>> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> > ---
>> > 
>> >  content.tex     | 32 ++++++++++++++++++++++++++++++--
>> >  packed-ring.tex |  2 +-
>> >  split-ring.tex  |  8 ++++++--
>> >  3 files changed, 37 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/content.tex b/content.tex
>> > index c6f116c..685525d 100644
>> > --- a/content.tex
>> > +++ b/content.tex
>> 
>> (...)
>> 
>> > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
>> > layout}\label{sec:Virtio Transport> 
>> >  present either a value of 0 or a power of 2 in
>> >  \field{queue_size}.
>> > 
>> > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
>> > provide the +Queue Indirect Size supported by device, which is a
>> > transport specific
>> "supported by the device", or maybe "it supports"?
>
> Article "the" is missing here, yes. Both "the device" or "it" would be fine.

Yes; just use whichever you prefer :)

>
>> > +configuration. It MUST allow the driver to set a lower value.
>> 
>> Maybe "It MUST allow the driver to specify a lower maximum size." ?
>
> That exact phrase was actually suggested by you (2021-12-14 18:20). I have no 
> strong opinion on that. I find the existing "to set a lower value" clear 
> enough and a less complicated wording though.

Heh :) I do not really have a strong opinion here (hence the "maybe".)

>> I'm not sure whether we would actually need some normative statements in
>> the sections below, but probably not.
>
> Like what would you potentially miss here?

That we MUST also negotiate the value if we negotiate the feature. But
as I wrote, we probably don't need to be that explicit.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-18 11:02     ` Christian Schoenebeck
@ 2022-03-18 16:06       ` Halil Pasic
  2022-03-19 10:12         ` Christian Schoenebeck
  2022-03-18 16:10       ` Cornelia Huck
  1 sibling, 1 reply; 39+ messages in thread
From: Halil Pasic @ 2022-03-18 16:06 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: Cornelia Huck, virtio-comment, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Fri, 18 Mar 2022 12:02:31 +0100
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Donnerstag, 17. März 2022 15:12:42 CET Cornelia Huck wrote:
> > On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:  
> > > This new CCW configuration field allows to negotiate a more fine
> > > graded maximum lenght of indirect descriptor chains.
> > > 
> > > Bump CCW virtio revision to 3 and make the existence of this new
> > > field "indirect_num" dependant on revision 3.
> > > 
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > > 
> > >  content.tex | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)  
> > 
> > (...)
> >   
> > > @@ -2581,12 +2583,17 @@ \subsubsection{Configuring a
> > > Virtqueue}\label{sec:Virtio Transport Options / Vir> 
> > >  struct vq_config_block {
> > >  
> > >          be16 index;
> > >          be16 max_num;
> > > 
> > > +        be32 indirect_num; /* since virtio-ccw rev. 3 */  
> > 
> > Maybe make it max_indirect_num (to mirror max_num?)  
> 
> Right, that name appears indeed more appropriate here.
> 
> > >  };
> > >  \end{lstlisting}
> > >  
> > >  The requested number of buffers for queue \field{index} is returned in
> > >  \field{max_num}.
> > > 
> > > +Since revision 3, \field{indirect_num} exists, which is supposed to
> > > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
> > > descriptor tables) +supported by device for this queue.  
> > 
> > It still depends on the feature flag, though.
> > 
> > Maybe
> > 
> > "If revision 2 or lower is set, struct vq_config_block extends up to
> > \field{max_num}.
> > 
> > If revision 3 or higher is set, struct vq_config_block extends to
> > \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
> > \field{indirect_max_num} contains the maximum Queue Indirect Size
> > (i.e. the maximum length of indirect descriptor tables) supported by the
> > device for this queue; otherwise, its contents are unpredictable."  
> 
> Maybe rather something like this instead:
> 
> "Actual size of struct vq_config_block depends on the virtio-ccw revision.
> 
> If revision 2 or lower is set, struct vq_config_block extends up to
> *including* \field{max_num}.
> 
> If revision 3 or higher is set, struct vq_config_block extends up to
> *including* \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is 
> negotiated, \field{indirect_max_num} contains the maximum Queue Indirect Size
> (i.e. the maximum length of indirect descriptor tables) supported by the
> device for this queue; otherwise, its contents are *undefined*."
>  

I agree that the "including" is important, but I'm not sure about the
"its contents are undefined". I don't really understand why should we use
a plural here. What speaks against specifying that in SHOULD be stored
as 0 by the device, and MUST be ignored by the driver?

Currently we say that \field{max_indirect_num} exists like a be32 field
even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
implies that at least type invariants should hold. Of course, there is
none here (i.e. every bits value is also a be32 value), but for something
like an enum interesting corner cases can pop up.

[..]

Regards,
Halil


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-18 11:02     ` Christian Schoenebeck
  2022-03-18 16:06       ` Halil Pasic
@ 2022-03-18 16:10       ` Cornelia Huck
  2022-03-19 10:23         ` Christian Schoenebeck
  1 sibling, 1 reply; 39+ messages in thread
From: Cornelia Huck @ 2022-03-18 16:10 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Fri, Mar 18 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Donnerstag, 17. März 2022 15:12:42 CET Cornelia Huck wrote:
>> On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

>> > +Since revision 3, \field{indirect_num} exists, which is supposed to
>> > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
>> > descriptor tables) +supported by device for this queue.
>> 
>> It still depends on the feature flag, though.
>> 
>> Maybe
>> 
>> "If revision 2 or lower is set, struct vq_config_block extends up to
>> \field{max_num}.
>> 
>> If revision 3 or higher is set, struct vq_config_block extends to
>> \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
>> \field{indirect_max_num} contains the maximum Queue Indirect Size
>> (i.e. the maximum length of indirect descriptor tables) supported by the
>> device for this queue; otherwise, its contents are unpredictable."
>
> Maybe rather something like this instead:
>
> "Actual size of struct vq_config_block depends on the virtio-ccw revision.
>
> If revision 2 or lower is set, struct vq_config_block extends up to
> *including* \field{max_num}.
>
> If revision 3 or higher is set, struct vq_config_block extends up to
> *including* \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is 
> negotiated, \field{indirect_max_num} contains the maximum Queue Indirect Size
> (i.e. the maximum length of indirect descriptor tables) supported by the
> device for this queue; otherwise, its contents are *undefined*."

Hm, I was coming from the driver's perspective here. But "undefined"
probably captures "junk in there, do not use" better.

Ack on your other points.

>> We should include a device normative statement:
>> 
>> "The device MUST NOT offer VIRTIO_RING_F_INDIRECT_SIZE, unless at least
>> revision 3 is set, and therefore \field{max_indirect_num} respectively
>> \field{indirect_num} are available."
>
> Oh, you are suggesting two different names for those two fields? 
> "max_indirect_num" vs. "indirect_num". If yes, why?

The "max_indirect_num" goes with the "max_num": this is the maximum
value that the device supports (the "maximum maximum", in a way :).
"indirect_num" is the actual upper limit while the device is in use.

>
> Otherwise about adding that normative statement in general here: sure, makes 
> sense.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
  2022-03-17 13:40   ` [virtio-comment] " Cornelia Huck
@ 2022-03-19  9:33   ` Michael S. Tsirkin
  2022-03-19 12:00     ` Christian Schoenebeck
  1 sibling, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-19  9:33 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

[-- Attachment #1: Type: text/plain, Size: 6773 bytes --]

On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck <qemu_oss@crudebyte.com>
wrote:

> This new feature flag allows to decouple the maximum amount of
> descriptors in indirect descriptor tables from the Queue Size.
>



if we are extending these limits, I suggest reusing the feature flag to
also add a limit on total s/g list size. making it separate from queue size
was requested a while ago.





> The new term "Queue Indirect Size" is introduced for this purpose,
> which is a transport specific configuration whose negotiation is
> further specified for each transport with subsequent patches.
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  content.tex     | 32 ++++++++++++++++++++++++++++++--
>  packed-ring.tex |  2 +-
>  split-ring.tex  |  8 ++++++--
>  3 files changed, 37 insertions(+), 5 deletions(-)
>
> diff --git a/content.tex b/content.tex
> index c6f116c..685525d 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a
> Virtio Device / Feature B
>  \begin{description}
>  \item[0 to 23, and 50 to 127] Feature bits for the specific device type
>
> -\item[24 to 40] Feature bits reserved for extensions to the queue and
> +\item[24 to 41] Feature bits reserved for extensions to the queue and
>    feature negotiation mechanisms
>
> -\item[41 to 49, and 128 and above] Feature bits reserved for future
> extensions.
> +\item[42 to 49, and 128 and above] Feature bits reserved for future
> extensions.
>  \end{description}
>
>  \begin{note}
> @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport
>  present either a value of 0 or a power of 2 in
>  \field{queue_size}.
>
> +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> provide the
> +Queue Indirect Size supported by device, which is a transport specific
> +configuration. It MUST allow the driver to set a lower value.
> +
>  \drivernormative{\paragraph}{Common configuration structure
> layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout
> / Common configuration structure layout}
>
>  The driver MUST NOT write to \field{device_feature}, \field{num_queues},
> \field{config_generation}, \field{queue_notify_off} or
> \field{queue_notify_data}.
> @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> Feature Bits}
>    that the driver can reset a queue individually.
>    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> Virtqueue Reset}.
>
> +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> +  descriptor tables, is independent from the Queue Size.
> +
> +  Without this feature, the Queue Size limits the length of the descriptor
> +  chain, including indirect descriptor tables as in \ref{sec:Basic
> Facilities of
> +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
> +  Descriptors}, i.e. both the maximum amount of slots in the vring and the
> +  actual bulk data size transmitted per vring slot.
>


spect does not call these  slots elsewhere.


+
> +  With this feature enabled, the Queue Size only limits the maximum amount
> +  of slots in the vring, but does not limit the actual bulk data size
> +  being transmitted when indirect descriptors are used. Decoupling these
> +  two configuration parameters this way not only allows much larger bulk
> data
> +  being transferred per vring slot, but also avoids complicated
> synchronization
> +  mechanisms if the device only supports a very small amount of vring
> slots. Due
> +  to the 16-bit size of a descriptor's "next" field there is still an
> absolute
> +  limit of $2^{16}$ descriptors per indirect descriptor table. However the
> +  actual maximum amount supported by either device or driver might be
> less,
> +  and therefore the bus specific Queue Indirect Size value MUST
> additionally
> +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> subsequently
> +  negotiate the actual amount of maximum indirect descriptors supported
> +  by both sides.
>


still not sure what exactly is the value. e.g. in a buffer including
indirect and direct descriptors.

+
>  \end{description}
>
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> diff --git a/packed-ring.tex b/packed-ring.tex
> index a9e6c16..e26d112 100644
> --- a/packed-ring.tex
> +++ b/packed-ring.tex
> @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
>  The device limits the number of descriptors in a list through a
>  transport-specific and/or device-specific value. If not limited,
>  the maximum number of descriptors in a list is the virt queue
> -size.
> +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been negotiated.
>
>  \subsection{Next Flag: Descriptor Chaining}
>  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> diff --git a/split-ring.tex b/split-ring.tex
> index de94038..eaa90c3 100644
> --- a/split-ring.tex
> +++ b/split-ring.tex
> @@ -268,8 +268,12 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic
> Facilities of a Virtio Devi
>  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie. only
>  one table per descriptor).
>
> -A driver MUST NOT create a descriptor chain longer than the Queue Size of
> -the device.
>

+If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
> +NOT create a descriptor chain longer than the Queue Size of the device.
> +
> +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> +descriptors per indirect descriptor table MUST NOT exceed the negotiated
> +Queue Indirect Size.
>

it is not negotiated is it?


>
>  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
>  in \field{flags}.
> --
> 2.30.2
>
>
>
>
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines:
> https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/
>
>

[-- Attachment #2: Type: text/html, Size: 10075 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-18 16:06       ` Halil Pasic
@ 2022-03-19 10:12         ` Christian Schoenebeck
  2022-03-21 16:36           ` Cornelia Huck
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-19 10:12 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cornelia Huck, virtio-comment, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet

On Freitag, 18. März 2022 17:06:25 CET Halil Pasic wrote:
> On Fri, 18 Mar 2022 12:02:31 +0100
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Donnerstag, 17. März 2022 15:12:42 CET Cornelia Huck wrote:
> > > On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> 
wrote:
> > > > This new CCW configuration field allows to negotiate a more fine
> > > > graded maximum lenght of indirect descriptor chains.
> > > > 
> > > > Bump CCW virtio revision to 3 and make the existence of this new
> > > > field "indirect_num" dependant on revision 3.
> > > > 
> > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > ---
> > > > 
> > > >  content.tex | 14 +++++++++++++-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > 
> > > (...)
> > > 
> > > > @@ -2581,12 +2583,17 @@ \subsubsection{Configuring a
> > > > Virtqueue}\label{sec:Virtio Transport Options / Vir>
> > > > 
> > > >  struct vq_config_block {
> > > >  
> > > >          be16 index;
> > > >          be16 max_num;
> > > > 
> > > > +        be32 indirect_num; /* since virtio-ccw rev. 3 */
> > > 
> > > Maybe make it max_indirect_num (to mirror max_num?)
> > 
> > Right, that name appears indeed more appropriate here.
> > 
> > > >  };
> > > >  \end{lstlisting}
> > > >  
> > > >  The requested number of buffers for queue \field{index} is returned
> > > >  in
> > > >  \field{max_num}.
> > > > 
> > > > +Since revision 3, \field{indirect_num} exists, which is supposed to
> > > > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
> > > > descriptor tables) +supported by device for this queue.
> > > 
> > > It still depends on the feature flag, though.
> > > 
> > > Maybe
> > > 
> > > "If revision 2 or lower is set, struct vq_config_block extends up to
> > > \field{max_num}.
> > > 
> > > If revision 3 or higher is set, struct vq_config_block extends to
> > > \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
> > > \field{indirect_max_num} contains the maximum Queue Indirect Size
> > > (i.e. the maximum length of indirect descriptor tables) supported by the
> > > device for this queue; otherwise, its contents are unpredictable."
> > 
> > Maybe rather something like this instead:
> > 
> > "Actual size of struct vq_config_block depends on the virtio-ccw revision.
> > 
> > If revision 2 or lower is set, struct vq_config_block extends up to
> > *including* \field{max_num}.
> > 
> > If revision 3 or higher is set, struct vq_config_block extends up to
> > *including* \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is
> > negotiated, \field{indirect_max_num} contains the maximum Queue Indirect
> > Size (i.e. the maximum length of indirect descriptor tables) supported by
> > the device for this queue; otherwise, its contents are *undefined*."
> 
> I agree that the "including" is important, but I'm not sure about the
> "its contents are undefined". I don't really understand why should we use
> a plural here. What speaks against specifying that in SHOULD be stored
> as 0 by the device, and MUST be ignored by the driver?

Both solutions would be viable. Personally I would just use something like 
"Should be zero" if there is a value in recommending that, but I don't see a 
value in recommending to set something to zero and at the same time requiring 
to not access it in the first place.

> Currently we say that \field{max_indirect_num} exists like a be32 field
> even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
> implies that at least type invariants should hold. Of course, there is
> none here (i.e. every bits value is also a be32 value), but for something
> like an enum interesting corner cases can pop up.

I can't follow you on that one. What has that do with enums in this case?

Anyway, I won't persist on my suggestion to use the (IMO more compact form) 
"undefined". If you guys prefer the more specific solution "SHOULD be 0 and 
MUST not be accessed" then I will go that way.

Best regards,
Christian Schoenebeck



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-18 16:10       ` Cornelia Huck
@ 2022-03-19 10:23         ` Christian Schoenebeck
  2022-03-21 16:25           ` Cornelia Huck
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-19 10:23 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Freitag, 18. März 2022 17:10:34 CET Cornelia Huck wrote:
> On Fri, Mar 18 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Donnerstag, 17. März 2022 15:12:42 CET Cornelia Huck wrote:
> >> On Wed, Mar 16 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> 
wrote:
> >> > +Since revision 3, \field{indirect_num} exists, which is supposed to
> >> > reflect the +Queue Indirect Size (i.e. the maximum length of indirect
> >> > descriptor tables) +supported by device for this queue.
> >> 
> >> It still depends on the feature flag, though.
> >> 
> >> Maybe
> >> 
> >> "If revision 2 or lower is set, struct vq_config_block extends up to
> >> \field{max_num}.
> >> 
> >> If revision 3 or higher is set, struct vq_config_block extends to
> >> \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is negotiated,
> >> \field{indirect_max_num} contains the maximum Queue Indirect Size
> >> (i.e. the maximum length of indirect descriptor tables) supported by the
> >> device for this queue; otherwise, its contents are unpredictable."
> > 
> > Maybe rather something like this instead:
> > 
> > "Actual size of struct vq_config_block depends on the virtio-ccw revision.
> > 
> > If revision 2 or lower is set, struct vq_config_block extends up to
> > *including* \field{max_num}.
> > 
> > If revision 3 or higher is set, struct vq_config_block extends up to
> > *including* \field{max_indirect_num}. If VIRTIO_RING_F_INDIRECT_SIZE is
> > negotiated, \field{indirect_max_num} contains the maximum Queue Indirect
> > Size (i.e. the maximum length of indirect descriptor tables) supported by
> > the device for this queue; otherwise, its contents are *undefined*."
> 
> Hm, I was coming from the driver's perspective here. But "undefined"
> probably captures "junk in there, do not use" better.

Like written to Halil moments ago, I would prefer "undefined" but I have no 
strong opininon on it.

> Ack on your other points.
> 
> >> We should include a device normative statement:
> >> 
> >> "The device MUST NOT offer VIRTIO_RING_F_INDIRECT_SIZE, unless at least
> >> revision 3 is set, and therefore \field{max_indirect_num} respectively
> >> \field{indirect_num} are available."
> > 
> > Oh, you are suggesting two different names for those two fields?
> > "max_indirect_num" vs. "indirect_num". If yes, why?
> 
> The "max_indirect_num" goes with the "max_num": this is the maximum
> value that the device supports (the "maximum maximum", in a way :).
> "indirect_num" is the actual upper limit while the device is in use.

Ok, I understand your motivation now, your interpretation aspect for this was:

	device's value >= driver's value

Another interpretation aspect would be though from driver PoV: 'what is the 
maximum bulk size I could send to device?', and in this case you would call 
both fields 'max_indirect_num'.

Therefore I would prefer using the same name 'max_indirect_num' on both 
structs, to also make it clear they are about negotiating the same thing.

> 
> > Otherwise about adding that normative statement in general here: sure,
> > makes sense.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-19  9:33   ` [virtio-comment] " Michael S. Tsirkin
@ 2022-03-19 12:00     ` Christian Schoenebeck
  2022-03-20 12:31       ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-19 12:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck <qemu_oss@crudebyte.com>
> 
> wrote:
> > This new feature flag allows to decouple the maximum amount of
> > descriptors in indirect descriptor tables from the Queue Size.
> 
> if we are extending these limits, I suggest reusing the feature flag to
> also add a limit on total s/g list size. making it separate from queue size
> was requested a while ago.

What do you mean with "total s/g list size"? The maximum bulk data size per 
message? Sum of both in and out s/g lists' bulk data or only for one of them? 
Or maximum size of exactly only one memory segment?

And are you suggesting this should become part of this series already?

> > The new term "Queue Indirect Size" is introduced for this purpose,
> > which is a transport specific configuration whose negotiation is
> > further specified for each transport with subsequent patches.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > 
> >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> >  packed-ring.tex |  2 +-
> >  split-ring.tex  |  8 ++++++--
> >  3 files changed, 37 insertions(+), 5 deletions(-)
> > 
> > diff --git a/content.tex b/content.tex
> > index c6f116c..685525d 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a
> > Virtio Device / Feature B
> > 
> >  \begin{description}
> >  \item[0 to 23, and 50 to 127] Feature bits for the specific device type
> > 
> > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > 
> >    feature negotiation mechanisms
> > 
> > -\item[41 to 49, and 128 and above] Feature bits reserved for future
> > extensions.
> > +\item[42 to 49, and 128 and above] Feature bits reserved for future
> > extensions.
> > 
> >  \end{description}
> >  
> >  \begin{note}
> > 
> > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> > layout}\label{sec:Virtio Transport
> > 
> >  present either a value of 0 or a power of 2 in
> >  \field{queue_size}.
> > 
> > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> > provide the
> > +Queue Indirect Size supported by device, which is a transport specific
> > +configuration. It MUST allow the driver to set a lower value.
> > +
> > 
> >  \drivernormative{\paragraph}{Common configuration structure
> > 
> > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout
> > / Common configuration structure layout}
> > 
> >  The driver MUST NOT write to \field{device_feature}, \field{num_queues},
> > 
> > \field{config_generation}, \field{queue_notify_off} or
> > \field{queue_notify_data}.
> > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> > Feature Bits}
> > 
> >    that the driver can reset a queue individually.
> >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > 
> > Virtqueue Reset}.
> > 
> > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> > +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> > +  descriptor tables, is independent from the Queue Size.
> > +
> > +  Without this feature, the Queue Size limits the length of the
> > descriptor
> > +  chain, including indirect descriptor tables as in \ref{sec:Basic
> > Facilities of
> > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > Indirect
> > +  Descriptors}, i.e. both the maximum amount of slots in the vring and
> > the
> > +  actual bulk data size transmitted per vring slot.
> 
> spect does not call these  slots elsewhere.

Yes, I intentionally used "vring slot" instead of "buffer" as I find the 
latter too vague in this context. A "buffer" can be a memory segment, a set of 
memory segments and what not. "vring slot" OTOH makes it clear that it is 
about exactly one, atomic pointer (hence with fixed size) in a Ring Buffer, as 
depicted in the ASCII illustration here:

https://github.com/oasis-tcs/virtio-spec/issues/122

The maximum amount of vring slots is therefore the maximum amount of messages 
that can be emplaced into a Ring Buffer, independent of any "bulk data buffer 
size".

> 
> 
> +
> 
> > +  With this feature enabled, the Queue Size only limits the maximum
> > amount
> > +  of slots in the vring, but does not limit the actual bulk data size
> > +  being transmitted when indirect descriptors are used. Decoupling these
> > +  two configuration parameters this way not only allows much larger bulk
> > data
> > +  being transferred per vring slot, but also avoids complicated
> > synchronization
> > +  mechanisms if the device only supports a very small amount of vring
> > slots. Due
> > +  to the 16-bit size of a descriptor's "next" field there is still an
> > absolute
> > +  limit of $2^{16}$ descriptors per indirect descriptor table. However
> > the
> > +  actual maximum amount supported by either device or driver might be
> > less,
> > +  and therefore the bus specific Queue Indirect Size value MUST
> > additionally
> > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > subsequently
> > +  negotiate the actual amount of maximum indirect descriptors supported
> > +  by both sides.
> 
> still not sure what exactly is the value. e.g. in a buffer including
> indirect and direct descriptors.
> 
> +
> 
> >  \end{description}
> >  
> >  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> > 
> > diff --git a/packed-ring.tex b/packed-ring.tex
> > index a9e6c16..e26d112 100644
> > --- a/packed-ring.tex
> > +++ b/packed-ring.tex
> > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > 
> >  The device limits the number of descriptors in a list through a
> >  transport-specific and/or device-specific value. If not limited,
> >  the maximum number of descriptors in a list is the virt queue
> > 
> > -size.
> > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been negotiated.
> > 
> >  \subsection{Next Flag: Descriptor Chaining}
> >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > 
> > diff --git a/split-ring.tex b/split-ring.tex
> > index de94038..eaa90c3 100644
> > --- a/split-ring.tex
> > +++ b/split-ring.tex
> > @@ -268,8 +268,12 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic
> > Facilities of a Virtio Devi
> > 
> >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie.
> >  only
> >  one table per descriptor).
> > 
> > -A driver MUST NOT create a descriptor chain longer than the Queue Size of
> > -the device.
> 
> +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
> 
> > +NOT create a descriptor chain longer than the Queue Size of the device.
> > +
> > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > +descriptors per indirect descriptor table MUST NOT exceed the negotiated
> > +Queue Indirect Size.
> 
> it is not negotiated is it?

What makes you think it is not negotiated?

> 
> >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
> >  in \field{flags}.
> > 
> > --
> > 2.30.2
> > 
> > 
> > 
> > 
> > 
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> > 
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> > 
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines:
> > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-19 12:00     ` Christian Schoenebeck
@ 2022-03-20 12:31       ` Michael S. Tsirkin
  2022-03-20 13:32         ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-20 12:31 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck <qemu_oss@crudebyte.com>
> > 
> > wrote:
> > > This new feature flag allows to decouple the maximum amount of
> > > descriptors in indirect descriptor tables from the Queue Size.
> > 
> > if we are extending these limits, I suggest reusing the feature flag to
> > also add a limit on total s/g list size. making it separate from queue size
> > was requested a while ago.
> 
> What do you mean with "total s/g list size"? The maximum bulk data size per 
> message?
> Sum of both in and out s/g lists' bulk data or only for one of them? 
> Or maximum size of exactly only one memory segment?

I don't really know what does "bulk data size" mean. Suggest we use
terminology from the spec. A buffer includes a group of direct and/or
indirect descriptors, in turn indirect descriptors point to direct
descriptors.


What has been requested a while is ability to limit per vq the
# of direct descriptors in a buffer.


Since IIUC what you want to do is allow more descriptors
than VQ size, then one way to achieve that is just to have
a per VQ limit on descriptor size and have that limit > VQ size.

Another thing related is that people wanted to block indirect
descriptors for some VQs. Not yet sure how to combine that
with this proposal, worth thinking about.


> 
> And are you suggesting this should become part of this series already?

yes since it's touching mostly same areas in the spec.

> > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > which is a transport specific configuration whose negotiation is
> > > further specified for each transport with subsequent patches.
> > > 
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > > 
> > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > >  packed-ring.tex |  2 +-
> > >  split-ring.tex  |  8 ++++++--
> > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/content.tex b/content.tex
> > > index c6f116c..685525d 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities of a
> > > Virtio Device / Feature B
> > > 
> > >  \begin{description}
> > >  \item[0 to 23, and 50 to 127] Feature bits for the specific device type
> > > 
> > > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > > 
> > >    feature negotiation mechanisms
> > > 
> > > -\item[41 to 49, and 128 and above] Feature bits reserved for future
> > > extensions.
> > > +\item[42 to 49, and 128 and above] Feature bits reserved for future
> > > extensions.
> > > 
> > >  \end{description}
> > >  
> > >  \begin{note}
> > > 
> > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> > > layout}\label{sec:Virtio Transport
> > > 
> > >  present either a value of 0 or a power of 2 in
> > >  \field{queue_size}.
> > > 
> > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> > > provide the
> > > +Queue Indirect Size supported by device, which is a transport specific
> > > +configuration. It MUST allow the driver to set a lower value.
> > > +
> > > 
> > >  \drivernormative{\paragraph}{Common configuration structure
> > > 
> > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout
> > > / Common configuration structure layout}
> > > 
> > >  The driver MUST NOT write to \field{device_feature}, \field{num_queues},
> > > 
> > > \field{config_generation}, \field{queue_notify_off} or
> > > \field{queue_notify_data}.
> > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> > > Feature Bits}
> > > 
> > >    that the driver can reset a queue individually.
> > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > 
> > > Virtqueue Reset}.
> > > 
> > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> > > +  descriptor tables, is independent from the Queue Size.
> > > +
> > > +  Without this feature, the Queue Size limits the length of the
> > > descriptor
> > > +  chain, including indirect descriptor tables as in \ref{sec:Basic
> > > Facilities of
> > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > Indirect
> > > +  Descriptors}, i.e. both the maximum amount of slots in the vring and
> > > the
> > > +  actual bulk data size transmitted per vring slot.
> > 
> > spect does not call these  slots elsewhere.
> 
> Yes, I intentionally used "vring slot" instead of "buffer" as I find the 
> latter too vague in this context. A "buffer" can be a memory segment, a set of 
> memory segments and what not. "vring slot" OTOH makes it clear that it is 
> about exactly one, atomic pointer (hence with fixed size) in a Ring Buffer, as 
> depicted in the ASCII illustration here:
> 
> https://github.com/oasis-tcs/virtio-spec/issues/122
> 
> The maximum amount of vring slots is therefore the maximum amount of messages 
> that can be emplaced into a Ring Buffer, independent of any "bulk data buffer 
> size".
> 
> > 
> > 
> > +
> > 
> > > +  With this feature enabled, the Queue Size only limits the maximum
> > > amount
> > > +  of slots in the vring, but does not limit the actual bulk data size
> > > +  being transmitted when indirect descriptors are used. Decoupling these
> > > +  two configuration parameters this way not only allows much larger bulk
> > > data
> > > +  being transferred per vring slot, but also avoids complicated
> > > synchronization
> > > +  mechanisms if the device only supports a very small amount of vring
> > > slots. Due
> > > +  to the 16-bit size of a descriptor's "next" field there is still an
> > > absolute
> > > +  limit of $2^{16}$ descriptors per indirect descriptor table. However
> > > the
> > > +  actual maximum amount supported by either device or driver might be
> > > less,
> > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > additionally
> > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > subsequently
> > > +  negotiate the actual amount of maximum indirect descriptors supported
> > > +  by both sides.
> > 
> > still not sure what exactly is the value. e.g. in a buffer including
> > indirect and direct descriptors.
> > 
> > +
> > 
> > >  \end{description}
> > >  
> > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
> > > 
> > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > index a9e6c16..e26d112 100644
> > > --- a/packed-ring.tex
> > > +++ b/packed-ring.tex
> > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > 
> > >  The device limits the number of descriptors in a list through a
> > >  transport-specific and/or device-specific value. If not limited,
> > >  the maximum number of descriptors in a list is the virt queue
> > > 
> > > -size.
> > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been negotiated.
> > > 
> > >  \subsection{Next Flag: Descriptor Chaining}
> > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > 
> > > diff --git a/split-ring.tex b/split-ring.tex
> > > index de94038..eaa90c3 100644
> > > --- a/split-ring.tex
> > > +++ b/split-ring.tex
> > > @@ -268,8 +268,12 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic
> > > Facilities of a Virtio Devi
> > > 
> > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie.
> > >  only
> > >  one table per descriptor).
> > > 
> > > -A driver MUST NOT create a descriptor chain longer than the Queue Size of
> > > -the device.
> > 
> > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
> > 
> > > +NOT create a descriptor chain longer than the Queue Size of the device.
> > > +
> > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > +descriptors per indirect descriptor table MUST NOT exceed the negotiated
> > > +Queue Indirect Size.
> > 
> > it is not negotiated is it?
> 
> What makes you think it is not negotiated?
> 
> > 
> > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
> > >  in \field{flags}.
> > > 
> > > --
> > > 2.30.2
> > > 
> > > 
> > > 
> > > 
> > > 
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > 
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > > 
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines:
> > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 12:31       ` Michael S. Tsirkin
@ 2022-03-20 13:32         ` Christian Schoenebeck
  2022-03-20 13:55           ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-20 13:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > <qemu_oss@crudebyte.com>
> > > 
> > > wrote:
> > > > This new feature flag allows to decouple the maximum amount of
> > > > descriptors in indirect descriptor tables from the Queue Size.
> > > 
> > > if we are extending these limits, I suggest reusing the feature flag to
> > > also add a limit on total s/g list size. making it separate from queue
> > > size
> > > was requested a while ago.
> > 
> > What do you mean with "total s/g list size"? The maximum bulk data size
> > per
> > message?
> > Sum of both in and out s/g lists' bulk data or only for one of them?
> > Or maximum size of exactly only one memory segment?
> 
> I don't really know what does "bulk data size" mean. Suggest we use
> terminology from the spec. A buffer includes a group of direct and/or
> indirect descriptors, in turn indirect descriptors point to direct
> descriptors.

I already described why I think it makes sense not calling it "buffer" in this 
particular context. So I am against changing this to "buffer". If other people 
support your position then I'll change it though of course.

About "bulk data size": "Bulk data" is the user data actually being used on 
application/driver level, i.e. above virtio level, and "Bulk data size" the 
size of that data. See ASCII illustration here:
https://github.com/oasis-tcs/virtio-spec/issues/122

The terminology "bulk data" is already used in the spec already BTW.

> What has been requested a while is ability to limit per vq the
> # of direct descriptors in a buffer.

So in cases where indirect descriptors are *not* used. This series is about 
indirect descriptors only though.

> Since IIUC what you want to do is allow more descriptors
> than VQ size, then one way to achieve that is just to have
> a per VQ limit on descriptor size and have that limit > VQ size.

Sorry, I can't follow you. What do you mean with "descriptor size"? For me a 
descriptor has a predefined constant struct size. You mean the size of one 
memory segment referenced by one indirect descriptor? And why would it be 
better than what this series suggests?

> Another thing related is that people wanted to block indirect
> descriptors for some VQs. Not yet sure how to combine that
> with this proposal, worth thinking about.

This series allows both, increasing *and* decreasing the number of indirect 
descriptors per VQ already.

> > And are you suggesting this should become part of this series already?
> 
> yes since it's touching mostly same areas in the spec.

:/ Please note that I sent the first draft on this issue already in November 
last year, and have not seen any response from your side so far. I actually 
assumed we were already at a point where it was just about precise wording et 
al., not restarting to redesign everything again from scratch now.

> > > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > > which is a transport specific configuration whose negotiation is
> > > > further specified for each transport with subsequent patches.
> > > > 
> > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > > 
> > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > >  packed-ring.tex |  2 +-
> > > >  split-ring.tex  |  8 ++++++--
> > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/content.tex b/content.tex
> > > > index c6f116c..685525d 100644
> > > > --- a/content.tex
> > > > +++ b/content.tex
> > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities
> > > > of a
> > > > Virtio Device / Feature B
> > > > 
> > > >  \begin{description}
> > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific device
> > > >  type
> > > > 
> > > > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > > > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > > > 
> > > >    feature negotiation mechanisms
> > > > 
> > > > -\item[41 to 49, and 128 and above] Feature bits reserved for future
> > > > extensions.
> > > > +\item[42 to 49, and 128 and above] Feature bits reserved for future
> > > > extensions.
> > > > 
> > > >  \end{description}
> > > >  
> > > >  \begin{note}
> > > > 
> > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> > > > layout}\label{sec:Virtio Transport
> > > > 
> > > >  present either a value of 0 or a power of 2 in
> > > >  \field{queue_size}.
> > > > 
> > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> > > > provide the
> > > > +Queue Indirect Size supported by device, which is a transport
> > > > specific
> > > > +configuration. It MUST allow the driver to set a lower value.
> > > > +
> > > > 
> > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > 
> > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device
> > > > Layout
> > > > / Common configuration structure layout}
> > > > 
> > > >  The driver MUST NOT write to \field{device_feature},
> > > >  \field{num_queues},
> > > > 
> > > > \field{config_generation}, \field{queue_notify_off} or
> > > > \field{queue_notify_data}.
> > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > Bits}\label{sec:Reserved
> > > > Feature Bits}
> > > > 
> > > >    that the driver can reset a queue individually.
> > > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > > 
> > > > Virtqueue Reset}.
> > > > 
> > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that
> > > > the
> > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in
> > > > indirect
> > > > +  descriptor tables, is independent from the Queue Size.
> > > > +
> > > > +  Without this feature, the Queue Size limits the length of the
> > > > descriptor
> > > > +  chain, including indirect descriptor tables as in \ref{sec:Basic
> > > > Facilities of
> > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > > Indirect
> > > > +  Descriptors}, i.e. both the maximum amount of slots in the vring
> > > > and
> > > > the
> > > > +  actual bulk data size transmitted per vring slot.
> > > 
> > > spect does not call these  slots elsewhere.
> > 
> > Yes, I intentionally used "vring slot" instead of "buffer" as I find the
> > latter too vague in this context. A "buffer" can be a memory segment, a
> > set of memory segments and what not. "vring slot" OTOH makes it clear
> > that it is about exactly one, atomic pointer (hence with fixed size) in a
> > Ring Buffer, as depicted in the ASCII illustration here:
> > 
> > https://github.com/oasis-tcs/virtio-spec/issues/122
> > 
> > The maximum amount of vring slots is therefore the maximum amount of
> > messages that can be emplaced into a Ring Buffer, independent of any
> > "bulk data buffer size".
> > 
> > > +
> > > 
> > > > +  With this feature enabled, the Queue Size only limits the maximum
> > > > amount
> > > > +  of slots in the vring, but does not limit the actual bulk data size
> > > > +  being transmitted when indirect descriptors are used. Decoupling
> > > > these
> > > > +  two configuration parameters this way not only allows much larger
> > > > bulk
> > > > data
> > > > +  being transferred per vring slot, but also avoids complicated
> > > > synchronization
> > > > +  mechanisms if the device only supports a very small amount of vring
> > > > slots. Due
> > > > +  to the 16-bit size of a descriptor's "next" field there is still an
> > > > absolute
> > > > +  limit of $2^{16}$ descriptors per indirect descriptor table.
> > > > However
> > > > the
> > > > +  actual maximum amount supported by either device or driver might be
> > > > less,
> > > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > > additionally
> > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > > subsequently
> > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > supported
> > > > +  by both sides.
> > > 
> > > still not sure what exactly is the value. e.g. in a buffer including
> > > indirect and direct descriptors.
> > > 
> > > +
> > > 
> > > >  \end{description}
> > > >  
> > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature
> > > >  Bits}
> > > > 
> > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > index a9e6c16..e26d112 100644
> > > > --- a/packed-ring.tex
> > > > +++ b/packed-ring.tex
> > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > 
> > > >  The device limits the number of descriptors in a list through a
> > > >  transport-specific and/or device-specific value. If not limited,
> > > >  the maximum number of descriptors in a list is the virt queue
> > > > 
> > > > -size.
> > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > negotiated.
> > > > 
> > > >  \subsection{Next Flag: Descriptor Chaining}
> > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > > 
> > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > index de94038..eaa90c3 100644
> > > > --- a/split-ring.tex
> > > > +++ b/split-ring.tex
> > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > Descriptors}\label{sec:Basic
> > > > Facilities of a Virtio Devi
> > > > 
> > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie.
> > > >  only
> > > >  one table per descriptor).
> > > > 
> > > > -A driver MUST NOT create a descriptor chain longer than the Queue
> > > > Size of
> > > > -the device.
> > > 
> > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
> > > 
> > > > +NOT create a descriptor chain longer than the Queue Size of the
> > > > device.
> > > > +
> > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > negotiated
> > > > +Queue Indirect Size.
> > > 
> > > it is not negotiated is it?
> > 
> > What makes you think it is not negotiated?

Also see my previous question here ^

> > > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and
> > > >  VIRTQ_DESC_F_NEXT
> > > >  in \field{flags}.
> > > > 
> > > > --
> > > > 2.30.2
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > This publicly archived list offers a means to provide input to the
> > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > 
> > > > In order to verify user consent to the Feedback License terms and
> > > > to minimize spam in the list archive, subscription is required
> > > > before posting.
> > > > 
> > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > Feedback License:
> > > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > List Guidelines:
> > > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > Join OASIS: https://www.oasis-open.org/join/



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 13:32         ` Christian Schoenebeck
@ 2022-03-20 13:55           ` Michael S. Tsirkin
  2022-03-20 15:17             ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-20 13:55 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sun, Mar 20, 2022 at 02:32:23PM +0100, Christian Schoenebeck wrote:
> On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> > On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> > > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > > <qemu_oss@crudebyte.com>
> > > > 
> > > > wrote:
> > > > > This new feature flag allows to decouple the maximum amount of
> > > > > descriptors in indirect descriptor tables from the Queue Size.
> > > > 
> > > > if we are extending these limits, I suggest reusing the feature flag to
> > > > also add a limit on total s/g list size. making it separate from queue
> > > > size
> > > > was requested a while ago.
> > > 
> > > What do you mean with "total s/g list size"? The maximum bulk data size
> > > per
> > > message?
> > > Sum of both in and out s/g lists' bulk data or only for one of them?
> > > Or maximum size of exactly only one memory segment?
> > 
> > I don't really know what does "bulk data size" mean. Suggest we use
> > terminology from the spec. A buffer includes a group of direct and/or
> > indirect descriptors, in turn indirect descriptors point to direct
> > descriptors.
> 
> I already described why I think it makes sense not calling it "buffer" in this 
> particular context. So I am against changing this to "buffer".

Well spec just defines what buffers are.
If you are using a different term then you need to define it in the
spec.

> If other people 
> support your position then I'll change it though of course.
> 
> About "bulk data size": "Bulk data" is the user data actually being used on 
> application/driver level, i.e. above virtio level, and "Bulk data size" the 
> size of that data. See ASCII illustration here:
> https://github.com/oasis-tcs/virtio-spec/issues/122
> 
> The terminology "bulk data" is already used in the spec already BTW.

It does not refer to anything specific though, just generally
to vqs for passing lots of data as opposed to config space used
to pass small amount of data.

> > What has been requested a while is ability to limit per vq the
> > # of direct descriptors in a buffer.
> 
> So in cases where indirect descriptors are *not* used. This series is about 
> indirect descriptors only though.

No, in all cases.

> > Since IIUC what you want to do is allow more descriptors
> > than VQ size, then one way to achieve that is just to have
> > a per VQ limit on descriptor size and have that limit > VQ size.
> 
> Sorry, I can't follow you. What do you mean with "descriptor size"?
> For me a 
> descriptor has a predefined constant struct size. You mean the size of one 
> memory segment referenced by one indirect descriptor? And why would it be 
> better than what this series suggests?

My bad. What I meant is a "per VQ limit on # of direct descriptors
per buffer".

> 
> > Another thing related is that people wanted to block indirect
> > descriptors for some VQs. Not yet sure how to combine that
> > with this proposal, worth thinking about.
> 
> This series allows both, increasing *and* decreasing the number of indirect 
> descriptors per VQ already.

I don't see how you block indirect descritors for a queue with this.
Did I miss it?

Also I think you missed the fact that
a direct descriptor can point to an indirect one, the
result is that max # of descriptors in a buffer is then:

queue size - 1 + indirect table size 

I don't see how your proposal limits the # of descriptors
below queue size since guest is never forced to use
indirect.


> > > And are you suggesting this should become part of this series already?
> > 
> > yes since it's touching mostly same areas in the spec.
> 
> :/ Please note that I sent the first draft on this issue already in November 
> last year, and have not seen any response from your side so far. I actually 
> assumed we were already at a point where it was just about precise wording et 
> al., not restarting to redesign everything again from scratch now.

Sorry about that.

> > > > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > > > which is a transport specific configuration whose negotiation is
> > > > > further specified for each transport with subsequent patches.
> > > > > 
> > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > ---
> > > > > 
> > > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > > >  packed-ring.tex |  2 +-
> > > > >  split-ring.tex  |  8 ++++++--
> > > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/content.tex b/content.tex
> > > > > index c6f116c..685525d 100644
> > > > > --- a/content.tex
> > > > > +++ b/content.tex
> > > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic Facilities
> > > > > of a
> > > > > Virtio Device / Feature B
> > > > > 
> > > > >  \begin{description}
> > > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific device
> > > > >  type
> > > > > 
> > > > > -\item[24 to 40] Feature bits reserved for extensions to the queue and
> > > > > +\item[24 to 41] Feature bits reserved for extensions to the queue and
> > > > > 
> > > > >    feature negotiation mechanisms
> > > > > 
> > > > > -\item[41 to 49, and 128 and above] Feature bits reserved for future
> > > > > extensions.
> > > > > +\item[42 to 49, and 128 and above] Feature bits reserved for future
> > > > > extensions.
> > > > > 
> > > > >  \end{description}
> > > > >  
> > > > >  \begin{note}
> > > > > 
> > > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration structure
> > > > > layout}\label{sec:Virtio Transport
> > > > > 
> > > > >  present either a value of 0 or a power of 2 in
> > > > >  \field{queue_size}.
> > > > > 
> > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device MUST
> > > > > provide the
> > > > > +Queue Indirect Size supported by device, which is a transport
> > > > > specific
> > > > > +configuration. It MUST allow the driver to set a lower value.
> > > > > +
> > > > > 
> > > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > > 
> > > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device
> > > > > Layout
> > > > > / Common configuration structure layout}
> > > > > 
> > > > >  The driver MUST NOT write to \field{device_feature},
> > > > >  \field{num_queues},
> > > > > 
> > > > > \field{config_generation}, \field{queue_notify_off} or
> > > > > \field{queue_notify_data}.
> > > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > > Bits}\label{sec:Reserved
> > > > > Feature Bits}
> > > > > 
> > > > >    that the driver can reset a queue individually.
> > > > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > > > 
> > > > > Virtqueue Reset}.
> > > > > 
> > > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that
> > > > > the
> > > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in
> > > > > indirect
> > > > > +  descriptor tables, is independent from the Queue Size.
> > > > > +
> > > > > +  Without this feature, the Queue Size limits the length of the
> > > > > descriptor
> > > > > +  chain, including indirect descriptor tables as in \ref{sec:Basic
> > > > > Facilities of
> > > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > > > Indirect
> > > > > +  Descriptors}, i.e. both the maximum amount of slots in the vring
> > > > > and
> > > > > the
> > > > > +  actual bulk data size transmitted per vring slot.
> > > > 
> > > > spect does not call these  slots elsewhere.
> > > 
> > > Yes, I intentionally used "vring slot" instead of "buffer" as I find the
> > > latter too vague in this context. A "buffer" can be a memory segment, a
> > > set of memory segments and what not. "vring slot" OTOH makes it clear
> > > that it is about exactly one, atomic pointer (hence with fixed size) in a
> > > Ring Buffer, as depicted in the ASCII illustration here:
> > > 
> > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > 
> > > The maximum amount of vring slots is therefore the maximum amount of
> > > messages that can be emplaced into a Ring Buffer, independent of any
> > > "bulk data buffer size".
> > > 
> > > > +
> > > > 
> > > > > +  With this feature enabled, the Queue Size only limits the maximum
> > > > > amount
> > > > > +  of slots in the vring, but does not limit the actual bulk data size
> > > > > +  being transmitted when indirect descriptors are used. Decoupling
> > > > > these
> > > > > +  two configuration parameters this way not only allows much larger
> > > > > bulk
> > > > > data
> > > > > +  being transferred per vring slot, but also avoids complicated
> > > > > synchronization
> > > > > +  mechanisms if the device only supports a very small amount of vring
> > > > > slots. Due
> > > > > +  to the 16-bit size of a descriptor's "next" field there is still an
> > > > > absolute
> > > > > +  limit of $2^{16}$ descriptors per indirect descriptor table.
> > > > > However
> > > > > the
> > > > > +  actual maximum amount supported by either device or driver might be
> > > > > less,
> > > > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > > > additionally
> > > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > > > subsequently
> > > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > > supported
> > > > > +  by both sides.
> > > > 
> > > > still not sure what exactly is the value. e.g. in a buffer including
> > > > indirect and direct descriptors.
> > > > 
> > > > +
> > > > 
> > > > >  \end{description}
> > > > >  
> > > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature
> > > > >  Bits}
> > > > > 
> > > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > > index a9e6c16..e26d112 100644
> > > > > --- a/packed-ring.tex
> > > > > +++ b/packed-ring.tex
> > > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > > 
> > > > >  The device limits the number of descriptors in a list through a
> > > > >  transport-specific and/or device-specific value. If not limited,
> > > > >  the maximum number of descriptors in a list is the virt queue
> > > > > 
> > > > > -size.
> > > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > > negotiated.
> > > > > 
> > > > >  \subsection{Next Flag: Descriptor Chaining}
> > > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > > > 
> > > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > > index de94038..eaa90c3 100644
> > > > > --- a/split-ring.tex
> > > > > +++ b/split-ring.tex
> > > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > > Descriptors}\label{sec:Basic
> > > > > Facilities of a Virtio Devi
> > > > > 
> > > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor (ie.
> > > > >  only
> > > > >  one table per descriptor).
> > > > > 
> > > > > -A driver MUST NOT create a descriptor chain longer than the Queue
> > > > > Size of
> > > > > -the device.
> > > > 
> > > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver MUST
> > > > 
> > > > > +NOT create a descriptor chain longer than the Queue Size of the
> > > > > device.
> > > > > +
> > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > > negotiated
> > > > > +Queue Indirect Size.
> > > > 
> > > > it is not negotiated is it?
> > > 
> > > What makes you think it is not negotiated?
> 
> Also see my previous question here ^


Sorry, what I mean is that you don't define what does negotiation
involve. I think you mean this:

	The driver SHOULD write to \field{queue_indirect_size} if its maximum
	number of descriptors per vring slot is lower than that reported by the
	device.

but driver can just read the value and that's it - and then the value
that is set by device applies, right?

If you are going to use terms such as negotiated you need to define what
they mean. In this case I would just say something like
"the value of Queue Indirect Size".



> > > > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and
> > > > >  VIRTQ_DESC_F_NEXT
> > > > >  in \field{flags}.
> > > > > 
> > > > > --
> > > > > 2.30.2
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > Feedback License:
> > > > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > List Guidelines:
> > > > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > Join OASIS: https://www.oasis-open.org/join/
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 13:55           ` Michael S. Tsirkin
@ 2022-03-20 15:17             ` Christian Schoenebeck
  2022-03-20 16:06               ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-20 15:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sonntag, 20. März 2022 14:55:59 CET Michael S. Tsirkin wrote:
> On Sun, Mar 20, 2022 at 02:32:23PM +0100, Christian Schoenebeck wrote:
> > On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> > > On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> > > > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > > > <qemu_oss@crudebyte.com>
> > > > > 
> > > > > wrote:
> > > > > > This new feature flag allows to decouple the maximum amount of
> > > > > > descriptors in indirect descriptor tables from the Queue Size.
> > > > > 
> > > > > if we are extending these limits, I suggest reusing the feature flag
> > > > > to
> > > > > also add a limit on total s/g list size. making it separate from
> > > > > queue
> > > > > size
> > > > > was requested a while ago.
> > > > 
> > > > What do you mean with "total s/g list size"? The maximum bulk data
> > > > size
> > > > per
> > > > message?
> > > > Sum of both in and out s/g lists' bulk data or only for one of them?
> > > > Or maximum size of exactly only one memory segment?
> > > 
> > > I don't really know what does "bulk data size" mean. Suggest we use
> > > terminology from the spec. A buffer includes a group of direct and/or
> > > indirect descriptors, in turn indirect descriptors point to direct
> > > descriptors.
> > 
> > I already described why I think it makes sense not calling it "buffer" in
> > this particular context. So I am against changing this to "buffer".
> 
> Well spec just defines what buffers are.
> If you are using a different term then you need to define it in the
> spec.

Sorry, but that sounds like nitpicking to me. From split-ring.tex:

"When the driver wants to send a buffer to the device, it fills in
a *slot* in the descriptor table (or chains several together)"

So a "slot in the descriptor table" does not need further specification, but 
the term "vring slot" does?

> > If other people
> > support your position then I'll change it though of course.
> > 
> > About "bulk data size": "Bulk data" is the user data actually being used
> > on
> > application/driver level, i.e. above virtio level, and "Bulk data size"
> > the
> > size of that data. See ASCII illustration here:
> > https://github.com/oasis-tcs/virtio-spec/issues/122
> > 
> > The terminology "bulk data" is already used in the spec already BTW.
> 
> It does not refer to anything specific though, just generally
> to vqs for passing lots of data as opposed to config space used
> to pass small amount of data.

Which is telling pretty much precise enough what it is, at least IMO.

> > > What has been requested a while is ability to limit per vq the
> > > # of direct descriptors in a buffer.
> > 
> > So in cases where indirect descriptors are *not* used. This series is
> > about
> > indirect descriptors only though.
> 
> No, in all cases.

?
 
> > > Since IIUC what you want to do is allow more descriptors
> > > than VQ size, then one way to achieve that is just to have
> > > a per VQ limit on descriptor size and have that limit > VQ size.
> > 
> > Sorry, I can't follow you. What do you mean with "descriptor size"?
> > For me a
> > descriptor has a predefined constant struct size. You mean the size of one
> > memory segment referenced by one indirect descriptor? And why would it be
> > better than what this series suggests?
> 
> My bad. What I meant is a "per VQ limit on # of direct descriptors
> per buffer".

Which is out of the scope of what this series was about.

> > > Another thing related is that people wanted to block indirect
> > > descriptors for some VQs. Not yet sure how to combine that
> > > with this proposal, worth thinking about.
> > 
> > This series allows both, increasing *and* decreasing the number of
> > indirect
> > descriptors per VQ already.
> 
> I don't see how you block indirect descritors for a queue with this.
> Did I miss it?

By setting the proposed "Queue Indirect Size" value to zero?

> Also I think you missed the fact that
> a direct descriptor can point to an indirect one, the
> result is that max # of descriptors in a buffer is then:
> 
> queue size - 1 + indirect table size
> 
> I don't see how your proposal limits the # of descriptors
> below queue size since guest is never forced to use
> indirect.

No, I didn't miss that. The suggested changes were about the amount of 
*indirect* descriptors, not about the amount of *direct* descriptors. The 
amount of direct descriptors was still limited to the "Queue Size".

I am aware that QEMU currently has a limit per "buffer" which adds the amount 
of direct descriptors *and* the amount of indirect ones together for that 
limit (which I also mentioned from the Github issue summary BTW). Which is a 
device specific implementation feature of QEMU and would not stop QEMU though 
to handle this correctly by reducing the negotiated "Queue Indirect Size" 
value appropriately.

> > > > And are you suggesting this should become part of this series already?
> > > 
> > > yes since it's touching mostly same areas in the spec.
> > :
> > :/ Please note that I sent the first draft on this issue already in
> > :November
> > 
> > last year, and have not seen any response from your side so far. I
> > actually
> > assumed we were already at a point where it was just about precise wording
> > et al., not restarting to redesign everything again from scratch now.
> Sorry about that.

I'm pretty sure you are.

> > > > > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > > > > which is a transport specific configuration whose negotiation is
> > > > > > further specified for each transport with subsequent patches.
> > > > > > 
> > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > ---
> > > > > > 
> > > > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > > > >  packed-ring.tex |  2 +-
> > > > > >  split-ring.tex  |  8 ++++++--
> > > > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/content.tex b/content.tex
> > > > > > index c6f116c..685525d 100644
> > > > > > --- a/content.tex
> > > > > > +++ b/content.tex
> > > > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic
> > > > > > Facilities
> > > > > > of a
> > > > > > Virtio Device / Feature B
> > > > > > 
> > > > > >  \begin{description}
> > > > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific
> > > > > >  device
> > > > > >  type
> > > > > > 
> > > > > > -\item[24 to 40] Feature bits reserved for extensions to the queue
> > > > > > and
> > > > > > +\item[24 to 41] Feature bits reserved for extensions to the queue
> > > > > > and
> > > > > > 
> > > > > >    feature negotiation mechanisms
> > > > > > 
> > > > > > -\item[41 to 49, and 128 and above] Feature bits reserved for
> > > > > > future
> > > > > > extensions.
> > > > > > +\item[42 to 49, and 128 and above] Feature bits reserved for
> > > > > > future
> > > > > > extensions.
> > > > > > 
> > > > > >  \end{description}
> > > > > >  
> > > > > >  \begin{note}
> > > > > > 
> > > > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration
> > > > > > structure
> > > > > > layout}\label{sec:Virtio Transport
> > > > > > 
> > > > > >  present either a value of 0 or a power of 2 in
> > > > > >  \field{queue_size}.
> > > > > > 
> > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device
> > > > > > MUST
> > > > > > provide the
> > > > > > +Queue Indirect Size supported by device, which is a transport
> > > > > > specific
> > > > > > +configuration. It MUST allow the driver to set a lower value.
> > > > > > +
> > > > > > 
> > > > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > > > 
> > > > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI
> > > > > > Device
> > > > > > Layout
> > > > > > / Common configuration structure layout}
> > > > > > 
> > > > > >  The driver MUST NOT write to \field{device_feature},
> > > > > >  \field{num_queues},
> > > > > > 
> > > > > > \field{config_generation}, \field{queue_notify_off} or
> > > > > > \field{queue_notify_data}.
> > > > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > > > Bits}\label{sec:Reserved
> > > > > > Feature Bits}
> > > > > > 
> > > > > >    that the driver can reset a queue individually.
> > > > > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > > > > 
> > > > > > Virtqueue Reset}.
> > > > > > 
> > > > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates
> > > > > > that
> > > > > > the
> > > > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in
> > > > > > indirect
> > > > > > +  descriptor tables, is independent from the Queue Size.
> > > > > > +
> > > > > > +  Without this feature, the Queue Size limits the length of the
> > > > > > descriptor
> > > > > > +  chain, including indirect descriptor tables as in
> > > > > > \ref{sec:Basic
> > > > > > Facilities of
> > > > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > > > > Indirect
> > > > > > +  Descriptors}, i.e. both the maximum amount of slots in the
> > > > > > vring
> > > > > > and
> > > > > > the
> > > > > > +  actual bulk data size transmitted per vring slot.
> > > > > 
> > > > > spect does not call these  slots elsewhere.
> > > > 
> > > > Yes, I intentionally used "vring slot" instead of "buffer" as I find
> > > > the
> > > > latter too vague in this context. A "buffer" can be a memory segment,
> > > > a
> > > > set of memory segments and what not. "vring slot" OTOH makes it clear
> > > > that it is about exactly one, atomic pointer (hence with fixed size)
> > > > in a
> > > > Ring Buffer, as depicted in the ASCII illustration here:
> > > > 
> > > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > 
> > > > The maximum amount of vring slots is therefore the maximum amount of
> > > > messages that can be emplaced into a Ring Buffer, independent of any
> > > > "bulk data buffer size".
> > > > 
> > > > > +
> > > > > 
> > > > > > +  With this feature enabled, the Queue Size only limits the
> > > > > > maximum
> > > > > > amount
> > > > > > +  of slots in the vring, but does not limit the actual bulk data
> > > > > > size
> > > > > > +  being transmitted when indirect descriptors are used.
> > > > > > Decoupling
> > > > > > these
> > > > > > +  two configuration parameters this way not only allows much
> > > > > > larger
> > > > > > bulk
> > > > > > data
> > > > > > +  being transferred per vring slot, but also avoids complicated
> > > > > > synchronization
> > > > > > +  mechanisms if the device only supports a very small amount of
> > > > > > vring
> > > > > > slots. Due
> > > > > > +  to the 16-bit size of a descriptor's "next" field there is
> > > > > > still an
> > > > > > absolute
> > > > > > +  limit of $2^{16}$ descriptors per indirect descriptor table.
> > > > > > However
> > > > > > the
> > > > > > +  actual maximum amount supported by either device or driver
> > > > > > might be
> > > > > > less,
> > > > > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > > > > additionally
> > > > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > > > > subsequently
> > > > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > > > supported
> > > > > > +  by both sides.
> > > > > 
> > > > > still not sure what exactly is the value. e.g. in a buffer including
> > > > > indirect and direct descriptors.
> > > > > 
> > > > > +
> > > > > 
> > > > > >  \end{description}
> > > > > >  
> > > > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved
> > > > > >  Feature
> > > > > >  Bits}
> > > > > > 
> > > > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > > > index a9e6c16..e26d112 100644
> > > > > > --- a/packed-ring.tex
> > > > > > +++ b/packed-ring.tex
> > > > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > > > 
> > > > > >  The device limits the number of descriptors in a list through a
> > > > > >  transport-specific and/or device-specific value. If not limited,
> > > > > >  the maximum number of descriptors in a list is the virt queue
> > > > > > 
> > > > > > -size.
> > > > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > > > negotiated.
> > > > > > 
> > > > > >  \subsection{Next Flag: Descriptor Chaining}
> > > > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > > > > 
> > > > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > > > index de94038..eaa90c3 100644
> > > > > > --- a/split-ring.tex
> > > > > > +++ b/split-ring.tex
> > > > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > > > Descriptors}\label{sec:Basic
> > > > > > Facilities of a Virtio Devi
> > > > > > 
> > > > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor
> > > > > >  (ie.
> > > > > >  only
> > > > > >  one table per descriptor).
> > > > > > 
> > > > > > -A driver MUST NOT create a descriptor chain longer than the Queue
> > > > > > Size of
> > > > > > -the device.
> > > > > 
> > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver
> > > > > MUST
> > > > > 
> > > > > > +NOT create a descriptor chain longer than the Queue Size of the
> > > > > > device.
> > > > > > +
> > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > > > negotiated
> > > > > > +Queue Indirect Size.
> > > > > 
> > > > > it is not negotiated is it?
> > > > 
> > > > What makes you think it is not negotiated?
> > 
> > Also see my previous question here ^
> 
> Sorry, what I mean is that you don't define what does negotiation
> involve. I think you mean this:
> 
> 	The driver SHOULD write to \field{queue_indirect_size} if its maximum
> 	number of descriptors per vring slot is lower than that reported by the
> 	device.
> 
> but driver can just read the value and that's it - and then the value
> that is set by device applies, right?
> 
> If you are going to use terms such as negotiated you need to define what
> they mean. In this case I would just say something like
> "the value of Queue Indirect Size".

Which makes me wonder why you just didn't say that in the first place? And I 
don't agree that it wasn't defined, because I actually think I did:

+  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
+  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
+  descriptor tables, is independent from the Queue Size."

Or is that definition of the new term "Queue Indirect Size" not clear enough 
to you?

> > > > > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and
> > > > > >  VIRTQ_DESC_F_NEXT
> > > > > >  in \field{flags}.
> > > > > > 
> > > > > > --
> > > > > > 2.30.2
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > This publicly archived list offers a means to provide input to the
> > > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > > 
> > > > > > In order to verify user consent to the Feedback License terms and
> > > > > > to minimize spam in the list archive, subscription is required
> > > > > > before posting.
> > > > > > 
> > > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > > List archive:
> > > > > > https://lists.oasis-open.org/archives/virtio-comment/
> > > > > > Feedback License:
> > > > > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > > List Guidelines:
> > > > > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > > Join OASIS: https://www.oasis-open.org/join/



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 15:17             ` Christian Schoenebeck
@ 2022-03-20 16:06               ` Michael S. Tsirkin
  2022-03-20 16:07                 ` Michael S. Tsirkin
  2022-03-20 17:43                 ` Christian Schoenebeck
  0 siblings, 2 replies; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-20 16:06 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sun, Mar 20, 2022 at 04:17:48PM +0100, Christian Schoenebeck wrote:
> On Sonntag, 20. März 2022 14:55:59 CET Michael S. Tsirkin wrote:
> > On Sun, Mar 20, 2022 at 02:32:23PM +0100, Christian Schoenebeck wrote:
> > > On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> > > > On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck wrote:
> > > > > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > > > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > > > > <qemu_oss@crudebyte.com>
> > > > > > 
> > > > > > wrote:
> > > > > > > This new feature flag allows to decouple the maximum amount of
> > > > > > > descriptors in indirect descriptor tables from the Queue Size.
> > > > > > 
> > > > > > if we are extending these limits, I suggest reusing the feature flag
> > > > > > to
> > > > > > also add a limit on total s/g list size. making it separate from
> > > > > > queue
> > > > > > size
> > > > > > was requested a while ago.
> > > > > 
> > > > > What do you mean with "total s/g list size"? The maximum bulk data
> > > > > size
> > > > > per
> > > > > message?
> > > > > Sum of both in and out s/g lists' bulk data or only for one of them?
> > > > > Or maximum size of exactly only one memory segment?
> > > > 
> > > > I don't really know what does "bulk data size" mean. Suggest we use
> > > > terminology from the spec. A buffer includes a group of direct and/or
> > > > indirect descriptors, in turn indirect descriptors point to direct
> > > > descriptors.
> > > 
> > > I already described why I think it makes sense not calling it "buffer" in
> > > this particular context. So I am against changing this to "buffer".
> > 
> > Well spec just defines what buffers are.
> > If you are using a different term then you need to define it in the
> > spec.
> 
> Sorry, but that sounds like nitpicking to me. From split-ring.tex:
> 
> "When the driver wants to send a buffer to the device, it fills in
> a *slot* in the descriptor table (or chains several together)"
> So a "slot in the descriptor table" does not need further specification, but 
> the term "vring slot" does?

One of the reasons is that it's an old text from pre-standard days.
It's therefore informal. We are trying to do better and unfortunately
this means you might need to clean up some old text to add your
feature.

Packed ring has the concept of "elements" maybe fixing up
split to use that teminology too can work.


Anothe reason I am not excited about "vring slots" is because
it is not concise - it's easy to just say "slots" and introduce
confusion.



> 
> > > If other people
> > > support your position then I'll change it though of course.
> > > 
> > > About "bulk data size": "Bulk data" is the user data actually being used
> > > on
> > > application/driver level, i.e. above virtio level, and "Bulk data size"
> > > the
> > > size of that data. See ASCII illustration here:
> > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > 
> > > The terminology "bulk data" is already used in the spec already BTW.
> > 
> > It does not refer to anything specific though, just generally
> > to vqs for passing lots of data as opposed to config space used
> > to pass small amount of data.
> 
> Which is telling pretty much precise enough what it is, at least IMO.

Well saying generally there's a "bulk of data" just means there's a lot.
If you are talking about it's size then I guess you somehow
distinguish this data from some other data?

Please look at the Virtqueues chapter and see if any existing terms
fit your usage. It's hard enough for people to learn the spec without
us changing terminology each release.

> > > > What has been requested a while is ability to limit per vq the
> > > > # of direct descriptors in a buffer.
> > > 
> > > So in cases where indirect descriptors are *not* used. This series is
> > > about
> > > indirect descriptors only though.
> > 
> > No, in all cases.
> 
> ?

what users in the field asked for is ability to increase VQ size to hold
more than 1024 buffers. However QEMU can not support requests with more
than 1024 elements. Thus the wish to limit buffer size below VQ size.
Whether there's an indirect element in the chain does not matter
for this use-case.


> > > > Since IIUC what you want to do is allow more descriptors
> > > > than VQ size, then one way to achieve that is just to have
> > > > a per VQ limit on descriptor size and have that limit > VQ size.
> > > 
> > > Sorry, I can't follow you. What do you mean with "descriptor size"?
> > > For me a
> > > descriptor has a predefined constant struct size. You mean the size of one
> > > memory segment referenced by one indirect descriptor? And why would it be
> > > better than what this series suggests?
> > 
> > My bad. What I meant is a "per VQ limit on # of direct descriptors
> > per buffer".
> 
> Which is out of the scope of what this series was about.

Maybe the scope is too limited then.


> > > > Another thing related is that people wanted to block indirect
> > > > descriptors for some VQs. Not yet sure how to combine that
> > > > with this proposal, worth thinking about.
> > > 
> > > This series allows both, increasing *and* decreasing the number of
> > > indirect
> > > descriptors per VQ already.
> > 
> > I don't see how you block indirect descritors for a queue with this.
> > Did I miss it?
> 
> By setting the proposed "Queue Indirect Size" value to zero?

I guess ... it is better to call this out explicitly in the spec.


> > Also I think you missed the fact that
> > a direct descriptor can point to an indirect one, the
> > result is that max # of descriptors in a buffer is then:
> > 
> > queue size - 1 + indirect table size
> > 
> > I don't see how your proposal limits the # of descriptors
> > below queue size since guest is never forced to use
> > indirect.
> 
> No, I didn't miss that. The suggested changes were about the amount of 
> *indirect* descriptors, not about the amount of *direct* descriptors. The 
> amount of direct descriptors was still limited to the "Queue Size".

Confused.
The amount of indirect descriptors?
Not the size in each one? But your commit log says
	descriptors in indirect descriptor tables
descriptors in the indirect descriptor tables are not indirect
descriptors.

At this point I don't understand the motivation for the change.


> I am aware that QEMU currently has a limit per "buffer" which adds the amount 
> of direct descriptors *and* the amount of indirect ones together for that 
> limit (which I also mentioned from the Github issue summary BTW). Which is a 
> device specific implementation feature of QEMU and would not stop QEMU though 
> to handle this correctly by reducing the negotiated "Queue Indirect Size" 
> value appropriately.

Except people want a deeper queue, I can probably find & forward links
to mailing list discussions.

> > > > > And are you suggesting this should become part of this series already?
> > > > 
> > > > yes since it's touching mostly same areas in the spec.
> > > :
> > > :/ Please note that I sent the first draft on this issue already in
> > > :November
> > > 
> > > last year, and have not seen any response from your side so far. I
> > > actually
> > > assumed we were already at a point where it was just about precise wording
> > > et al., not restarting to redesign everything again from scratch now.
> > Sorry about that.
> 
> I'm pretty sure you are.

Yes, I wish we could move faster.  If you are asking for advice about
avoiding such situations in the future then that would be to iterate
faster. It's just on v3 since november, and you did get comments on the
previous revisions.

> > > > > > > The new term "Queue Indirect Size" is introduced for this purpose,
> > > > > > > which is a transport specific configuration whose negotiation is
> > > > > > > further specified for each transport with subsequent patches.
> > > > > > > 
> > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > > ---
> > > > > > > 
> > > > > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > > > > >  packed-ring.tex |  2 +-
> > > > > > >  split-ring.tex  |  8 ++++++--
> > > > > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > index c6f116c..685525d 100644
> > > > > > > --- a/content.tex
> > > > > > > +++ b/content.tex
> > > > > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic
> > > > > > > Facilities
> > > > > > > of a
> > > > > > > Virtio Device / Feature B
> > > > > > > 
> > > > > > >  \begin{description}
> > > > > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific
> > > > > > >  device
> > > > > > >  type
> > > > > > > 
> > > > > > > -\item[24 to 40] Feature bits reserved for extensions to the queue
> > > > > > > and
> > > > > > > +\item[24 to 41] Feature bits reserved for extensions to the queue
> > > > > > > and
> > > > > > > 
> > > > > > >    feature negotiation mechanisms
> > > > > > > 
> > > > > > > -\item[41 to 49, and 128 and above] Feature bits reserved for
> > > > > > > future
> > > > > > > extensions.
> > > > > > > +\item[42 to 49, and 128 and above] Feature bits reserved for
> > > > > > > future
> > > > > > > extensions.
> > > > > > > 
> > > > > > >  \end{description}
> > > > > > >  
> > > > > > >  \begin{note}
> > > > > > > 
> > > > > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration
> > > > > > > structure
> > > > > > > layout}\label{sec:Virtio Transport
> > > > > > > 
> > > > > > >  present either a value of 0 or a power of 2 in
> > > > > > >  \field{queue_size}.
> > > > > > > 
> > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the device
> > > > > > > MUST
> > > > > > > provide the
> > > > > > > +Queue Indirect Size supported by device, which is a transport
> > > > > > > specific
> > > > > > > +configuration. It MUST allow the driver to set a lower value.
> > > > > > > +
> > > > > > > 
> > > > > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > > > > 
> > > > > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI
> > > > > > > Device
> > > > > > > Layout
> > > > > > > / Common configuration structure layout}
> > > > > > > 
> > > > > > >  The driver MUST NOT write to \field{device_feature},
> > > > > > >  \field{num_queues},
> > > > > > > 
> > > > > > > \field{config_generation}, \field{queue_notify_off} or
> > > > > > > \field{queue_notify_data}.
> > > > > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > > > > Bits}\label{sec:Reserved
> > > > > > > Feature Bits}
> > > > > > > 
> > > > > > >    that the driver can reset a queue individually.
> > > > > > >    See \ref{sec:Basic Facilities of a Virtio Device / Virtqueues /
> > > > > > > 
> > > > > > > Virtqueue Reset}.
> > > > > > > 
> > > > > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates
> > > > > > > that
> > > > > > > the
> > > > > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors in
> > > > > > > indirect
> > > > > > > +  descriptor tables, is independent from the Queue Size.
> > > > > > > +
> > > > > > > +  Without this feature, the Queue Size limits the length of the
> > > > > > > descriptor
> > > > > > > +  chain, including indirect descriptor tables as in
> > > > > > > \ref{sec:Basic
> > > > > > > Facilities of
> > > > > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor Table /
> > > > > > > Indirect
> > > > > > > +  Descriptors}, i.e. both the maximum amount of slots in the
> > > > > > > vring
> > > > > > > and
> > > > > > > the
> > > > > > > +  actual bulk data size transmitted per vring slot.
> > > > > > 
> > > > > > spect does not call these  slots elsewhere.
> > > > > 
> > > > > Yes, I intentionally used "vring slot" instead of "buffer" as I find
> > > > > the
> > > > > latter too vague in this context. A "buffer" can be a memory segment,
> > > > > a
> > > > > set of memory segments and what not. "vring slot" OTOH makes it clear
> > > > > that it is about exactly one, atomic pointer (hence with fixed size)
> > > > > in a
> > > > > Ring Buffer, as depicted in the ASCII illustration here:
> > > > > 
> > > > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > 
> > > > > The maximum amount of vring slots is therefore the maximum amount of
> > > > > messages that can be emplaced into a Ring Buffer, independent of any
> > > > > "bulk data buffer size".
> > > > > 
> > > > > > +
> > > > > > 
> > > > > > > +  With this feature enabled, the Queue Size only limits the
> > > > > > > maximum
> > > > > > > amount
> > > > > > > +  of slots in the vring, but does not limit the actual bulk data
> > > > > > > size
> > > > > > > +  being transmitted when indirect descriptors are used.
> > > > > > > Decoupling
> > > > > > > these
> > > > > > > +  two configuration parameters this way not only allows much
> > > > > > > larger
> > > > > > > bulk
> > > > > > > data
> > > > > > > +  being transferred per vring slot, but also avoids complicated
> > > > > > > synchronization
> > > > > > > +  mechanisms if the device only supports a very small amount of
> > > > > > > vring
> > > > > > > slots. Due
> > > > > > > +  to the 16-bit size of a descriptor's "next" field there is
> > > > > > > still an
> > > > > > > absolute
> > > > > > > +  limit of $2^{16}$ descriptors per indirect descriptor table.
> > > > > > > However
> > > > > > > the
> > > > > > > +  actual maximum amount supported by either device or driver
> > > > > > > might be
> > > > > > > less,
> > > > > > > +  and therefore the bus specific Queue Indirect Size value MUST
> > > > > > > additionally
> > > > > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
> > > > > > > subsequently
> > > > > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > > > > supported
> > > > > > > +  by both sides.
> > > > > > 
> > > > > > still not sure what exactly is the value. e.g. in a buffer including
> > > > > > indirect and direct descriptors.
> > > > > > 
> > > > > > +
> > > > > > 
> > > > > > >  \end{description}
> > > > > > >  
> > > > > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved
> > > > > > >  Feature
> > > > > > >  Bits}
> > > > > > > 
> > > > > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > > > > index a9e6c16..e26d112 100644
> > > > > > > --- a/packed-ring.tex
> > > > > > > +++ b/packed-ring.tex
> > > > > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > > > > 
> > > > > > >  The device limits the number of descriptors in a list through a
> > > > > > >  transport-specific and/or device-specific value. If not limited,
> > > > > > >  the maximum number of descriptors in a list is the virt queue
> > > > > > > 
> > > > > > > -size.
> > > > > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > > > > negotiated.
> > > > > > > 
> > > > > > >  \subsection{Next Flag: Descriptor Chaining}
> > > > > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor Chaining}
> > > > > > > 
> > > > > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > > > > index de94038..eaa90c3 100644
> > > > > > > --- a/split-ring.tex
> > > > > > > +++ b/split-ring.tex
> > > > > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > > > > Descriptors}\label{sec:Basic
> > > > > > > Facilities of a Virtio Devi
> > > > > > > 
> > > > > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect descriptor
> > > > > > >  (ie.
> > > > > > >  only
> > > > > > >  one table per descriptor).
> > > > > > > 
> > > > > > > -A driver MUST NOT create a descriptor chain longer than the Queue
> > > > > > > Size of
> > > > > > > -the device.
> > > > > > 
> > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the driver
> > > > > > MUST
> > > > > > 
> > > > > > > +NOT create a descriptor chain longer than the Queue Size of the
> > > > > > > device.
> > > > > > > +
> > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the number of
> > > > > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > > > > negotiated
> > > > > > > +Queue Indirect Size.
> > > > > > 
> > > > > > it is not negotiated is it?
> > > > > 
> > > > > What makes you think it is not negotiated?
> > > 
> > > Also see my previous question here ^
> > 
> > Sorry, what I mean is that you don't define what does negotiation
> > involve. I think you mean this:
> > 
> > 	The driver SHOULD write to \field{queue_indirect_size} if its maximum
> > 	number of descriptors per vring slot is lower than that reported by the
> > 	device.
> > 
> > but driver can just read the value and that's it - and then the value
> > that is set by device applies, right?
> > 
> > If you are going to use terms such as negotiated you need to define what
> > they mean. In this case I would just say something like
> > "the value of Queue Indirect Size".
> 
> Which makes me wonder why you just didn't say that in the first place? And I 
> don't agree that it wasn't defined, because I actually think I did:
> 
> +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> +  descriptor tables, is independent from the Queue Size."
> 
> Or is that definition of the new term "Queue Indirect Size" not clear enough 
> to you?

Maybe ... but I don't think this will jump out to the reader. I feel
we abuse of the word negotiate to the point where reader only has a
vague idea what it means. That's why I struggled to give concise
comments.  Consider:

+  and therefore the transport specific Queue Indirect Size value MUST
+  additionally be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated to
+  subsequently negotiate the actual amount of maximum indirect descriptors
+  supported by both sides.

sounds like a tongue-twister to me. Can't we find a term different from
"negotiate"?


It's already used for features, please find something else for this.
And why is a term even necessary?  How is this different from other
writeable fields?  Consider an example of similar functionality
from the spec:

\item[\field{queue_size}]
        Queue Size.  On reset, specifies the maximum queue size supported by
        the device. This can be modified by the driver to reduce memory requirements.
        A 0 means the queue is unavailable.


Hope this helps.



> > > > > > >  A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and
> > > > > > >  VIRTQ_DESC_F_NEXT
> > > > > > >  in \field{flags}.
> > > > > > > 
> > > > > > > --
> > > > > > > 2.30.2
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > This publicly archived list offers a means to provide input to the
> > > > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > > > 
> > > > > > > In order to verify user consent to the Feedback License terms and
> > > > > > > to minimize spam in the list archive, subscription is required
> > > > > > > before posting.
> > > > > > > 
> > > > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > > > List archive:
> > > > > > > https://lists.oasis-open.org/archives/virtio-comment/
> > > > > > > Feedback License:
> > > > > > > https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > > > List Guidelines:
> > > > > > > https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > > > Join OASIS: https://www.oasis-open.org/join/
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 16:06               ` Michael S. Tsirkin
@ 2022-03-20 16:07                 ` Michael S. Tsirkin
  2022-03-20 17:43                 ` Christian Schoenebeck
  1 sibling, 0 replies; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-20 16:07 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sun, Mar 20, 2022 at 12:06:20PM -0400, Michael S. Tsirkin wrote:
> > No, I didn't miss that. The suggested changes were about the amount of 
> > *indirect* descriptors, not about the amount of *direct* descriptors. The 
> > amount of direct descriptors was still limited to the "Queue Size".
> 
> Confused.
> The amount of indirect descriptors?
> Not the size in each one?

length, to be exact

> But your commit log says
> 	descriptors in indirect descriptor tables
> descriptors in the indirect descriptor tables are not indirect
> descriptors.
> 
> At this point I don't understand the motivation for the change.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 16:06               ` Michael S. Tsirkin
  2022-03-20 16:07                 ` Michael S. Tsirkin
@ 2022-03-20 17:43                 ` Christian Schoenebeck
  2022-03-20 21:52                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-20 17:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sonntag, 20. März 2022 17:06:16 CET Michael S. Tsirkin wrote:
> On Sun, Mar 20, 2022 at 04:17:48PM +0100, Christian Schoenebeck wrote:
> > On Sonntag, 20. März 2022 14:55:59 CET Michael S. Tsirkin wrote:
> > > On Sun, Mar 20, 2022 at 02:32:23PM +0100, Christian Schoenebeck wrote:
> > > > On Sonntag, 20. März 2022 13:31:51 CET Michael S. Tsirkin wrote:
> > > > > On Sat, Mar 19, 2022 at 01:00:28PM +0100, Christian Schoenebeck 
wrote:
> > > > > > On Samstag, 19. März 2022 10:33:49 CET Michael S. Tsirkin wrote:
> > > > > > > On Wed, Mar 16, 2022, 15:47 Christian Schoenebeck
> > > > > > > <qemu_oss@crudebyte.com>
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > This new feature flag allows to decouple the maximum amount of
> > > > > > > > descriptors in indirect descriptor tables from the Queue Size.
> > > > > > > 
> > > > > > > if we are extending these limits, I suggest reusing the feature
> > > > > > > flag
> > > > > > > to
> > > > > > > also add a limit on total s/g list size. making it separate from
> > > > > > > queue
> > > > > > > size
> > > > > > > was requested a while ago.
> > > > > > 
> > > > > > What do you mean with "total s/g list size"? The maximum bulk data
> > > > > > size
> > > > > > per
> > > > > > message?
> > > > > > Sum of both in and out s/g lists' bulk data or only for one of
> > > > > > them?
> > > > > > Or maximum size of exactly only one memory segment?
> > > > > 
> > > > > I don't really know what does "bulk data size" mean. Suggest we use
> > > > > terminology from the spec. A buffer includes a group of direct
> > > > > and/or
> > > > > indirect descriptors, in turn indirect descriptors point to direct
> > > > > descriptors.
> > > > 
> > > > I already described why I think it makes sense not calling it "buffer"
> > > > in
> > > > this particular context. So I am against changing this to "buffer".
> > > 
> > > Well spec just defines what buffers are.
> > > If you are using a different term then you need to define it in the
> > > spec.
> > 
> > Sorry, but that sounds like nitpicking to me. From split-ring.tex:
> > 
> > "When the driver wants to send a buffer to the device, it fills in
> > a *slot* in the descriptor table (or chains several together)"
> > So a "slot in the descriptor table" does not need further specification,
> > but the term "vring slot" does?
> 
> One of the reasons is that it's an old text from pre-standard days.
> It's therefore informal. We are trying to do better and unfortunately
> this means you might need to clean up some old text to add your
> feature.
> 
> Packed ring has the concept of "elements" maybe fixing up
> split to use that teminology too can work.
> 
> 
> Anothe reason I am not excited about "vring slots" is because
> it is not concise - it's easy to just say "slots" and introduce
> confusion.

And using the term "buffer" instead of "slot" is better in avoiding confusion. 
Sorry, but I deeply disagree with you. A "buffer" can mean simply anything, 
and the only reason why you want to stick to that term is because it is 
already used in the spec. A "buffer" is still a much more confusing term than 
"slot", espcially in this context.

> > > > If other people
> > > > support your position then I'll change it though of course.
> > > > 
> > > > About "bulk data size": "Bulk data" is the user data actually being
> > > > used
> > > > on
> > > > application/driver level, i.e. above virtio level, and "Bulk data
> > > > size"
> > > > the
> > > > size of that data. See ASCII illustration here:
> > > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > 
> > > > The terminology "bulk data" is already used in the spec already BTW.
> > > 
> > > It does not refer to anything specific though, just generally
> > > to vqs for passing lots of data as opposed to config space used
> > > to pass small amount of data.
> > 
> > Which is telling pretty much precise enough what it is, at least IMO.
> 
> Well saying generally there's a "bulk of data" just means there's a lot.
> If you are talking about it's size then I guess you somehow
> distinguish this data from some other data?
> 
> Please look at the Virtqueues chapter and see if any existing terms
> fit your usage. It's hard enough for people to learn the spec without
> us changing terminology each release.

Thank you. You are invited to name one that you feel appropriate.

> > > > > What has been requested a while is ability to limit per vq the
> > > > > # of direct descriptors in a buffer.
> > > > 
> > > > So in cases where indirect descriptors are *not* used. This series is
> > > > about
> > > > indirect descriptors only though.
> > > 
> > > No, in all cases.
> > 
> > ?
> 
> what users in the field asked for is ability to increase VQ size to hold
> more than 1024 buffers. However QEMU can not support requests with more
> than 1024 elements. Thus the wish to limit buffer size below VQ size.
> Whether there's an indirect element in the chain does not matter
> for this use-case.

Yes, and it is a different use case.

> > > > > Since IIUC what you want to do is allow more descriptors
> > > > > than VQ size, then one way to achieve that is just to have
> > > > > a per VQ limit on descriptor size and have that limit > VQ size.
> > > > 
> > > > Sorry, I can't follow you. What do you mean with "descriptor size"?
> > > > For me a
> > > > descriptor has a predefined constant struct size. You mean the size of
> > > > one
> > > > memory segment referenced by one indirect descriptor? And why would it
> > > > be
> > > > better than what this series suggests?
> > > 
> > > My bad. What I meant is a "per VQ limit on # of direct descriptors
> > > per buffer".
> > 
> > Which is out of the scope of what this series was about.
> 
> Maybe the scope is too limited then.

Yes maybe. And maybe you are a little late with realizing that.

> > > > > Another thing related is that people wanted to block indirect
> > > > > descriptors for some VQs. Not yet sure how to combine that
> > > > > with this proposal, worth thinking about.
> > > > 
> > > > This series allows both, increasing *and* decreasing the number of
> > > > indirect
> > > > descriptors per VQ already.
> > > 
> > > I don't see how you block indirect descritors for a queue with this.
> > > Did I miss it?
> > 
> > By setting the proposed "Queue Indirect Size" value to zero?
> 
> I guess ... it is better to call this out explicitly in the spec.
> 
> > > Also I think you missed the fact that
> > > a direct descriptor can point to an indirect one, the
> > > result is that max # of descriptors in a buffer is then:
> > > 
> > > queue size - 1 + indirect table size
> > > 
> > > I don't see how your proposal limits the # of descriptors
> > > below queue size since guest is never forced to use
> > > indirect.
> > 
> > No, I didn't miss that. The suggested changes were about the amount of
> > *indirect* descriptors, not about the amount of *direct* descriptors. The
> > amount of direct descriptors was still limited to the "Queue Size".
> 
> Confused.
> The amount of indirect descriptors?
> Not the size in each one? But your commit log says
> 	descriptors in indirect descriptor tables
> descriptors in the indirect descriptor tables are not indirect
> descriptors.
> 
> At this point I don't understand the motivation for the change.

You seriously don't understand the motivation, or you rather want to continue 
with terminology discussions? 

> > I am aware that QEMU currently has a limit per "buffer" which adds the
> > amount of direct descriptors *and* the amount of indirect ones together
> > for that limit (which I also mentioned from the Github issue summary
> > BTW). Which is a device specific implementation feature of QEMU and would
> > not stop QEMU though to handle this correctly by reducing the negotiated
> > "Queue Indirect Size" value appropriately.
> 
> Except people want a deeper queue, I can probably find & forward links
> to mailing list discussions.
> 
> > > > > > And are you suggesting this should become part of this series
> > > > > > already?
> > > > > 
> > > > > yes since it's touching mostly same areas in the spec.
> > > > :
> > > > :/ Please note that I sent the first draft on this issue already in
> > > > :November
> > > > 
> > > > last year, and have not seen any response from your side so far. I
> > > > actually
> > > > assumed we were already at a point where it was just about precise
> > > > wording
> > > > et al., not restarting to redesign everything again from scratch now.
> > > 
> > > Sorry about that.
> > 
> > I'm pretty sure you are.
> 
> Yes, I wish we could move faster.  If you are asking for advice about
> avoiding such situations in the future then that would be to iterate
> faster. It's just on v3 since november, and you did get comments on the
> previous revisions.

And my advice on you to avoid such situations is to react *at* *all* on posted 
draft suggestions faster, not after 4 months. And this is actually not v3 to 
be precise, it is v5. The original draft was already extended and renamed 
consequently. And the reason for delays between iterations was that I 
sometimes had to wait for a month on response, which I was not blaming up 
anybody for. But apparently you are turning that on me now.

> > > > > > > > The new term "Queue Indirect Size" is introduced for this
> > > > > > > > purpose,
> > > > > > > > which is a transport specific configuration whose negotiation
> > > > > > > > is
> > > > > > > > further specified for each transport with subsequent patches.
> > > > > > > > 
> > > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > > > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > > > > > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > > > ---
> > > > > > > > 
> > > > > > > >  content.tex     | 32 ++++++++++++++++++++++++++++++--
> > > > > > > >  packed-ring.tex |  2 +-
> > > > > > > >  split-ring.tex  |  8 ++++++--
> > > > > > > >  3 files changed, 37 insertions(+), 5 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > > index c6f116c..685525d 100644
> > > > > > > > --- a/content.tex
> > > > > > > > +++ b/content.tex
> > > > > > > > @@ -99,10 +99,10 @@ \section{Feature Bits}\label{sec:Basic
> > > > > > > > Facilities
> > > > > > > > of a
> > > > > > > > Virtio Device / Feature B
> > > > > > > > 
> > > > > > > >  \begin{description}
> > > > > > > >  \item[0 to 23, and 50 to 127] Feature bits for the specific
> > > > > > > >  device
> > > > > > > >  type
> > > > > > > > 
> > > > > > > > -\item[24 to 40] Feature bits reserved for extensions to the
> > > > > > > > queue
> > > > > > > > and
> > > > > > > > +\item[24 to 41] Feature bits reserved for extensions to the
> > > > > > > > queue
> > > > > > > > and
> > > > > > > > 
> > > > > > > >    feature negotiation mechanisms
> > > > > > > > 
> > > > > > > > -\item[41 to 49, and 128 and above] Feature bits reserved for
> > > > > > > > future
> > > > > > > > extensions.
> > > > > > > > +\item[42 to 49, and 128 and above] Feature bits reserved for
> > > > > > > > future
> > > > > > > > extensions.
> > > > > > > > 
> > > > > > > >  \end{description}
> > > > > > > >  
> > > > > > > >  \begin{note}
> > > > > > > > 
> > > > > > > > @@ -1051,6 +1051,10 @@ \subsubsection{Common configuration
> > > > > > > > structure
> > > > > > > > layout}\label{sec:Virtio Transport
> > > > > > > > 
> > > > > > > >  present either a value of 0 or a power of 2 in
> > > > > > > >  \field{queue_size}.
> > > > > > > > 
> > > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the
> > > > > > > > device
> > > > > > > > MUST
> > > > > > > > provide the
> > > > > > > > +Queue Indirect Size supported by device, which is a transport
> > > > > > > > specific
> > > > > > > > +configuration. It MUST allow the driver to set a lower value.
> > > > > > > > +
> > > > > > > > 
> > > > > > > >  \drivernormative{\paragraph}{Common configuration structure
> > > > > > > > 
> > > > > > > > layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI
> > > > > > > > Device
> > > > > > > > Layout
> > > > > > > > / Common configuration structure layout}
> > > > > > > > 
> > > > > > > >  The driver MUST NOT write to \field{device_feature},
> > > > > > > >  \field{num_queues},
> > > > > > > > 
> > > > > > > > \field{config_generation}, \field{queue_notify_off} or
> > > > > > > > \field{queue_notify_data}.
> > > > > > > > @@ -6847,6 +6851,30 @@ \chapter{Reserved Feature
> > > > > > > > Bits}\label{sec:Reserved
> > > > > > > > Feature Bits}
> > > > > > > > 
> > > > > > > >    that the driver can reset a queue individually.
> > > > > > > >    See \ref{sec:Basic Facilities of a Virtio Device /
> > > > > > > >    Virtqueues /
> > > > > > > > 
> > > > > > > > Virtqueue Reset}.
> > > > > > > > 
> > > > > > > > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature
> > > > > > > > indicates
> > > > > > > > that
> > > > > > > > the
> > > > > > > > +  Queue Indirect Size, i.e. the maximum amount of descriptors
> > > > > > > > in
> > > > > > > > indirect
> > > > > > > > +  descriptor tables, is independent from the Queue Size.
> > > > > > > > +
> > > > > > > > +  Without this feature, the Queue Size limits the length of
> > > > > > > > the
> > > > > > > > descriptor
> > > > > > > > +  chain, including indirect descriptor tables as in
> > > > > > > > \ref{sec:Basic
> > > > > > > > Facilities of
> > > > > > > > +  a Virtio Device / Virtqueues / The Virtqueue Descriptor
> > > > > > > > Table /
> > > > > > > > Indirect
> > > > > > > > +  Descriptors}, i.e. both the maximum amount of slots in the
> > > > > > > > vring
> > > > > > > > and
> > > > > > > > the
> > > > > > > > +  actual bulk data size transmitted per vring slot.
> > > > > > > 
> > > > > > > spect does not call these  slots elsewhere.
> > > > > > 
> > > > > > Yes, I intentionally used "vring slot" instead of "buffer" as I
> > > > > > find
> > > > > > the
> > > > > > latter too vague in this context. A "buffer" can be a memory
> > > > > > segment,
> > > > > > a
> > > > > > set of memory segments and what not. "vring slot" OTOH makes it
> > > > > > clear
> > > > > > that it is about exactly one, atomic pointer (hence with fixed
> > > > > > size)
> > > > > > in a
> > > > > > Ring Buffer, as depicted in the ASCII illustration here:
> > > > > > 
> > > > > > https://github.com/oasis-tcs/virtio-spec/issues/122
> > > > > > 
> > > > > > The maximum amount of vring slots is therefore the maximum amount
> > > > > > of
> > > > > > messages that can be emplaced into a Ring Buffer, independent of
> > > > > > any
> > > > > > "bulk data buffer size".
> > > > > > 
> > > > > > > +
> > > > > > > 
> > > > > > > > +  With this feature enabled, the Queue Size only limits the
> > > > > > > > maximum
> > > > > > > > amount
> > > > > > > > +  of slots in the vring, but does not limit the actual bulk
> > > > > > > > data
> > > > > > > > size
> > > > > > > > +  being transmitted when indirect descriptors are used.
> > > > > > > > Decoupling
> > > > > > > > these
> > > > > > > > +  two configuration parameters this way not only allows much
> > > > > > > > larger
> > > > > > > > bulk
> > > > > > > > data
> > > > > > > > +  being transferred per vring slot, but also avoids
> > > > > > > > complicated
> > > > > > > > synchronization
> > > > > > > > +  mechanisms if the device only supports a very small amount
> > > > > > > > of
> > > > > > > > vring
> > > > > > > > slots. Due
> > > > > > > > +  to the 16-bit size of a descriptor's "next" field there is
> > > > > > > > still an
> > > > > > > > absolute
> > > > > > > > +  limit of $2^{16}$ descriptors per indirect descriptor
> > > > > > > > table.
> > > > > > > > However
> > > > > > > > the
> > > > > > > > +  actual maximum amount supported by either device or driver
> > > > > > > > might be
> > > > > > > > less,
> > > > > > > > +  and therefore the bus specific Queue Indirect Size value
> > > > > > > > MUST
> > > > > > > > additionally
> > > > > > > > +  be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated
> > > > > > > > to
> > > > > > > > subsequently
> > > > > > > > +  negotiate the actual amount of maximum indirect descriptors
> > > > > > > > supported
> > > > > > > > +  by both sides.
> > > > > > > 
> > > > > > > still not sure what exactly is the value. e.g. in a buffer
> > > > > > > including
> > > > > > > indirect and direct descriptors.
> > > > > > > 
> > > > > > > +
> > > > > > > 
> > > > > > > >  \end{description}
> > > > > > > >  
> > > > > > > >  \drivernormative{\section}{Reserved Feature Bits}{Reserved
> > > > > > > >  Feature
> > > > > > > >  Bits}
> > > > > > > > 
> > > > > > > > diff --git a/packed-ring.tex b/packed-ring.tex
> > > > > > > > index a9e6c16..e26d112 100644
> > > > > > > > --- a/packed-ring.tex
> > > > > > > > +++ b/packed-ring.tex
> > > > > > > > @@ -195,7 +195,7 @@ \subsection{Scatter-Gather Support}
> > > > > > > > 
> > > > > > > >  The device limits the number of descriptors in a list through
> > > > > > > >  a
> > > > > > > >  transport-specific and/or device-specific value. If not
> > > > > > > >  limited,
> > > > > > > >  the maximum number of descriptors in a list is the virt queue
> > > > > > > > 
> > > > > > > > -size.
> > > > > > > > +size unless the VIRTIO_RING_F_INDIRECT_SIZE feature has been
> > > > > > > > negotiated.
> > > > > > > > 
> > > > > > > >  \subsection{Next Flag: Descriptor Chaining}
> > > > > > > >  \label{sec:Packed Virtqueues / Next Flag: Descriptor
> > > > > > > >  Chaining}
> > > > > > > > 
> > > > > > > > diff --git a/split-ring.tex b/split-ring.tex
> > > > > > > > index de94038..eaa90c3 100644
> > > > > > > > --- a/split-ring.tex
> > > > > > > > +++ b/split-ring.tex
> > > > > > > > @@ -268,8 +268,12 @@ \subsubsection{Indirect
> > > > > > > > Descriptors}\label{sec:Basic
> > > > > > > > Facilities of a Virtio Devi
> > > > > > > > 
> > > > > > > >  set the VIRTQ_DESC_F_INDIRECT flag within an indirect
> > > > > > > >  descriptor
> > > > > > > >  (ie.
> > > > > > > >  only
> > > > > > > >  one table per descriptor).
> > > > > > > > 
> > > > > > > > -A driver MUST NOT create a descriptor chain longer than the
> > > > > > > > Queue
> > > > > > > > Size of
> > > > > > > > -the device.
> > > > > > > 
> > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has not been negotiated, the
> > > > > > > driver
> > > > > > > MUST
> > > > > > > 
> > > > > > > > +NOT create a descriptor chain longer than the Queue Size of
> > > > > > > > the
> > > > > > > > device.
> > > > > > > > +
> > > > > > > > +If VIRTIO_RING_F_INDIRECT_SIZE has been negotiated, the
> > > > > > > > number of
> > > > > > > > +descriptors per indirect descriptor table MUST NOT exceed the
> > > > > > > > negotiated
> > > > > > > > +Queue Indirect Size.
> > > > > > > 
> > > > > > > it is not negotiated is it?
> > > > > > 
> > > > > > What makes you think it is not negotiated?
> > > > 
> > > > Also see my previous question here ^
> > > 
> > > Sorry, what I mean is that you don't define what does negotiation
> > > 
> > > involve. I think you mean this:
> > > 	The driver SHOULD write to \field{queue_indirect_size} if its maximum
> > > 	number of descriptors per vring slot is lower than that reported by 
the
> > > 	device.
> > > 
> > > but driver can just read the value and that's it - and then the value
> > > that is set by device applies, right?
> > > 
> > > If you are going to use terms such as negotiated you need to define what
> > > they mean. In this case I would just say something like
> > > "the value of Queue Indirect Size".
> > 
> > Which makes me wonder why you just didn't say that in the first place? And
> > I don't agree that it wasn't defined, because I actually think I did:
> > 
> > +  \item[VIRTIO_RING_F_INDIRECT_SIZE(41)] This feature indicates that the
> > +  Queue Indirect Size, i.e. the maximum amount of descriptors in indirect
> > +  descriptor tables, is independent from the Queue Size."
> > 
> > Or is that definition of the new term "Queue Indirect Size" not clear
> > enough to you?
> 
> Maybe ... but I don't think this will jump out to the reader. I feel
> we abuse of the word negotiate to the point where reader only has a
> vague idea what it means. That's why I struggled to give concise
> comments.  Consider:
> 
> +  and therefore the transport specific Queue Indirect Size value MUST
> +  additionally be negotiated if VIRTIO_RING_F_INDIRECT_SIZE was negotiated
> to +  subsequently negotiate the actual amount of maximum indirect
> descriptors +  supported by both sides.
> 
> sounds like a tongue-twister to me. Can't we find a term different from
> "negotiate"?
> 
> 
> It's already used for features, please find something else for this.
> And why is a term even necessary?  How is this different from other
> writeable fields?  Consider an example of similar functionality
> from the spec:
> 
> \item[\field{queue_size}]
>         Queue Size.  On reset, specifies the maximum queue size supported by
> the device. This can be modified by the driver to reduce memory
> requirements. A 0 means the queue is unavailable.
> 
> 
> Hope this helps.

To be honest, I don't feel like discussing precise wordings at this point when 
you are questioning the proposal on design level already.

Maybe you make some more thorough thoughts on what you actually want this to 
be on design level before continueing to argue about precise terminology, 
which you are not using either BTW when you articulating your criticism.

Or even better: come up with your own proposol with the precise wording you 
feel appropriate.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 17:43                 ` Christian Schoenebeck
@ 2022-03-20 21:52                   ` Michael S. Tsirkin
  2022-03-21  9:23                     ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-20 21:52 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> To be honest, I don't feel like discussing precise wordings at this point when 
> you are questioning the proposal on design level already.
> 
> Maybe you make some more thorough thoughts on what you actually want this to 
> be on design level before continueing to argue about precise terminology, 
> which you are not using either BTW when you articulating your criticism.
> 
> Or even better: come up with your own proposol with the precise wording you 
> feel appropriate.

OK let's go back and agree on what we are trying to achieve.  The github
issue and the cover letter imply that while indirect descriptors would
normally allow huge tables, we artificially limit them to queue size,
and you want to be able to relax that.

Fair enough.

However, I feel trying to talk about indirect descriptor is too narrow a
use-case, simply because the issue is not indirect at all.  Why do we
limit number of segments? I think it's really because of backend
limitations. And indirect is only used by the frontend.  So limiting
that is really going about it wrong.


So block for example has seg_max already. What should happen
if that exceeds queue size is not defined.


So maybe we can generalize that making it device independent?
The litmus paper for this is the block and scsi devices,
we should be able to use the new feature as a super-set.


Before we discuss solutions, did I formulate the problem correctly?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-20 21:52                   ` Michael S. Tsirkin
@ 2022-03-21  9:23                     ` Christian Schoenebeck
  2022-03-21 22:13                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-21  9:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > To be honest, I don't feel like discussing precise wordings at this point
> > when you are questioning the proposal on design level already.
> > 
> > Maybe you make some more thorough thoughts on what you actually want this
> > to be on design level before continueing to argue about precise
> > terminology, which you are not using either BTW when you articulating
> > your criticism.
> > 
> > Or even better: come up with your own proposol with the precise wording
> > you
> > feel appropriate.
> 
> OK let's go back and agree on what we are trying to achieve.  The github
> issue and the cover letter imply that while indirect descriptors would
> normally allow huge tables, we artificially limit them to queue size,
> and you want to be able to relax that.

Correct, that's my motivation for all of this.

> Fair enough.
> 
> However, I feel trying to talk about indirect descriptor is too narrow a
> use-case, simply because the issue is not indirect at all.  Why do we
> limit number of segments? I think it's really because of backend
> limitations. And indirect is only used by the frontend.  So limiting
> that is really going about it wrong.

I am only aware about current implementation situation in QEMU and Linux 
kernel. As for those two: yes, it is not a limitation on Linux kernel side, 
but on QEMU side.

As for other implementations: no idea.

> So block for example has seg_max already. What should happen
> if that exceeds queue size is not defined.
> 
> So maybe we can generalize that making it device independent?
> The litmus paper for this is the block and scsi devices,
> we should be able to use the new feature as a super-set.
> 
> Before we discuss solutions, did I formulate the problem correctly?

Keep in mind that I never worked on virtio code or virtio spec before. I just 
started to review virtio implementation of QEMU and Linux kernel and the 
virtio spec in November, specifically in context of 9p. I definitely don't 
know all the other virtioo device classes out there.

In other words: I can't help you on fitting this appropriately into a superset 
picture.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-19 10:23         ` Christian Schoenebeck
@ 2022-03-21 16:25           ` Cornelia Huck
  0 siblings, 0 replies; 39+ messages in thread
From: Cornelia Huck @ 2022-03-21 16:25 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet,
	Halil Pasic

On Sat, Mar 19 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Freitag, 18. März 2022 17:10:34 CET Cornelia Huck wrote:
>> On Fri, Mar 18 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > Oh, you are suggesting two different names for those two fields?
>> > "max_indirect_num" vs. "indirect_num". If yes, why?
>> 
>> The "max_indirect_num" goes with the "max_num": this is the maximum
>> value that the device supports (the "maximum maximum", in a way :).
>> "indirect_num" is the actual upper limit while the device is in use.
>
> Ok, I understand your motivation now, your interpretation aspect for this was:
>
> 	device's value >= driver's value
>
> Another interpretation aspect would be though from driver PoV: 'what is the 
> maximum bulk size I could send to device?', and in this case you would call 
> both fields 'max_indirect_num'.
>
> Therefore I would prefer using the same name 'max_indirect_num' on both 
> structs, to also make it clear they are about negotiating the same thing.

So, maybe the 'max' is what is confusing? 'max_indirect_num_limit' vs
'max_indirect_num', perhaps? I'd really like to keep distinct names for
the two fields, though, as they are fields in two distinct structures,
one used for the device advertising what it supports, and one for the
driver advertising what it will actually use. (This is different from
PCI or MMIO, where both device and driver interact on the same value.)


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-19 10:12         ` Christian Schoenebeck
@ 2022-03-21 16:36           ` Cornelia Huck
  2022-03-22  1:56             ` Halil Pasic
  0 siblings, 1 reply; 39+ messages in thread
From: Cornelia Huck @ 2022-03-21 16:36 UTC (permalink / raw)
  To: Christian Schoenebeck, Halil Pasic
  Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz, Dominique Martinet

On Sat, Mar 19 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Freitag, 18. März 2022 17:06:25 CET Halil Pasic wrote:

>> I agree that the "including" is important, but I'm not sure about the
>> "its contents are undefined". I don't really understand why should we use
>> a plural here. What speaks against specifying that in SHOULD be stored
>> as 0 by the device, and MUST be ignored by the driver?
>
> Both solutions would be viable. Personally I would just use something like 
> "Should be zero" if there is a value in recommending that, but I don't see a 
> value in recommending to set something to zero and at the same time requiring 
> to not access it in the first place.
>
>> Currently we say that \field{max_indirect_num} exists like a be32 field
>> even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
>> implies that at least type invariants should hold. Of course, there is
>> none here (i.e. every bits value is also a be32 value), but for something
>> like an enum interesting corner cases can pop up.
>
> I can't follow you on that one. What has that do with enums in this case?
>
> Anyway, I won't persist on my suggestion to use the (IMO more compact form) 
> "undefined". If you guys prefer the more specific solution "SHOULD be 0 and 
> MUST not be accessed" then I will go that way.

I'm not sure what mandating 0 and non-access would buy us here... the
driver can of course read the field (e.g. when copying the structure
wholesale); it just can't make use of the contents when it did not
negotiate the feature (but why would it do so in that case anyway?)

Also, I think junk remains junk, whether it is a be32 field or
interpreted as an enum. It is simply not valid, even if it might by
accident end up to be a defined enum value.

So I think "undefined" should be fine.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-21  9:23                     ` Christian Schoenebeck
@ 2022-03-21 22:13                       ` Michael S. Tsirkin
  2022-03-23 10:20                         ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-21 22:13 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > > To be honest, I don't feel like discussing precise wordings at this point
> > > when you are questioning the proposal on design level already.
> > > 
> > > Maybe you make some more thorough thoughts on what you actually want this
> > > to be on design level before continueing to argue about precise
> > > terminology, which you are not using either BTW when you articulating
> > > your criticism.
> > > 
> > > Or even better: come up with your own proposol with the precise wording
> > > you
> > > feel appropriate.
> > 
> > OK let's go back and agree on what we are trying to achieve.  The github
> > issue and the cover letter imply that while indirect descriptors would
> > normally allow huge tables, we artificially limit them to queue size,
> > and you want to be able to relax that.
> 
> Correct, that's my motivation for all of this.

Okay. So I think that given this, we can limit the total number
of non-indirect descriptors, including non-indirect ones
in a chain + all the ones in indirect pointer table if any,
and excluding the indirect descriptor itself, and this
will address the issue you are describing here, right?

Question:
- I am thinking about bi-directional descriptors such as
  block and scsi have. it looks like they have a separate limit for
  read and write parts of the chain. Should we have two limits
  then? Or should we just make driver use the lower of the two,
  i.e. the per vq limit applies to the total # of elements
  in a buffer, and read/write sgs are controlled separately
  by the per device control?
  

Stefan, any comments?



> > Fair enough.
> > 
> > However, I feel trying to talk about indirect descriptor is too narrow a
> > use-case, simply because the issue is not indirect at all.  Why do we
> > limit number of segments? I think it's really because of backend
> > limitations. And indirect is only used by the frontend.  So limiting
> > that is really going about it wrong.
> 
> I am only aware about current implementation situation in QEMU and Linux 
> kernel. As for those two: yes, it is not a limitation on Linux kernel side, 
> but on QEMU side.
> 
> As for other implementations: no idea.
> 
> > So block for example has seg_max already. What should happen
> > if that exceeds queue size is not defined.
> > 
> > So maybe we can generalize that making it device independent?
> > The litmus paper for this is the block and scsi devices,
> > we should be able to use the new feature as a super-set.
> > 
> > Before we discuss solutions, did I formulate the problem correctly?
> 
> Keep in mind that I never worked on virtio code or virtio spec before. I just 
> started to review virtio implementation of QEMU and Linux kernel and the 
> virtio spec in November, specifically in context of 9p. I definitely don't 
> know all the other virtioo device classes out there.

I think it's great that we have someone taking care of 9p btw!

> 
> In other words: I can't help you on fitting this appropriately into a superset 
> picture.
> 
> Best regards,
> Christian Schoenebeck

I think we need some cleanups in the spec to make what you are trying to
do possible to do cleanly, specifically move this description:

	A buffer consists of zero or more device-readable physically-contiguous
	elements followed by zero or more physically-contiguous
	device-writable elements (each buffer has at least one element).

out to the generic part from packed ring part, drop corresponding
text from split ring and make it refer to the term "element".

After this, the new field will just be "a number of elements per
buffer" (note that indirect descriptors do not themselves
describe elements and so won't be included in the math).

Christian, you mentioned you don't like the term buffer generally,
changing that can be done before or after this feature but IMHO
best not as part of it.

I think it's good in that it will fit better in the superset picture
addressing in addition to your requirement also the requirement to have
huge rings while limiting descriptors.

Does above sound like it addresses your requirements of having
a longer descriptor chain than queue size? If not what is not
addressed?


Thanks,
-- 
MST


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-21 16:36           ` Cornelia Huck
@ 2022-03-22  1:56             ` Halil Pasic
  2022-03-22  9:57               ` Cornelia Huck
  0 siblings, 1 reply; 39+ messages in thread
From: Halil Pasic @ 2022-03-22  1:56 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Schoenebeck, virtio-comment, Stefan Hajnoczi,
	Greg Kurz, Dominique Martinet, Halil Pasic

On Mon, 21 Mar 2022 17:36:26 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Sat, Mar 19 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> 
> > On Freitag, 18. März 2022 17:06:25 CET Halil Pasic wrote:  
> 
> >> I agree that the "including" is important, but I'm not sure about the
> >> "its contents are undefined". I don't really understand why should we use
> >> a plural here. What speaks against specifying that in SHOULD be stored
> >> as 0 by the device, and MUST be ignored by the driver?  
> >
> > Both solutions would be viable. Personally I would just use something like 
> > "Should be zero" if there is a value in recommending that, but I don't see a 
> > value in recommending to set something to zero and at the same time requiring 
> > to not access it in the first place.
> >  
> >> Currently we say that \field{max_indirect_num} exists like a be32 field
> >> even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
> >> implies that at least type invariants should hold. Of course, there is
> >> none here (i.e. every bits value is also a be32 value), but for something
> >> like an enum interesting corner cases can pop up.  
> >
> > I can't follow you on that one. What has that do with enums in this case?
> >
> > Anyway, I won't persist on my suggestion to use the (IMO more compact form) 
> > "undefined". If you guys prefer the more specific solution "SHOULD be 0 and 
> > MUST not be accessed" then I will go that way.  
> 
> I'm not sure what mandating 0 and non-access would buy us here... the
> driver can of course read the field (e.g. when copying the structure
> wholesale); it just can't make use of the contents when it did not
> negotiate the feature (but why would it do so in that case anyway?)

My train of thought was that making the device give us a well defined
0 could benefit robustness. The idea was, that even if the driver was
buggy, and used the value we would still end up with some sane behavior.

> 
> Also, I think junk remains junk, whether it is a be32 field or
> interpreted as an enum. It is simply not valid, even if it might by
> accident end up to be a defined enum value.

What I had in mind is the difference between "trap representation" and
"unspecified value" in terms of the C standard. Using a "trap
representation" is undefined behavior, while using an "unspecified value"
is far less serious. As far as I remember, there are no trap
representations for enumerated types in C, so the example ain't perfect.
But if some code was to assume that all it can see it the values defined
in the enum, strange stuff may happen.


> 
> So I think "undefined" should be fine.
> 

BTW the C standard uses the term "indeterminate value" in this situation.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-22  1:56             ` Halil Pasic
@ 2022-03-22  9:57               ` Cornelia Huck
  2022-03-22 11:21                 ` Halil Pasic
  0 siblings, 1 reply; 39+ messages in thread
From: Cornelia Huck @ 2022-03-22  9:57 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Christian Schoenebeck, virtio-comment, Stefan Hajnoczi,
	Greg Kurz, Dominique Martinet, Halil Pasic

On Tue, Mar 22 2022, Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 21 Mar 2022 17:36:26 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
>
>> On Sat, Mar 19 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> 
>> > On Freitag, 18. März 2022 17:06:25 CET Halil Pasic wrote:  
>> 
>> >> I agree that the "including" is important, but I'm not sure about the
>> >> "its contents are undefined". I don't really understand why should we use
>> >> a plural here. What speaks against specifying that in SHOULD be stored
>> >> as 0 by the device, and MUST be ignored by the driver?  
>> >
>> > Both solutions would be viable. Personally I would just use something like 
>> > "Should be zero" if there is a value in recommending that, but I don't see a 
>> > value in recommending to set something to zero and at the same time requiring 
>> > to not access it in the first place.
>> >  
>> >> Currently we say that \field{max_indirect_num} exists like a be32 field
>> >> even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
>> >> implies that at least type invariants should hold. Of course, there is
>> >> none here (i.e. every bits value is also a be32 value), but for something
>> >> like an enum interesting corner cases can pop up.  
>> >
>> > I can't follow you on that one. What has that do with enums in this case?
>> >
>> > Anyway, I won't persist on my suggestion to use the (IMO more compact form) 
>> > "undefined". If you guys prefer the more specific solution "SHOULD be 0 and 
>> > MUST not be accessed" then I will go that way.  
>> 
>> I'm not sure what mandating 0 and non-access would buy us here... the
>> driver can of course read the field (e.g. when copying the structure
>> wholesale); it just can't make use of the contents when it did not
>> negotiate the feature (but why would it do so in that case anyway?)
>
> My train of thought was that making the device give us a well defined
> 0 could benefit robustness. The idea was, that even if the driver was
> buggy, and used the value we would still end up with some sane behavior.

I'm not sure a 0 would lead to sane behaviour in an already buggy
driver... operating with a limit of 0 would imply that the driver cannot
really do anything, and I'm not sure a driver buggy enough to access the
field would heed that. There's nothing wrong with a device using 0 if
the feature had not been negotiated, but I don't think it will help much
with already buggy drivers.

>
>> 
>> Also, I think junk remains junk, whether it is a be32 field or
>> interpreted as an enum. It is simply not valid, even if it might by
>> accident end up to be a defined enum value.
>
> What I had in mind is the difference between "trap representation" and
> "unspecified value" in terms of the C standard. Using a "trap
> representation" is undefined behavior, while using an "unspecified value"
> is far less serious. As far as I remember, there are no trap
> representations for enumerated types in C, so the example ain't perfect.
> But if some code was to assume that all it can see it the values defined
> in the enum, strange stuff may happen.

While the struct definitions look suspiciously like C, they are not in
fact C :) I don't think the spec defines anything of the above, and I
don't think it should.

>
>
>> 
>> So I think "undefined" should be fine.
>> 
>
> BTW the C standard uses the term "indeterminate value" in this situation.

"Indeterminate value" is a bit of a mouthful, though; "undefined" or
"unpredictable" from the driver's point of view should already capture
it, as the driver is not supposed to do anything with the value anyway.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] Re: [PATCH v3 4/4] Add CCW configuration field "indirect_num"
  2022-03-22  9:57               ` Cornelia Huck
@ 2022-03-22 11:21                 ` Halil Pasic
  0 siblings, 0 replies; 39+ messages in thread
From: Halil Pasic @ 2022-03-22 11:21 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Schoenebeck, virtio-comment, Stefan Hajnoczi,
	Greg Kurz, Dominique Martinet, Halil Pasic

On Tue, 22 Mar 2022 10:57:00 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Tue, Mar 22 2022, Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Mon, 21 Mar 2022 17:36:26 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >  
> >> On Sat, Mar 19 2022, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> >>   
> >> > On Freitag, 18. März 2022 17:06:25 CET Halil Pasic wrote:    
> >>   
> >> >> I agree that the "including" is important, but I'm not sure about the
> >> >> "its contents are undefined". I don't really understand why should we use
> >> >> a plural here. What speaks against specifying that in SHOULD be stored
> >> >> as 0 by the device, and MUST be ignored by the driver?    
> >> >
> >> > Both solutions would be viable. Personally I would just use something like 
> >> > "Should be zero" if there is a value in recommending that, but I don't see a 
> >> > value in recommending to set something to zero and at the same time requiring 
> >> > to not access it in the first place.
> >> >    
> >> >> Currently we say that \field{max_indirect_num} exists like a be32 field
> >> >> even if VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. Which kind of
> >> >> implies that at least type invariants should hold. Of course, there is
> >> >> none here (i.e. every bits value is also a be32 value), but for something
> >> >> like an enum interesting corner cases can pop up.    
> >> >
> >> > I can't follow you on that one. What has that do with enums in this case?
> >> >
> >> > Anyway, I won't persist on my suggestion to use the (IMO more compact form) 
> >> > "undefined". If you guys prefer the more specific solution "SHOULD be 0 and 
> >> > MUST not be accessed" then I will go that way.    
> >> 
> >> I'm not sure what mandating 0 and non-access would buy us here... the
> >> driver can of course read the field (e.g. when copying the structure
> >> wholesale); it just can't make use of the contents when it did not
> >> negotiate the feature (but why would it do so in that case anyway?)  
> >
> > My train of thought was that making the device give us a well defined
> > 0 could benefit robustness. The idea was, that even if the driver was
> > buggy, and used the value we would still end up with some sane behavior.  
> 
> I'm not sure a 0 would lead to sane behaviour in an already buggy
> driver... operating with a limit of 0 would imply that the driver cannot
> really do anything, and I'm not sure a driver buggy enough to access the
> field would heed that. There's nothing wrong with a device using 0 if
> the feature had not been negotiated, but I don't think it will help much
> with already buggy drivers.
> 

I don't consider this awfully important. While I do see some value in
devices presenting some saneish value in this situation over presenting
junk, I am fine with junk as well. Actually implementations can still do
whatever they want.

> >  
> >> 
> >> Also, I think junk remains junk, whether it is a be32 field or
> >> interpreted as an enum. It is simply not valid, even if it might by
> >> accident end up to be a defined enum value.  
> >
> > What I had in mind is the difference between "trap representation" and
> > "unspecified value" in terms of the C standard. Using a "trap
> > representation" is undefined behavior, while using an "unspecified value"
> > is far less serious. As far as I remember, there are no trap
> > representations for enumerated types in C, so the example ain't perfect.
> > But if some code was to assume that all it can see it the values defined
> > in the enum, strange stuff may happen.  
> 
> While the struct definitions look suspiciously like C, they are not in
> fact C :) 

I'm aware. I actually merely used the C standard lingo, because most of
us are familiar with C, and it is easy to read up on the precise meaning.

I pointed out the difference between using an unspecified value and
using a trap representation to showcase, that the difference between the
two might matter.

>I don't think the spec defines anything of the above, and I
> don't think it should.

Never stated the spec defines anything of the above.

> 
> >
> >  
> >> 
> >> So I think "undefined" should be fine.
> >>   
> >
> > BTW the C standard uses the term "indeterminate value" in this situation.  
> 
> "Indeterminate value" is a bit of a mouthful, though; "undefined" or
> "unpredictable" from the driver's point of view should already capture
> it, as the driver is not supposed to do anything with the value anyway.
> 

Yes the reader is more than likely to figure out what "undefined value"
or "unpredictable value" is supposed to mean in this context from the
context.  I should really stop splitting hairs. Nevertheless given
that revision >= 3 and was VIRTIO_RING_F_INDIRECT_SIZE negotiated, is
the value of indirect_max_num predictable by the driver? In my opinion
it is not. And neither is the value defined by this spec. The semantic
of the value is defined, but the value itself isn't really any more
defined than when VIRTIO_RING_F_INDIRECT_SIZE is not negotiated. 

We could say that when VIRTIO_RING_F_INDIRECT_SIZE is not negotiated
Queue Indirect Size is undefined. But I should really stop splitting
hairs. Sorry.

Regards,
Halil


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-21 22:13                       ` Michael S. Tsirkin
@ 2022-03-23 10:20                         ` Christian Schoenebeck
  2022-03-23 12:35                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-23 10:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > > > To be honest, I don't feel like discussing precise wordings at this
> > > > point
> > > > when you are questioning the proposal on design level already.
> > > > 
> > > > Maybe you make some more thorough thoughts on what you actually want
> > > > this
> > > > to be on design level before continueing to argue about precise
> > > > terminology, which you are not using either BTW when you articulating
> > > > your criticism.
> > > > 
> > > > Or even better: come up with your own proposol with the precise
> > > > wording
> > > > you
> > > > feel appropriate.
> > > 
> > > OK let's go back and agree on what we are trying to achieve.  The github
> > > issue and the cover letter imply that while indirect descriptors would
> > > normally allow huge tables, we artificially limit them to queue size,
> > > and you want to be able to relax that.
> > 
> > Correct, that's my motivation for all of this.
> 
> Okay. So I think that given this, we can limit the total number
> of non-indirect descriptors, including non-indirect ones
> in a chain + all the ones in indirect pointer table if any,
> and excluding the indirect descriptor itself, and this
> will address the issue you are describing here, right?

As far as I understand your suggestion, yes, it would cover:

A. My use case [1]: allowing indirect table length > queue size.
B. Stefan's use case [2]: forcing indirect table length < queue size.
C. Your use case: forcing chain (within FIFOs) length < queue size.

[1] https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.com/
[2] https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/

However it would not cover:

D. Your other use case: blocking indirect tables.
E. Potential other people's need: max. chain length (within FIFOs) != max.
   indirect table length.

It's not clear to me though why you would want to exclude the descriptor
pointing to a table from counting towards that limit.

Instead of yet again mixing different limits again with each other, what about
using 3 distinct fields for those limits instead:

1. max. indirect table length (any descriptor in the table)
2. max. chain length (currently: descriptors within FIFOs)
3. max. total descriptors per buffer (sum of all descriptors, both from FIFOs
   and indirect tables, including descriptor pointing to a table).

As a side note: the spec currently refers to "table of indirect descriptors",
hence my previous assumption that descriptors in such a table should be called
"indirect descriptors", whereas you are apparently assuming that an "indirect
descriptor" is only that single one pointing to a table and you see the
descriptors in the table as "direct" ones I guess.

> Question:
> - I am thinking about bi-directional descriptors such as
>   block and scsi have. it looks like they have a separate limit for
>   read and write parts of the chain. Should we have two limits
>   then? Or should we just make driver use the lower of the two,
>   i.e. the per vq limit applies to the total # of elements
>   in a buffer, and read/write sgs are controlled separately
>   by the per device control?
> 
> Stefan, any comments?

Leaving that question to Stefan. No opinion from my side at this point.

> > > Fair enough.
> > > 
> > > However, I feel trying to talk about indirect descriptor is too narrow a
> > > use-case, simply because the issue is not indirect at all.  Why do we
> > > limit number of segments? I think it's really because of backend
> > > limitations. And indirect is only used by the frontend.  So limiting
> > > that is really going about it wrong.
> > 
> > I am only aware about current implementation situation in QEMU and Linux
> > kernel. As for those two: yes, it is not a limitation on Linux kernel
> > side,
> > but on QEMU side.
> > 
> > As for other implementations: no idea.
> > 
> > > So block for example has seg_max already. What should happen
> > > if that exceeds queue size is not defined.
> > > 
> > > So maybe we can generalize that making it device independent?
> > > The litmus paper for this is the block and scsi devices,
> > > we should be able to use the new feature as a super-set.
> > > 
> > > Before we discuss solutions, did I formulate the problem correctly?
> > 
> > Keep in mind that I never worked on virtio code or virtio spec before. I
> > just started to review virtio implementation of QEMU and Linux kernel and
> > the virtio spec in November, specifically in context of 9p. I definitely
> > don't know all the other virtioo device classes out there.
> 
> I think it's great that we have someone taking care of 9p btw!

Thanks!

> > In other words: I can't help you on fitting this appropriately into a
> > superset picture.
> > 
> > Best regards,
> > Christian Schoenebeck
> 
> I think we need some cleanups in the spec to make what you are trying to
> do possible to do cleanly, specifically move this description:
> 
> 	A buffer consists of zero or more device-readable physically-contiguous
> 	elements followed by zero or more physically-contiguous
> 	device-writable elements (each buffer has at least one element).
> 
> out to the generic part from packed ring part, drop corresponding
> text from split ring and make it refer to the term "element".
> 
> After this, the new field will just be "a number of elements per
> buffer" (note that indirect descriptors do not themselves
> describe elements and so won't be included in the math).
> 
> Christian, you mentioned you don't like the term buffer generally,
> changing that can be done before or after this feature but IMHO
> best not as part of it.

For pragmatic reasons I will refrain from questioning any virtio terms in
foreseeable future and will just use the ones suggested by people.

> I think it's good in that it will fit better in the superset picture
> addressing in addition to your requirement also the requirement to have
> huge rings while limiting descriptors.
> 
> Does above sound like it addresses your requirements of having
> a longer descriptor chain than queue size? If not what is not
> addressed?
> 
> Thanks,



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-23 10:20                         ` Christian Schoenebeck
@ 2022-03-23 12:35                           ` Michael S. Tsirkin
  2022-03-24  9:16                             ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-23 12:35 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > > > > To be honest, I don't feel like discussing precise wordings at this
> > > > > point
> > > > > when you are questioning the proposal on design level already.
> > > > > 
> > > > > Maybe you make some more thorough thoughts on what you actually want
> > > > > this
> > > > > to be on design level before continueing to argue about precise
> > > > > terminology, which you are not using either BTW when you articulating
> > > > > your criticism.
> > > > > 
> > > > > Or even better: come up with your own proposol with the precise
> > > > > wording
> > > > > you
> > > > > feel appropriate.
> > > > 
> > > > OK let's go back and agree on what we are trying to achieve.  The github
> > > > issue and the cover letter imply that while indirect descriptors would
> > > > normally allow huge tables, we artificially limit them to queue size,
> > > > and you want to be able to relax that.
> > > 
> > > Correct, that's my motivation for all of this.
> > 
> > Okay. So I think that given this, we can limit the total number
> > of non-indirect descriptors, including non-indirect ones
> > in a chain + all the ones in indirect pointer table if any,
> > and excluding the indirect descriptor itself, and this
> > will address the issue you are describing here, right?
> 
> As far as I understand your suggestion, yes, it would cover:
> 
> A. My use case [1]: allowing indirect table length > queue size.
> B. Stefan's use case [2]: forcing indirect table length < queue size.
> C. Your use case: forcing chain (within FIFOs) length < queue size.
> 
> [1] https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.com/
> [2] https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> 
> However it would not cover:
> 
> D. Your other use case: blocking indirect tables.

Well, the practical use-case I have in mind is actually for when using
mergeable buffers with the network device, in practice there's never a
s/g but there's no way for device to know that. So maybe we can declare
that indirect must not be used when there's a single descriptor in the
table (esp when the new feature is accepted), and then just limiting s/g
to 1 will do the trick in this specific case.

> E. Potential other people's need: max. chain length (within FIFOs) != max.
>    indirect table length.

I don't understand this one, sorry.

> It's not clear to me though why you would want to exclude the descriptor
> pointing to a table from counting towards that limit.

Because it's a transient thing. It does not need to be processed
with a packet, it's just a way of supplying a longer s/g.
In other words what devices care about is the s/g list length,
bot how the s/g is specified.

I think what QEMU and Linux both do is very typical:
the indirect descriptor is used to fetch the table,
but then what QEMU cares about is whether it will overflow 1024
entries when preparing the iovec for linux.



> Instead of yet again mixing different limits again with each other, what about
> using 3 distinct fields for those limits instead:
> 
> 1. max. indirect table length (any descriptor in the table)
> 2. max. chain length (currently: descriptors within FIFOs)
> 3. max. total descriptors per buffer (sum of all descriptors, both from FIFOs
>    and indirect tables, including descriptor pointing to a table).

So this doesn't actually address at least what you called "your
use-case" above. Well it almost does, but not exactly because
of the extra indirect descriptor.

It's not that I'm against this kind of thing but personally
I'd like a bit more than a theoretical argument for
why all the extra complexity is helpful - remember that
all that needs to be coded in the driver, and any mistakes
often tend to paint us in the corner because of the need to
support existing drivers.


> As a side note: the spec currently refers to "table of indirect descriptors",
> hence my previous assumption that descriptors in such a table should be called
> "indirect descriptors", whereas you are apparently assuming that an "indirect
> descriptor" is only that single one pointing to a table and you see the
> descriptors in the table as "direct" ones I guess.

Oh. You are right. There's no special name for what I mistakenly
called an indirect descriptor. So what I called
"direct descriptors" are really "read/write descriptors".
It might make sense to add some text calling these something.
Scatter/gather descriptors maybe?

And what I called "indirect descriptor" is really just
descriptor with INDIRECT flag set. I am not sure we need
to call these anything unless we really start counting it.




> > Question:
> > - I am thinking about bi-directional descriptors such as
> >   block and scsi have. it looks like they have a separate limit for
> >   read and write parts of the chain. Should we have two limits
> >   then? Or should we just make driver use the lower of the two,
> >   i.e. the per vq limit applies to the total # of elements
> >   in a buffer, and read/write sgs are controlled separately
> >   by the per device control?
> > 
> > Stefan, any comments?
> 
> Leaving that question to Stefan. No opinion from my side at this point.
> 
> > > > Fair enough.
> > > > 
> > > > However, I feel trying to talk about indirect descriptor is too narrow a
> > > > use-case, simply because the issue is not indirect at all.  Why do we
> > > > limit number of segments? I think it's really because of backend
> > > > limitations. And indirect is only used by the frontend.  So limiting
> > > > that is really going about it wrong.
> > > 
> > > I am only aware about current implementation situation in QEMU and Linux
> > > kernel. As for those two: yes, it is not a limitation on Linux kernel
> > > side,
> > > but on QEMU side.
> > > 
> > > As for other implementations: no idea.
> > > 
> > > > So block for example has seg_max already. What should happen
> > > > if that exceeds queue size is not defined.
> > > > 
> > > > So maybe we can generalize that making it device independent?
> > > > The litmus paper for this is the block and scsi devices,
> > > > we should be able to use the new feature as a super-set.
> > > > 
> > > > Before we discuss solutions, did I formulate the problem correctly?
> > > 
> > > Keep in mind that I never worked on virtio code or virtio spec before. I
> > > just started to review virtio implementation of QEMU and Linux kernel and
> > > the virtio spec in November, specifically in context of 9p. I definitely
> > > don't know all the other virtioo device classes out there.
> > 
> > I think it's great that we have someone taking care of 9p btw!
> 
> Thanks!
> 
> > > In other words: I can't help you on fitting this appropriately into a
> > > superset picture.
> > > 
> > > Best regards,
> > > Christian Schoenebeck
> > 
> > I think we need some cleanups in the spec to make what you are trying to
> > do possible to do cleanly, specifically move this description:
> > 
> > 	A buffer consists of zero or more device-readable physically-contiguous
> > 	elements followed by zero or more physically-contiguous
> > 	device-writable elements (each buffer has at least one element).
> > 
> > out to the generic part from packed ring part, drop corresponding
> > text from split ring and make it refer to the term "element".
> > 
> > After this, the new field will just be "a number of elements per
> > buffer" (note that indirect descriptors do not themselves
> > describe elements and so won't be included in the math).
> > 
> > Christian, you mentioned you don't like the term buffer generally,
> > changing that can be done before or after this feature but IMHO
> > best not as part of it.
> 
> For pragmatic reasons I will refrain from questioning any virtio terms in
> foreseeable future and will just use the ones suggested by people.

It's not that the terminology is so great, but we do have used it for a
while and by now it's quite entrenched.

> > I think it's good in that it will fit better in the superset picture
> > addressing in addition to your requirement also the requirement to have
> > huge rings while limiting descriptors.
> > 
> > Does above sound like it addresses your requirements of having
> > a longer descriptor chain than queue size? If not what is not
> > addressed?
> > 
> > Thanks,
> 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-23 12:35                           ` Michael S. Tsirkin
@ 2022-03-24  9:16                             ` Christian Schoenebeck
  2022-03-24 10:36                               ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-24  9:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Mittwoch, 23. März 2022 13:35:00 CET Michael S. Tsirkin wrote:
> On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> > On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck 
wrote:
> > > > > OK let's go back and agree on what we are trying to achieve.  The
> > > > > github
> > > > > issue and the cover letter imply that while indirect descriptors
> > > > > would
> > > > > normally allow huge tables, we artificially limit them to queue
> > > > > size,
> > > > > and you want to be able to relax that.
> > > > 
> > > > Correct, that's my motivation for all of this.
> > > 
> > > Okay. So I think that given this, we can limit the total number
> > > of non-indirect descriptors, including non-indirect ones
> > > in a chain + all the ones in indirect pointer table if any,
> > > and excluding the indirect descriptor itself, and this
> > > will address the issue you are describing here, right?
> > 
> > As far as I understand your suggestion, yes, it would cover:
> > 
> > A. My use case [1]: allowing indirect table length > queue size.
> > B. Stefan's use case [2]: forcing indirect table length < queue size.
> > C. Your use case: forcing chain (within FIFOs) length < queue size.
> > 
> > [1]
> > https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.c
> > om/ [2]
> > https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> > 
> > However it would not cover:
> > 
> > D. Your other use case: blocking indirect tables.
> 
> Well, the practical use-case I have in mind is actually for when using
> mergeable buffers with the network device, in practice there's never a
> s/g but there's no way for device to know that. So maybe we can declare
> that indirect must not be used when there's a single descriptor in the
> table (esp when the new feature is accepted), and then just limiting s/g
> to 1 will do the trick in this specific case.
> 
> > E. Potential other people's need: max. chain length (within FIFOs) != max.
> > 
> >    indirect table length.
> 
> I don't understand this one, sorry.

Example: say you would want to allow large indirect tables, but OTOH want to 
block any chained (pre-allocated) direct descriptors in the vring FIFOs 
themselfes, e.g.:

	Queue Size (max. # "buffers" in FIFO) = 127
	Indirect Size (max. # segments in indirect descr. table) = 65535
	Direct Size (max. # chained descr. per "buffer" in FIFO) = 1
	Total Size (max. # sum of all descr. per "buffer") = 65536

Your suggestion would not cover this example. Why would somebody want that? 
Because of device/driver simplification (i.e. same motivation you apparently 
had in mind for blocking indirect tables for network devices) and to prevent 
one side from starving because other side filled entire FIFO with large bulk 
data for a single message.

> > Instead of yet again mixing different limits again with each other, what
> > about using 3 distinct fields for those limits instead:
> > 
> > 1. max. indirect table length (any descriptor in the table)
> > 2. max. chain length (currently: descriptors within FIFOs)
> > 3. max. total descriptors per buffer (sum of all descriptors, both from
> > FIFOs> 
> >    and indirect tables, including descriptor pointing to a table).
> 
> So this doesn't actually address at least what you called "your
> use-case" above. Well it almost does, but not exactly because
> of the extra indirect descriptor.

Of course it would cover it. And of course you would have to either subtract 
or add one descriptor somewhere, no matter which solution is chosen in the 
end.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-24  9:16                             ` Christian Schoenebeck
@ 2022-03-24 10:36                               ` Michael S. Tsirkin
  2022-03-24 11:11                                 ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-24 10:36 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Thu, Mar 24, 2022 at 10:16:00AM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 23. März 2022 13:35:00 CET Michael S. Tsirkin wrote:
> > On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> > > On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > > > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > > > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck 
> wrote:
> > > > > > OK let's go back and agree on what we are trying to achieve.  The
> > > > > > github
> > > > > > issue and the cover letter imply that while indirect descriptors
> > > > > > would
> > > > > > normally allow huge tables, we artificially limit them to queue
> > > > > > size,
> > > > > > and you want to be able to relax that.
> > > > > 
> > > > > Correct, that's my motivation for all of this.
> > > > 
> > > > Okay. So I think that given this, we can limit the total number
> > > > of non-indirect descriptors, including non-indirect ones
> > > > in a chain + all the ones in indirect pointer table if any,
> > > > and excluding the indirect descriptor itself, and this
> > > > will address the issue you are describing here, right?
> > > 
> > > As far as I understand your suggestion, yes, it would cover:
> > > 
> > > A. My use case [1]: allowing indirect table length > queue size.
> > > B. Stefan's use case [2]: forcing indirect table length < queue size.
> > > C. Your use case: forcing chain (within FIFOs) length < queue size.
> > > 
> > > [1]
> > > https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.c
> > > om/ [2]
> > > https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> > > 
> > > However it would not cover:
> > > 
> > > D. Your other use case: blocking indirect tables.
> > 
> > Well, the practical use-case I have in mind is actually for when using
> > mergeable buffers with the network device, in practice there's never a
> > s/g but there's no way for device to know that. So maybe we can declare
> > that indirect must not be used when there's a single descriptor in the
> > table (esp when the new feature is accepted), and then just limiting s/g
> > to 1 will do the trick in this specific case.
> > 
> > > E. Potential other people's need: max. chain length (within FIFOs) != max.
> > > 
> > >    indirect table length.
> > 
> > I don't understand this one, sorry.
> 
> Example: say you would want to allow large indirect tables, but OTOH want to 
> block any chained (pre-allocated) direct descriptors in the vring FIFOs 
> themselfes, e.g.:
> 
> 	Queue Size (max. # "buffers" in FIFO) = 127
> 	Indirect Size (max. # segments in indirect descr. table) = 65535
> 	Direct Size (max. # chained descr. per "buffer" in FIFO) = 1
> 	Total Size (max. # sum of all descr. per "buffer") = 65536
> 
> Your suggestion would not cover this example. Why would somebody want that? 
> Because of device/driver simplification

It does not simplify the driver, does it? It's always an extra resriction on
it.

> (i.e. same motivation you apparently 
> had in mind for blocking indirect tables for network devices)

OK I got this one. Device writers do complain about the complexity.
However the tradeoff in limiting what device accepts is ability
to migrate to a different device. So there's an advantage to
making this a separate feature bit so that it's clear to people it's
a low-end device.

> and to prevent 
> one side from starving because other side filled entire FIFO with large bulk 
> data for a single message.

This last I don't see why would it be the device's problem. If driver
will starve the device it can always do so, if it wants to use
indirect aggressively to avoid filling up the queue it can do so too.

> > > Instead of yet again mixing different limits again with each other, what
> > > about using 3 distinct fields for those limits instead:
> > > 
> > > 1. max. indirect table length (any descriptor in the table)
> > > 2. max. chain length (currently: descriptors within FIFOs)
> > > 3. max. total descriptors per buffer (sum of all descriptors, both from
> > > FIFOs> 
> > >    and indirect tables, including descriptor pointing to a table).
> > 
> > So this doesn't actually address at least what you called "your
> > use-case" above. Well it almost does, but not exactly because
> > of the extra indirect descriptor.
> 
> Of course it would cover it.
> And of course you would have to either subtract 
> or add one descriptor somewhere,

Well, indirect descriptors can be chained with VIRTQ_DESC_F_NEXT right?
The number of in the chain is not really predictable, it's not always
1.

> no matter which solution is chosen in the 
> end.
> 
> Best regards,
> Christian Schoenebeck


-- 
MST


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-24 10:36                               ` Michael S. Tsirkin
@ 2022-03-24 11:11                                 ` Christian Schoenebeck
  2022-03-24 11:16                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-24 11:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Donnerstag, 24. März 2022 11:36:50 CET Michael S. Tsirkin wrote:
> On Thu, Mar 24, 2022 at 10:16:00AM +0100, Christian Schoenebeck wrote:
> > On Mittwoch, 23. März 2022 13:35:00 CET Michael S. Tsirkin wrote:
> > > On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> > > > On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > > > > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck 
wrote:
> > > > > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck
> > 
> > wrote:
> > > > > > > OK let's go back and agree on what we are trying to achieve. 
> > > > > > > The
> > > > > > > github
> > > > > > > issue and the cover letter imply that while indirect descriptors
> > > > > > > would
> > > > > > > normally allow huge tables, we artificially limit them to queue
> > > > > > > size,
> > > > > > > and you want to be able to relax that.
> > > > > > 
> > > > > > Correct, that's my motivation for all of this.
> > > > > 
> > > > > Okay. So I think that given this, we can limit the total number
> > > > > of non-indirect descriptors, including non-indirect ones
> > > > > in a chain + all the ones in indirect pointer table if any,
> > > > > and excluding the indirect descriptor itself, and this
> > > > > will address the issue you are describing here, right?
> > > > 
> > > > As far as I understand your suggestion, yes, it would cover:
> > > > 
> > > > A. My use case [1]: allowing indirect table length > queue size.
> > > > B. Stefan's use case [2]: forcing indirect table length < queue size.
> > > > C. Your use case: forcing chain (within FIFOs) length < queue size.
> > > > 
> > > > [1]
> > > > https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyt
> > > > e.c
> > > > om/ [2]
> > > > https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> > > > 
> > > > However it would not cover:
> > > > 
> > > > D. Your other use case: blocking indirect tables.
> > > 
> > > Well, the practical use-case I have in mind is actually for when using
> > > mergeable buffers with the network device, in practice there's never a
> > > s/g but there's no way for device to know that. So maybe we can declare
> > > that indirect must not be used when there's a single descriptor in the
> > > table (esp when the new feature is accepted), and then just limiting s/g
> > > to 1 will do the trick in this specific case.
> > > 
> > > > E. Potential other people's need: max. chain length (within FIFOs) !=
> > > > max.
> > > > 
> > > >    indirect table length.
> > > 
> > > I don't understand this one, sorry.
> > 
> > Example: say you would want to allow large indirect tables, but OTOH want
> > to block any chained (pre-allocated) direct descriptors in the vring
> > FIFOs> 
> > themselfes, e.g.:
> > 	Queue Size (max. # "buffers" in FIFO) = 127
> > 	Indirect Size (max. # segments in indirect descr. table) = 65535
> > 	Direct Size (max. # chained descr. per "buffer" in FIFO) = 1
> > 	Total Size (max. # sum of all descr. per "buffer") = 65536
> > 
> > Your suggestion would not cover this example. Why would somebody want
> > that?
> > Because of device/driver simplification
> 
> It does not simplify the driver, does it? It's always an extra resriction on
> it.

Why not? If it's the driver that sets Direct Size = 1 then it can ignore 
handling chained direct descriptors. Sounds like a simplification to me.

> > (i.e. same motivation you apparently
> > had in mind for blocking indirect tables for network devices)
> 
> OK I got this one. Device writers do complain about the complexity.
> However the tradeoff in limiting what device accepts is ability
> to migrate to a different device. So there's an advantage to
> making this a separate feature bit so that it's clear to people it's
> a low-end device.
> 
> > and to prevent
> > one side from starving because other side filled entire FIFO with large
> > bulk data for a single message.
> 
> This last I don't see why would it be the device's problem. If driver
> will starve the device it can always do so, if it wants to use
> indirect aggressively to avoid filling up the queue it can do so too.

To actually make it possible to enforce such kind of policy: "do not use the 
FIFO for large bulk data, use indirect space for that instead".

> > > > Instead of yet again mixing different limits again with each other,
> > > > what
> > > > about using 3 distinct fields for those limits instead:
> > > > 
> > > > 1. max. indirect table length (any descriptor in the table)
> > > > 2. max. chain length (currently: descriptors within FIFOs)
> > > > 3. max. total descriptors per buffer (sum of all descriptors, both
> > > > from
> > > > FIFOs>
> > > > 
> > > >    and indirect tables, including descriptor pointing to a table).
> > > 
> > > So this doesn't actually address at least what you called "your
> > > use-case" above. Well it almost does, but not exactly because
> > > of the extra indirect descriptor.
> > 
> > Of course it would cover it.
> > And of course you would have to either subtract
> > or add one descriptor somewhere,
> 
> Well, indirect descriptors can be chained with VIRTQ_DESC_F_NEXT right?
> The number of in the chain is not really predictable, it's not always
> 1.

I don't think so:

  2.6.5.3.1 Driver Requirements: Indirect Descriptors
  ..
  "A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT in
  flags."

So as far as I can see it, the amount of direct descriptors is currently 
always exactly one if an indirect table is used.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-24 11:11                                 ` Christian Schoenebeck
@ 2022-03-24 11:16                                   ` Michael S. Tsirkin
  2022-03-24 11:52                                     ` Christian Schoenebeck
  0 siblings, 1 reply; 39+ messages in thread
From: Michael S. Tsirkin @ 2022-03-24 11:16 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Thu, Mar 24, 2022 at 12:11:33PM +0100, Christian Schoenebeck wrote:
> On Donnerstag, 24. März 2022 11:36:50 CET Michael S. Tsirkin wrote:
> > On Thu, Mar 24, 2022 at 10:16:00AM +0100, Christian Schoenebeck wrote:
> > > On Mittwoch, 23. März 2022 13:35:00 CET Michael S. Tsirkin wrote:
> > > > On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> > > > > On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > > > > > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck 
> wrote:
> > > > > > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > > > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck
> > > 
> > > wrote:
> > > > > > > > OK let's go back and agree on what we are trying to achieve. 
> > > > > > > > The
> > > > > > > > github
> > > > > > > > issue and the cover letter imply that while indirect descriptors
> > > > > > > > would
> > > > > > > > normally allow huge tables, we artificially limit them to queue
> > > > > > > > size,
> > > > > > > > and you want to be able to relax that.
> > > > > > > 
> > > > > > > Correct, that's my motivation for all of this.
> > > > > > 
> > > > > > Okay. So I think that given this, we can limit the total number
> > > > > > of non-indirect descriptors, including non-indirect ones
> > > > > > in a chain + all the ones in indirect pointer table if any,
> > > > > > and excluding the indirect descriptor itself, and this
> > > > > > will address the issue you are describing here, right?
> > > > > 
> > > > > As far as I understand your suggestion, yes, it would cover:
> > > > > 
> > > > > A. My use case [1]: allowing indirect table length > queue size.
> > > > > B. Stefan's use case [2]: forcing indirect table length < queue size.
> > > > > C. Your use case: forcing chain (within FIFOs) length < queue size.
> > > > > 
> > > > > [1]
> > > > > https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyt
> > > > > e.c
> > > > > om/ [2]
> > > > > https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> > > > > 
> > > > > However it would not cover:
> > > > > 
> > > > > D. Your other use case: blocking indirect tables.
> > > > 
> > > > Well, the practical use-case I have in mind is actually for when using
> > > > mergeable buffers with the network device, in practice there's never a
> > > > s/g but there's no way for device to know that. So maybe we can declare
> > > > that indirect must not be used when there's a single descriptor in the
> > > > table (esp when the new feature is accepted), and then just limiting s/g
> > > > to 1 will do the trick in this specific case.
> > > > 
> > > > > E. Potential other people's need: max. chain length (within FIFOs) !=
> > > > > max.
> > > > > 
> > > > >    indirect table length.
> > > > 
> > > > I don't understand this one, sorry.
> > > 
> > > Example: say you would want to allow large indirect tables, but OTOH want
> > > to block any chained (pre-allocated) direct descriptors in the vring
> > > FIFOs> 
> > > themselfes, e.g.:
> > > 	Queue Size (max. # "buffers" in FIFO) = 127
> > > 	Indirect Size (max. # segments in indirect descr. table) = 65535
> > > 	Direct Size (max. # chained descr. per "buffer" in FIFO) = 1
> > > 	Total Size (max. # sum of all descr. per "buffer") = 65536
> > > 
> > > Your suggestion would not cover this example. Why would somebody want
> > > that?
> > > Because of device/driver simplification
> > 
> > It does not simplify the driver, does it? It's always an extra resriction on
> > it.
> 
> Why not? If it's the driver that sets Direct Size = 1 then it can ignore 
> handling chained direct descriptors. Sounds like a simplification to me.

If driver does not want to chain descriptors it does not have to.
There's no need to device to forbid it from doing so.

> > > (i.e. same motivation you apparently
> > > had in mind for blocking indirect tables for network devices)
> > 
> > OK I got this one. Device writers do complain about the complexity.
> > However the tradeoff in limiting what device accepts is ability
> > to migrate to a different device. So there's an advantage to
> > making this a separate feature bit so that it's clear to people it's
> > a low-end device.
> > 
> > > and to prevent
> > > one side from starving because other side filled entire FIFO with large
> > > bulk data for a single message.
> > 
> > This last I don't see why would it be the device's problem. If driver
> > will starve the device it can always do so, if it wants to use
> > indirect aggressively to avoid filling up the queue it can do so too.
> 
> To actually make it possible to enforce such kind of policy: "do not use the 
> FIFO for large bulk data, use indirect space for that instead".

Why does device want to enforce this policy? Policy is generally
something best left to guests they know how device is used.

> > > > > Instead of yet again mixing different limits again with each other,
> > > > > what
> > > > > about using 3 distinct fields for those limits instead:
> > > > > 
> > > > > 1. max. indirect table length (any descriptor in the table)
> > > > > 2. max. chain length (currently: descriptors within FIFOs)
> > > > > 3. max. total descriptors per buffer (sum of all descriptors, both
> > > > > from
> > > > > FIFOs>
> > > > > 
> > > > >    and indirect tables, including descriptor pointing to a table).
> > > > 
> > > > So this doesn't actually address at least what you called "your
> > > > use-case" above. Well it almost does, but not exactly because
> > > > of the extra indirect descriptor.
> > > 
> > > Of course it would cover it.
> > > And of course you would have to either subtract
> > > or add one descriptor somewhere,
> > 
> > Well, indirect descriptors can be chained with VIRTQ_DESC_F_NEXT right?
> > The number of in the chain is not really predictable, it's not always
> > 1.
> 
> I don't think so:
> 
>   2.6.5.3.1 Driver Requirements: Indirect Descriptors
>   ..
>   "A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT in
>   flags."
> So as far as I can see it, the amount of direct descriptors is currently 
> always exactly one if an indirect table is used.
> 
> Best regards,
> Christian Schoenebeck
> 

Oh. You are right.  Weirdly this text is not in packed-ring.tex - I
think this and a bunch of other cases like this are an oversight,
however we need to fix them first before adding features that assume
they are fixed ...

-- 
MST


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
  2022-03-24 11:16                                   ` Michael S. Tsirkin
@ 2022-03-24 11:52                                     ` Christian Schoenebeck
  0 siblings, 0 replies; 39+ messages in thread
From: Christian Schoenebeck @ 2022-03-24 11:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, Cornelia Huck, Stefan Hajnoczi, Greg Kurz,
	Dominique Martinet, Halil Pasic

On Donnerstag, 24. März 2022 12:16:51 CET Michael S. Tsirkin wrote:
> On Thu, Mar 24, 2022 at 12:11:33PM +0100, Christian Schoenebeck wrote:
> > On Donnerstag, 24. März 2022 11:36:50 CET Michael S. Tsirkin wrote:
> > > On Thu, Mar 24, 2022 at 10:16:00AM +0100, Christian Schoenebeck wrote:
> > > > On Mittwoch, 23. März 2022 13:35:00 CET Michael S. Tsirkin wrote:
> > > > > On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck 
wrote:
> > > > > > On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > > > > > > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck
> > 
> > wrote:
> > > > > > > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin 
wrote:
> > > > > > > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian
> > > > > > > > > Schoenebeck
> > > > 
> > > > wrote:
> > > > > > > > > OK let's go back and agree on what we are trying to achieve.
> > > > > > > > > The
> > > > > > > > > github
> > > > > > > > > issue and the cover letter imply that while indirect
> > > > > > > > > descriptors
> > > > > > > > > would
> > > > > > > > > normally allow huge tables, we artificially limit them to
> > > > > > > > > queue
> > > > > > > > > size,
> > > > > > > > > and you want to be able to relax that.
> > > > > > > > 
> > > > > > > > Correct, that's my motivation for all of this.
> > > > > > > 
> > > > > > > Okay. So I think that given this, we can limit the total number
> > > > > > > of non-indirect descriptors, including non-indirect ones
> > > > > > > in a chain + all the ones in indirect pointer table if any,
> > > > > > > and excluding the indirect descriptor itself, and this
> > > > > > > will address the issue you are describing here, right?
> > > > > > 
> > > > > > As far as I understand your suggestion, yes, it would cover:
> > > > > > 
> > > > > > A. My use case [1]: allowing indirect table length > queue size.
> > > > > > B. Stefan's use case [2]: forcing indirect table length < queue
> > > > > > size.
> > > > > > C. Your use case: forcing chain (within FIFOs) length < queue
> > > > > > size.
> > > > > > 
> > > > > > [1]
> > > > > > https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crud
> > > > > > ebyt
> > > > > > e.c
> > > > > > om/ [2]
> > > > > > https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdoma
> > > > > > in/
> > > > > > 
> > > > > > However it would not cover:
> > > > > > 
> > > > > > D. Your other use case: blocking indirect tables.
> > > > > 
> > > > > Well, the practical use-case I have in mind is actually for when
> > > > > using
> > > > > mergeable buffers with the network device, in practice there's never
> > > > > a
> > > > > s/g but there's no way for device to know that. So maybe we can
> > > > > declare
> > > > > that indirect must not be used when there's a single descriptor in
> > > > > the
> > > > > table (esp when the new feature is accepted), and then just limiting
> > > > > s/g
> > > > > to 1 will do the trick in this specific case.
> > > > > 
> > > > > > E. Potential other people's need: max. chain length (within FIFOs)
> > > > > > !=
> > > > > > max.
> > > > > > 
> > > > > >    indirect table length.
> > > > > 
> > > > > I don't understand this one, sorry.
> > > > 
> > > > Example: say you would want to allow large indirect tables, but OTOH
> > > > want
> > > > to block any chained (pre-allocated) direct descriptors in the vring
> > > > FIFOs>
> > > > 
> > > > themselfes, e.g.:
> > > > 	Queue Size (max. # "buffers" in FIFO) = 127
> > > > 	Indirect Size (max. # segments in indirect descr. table) = 65535
> > > > 	Direct Size (max. # chained descr. per "buffer" in FIFO) = 1
> > > > 	Total Size (max. # sum of all descr. per "buffer") = 65536
> > > > 
> > > > Your suggestion would not cover this example. Why would somebody want
> > > > that?
> > > > Because of device/driver simplification
> > > 
> > > It does not simplify the driver, does it? It's always an extra
> > > resriction on it.
> > 
> > Why not? If it's the driver that sets Direct Size = 1 then it can ignore
> > handling chained direct descriptors. Sounds like a simplification to me.
> 
> If driver does not want to chain descriptors it does not have to.
> There's no need to device to forbid it from doing so.
> 
> > > > (i.e. same motivation you apparently
> > > > had in mind for blocking indirect tables for network devices)
> > > 
> > > OK I got this one. Device writers do complain about the complexity.
> > > However the tradeoff in limiting what device accepts is ability
> > > to migrate to a different device. So there's an advantage to
> > > making this a separate feature bit so that it's clear to people it's
> > > a low-end device.
> > > 
> > > > and to prevent
> > > > one side from starving because other side filled entire FIFO with
> > > > large
> > > > bulk data for a single message.
> > > 
> > > This last I don't see why would it be the device's problem. If driver
> > > will starve the device it can always do so, if it wants to use
> > > indirect aggressively to avoid filling up the queue it can do so too.
> > 
> > To actually make it possible to enforce such kind of policy: "do not use
> > the FIFO for large bulk data, use indirect space for that instead".
> 
> Why does device want to enforce this policy? Policy is generally
> something best left to guests they know how device is used.

I'm just saying. Period. It's not that I care. In the end I leave it to other 
people to fit this into an appropriate superset picture, caring about future 
proofness, etc.

As long as "my" use case (indirect size > queue size) will be covered, then I 
will be fine with any design. #pragmatism

> > > > > > Instead of yet again mixing different limits again with each
> > > > > > other,
> > > > > > what
> > > > > > about using 3 distinct fields for those limits instead:
> > > > > > 
> > > > > > 1. max. indirect table length (any descriptor in the table)
> > > > > > 2. max. chain length (currently: descriptors within FIFOs)
> > > > > > 3. max. total descriptors per buffer (sum of all descriptors, both
> > > > > > from
> > > > > > FIFOs>
> > > > > > 
> > > > > >    and indirect tables, including descriptor pointing to a table).
> > > > > 
> > > > > So this doesn't actually address at least what you called "your
> > > > > use-case" above. Well it almost does, but not exactly because
> > > > > of the extra indirect descriptor.
> > > > 
> > > > Of course it would cover it.
> > > > And of course you would have to either subtract
> > > > or add one descriptor somewhere,
> > > 
> > > Well, indirect descriptors can be chained with VIRTQ_DESC_F_NEXT right?
> > > The number of in the chain is not really predictable, it's not always
> > > 1.
> > 
> > I don't think so:
> >   2.6.5.3.1 Driver Requirements: Indirect Descriptors
> >   ..
> >   "A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
> >   in
> >   flags."
> > 
> > So as far as I can see it, the amount of direct descriptors is currently
> > always exactly one if an indirect table is used.
> > 
> > Best regards,
> > Christian Schoenebeck
> 
> Oh. You are right.  Weirdly this text is not in packed-ring.tex - I

As far as I can see it, you both originally added this normative requirement 
to content.tex with 3b676c2f (Wed Apr 29 11:09:04 2015) and then you moved it 
to split-ring.tex with 6131f8d9 (Fri Mar 9 23:23:30 2018).

> think this and a bunch of other cases like this are an oversight,
> however we need to fix them first before adding features that assume
> they are fixed ...

Makes sense.

Feel free to resume this discussion when you think it is time to.

It would be good to see other people's opinion on this overall issue as well 
though.

Thanks!

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2022-03-24 11:52 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
2022-03-17 13:40   ` [virtio-comment] " Cornelia Huck
2022-03-18 10:45     ` Christian Schoenebeck
2022-03-18 16:03       ` [virtio-comment] " Cornelia Huck
2022-03-19  9:33   ` [virtio-comment] " Michael S. Tsirkin
2022-03-19 12:00     ` Christian Schoenebeck
2022-03-20 12:31       ` Michael S. Tsirkin
2022-03-20 13:32         ` Christian Schoenebeck
2022-03-20 13:55           ` Michael S. Tsirkin
2022-03-20 15:17             ` Christian Schoenebeck
2022-03-20 16:06               ` Michael S. Tsirkin
2022-03-20 16:07                 ` Michael S. Tsirkin
2022-03-20 17:43                 ` Christian Schoenebeck
2022-03-20 21:52                   ` Michael S. Tsirkin
2022-03-21  9:23                     ` Christian Schoenebeck
2022-03-21 22:13                       ` Michael S. Tsirkin
2022-03-23 10:20                         ` Christian Schoenebeck
2022-03-23 12:35                           ` Michael S. Tsirkin
2022-03-24  9:16                             ` Christian Schoenebeck
2022-03-24 10:36                               ` Michael S. Tsirkin
2022-03-24 11:11                                 ` Christian Schoenebeck
2022-03-24 11:16                                   ` Michael S. Tsirkin
2022-03-24 11:52                                     ` Christian Schoenebeck
2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
2022-03-16 14:41   ` [virtio-comment] " Christian Schoenebeck
2022-03-16 13:52 ` [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum" Christian Schoenebeck
2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
2022-03-17 14:12   ` [virtio-comment] " Cornelia Huck
2022-03-18 11:02     ` Christian Schoenebeck
2022-03-18 16:06       ` Halil Pasic
2022-03-19 10:12         ` Christian Schoenebeck
2022-03-21 16:36           ` Cornelia Huck
2022-03-22  1:56             ` Halil Pasic
2022-03-22  9:57               ` Cornelia Huck
2022-03-22 11:21                 ` Halil Pasic
2022-03-18 16:10       ` Cornelia Huck
2022-03-19 10:23         ` Christian Schoenebeck
2022-03-21 16:25           ` Cornelia Huck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.