All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-29 13:24 [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Christian Schoenebeck
@ 2021-11-29 13:22 ` Christian Schoenebeck
  2021-12-02 16:04   ` Stefan Hajnoczi
  2021-11-29 13:23 ` [PATCH v2 2/2] Add common configuration field "queue_indirect_size" Christian Schoenebeck
  2021-11-30 12:06 ` [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Cornelia Huck
  2 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-11-29 13:22 UTC (permalink / raw)
  To: virtio-comment; +Cc: Stefan Hajnoczi, Cornelia Huck, Greg Kurz

This new feature flag allows indirect descriptor tables to
exceed the queue size.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 content.tex    | 21 +++++++++++++++++++++
 split-ring.tex |  2 +-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/content.tex b/content.tex
index 5d112af..34fa23e 100644
--- a/content.tex
+++ b/content.tex
@@ -6693,6 +6693,27 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   transport specific.
   For more details about driver notifications over PCI see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Available Buffer Notifications}.
 
+  \item[VIRTIO_RING_F_LARGE_INDIRECT_DESC(40)] This feature indicates that the
+  amount of descriptors in an indirect descriptor table is allowed to exceed
+  the Queue Size.
+
+  If this feature bit is not negotiated, then a driver MUST NOT create a
+  descriptor chain longer than the Queue Size, which then also applies to
+  indirect descriptor tables as in \ref{sec:Basic Facilities of a Virtio
+  Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
+  Descriptors}. Which means without this feature, the Queue Size limits
+  both the maximum amount of slots in the vring, as well as the actual
+  bulk data size being transmitted per vring slot.
+
+  With this feature enabled, the Queue Size only limits the maximum amount
+  of slots in the vring, but does not oppose a limit to the actual bulk data
+  size being transmitted when indirect descriptors are used. Decoupling these
+  two configuration parameters this way not only allows much larger bulk data
+  being transferred per vring slot, but also avoids complicated synchronization
+  mechanisms if device only supports a very small amount of vring slots. Due to
+  the 16-bit size of a descriptor's "next" field there is still an absolute
+  limit of $2^{16}$ descriptors per indirect descriptor table.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
diff --git a/split-ring.tex b/split-ring.tex
index bfef62d..5284635 100644
--- a/split-ring.tex
+++ b/split-ring.tex
@@ -269,7 +269,7 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic Facilities of a Virtio Devi
 one table per descriptor).
 
 A driver MUST NOT create a descriptor chain longer than the Queue Size of
-the device.
+the device unless VIRTIO_RING_F_LARGE_INDIRECT_DESC has been negotiated.
 
 A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
 in \field{flags}.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/2] Add common configuration field "queue_indirect_size"
  2021-11-29 13:24 [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Christian Schoenebeck
  2021-11-29 13:22 ` [PATCH v2 1/2] " Christian Schoenebeck
@ 2021-11-29 13:23 ` Christian Schoenebeck
  2021-12-02 16:21   ` Stefan Hajnoczi
  2021-11-30 12:06 ` [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Cornelia Huck
  2 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-11-29 13:23 UTC (permalink / raw)
  To: virtio-comment; +Cc: Stefan Hajnoczi, Cornelia Huck, Greg Kurz

This new common configuration field allows to negotiate a more fine
graded numeric maximum length of indirect descriptor chains.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 content.tex    | 37 ++++++++++++++++++++++++++++++++++++-
 split-ring.tex |  5 +++++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/content.tex b/content.tex
index 34fa23e..25861f3 100644
--- a/content.tex
+++ b/content.tex
@@ -859,6 +859,7 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         le64 queue_driver;              /* read-write */
         le64 queue_device;              /* read-write */
         le16 queue_notify_data;         /* read-only for driver */
+        le16 queue_indirect_size;       /* read-write */
 };
 \end{lstlisting}
 
@@ -938,6 +939,34 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         may benefit from providing another value, for example an internal virtqueue
         identifier, or an internal offset related to the virtqueue number.
         \end{note}
+
+\item[\field{queue_indirect_size}]
+        This field was added in revision 3 and MUST exist if revision 3
+        has been negotiated. This field is used to negotiate the maximum
+        amount of indirect descriptors per indirect descriptor table as in
+        \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / The
+        Virtqueue Descriptor Table / Indirect Descriptors}.
+
+        The device specifies its supported maximum amount here first,
+        subsequently driver may read and then SHOULD write this field to
+        lower the value if driver's maximum amount of indirect descriptors
+        ever being emplaced by driver per vring slot is less than what
+        device supports. However driver MUST NOT increase this value.
+
+        Two different semantics evolve for the value of this field, depending on
+        whether VIRTIO_RING_F_LARGE_INDIRECT_DESC (see section
+        \ref{sec:Reserved Feature Bits}) has been negotiated:
+
+        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has not been negotiated, then the
+        effective maximum amount of indirect descriptors per indirect descriptor
+        table is min(Queue Size, \field{queue_indirect_size}). So in this case
+        this field allows driver to indicate if its supported maximum amount of
+        indirect descriptors is less than Queue Size.
+
+        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has been negotiated, then the
+        effective maximum amount of indirect descriptors per indirect descriptor
+        table is directly and solely reflected by \field{queue_indirect_size}
+        field here.
 \end{description}
 
 \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
@@ -6712,7 +6741,13 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
   being transferred per vring slot, but also avoids complicated synchronization
   mechanisms if device only supports a very small amount of vring slots. Due to
   the 16-bit size of a descriptor's "next" field there is still an absolute
-  limit of $2^{16}$ descriptors per indirect descriptor table.
+  limit of $2^{16}$ descriptors per indirect descriptor table. However the
+  actual maximum amount supported by either device or driver might be less,
+  and therefore the common configuration field "queue_indirect_size" MUST exist
+  if VIRTIO_RING_F_LARGE_INDIRECT_DESC had been negotiated to subsequently
+  negotiate the actual amount of maximum indirect descriptors supported
+  by both sides, as described in \ref{sec:Virtio Transport Options / Virtio
+  Over PCI Bus / PCI Device Layout / Common configuration structure layout}.
 
 \end{description}
 
diff --git a/split-ring.tex b/split-ring.tex
index 5284635..dff4018 100644
--- a/split-ring.tex
+++ b/split-ring.tex
@@ -270,6 +270,11 @@ \subsubsection{Indirect Descriptors}\label{sec:Basic Facilities of a Virtio Devi
 
 A driver MUST NOT create a descriptor chain longer than the Queue Size of
 the device unless VIRTIO_RING_F_LARGE_INDIRECT_DESC has been negotiated.
+Furthermore if common configuration field "queue_indirect_size" exists
+(since revision 3, see section \ref{sec:Virtio Transport Options / Virtio Over
+PCI Bus / PCI Device Layout / Common configuration structure layout}) then a
+descriptor chain additionally MUST NOT be longer than the value negotiated by
+that common configuration field.
 
 A driver MUST NOT set both VIRTQ_DESC_F_INDIRECT and VIRTQ_DESC_F_NEXT
 in \field{flags}.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
@ 2021-11-29 13:24 Christian Schoenebeck
  2021-11-29 13:22 ` [PATCH v2 1/2] " Christian Schoenebeck
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Christian Schoenebeck @ 2021-11-29 13:24 UTC (permalink / raw)
  To: virtio-comment; +Cc: Stefan Hajnoczi, Cornelia Huck, Greg Kurz

This is a follow-up of:
https://lists.oasis-open.org/archives/virtio-comment/202111/msg00059.html

These two patches introduce a more fine graded control over the maximum
length of an indirect descriptor table.

Associated Github Task:
https://github.com/oasis-tcs/virtio-spec/issues/122

v1 -> v2:

  * "active/pending message(s)" -> "vring slot(s)"
    NOTE: I decided to use the term "vring slot(s)" over the previously
    suggested term "buffer(s)" as I found tha latter too ambiguous in this
    context. [patch 1]

  * "... but does not oppose a limit to the actual bulk data size
    being transmitted."
    ->
    " ... but does not oppose a limit to the actual bulk data size
    being transmitted when indirect descriptors are used." [patch 1]

  * Add common configuration field "queue_indirect_size" and make it
    mandatory as of revision 3. [NEW patch 2]

Christian Schoenebeck (2):
  Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  Add common configuration field "queue_indirect_size"

 content.tex    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
 split-ring.tex |  7 ++++++-
 2 files changed, 62 insertions(+), 1 deletion(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-29 13:24 [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Christian Schoenebeck
  2021-11-29 13:22 ` [PATCH v2 1/2] " Christian Schoenebeck
  2021-11-29 13:23 ` [PATCH v2 2/2] Add common configuration field "queue_indirect_size" Christian Schoenebeck
@ 2021-11-30 12:06 ` Cornelia Huck
  2021-11-30 13:33   ` Christian Schoenebeck
  2 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-11-30 12:06 UTC (permalink / raw)
  To: Christian Schoenebeck, virtio-comment; +Cc: Stefan Hajnoczi, Greg Kurz

On Mon, Nov 29 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> This is a follow-up of:
> https://lists.oasis-open.org/archives/virtio-comment/202111/msg00059.html
>
> These two patches introduce a more fine graded control over the maximum
> length of an indirect descriptor table.
>
> Associated Github Task:
> https://github.com/oasis-tcs/virtio-spec/issues/122
>
> v1 -> v2:
>
>   * "active/pending message(s)" -> "vring slot(s)"
>     NOTE: I decided to use the term "vring slot(s)" over the previously
>     suggested term "buffer(s)" as I found tha latter too ambiguous in this
>     context. [patch 1]
>
>   * "... but does not oppose a limit to the actual bulk data size
>     being transmitted."
>     ->
>     " ... but does not oppose a limit to the actual bulk data size
>     being transmitted when indirect descriptors are used." [patch 1]
>
>   * Add common configuration field "queue_indirect_size" and make it
>     mandatory as of revision 3. [NEW patch 2]

Now you have me confused: 'revision' is a ccw transport concept, but you
only add support on the pci transport? It's fine to leave out ccw (and
mmio) for now, but you should not talk about revision 3...

>
> Christian Schoenebeck (2):
>   Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
>   Add common configuration field "queue_indirect_size"
>
>  content.tex    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  split-ring.tex |  7 ++++++-
>  2 files changed, 62 insertions(+), 1 deletion(-)


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-30 12:06 ` [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Cornelia Huck
@ 2021-11-30 13:33   ` Christian Schoenebeck
  2021-11-30 13:48     ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-11-30 13:33 UTC (permalink / raw)
  To: virtio-comment; +Cc: Cornelia Huck, Stefan Hajnoczi, Greg Kurz

On Dienstag, 30. November 2021 13:06:31 CET Cornelia Huck wrote:
> On Mon, Nov 29 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > This is a follow-up of:
> > https://lists.oasis-open.org/archives/virtio-comment/202111/msg00059.html
> > 
> > These two patches introduce a more fine graded control over the maximum
> > length of an indirect descriptor table.
> > 
> > Associated Github Task:
> > https://github.com/oasis-tcs/virtio-spec/issues/122
> > 
> > v1 -> v2:
> >   * "active/pending message(s)" -> "vring slot(s)"
> >   
> >     NOTE: I decided to use the term "vring slot(s)" over the previously
> >     suggested term "buffer(s)" as I found tha latter too ambiguous in this
> >     context. [patch 1]
> >   
> >   * "... but does not oppose a limit to the actual bulk data size
> >   
> >     being transmitted."
> >     ->
> >     " ... but does not oppose a limit to the actual bulk data size
> >     being transmitted when indirect descriptors are used." [patch 1]
> >   
> >   * Add common configuration field "queue_indirect_size" and make it
> >   
> >     mandatory as of revision 3. [NEW patch 2]
> 
> Now you have me confused: 'revision' is a ccw transport concept, but you
> only add support on the pci transport? It's fine to leave out ccw (and
> mmio) for now, but you should not talk about revision 3...

Looks like I was too quick on this one. Right, the name virtio_pci_common_cfg 
should have suggested to me that this was PCI specific. :) I simply assumed 
this was a common config structure shared by all bus types.

And I had no idea that you were referring with "revision" to a bus specific 
mechanism either. I assumed virtio v1.0 = rev1, virtio v1.1 = rev2 and that 
the upcoming spec to become v1.2 = rev3 (i.e. for all bus tyes).

So what is the desired approach to proceed on this overall task, should this 
just be addressed on PCI for now as first step?

And what about the intended availability of this new virtio_pci_common_cfg 
field "queue_indirect_size", should it be optional per se, independent of the 
virtio version or rather a mandatory field in upcoming virtio version?

> > Christian Schoenebeck (2):
> >   Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
> >   Add common configuration field "queue_indirect_size"
> >  
> >  content.tex    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  split-ring.tex |  7 ++++++-
> >  2 files changed, 62 insertions(+), 1 deletion(-)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-30 13:33   ` Christian Schoenebeck
@ 2021-11-30 13:48     ` Cornelia Huck
  2021-11-30 18:47       ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-11-30 13:48 UTC (permalink / raw)
  To: Christian Schoenebeck, virtio-comment; +Cc: Stefan Hajnoczi, Greg Kurz

On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Dienstag, 30. November 2021 13:06:31 CET Cornelia Huck wrote:
>> On Mon, Nov 29 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > This is a follow-up of:
>> > https://lists.oasis-open.org/archives/virtio-comment/202111/msg00059.html
>> > 
>> > These two patches introduce a more fine graded control over the maximum
>> > length of an indirect descriptor table.
>> > 
>> > Associated Github Task:
>> > https://github.com/oasis-tcs/virtio-spec/issues/122
>> > 
>> > v1 -> v2:
>> >   * "active/pending message(s)" -> "vring slot(s)"
>> >   
>> >     NOTE: I decided to use the term "vring slot(s)" over the previously
>> >     suggested term "buffer(s)" as I found tha latter too ambiguous in this
>> >     context. [patch 1]
>> >   
>> >   * "... but does not oppose a limit to the actual bulk data size
>> >   
>> >     being transmitted."
>> >     ->
>> >     " ... but does not oppose a limit to the actual bulk data size
>> >     being transmitted when indirect descriptors are used." [patch 1]
>> >   
>> >   * Add common configuration field "queue_indirect_size" and make it
>> >   
>> >     mandatory as of revision 3. [NEW patch 2]
>> 
>> Now you have me confused: 'revision' is a ccw transport concept, but you
>> only add support on the pci transport? It's fine to leave out ccw (and
>> mmio) for now, but you should not talk about revision 3...
>
> Looks like I was too quick on this one. Right, the name virtio_pci_common_cfg 
> should have suggested to me that this was PCI specific. :) I simply assumed 
> this was a common config structure shared by all bus types.
>
> And I had no idea that you were referring with "revision" to a bus specific 
> mechanism either. I assumed virtio v1.0 = rev1, virtio v1.1 = rev2 and that 
> the upcoming spec to become v1.2 = rev3 (i.e. for all bus tyes).

Apologies if I was too confusing with my comments :(

>
> So what is the desired approach to proceed on this overall task, should this 
> just be addressed on PCI for now as first step?

Yes, I'd concentrate on PCI first. It should not be too hard to add this
for the other transports.

>
> And what about the intended availability of this new virtio_pci_common_cfg 
> field "queue_indirect_size", should it be optional per se, independent of the 
> virtio version or rather a mandatory field in upcoming virtio version?

We should keep that field depending upon the feature bit IMHO.

>
>> > Christian Schoenebeck (2):
>> >   Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
>> >   Add common configuration field "queue_indirect_size"
>> >  
>> >  content.tex    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >  split-ring.tex |  7 ++++++-
>> >  2 files changed, 62 insertions(+), 1 deletion(-)


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-30 13:48     ` Cornelia Huck
@ 2021-11-30 18:47       ` Christian Schoenebeck
  2021-12-02 10:27         ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-11-30 18:47 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz

On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Dienstag, 30. November 2021 13:06:31 CET Cornelia Huck wrote:
> >> On Mon, Nov 29 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> >> > This is a follow-up of:
> >> > https://lists.oasis-open.org/archives/virtio-comment/202111/msg00059.ht
> >> > ml
> >> > 
> >> > These two patches introduce a more fine graded control over the maximum
> >> > length of an indirect descriptor table.
> >> > 
> >> > Associated Github Task:
> >> > https://github.com/oasis-tcs/virtio-spec/issues/122
> >> > 
> >> > v1 -> v2:
> >> >   * "active/pending message(s)" -> "vring slot(s)"
> >> >   
> >> >     NOTE: I decided to use the term "vring slot(s)" over the previously
> >> >     suggested term "buffer(s)" as I found tha latter too ambiguous in
> >> >     this
> >> >     context. [patch 1]
> >> >   
> >> >   * "... but does not oppose a limit to the actual bulk data size
> >> >   
> >> >     being transmitted."
> >> >     ->
> >> >     " ... but does not oppose a limit to the actual bulk data size
> >> >     being transmitted when indirect descriptors are used." [patch 1]
> >> >   
> >> >   * Add common configuration field "queue_indirect_size" and make it
> >> >   
> >> >     mandatory as of revision 3. [NEW patch 2]
> >> 
> >> Now you have me confused: 'revision' is a ccw transport concept, but you
> >> only add support on the pci transport? It's fine to leave out ccw (and
> >> mmio) for now, but you should not talk about revision 3...
> > 
> > Looks like I was too quick on this one. Right, the name
> > virtio_pci_common_cfg should have suggested to me that this was PCI
> > specific. :) I simply assumed this was a common config structure shared
> > by all bus types.
> > 
> > And I had no idea that you were referring with "revision" to a bus
> > specific
> > mechanism either. I assumed virtio v1.0 = rev1, virtio v1.1 = rev2 and
> > that
> > the upcoming spec to become v1.2 = rev3 (i.e. for all bus tyes).
> 
> Apologies if I was too confusing with my comments :(

My bad. :)

> > So what is the desired approach to proceed on this overall task, should
> > this just be addressed on PCI for now as first step?
> 
> Yes, I'd concentrate on PCI first. It should not be too hard to add this
> for the other transports.

Ok

> > And what about the intended availability of this new virtio_pci_common_cfg
> > field "queue_indirect_size", should it be optional per se, independent of
> > the virtio version or rather a mandatory field in upcoming virtio
> > version?
> We should keep that field depending upon the feature bit IMHO.

So you are suggesting the existence of "queue_indirect_size" field to be 
dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of being 
dependent on the virtio version.

Stefan, would that also suit your intended 2nd use case of lowering the max. 
descriptor count *below* Queue Size? It would be somewhat different from what 
I suggested here in patch 2 [which was min(QueueSize, queue_indirect_size) if 
VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but it could be fine as well. 
Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit misleading 
though, because the flag would also be set for forcing small limits.

Another issue I just realized: there is also an ambiguity in this v2 what the 
maximum descriptor count actually relates to. Should it be

1. max. indirect descriptor count per indirect descriptor table

or

2. max. indirect descriptor count per vring slot (i.e. the sum from multiple 
indirect descriptor tables within the same message)

Case 2 applies to QEMU's implementation right now AFAICS. The max. possible 
bulk transfer size is lower in case 2 accordingly.

> >> > Christian Schoenebeck (2):
> >> >   Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
> >> >   Add common configuration field "queue_indirect_size"
> >> >  
> >> >  content.tex    | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >  split-ring.tex |  7 ++++++-
> >> >  2 files changed, 62 insertions(+), 1 deletion(-)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-30 18:47       ` Christian Schoenebeck
@ 2021-12-02 10:27         ` Cornelia Huck
  2021-12-03 14:47           ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-12-02 10:27 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz

On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
>> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

>> > And what about the intended availability of this new virtio_pci_common_cfg
>> > field "queue_indirect_size", should it be optional per se, independent of
>> > the virtio version or rather a mandatory field in upcoming virtio
>> > version?
>> We should keep that field depending upon the feature bit IMHO.
>
> So you are suggesting the existence of "queue_indirect_size" field to be 
> dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of being 
> dependent on the virtio version.

Yes.

> Stefan, would that also suit your intended 2nd use case of lowering the max. 
> descriptor count *below* Queue Size? It would be somewhat different from what 
> I suggested here in patch 2 [which was min(QueueSize, queue_indirect_size) if 
> VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but it could be fine as well. 

I think that a configurable value for the descriptor count needs to
depend on the feature bit as well.

> Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit misleading 
> though, because the flag would also be set for forcing small limits.

VIRTIO_RING_F_CONFIGURABLE_INDIRECT_DESC ?

>
> Another issue I just realized: there is also an ambiguity in this v2 what the 
> maximum descriptor count actually relates to. Should it be
>
> 1. max. indirect descriptor count per indirect descriptor table
>
> or
>
> 2. max. indirect descriptor count per vring slot (i.e. the sum from multiple 
> indirect descriptor tables within the same message)
>
> Case 2 applies to QEMU's implementation right now AFAICS. The max. possible 
> bulk transfer size is lower in case 2 accordingly.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-11-29 13:22 ` [PATCH v2 1/2] " Christian Schoenebeck
@ 2021-12-02 16:04   ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-12-02 16:04 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: virtio-comment, Cornelia Huck, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 2227 bytes --]

On Mon, Nov 29, 2021 at 02:22:52PM +0100, Christian Schoenebeck wrote:
> This new feature flag allows indirect descriptor tables to
> exceed the queue size.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  content.tex    | 21 +++++++++++++++++++++
>  split-ring.tex |  2 +-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/content.tex b/content.tex
> index 5d112af..34fa23e 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -6693,6 +6693,27 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    transport specific.
>    For more details about driver notifications over PCI see \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Initialization And Device Operation / Available Buffer Notifications}.
>  
> +  \item[VIRTIO_RING_F_LARGE_INDIRECT_DESC(40)] This feature indicates that the
> +  amount of descriptors in an indirect descriptor table is allowed to exceed
> +  the Queue Size.
> +
> +  If this feature bit is not negotiated, then a driver MUST NOT create a
> +  descriptor chain longer than the Queue Size, which then also applies to
> +  indirect descriptor tables as in \ref{sec:Basic Facilities of a Virtio
> +  Device / Virtqueues / The Virtqueue Descriptor Table / Indirect
> +  Descriptors}. Which means without this feature, the Queue Size limits
> +  both the maximum amount of slots in the vring, as well as the actual
> +  bulk data size being transmitted per vring slot.
> +
> +  With this feature enabled, the Queue Size only limits the maximum amount
> +  of slots in the vring, but does not oppose a limit to the actual bulk data

s/oppose a limit to/impose a limit on/ but a more concise version is:

  ... does not limit the actual bulk data size ...

> +  size being transmitted when indirect descriptors are used. Decoupling these
> +  two configuration parameters this way not only allows much larger bulk data
> +  being transferred per vring slot, but also avoids complicated synchronization
> +  mechanisms if device only supports a very small amount of vring slots. Due to

s/if device/if the device/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] Add common configuration field "queue_indirect_size"
  2021-11-29 13:23 ` [PATCH v2 2/2] Add common configuration field "queue_indirect_size" Christian Schoenebeck
@ 2021-12-02 16:21   ` Stefan Hajnoczi
  2021-12-03 15:03     ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-12-02 16:21 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: virtio-comment, Cornelia Huck, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 5120 bytes --]

On Mon, Nov 29, 2021 at 02:23:01PM +0100, Christian Schoenebeck wrote:
> This new common configuration field allows to negotiate a more fine
> graded numeric maximum length of indirect descriptor chains.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  content.tex    | 37 ++++++++++++++++++++++++++++++++++++-
>  split-ring.tex |  5 +++++
>  2 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/content.tex b/content.tex
> index 34fa23e..25861f3 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -859,6 +859,7 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>          le64 queue_driver;              /* read-write */
>          le64 queue_device;              /* read-write */
>          le16 queue_notify_data;         /* read-only for driver */
> +        le16 queue_indirect_size;       /* read-write */
>  };
>  \end{lstlisting}
>  
> @@ -938,6 +939,34 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
>          may benefit from providing another value, for example an internal virtqueue
>          identifier, or an internal offset related to the virtqueue number.
>          \end{note}
> +
> +\item[\field{queue_indirect_size}]
> +        This field was added in revision 3 and MUST exist if revision 3
> +        has been negotiated. This field is used to negotiate the maximum
> +        amount of indirect descriptors per indirect descriptor table as in
> +        \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / The
> +        Virtqueue Descriptor Table / Indirect Descriptors}.
> +
> +        The device specifies its supported maximum amount here first,
> +        subsequently driver may read and then SHOULD write this field to
> +        lower the value if driver's maximum amount of indirect descriptors
> +        ever being emplaced by driver per vring slot is less than what
> +        device supports. However driver MUST NOT increase this value.

may/SHOULD/MUST NOT statements need to go into drivernormative or
devicenormative sections.

I suggest phrasing it so functionality is described here and normative
sections don't repeat what was already covered:

  The device specifies its maximum supported number of descriptors per
  indirect descriptor table. If the driver requires fewer descriptors,
  it writes its lower value to inform the device of the reduced resource
  requirements.

The drivernormative section would say:

- The driver SHOULD write the field if its maximum number of descriptors
  per indirect descriptor table is lower than that reported by the
  device.
- The driver MUST NOT write a higher value than the one it reads from
  the device.

> +
> +        Two different semantics evolve for the value of this field, depending on
> +        whether VIRTIO_RING_F_LARGE_INDIRECT_DESC (see section
> +        \ref{sec:Reserved Feature Bits}) has been negotiated:
> +
> +        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has not been negotiated, then the
> +        effective maximum amount of indirect descriptors per indirect descriptor
> +        table is min(Queue Size, \field{queue_indirect_size}). So in this case
> +        this field allows driver to indicate if its supported maximum amount of
> +        indirect descriptors is less than Queue Size.

I expected the !VIRTIO_RING_F_LARGE_INDIRECT_DESC case to preserve
existing behavior to avoid breaking existing drivers/devices. An
existing driver that honors Queue Size and doesn't know about
queue_indirect_size now violates the spec when a new device reports a
queue_indirect_size value less than Queue Size.

I expected drivers to only check queue_indirect_size when
VIRTIO_RING_F_LARGE_INDIRECT_DESC is negotiated.

> +
> +        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has been negotiated, then the
> +        effective maximum amount of indirect descriptors per indirect descriptor
> +        table is directly and solely reflected by \field{queue_indirect_size}
> +        field here.
>  \end{description}
>  
>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
> @@ -6712,7 +6741,13 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>    being transferred per vring slot, but also avoids complicated synchronization
>    mechanisms if device only supports a very small amount of vring slots. Due to
>    the 16-bit size of a descriptor's "next" field there is still an absolute
> -  limit of $2^{16}$ descriptors per indirect descriptor table.
> +  limit of $2^{16}$ descriptors per indirect descriptor table. However the
> +  actual maximum amount supported by either device or driver might be less,
> +  and therefore the common configuration field "queue_indirect_size" MUST exist
> +  if VIRTIO_RING_F_LARGE_INDIRECT_DESC had been negotiated to subsequently

s/had been/was/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-02 10:27         ` Cornelia Huck
@ 2021-12-03 14:47           ` Christian Schoenebeck
  2021-12-06 11:52             ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-12-03 14:47 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz

On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
wrote:
> >> > And what about the intended availability of this new
> >> > virtio_pci_common_cfg
> >> > field "queue_indirect_size", should it be optional per se, independent
> >> > of
> >> > the virtio version or rather a mandatory field in upcoming virtio
> >> > version?
> >> 
> >> We should keep that field depending upon the feature bit IMHO.
> > 
> > So you are suggesting the existence of "queue_indirect_size" field to be
> > dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of
> > being dependent on the virtio version.
> 
> Yes.
> 
> > Stefan, would that also suit your intended 2nd use case of lowering the
> > max. descriptor count *below* Queue Size? It would be somewhat different
> > from what I suggested here in patch 2 [which was min(QueueSize,
> > queue_indirect_size) if VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but
> > it could be fine as well.
> I think that a configurable value for the descriptor count needs to
> depend on the feature bit as well.
> 
> > Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit
> > misleading though, because the flag would also be set for forcing small
> > limits.
> VIRTIO_RING_F_CONFIGURABLE_INDIRECT_DESC ?

As the suggested common config field is "queue_indirect_size", what about 
VIRTIO_RING_F_INDIRECT_SIZE ?

> > Another issue I just realized: there is also an ambiguity in this v2 what
> > the maximum descriptor count actually relates to. Should it be
> > 
> > 1. max. indirect descriptor count per indirect descriptor table
> > 
> > or
> > 
> > 2. max. indirect descriptor count per vring slot (i.e. the sum from
> > multiple indirect descriptor tables within the same message)
> > 
> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
> > possible
> > bulk transfer size is lower in case 2 accordingly.

After reviewing virtio code on QEMU side again, I suggest to go for (2.). 
Otherwise a large portion of QEMU's virtio code would need quite a bunch of 
changes to support (1.). I assume that resistence for such changes in QEMU 
would be high, and I currently don't care enough to work on and defend those 
changes that (1.) would need.

In practice that would mean for many devices: the theoretical absolute max. 
virtio transfer size might be cut into half with (2.) in comparison to (1.), 
which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page size of 4k, 
because one indirect descriptor table is usually used for sending to device 
and another table for receiving from device. But that's use case dependent and 
(1.) is still a huge step forward IMO.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] Add common configuration field "queue_indirect_size"
  2021-12-02 16:21   ` Stefan Hajnoczi
@ 2021-12-03 15:03     ` Christian Schoenebeck
  2021-12-06 13:54       ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-12-03 15:03 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: virtio-comment, Cornelia Huck, Greg Kurz

On Donnerstag, 2. Dezember 2021 17:21:05 CET Stefan Hajnoczi wrote:
> On Mon, Nov 29, 2021 at 02:23:01PM +0100, Christian Schoenebeck wrote:
> > This new common configuration field allows to negotiate a more fine
> > graded numeric maximum length of indirect descriptor chains.
> > 
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  content.tex    | 37 ++++++++++++++++++++++++++++++++++++-
> >  split-ring.tex |  5 +++++
> >  2 files changed, 41 insertions(+), 1 deletion(-)
> > 
> > diff --git a/content.tex b/content.tex
> > index 34fa23e..25861f3 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -859,6 +859,7 @@ \subsubsection{Common configuration structure
> > layout}\label{sec:Virtio Transport> 
> >          le64 queue_driver;              /* read-write */
> >          le64 queue_device;              /* read-write */
> >          le16 queue_notify_data;         /* read-only for driver */
> > 
> > +        le16 queue_indirect_size;       /* read-write */
> > 
> >  };
> >  \end{lstlisting}
> > 
> > @@ -938,6 +939,34 @@ \subsubsection{Common configuration structure
> > layout}\label{sec:Virtio Transport> 
> >          may benefit from providing another value, for example an internal
> >          virtqueue
> >          identifier, or an internal offset related to the virtqueue
> >          number.
> >          \end{note}
> > 
> > +
> > +\item[\field{queue_indirect_size}]
> > +        This field was added in revision 3 and MUST exist if revision 3
> > +        has been negotiated. This field is used to negotiate the maximum
> > +        amount of indirect descriptors per indirect descriptor table as
> > in
> > +        \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / The
> > +        Virtqueue Descriptor Table / Indirect Descriptors}.
> > +
> > +        The device specifies its supported maximum amount here first,
> > +        subsequently driver may read and then SHOULD write this field to
> > +        lower the value if driver's maximum amount of indirect
> > descriptors
> > +        ever being emplaced by driver per vring slot is less than what
> > +        device supports. However driver MUST NOT increase this value.
> 
> may/SHOULD/MUST NOT statements need to go into drivernormative or
> devicenormative sections.
> 
> I suggest phrasing it so functionality is described here and normative
> sections don't repeat what was already covered:
> 
>   The device specifies its maximum supported number of descriptors per
>   indirect descriptor table. If the driver requires fewer descriptors,
>   it writes its lower value to inform the device of the reduced resource
>   requirements.
> 
> The drivernormative section would say:
> 
> - The driver SHOULD write the field if its maximum number of descriptors
>   per indirect descriptor table is lower than that reported by the
>   device.
> - The driver MUST NOT write a higher value than the one it reads from
>   the device.

I don't understand which sections you see as "drivernormative" and which ones 
not. Which precise sections in the spec do you want those comments to go to?

> > +
> > +        Two different semantics evolve for the value of this field,
> > depending on +        whether VIRTIO_RING_F_LARGE_INDIRECT_DESC (see
> > section
> > +        \ref{sec:Reserved Feature Bits}) has been negotiated:
> > +
> > +        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has not been negotiated,
> > then the +        effective maximum amount of indirect descriptors per
> > indirect descriptor +        table is min(Queue Size,
> > \field{queue_indirect_size}). So in this case +        this field allows
> > driver to indicate if its supported maximum amount of +        indirect
> > descriptors is less than Queue Size.
> 
> I expected the !VIRTIO_RING_F_LARGE_INDIRECT_DESC case to preserve
> existing behavior to avoid breaking existing drivers/devices. An
> existing driver that honors Queue Size and doesn't know about
> queue_indirect_size now violates the spec when a new device reports a
> queue_indirect_size value less than Queue Size.
> 
> I expected drivers to only check queue_indirect_size when
> VIRTIO_RING_F_LARGE_INDIRECT_DESC is negotiated.

Ok, fine with me then.

> > +
> > +        If VIRTIO_RING_F_LARGE_INDIRECT_DESC has been negotiated, then
> > the
> > +        effective maximum amount of indirect descriptors per indirect
> > descriptor +        table is directly and solely reflected by
> > \field{queue_indirect_size} +        field here.
> > 
> >  \end{description}
> >  
> >  \devicenormative{\paragraph}{Common configuration structure
> >  layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device
> >  Layout / Common configuration structure layout}> 
> > @@ -6712,7 +6741,13 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved
> > Feature Bits}> 
> >    being transferred per vring slot, but also avoids complicated
> >    synchronization mechanisms if device only supports a very small amount
> >    of vring slots. Due to the 16-bit size of a descriptor's "next" field
> >    there is still an absolute> 
> > -  limit of $2^{16}$ descriptors per indirect descriptor table.
> > +  limit of $2^{16}$ descriptors per indirect descriptor table. However
> > the
> > +  actual maximum amount supported by either device or driver might be
> > less, +  and therefore the common configuration field
> > "queue_indirect_size" MUST exist +  if VIRTIO_RING_F_LARGE_INDIRECT_DESC
> > had been negotiated to subsequently
> s/had been/was/

Ack. Same applies to your comments on patch 1, so I won't send a separate 
email on that.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-03 14:47           ` Christian Schoenebeck
@ 2021-12-06 11:52             ` Cornelia Huck
  2021-12-06 19:12               ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-12-06 11:52 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz

On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
>> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
>> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
> wrote:
>> >> > And what about the intended availability of this new
>> >> > virtio_pci_common_cfg
>> >> > field "queue_indirect_size", should it be optional per se, independent
>> >> > of
>> >> > the virtio version or rather a mandatory field in upcoming virtio
>> >> > version?
>> >> 
>> >> We should keep that field depending upon the feature bit IMHO.
>> > 
>> > So you are suggesting the existence of "queue_indirect_size" field to be
>> > dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of
>> > being dependent on the virtio version.
>> 
>> Yes.
>> 
>> > Stefan, would that also suit your intended 2nd use case of lowering the
>> > max. descriptor count *below* Queue Size? It would be somewhat different
>> > from what I suggested here in patch 2 [which was min(QueueSize,
>> > queue_indirect_size) if VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but
>> > it could be fine as well.
>> I think that a configurable value for the descriptor count needs to
>> depend on the feature bit as well.
>> 
>> > Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit
>> > misleading though, because the flag would also be set for forcing small
>> > limits.
>> VIRTIO_RING_F_CONFIGURABLE_INDIRECT_DESC ?
>
> As the suggested common config field is "queue_indirect_size", what about 
> VIRTIO_RING_F_INDIRECT_SIZE ?

That would work for me as well.

>
>> > Another issue I just realized: there is also an ambiguity in this v2 what
>> > the maximum descriptor count actually relates to. Should it be
>> > 
>> > 1. max. indirect descriptor count per indirect descriptor table
>> > 
>> > or
>> > 
>> > 2. max. indirect descriptor count per vring slot (i.e. the sum from
>> > multiple indirect descriptor tables within the same message)
>> > 
>> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
>> > possible
>> > bulk transfer size is lower in case 2 accordingly.
>
> After reviewing virtio code on QEMU side again, I suggest to go for (2.). 
> Otherwise a large portion of QEMU's virtio code would need quite a bunch of 
> changes to support (1.). I assume that resistence for such changes in QEMU 
> would be high, and I currently don't care enough to work on and defend those 
> changes that (1.) would need.
>
> In practice that would mean for many devices: the theoretical absolute max. 
> virtio transfer size might be cut into half with (2.) in comparison to (1.), 
> which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page size of 4k, 
> because one indirect descriptor table is usually used for sending to device 
> and another table for receiving from device. But that's use case dependent and 
> (1.) is still a huge step forward IMO.

If the variant that is easier for QEMU to implement still gives you
enough of what you need, I'm fine with going with that. (Is it
future-proof enough?)

Do we know what other implementors of this standard are doing?


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/2] Add common configuration field "queue_indirect_size"
  2021-12-03 15:03     ` Christian Schoenebeck
@ 2021-12-06 13:54       ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-12-06 13:54 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: virtio-comment, Cornelia Huck, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 5119 bytes --]

On Fri, Dec 03, 2021 at 04:03:26PM +0100, Christian Schoenebeck wrote:
> On Donnerstag, 2. Dezember 2021 17:21:05 CET Stefan Hajnoczi wrote:
> > On Mon, Nov 29, 2021 at 02:23:01PM +0100, Christian Schoenebeck wrote:
> > > This new common configuration field allows to negotiate a more fine
> > > graded numeric maximum length of indirect descriptor chains.
> > > 
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/122
> > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > ---
> > > 
> > >  content.tex    | 37 ++++++++++++++++++++++++++++++++++++-
> > >  split-ring.tex |  5 +++++
> > >  2 files changed, 41 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/content.tex b/content.tex
> > > index 34fa23e..25861f3 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -859,6 +859,7 @@ \subsubsection{Common configuration structure
> > > layout}\label{sec:Virtio Transport> 
> > >          le64 queue_driver;              /* read-write */
> > >          le64 queue_device;              /* read-write */
> > >          le16 queue_notify_data;         /* read-only for driver */
> > > 
> > > +        le16 queue_indirect_size;       /* read-write */
> > > 
> > >  };
> > >  \end{lstlisting}
> > > 
> > > @@ -938,6 +939,34 @@ \subsubsection{Common configuration structure
> > > layout}\label{sec:Virtio Transport> 
> > >          may benefit from providing another value, for example an internal
> > >          virtqueue
> > >          identifier, or an internal offset related to the virtqueue
> > >          number.
> > >          \end{note}
> > > 
> > > +
> > > +\item[\field{queue_indirect_size}]
> > > +        This field was added in revision 3 and MUST exist if revision 3
> > > +        has been negotiated. This field is used to negotiate the maximum
> > > +        amount of indirect descriptors per indirect descriptor table as
> > > in
> > > +        \ref{sec:Basic Facilities of a Virtio Device / Virtqueues / The
> > > +        Virtqueue Descriptor Table / Indirect Descriptors}.
> > > +
> > > +        The device specifies its supported maximum amount here first,
> > > +        subsequently driver may read and then SHOULD write this field to
> > > +        lower the value if driver's maximum amount of indirect
> > > descriptors
> > > +        ever being emplaced by driver per vring slot is less than what
> > > +        device supports. However driver MUST NOT increase this value.
> > 
> > may/SHOULD/MUST NOT statements need to go into drivernormative or
> > devicenormative sections.
> > 
> > I suggest phrasing it so functionality is described here and normative
> > sections don't repeat what was already covered:
> > 
> >   The device specifies its maximum supported number of descriptors per
> >   indirect descriptor table. If the driver requires fewer descriptors,
> >   it writes its lower value to inform the device of the reduced resource
> >   requirements.
> > 
> > The drivernormative section would say:
> > 
> > - The driver SHOULD write the field if its maximum number of descriptors
> >   per indirect descriptor table is lower than that reported by the
> >   device.
> > - The driver MUST NOT write a higher value than the one it reads from
> >   the device.
> 
> I don't understand which sections you see as "drivernormative" and which ones 
> not. Which precise sections in the spec do you want those comments to go to?

The spec is structured so that each section contains:

- Description
  \section{Foo}\label{sec:Bar / Foo}

- Driver requirements sub-section (optional)
  \drivernormative{\subsection}{Foo}{Bar / Foo}

- Device requirements sub-section (optional)
  \devicenormative{\subsection}{Foo}{Bar / Foo}

The driver and device requirements are kept separate from the
description, so normative statements (SHOULD, MUST, etc) are not
supposed to be used in the description text. Each driver and device
requirements sub-section is referenced from conformance.tex, a single
chapter at the end of the spec that lists all the normative
sub-sections.

Here is where I would put the spec changes:

- In "Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout" (4.1.4.3 Common configuration structure layout in VIRTIO 1.1):

  \item[\field{queue_indirect_size}] The device specifies its maximum
  supported number of descriptors per indirect descriptor table. If the
  driver requires fewer descriptors, it writes its lower value to inform
  the device of the reduced resource requirements.

- In the \drivernormative paragraph for "Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout" (4.1.4.3.2 Driver Requirements: Common configuration structure layout in VIRTIO 1.1):

  The driver SHOULD write the field if its maximum number of descriptors
  per indirect descriptor table is lower than that reported by the
  device.

  The driver MUST NOT write a higher value than the one it reads from
  the device.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-06 11:52             ` Cornelia Huck
@ 2021-12-06 19:12               ` Christian Schoenebeck
  2021-12-07  9:50                 ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-12-06 19:12 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: virtio-comment, Stefan Hajnoczi, Greg Kurz

On Montag, 6. Dezember 2021 12:52:07 CET Cornelia Huck wrote:
> On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> >> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> >> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com>
> > 
> > wrote:
> >> >> > And what about the intended availability of this new
> >> >> > virtio_pci_common_cfg
> >> >> > field "queue_indirect_size", should it be optional per se,
> >> >> > independent
> >> >> > of
> >> >> > the virtio version or rather a mandatory field in upcoming virtio
> >> >> > version?
> >> >> 
> >> >> We should keep that field depending upon the feature bit IMHO.
> >> > 
> >> > So you are suggesting the existence of "queue_indirect_size" field to
> >> > be
> >> > dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of
> >> > being dependent on the virtio version.
> >> 
> >> Yes.
> >> 
> >> > Stefan, would that also suit your intended 2nd use case of lowering the
> >> > max. descriptor count *below* Queue Size? It would be somewhat
> >> > different
> >> > from what I suggested here in patch 2 [which was min(QueueSize,
> >> > queue_indirect_size) if VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but
> >> > it could be fine as well.
> >> 
> >> I think that a configurable value for the descriptor count needs to
> >> depend on the feature bit as well.
> >> 
> >> > Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit
> >> > misleading though, because the flag would also be set for forcing small
> >> > limits.
> >> 
> >> VIRTIO_RING_F_CONFIGURABLE_INDIRECT_DESC ?
> > 
> > As the suggested common config field is "queue_indirect_size", what about
> > VIRTIO_RING_F_INDIRECT_SIZE ?
> 
> That would work for me as well.

Ok

> >> > Another issue I just realized: there is also an ambiguity in this v2
> >> > what
> >> > the maximum descriptor count actually relates to. Should it be
> >> > 
> >> > 1. max. indirect descriptor count per indirect descriptor table
> >> > 
> >> > or
> >> > 
> >> > 2. max. indirect descriptor count per vring slot (i.e. the sum from
> >> > multiple indirect descriptor tables within the same message)
> >> > 
> >> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
> >> > possible
> >> > bulk transfer size is lower in case 2 accordingly.
> > 
> > After reviewing virtio code on QEMU side again, I suggest to go for (2.).
> > Otherwise a large portion of QEMU's virtio code would need quite a bunch
> > of
> > changes to support (1.). I assume that resistence for such changes in QEMU
> > would be high, and I currently don't care enough to work on and defend
> > those changes that (1.) would need.
> > 
> > In practice that would mean for many devices: the theoretical absolute
> > max.
> > virtio transfer size might be cut into half with (2.) in comparison to
> > (1.), which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page size
> > of 4k, because one indirect descriptor table is usually used for sending
> > to device and another table for receiving from device. But that's use
> > case dependent and (1.) is still a huge step forward IMO.
> 
> If the variant that is easier for QEMU to implement still gives you
> enough of what you need, I'm fine with going with that. (Is it
> future-proof enough?)

No crystal ball here, sorry. :)

Just to give you a feeling what I am talking about here for QEMU, you might
have a quick glimpse on the hw/virtio/virtio.c changes of following patch. It
is not exactly how the final changes would look like, but it should give a
rough idea of what is involved:
https://lore.kernel.org/all/c9dea2e27ae19b2b9a51e8f08687067bf3e47bd5.1633376313.git.qemu_oss@crudebyte.com/

As you can see, QEMU first reserves the max. expected descriptor count as
array memory on stack, then it gathers *all* descriptors from all indirect
descriptor tables of a vring slot all together into that array and finally
the vring slot's message is processed on device level:
https://github.com/qemu/qemu/blob/99fc08366b06282614daeda989d2fde6ab8a707f/hw/virtio/virtio.c#L1475

So a limit per vring slot would be much easier to implement in QEMU, as it is
more or less just refactoring of QEMU's current compile-time constant
VIRTQUEUE_MAX_SIZE into a runtime variable.
Implementing a limit per table instead would require substantial changes to
its current program flow.

Back to your question ...

Assuming that most devices have one or two tables per vring slot, and
considering that almost nobody cared for virtio's current descriptor count
limit so far, I would not expect that the new, much higher limit to be
questioned in the next few years or so. And if it was, you would probably also
start to question all those 16-bit fields in virtio as well and then this
specific aspect here would probably be the smallest issue to worry about.

OTOH if there are devices with like 10 descriptor tables or more per vring
slot, then they obviously would hit this limit much sooner. No idea if there
is any such device though.

> Do we know what other implementors of this standard are doing?

I don't know how this is handled in other implementations, nor who to ask
unfortunately. I assume these would be bhyve and VirtualBox.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-06 19:12               ` Christian Schoenebeck
@ 2021-12-07  9:50                 ` Stefan Hajnoczi
  2021-12-07 18:00                   ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-12-07  9:50 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: Cornelia Huck, virtio-comment, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 5954 bytes --]

On Mon, Dec 06, 2021 at 08:12:14PM +0100, Christian Schoenebeck wrote:
> On Montag, 6. Dezember 2021 12:52:07 CET Cornelia Huck wrote:
> > On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
> > >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > >> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> > >> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com>
> > > 
> > > wrote:
> > >> >> > And what about the intended availability of this new
> > >> >> > virtio_pci_common_cfg
> > >> >> > field "queue_indirect_size", should it be optional per se,
> > >> >> > independent
> > >> >> > of
> > >> >> > the virtio version or rather a mandatory field in upcoming virtio
> > >> >> > version?
> > >> >> 
> > >> >> We should keep that field depending upon the feature bit IMHO.
> > >> > 
> > >> > So you are suggesting the existence of "queue_indirect_size" field to
> > >> > be
> > >> > dependent on feature flag VIRTIO_RING_F_LARGE_INDIRECT_DESC instead of
> > >> > being dependent on the virtio version.
> > >> 
> > >> Yes.
> > >> 
> > >> > Stefan, would that also suit your intended 2nd use case of lowering the
> > >> > max. descriptor count *below* Queue Size? It would be somewhat
> > >> > different
> > >> > from what I suggested here in patch 2 [which was min(QueueSize,
> > >> > queue_indirect_size) if VIRTIO_RING_F_LARGE_INDIRECT_DESC not set], but
> > >> > it could be fine as well.
> > >> 
> > >> I think that a configurable value for the descriptor count needs to
> > >> depend on the feature bit as well.
> > >> 
> > >> > Maybe the name VIRTIO_RING_F_LARGE_INDIRECT_DESC might then a bit
> > >> > misleading though, because the flag would also be set for forcing small
> > >> > limits.
> > >> 
> > >> VIRTIO_RING_F_CONFIGURABLE_INDIRECT_DESC ?
> > > 
> > > As the suggested common config field is "queue_indirect_size", what about
> > > VIRTIO_RING_F_INDIRECT_SIZE ?
> > 
> > That would work for me as well.
> 
> Ok
> 
> > >> > Another issue I just realized: there is also an ambiguity in this v2
> > >> > what
> > >> > the maximum descriptor count actually relates to. Should it be
> > >> > 
> > >> > 1. max. indirect descriptor count per indirect descriptor table
> > >> > 
> > >> > or
> > >> > 
> > >> > 2. max. indirect descriptor count per vring slot (i.e. the sum from
> > >> > multiple indirect descriptor tables within the same message)
> > >> > 
> > >> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
> > >> > possible
> > >> > bulk transfer size is lower in case 2 accordingly.
> > > 
> > > After reviewing virtio code on QEMU side again, I suggest to go for (2.).
> > > Otherwise a large portion of QEMU's virtio code would need quite a bunch
> > > of
> > > changes to support (1.). I assume that resistence for such changes in QEMU
> > > would be high, and I currently don't care enough to work on and defend
> > > those changes that (1.) would need.
> > > 
> > > In practice that would mean for many devices: the theoretical absolute
> > > max.
> > > virtio transfer size might be cut into half with (2.) in comparison to
> > > (1.), which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page size
> > > of 4k, because one indirect descriptor table is usually used for sending
> > > to device and another table for receiving from device. But that's use
> > > case dependent and (1.) is still a huge step forward IMO.
> > 
> > If the variant that is easier for QEMU to implement still gives you
> > enough of what you need, I'm fine with going with that. (Is it
> > future-proof enough?)
> 
> No crystal ball here, sorry. :)
> 
> Just to give you a feeling what I am talking about here for QEMU, you might
> have a quick glimpse on the hw/virtio/virtio.c changes of following patch. It
> is not exactly how the final changes would look like, but it should give a
> rough idea of what is involved:
> https://lore.kernel.org/all/c9dea2e27ae19b2b9a51e8f08687067bf3e47bd5.1633376313.git.qemu_oss@crudebyte.com/
> 
> As you can see, QEMU first reserves the max. expected descriptor count as
> array memory on stack, then it gathers *all* descriptors from all indirect
> descriptor tables of a vring slot all together into that array and finally
> the vring slot's message is processed on device level:
> https://github.com/qemu/qemu/blob/99fc08366b06282614daeda989d2fde6ab8a707f/hw/virtio/virtio.c#L1475
> 
> So a limit per vring slot would be much easier to implement in QEMU, as it is
> more or less just refactoring of QEMU's current compile-time constant
> VIRTQUEUE_MAX_SIZE into a runtime variable.
> Implementing a limit per table instead would require substantial changes to
> its current program flow.
> 
> Back to your question ...
> 
> Assuming that most devices have one or two tables per vring slot, and
> considering that almost nobody cared for virtio's current descriptor count
> limit so far, I would not expect that the new, much higher limit to be
> questioned in the next few years or so. And if it was, you would probably also
> start to question all those 16-bit fields in virtio as well and then this
> specific aspect here would probably be the smallest issue to worry about.
> 
> OTOH if there are devices with like 10 descriptor tables or more per vring
> slot, then they obviously would hit this limit much sooner. No idea if there
> is any such device though.

Other device implementations probably also care about the total number
of descriptors per vring slot instead of the number of descriptors per
indirect table. The limitation on the device side is the resource
requirement and/or maximum supported by the underlying I/O mechanism, so
the total number of descriptors is likely to matter.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-07  9:50                 ` Stefan Hajnoczi
@ 2021-12-07 18:00                   ` Cornelia Huck
  2021-12-08 12:26                     ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-12-07 18:00 UTC (permalink / raw)
  To: Stefan Hajnoczi, Christian Schoenebeck; +Cc: virtio-comment, Greg Kurz

On Tue, Dec 07 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Mon, Dec 06, 2021 at 08:12:14PM +0100, Christian Schoenebeck wrote:
>> On Montag, 6. Dezember 2021 12:52:07 CET Cornelia Huck wrote:
>> > On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > > On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
>> > >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
>> > >> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
>> > >> >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com>
>> > >> > Another issue I just realized: there is also an ambiguity in this v2
>> > >> > what
>> > >> > the maximum descriptor count actually relates to. Should it be
>> > >> > 
>> > >> > 1. max. indirect descriptor count per indirect descriptor table
>> > >> > 
>> > >> > or
>> > >> > 
>> > >> > 2. max. indirect descriptor count per vring slot (i.e. the sum from
>> > >> > multiple indirect descriptor tables within the same message)
>> > >> > 
>> > >> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
>> > >> > possible
>> > >> > bulk transfer size is lower in case 2 accordingly.
>> > > 
>> > > After reviewing virtio code on QEMU side again, I suggest to go for (2.).
>> > > Otherwise a large portion of QEMU's virtio code would need quite a bunch
>> > > of
>> > > changes to support (1.). I assume that resistence for such changes in QEMU
>> > > would be high, and I currently don't care enough to work on and defend
>> > > those changes that (1.) would need.
>> > > 
>> > > In practice that would mean for many devices: the theoretical absolute
>> > > max.
>> > > virtio transfer size might be cut into half with (2.) in comparison to
>> > > (1.), which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page size
>> > > of 4k, because one indirect descriptor table is usually used for sending
>> > > to device and another table for receiving from device. But that's use
>> > > case dependent and (1.) is still a huge step forward IMO.
>> > 
>> > If the variant that is easier for QEMU to implement still gives you
>> > enough of what you need, I'm fine with going with that. (Is it
>> > future-proof enough?)
>> 
>> No crystal ball here, sorry. :)

:)

>> 
>> Just to give you a feeling what I am talking about here for QEMU, you might
>> have a quick glimpse on the hw/virtio/virtio.c changes of following patch. It
>> is not exactly how the final changes would look like, but it should give a
>> rough idea of what is involved:
>> https://lore.kernel.org/all/c9dea2e27ae19b2b9a51e8f08687067bf3e47bd5.1633376313.git.qemu_oss@crudebyte.com/
>> 
>> As you can see, QEMU first reserves the max. expected descriptor count as
>> array memory on stack, then it gathers *all* descriptors from all indirect
>> descriptor tables of a vring slot all together into that array and finally
>> the vring slot's message is processed on device level:
>> https://github.com/qemu/qemu/blob/99fc08366b06282614daeda989d2fde6ab8a707f/hw/virtio/virtio.c#L1475
>> 
>> So a limit per vring slot would be much easier to implement in QEMU, as it is
>> more or less just refactoring of QEMU's current compile-time constant
>> VIRTQUEUE_MAX_SIZE into a runtime variable.
>> Implementing a limit per table instead would require substantial changes to
>> its current program flow.
>> 
>> Back to your question ...
>> 
>> Assuming that most devices have one or two tables per vring slot, and
>> considering that almost nobody cared for virtio's current descriptor count
>> limit so far, I would not expect that the new, much higher limit to be
>> questioned in the next few years or so. And if it was, you would probably also
>> start to question all those 16-bit fields in virtio as well and then this
>> specific aspect here would probably be the smallest issue to worry about.
>> 
>> OTOH if there are devices with like 10 descriptor tables or more per vring
>> slot, then they obviously would hit this limit much sooner. No idea if there
>> is any such device though.
>
> Other device implementations probably also care about the total number
> of descriptors per vring slot instead of the number of descriptors per
> indirect table. The limitation on the device side is the resource
> requirement and/or maximum supported by the underlying I/O mechanism, so
> the total number of descriptors is likely to matter.

Thanks to you both; going with the total number seems to be best.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-07 18:00                   ` Cornelia Huck
@ 2021-12-08 12:26                     ` Christian Schoenebeck
  2021-12-08 15:11                       ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Schoenebeck @ 2021-12-08 12:26 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Stefan Hajnoczi, virtio-comment, Greg Kurz

On Dienstag, 7. Dezember 2021 19:00:27 CET Cornelia Huck wrote:
> On Tue, Dec 07 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Mon, Dec 06, 2021 at 08:12:14PM +0100, Christian Schoenebeck wrote:
> >> On Montag, 6. Dezember 2021 12:52:07 CET Cornelia Huck wrote:
> >> > On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
wrote:
> >> > > On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
> >> > >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
wrote:
> >> > >> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> >> > >> >> On Tue, Nov 30 2021, Christian Schoenebeck
> >> > >> >> <qemu_oss@crudebyte.com>
> >> > >> > 
> >> > >> > Another issue I just realized: there is also an ambiguity in this
> >> > >> > v2
> >> > >> > what
> >> > >> > the maximum descriptor count actually relates to. Should it be
> >> > >> > 
> >> > >> > 1. max. indirect descriptor count per indirect descriptor table
> >> > >> > 
> >> > >> > or
> >> > >> > 
> >> > >> > 2. max. indirect descriptor count per vring slot (i.e. the sum
> >> > >> > from
> >> > >> > multiple indirect descriptor tables within the same message)
> >> > >> > 
> >> > >> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
> >> > >> > possible
> >> > >> > bulk transfer size is lower in case 2 accordingly.
> >> > > 
> >> > > After reviewing virtio code on QEMU side again, I suggest to go for
> >> > > (2.).
> >> > > Otherwise a large portion of QEMU's virtio code would need quite a
> >> > > bunch
> >> > > of
> >> > > changes to support (1.). I assume that resistence for such changes in
> >> > > QEMU
> >> > > would be high, and I currently don't care enough to work on and
> >> > > defend
> >> > > those changes that (1.) would need.
> >> > > 
> >> > > In practice that would mean for many devices: the theoretical
> >> > > absolute
> >> > > max.
> >> > > virtio transfer size might be cut into half with (2.) in comparison
> >> > > to
> >> > > (1.), which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page
> >> > > size
> >> > > of 4k, because one indirect descriptor table is usually used for
> >> > > sending
> >> > > to device and another table for receiving from device. But that's use
> >> > > case dependent and (1.) is still a huge step forward IMO.
> >> > 
> >> > If the variant that is easier for QEMU to implement still gives you
> >> > enough of what you need, I'm fine with going with that. (Is it
> >> > future-proof enough?)
> >> 
> >> No crystal ball here, sorry. :)
> :
> :)
> :
> >> Just to give you a feeling what I am talking about here for QEMU, you
> >> might
> >> have a quick glimpse on the hw/virtio/virtio.c changes of following
> >> patch. It is not exactly how the final changes would look like, but it
> >> should give a rough idea of what is involved:
> >> https://lore.kernel.org/all/c9dea2e27ae19b2b9a51e8f08687067bf3e47bd5.1633
> >> 376313.git.qemu_oss@crudebyte.com/
> >> 
> >> As you can see, QEMU first reserves the max. expected descriptor count as
> >> array memory on stack, then it gathers *all* descriptors from all
> >> indirect
> >> descriptor tables of a vring slot all together into that array and
> >> finally
> >> the vring slot's message is processed on device level:
> >> https://github.com/qemu/qemu/blob/99fc08366b06282614daeda989d2fde6ab8a707
> >> f/hw/virtio/virtio.c#L1475
> >> 
> >> So a limit per vring slot would be much easier to implement in QEMU, as
> >> it is more or less just refactoring of QEMU's current compile-time
> >> constant VIRTQUEUE_MAX_SIZE into a runtime variable.
> >> Implementing a limit per table instead would require substantial changes
> >> to
> >> its current program flow.
> >> 
> >> Back to your question ...
> >> 
> >> Assuming that most devices have one or two tables per vring slot, and
> >> considering that almost nobody cared for virtio's current descriptor
> >> count
> >> limit so far, I would not expect that the new, much higher limit to be
> >> questioned in the next few years or so. And if it was, you would probably
> >> also start to question all those 16-bit fields in virtio as well and
> >> then this specific aspect here would probably be the smallest issue to
> >> worry about.
> >> 
> >> OTOH if there are devices with like 10 descriptor tables or more per
> >> vring
> >> slot, then they obviously would hit this limit much sooner. No idea if
> >> there is any such device though.
> > 
> > Other device implementations probably also care about the total number
> > of descriptors per vring slot instead of the number of descriptors per
> > indirect table. The limitation on the device side is the resource
> > requirement and/or maximum supported by the underlying I/O mechanism, so
> > the total number of descriptors is likely to matter.
> 
> Thanks to you both; going with the total number seems to be best.

Yes, agreed.

One more thought: what about making the new 'queue_indirect_size' config field 
32 bit wide instead of 16 bit? That would easily mitigate the issue of the 
aggregated limit discussed here, and would in general be more future safe, 
i.e. considering that there might be either nested/multi-level indirect 
descriptor tables or chained tables in future? The cost would be low, right?

Best regards,
Christian Schoenebeck



This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-08 12:26                     ` Christian Schoenebeck
@ 2021-12-08 15:11                       ` Stefan Hajnoczi
  2021-12-08 15:28                         ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-12-08 15:11 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: Cornelia Huck, virtio-comment, Greg Kurz

[-- Attachment #1: Type: text/plain, Size: 6012 bytes --]

On Wed, Dec 08, 2021 at 01:26:54PM +0100, Christian Schoenebeck wrote:
> On Dienstag, 7. Dezember 2021 19:00:27 CET Cornelia Huck wrote:
> > On Tue, Dec 07 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > On Mon, Dec 06, 2021 at 08:12:14PM +0100, Christian Schoenebeck wrote:
> > >> On Montag, 6. Dezember 2021 12:52:07 CET Cornelia Huck wrote:
> > >> > On Fri, Dec 03 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
> wrote:
> > >> > > On Donnerstag, 2. Dezember 2021 11:27:17 CET Cornelia Huck wrote:
> > >> > >> On Tue, Nov 30 2021, Christian Schoenebeck <qemu_oss@crudebyte.com> 
> wrote:
> > >> > >> > On Dienstag, 30. November 2021 14:48:50 CET Cornelia Huck wrote:
> > >> > >> >> On Tue, Nov 30 2021, Christian Schoenebeck
> > >> > >> >> <qemu_oss@crudebyte.com>
> > >> > >> > 
> > >> > >> > Another issue I just realized: there is also an ambiguity in this
> > >> > >> > v2
> > >> > >> > what
> > >> > >> > the maximum descriptor count actually relates to. Should it be
> > >> > >> > 
> > >> > >> > 1. max. indirect descriptor count per indirect descriptor table
> > >> > >> > 
> > >> > >> > or
> > >> > >> > 
> > >> > >> > 2. max. indirect descriptor count per vring slot (i.e. the sum
> > >> > >> > from
> > >> > >> > multiple indirect descriptor tables within the same message)
> > >> > >> > 
> > >> > >> > Case 2 applies to QEMU's implementation right now AFAICS. The max.
> > >> > >> > possible
> > >> > >> > bulk transfer size is lower in case 2 accordingly.
> > >> > > 
> > >> > > After reviewing virtio code on QEMU side again, I suggest to go for
> > >> > > (2.).
> > >> > > Otherwise a large portion of QEMU's virtio code would need quite a
> > >> > > bunch
> > >> > > of
> > >> > > changes to support (1.). I assume that resistence for such changes in
> > >> > > QEMU
> > >> > > would be high, and I currently don't care enough to work on and
> > >> > > defend
> > >> > > those changes that (1.) would need.
> > >> > > 
> > >> > > In practice that would mean for many devices: the theoretical
> > >> > > absolute
> > >> > > max.
> > >> > > virtio transfer size might be cut into half with (2.) in comparison
> > >> > > to
> > >> > > (1.), which is (2^16 * PAGE_SIZE) / 2 = 128 MB with a typical page
> > >> > > size
> > >> > > of 4k, because one indirect descriptor table is usually used for
> > >> > > sending
> > >> > > to device and another table for receiving from device. But that's use
> > >> > > case dependent and (1.) is still a huge step forward IMO.
> > >> > 
> > >> > If the variant that is easier for QEMU to implement still gives you
> > >> > enough of what you need, I'm fine with going with that. (Is it
> > >> > future-proof enough?)
> > >> 
> > >> No crystal ball here, sorry. :)
> > :
> > :)
> > :
> > >> Just to give you a feeling what I am talking about here for QEMU, you
> > >> might
> > >> have a quick glimpse on the hw/virtio/virtio.c changes of following
> > >> patch. It is not exactly how the final changes would look like, but it
> > >> should give a rough idea of what is involved:
> > >> https://lore.kernel.org/all/c9dea2e27ae19b2b9a51e8f08687067bf3e47bd5.1633
> > >> 376313.git.qemu_oss@crudebyte.com/
> > >> 
> > >> As you can see, QEMU first reserves the max. expected descriptor count as
> > >> array memory on stack, then it gathers *all* descriptors from all
> > >> indirect
> > >> descriptor tables of a vring slot all together into that array and
> > >> finally
> > >> the vring slot's message is processed on device level:
> > >> https://github.com/qemu/qemu/blob/99fc08366b06282614daeda989d2fde6ab8a707
> > >> f/hw/virtio/virtio.c#L1475
> > >> 
> > >> So a limit per vring slot would be much easier to implement in QEMU, as
> > >> it is more or less just refactoring of QEMU's current compile-time
> > >> constant VIRTQUEUE_MAX_SIZE into a runtime variable.
> > >> Implementing a limit per table instead would require substantial changes
> > >> to
> > >> its current program flow.
> > >> 
> > >> Back to your question ...
> > >> 
> > >> Assuming that most devices have one or two tables per vring slot, and
> > >> considering that almost nobody cared for virtio's current descriptor
> > >> count
> > >> limit so far, I would not expect that the new, much higher limit to be
> > >> questioned in the next few years or so. And if it was, you would probably
> > >> also start to question all those 16-bit fields in virtio as well and
> > >> then this specific aspect here would probably be the smallest issue to
> > >> worry about.
> > >> 
> > >> OTOH if there are devices with like 10 descriptor tables or more per
> > >> vring
> > >> slot, then they obviously would hit this limit much sooner. No idea if
> > >> there is any such device though.
> > > 
> > > Other device implementations probably also care about the total number
> > > of descriptors per vring slot instead of the number of descriptors per
> > > indirect table. The limitation on the device side is the resource
> > > requirement and/or maximum supported by the underlying I/O mechanism, so
> > > the total number of descriptors is likely to matter.
> > 
> > Thanks to you both; going with the total number seems to be best.
> 
> Yes, agreed.
> 
> One more thought: what about making the new 'queue_indirect_size' config field 
> 32 bit wide instead of 16 bit? That would easily mitigate the issue of the 
> aggregated limit discussed here, and would in general be more future safe, 
> i.e. considering that there might be either nested/multi-level indirect 
> descriptor tables or chained tables in future? The cost would be low, right?

The MMIO transport has 32-bit fields, so there doesn't seem to be a
strict requirement to use 16 bits for descriptor counts.

I think you're right that the cost is low. Usually it's the access
itself that carries a cost (a VM exit or bus transaction) and 2 vs 4
byte transfers don't really matter.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-08 15:11                       ` Stefan Hajnoczi
@ 2021-12-08 15:28                         ` Cornelia Huck
  2021-12-09 12:33                           ` Christian Schoenebeck
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2021-12-08 15:28 UTC (permalink / raw)
  To: Stefan Hajnoczi, Christian Schoenebeck; +Cc: virtio-comment, Greg Kurz

On Wed, Dec 08 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Wed, Dec 08, 2021 at 01:26:54PM +0100, Christian Schoenebeck wrote:

>> One more thought: what about making the new 'queue_indirect_size' config field 
>> 32 bit wide instead of 16 bit? That would easily mitigate the issue of the 
>> aggregated limit discussed here, and would in general be more future safe, 
>> i.e. considering that there might be either nested/multi-level indirect 
>> descriptor tables or chained tables in future? The cost would be low, right?
>
> The MMIO transport has 32-bit fields, so there doesn't seem to be a
> strict requirement to use 16 bits for descriptor counts.
>
> I think you're right that the cost is low. Usually it's the access
> itself that carries a cost (a VM exit or bus transaction) and 2 vs 4
> byte transfers don't really matter.

For ccw, I think we'd simply include the value in a control block, and
it does not really matter whether that control block is 2 bytes
longer. So, no objection from me.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC
  2021-12-08 15:28                         ` Cornelia Huck
@ 2021-12-09 12:33                           ` Christian Schoenebeck
  0 siblings, 0 replies; 21+ messages in thread
From: Christian Schoenebeck @ 2021-12-09 12:33 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Stefan Hajnoczi, virtio-comment, Greg Kurz

On Mittwoch, 8. Dezember 2021 16:28:07 CET Cornelia Huck wrote:
> On Wed, Dec 08 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Wed, Dec 08, 2021 at 01:26:54PM +0100, Christian Schoenebeck wrote:
> >> One more thought: what about making the new 'queue_indirect_size' config
> >> field 32 bit wide instead of 16 bit? That would easily mitigate the
> >> issue of the aggregated limit discussed here, and would in general be
> >> more future safe, i.e. considering that there might be either
> >> nested/multi-level indirect descriptor tables or chained tables in
> >> future? The cost would be low, right?> 
> > The MMIO transport has 32-bit fields, so there doesn't seem to be a
> > strict requirement to use 16 bits for descriptor counts.
> > 
> > I think you're right that the cost is low. Usually it's the access
> > itself that carries a cost (a VM exit or bus transaction) and 2 vs 4
> > byte transfers don't really matter.
> 
> For ccw, I think we'd simply include the value in a control block, and
> it does not really matter whether that control block is 2 bytes
> longer. So, no objection from me.

Sounds like a plan!

I will send a new version next week. Thanks guys!

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-12-09 12:33 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-29 13:24 [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Christian Schoenebeck
2021-11-29 13:22 ` [PATCH v2 1/2] " Christian Schoenebeck
2021-12-02 16:04   ` Stefan Hajnoczi
2021-11-29 13:23 ` [PATCH v2 2/2] Add common configuration field "queue_indirect_size" Christian Schoenebeck
2021-12-02 16:21   ` Stefan Hajnoczi
2021-12-03 15:03     ` Christian Schoenebeck
2021-12-06 13:54       ` Stefan Hajnoczi
2021-11-30 12:06 ` [virtio-comment] Re: [PATCH v2 0/2] Add VIRTIO_RING_F_LARGE_INDIRECT_DESC Cornelia Huck
2021-11-30 13:33   ` Christian Schoenebeck
2021-11-30 13:48     ` Cornelia Huck
2021-11-30 18:47       ` Christian Schoenebeck
2021-12-02 10:27         ` Cornelia Huck
2021-12-03 14:47           ` Christian Schoenebeck
2021-12-06 11:52             ` Cornelia Huck
2021-12-06 19:12               ` Christian Schoenebeck
2021-12-07  9:50                 ` Stefan Hajnoczi
2021-12-07 18:00                   ` Cornelia Huck
2021-12-08 12:26                     ` Christian Schoenebeck
2021-12-08 15:11                       ` Stefan Hajnoczi
2021-12-08 15:28                         ` Cornelia Huck
2021-12-09 12:33                           ` Christian Schoenebeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.