linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 0/2] virtio: put last seen used index into ring itself
@ 2010-06-01 14:47 Michael S. Tsirkin
  2010-06-01 14:47 ` [PATCHv3 1/2] virtio: support layout with avail ring before idx Michael S. Tsirkin
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-01 14:47 UTC (permalink / raw)
  To: Rusty Russell, linux-kernel, virtualization, kvm, qemu-devel

Changes from v2: added padding between avail idx and flags,
and changed virtio to only publish used index when callbacks
are enabled.

Here's a rewrite of the original patch with a new layout.
I haven't tested it yet so no idea how this performs, but
I think this addresses the cache bounce issue raised by Avi.
Posting for early flames/comments.

Generally, the Host end of the virtio ring doesn't need to see where
Guest is up to in consuming the ring.  However, to completely understand
what's going on from the outside, this information must be exposed.
For example, host can reduce the number of interrupts by detecting
that the guest is currently handling previous buffers.

We add a feature bit so the guest can tell the host that it's writing
out the current value there, if it wants to use that.

This differs from original approach in that the used index
is put after avail index (they are typically written out together).
To avoid cache bounces on descriptor access,
and make future extensions easier, we put the ring itself at start of
page, and move the control after it.


Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


Michael S. Tsirkin (2):
  virtio: support layout with avail ring before idx
  virtio: publish used idx

 drivers/net/virtio_net.c     |    2 +
 drivers/virtio/virtio_ring.c |   25 +++++++++++++++------
 include/linux/virtio_ring.h  |   50 ++++++++++++++++++++++++++++++++++++-----
 3 files changed, 64 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-01 14:47 [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
@ 2010-06-01 14:47 ` Michael S. Tsirkin
  2010-06-04  2:34   ` Rusty Russell
  2010-06-01 14:47 ` [PATCHv3 2/2] virtio: publish used idx Michael S. Tsirkin
  2010-06-01 15:12 ` [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
  2 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-01 14:47 UTC (permalink / raw)
  To: Rusty Russell, linux-kernel, virtualization, kvm, qemu-devel

This adds an (unused) option to put available ring before control (avail
index, flags), and adds padding between index and flags. This avoids
cache line sharing between control and ring, and also makes it possible
to extend avail control without incurring extra cache misses.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c |   12 ++++++------
 include/linux/virtio_ring.h  |   40 ++++++++++++++++++++++++++++++++++------
 2 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 1ca8890..0f684b4 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -223,8 +223,8 @@ add_head:
 
 	/* Put entry in available array (but don't update avail->idx until they
 	 * do sync).  FIXME: avoid modulus here? */
-	avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num;
-	vq->vring.avail->ring[avail] = head;
+	avail = (*vq->vring.avail_idx + vq->num_added++) % vq->vring.num;
+	vq->vring.avail_ring[avail] = head;
 
 	pr_debug("Added buffer head %i to %p\n", head, vq);
 	END_USE(vq);
@@ -244,7 +244,7 @@ void virtqueue_kick(struct virtqueue *_vq)
 	 * new available array entries. */
 	virtio_wmb();
 
-	vq->vring.avail->idx += vq->num_added;
+	*vq->vring.avail_idx += vq->num_added;
 	vq->num_added = 0;
 
 	/* Need to update avail index before checking if we should notify */
@@ -335,7 +335,7 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
-	vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+	*vq->vring.avail_flags |= VRING_AVAIL_F_NO_INTERRUPT;
 }
 EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
 
@@ -347,7 +347,7 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
 
 	/* We optimistically turn back on interrupts, then check if there was
 	 * more to do. */
-	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	vq->vring.avail_flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
 	virtio_mb();
 	if (unlikely(more_used(vq))) {
 		END_USE(vq);
@@ -425,7 +425,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	if (!vq)
 		return NULL;
 
-	vring_init(&vq->vring, num, pages, vring_align);
+	vring_init(&vq->vring, num, pages, vring_align, false);
 	vq->vq.callback = callback;
 	vq->vq.vdev = vdev;
 	vq->vq.name = name;
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e4d144b..458ce41 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -47,6 +47,12 @@ struct vring_avail {
 	__u16 ring[];
 };
 
+struct vring_avail_ctrl {
+	__u16 idx;
+	__u8 pad[254];
+	__u16 flags;
+};
+
 /* u32 is used here for ids for padding reasons. */
 struct vring_used_elem {
 	/* Index of start of used descriptor chain. */
@@ -66,7 +72,9 @@ struct vring {
 
 	struct vring_desc *desc;
 
-	struct vring_avail *avail;
+	__u16 *avail_idx;
+	__u16 *avail_flags;
+	__u16 *avail_ring;
 
 	struct vring_used *used;
 };
@@ -79,11 +87,19 @@ struct vring {
  *	// The actual descriptors (16 bytes each)
  *	struct vring_desc desc[num];
  *
- *	// A ring of available descriptor heads with free-running index.
+ *	// A ring of available descriptor heads with a control structure
+ *      // including a free-running index.
+ *      // The ring can come either after  (legacy) or before the control.
  *	__u16 avail_flags;
  *	__u16 avail_idx;
  *	__u16 available[num];
  *
+ * or
+ *
+ *	__u16 available[num];
+ *	__u16 avail_idx;
+ *	__u8 pad[254]; // Padding to align flags at cache line boundary.
+ *	__u16 avail_flags;
  *	// Padding to the next align boundary.
  *	char pad[];
  *
@@ -94,13 +110,25 @@ struct vring {
  * };
  */
 static inline void vring_init(struct vring *vr, unsigned int num, void *p,
-			      unsigned long align)
+			      unsigned long align, bool avail_ring_first)
 {
+	struct vring_avail *avail = p + num * sizeof(struct vring_desc);
 	vr->num = num;
 	vr->desc = p;
-	vr->avail = p + num*sizeof(struct vring_desc);
-	vr->used = (void *)(((unsigned long)&vr->avail->ring[num] + align-1)
-			    & ~(align - 1));
+	if (avail_ring_first) {
+		struct vring_avail_ctrl *ctrl = (void *)&vr->avail_ring[num];
+		vr->avail_ring = (void*)avail;
+		vr->avail_idx = &ctrl->idx;
+		vr->avail_flags = &ctrl->flags;
+	} else {
+		vr->avail_idx = &avail->idx;
+		vr->avail_flags = &avail->flags;
+	}
+	vr->used = (void *)ALIGN((unsigned long)&avail->ring[num], align);
+	/* Verify that avail fits before used. */
+	BUG_ON((unsigned long)(vr->avail_flags + 1) > (unsigned long)vr->used);
+	BUG_ON((unsigned long)(vr->avail_idx + 1) > (unsigned long)vr->used);
+	BUG_ON((unsigned long)(&vr->avail_ring[num]) > (unsigned long)vr->used);
 }
 
 static inline unsigned vring_size(unsigned int num, unsigned long align)
-- 
1.7.1.12.g42b7f


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv3 2/2] virtio: publish used idx
  2010-06-01 14:47 [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
  2010-06-01 14:47 ` [PATCHv3 1/2] virtio: support layout with avail ring before idx Michael S. Tsirkin
@ 2010-06-01 14:47 ` Michael S. Tsirkin
  2010-06-01 15:12 ` [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
  2 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-01 14:47 UTC (permalink / raw)
  To: Rusty Russell, linux-kernel, virtualization, kvm, qemu-devel

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/net/virtio_net.c     |    2 ++
 drivers/virtio/virtio_ring.c |   15 +++++++++++++--
 include/linux/virtio_ring.h  |   10 ++++++++++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 78eb319..30e7483 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -23,6 +23,7 @@
 #include <linux/module.h>
 #include <linux/virtio.h>
 #include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
 #include <linux/scatterlist.h>
 #include <linux/if_vlan.h>
 #include <linux/slab.h>
@@ -1056,6 +1057,7 @@ static unsigned int features[] = {
 	VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_UFO,
 	VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS, VIRTIO_NET_F_CTRL_VQ,
 	VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN,
+	VIRTIO_RING_F_PUBLISH_USED,
 };
 
 static struct virtio_driver virtio_net_driver = {
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 0f684b4..5a8c711 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -92,6 +92,9 @@ struct vring_virtqueue
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
+	/* Publish last used index we've seen at this location. */
+	u16 *publish_last_used_idx;
+
 	/* How to notify other side. FIXME: commonalize hcalls! */
 	void (*notify)(struct virtqueue *vq);
 
@@ -325,7 +328,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
 	/* detach_buf clears data, so grab it now. */
 	ret = vq->data[i];
 	detach_buf(vq, i);
-	vq->last_used_idx++;
+	*vq->publish_last_used_idx = ++vq->last_used_idx;
 	END_USE(vq);
 	return ret;
 }
@@ -425,13 +428,19 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
 	if (!vq)
 		return NULL;
 
-	vring_init(&vq->vring, num, pages, vring_align, false);
+	vring_init(&vq->vring, num, pages, vring_align,
+		   virtio_has_feature(vdev, VIRTIO_RING_F_PUBLISH_USED));
+	if (virtio_has_feature(vdev, VIRTIO_RING_F_PUBLISH_USED))
+	    vq->publish_last_used_idx = &vq->vring.avail->last_used_idx;
+	else
+	    vq->publish_last_used_idx = &vq->last_used_idx;
 	vq->vq.callback = callback;
 	vq->vq.vdev = vdev;
 	vq->vq.name = name;
 	vq->notify = notify;
 	vq->broken = false;
 	vq->last_used_idx = 0;
+	*vq->publish_last_used_idx = 0;
 	vq->num_added = 0;
 	list_add_tail(&vq->vq.list, &vdev->vqs);
 #ifdef DEBUG
@@ -473,6 +482,8 @@ void vring_transport_features(struct virtio_device *vdev)
 		switch (i) {
 		case VIRTIO_RING_F_INDIRECT_DESC:
 			break;
+		case VIRTIO_RING_F_PUBLISH_USED:
+			break;
 		default:
 			/* We don't understand this bit. */
 			clear_bit(i, vdev->features);
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 458ce41..87f8fd3 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -29,6 +29,9 @@
 /* We support indirect buffer descriptors */
 #define VIRTIO_RING_F_INDIRECT_DESC	28
 
+/* The Guest publishes last-seen used index at the end of the avail ring. */
+#define VIRTIO_RING_F_PUBLISH_USED	29
+
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
 	/* Address (guest-physical). */
@@ -51,6 +54,7 @@ struct vring_avail_ctrl {
 	__u16 idx;
 	__u8 pad[254];
 	__u16 flags;
+	__u16 last_used_idx;
 };
 
 /* u32 is used here for ids for padding reasons. */
@@ -75,6 +79,7 @@ struct vring {
 	__u16 *avail_idx;
 	__u16 *avail_flags;
 	__u16 *avail_ring;
+	__u16 *last_used_idx;
 
 	struct vring_used *used;
 };
@@ -100,6 +105,7 @@ struct vring {
  *	__u16 avail_idx;
  *	__u8 pad[254]; // Padding to align flags at cache line boundary.
  *	__u16 avail_flags;
+ *	__u16 last_used_idx;
  *	// Padding to the next align boundary.
  *	char pad[];
  *
@@ -120,14 +126,18 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
 		vr->avail_ring = (void*)avail;
 		vr->avail_idx = &ctrl->idx;
 		vr->avail_flags = &ctrl->flags;
+		vr->last_used_idx = &ctrl->last_used_idx;
 	} else {
 		vr->avail_idx = &avail->idx;
 		vr->avail_flags = &avail->flags;
+		vr->last_used_idx = NULL;
 	}
 	vr->used = (void *)ALIGN((unsigned long)&avail->ring[num], align);
 	/* Verify that avail fits before used. */
 	BUG_ON((unsigned long)(vr->avail_flags + 1) > (unsigned long)vr->used);
 	BUG_ON((unsigned long)(vr->avail_idx + 1) > (unsigned long)vr->used);
+	BUG_ON(vr->last_used_idx && (unsigned long)(vr->last_used_idx + 1) >
+	       (unsigned long)vr->used);
 	BUG_ON((unsigned long)(&vr->avail_ring[num]) > (unsigned long)vr->used);
 }
 
-- 
1.7.1.12.g42b7f

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 0/2] virtio: put last seen used index into ring itself
  2010-06-01 14:47 [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
  2010-06-01 14:47 ` [PATCHv3 1/2] virtio: support layout with avail ring before idx Michael S. Tsirkin
  2010-06-01 14:47 ` [PATCHv3 2/2] virtio: publish used idx Michael S. Tsirkin
@ 2010-06-01 15:12 ` Michael S. Tsirkin
  2 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-01 15:12 UTC (permalink / raw)
  To: Rusty Russell, linux-kernel, virtualization, kvm, qemu-devel

On Tue, Jun 01, 2010 at 05:47:08PM +0300, Michael S. Tsirkin wrote:
> Changes from v2: added padding between avail idx and flags,
> and changed virtio to only publish used index when callbacks
> are enabled.

Here's the updated spec patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

--

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index ed35893..a2dcd76 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -1803,6 +1803,36 @@ next
 \emph default
  descriptor entry (modulo the ring size).
  This starts at 0, and increases.
+\change_inserted 0 1274966643
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 0 1274968378
+When PUBLISH_USED feature flag has 
+\emph on
+not
+\emph default
+ been negotiated, the ring follows the 
+\begin_inset Quotes eld
+\end_inset
+
+flags
+\begin_inset Quotes erd
+\end_inset
+
+ and the 
+\begin_inset Quotes eld
+\end_inset
+
+idx
+\begin_inset Quotes erd
+\end_inset
+
+ fields:
+\change_unchanged
+
 \end_layout
 
 \begin_layout Standard
@@ -1845,7 +1875,143 @@ struct vring_avail {
 
 \end_layout
 
+\begin_layout Standard
+
+\change_inserted 0 1274968432
+\begin_inset CommandInset label
+LatexCommand label
+name "PUBLISH_USED-feature"
+
+\end_inset
+
+When PUBLISH_USED feature flag has been negotiated, the control structure
+ including the 
+\begin_inset Quotes eld
+\end_inset
+
+flags and the 
+\begin_inset Quotes eld
+\end_inset
+
+idx
+\begin_inset Quotes erd
+\end_inset
+
+ fields follows the ring.
+ This leaves the room for the 
+\begin_inset Quotes eld
+\end_inset
+
+last_seen_used_idx
+\begin_inset Quotes erd
+\end_inset
+
+ field, which indicates the most recent 
+\begin_inset Quotes eld
+\end_inset
+
+idx
+\begin_inset Quotes erd
+\end_inset
+
+ value observed by guest in the used ring (see 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Used-Ring"
+
+\end_inset
+
+ below):
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 0 1274967396
+\begin_inset listings
+inline false
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274967404
+
+struct vring_avail {
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1275404889
+
+   u16 ring[qsz]; /* qsz is the Queue Size field read from device */
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1275404891
+
+   u16 idx;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1275404921
+
+   u8 pad[254] /* Padding to avoid sharing cache line between flags and
+ idx fields.
+ */
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274967533
+
+#define VRING_AVAIL_F_NO_INTERRUPT      1
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274967533
+
+   u16 flags;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274968345
+
+   u16 last_seen_used_idx;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274967396
+
+}; 
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 0 1274967715
+If the ring is large enough, the second layout maintains the control and
+ ring structures on separate cache lines.
+\end_layout
+
 \begin_layout Subsection
+
+\change_inserted 0 1274968415
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Used-Ring"
+
+\end_inset
+
+
+\change_unchanged
 Used Ring
 \end_layout
 
@@ -2391,12 +2557,20 @@ status open
 
 \begin_layout Plain Layout
 
-while (vq->last_seen_used != vring->used.idx) {
+while (vq->last_seen_used
+\change_inserted 0 1274968316
+_idx
+\change_unchanged
+ != vring->used.idx) {
 \end_layout
 
 \begin_layout Plain Layout
 
-    struct vring_used_elem *e = vring.used->ring[vq->last_seen_used%vsz];
+    struct vring_used_elem *e = vring.used->ring[vq->last_seen_used
+\change_inserted 0 1274968326
+_idx
+\change_unchanged
+%vsz];
 \end_layout
 
 \begin_layout Plain Layout
@@ -2406,7 +2580,11 @@ while (vq->last_seen_used != vring->used.idx) {
 
 \begin_layout Plain Layout
 
-    vq->last_seen_used++;
+    vq->last_seen_used
+\change_inserted 0 1274968321
+_idx
+\change_unchanged
+++;
 \end_layout
 
 \begin_layout Plain Layout
@@ -2419,6 +2597,15 @@ while (vq->last_seen_used != vring->used.idx) {
 
 \end_layout
 
+\begin_layout Standard
+
+\change_inserted 0 1275405042
+If PUBLISH_USED feature is negotiated, last_seen_used value should be published
+ to the device in the avail ring.
+ This value is used by the host for interrupt mitigation, so it only need
+ to be updated when interrupts are enabled.
+\end_layout
+
 \begin_layout Subsection
 Dealing With Configuration Changes
 \end_layout
@@ -2986,6 +3173,47 @@ struct vring_avail {
 \begin_layout Plain Layout
 
 };
+\change_inserted 0 1274966477
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274966484
+
+struct vring_avail_ctrl {
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274966489
+
+        __u16 flags;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274966494
+
+        __u16 idx;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274966499
+
+        __u16 last_used_idx;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 0 1274966474
+
+};
 \end_layout
 
 \begin_layout Plain Layout
@@ -3349,6 +3577,28 @@ reference "sub:Indirect-Descriptors"
 \end_inset
 
 .
+\change_inserted 0 1274967762
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 0 1274967926
+VIRTIO_F_RING_PUBLISH_USED
+\begin_inset space ~
+\end_inset
+
+(29) Negotiating this feature indicates that the avail ring layout includes
+ the used index observed by driver, see
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "PUBLISH_USED-feature"
+
+\end_inset
+
+.
+\change_unchanged
+
 \end_layout
 
 \begin_layout Description

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-01 14:47 ` [PATCHv3 1/2] virtio: support layout with avail ring before idx Michael S. Tsirkin
@ 2010-06-04  2:34   ` Rusty Russell
  2010-06-04 10:35     ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Rusty Russell @ 2010-06-04  2:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
> This adds an (unused) option to put available ring before control (avail
> index, flags), and adds padding between index and flags. This avoids
> cache line sharing between control and ring, and also makes it possible
> to extend avail control without incurring extra cache misses.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

No no no no.  254?  You're trying to Morton me![1]

How's this (untested):

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -74,8 +74,8 @@ struct vring {
 /* The standard layout for the ring is a continuous chunk of memory which looks
  * like this.  We assume num is a power of 2.
  *
- * struct vring
- * {
+ * struct vring {
+ *	*** The driver writes to this part.
  *	// The actual descriptors (16 bytes each)
  *	struct vring_desc desc[num];
  *
@@ -84,9 +84,11 @@ struct vring {
  *	__u16 avail_idx;
  *	__u16 available[num];
  *
- *	// Padding to the next align boundary.
+ *	// Padding so used_flags is on the next align boundary.
  *	char pad[];
+ *	__u16 last_used; // On a cacheline of its own.
  *
+ *	*** The device writes to this part.
  *	// A ring of used descriptor heads with free-running index.
  *	__u16 used_flags;
  *	__u16 used_idx;
@@ -110,6 +112,12 @@ static inline unsigned vring_size(unsign
 		+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
 }
 
+/* Last used index sits at the very end of the driver part of the struct */
+static inline __u16 *vring_last_used_idx(const struct vring *vr)
+{
+	return (__u16 *)vr->used - 1;
+}
+
 #ifdef __KERNEL__
 #include <linux/irqreturn.h>
 struct virtio_device;

Cheers,
Rusty.
[1] Andrew Morton has this technique where he posts a solution so ugly it
    forces others to fix it properly.  Ego-roping, basically.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-04  2:34   ` Rusty Russell
@ 2010-06-04 10:35     ` Michael S. Tsirkin
  2010-06-04 11:16       ` Rusty Russell
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-04 10:35 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
> On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
> > This adds an (unused) option to put available ring before control (avail
> > index, flags), and adds padding between index and flags. This avoids
> > cache line sharing between control and ring, and also makes it possible
> > to extend avail control without incurring extra cache misses.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> No no no no.  254?  You're trying to Morton me![1]

Hmm, I wonder what will we do if we want a 3rd field on
a separate chacheline. But ok.

> How's this (untested):

I think we also want to put flags there as well,
they are used on interrupt path, together with last used index.

> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -74,8 +74,8 @@ struct vring {
>  /* The standard layout for the ring is a continuous chunk of memory which looks
>   * like this.  We assume num is a power of 2.
>   *
> - * struct vring
> - * {
> + * struct vring {
> + *	*** The driver writes to this part.
>   *	// The actual descriptors (16 bytes each)
>   *	struct vring_desc desc[num];
>   *
> @@ -84,9 +84,11 @@ struct vring {
>   *	__u16 avail_idx;
>   *	__u16 available[num];
>   *
> - *	// Padding to the next align boundary.
> + *	// Padding so used_flags is on the next align boundary.
>   *	char pad[];
> + *	__u16 last_used; // On a cacheline of its own.
>   *
> + *	*** The device writes to this part.
>   *	// A ring of used descriptor heads with free-running index.
>   *	__u16 used_flags;
>   *	__u16 used_idx;
> @@ -110,6 +112,12 @@ static inline unsigned vring_size(unsign
>  		+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
>  }
>  
> +/* Last used index sits at the very end of the driver part of the struct */
> +static inline __u16 *vring_last_used_idx(const struct vring *vr)
> +{
> +	return (__u16 *)vr->used - 1;
> +}
> +
>  #ifdef __KERNEL__
>  #include <linux/irqreturn.h>
>  struct virtio_device;
> 
> Cheers,
> Rusty.
> [1] Andrew Morton has this technique where he posts a solution so ugly it
>     forces others to fix it properly.  Ego-roping, basically.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-04 10:35     ` Michael S. Tsirkin
@ 2010-06-04 11:16       ` Rusty Russell
  2010-06-04 11:42         ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Rusty Russell @ 2010-06-04 11:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Fri, 4 Jun 2010 08:05:43 pm Michael S. Tsirkin wrote:
> On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
> > On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
> > > This adds an (unused) option to put available ring before control (avail
> > > index, flags), and adds padding between index and flags. This avoids
> > > cache line sharing between control and ring, and also makes it possible
> > > to extend avail control without incurring extra cache misses.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > No no no no.  254?  You're trying to Morton me![1]
> 
> Hmm, I wonder what will we do if we want a 3rd field on
> a separate chacheline. But ok.
> 
> > How's this (untested):
> 
> I think we also want to put flags there as well,
> they are used on interrupt path, together with last used index.

I'm uncomfortable with moving a field.

We haven't done that before and I wonder what will break with old code.

Should we instead just abandon the flags field and use last_used only?
Or, more radically, put flags == last_used when the feature is on?

Thoughts?
Rusty.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-04 11:16       ` Rusty Russell
@ 2010-06-04 11:42         ` Michael S. Tsirkin
  2010-06-05  4:10           ` Rusty Russell
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-04 11:42 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
> On Fri, 4 Jun 2010 08:05:43 pm Michael S. Tsirkin wrote:
> > On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
> > > On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
> > > > This adds an (unused) option to put available ring before control (avail
> > > > index, flags), and adds padding between index and flags. This avoids
> > > > cache line sharing between control and ring, and also makes it possible
> > > > to extend avail control without incurring extra cache misses.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > 
> > > No no no no.  254?  You're trying to Morton me![1]
> > 
> > Hmm, I wonder what will we do if we want a 3rd field on
> > a separate chacheline. But ok.
> > 
> > > How's this (untested):
> > 
> > I think we also want to put flags there as well,
> > they are used on interrupt path, together with last used index.
> 
> I'm uncomfortable with moving a field.
> 
> We haven't done that before and I wonder what will break with old code.

With e.g. my patch, We only do this conditionally when bit is negotitated.

> Should we instead just abandon the flags field and use last_used only?
> Or, more radically, put flags == last_used when the feature is on?
> 
> Thoughts?
> Rusty.

Hmm, e.g. with TX and virtio net, we almost never want interrupts,
whatever the index value.

-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-04 11:42         ` Michael S. Tsirkin
@ 2010-06-05  4:10           ` Rusty Russell
  2010-06-06  9:11             ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Rusty Russell @ 2010-06-05  4:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Fri, 4 Jun 2010 09:12:05 pm Michael S. Tsirkin wrote:
> On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
> > I'm uncomfortable with moving a field.
> > 
> > We haven't done that before and I wonder what will break with old code.
> 
> With e.g. my patch, We only do this conditionally when bit is negotitated.

Of course, but see this change:

commit ef688e151c00e5d529703be9a04fd506df8bc54e
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Fri Jun 12 22:16:35 2009 -0600

    virtio: meet virtio spec by finalizing features before using device
    
    Virtio devices are supposed to negotiate features before they start using
    the device, but the current code doesn't do this.  This is because the
    driver's probe() function invariably has to add buffers to a virtqueue,
    or probe the disk (virtio_blk).
    
    This currently doesn't matter since no existing backend is strict about
    the feature negotiation.  But it's possible to imagine a future feature
    which completely changes how a device operates: in this case, we'd need
    to acknowledge it before using the device.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Now, this isn't impossible to overcome: we know that if they use the ring
before completing feature negotiation then they don't understand the new
format.

But we have to be aware of that on the qemu side.  Are we?

> > Should we instead just abandon the flags field and use last_used only?
> > Or, more radically, put flags == last_used when the feature is on?
> > 
> > Thoughts?
> > Rusty.
> 
> Hmm, e.g. with TX and virtio net, we almost never want interrupts,
> whatever the index value.

Good point.  OK, I give in, I'll take your patch which moves the fields
to the end.  Is that your preference?

Please be careful with the qemu side though...

It's not inconceivable that I'll write that virtio cacheline simulator this
(coming) week, too...

Thanks.
Rusty.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx
  2010-06-05  4:10           ` Rusty Russell
@ 2010-06-06  9:11             ` Michael S. Tsirkin
  0 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2010-06-06  9:11 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, virtualization, kvm, qemu-devel, Andrew Morton

On Sat, Jun 05, 2010 at 01:40:26PM +0930, Rusty Russell wrote:
> On Fri, 4 Jun 2010 09:12:05 pm Michael S. Tsirkin wrote:
> > On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
> > > I'm uncomfortable with moving a field.
> > > 
> > > We haven't done that before and I wonder what will break with old code.
> > 
> > With e.g. my patch, We only do this conditionally when bit is negotitated.
> 
> Of course, but see this change:
> 
> commit ef688e151c00e5d529703be9a04fd506df8bc54e
> Author: Rusty Russell <rusty@rustcorp.com.au>
> Date:   Fri Jun 12 22:16:35 2009 -0600
> 
>     virtio: meet virtio spec by finalizing features before using device
>     
>     Virtio devices are supposed to negotiate features before they start using
>     the device, but the current code doesn't do this.  This is because the
>     driver's probe() function invariably has to add buffers to a virtqueue,
>     or probe the disk (virtio_blk).
>     
>     This currently doesn't matter since no existing backend is strict about
>     the feature negotiation.  But it's possible to imagine a future feature
>     which completely changes how a device operates: in this case, we'd need
>     to acknowledge it before using the device.
>     
>     Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> 
> Now, this isn't impossible to overcome: we know that if they use the ring
> before completing feature negotiation then they don't understand the new
> format.
> 
> But we have to be aware of that on the qemu side.  Are we?

I think we are ok. virtqueue_init which sets the avail/ysed pointers is
called when we write the base address.  So we only need to be careful
and not change this feature bit after creating the rings.


> > > Should we instead just abandon the flags field and use last_used only?
> > > Or, more radically, put flags == last_used when the feature is on?
> > > 
> > > Thoughts?
> > > Rusty.
> > 
> > Hmm, e.g. with TX and virtio net, we almost never want interrupts,
> > whatever the index value.
> 
> Good point.  OK, I give in, I'll take your patch which moves the fields
> to the end.  Is that your preference?

Yes, I think so.
You mean PATCHv3 unchanged with 254 byte padding?

> Please be careful with the qemu side though...
> 
> It's not inconceivable that I'll write that virtio cacheline simulator this
> (coming) week, too...
> 
> Thanks.
> Rusty.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-06-06  9:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-01 14:47 [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
2010-06-01 14:47 ` [PATCHv3 1/2] virtio: support layout with avail ring before idx Michael S. Tsirkin
2010-06-04  2:34   ` Rusty Russell
2010-06-04 10:35     ` Michael S. Tsirkin
2010-06-04 11:16       ` Rusty Russell
2010-06-04 11:42         ` Michael S. Tsirkin
2010-06-05  4:10           ` Rusty Russell
2010-06-06  9:11             ` Michael S. Tsirkin
2010-06-01 14:47 ` [PATCHv3 2/2] virtio: publish used idx Michael S. Tsirkin
2010-06-01 15:12 ` [PATCHv3 0/2] virtio: put last seen used index into ring itself Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).