All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 12:22 ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

The virtqueue_add_buf function has two limitations:

1) it requires the caller to provide all the buffers in a single call;

2) it does not support chained scatterlists: the buffers must be
provided as an array of struct scatterlist.

Because of these limitations, virtio-scsi has to copy each request into
a scatterlist internal to the driver.  It cannot just use the one that
was prepared by the upper SCSI layers.

This series adds a different set of APIs for adding a buffer to a
virtqueue.  The new API lets you pass the buffers piecewise, wrapping
multiple calls to virtqueue_add_sg between virtqueue_start_buf and
virtqueue_end_buf.  Letting drivers call virtqueue_add_sg multiple times
if they already have a scatterlist provided by someone else simplifies the
code and, for virtio-scsi, it saves the copying and the related locking.

One major difference between virtqueue_add_buf and virtqueue_add_sg
is that the latter uses scatterlist iterators, which follow chained
scatterlist structs and stop at ending markers.  In order to avoid code
duplication, and use the new API from virtqueue_add_buf (patch 8), we need
to change all existing callers of virtqueue_add_buf to provide well-formed
scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
to just switch to the new API, just like for virtio-scsi.  For virtio-net
the ending marker must be reset after calling virtqueue_add_buf, in
preparation for the next usage of the scatterlist.  Other drivers are
safe already.

This is an RFC for two reasons.  First, because I haven't done enough
testing yet (especially with all the variations on receiving that
virtio-net has).  Second, because I still have two struct vring_desc *
fields in virtqueue API, which is a layering violation.  I'm not really
sure how important that is and how to fix that---except by making the
fields void*.

Paolo

Paolo Bonzini (8):
  virtio: add functions for piecewise addition of buffers
  virtio-blk: reorganize virtblk_add_req
  virtio-blk: use virtqueue_start_buf on bio path
  virtio-blk: use virtqueue_start_buf on req path
  scatterlist: introduce sg_unmark_end
  virtio-net: unmark scatterlist ending after virtqueue_add_buf
  virtio-scsi: use virtqueue_start_buf
  virtio: reimplement virtqueue_add_buf using new functions

 block/blk-integrity.c        |    2 +-
 block/blk-merge.c            |    2 +-
 drivers/block/virtio_blk.c   |  165 +++++++++--------
 drivers/net/virtio_net.c     |   21 ++-
 drivers/scsi/virtio_scsi.c   |  103 +++++------
 drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
 include/linux/scatterlist.h  |   16 ++
 include/linux/virtio.h       |   25 +++
 8 files changed, 460 insertions(+), 291 deletions(-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 12:22 ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

The virtqueue_add_buf function has two limitations:

1) it requires the caller to provide all the buffers in a single call;

2) it does not support chained scatterlists: the buffers must be
provided as an array of struct scatterlist.

Because of these limitations, virtio-scsi has to copy each request into
a scatterlist internal to the driver.  It cannot just use the one that
was prepared by the upper SCSI layers.

This series adds a different set of APIs for adding a buffer to a
virtqueue.  The new API lets you pass the buffers piecewise, wrapping
multiple calls to virtqueue_add_sg between virtqueue_start_buf and
virtqueue_end_buf.  Letting drivers call virtqueue_add_sg multiple times
if they already have a scatterlist provided by someone else simplifies the
code and, for virtio-scsi, it saves the copying and the related locking.

One major difference between virtqueue_add_buf and virtqueue_add_sg
is that the latter uses scatterlist iterators, which follow chained
scatterlist structs and stop at ending markers.  In order to avoid code
duplication, and use the new API from virtqueue_add_buf (patch 8), we need
to change all existing callers of virtqueue_add_buf to provide well-formed
scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
to just switch to the new API, just like for virtio-scsi.  For virtio-net
the ending marker must be reset after calling virtqueue_add_buf, in
preparation for the next usage of the scatterlist.  Other drivers are
safe already.

This is an RFC for two reasons.  First, because I haven't done enough
testing yet (especially with all the variations on receiving that
virtio-net has).  Second, because I still have two struct vring_desc *
fields in virtqueue API, which is a layering violation.  I'm not really
sure how important that is and how to fix that---except by making the
fields void*.

Paolo

Paolo Bonzini (8):
  virtio: add functions for piecewise addition of buffers
  virtio-blk: reorganize virtblk_add_req
  virtio-blk: use virtqueue_start_buf on bio path
  virtio-blk: use virtqueue_start_buf on req path
  scatterlist: introduce sg_unmark_end
  virtio-net: unmark scatterlist ending after virtqueue_add_buf
  virtio-scsi: use virtqueue_start_buf
  virtio: reimplement virtqueue_add_buf using new functions

 block/blk-integrity.c        |    2 +-
 block/blk-merge.c            |    2 +-
 drivers/block/virtio_blk.c   |  165 +++++++++--------
 drivers/net/virtio_net.c     |   21 ++-
 drivers/scsi/virtio_scsi.c   |  103 +++++------
 drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
 include/linux/scatterlist.h  |   16 ++
 include/linux/virtio.h       |   25 +++
 8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/8] virtio: add functions for piecewise addition of buffers
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

The virtqueue_add_buf function has two limitations:

1) it requires the caller to provide all the buffers in a single call;

2) it does not support chained scatterlists: the buffers must be
provided as an array of struct scatterlist.

Because of these limitations, virtio-scsi has to copy each request into
a scatterlist internal to the driver.  It cannot just use the one that
was prepared by the upper SCSI layers.

This patch adds a different set of APIs for adding a buffer to a virtqueue.
The new API lets you pass the buffers piecewise, wrapping multiple calls
to virtqueue_add_sg between virtqueue_start_buf and virtqueue_end_buf.
virtio-scsi can then call virtqueue_add_sg 3/4 times: for the request
header, for the write buffer (if present), for the response header, and
finally for the read buffer (again if present).  It saves the copying
and the related locking.

Another function, virtqueue_add_sg_single unrolls virtqueue_add_sg for
a one-element scatterlist.  These are common because they are used
for request and response headers.

This API can also be used in virtio-blk, where it enables conversion of
virtqueue_add_buf itself to use the new API.  virtio-blk receives
a scatterlist from the blk_map_rq_sg call and it has an ending
marker set.  When virtqueue_add_buf starts using sg_next, virtio-blk
cannot simply hand the resulting scatterlist to virtqueue_add_buf;
however it can use the piecewise API to pass the request and
response headers separately.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/virtio/virtio_ring.c |  260 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/virtio.h       |   25 ++++
 2 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ffd7e7d..25f56e6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -394,6 +394,266 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
 	vq->vq.num_free++;
 }
 
+/**
+ * virtqueue_start_buf - start building buffer for the other end
+ * @vq: the struct virtqueue we're talking about.
+ * @buf: a struct keeping the state of the buffer
+ * @data: the token identifying the buffer.
+ * @count: the number of buffers that will be added
+ * @count_sg: the number of sg lists that will be added
+ * @gfp: how to do memory allocations (if necessary).
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted), and that a successful call is
+ * followed by one or more calls to virtqueue_add_sg, and finally a call
+ * to virtqueue_end_buf.
+ *
+ * Returns zero or a negative error (ie. ENOSPC).
+ */
+int virtqueue_start_buf(struct virtqueue *_vq,
+			struct virtqueue_buf *buf,
+			void *data,
+			unsigned int count,
+			unsigned int count_sg,
+			gfp_t gfp)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	struct vring_desc *desc = NULL;
+	int head;
+	int ret = -ENOMEM;
+
+	START_USE(vq);
+
+	BUG_ON(data == NULL);
+
+#ifdef DEBUG
+	{
+		ktime_t now = ktime_get();
+
+		/* No kick or get, with .1 second between?  Warn. */
+		if (vq->last_add_time_valid)
+			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
+					    > 100);
+		vq->last_add_time = now;
+		vq->last_add_time_valid = true;
+	}
+#endif
+
+	BUG_ON(count < count_sg);
+	BUG_ON(count_sg == 0);
+
+	/* If the host supports indirect descriptor tables, and there is
+	 * no space for direct buffers or there are multi-item scatterlists,
+	 * go indirect.
+	 */
+	head = vq->free_head;
+	if (vq->indirect && (count > count_sg || vq->vq.num_free < count)) {
+		if (vq->vq.num_free == 0)
+			goto no_space;
+
+		desc = kmalloc(count * sizeof(struct vring_desc), gfp);
+		if (!desc)
+			goto error;
+
+		/* We're about to use a buffer */
+		vq->vq.num_free--;
+
+		/* Use a single buffer which doesn't continue */
+		vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
+		vq->vring.desc[head].addr = virt_to_phys(desc);
+		vq->vring.desc[head].len = count * sizeof(struct vring_desc);
+
+		/* Update free pointer */
+		vq->free_head = vq->vring.desc[head].next;
+	}
+
+	/* Set token. */
+	vq->data[head] = data;
+
+	pr_debug("Started buffer head %i for %p\n", head, vq);
+
+	buf->vq = _vq;
+	buf->indirect = desc;
+	buf->tail = NULL;
+	buf->head = head;
+	return 0;
+
+no_space:
+	ret = -ENOSPC;
+error:
+	pr_debug("Can't add buf (%d) - count = %i, avail = %i\n",
+		 ret, count, vq->vq.num_free);
+	END_USE(vq);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(virtqueue_start_buf);
+
+/**
+ * virtqueue_add_sg - add sglist to buffer
+ * @buf: the struct that was passed to virtqueue_start_buf
+ * @sgl: the description of the buffer(s).
+ * @count: the number of items to process in sgl
+ * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
+ *
+ * Note that, unlike virtqueue_add_buf, this function follows chained
+ * scatterlists, and stops before the @count-th item if a scatterlist item
+ * has a marker.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_add_sg(struct virtqueue_buf *buf,
+		      struct scatterlist sgl[],
+		      unsigned int count,
+		      enum dma_data_direction dir)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int i, n;
+	struct scatterlist *sg;
+	struct vring_desc *tail;
+	u32 flags;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+
+	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
+	BUG_ON(count == 0);
+
+	flags = (dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0);
+	flags |= VRING_DESC_F_NEXT;
+
+	/* If using indirect descriptor tables, fill in the buffers
+	 * at buf->indirect.  */
+	if (buf->indirect != NULL) {
+		i = 0;
+		if (likely(buf->tail != NULL))
+			i = buf->tail - buf->indirect + 1;
+
+		for_each_sg(sgl, sg, count, n) {
+			tail = &buf->indirect[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			tail->next = ++i;
+		}
+	} else {
+		BUG_ON(vq->vq.num_free < count);
+
+		i = vq->free_head;
+		for_each_sg(sgl, sg, count, n) {
+			tail = &vq->vring.desc[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			i = vq->vring.desc[i].next;
+			vq->vq.num_free--;
+		}
+
+		vq->free_head = i;
+	}
+	buf->tail = tail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_sg);
+
+/**
+ * virtqueue_add_sg_single - add a buffer, described by a single scatterlist
+ * @buf: the struct that was passed to virtqueue_start_buf
+ * @sg: the description of the buffer.
+ * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
+ *
+ * This is an unrolled version of virtqueue_add_sg.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_add_sg_single(struct virtqueue_buf *buf,
+		             struct scatterlist *sg,
+		             enum dma_data_direction dir)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int i;
+	struct vring_desc *tail;
+	u32 flags;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+
+	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
+
+	/* If using indirect descriptor tables, fill in the buffers
+	 * at buf->indirect.  */
+	if (buf->indirect != NULL) {
+		i = 0;
+		if (likely(buf->tail != NULL))
+			i = buf->tail - buf->indirect + 1;
+
+		tail = &buf->indirect[i];
+		tail->next = i + 1;
+	} else {
+		BUG_ON(vq->vq.num_free == 0);
+
+		i = vq->free_head;
+		vq->free_head = vq->vring.desc[i].next;
+		vq->vq.num_free--;
+
+		tail = &vq->vring.desc[i];
+	}
+
+	flags = (dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0);
+	flags |= VRING_DESC_F_NEXT;
+
+	tail->flags = flags;
+	tail->addr = sg_phys(sg);
+	tail->len = sg->length;
+	buf->tail = tail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_sg_single);
+
+/**
+ * virtqueue_end_buf - expose buffer to other end
+ * @buf: the struct that was passed to virtqueue_start_buf
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_end_buf(struct virtqueue_buf *buf)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int avail;
+	int head = buf->head;
+	struct vring_desc *tail = buf->tail;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+	BUG_ON(tail == NULL);
+
+	/* The last one does not have the next flag set.  */
+	tail->flags &= ~VRING_DESC_F_NEXT;
+
+	/* Put entry in available array (but don't update avail->idx until
+	 * virtqueue_end_buf). */
+	avail = (vq->vring.avail->idx & (vq->vring.num-1));
+	vq->vring.avail->ring[avail] = head;
+
+	/* Descriptors and available array need to be set before we expose the
+	 * new available array entries. */
+	virtio_wmb(vq);
+	vq->vring.avail->idx++;
+	vq->num_added++;
+
+	/* This is very unlikely, but theoretically possible.  Kick
+	 * just in case. */
+	if (unlikely(vq->num_added == (1 << 16) - 1))
+		virtqueue_kick(buf->vq);
+
+	pr_debug("Added buffer head %i to %p\n", head, vq);
+	END_USE(vq);
+}
+EXPORT_SYMBOL_GPL(virtqueue_end_buf);
+
 static inline bool more_used(const struct vring_virtqueue *vq)
 {
 	return vq->last_used_idx != vq->vring.used->idx;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index cf8adb1..e9a5256 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -7,6 +7,7 @@
 #include <linux/spinlock.h>
 #include <linux/device.h>
 #include <linux/mod_devicetable.h>
+#include <linux/dma-direction.h>
 #include <linux/gfp.h>
 
 /**
@@ -40,6 +41,30 @@ int virtqueue_add_buf(struct virtqueue *vq,
 		      void *data,
 		      gfp_t gfp);
 
+struct virtqueue_buf {
+	struct virtqueue *vq;
+	struct vring_desc *indirect, *tail;
+	int head;
+};
+
+int virtqueue_start_buf(struct virtqueue *_vq,
+			struct virtqueue_buf *buf,
+			void *data,
+			unsigned int count,
+			unsigned int count_sg,
+			gfp_t gfp);
+
+void virtqueue_add_sg_single(struct virtqueue_buf *buf,
+		             struct scatterlist *sg,
+		             enum dma_data_direction dir);
+
+void virtqueue_add_sg(struct virtqueue_buf *buf,
+		      struct scatterlist sgl[],
+		      unsigned int count,
+		      enum dma_data_direction dir);
+
+void virtqueue_end_buf(struct virtqueue_buf *buf);
+
 void virtqueue_kick(struct virtqueue *vq);
 
 bool virtqueue_kick_prepare(struct virtqueue *vq);
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/8] virtio: add functions for piecewise addition of buffers
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

The virtqueue_add_buf function has two limitations:

1) it requires the caller to provide all the buffers in a single call;

2) it does not support chained scatterlists: the buffers must be
provided as an array of struct scatterlist.

Because of these limitations, virtio-scsi has to copy each request into
a scatterlist internal to the driver.  It cannot just use the one that
was prepared by the upper SCSI layers.

This patch adds a different set of APIs for adding a buffer to a virtqueue.
The new API lets you pass the buffers piecewise, wrapping multiple calls
to virtqueue_add_sg between virtqueue_start_buf and virtqueue_end_buf.
virtio-scsi can then call virtqueue_add_sg 3/4 times: for the request
header, for the write buffer (if present), for the response header, and
finally for the read buffer (again if present).  It saves the copying
and the related locking.

Another function, virtqueue_add_sg_single unrolls virtqueue_add_sg for
a one-element scatterlist.  These are common because they are used
for request and response headers.

This API can also be used in virtio-blk, where it enables conversion of
virtqueue_add_buf itself to use the new API.  virtio-blk receives
a scatterlist from the blk_map_rq_sg call and it has an ending
marker set.  When virtqueue_add_buf starts using sg_next, virtio-blk
cannot simply hand the resulting scatterlist to virtqueue_add_buf;
however it can use the piecewise API to pass the request and
response headers separately.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/virtio/virtio_ring.c |  260 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/virtio.h       |   25 ++++
 2 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ffd7e7d..25f56e6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -394,6 +394,266 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
 	vq->vq.num_free++;
 }
 
+/**
+ * virtqueue_start_buf - start building buffer for the other end
+ * @vq: the struct virtqueue we're talking about.
+ * @buf: a struct keeping the state of the buffer
+ * @data: the token identifying the buffer.
+ * @count: the number of buffers that will be added
+ * @count_sg: the number of sg lists that will be added
+ * @gfp: how to do memory allocations (if necessary).
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted), and that a successful call is
+ * followed by one or more calls to virtqueue_add_sg, and finally a call
+ * to virtqueue_end_buf.
+ *
+ * Returns zero or a negative error (ie. ENOSPC).
+ */
+int virtqueue_start_buf(struct virtqueue *_vq,
+			struct virtqueue_buf *buf,
+			void *data,
+			unsigned int count,
+			unsigned int count_sg,
+			gfp_t gfp)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	struct vring_desc *desc = NULL;
+	int head;
+	int ret = -ENOMEM;
+
+	START_USE(vq);
+
+	BUG_ON(data == NULL);
+
+#ifdef DEBUG
+	{
+		ktime_t now = ktime_get();
+
+		/* No kick or get, with .1 second between?  Warn. */
+		if (vq->last_add_time_valid)
+			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
+					    > 100);
+		vq->last_add_time = now;
+		vq->last_add_time_valid = true;
+	}
+#endif
+
+	BUG_ON(count < count_sg);
+	BUG_ON(count_sg == 0);
+
+	/* If the host supports indirect descriptor tables, and there is
+	 * no space for direct buffers or there are multi-item scatterlists,
+	 * go indirect.
+	 */
+	head = vq->free_head;
+	if (vq->indirect && (count > count_sg || vq->vq.num_free < count)) {
+		if (vq->vq.num_free == 0)
+			goto no_space;
+
+		desc = kmalloc(count * sizeof(struct vring_desc), gfp);
+		if (!desc)
+			goto error;
+
+		/* We're about to use a buffer */
+		vq->vq.num_free--;
+
+		/* Use a single buffer which doesn't continue */
+		vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
+		vq->vring.desc[head].addr = virt_to_phys(desc);
+		vq->vring.desc[head].len = count * sizeof(struct vring_desc);
+
+		/* Update free pointer */
+		vq->free_head = vq->vring.desc[head].next;
+	}
+
+	/* Set token. */
+	vq->data[head] = data;
+
+	pr_debug("Started buffer head %i for %p\n", head, vq);
+
+	buf->vq = _vq;
+	buf->indirect = desc;
+	buf->tail = NULL;
+	buf->head = head;
+	return 0;
+
+no_space:
+	ret = -ENOSPC;
+error:
+	pr_debug("Can't add buf (%d) - count = %i, avail = %i\n",
+		 ret, count, vq->vq.num_free);
+	END_USE(vq);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(virtqueue_start_buf);
+
+/**
+ * virtqueue_add_sg - add sglist to buffer
+ * @buf: the struct that was passed to virtqueue_start_buf
+ * @sgl: the description of the buffer(s).
+ * @count: the number of items to process in sgl
+ * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
+ *
+ * Note that, unlike virtqueue_add_buf, this function follows chained
+ * scatterlists, and stops before the @count-th item if a scatterlist item
+ * has a marker.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_add_sg(struct virtqueue_buf *buf,
+		      struct scatterlist sgl[],
+		      unsigned int count,
+		      enum dma_data_direction dir)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int i, n;
+	struct scatterlist *sg;
+	struct vring_desc *tail;
+	u32 flags;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+
+	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
+	BUG_ON(count == 0);
+
+	flags = (dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0);
+	flags |= VRING_DESC_F_NEXT;
+
+	/* If using indirect descriptor tables, fill in the buffers
+	 * at buf->indirect.  */
+	if (buf->indirect != NULL) {
+		i = 0;
+		if (likely(buf->tail != NULL))
+			i = buf->tail - buf->indirect + 1;
+
+		for_each_sg(sgl, sg, count, n) {
+			tail = &buf->indirect[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			tail->next = ++i;
+		}
+	} else {
+		BUG_ON(vq->vq.num_free < count);
+
+		i = vq->free_head;
+		for_each_sg(sgl, sg, count, n) {
+			tail = &vq->vring.desc[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			i = vq->vring.desc[i].next;
+			vq->vq.num_free--;
+		}
+
+		vq->free_head = i;
+	}
+	buf->tail = tail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_sg);
+
+/**
+ * virtqueue_add_sg_single - add a buffer, described by a single scatterlist
+ * @buf: the struct that was passed to virtqueue_start_buf
+ * @sg: the description of the buffer.
+ * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
+ *
+ * This is an unrolled version of virtqueue_add_sg.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_add_sg_single(struct virtqueue_buf *buf,
+		             struct scatterlist *sg,
+		             enum dma_data_direction dir)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int i;
+	struct vring_desc *tail;
+	u32 flags;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+
+	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
+
+	/* If using indirect descriptor tables, fill in the buffers
+	 * at buf->indirect.  */
+	if (buf->indirect != NULL) {
+		i = 0;
+		if (likely(buf->tail != NULL))
+			i = buf->tail - buf->indirect + 1;
+
+		tail = &buf->indirect[i];
+		tail->next = i + 1;
+	} else {
+		BUG_ON(vq->vq.num_free == 0);
+
+		i = vq->free_head;
+		vq->free_head = vq->vring.desc[i].next;
+		vq->vq.num_free--;
+
+		tail = &vq->vring.desc[i];
+	}
+
+	flags = (dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0);
+	flags |= VRING_DESC_F_NEXT;
+
+	tail->flags = flags;
+	tail->addr = sg_phys(sg);
+	tail->len = sg->length;
+	buf->tail = tail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_sg_single);
+
+/**
+ * virtqueue_end_buf - expose buffer to other end
+ * @buf: the struct that was passed to virtqueue_start_buf
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_end_buf(struct virtqueue_buf *buf)
+{
+	struct vring_virtqueue *vq = to_vvq(buf->vq);
+	unsigned int avail;
+	int head = buf->head;
+	struct vring_desc *tail = buf->tail;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+	BUG_ON(tail == NULL);
+
+	/* The last one does not have the next flag set.  */
+	tail->flags &= ~VRING_DESC_F_NEXT;
+
+	/* Put entry in available array (but don't update avail->idx until
+	 * virtqueue_end_buf). */
+	avail = (vq->vring.avail->idx & (vq->vring.num-1));
+	vq->vring.avail->ring[avail] = head;
+
+	/* Descriptors and available array need to be set before we expose the
+	 * new available array entries. */
+	virtio_wmb(vq);
+	vq->vring.avail->idx++;
+	vq->num_added++;
+
+	/* This is very unlikely, but theoretically possible.  Kick
+	 * just in case. */
+	if (unlikely(vq->num_added == (1 << 16) - 1))
+		virtqueue_kick(buf->vq);
+
+	pr_debug("Added buffer head %i to %p\n", head, vq);
+	END_USE(vq);
+}
+EXPORT_SYMBOL_GPL(virtqueue_end_buf);
+
 static inline bool more_used(const struct vring_virtqueue *vq)
 {
 	return vq->last_used_idx != vq->vring.used->idx;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index cf8adb1..e9a5256 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -7,6 +7,7 @@
 #include <linux/spinlock.h>
 #include <linux/device.h>
 #include <linux/mod_devicetable.h>
+#include <linux/dma-direction.h>
 #include <linux/gfp.h>
 
 /**
@@ -40,6 +41,30 @@ int virtqueue_add_buf(struct virtqueue *vq,
 		      void *data,
 		      gfp_t gfp);
 
+struct virtqueue_buf {
+	struct virtqueue *vq;
+	struct vring_desc *indirect, *tail;
+	int head;
+};
+
+int virtqueue_start_buf(struct virtqueue *_vq,
+			struct virtqueue_buf *buf,
+			void *data,
+			unsigned int count,
+			unsigned int count_sg,
+			gfp_t gfp);
+
+void virtqueue_add_sg_single(struct virtqueue_buf *buf,
+		             struct scatterlist *sg,
+		             enum dma_data_direction dir);
+
+void virtqueue_add_sg(struct virtqueue_buf *buf,
+		      struct scatterlist sgl[],
+		      unsigned int count,
+		      enum dma_data_direction dir);
+
+void virtqueue_end_buf(struct virtqueue_buf *buf);
+
 void virtqueue_kick(struct virtqueue *vq);
 
 bool virtqueue_kick_prepare(struct virtqueue *vq);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/8] virtio-blk: reorganize virtblk_add_req
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

Right now, both virtblk_add_req and virtblk_add_req_wait call
virtqueue_add_buf.  To prepare for the next patches, abstract the call
to virtqueue_add_buf into a new function __virtblk_add_req, and include
the waiting logic directly in virtblk_add_req.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   55 ++++++++++++++++----------------------------
 1 files changed, 20 insertions(+), 35 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 8ad21a2..fd8a689 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -100,50 +100,39 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 	return vbr;
 }
 
-static void virtblk_add_buf_wait(struct virtio_blk *vblk,
-				 struct virtblk_req *vbr,
-				 unsigned long out,
-				 unsigned long in)
+static inline int __virtblk_add_req(struct virtqueue *vq,
+			     struct virtblk_req *vbr,
+			     unsigned long out,
+			     unsigned long in)
 {
+	return virtqueue_add_buf(vq, vbr->sg, out, in, vbr, GFP_ATOMIC);
+}
+
+static void virtblk_add_req(struct virtblk_req *vbr,
+			    unsigned int out, unsigned int in)
+{
+	struct virtio_blk *vblk = vbr->vblk;
 	DEFINE_WAIT(wait);
+	int ret;
 
-	for (;;) {
+	spin_lock_irq(vblk->disk->queue->queue_lock);
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr,
+						 out, in)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
+		spin_unlock_irq(vblk->disk->queue->queue_lock);
+		io_schedule();
 		spin_lock_irq(vblk->disk->queue->queue_lock);
-		if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
-				      GFP_ATOMIC) < 0) {
-			spin_unlock_irq(vblk->disk->queue->queue_lock);
-			io_schedule();
-		} else {
-			virtqueue_kick(vblk->vq);
-			spin_unlock_irq(vblk->disk->queue->queue_lock);
-			break;
-		}
 
+		finish_wait(&vblk->queue_wait, &wait);
 	}
 
-	finish_wait(&vblk->queue_wait, &wait);
-}
-
-static inline void virtblk_add_req(struct virtblk_req *vbr,
-				   unsigned int out, unsigned int in)
-{
-	struct virtio_blk *vblk = vbr->vblk;
-
-	spin_lock_irq(vblk->disk->queue->queue_lock);
-	if (unlikely(virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
-					GFP_ATOMIC) < 0)) {
-		spin_unlock_irq(vblk->disk->queue->queue_lock);
-		virtblk_add_buf_wait(vblk, vbr, out, in);
-		return;
-	}
 	virtqueue_kick(vblk->vq);
 	spin_unlock_irq(vblk->disk->queue->queue_lock);
 }
 
-static int virtblk_bio_send_flush(struct virtblk_req *vbr)
+static void virtblk_bio_send_flush(struct virtblk_req *vbr)
 {
 	unsigned int out = 0, in = 0;
 
@@ -155,11 +144,9 @@ static int virtblk_bio_send_flush(struct virtblk_req *vbr)
 	sg_set_buf(&vbr->sg[out + in++], &vbr->status, sizeof(vbr->status));
 
 	virtblk_add_req(vbr, out, in);
-
-	return 0;
 }
 
-static int virtblk_bio_send_data(struct virtblk_req *vbr)
+static void virtblk_bio_send_data(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
 	unsigned int num, out = 0, in = 0;
@@ -188,8 +175,6 @@ static int virtblk_bio_send_data(struct virtblk_req *vbr)
 	}
 
 	virtblk_add_req(vbr, out, in);
-
-	return 0;
 }
 
 static void virtblk_bio_send_data_work(struct work_struct *work)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/8] virtio-blk: reorganize virtblk_add_req
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

Right now, both virtblk_add_req and virtblk_add_req_wait call
virtqueue_add_buf.  To prepare for the next patches, abstract the call
to virtqueue_add_buf into a new function __virtblk_add_req, and include
the waiting logic directly in virtblk_add_req.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   55 ++++++++++++++++----------------------------
 1 files changed, 20 insertions(+), 35 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 8ad21a2..fd8a689 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -100,50 +100,39 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 	return vbr;
 }
 
-static void virtblk_add_buf_wait(struct virtio_blk *vblk,
-				 struct virtblk_req *vbr,
-				 unsigned long out,
-				 unsigned long in)
+static inline int __virtblk_add_req(struct virtqueue *vq,
+			     struct virtblk_req *vbr,
+			     unsigned long out,
+			     unsigned long in)
 {
+	return virtqueue_add_buf(vq, vbr->sg, out, in, vbr, GFP_ATOMIC);
+}
+
+static void virtblk_add_req(struct virtblk_req *vbr,
+			    unsigned int out, unsigned int in)
+{
+	struct virtio_blk *vblk = vbr->vblk;
 	DEFINE_WAIT(wait);
+	int ret;
 
-	for (;;) {
+	spin_lock_irq(vblk->disk->queue->queue_lock);
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr,
+						 out, in)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
+		spin_unlock_irq(vblk->disk->queue->queue_lock);
+		io_schedule();
 		spin_lock_irq(vblk->disk->queue->queue_lock);
-		if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
-				      GFP_ATOMIC) < 0) {
-			spin_unlock_irq(vblk->disk->queue->queue_lock);
-			io_schedule();
-		} else {
-			virtqueue_kick(vblk->vq);
-			spin_unlock_irq(vblk->disk->queue->queue_lock);
-			break;
-		}
 
+		finish_wait(&vblk->queue_wait, &wait);
 	}
 
-	finish_wait(&vblk->queue_wait, &wait);
-}
-
-static inline void virtblk_add_req(struct virtblk_req *vbr,
-				   unsigned int out, unsigned int in)
-{
-	struct virtio_blk *vblk = vbr->vblk;
-
-	spin_lock_irq(vblk->disk->queue->queue_lock);
-	if (unlikely(virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
-					GFP_ATOMIC) < 0)) {
-		spin_unlock_irq(vblk->disk->queue->queue_lock);
-		virtblk_add_buf_wait(vblk, vbr, out, in);
-		return;
-	}
 	virtqueue_kick(vblk->vq);
 	spin_unlock_irq(vblk->disk->queue->queue_lock);
 }
 
-static int virtblk_bio_send_flush(struct virtblk_req *vbr)
+static void virtblk_bio_send_flush(struct virtblk_req *vbr)
 {
 	unsigned int out = 0, in = 0;
 
@@ -155,11 +144,9 @@ static int virtblk_bio_send_flush(struct virtblk_req *vbr)
 	sg_set_buf(&vbr->sg[out + in++], &vbr->status, sizeof(vbr->status));
 
 	virtblk_add_req(vbr, out, in);
-
-	return 0;
 }
 
-static int virtblk_bio_send_data(struct virtblk_req *vbr)
+static void virtblk_bio_send_data(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
 	unsigned int num, out = 0, in = 0;
@@ -188,8 +175,6 @@ static int virtblk_bio_send_data(struct virtblk_req *vbr)
 	}
 
 	virtblk_add_req(vbr, out, in);
-
-	return 0;
 }
 
 static void virtblk_bio_send_data_work(struct work_struct *work)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/8] virtio-blk: use virtqueue_start_buf on bio path
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

Move the creation of the request header and response footer to
__virtblk_add_req.  vbr->sg only contains the data scatterlist,
the header/footer are added separately using the new piecewise
API for building virtqueue buffers.

With this change, virtio-blk is not relying anymore on the virtio
functions ignoring the end markers in a scatterlist.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   75 +++++++++++++++++++++++++++-----------------
 1 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index fd8a689..538cc40 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -62,6 +62,7 @@ struct virtblk_req
 	struct virtio_blk *vblk;
 	int flags;
 	u8 status;
+	int sg_count;
 	struct scatterlist sg[];
 };
 
@@ -100,24 +101,53 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 	return vbr;
 }
 
-static inline int __virtblk_add_req(struct virtqueue *vq,
-			     struct virtblk_req *vbr,
-			     unsigned long out,
-			     unsigned long in)
+static int __virtblk_add_req(struct virtqueue *vq,
+			     struct virtblk_req *vbr)
 {
-	return virtqueue_add_buf(vq, vbr->sg, out, in, vbr, GFP_ATOMIC);
+	struct virtqueue_buf buf;
+	struct scatterlist sg;
+	enum dma_data_direction dir;
+	int ret;
+
+	unsigned int count = 2;
+	unsigned int count_sg = 2;
+
+	if (vbr->sg_count) {
+		count_sg++;
+		count += vbr->sg_count;
+	}
+
+	ret = virtqueue_start_buf(vq, &buf, vbr, count, count_sg, GFP_ATOMIC);
+	if (ret < 0)
+		return ret;
+
+	dir = DMA_TO_DEVICE;
+	sg_init_one(&sg, &vbr->out_hdr, sizeof(vbr->out_hdr));
+	virtqueue_add_sg_single(&buf, &sg, dir);
+
+	if (vbr->sg_count) {
+		if ((vbr->out_hdr.type & VIRTIO_BLK_T_OUT) == 0)
+			dir = DMA_FROM_DEVICE;
+
+		virtqueue_add_sg(&buf, vbr->sg, vbr->sg_count, dir);
+	}
+
+	dir = DMA_FROM_DEVICE;
+	sg_init_one(&sg, &vbr->status, sizeof(vbr->status));
+	virtqueue_add_sg_single(&buf, &sg, dir);
+
+	virtqueue_end_buf(&buf);
+	return 0;
 }
 
-static void virtblk_add_req(struct virtblk_req *vbr,
-			    unsigned int out, unsigned int in)
+static void virtblk_add_req(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
 	DEFINE_WAIT(wait);
 	int ret;
 
 	spin_lock_irq(vblk->disk->queue->queue_lock);
-	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr,
-						 out, in)) < 0)) {
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
@@ -134,22 +164,18 @@ static void virtblk_add_req(struct virtblk_req *vbr,
 
 static void virtblk_bio_send_flush(struct virtblk_req *vbr)
 {
-	unsigned int out = 0, in = 0;
-
 	vbr->flags |= VBLK_IS_FLUSH;
 	vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
 	vbr->out_hdr.sector = 0;
 	vbr->out_hdr.ioprio = 0;
-	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-	sg_set_buf(&vbr->sg[out + in++], &vbr->status, sizeof(vbr->status));
+	vbr->sg_count = 0;
 
-	virtblk_add_req(vbr, out, in);
+	virtblk_add_req(vbr);
 }
 
 static void virtblk_bio_send_data(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
-	unsigned int num, out = 0, in = 0;
 	struct bio *bio = vbr->bio;
 
 	vbr->flags &= ~VBLK_IS_FLUSH;
@@ -157,24 +183,15 @@ static void virtblk_bio_send_data(struct virtblk_req *vbr)
 	vbr->out_hdr.sector = bio->bi_sector;
 	vbr->out_hdr.ioprio = bio_prio(bio);
 
-	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-
-	num = blk_bio_map_sg(vblk->disk->queue, bio, vbr->sg + out);
-
-	sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
-		   sizeof(vbr->status));
-
-	if (num) {
-		if (bio->bi_rw & REQ_WRITE) {
+	vbr->sg_count = blk_bio_map_sg(vblk->disk->queue, bio, vbr->sg);
+	if (vbr->sg_count) {
+		if (bio->bi_rw & REQ_WRITE)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
-			out += num;
-		} else {
+		else
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
-			in += num;
-		}
 	}
 
-	virtblk_add_req(vbr, out, in);
+	virtblk_add_req(vbr);
 }
 
 static void virtblk_bio_send_data_work(struct work_struct *work)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/8] virtio-blk: use virtqueue_start_buf on bio path
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

Move the creation of the request header and response footer to
__virtblk_add_req.  vbr->sg only contains the data scatterlist,
the header/footer are added separately using the new piecewise
API for building virtqueue buffers.

With this change, virtio-blk is not relying anymore on the virtio
functions ignoring the end markers in a scatterlist.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   75 +++++++++++++++++++++++++++-----------------
 1 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index fd8a689..538cc40 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -62,6 +62,7 @@ struct virtblk_req
 	struct virtio_blk *vblk;
 	int flags;
 	u8 status;
+	int sg_count;
 	struct scatterlist sg[];
 };
 
@@ -100,24 +101,53 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 	return vbr;
 }
 
-static inline int __virtblk_add_req(struct virtqueue *vq,
-			     struct virtblk_req *vbr,
-			     unsigned long out,
-			     unsigned long in)
+static int __virtblk_add_req(struct virtqueue *vq,
+			     struct virtblk_req *vbr)
 {
-	return virtqueue_add_buf(vq, vbr->sg, out, in, vbr, GFP_ATOMIC);
+	struct virtqueue_buf buf;
+	struct scatterlist sg;
+	enum dma_data_direction dir;
+	int ret;
+
+	unsigned int count = 2;
+	unsigned int count_sg = 2;
+
+	if (vbr->sg_count) {
+		count_sg++;
+		count += vbr->sg_count;
+	}
+
+	ret = virtqueue_start_buf(vq, &buf, vbr, count, count_sg, GFP_ATOMIC);
+	if (ret < 0)
+		return ret;
+
+	dir = DMA_TO_DEVICE;
+	sg_init_one(&sg, &vbr->out_hdr, sizeof(vbr->out_hdr));
+	virtqueue_add_sg_single(&buf, &sg, dir);
+
+	if (vbr->sg_count) {
+		if ((vbr->out_hdr.type & VIRTIO_BLK_T_OUT) == 0)
+			dir = DMA_FROM_DEVICE;
+
+		virtqueue_add_sg(&buf, vbr->sg, vbr->sg_count, dir);
+	}
+
+	dir = DMA_FROM_DEVICE;
+	sg_init_one(&sg, &vbr->status, sizeof(vbr->status));
+	virtqueue_add_sg_single(&buf, &sg, dir);
+
+	virtqueue_end_buf(&buf);
+	return 0;
 }
 
-static void virtblk_add_req(struct virtblk_req *vbr,
-			    unsigned int out, unsigned int in)
+static void virtblk_add_req(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
 	DEFINE_WAIT(wait);
 	int ret;
 
 	spin_lock_irq(vblk->disk->queue->queue_lock);
-	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr,
-						 out, in)) < 0)) {
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
@@ -134,22 +164,18 @@ static void virtblk_add_req(struct virtblk_req *vbr,
 
 static void virtblk_bio_send_flush(struct virtblk_req *vbr)
 {
-	unsigned int out = 0, in = 0;
-
 	vbr->flags |= VBLK_IS_FLUSH;
 	vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
 	vbr->out_hdr.sector = 0;
 	vbr->out_hdr.ioprio = 0;
-	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-	sg_set_buf(&vbr->sg[out + in++], &vbr->status, sizeof(vbr->status));
+	vbr->sg_count = 0;
 
-	virtblk_add_req(vbr, out, in);
+	virtblk_add_req(vbr);
 }
 
 static void virtblk_bio_send_data(struct virtblk_req *vbr)
 {
 	struct virtio_blk *vblk = vbr->vblk;
-	unsigned int num, out = 0, in = 0;
 	struct bio *bio = vbr->bio;
 
 	vbr->flags &= ~VBLK_IS_FLUSH;
@@ -157,24 +183,15 @@ static void virtblk_bio_send_data(struct virtblk_req *vbr)
 	vbr->out_hdr.sector = bio->bi_sector;
 	vbr->out_hdr.ioprio = bio_prio(bio);
 
-	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-
-	num = blk_bio_map_sg(vblk->disk->queue, bio, vbr->sg + out);
-
-	sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
-		   sizeof(vbr->status));
-
-	if (num) {
-		if (bio->bi_rw & REQ_WRITE) {
+	vbr->sg_count = blk_bio_map_sg(vblk->disk->queue, bio, vbr->sg);
+	if (vbr->sg_count) {
+		if (bio->bi_rw & REQ_WRITE)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
-			out += num;
-		} else {
+		else
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
-			in += num;
-		}
 	}
 
-	virtblk_add_req(vbr, out, in);
+	virtblk_add_req(vbr);
 }
 
 static void virtblk_bio_send_data_work(struct work_struct *work)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/8] virtio-blk: use virtqueue_start_buf on req path
  2013-02-07 12:22 ` Paolo Bonzini
                   ` (4 preceding siblings ...)
  (?)
@ 2013-02-07 12:22 ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

This is similar to the previous patch, but a bit more radical
because the bio and req paths now share the buffer construction
code.  Because the req path doesn't use vbr->sg, however, we
need to add a couple of arguments to __virtblk_add_req.

We also need to teach __virtblk_add_req how to build SCSI command
requests.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   73 +++++++++++++++++++++++---------------------
 1 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 538cc40..5896b4d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -102,19 +102,27 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 }
 
 static int __virtblk_add_req(struct virtqueue *vq,
-			     struct virtblk_req *vbr)
+			     struct virtblk_req *vbr,
+			     struct scatterlist *io_sg,
+			     unsigned io_sg_count)
 {
 	struct virtqueue_buf buf;
 	struct scatterlist sg;
 	enum dma_data_direction dir;
 	int ret;
 
+	int type = vbr->out_hdr.type & ~VIRTIO_BLK_T_OUT;
 	unsigned int count = 2;
 	unsigned int count_sg = 2;
 
-	if (vbr->sg_count) {
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		BUG_ON(use_bio);
+		count_sg += 3;
+		count += 3;
+	}
+	if (io_sg_count) {
 		count_sg++;
-		count += vbr->sg_count;
+		count += io_sg_count;
 	}
 
 	ret = virtqueue_start_buf(vq, &buf, vbr, count, count_sg, GFP_ATOMIC);
@@ -125,14 +133,32 @@ static int __virtblk_add_req(struct virtqueue *vq,
 	sg_init_one(&sg, &vbr->out_hdr, sizeof(vbr->out_hdr));
 	virtqueue_add_sg_single(&buf, &sg, dir);
 
-	if (vbr->sg_count) {
+	/*
+	 * If this is a packet command we need a couple of additional headers.
+	 * Behind the normal outhdr we put a segment with the scsi command
+	 * block, and before the normal inhdr we put the sense data and the
+	 * inhdr with additional status information before the normal inhdr.
+	 */
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		sg_init_one(&sg, vbr->req->cmd, vbr->req->cmd_len);
+		virtqueue_add_sg_single(&buf, &sg, dir);
+	}
+
+	if (io_sg_count) {
 		if ((vbr->out_hdr.type & VIRTIO_BLK_T_OUT) == 0)
 			dir = DMA_FROM_DEVICE;
 
-		virtqueue_add_sg(&buf, vbr->sg, vbr->sg_count, dir);
+		virtqueue_add_sg(&buf, io_sg, io_sg_count, dir);
 	}
 
 	dir = DMA_FROM_DEVICE;
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		sg_init_one(&sg, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
+		virtqueue_add_sg_single(&buf, &sg, dir);
+		sg_init_one(&sg, &vbr->in_hdr, sizeof(vbr->in_hdr));
+		virtqueue_add_sg_single(&buf, &sg, dir);
+	}
+
 	sg_init_one(&sg, &vbr->status, sizeof(vbr->status));
 	virtqueue_add_sg_single(&buf, &sg, dir);
 
@@ -147,7 +173,8 @@ static void virtblk_add_req(struct virtblk_req *vbr)
 	int ret;
 
 	spin_lock_irq(vblk->disk->queue->queue_lock);
-	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr)) < 0)) {
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr, vbr->sg,
+						 vbr->sg_count)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
@@ -300,7 +327,7 @@ static void virtblk_done(struct virtqueue *vq)
 static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		   struct request *req)
 {
-	unsigned long num, out = 0, in = 0;
+	unsigned int num;
 	struct virtblk_req *vbr;
 
 	vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
@@ -337,40 +364,16 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		}
 	}
 
-	sg_set_buf(&vblk->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-
-	/*
-	 * If this is a packet command we need a couple of additional headers.
-	 * Behind the normal outhdr we put a segment with the scsi command
-	 * block, and before the normal inhdr we put the sense data and the
-	 * inhdr with additional status information before the normal inhdr.
-	 */
-	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC)
-		sg_set_buf(&vblk->sg[out++], vbr->req->cmd, vbr->req->cmd_len);
-
-	num = blk_rq_map_sg(q, vbr->req, vblk->sg + out);
-
-	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) {
-		sg_set_buf(&vblk->sg[num + out + in++], vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
-		sg_set_buf(&vblk->sg[num + out + in++], &vbr->in_hdr,
-			   sizeof(vbr->in_hdr));
-	}
-
-	sg_set_buf(&vblk->sg[num + out + in++], &vbr->status,
-		   sizeof(vbr->status));
+	num = blk_rq_map_sg(q, vbr->req, vblk->sg);
 
 	if (num) {
-		if (rq_data_dir(vbr->req) == WRITE) {
+		if (rq_data_dir(vbr->req) == WRITE)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
-			out += num;
-		} else {
+		else
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
-			in += num;
-		}
 	}
 
-	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
-			      GFP_ATOMIC) < 0) {
+	if (__virtblk_add_req(vblk->vq, vbr, vblk->sg, num) < 0) {
 		mempool_free(vbr, vblk->pool);
 		return false;
 	}
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/8] virtio-blk: use virtqueue_start_buf on req path
  2013-02-07 12:22 ` Paolo Bonzini
                   ` (3 preceding siblings ...)
  (?)
@ 2013-02-07 12:22 ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

This is similar to the previous patch, but a bit more radical
because the bio and req paths now share the buffer construction
code.  Because the req path doesn't use vbr->sg, however, we
need to add a couple of arguments to __virtblk_add_req.

We also need to teach __virtblk_add_req how to build SCSI command
requests.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/block/virtio_blk.c |   73 +++++++++++++++++++++++---------------------
 1 files changed, 38 insertions(+), 35 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 538cc40..5896b4d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -102,19 +102,27 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
 }
 
 static int __virtblk_add_req(struct virtqueue *vq,
-			     struct virtblk_req *vbr)
+			     struct virtblk_req *vbr,
+			     struct scatterlist *io_sg,
+			     unsigned io_sg_count)
 {
 	struct virtqueue_buf buf;
 	struct scatterlist sg;
 	enum dma_data_direction dir;
 	int ret;
 
+	int type = vbr->out_hdr.type & ~VIRTIO_BLK_T_OUT;
 	unsigned int count = 2;
 	unsigned int count_sg = 2;
 
-	if (vbr->sg_count) {
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		BUG_ON(use_bio);
+		count_sg += 3;
+		count += 3;
+	}
+	if (io_sg_count) {
 		count_sg++;
-		count += vbr->sg_count;
+		count += io_sg_count;
 	}
 
 	ret = virtqueue_start_buf(vq, &buf, vbr, count, count_sg, GFP_ATOMIC);
@@ -125,14 +133,32 @@ static int __virtblk_add_req(struct virtqueue *vq,
 	sg_init_one(&sg, &vbr->out_hdr, sizeof(vbr->out_hdr));
 	virtqueue_add_sg_single(&buf, &sg, dir);
 
-	if (vbr->sg_count) {
+	/*
+	 * If this is a packet command we need a couple of additional headers.
+	 * Behind the normal outhdr we put a segment with the scsi command
+	 * block, and before the normal inhdr we put the sense data and the
+	 * inhdr with additional status information before the normal inhdr.
+	 */
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		sg_init_one(&sg, vbr->req->cmd, vbr->req->cmd_len);
+		virtqueue_add_sg_single(&buf, &sg, dir);
+	}
+
+	if (io_sg_count) {
 		if ((vbr->out_hdr.type & VIRTIO_BLK_T_OUT) == 0)
 			dir = DMA_FROM_DEVICE;
 
-		virtqueue_add_sg(&buf, vbr->sg, vbr->sg_count, dir);
+		virtqueue_add_sg(&buf, io_sg, io_sg_count, dir);
 	}
 
 	dir = DMA_FROM_DEVICE;
+	if (type == VIRTIO_BLK_T_SCSI_CMD) {
+		sg_init_one(&sg, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
+		virtqueue_add_sg_single(&buf, &sg, dir);
+		sg_init_one(&sg, &vbr->in_hdr, sizeof(vbr->in_hdr));
+		virtqueue_add_sg_single(&buf, &sg, dir);
+	}
+
 	sg_init_one(&sg, &vbr->status, sizeof(vbr->status));
 	virtqueue_add_sg_single(&buf, &sg, dir);
 
@@ -147,7 +173,8 @@ static void virtblk_add_req(struct virtblk_req *vbr)
 	int ret;
 
 	spin_lock_irq(vblk->disk->queue->queue_lock);
-	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr)) < 0)) {
+	while (unlikely((ret = __virtblk_add_req(vblk->vq, vbr, vbr->sg,
+						 vbr->sg_count)) < 0)) {
 		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
 					  TASK_UNINTERRUPTIBLE);
 
@@ -300,7 +327,7 @@ static void virtblk_done(struct virtqueue *vq)
 static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		   struct request *req)
 {
-	unsigned long num, out = 0, in = 0;
+	unsigned int num;
 	struct virtblk_req *vbr;
 
 	vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
@@ -337,40 +364,16 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		}
 	}
 
-	sg_set_buf(&vblk->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
-
-	/*
-	 * If this is a packet command we need a couple of additional headers.
-	 * Behind the normal outhdr we put a segment with the scsi command
-	 * block, and before the normal inhdr we put the sense data and the
-	 * inhdr with additional status information before the normal inhdr.
-	 */
-	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC)
-		sg_set_buf(&vblk->sg[out++], vbr->req->cmd, vbr->req->cmd_len);
-
-	num = blk_rq_map_sg(q, vbr->req, vblk->sg + out);
-
-	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) {
-		sg_set_buf(&vblk->sg[num + out + in++], vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
-		sg_set_buf(&vblk->sg[num + out + in++], &vbr->in_hdr,
-			   sizeof(vbr->in_hdr));
-	}
-
-	sg_set_buf(&vblk->sg[num + out + in++], &vbr->status,
-		   sizeof(vbr->status));
+	num = blk_rq_map_sg(q, vbr->req, vblk->sg);
 
 	if (num) {
-		if (rq_data_dir(vbr->req) == WRITE) {
+		if (rq_data_dir(vbr->req) == WRITE)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
-			out += num;
-		} else {
+		else
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
-			in += num;
-		}
 	}
 
-	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
-			      GFP_ATOMIC) < 0) {
+	if (__virtblk_add_req(vblk->vq, vbr, vblk->sg, num) < 0) {
 		mempool_free(vbr, vblk->pool);
 		return false;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/8] scatterlist: introduce sg_unmark_end
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization, Jens Axboe

This is useful in places that recycle the same scatterlist multiple
times, and do not want to incur the cost of sg_init_table every
time in hot paths.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	Jens, could you give your Acked-by for this patch?

 block/blk-integrity.c       |    2 +-
 block/blk-merge.c           |    2 +-
 include/linux/scatterlist.h |   16 ++++++++++++++++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index da2a818..0f5004f 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -110,7 +110,7 @@ new_segment:
 			if (!sg)
 				sg = sglist;
 			else {
-				sg->page_link &= ~0x02;
+				sg_unmark_end(sg);
 				sg = sg_next(sg);
 			}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 936a110..5f24482 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -143,7 +143,7 @@ new_segment:
 			 * termination bit to avoid doing a full
 			 * sg_init_table() in drivers for each command.
 			 */
-			(*sg)->page_link &= ~0x02;
+			sg_unmark_end(*sg);
 			*sg = sg_next(*sg);
 		}
 
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4bd6c06..9b83784 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -172,6 +172,22 @@ static inline void sg_mark_end(struct scatterlist *sg)
 }
 
 /**
+ * sg_unmark_end - Undo setting the end of the scatterlist
+ * @sg:		 SG entryScatterlist
+ *
+ * Description:
+ *   Removes the termination marker from the given entry of the scatterlist.
+ *
+ **/
+static inline void sg_unmark_end(struct scatterlist *sg)
+{
+#ifdef CONFIG_DEBUG_SG
+	BUG_ON(sg->sg_magic != SG_MAGIC);
+#endif
+	sg->page_link &= ~0x02;
+}
+
+/**
  * sg_phys - Return physical address of an sg entry
  * @sg:	     SG entry
  *
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/8] scatterlist: introduce sg_unmark_end
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, kvm, mst, virtualization

This is useful in places that recycle the same scatterlist multiple
times, and do not want to incur the cost of sg_init_table every
time in hot paths.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	Jens, could you give your Acked-by for this patch?

 block/blk-integrity.c       |    2 +-
 block/blk-merge.c           |    2 +-
 include/linux/scatterlist.h |   16 ++++++++++++++++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index da2a818..0f5004f 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -110,7 +110,7 @@ new_segment:
 			if (!sg)
 				sg = sglist;
 			else {
-				sg->page_link &= ~0x02;
+				sg_unmark_end(sg);
 				sg = sg_next(sg);
 			}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 936a110..5f24482 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -143,7 +143,7 @@ new_segment:
 			 * termination bit to avoid doing a full
 			 * sg_init_table() in drivers for each command.
 			 */
-			(*sg)->page_link &= ~0x02;
+			sg_unmark_end(*sg);
 			*sg = sg_next(*sg);
 		}
 
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4bd6c06..9b83784 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -172,6 +172,22 @@ static inline void sg_mark_end(struct scatterlist *sg)
 }
 
 /**
+ * sg_unmark_end - Undo setting the end of the scatterlist
+ * @sg:		 SG entryScatterlist
+ *
+ * Description:
+ *   Removes the termination marker from the given entry of the scatterlist.
+ *
+ **/
+static inline void sg_unmark_end(struct scatterlist *sg)
+{
+#ifdef CONFIG_DEBUG_SG
+	BUG_ON(sg->sg_magic != SG_MAGIC);
+#endif
+	sg->page_link &= ~0x02;
+}
+
+/**
  * sg_phys - Return physical address of an sg entry
  * @sg:	     SG entry
  *
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 6/8] virtio-net: unmark scatterlist ending after virtqueue_add_buf
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

Prepare for when virtqueue_add_buf will use sg_next instead of
ignoring ending markers.

Note that for_each_sg (and thus virtqueue_add_buf) allows you
to pass a "truncated" scatterlist that does not have a marker
on the last item.  We rely on this in add_recvbuf_mergeable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	This is the only part that survived of Rusty's ideas. :)

 drivers/net/virtio_net.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 35c00c5..ce08b54 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -440,13 +440,17 @@ static int add_recvbuf_small(struct receive_queue *rq, gfp_t gfp)
 
 	hdr = skb_vnet_hdr(skb);
 	sg_set_buf(rq->sg, &hdr->hdr, sizeof hdr->hdr);
-
 	skb_to_sgvec(skb, rq->sg + 1, 0, skb->len);
 
 	err = virtqueue_add_buf(rq->vq, rq->sg, 0, 2, skb, gfp);
 	if (err < 0)
 		dev_kfree_skb(skb);
 
+	/*
+	 * An optimization: clear the end bit set by skb_to_sgvec, so
+	 * we can simply re-use rq->sg[] next time.
+	 */
+	sg_unmark_end(rq->sg + 1);
 	return err;
 }
 
@@ -505,8 +509,7 @@ static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
 	if (!page)
 		return -ENOMEM;
 
-	sg_init_one(rq->sg, page_address(page), PAGE_SIZE);
-
+	sg_set_page(rq->sg, page, PAGE_SIZE, 0);
 	err = virtqueue_add_buf(rq->vq, rq->sg, 0, 1, page, gfp);
 	if (err < 0)
 		give_pages(rq, page);
@@ -671,6 +674,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
 	struct virtnet_info *vi = sq->vq->vdev->priv;
 	unsigned num_sg;
+	int ret;
 
 	pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest);
 
@@ -710,8 +714,15 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 		sg_set_buf(sq->sg, &hdr->hdr, sizeof hdr->hdr);
 
 	num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
-	return virtqueue_add_buf(sq->vq, sq->sg, num_sg,
-				 0, skb, GFP_ATOMIC);
+	ret = virtqueue_add_buf(sq->vq, sq->sg, num_sg,
+				0, skb, GFP_ATOMIC);
+
+	/*
+	 * An optimization: clear the end bit set by skb_to_sgvec, so
+	 * we can simply re-use sq->sg[] next time.
+	 */
+	sg_unmark_end(&sq->sg[num_sg-1]);
+	return ret;
 }
 
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 6/8] virtio-net: unmark scatterlist ending after virtqueue_add_buf
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

Prepare for when virtqueue_add_buf will use sg_next instead of
ignoring ending markers.

Note that for_each_sg (and thus virtqueue_add_buf) allows you
to pass a "truncated" scatterlist that does not have a marker
on the last item.  We rely on this in add_recvbuf_mergeable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	This is the only part that survived of Rusty's ideas. :)

 drivers/net/virtio_net.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 35c00c5..ce08b54 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -440,13 +440,17 @@ static int add_recvbuf_small(struct receive_queue *rq, gfp_t gfp)
 
 	hdr = skb_vnet_hdr(skb);
 	sg_set_buf(rq->sg, &hdr->hdr, sizeof hdr->hdr);
-
 	skb_to_sgvec(skb, rq->sg + 1, 0, skb->len);
 
 	err = virtqueue_add_buf(rq->vq, rq->sg, 0, 2, skb, gfp);
 	if (err < 0)
 		dev_kfree_skb(skb);
 
+	/*
+	 * An optimization: clear the end bit set by skb_to_sgvec, so
+	 * we can simply re-use rq->sg[] next time.
+	 */
+	sg_unmark_end(rq->sg + 1);
 	return err;
 }
 
@@ -505,8 +509,7 @@ static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
 	if (!page)
 		return -ENOMEM;
 
-	sg_init_one(rq->sg, page_address(page), PAGE_SIZE);
-
+	sg_set_page(rq->sg, page, PAGE_SIZE, 0);
 	err = virtqueue_add_buf(rq->vq, rq->sg, 0, 1, page, gfp);
 	if (err < 0)
 		give_pages(rq, page);
@@ -671,6 +674,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
 	struct virtnet_info *vi = sq->vq->vdev->priv;
 	unsigned num_sg;
+	int ret;
 
 	pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest);
 
@@ -710,8 +714,15 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 		sg_set_buf(sq->sg, &hdr->hdr, sizeof hdr->hdr);
 
 	num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
-	return virtqueue_add_buf(sq->vq, sq->sg, num_sg,
-				 0, skb, GFP_ATOMIC);
+	ret = virtqueue_add_buf(sq->vq, sq->sg, num_sg,
+				0, skb, GFP_ATOMIC);
+
+	/*
+	 * An optimization: clear the end bit set by skb_to_sgvec, so
+	 * we can simply re-use sq->sg[] next time.
+	 */
+	sg_unmark_end(&sq->sg[num_sg-1]);
+	return ret;
 }
 
 static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 7/8] virtio-scsi: use virtqueue_start_buf
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 12:22   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

Using the new virtio_scsi_add_sg function lets us simplify the queueing
path.  In particular, all data protected by the tgt_lock is just gone
(multiqueue will find a new use for the lock).

The speedup is relatively small (2-4%) but it is worthwhile because of
the code simplification it enables.  And it will be particularly useful
when adding multiqueue supoprt, because it keeps the tgt_lock critical
sections very slim while avoiding a proliferation of locks.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/scsi/virtio_scsi.c |  103 +++++++++++++++++++-------------------------
 1 files changed, 45 insertions(+), 58 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 3449a1f..3b61173 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -59,11 +59,8 @@ struct virtio_scsi_vq {
 
 /* Per-target queue state */
 struct virtio_scsi_target_state {
-	/* Protects sg.  Lock hierarchy is tgt_lock -> vq_lock.  */
+	/* Never held at the same time as vq_lock.  */
 	spinlock_t tgt_lock;
-
-	/* For sglist construction when adding commands to the virtqueue.  */
-	struct scatterlist sg[];
 };
 
 /* Driver instance state */
@@ -351,89 +348,83 @@ static void virtscsi_event_done(struct virtqueue *vq)
 	spin_unlock_irqrestore(&vscsi->event_vq.vq_lock, flags);
 };
 
-static void virtscsi_map_sgl(struct scatterlist *sg, unsigned int *p_idx,
-			     struct scsi_data_buffer *sdb)
-{
-	struct sg_table *table = &sdb->table;
-	struct scatterlist *sg_elem;
-	unsigned int idx = *p_idx;
-	int i;
-
-	for_each_sg(table->sgl, sg_elem, table->nents, i)
-		sg[idx++] = *sg_elem;
-
-	*p_idx = idx;
-}
-
 /**
- * virtscsi_map_cmd - map a scsi_cmd to a virtqueue scatterlist
+ * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue
  * @vscsi	: virtio_scsi state
  * @cmd		: command structure
- * @out_num	: number of read-only elements
- * @in_num	: number of write-only elements
  * @req_size	: size of the request buffer
  * @resp_size	: size of the response buffer
- *
- * Called with tgt_lock held.
+ * @gfp	: flags to use for memory allocations
  */
-static void virtscsi_map_cmd(struct virtio_scsi_target_state *tgt,
-			     struct virtio_scsi_cmd *cmd,
-			     unsigned *out_num, unsigned *in_num,
-			     size_t req_size, size_t resp_size)
+static int virtscsi_add_cmd(struct virtqueue *vq,
+			    struct virtio_scsi_cmd *cmd,
+			    size_t req_size, size_t resp_size, gfp_t gfp)
 {
 	struct scsi_cmnd *sc = cmd->sc;
-	struct scatterlist *sg = tgt->sg;
-	unsigned int idx = 0;
+	struct scatterlist sg;
+	unsigned int count, count_sg;
+	struct sg_table *out, *in;
+	struct virtqueue_buf buf;
+	int ret;
+
+	out = in = NULL;
+
+	if (sc && sc->sc_data_direction != DMA_NONE) {
+		if (sc->sc_data_direction != DMA_FROM_DEVICE)
+			out = &scsi_out(sc)->table;
+		if (sc->sc_data_direction != DMA_TO_DEVICE)
+			in = &scsi_in(sc)->table;
+	}
+
+	count_sg = 2 + (out ? 1 : 0)          + (in ? 1 : 0);
+	count    = 2 + (out ? out->nents : 0) + (in ? in->nents : 0);
+	ret = virtqueue_start_buf(vq, &buf, cmd, count, count_sg, gfp);
+	if (ret < 0)
+		return ret;
 
 	/* Request header.  */
-	sg_set_buf(&sg[idx++], &cmd->req, req_size);
+	sg_init_one(&sg, &cmd->req, req_size);
+	virtqueue_add_sg_single(&buf, &sg, DMA_TO_DEVICE);
 
 	/* Data-out buffer.  */
-	if (sc && sc->sc_data_direction != DMA_FROM_DEVICE)
-		virtscsi_map_sgl(sg, &idx, scsi_out(sc));
-
-	*out_num = idx;
+	if (out)
+		virtqueue_add_sg(&buf, out->sgl, out->nents, DMA_TO_DEVICE);
 
 	/* Response header.  */
-	sg_set_buf(&sg[idx++], &cmd->resp, resp_size);
+	sg_init_one(&sg, &cmd->resp, resp_size);
+	virtqueue_add_sg_single(&buf, &sg, DMA_FROM_DEVICE);
 
 	/* Data-in buffer */
-	if (sc && sc->sc_data_direction != DMA_TO_DEVICE)
-		virtscsi_map_sgl(sg, &idx, scsi_in(sc));
+	if (in)
+		virtqueue_add_sg(&buf, in->sgl, in->nents, DMA_FROM_DEVICE);
 
-	*in_num = idx - *out_num;
+	virtqueue_end_buf(&buf);
+	return 0;
 }
 
-static int virtscsi_kick_cmd(struct virtio_scsi_target_state *tgt,
-			     struct virtio_scsi_vq *vq,
+static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq,
 			     struct virtio_scsi_cmd *cmd,
 			     size_t req_size, size_t resp_size, gfp_t gfp)
 {
-	unsigned int out_num, in_num;
 	unsigned long flags;
-	int err;
+	int ret;
 	bool needs_kick = false;
 
-	spin_lock_irqsave(&tgt->tgt_lock, flags);
-	virtscsi_map_cmd(tgt, cmd, &out_num, &in_num, req_size, resp_size);
-
-	spin_lock(&vq->vq_lock);
-	err = virtqueue_add_buf(vq->vq, tgt->sg, out_num, in_num, cmd, gfp);
-	spin_unlock(&tgt->tgt_lock);
-	if (!err)
+	spin_lock_irqsave(&vq->vq_lock, flags);
+	ret = virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size, gfp);
+	if (!ret)
 		needs_kick = virtqueue_kick_prepare(vq->vq);
 
 	spin_unlock_irqrestore(&vq->vq_lock, flags);
 
 	if (needs_kick)
 		virtqueue_notify(vq->vq);
-	return err;
+	return ret;
 }
 
 static int virtscsi_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *sc)
 {
 	struct virtio_scsi *vscsi = shost_priv(sh);
-	struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id];
 	struct virtio_scsi_cmd *cmd;
 	int ret;
 
@@ -467,7 +458,7 @@ static int virtscsi_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *sc)
 	BUG_ON(sc->cmd_len > VIRTIO_SCSI_CDB_SIZE);
 	memcpy(cmd->req.cmd.cdb, sc->cmnd, sc->cmd_len);
 
-	if (virtscsi_kick_cmd(tgt, &vscsi->req_vq, cmd,
+	if (virtscsi_kick_cmd(&vscsi->req_vq, cmd,
 			      sizeof cmd->req.cmd, sizeof cmd->resp.cmd,
 			      GFP_ATOMIC) == 0)
 		ret = 0;
@@ -481,11 +472,10 @@ out:
 static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
 {
 	DECLARE_COMPLETION_ONSTACK(comp);
-	struct virtio_scsi_target_state *tgt = vscsi->tgt[cmd->sc->device->id];
 	int ret = FAILED;
 
 	cmd->comp = &comp;
-	if (virtscsi_kick_cmd(tgt, &vscsi->ctrl_vq, cmd,
+	if (virtscsi_kick_cmd(&vscsi->ctrl_vq, cmd,
 			      sizeof cmd->req.tmf, sizeof cmd->resp.tmf,
 			      GFP_NOIO) < 0)
 		goto out;
@@ -592,14 +582,11 @@ static struct virtio_scsi_target_state *virtscsi_alloc_tgt(
 	gfp_t gfp_mask = GFP_KERNEL;
 
 	/* We need extra sg elements at head and tail.  */
-	tgt = kmalloc(sizeof(*tgt) + sizeof(tgt->sg[0]) * (sg_elems + 2),
-		      gfp_mask);
-
+	tgt = kmalloc(sizeof(*tgt), gfp_mask);
 	if (!tgt)
 		return NULL;
 
 	spin_lock_init(&tgt->tgt_lock);
-	sg_init_table(tgt->sg, sg_elems + 2);
 	return tgt;
 }
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 7/8] virtio-scsi: use virtqueue_start_buf
@ 2013-02-07 12:22   ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

Using the new virtio_scsi_add_sg function lets us simplify the queueing
path.  In particular, all data protected by the tgt_lock is just gone
(multiqueue will find a new use for the lock).

The speedup is relatively small (2-4%) but it is worthwhile because of
the code simplification it enables.  And it will be particularly useful
when adding multiqueue supoprt, because it keeps the tgt_lock critical
sections very slim while avoiding a proliferation of locks.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/scsi/virtio_scsi.c |  103 +++++++++++++++++++-------------------------
 1 files changed, 45 insertions(+), 58 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 3449a1f..3b61173 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -59,11 +59,8 @@ struct virtio_scsi_vq {
 
 /* Per-target queue state */
 struct virtio_scsi_target_state {
-	/* Protects sg.  Lock hierarchy is tgt_lock -> vq_lock.  */
+	/* Never held at the same time as vq_lock.  */
 	spinlock_t tgt_lock;
-
-	/* For sglist construction when adding commands to the virtqueue.  */
-	struct scatterlist sg[];
 };
 
 /* Driver instance state */
@@ -351,89 +348,83 @@ static void virtscsi_event_done(struct virtqueue *vq)
 	spin_unlock_irqrestore(&vscsi->event_vq.vq_lock, flags);
 };
 
-static void virtscsi_map_sgl(struct scatterlist *sg, unsigned int *p_idx,
-			     struct scsi_data_buffer *sdb)
-{
-	struct sg_table *table = &sdb->table;
-	struct scatterlist *sg_elem;
-	unsigned int idx = *p_idx;
-	int i;
-
-	for_each_sg(table->sgl, sg_elem, table->nents, i)
-		sg[idx++] = *sg_elem;
-
-	*p_idx = idx;
-}
-
 /**
- * virtscsi_map_cmd - map a scsi_cmd to a virtqueue scatterlist
+ * virtscsi_add_cmd - add a virtio_scsi_cmd to a virtqueue
  * @vscsi	: virtio_scsi state
  * @cmd		: command structure
- * @out_num	: number of read-only elements
- * @in_num	: number of write-only elements
  * @req_size	: size of the request buffer
  * @resp_size	: size of the response buffer
- *
- * Called with tgt_lock held.
+ * @gfp	: flags to use for memory allocations
  */
-static void virtscsi_map_cmd(struct virtio_scsi_target_state *tgt,
-			     struct virtio_scsi_cmd *cmd,
-			     unsigned *out_num, unsigned *in_num,
-			     size_t req_size, size_t resp_size)
+static int virtscsi_add_cmd(struct virtqueue *vq,
+			    struct virtio_scsi_cmd *cmd,
+			    size_t req_size, size_t resp_size, gfp_t gfp)
 {
 	struct scsi_cmnd *sc = cmd->sc;
-	struct scatterlist *sg = tgt->sg;
-	unsigned int idx = 0;
+	struct scatterlist sg;
+	unsigned int count, count_sg;
+	struct sg_table *out, *in;
+	struct virtqueue_buf buf;
+	int ret;
+
+	out = in = NULL;
+
+	if (sc && sc->sc_data_direction != DMA_NONE) {
+		if (sc->sc_data_direction != DMA_FROM_DEVICE)
+			out = &scsi_out(sc)->table;
+		if (sc->sc_data_direction != DMA_TO_DEVICE)
+			in = &scsi_in(sc)->table;
+	}
+
+	count_sg = 2 + (out ? 1 : 0)          + (in ? 1 : 0);
+	count    = 2 + (out ? out->nents : 0) + (in ? in->nents : 0);
+	ret = virtqueue_start_buf(vq, &buf, cmd, count, count_sg, gfp);
+	if (ret < 0)
+		return ret;
 
 	/* Request header.  */
-	sg_set_buf(&sg[idx++], &cmd->req, req_size);
+	sg_init_one(&sg, &cmd->req, req_size);
+	virtqueue_add_sg_single(&buf, &sg, DMA_TO_DEVICE);
 
 	/* Data-out buffer.  */
-	if (sc && sc->sc_data_direction != DMA_FROM_DEVICE)
-		virtscsi_map_sgl(sg, &idx, scsi_out(sc));
-
-	*out_num = idx;
+	if (out)
+		virtqueue_add_sg(&buf, out->sgl, out->nents, DMA_TO_DEVICE);
 
 	/* Response header.  */
-	sg_set_buf(&sg[idx++], &cmd->resp, resp_size);
+	sg_init_one(&sg, &cmd->resp, resp_size);
+	virtqueue_add_sg_single(&buf, &sg, DMA_FROM_DEVICE);
 
 	/* Data-in buffer */
-	if (sc && sc->sc_data_direction != DMA_TO_DEVICE)
-		virtscsi_map_sgl(sg, &idx, scsi_in(sc));
+	if (in)
+		virtqueue_add_sg(&buf, in->sgl, in->nents, DMA_FROM_DEVICE);
 
-	*in_num = idx - *out_num;
+	virtqueue_end_buf(&buf);
+	return 0;
 }
 
-static int virtscsi_kick_cmd(struct virtio_scsi_target_state *tgt,
-			     struct virtio_scsi_vq *vq,
+static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq,
 			     struct virtio_scsi_cmd *cmd,
 			     size_t req_size, size_t resp_size, gfp_t gfp)
 {
-	unsigned int out_num, in_num;
 	unsigned long flags;
-	int err;
+	int ret;
 	bool needs_kick = false;
 
-	spin_lock_irqsave(&tgt->tgt_lock, flags);
-	virtscsi_map_cmd(tgt, cmd, &out_num, &in_num, req_size, resp_size);
-
-	spin_lock(&vq->vq_lock);
-	err = virtqueue_add_buf(vq->vq, tgt->sg, out_num, in_num, cmd, gfp);
-	spin_unlock(&tgt->tgt_lock);
-	if (!err)
+	spin_lock_irqsave(&vq->vq_lock, flags);
+	ret = virtscsi_add_cmd(vq->vq, cmd, req_size, resp_size, gfp);
+	if (!ret)
 		needs_kick = virtqueue_kick_prepare(vq->vq);
 
 	spin_unlock_irqrestore(&vq->vq_lock, flags);
 
 	if (needs_kick)
 		virtqueue_notify(vq->vq);
-	return err;
+	return ret;
 }
 
 static int virtscsi_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *sc)
 {
 	struct virtio_scsi *vscsi = shost_priv(sh);
-	struct virtio_scsi_target_state *tgt = vscsi->tgt[sc->device->id];
 	struct virtio_scsi_cmd *cmd;
 	int ret;
 
@@ -467,7 +458,7 @@ static int virtscsi_queuecommand(struct Scsi_Host *sh, struct scsi_cmnd *sc)
 	BUG_ON(sc->cmd_len > VIRTIO_SCSI_CDB_SIZE);
 	memcpy(cmd->req.cmd.cdb, sc->cmnd, sc->cmd_len);
 
-	if (virtscsi_kick_cmd(tgt, &vscsi->req_vq, cmd,
+	if (virtscsi_kick_cmd(&vscsi->req_vq, cmd,
 			      sizeof cmd->req.cmd, sizeof cmd->resp.cmd,
 			      GFP_ATOMIC) == 0)
 		ret = 0;
@@ -481,11 +472,10 @@ out:
 static int virtscsi_tmf(struct virtio_scsi *vscsi, struct virtio_scsi_cmd *cmd)
 {
 	DECLARE_COMPLETION_ONSTACK(comp);
-	struct virtio_scsi_target_state *tgt = vscsi->tgt[cmd->sc->device->id];
 	int ret = FAILED;
 
 	cmd->comp = &comp;
-	if (virtscsi_kick_cmd(tgt, &vscsi->ctrl_vq, cmd,
+	if (virtscsi_kick_cmd(&vscsi->ctrl_vq, cmd,
 			      sizeof cmd->req.tmf, sizeof cmd->resp.tmf,
 			      GFP_NOIO) < 0)
 		goto out;
@@ -592,14 +582,11 @@ static struct virtio_scsi_target_state *virtscsi_alloc_tgt(
 	gfp_t gfp_mask = GFP_KERNEL;
 
 	/* We need extra sg elements at head and tail.  */
-	tgt = kmalloc(sizeof(*tgt) + sizeof(tgt->sg[0]) * (sg_elems + 2),
-		      gfp_mask);
-
+	tgt = kmalloc(sizeof(*tgt), gfp_mask);
 	if (!tgt)
 		return NULL;
 
 	spin_lock_init(&tgt->tgt_lock);
-	sg_init_table(tgt->sg, sg_elems + 2);
 	return tgt;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 8/8] virtio: reimplement virtqueue_add_buf using new functions
  2013-02-07 12:22 ` Paolo Bonzini
                   ` (8 preceding siblings ...)
  (?)
@ 2013-02-07 12:22 ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wanlong Gao, asias, Rusty Russell, mst, kvm, virtualization

Eliminate the code duplication between virtqueue_add_buf and
virtqueue_add_sg.  That's safe to do now that no devices will
pass scatterlists with a termination marker in the middle.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/virtio/virtio_ring.c |  157 +++---------------------------------------
 1 files changed, 11 insertions(+), 146 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 25f56e6..d19113d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -119,63 +119,6 @@ struct vring_virtqueue
 
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
 
-/* Set up an indirect table of descriptors and add it to the queue. */
-static int vring_add_indirect(struct vring_virtqueue *vq,
-			      struct scatterlist sg[],
-			      unsigned int out,
-			      unsigned int in,
-			      gfp_t gfp)
-{
-	struct vring_desc *desc;
-	unsigned head;
-	int i;
-
-	/*
-	 * We require lowmem mappings for the descriptors because
-	 * otherwise virt_to_phys will give us bogus addresses in the
-	 * virtqueue.
-	 */
-	gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH);
-
-	desc = kmalloc((out + in) * sizeof(struct vring_desc), gfp);
-	if (!desc)
-		return -ENOMEM;
-
-	/* Transfer entries from the sg list into the indirect page */
-	for (i = 0; i < out; i++) {
-		desc[i].flags = VRING_DESC_F_NEXT;
-		desc[i].addr = sg_phys(sg);
-		desc[i].len = sg->length;
-		desc[i].next = i+1;
-		sg++;
-	}
-	for (; i < (out + in); i++) {
-		desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
-		desc[i].addr = sg_phys(sg);
-		desc[i].len = sg->length;
-		desc[i].next = i+1;
-		sg++;
-	}
-
-	/* Last one doesn't continue. */
-	desc[i-1].flags &= ~VRING_DESC_F_NEXT;
-	desc[i-1].next = 0;
-
-	/* We're about to use a buffer */
-	vq->vq.num_free--;
-
-	/* Use a single buffer which doesn't continue */
-	head = vq->free_head;
-	vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
-	vq->vring.desc[head].addr = virt_to_phys(desc);
-	vq->vring.desc[head].len = i * sizeof(struct vring_desc);
-
-	/* Update free pointer */
-	vq->free_head = vq->vring.desc[head].next;
-
-	return head;
-}
-
 /**
  * virtqueue_add_buf - expose buffer to other end
  * @vq: the struct virtqueue we're talking about.
@@ -190,104 +133,26 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
  *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM).
  */
-int virtqueue_add_buf(struct virtqueue *_vq,
+int virtqueue_add_buf(struct virtqueue *vq,
 		      struct scatterlist sg[],
 		      unsigned int out,
 		      unsigned int in,
 		      void *data,
 		      gfp_t gfp)
 {
-	struct vring_virtqueue *vq = to_vvq(_vq);
-	unsigned int i, avail, uninitialized_var(prev);
-	int head;
-
-	START_USE(vq);
-
-	BUG_ON(data == NULL);
-
-#ifdef DEBUG
-	{
-		ktime_t now = ktime_get();
+	struct virtqueue_buf buf;
+	int ret;
 
-		/* No kick or get, with .1 second between?  Warn. */
-		if (vq->last_add_time_valid)
-			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
-					    > 100);
-		vq->last_add_time = now;
-		vq->last_add_time_valid = true;
-	}
-#endif
+	ret = virtqueue_start_buf(vq, &buf, data, out + in, !!out + !!in, gfp);
+	if (ret < 0)
+		return ret;
 
-	/* If the host supports indirect descriptor tables, and we have multiple
-	 * buffers, then go indirect. FIXME: tune this threshold */
-	if (vq->indirect && (out + in) > 1 && vq->vq.num_free) {
-		head = vring_add_indirect(vq, sg, out, in, gfp);
-		if (likely(head >= 0))
-			goto add_head;
-	}
-
-	BUG_ON(out + in > vq->vring.num);
-	BUG_ON(out + in == 0);
-
-	if (vq->vq.num_free < out + in) {
-		pr_debug("Can't add buf len %i - avail = %i\n",
-			 out + in, vq->vq.num_free);
-		/* FIXME: for historical reasons, we force a notify here if
-		 * there are outgoing parts to the buffer.  Presumably the
-		 * host should service the ring ASAP. */
-		if (out)
-			vq->notify(&vq->vq);
-		END_USE(vq);
-		return -ENOSPC;
-	}
-
-	/* We're about to use some buffers from the free list. */
-	vq->vq.num_free -= out + in;
-
-	head = vq->free_head;
-	for (i = vq->free_head; out; i = vq->vring.desc[i].next, out--) {
-		vq->vring.desc[i].flags = VRING_DESC_F_NEXT;
-		vq->vring.desc[i].addr = sg_phys(sg);
-		vq->vring.desc[i].len = sg->length;
-		prev = i;
-		sg++;
-	}
-	for (; in; i = vq->vring.desc[i].next, in--) {
-		vq->vring.desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
-		vq->vring.desc[i].addr = sg_phys(sg);
-		vq->vring.desc[i].len = sg->length;
-		prev = i;
-		sg++;
-	}
-	/* Last one doesn't continue. */
-	vq->vring.desc[prev].flags &= ~VRING_DESC_F_NEXT;
-
-	/* Update free pointer */
-	vq->free_head = i;
-
-add_head:
-	/* Set token. */
-	vq->data[head] = data;
-
-	/* Put entry in available array (but don't update avail->idx until they
-	 * do sync). */
-	avail = (vq->vring.avail->idx & (vq->vring.num-1));
-	vq->vring.avail->ring[avail] = head;
-
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb(vq);
-	vq->vring.avail->idx++;
-	vq->num_added++;
-
-	/* This is very unlikely, but theoretically possible.  Kick
-	 * just in case. */
-	if (unlikely(vq->num_added == (1 << 16) - 1))
-		virtqueue_kick(_vq);
-
-	pr_debug("Added buffer head %i to %p\n", head, vq);
-	END_USE(vq);
+	if (out)
+		virtqueue_add_sg(&buf, sg, out, DMA_TO_DEVICE);
+	if (in)
+		virtqueue_add_sg(&buf, sg + out, in, DMA_FROM_DEVICE);
 
+	virtqueue_end_buf(&buf);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 8/8] virtio: reimplement virtqueue_add_buf using new functions
  2013-02-07 12:22 ` Paolo Bonzini
                   ` (9 preceding siblings ...)
  (?)
@ 2013-02-07 12:22 ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 12:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm, mst, virtualization

Eliminate the code duplication between virtqueue_add_buf and
virtqueue_add_sg.  That's safe to do now that no devices will
pass scatterlists with a termination marker in the middle.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/virtio/virtio_ring.c |  157 +++---------------------------------------
 1 files changed, 11 insertions(+), 146 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 25f56e6..d19113d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -119,63 +119,6 @@ struct vring_virtqueue
 
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
 
-/* Set up an indirect table of descriptors and add it to the queue. */
-static int vring_add_indirect(struct vring_virtqueue *vq,
-			      struct scatterlist sg[],
-			      unsigned int out,
-			      unsigned int in,
-			      gfp_t gfp)
-{
-	struct vring_desc *desc;
-	unsigned head;
-	int i;
-
-	/*
-	 * We require lowmem mappings for the descriptors because
-	 * otherwise virt_to_phys will give us bogus addresses in the
-	 * virtqueue.
-	 */
-	gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH);
-
-	desc = kmalloc((out + in) * sizeof(struct vring_desc), gfp);
-	if (!desc)
-		return -ENOMEM;
-
-	/* Transfer entries from the sg list into the indirect page */
-	for (i = 0; i < out; i++) {
-		desc[i].flags = VRING_DESC_F_NEXT;
-		desc[i].addr = sg_phys(sg);
-		desc[i].len = sg->length;
-		desc[i].next = i+1;
-		sg++;
-	}
-	for (; i < (out + in); i++) {
-		desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
-		desc[i].addr = sg_phys(sg);
-		desc[i].len = sg->length;
-		desc[i].next = i+1;
-		sg++;
-	}
-
-	/* Last one doesn't continue. */
-	desc[i-1].flags &= ~VRING_DESC_F_NEXT;
-	desc[i-1].next = 0;
-
-	/* We're about to use a buffer */
-	vq->vq.num_free--;
-
-	/* Use a single buffer which doesn't continue */
-	head = vq->free_head;
-	vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
-	vq->vring.desc[head].addr = virt_to_phys(desc);
-	vq->vring.desc[head].len = i * sizeof(struct vring_desc);
-
-	/* Update free pointer */
-	vq->free_head = vq->vring.desc[head].next;
-
-	return head;
-}
-
 /**
  * virtqueue_add_buf - expose buffer to other end
  * @vq: the struct virtqueue we're talking about.
@@ -190,104 +133,26 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
  *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM).
  */
-int virtqueue_add_buf(struct virtqueue *_vq,
+int virtqueue_add_buf(struct virtqueue *vq,
 		      struct scatterlist sg[],
 		      unsigned int out,
 		      unsigned int in,
 		      void *data,
 		      gfp_t gfp)
 {
-	struct vring_virtqueue *vq = to_vvq(_vq);
-	unsigned int i, avail, uninitialized_var(prev);
-	int head;
-
-	START_USE(vq);
-
-	BUG_ON(data == NULL);
-
-#ifdef DEBUG
-	{
-		ktime_t now = ktime_get();
+	struct virtqueue_buf buf;
+	int ret;
 
-		/* No kick or get, with .1 second between?  Warn. */
-		if (vq->last_add_time_valid)
-			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
-					    > 100);
-		vq->last_add_time = now;
-		vq->last_add_time_valid = true;
-	}
-#endif
+	ret = virtqueue_start_buf(vq, &buf, data, out + in, !!out + !!in, gfp);
+	if (ret < 0)
+		return ret;
 
-	/* If the host supports indirect descriptor tables, and we have multiple
-	 * buffers, then go indirect. FIXME: tune this threshold */
-	if (vq->indirect && (out + in) > 1 && vq->vq.num_free) {
-		head = vring_add_indirect(vq, sg, out, in, gfp);
-		if (likely(head >= 0))
-			goto add_head;
-	}
-
-	BUG_ON(out + in > vq->vring.num);
-	BUG_ON(out + in == 0);
-
-	if (vq->vq.num_free < out + in) {
-		pr_debug("Can't add buf len %i - avail = %i\n",
-			 out + in, vq->vq.num_free);
-		/* FIXME: for historical reasons, we force a notify here if
-		 * there are outgoing parts to the buffer.  Presumably the
-		 * host should service the ring ASAP. */
-		if (out)
-			vq->notify(&vq->vq);
-		END_USE(vq);
-		return -ENOSPC;
-	}
-
-	/* We're about to use some buffers from the free list. */
-	vq->vq.num_free -= out + in;
-
-	head = vq->free_head;
-	for (i = vq->free_head; out; i = vq->vring.desc[i].next, out--) {
-		vq->vring.desc[i].flags = VRING_DESC_F_NEXT;
-		vq->vring.desc[i].addr = sg_phys(sg);
-		vq->vring.desc[i].len = sg->length;
-		prev = i;
-		sg++;
-	}
-	for (; in; i = vq->vring.desc[i].next, in--) {
-		vq->vring.desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
-		vq->vring.desc[i].addr = sg_phys(sg);
-		vq->vring.desc[i].len = sg->length;
-		prev = i;
-		sg++;
-	}
-	/* Last one doesn't continue. */
-	vq->vring.desc[prev].flags &= ~VRING_DESC_F_NEXT;
-
-	/* Update free pointer */
-	vq->free_head = i;
-
-add_head:
-	/* Set token. */
-	vq->data[head] = data;
-
-	/* Put entry in available array (but don't update avail->idx until they
-	 * do sync). */
-	avail = (vq->vring.avail->idx & (vq->vring.num-1));
-	vq->vring.avail->ring[avail] = head;
-
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb(vq);
-	vq->vring.avail->idx++;
-	vq->num_added++;
-
-	/* This is very unlikely, but theoretically possible.  Kick
-	 * just in case. */
-	if (unlikely(vq->num_added == (1 << 16) - 1))
-		virtqueue_kick(_vq);
-
-	pr_debug("Added buffer head %i to %p\n", head, vq);
-	END_USE(vq);
+	if (out)
+		virtqueue_add_sg(&buf, sg, out, DMA_TO_DEVICE);
+	if (in)
+		virtqueue_add_sg(&buf, sg + out, in, DMA_FROM_DEVICE);
 
+	virtqueue_end_buf(&buf);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_buf);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/8] scatterlist: introduce sg_unmark_end
  2013-02-07 12:22   ` Paolo Bonzini
  (?)
  (?)
@ 2013-02-07 12:35   ` Jens Axboe
  -1 siblings, 0 replies; 40+ messages in thread
From: Jens Axboe @ 2013-02-07 12:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, mst, kvm,
	virtualization

On Thu, Feb 07 2013, Paolo Bonzini wrote:
> This is useful in places that recycle the same scatterlist multiple
> times, and do not want to incur the cost of sg_init_table every
> time in hot paths.

Looks fine to me.

Acked-by: Jens Axboe <axboe@kernel.dk>

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/8] scatterlist: introduce sg_unmark_end
  2013-02-07 12:22   ` Paolo Bonzini
  (?)
@ 2013-02-07 12:35   ` Jens Axboe
  -1 siblings, 0 replies; 40+ messages in thread
From: Jens Axboe @ 2013-02-07 12:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, mst, linux-kernel, virtualization

On Thu, Feb 07 2013, Paolo Bonzini wrote:
> This is useful in places that recycle the same scatterlist multiple
> times, and do not want to incur the cost of sg_init_table every
> time in hot paths.

Looks fine to me.

Acked-by: Jens Axboe <axboe@kernel.dk>

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-07 13:09   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

On Thu, Feb 07, 2013 at 01:22:24PM +0100, Paolo Bonzini wrote:
> The virtqueue_add_buf function has two limitations:
> 
> 1) it requires the caller to provide all the buffers in a single call;
> 
> 2) it does not support chained scatterlists: the buffers must be
> provided as an array of struct scatterlist.
> 
> Because of these limitations, virtio-scsi has to copy each request into
> a scatterlist internal to the driver.  It cannot just use the one that
> was prepared by the upper SCSI layers.
> 
> This series adds a different set of APIs for adding a buffer to a
> virtqueue.  The new API lets you pass the buffers piecewise, wrapping
> multiple calls to virtqueue_add_sg between virtqueue_start_buf and
> virtqueue_end_buf.  Letting drivers call virtqueue_add_sg multiple times
> if they already have a scatterlist provided by someone else simplifies the
> code and, for virtio-scsi, it saves the copying and the related locking.
> 
> One major difference between virtqueue_add_buf and virtqueue_add_sg
> is that the latter uses scatterlist iterators, which follow chained
> scatterlist structs and stop at ending markers.  In order to avoid code
> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> to change all existing callers of virtqueue_add_buf to provide well-formed
> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> the ending marker must be reset after calling virtqueue_add_buf, in
> preparation for the next usage of the scatterlist.  Other drivers are
> safe already.
> 

What are the changes as compared to the previous version?
How about some comments made on the previous version?
See e.g.
https://patchwork.kernel.org/patch/1891541/

Generally we have code for direct and indirect which is already
painful. We do not want 4 more variants of this code.

> This is an RFC for two reasons.  First, because I haven't done enough
> testing yet (especially with all the variations on receiving that
> virtio-net has).  Second, because I still have two struct vring_desc *
> fields in virtqueue API, which is a layering violation.  I'm not really
> sure how important that is and how to fix that---except by making the
> fields void*.

Hide the whole structure as part of vring struct, the problem will go
away.

> Paolo
> Paolo Bonzini (8):
>   virtio: add functions for piecewise addition of buffers
>   virtio-blk: reorganize virtblk_add_req
>   virtio-blk: use virtqueue_start_buf on bio path
>   virtio-blk: use virtqueue_start_buf on req path
>   scatterlist: introduce sg_unmark_end
>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>   virtio-scsi: use virtqueue_start_buf
>   virtio: reimplement virtqueue_add_buf using new functions
> 
>  block/blk-integrity.c        |    2 +-
>  block/blk-merge.c            |    2 +-
>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>  drivers/net/virtio_net.c     |   21 ++-
>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>  include/linux/scatterlist.h  |   16 ++
>  include/linux/virtio.h       |   25 +++
>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:09   ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, virtualization

On Thu, Feb 07, 2013 at 01:22:24PM +0100, Paolo Bonzini wrote:
> The virtqueue_add_buf function has two limitations:
> 
> 1) it requires the caller to provide all the buffers in a single call;
> 
> 2) it does not support chained scatterlists: the buffers must be
> provided as an array of struct scatterlist.
> 
> Because of these limitations, virtio-scsi has to copy each request into
> a scatterlist internal to the driver.  It cannot just use the one that
> was prepared by the upper SCSI layers.
> 
> This series adds a different set of APIs for adding a buffer to a
> virtqueue.  The new API lets you pass the buffers piecewise, wrapping
> multiple calls to virtqueue_add_sg between virtqueue_start_buf and
> virtqueue_end_buf.  Letting drivers call virtqueue_add_sg multiple times
> if they already have a scatterlist provided by someone else simplifies the
> code and, for virtio-scsi, it saves the copying and the related locking.
> 
> One major difference between virtqueue_add_buf and virtqueue_add_sg
> is that the latter uses scatterlist iterators, which follow chained
> scatterlist structs and stop at ending markers.  In order to avoid code
> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> to change all existing callers of virtqueue_add_buf to provide well-formed
> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> the ending marker must be reset after calling virtqueue_add_buf, in
> preparation for the next usage of the scatterlist.  Other drivers are
> safe already.
> 

What are the changes as compared to the previous version?
How about some comments made on the previous version?
See e.g.
https://patchwork.kernel.org/patch/1891541/

Generally we have code for direct and indirect which is already
painful. We do not want 4 more variants of this code.

> This is an RFC for two reasons.  First, because I haven't done enough
> testing yet (especially with all the variations on receiving that
> virtio-net has).  Second, because I still have two struct vring_desc *
> fields in virtqueue API, which is a layering violation.  I'm not really
> sure how important that is and how to fix that---except by making the
> fields void*.

Hide the whole structure as part of vring struct, the problem will go
away.

> Paolo
> Paolo Bonzini (8):
>   virtio: add functions for piecewise addition of buffers
>   virtio-blk: reorganize virtblk_add_req
>   virtio-blk: use virtqueue_start_buf on bio path
>   virtio-blk: use virtqueue_start_buf on req path
>   scatterlist: introduce sg_unmark_end
>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>   virtio-scsi: use virtqueue_start_buf
>   virtio: reimplement virtqueue_add_buf using new functions
> 
>  block/blk-integrity.c        |    2 +-
>  block/blk-merge.c            |    2 +-
>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>  drivers/net/virtio_net.c     |   21 ++-
>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>  include/linux/scatterlist.h  |   16 ++
>  include/linux/virtio.h       |   25 +++
>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 13:09   ` Michael S. Tsirkin
@ 2013-02-07 13:14     ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
>> One major difference between virtqueue_add_buf and virtqueue_add_sg
>> is that the latter uses scatterlist iterators, which follow chained
>> scatterlist structs and stop at ending markers.  In order to avoid code
>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
>> to change all existing callers of virtqueue_add_buf to provide well-formed
>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
>> the ending marker must be reset after calling virtqueue_add_buf, in
>> preparation for the next usage of the scatterlist.  Other drivers are
>> safe already.
> 
> What are the changes as compared to the previous version?
> How about some comments made on the previous version?
> See e.g.
> https://patchwork.kernel.org/patch/1891541/

Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
virtqueue_add_buf in terms of the new API, which requires virtio-blk and
virtio-net changes.

The virtio-blk and virtio-net changes are based on some ideas in the
patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
redone from scratch.

> Generally we have code for direct and indirect which is already
> painful. We do not want 4 more variants of this code.

Yes, indeed, the other main difference is that I'm now reimplementing
virtqueue_add_buf using the new functions.  So:

- we previously had 2 variants (direct/indirect)

- v1 had 4 variants (direct/indirect x add_buf/add_sg)

- v2 has 4 variants (direct/indirect x add_sg/add_sg_single)

>> This is an RFC for two reasons.  First, because I haven't done enough
>> testing yet (especially with all the variations on receiving that
>> virtio-net has).  Second, because I still have two struct vring_desc *
>> fields in virtqueue API, which is a layering violation.  I'm not really
>> sure how important that is and how to fix that---except by making the
>> fields void*.
> 
> Hide the whole structure as part of vring struct, the problem will go
> away.

Yes, that's the other possibility.  Will do for the next submission.

Paolo

>> Paolo
>> Paolo Bonzini (8):
>>   virtio: add functions for piecewise addition of buffers
>>   virtio-blk: reorganize virtblk_add_req
>>   virtio-blk: use virtqueue_start_buf on bio path
>>   virtio-blk: use virtqueue_start_buf on req path
>>   scatterlist: introduce sg_unmark_end
>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>>   virtio-scsi: use virtqueue_start_buf
>>   virtio: reimplement virtqueue_add_buf using new functions
>>
>>  block/blk-integrity.c        |    2 +-
>>  block/blk-merge.c            |    2 +-
>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>>  drivers/net/virtio_net.c     |   21 ++-
>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>>  include/linux/scatterlist.h  |   16 ++
>>  include/linux/virtio.h       |   25 +++
>>  8 files changed, 460 insertions(+), 291 deletions(-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:14     ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization

Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
>> One major difference between virtqueue_add_buf and virtqueue_add_sg
>> is that the latter uses scatterlist iterators, which follow chained
>> scatterlist structs and stop at ending markers.  In order to avoid code
>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
>> to change all existing callers of virtqueue_add_buf to provide well-formed
>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
>> the ending marker must be reset after calling virtqueue_add_buf, in
>> preparation for the next usage of the scatterlist.  Other drivers are
>> safe already.
> 
> What are the changes as compared to the previous version?
> How about some comments made on the previous version?
> See e.g.
> https://patchwork.kernel.org/patch/1891541/

Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
virtqueue_add_buf in terms of the new API, which requires virtio-blk and
virtio-net changes.

The virtio-blk and virtio-net changes are based on some ideas in the
patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
redone from scratch.

> Generally we have code for direct and indirect which is already
> painful. We do not want 4 more variants of this code.

Yes, indeed, the other main difference is that I'm now reimplementing
virtqueue_add_buf using the new functions.  So:

- we previously had 2 variants (direct/indirect)

- v1 had 4 variants (direct/indirect x add_buf/add_sg)

- v2 has 4 variants (direct/indirect x add_sg/add_sg_single)

>> This is an RFC for two reasons.  First, because I haven't done enough
>> testing yet (especially with all the variations on receiving that
>> virtio-net has).  Second, because I still have two struct vring_desc *
>> fields in virtqueue API, which is a layering violation.  I'm not really
>> sure how important that is and how to fix that---except by making the
>> fields void*.
> 
> Hide the whole structure as part of vring struct, the problem will go
> away.

Yes, that's the other possibility.  Will do for the next submission.

Paolo

>> Paolo
>> Paolo Bonzini (8):
>>   virtio: add functions for piecewise addition of buffers
>>   virtio-blk: reorganize virtblk_add_req
>>   virtio-blk: use virtqueue_start_buf on bio path
>>   virtio-blk: use virtqueue_start_buf on req path
>>   scatterlist: introduce sg_unmark_end
>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>>   virtio-scsi: use virtqueue_start_buf
>>   virtio: reimplement virtqueue_add_buf using new functions
>>
>>  block/blk-integrity.c        |    2 +-
>>  block/blk-merge.c            |    2 +-
>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>>  drivers/net/virtio_net.c     |   21 ++-
>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>>  include/linux/scatterlist.h  |   16 ++
>>  include/linux/virtio.h       |   25 +++
>>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 13:23       ` Michael S. Tsirkin
@ 2013-02-07 13:20         ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

Il 07/02/2013 14:23, Michael S. Tsirkin ha scritto:
> On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
>> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
>>>> One major difference between virtqueue_add_buf and virtqueue_add_sg
>>>> is that the latter uses scatterlist iterators, which follow chained
>>>> scatterlist structs and stop at ending markers.  In order to avoid code
>>>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
>>>> to change all existing callers of virtqueue_add_buf to provide well-formed
>>>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
>>>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
>>>> the ending marker must be reset after calling virtqueue_add_buf, in
>>>> preparation for the next usage of the scatterlist.  Other drivers are
>>>> safe already.
>>>
>>> What are the changes as compared to the previous version?
>>> How about some comments made on the previous version?
>>> See e.g.
>>> https://patchwork.kernel.org/patch/1891541/
>>
>> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
>> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
>> virtio-net changes.
>>
>> The virtio-blk and virtio-net changes are based on some ideas in the
>> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
>> redone from scratch.
>>
>>> Generally we have code for direct and indirect which is already
>>> painful. We do not want 4 more variants of this code.
>>
>> Yes, indeed, the other main difference is that I'm now reimplementing
>> virtqueue_add_buf using the new functions.  So:
>>
>> - we previously had 2 variants (direct/indirect)
>>
>> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
>>
>> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)
> 
> single is never indirect so should have a single variant.

Single means *this piece* (for example a request header) is single.  It
could still end up in an indirect buffer because QEMU does not support
mixed direct/indirect buffers.

Paolo

>>>> This is an RFC for two reasons.  First, because I haven't done enough
>>>> testing yet (especially with all the variations on receiving that
>>>> virtio-net has).  Second, because I still have two struct vring_desc *
>>>> fields in virtqueue API, which is a layering violation.  I'm not really
>>>> sure how important that is and how to fix that---except by making the
>>>> fields void*.
>>>
>>> Hide the whole structure as part of vring struct, the problem will go
>>> away.
>>
>> Yes, that's the other possibility.  Will do for the next submission.
>>
>> Paolo
>>
>>>> Paolo
>>>> Paolo Bonzini (8):
>>>>   virtio: add functions for piecewise addition of buffers
>>>>   virtio-blk: reorganize virtblk_add_req
>>>>   virtio-blk: use virtqueue_start_buf on bio path
>>>>   virtio-blk: use virtqueue_start_buf on req path
>>>>   scatterlist: introduce sg_unmark_end
>>>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>>>>   virtio-scsi: use virtqueue_start_buf
>>>>   virtio: reimplement virtqueue_add_buf using new functions
>>>>
>>>>  block/blk-integrity.c        |    2 +-
>>>>  block/blk-merge.c            |    2 +-
>>>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>>>>  drivers/net/virtio_net.c     |   21 ++-
>>>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>>>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>>>>  include/linux/scatterlist.h  |   16 ++
>>>>  include/linux/virtio.h       |   25 +++
>>>>  8 files changed, 460 insertions(+), 291 deletions(-)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:20         ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:20 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization

Il 07/02/2013 14:23, Michael S. Tsirkin ha scritto:
> On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
>> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
>>>> One major difference between virtqueue_add_buf and virtqueue_add_sg
>>>> is that the latter uses scatterlist iterators, which follow chained
>>>> scatterlist structs and stop at ending markers.  In order to avoid code
>>>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
>>>> to change all existing callers of virtqueue_add_buf to provide well-formed
>>>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
>>>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
>>>> the ending marker must be reset after calling virtqueue_add_buf, in
>>>> preparation for the next usage of the scatterlist.  Other drivers are
>>>> safe already.
>>>
>>> What are the changes as compared to the previous version?
>>> How about some comments made on the previous version?
>>> See e.g.
>>> https://patchwork.kernel.org/patch/1891541/
>>
>> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
>> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
>> virtio-net changes.
>>
>> The virtio-blk and virtio-net changes are based on some ideas in the
>> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
>> redone from scratch.
>>
>>> Generally we have code for direct and indirect which is already
>>> painful. We do not want 4 more variants of this code.
>>
>> Yes, indeed, the other main difference is that I'm now reimplementing
>> virtqueue_add_buf using the new functions.  So:
>>
>> - we previously had 2 variants (direct/indirect)
>>
>> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
>>
>> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)
> 
> single is never indirect so should have a single variant.

Single means *this piece* (for example a request header) is single.  It
could still end up in an indirect buffer because QEMU does not support
mixed direct/indirect buffers.

Paolo

>>>> This is an RFC for two reasons.  First, because I haven't done enough
>>>> testing yet (especially with all the variations on receiving that
>>>> virtio-net has).  Second, because I still have two struct vring_desc *
>>>> fields in virtqueue API, which is a layering violation.  I'm not really
>>>> sure how important that is and how to fix that---except by making the
>>>> fields void*.
>>>
>>> Hide the whole structure as part of vring struct, the problem will go
>>> away.
>>
>> Yes, that's the other possibility.  Will do for the next submission.
>>
>> Paolo
>>
>>>> Paolo
>>>> Paolo Bonzini (8):
>>>>   virtio: add functions for piecewise addition of buffers
>>>>   virtio-blk: reorganize virtblk_add_req
>>>>   virtio-blk: use virtqueue_start_buf on bio path
>>>>   virtio-blk: use virtqueue_start_buf on req path
>>>>   scatterlist: introduce sg_unmark_end
>>>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
>>>>   virtio-scsi: use virtqueue_start_buf
>>>>   virtio: reimplement virtqueue_add_buf using new functions
>>>>
>>>>  block/blk-integrity.c        |    2 +-
>>>>  block/blk-merge.c            |    2 +-
>>>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
>>>>  drivers/net/virtio_net.c     |   21 ++-
>>>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
>>>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
>>>>  include/linux/scatterlist.h  |   16 ++
>>>>  include/linux/virtio.h       |   25 +++
>>>>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 13:14     ` Paolo Bonzini
@ 2013-02-07 13:23       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
> >> One major difference between virtqueue_add_buf and virtqueue_add_sg
> >> is that the latter uses scatterlist iterators, which follow chained
> >> scatterlist structs and stop at ending markers.  In order to avoid code
> >> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> >> to change all existing callers of virtqueue_add_buf to provide well-formed
> >> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> >> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> >> the ending marker must be reset after calling virtqueue_add_buf, in
> >> preparation for the next usage of the scatterlist.  Other drivers are
> >> safe already.
> > 
> > What are the changes as compared to the previous version?
> > How about some comments made on the previous version?
> > See e.g.
> > https://patchwork.kernel.org/patch/1891541/
> 
> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
> virtio-net changes.
> 
> The virtio-blk and virtio-net changes are based on some ideas in the
> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
> redone from scratch.
> 
> > Generally we have code for direct and indirect which is already
> > painful. We do not want 4 more variants of this code.
> 
> Yes, indeed, the other main difference is that I'm now reimplementing
> virtqueue_add_buf using the new functions.  So:
> 
> - we previously had 2 variants (direct/indirect)
> 
> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
> 
> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)

single is never indirect so should have a single variant.

> >> This is an RFC for two reasons.  First, because I haven't done enough
> >> testing yet (especially with all the variations on receiving that
> >> virtio-net has).  Second, because I still have two struct vring_desc *
> >> fields in virtqueue API, which is a layering violation.  I'm not really
> >> sure how important that is and how to fix that---except by making the
> >> fields void*.
> > 
> > Hide the whole structure as part of vring struct, the problem will go
> > away.
> 
> Yes, that's the other possibility.  Will do for the next submission.
> 
> Paolo
> 
> >> Paolo
> >> Paolo Bonzini (8):
> >>   virtio: add functions for piecewise addition of buffers
> >>   virtio-blk: reorganize virtblk_add_req
> >>   virtio-blk: use virtqueue_start_buf on bio path
> >>   virtio-blk: use virtqueue_start_buf on req path
> >>   scatterlist: introduce sg_unmark_end
> >>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
> >>   virtio-scsi: use virtqueue_start_buf
> >>   virtio: reimplement virtqueue_add_buf using new functions
> >>
> >>  block/blk-integrity.c        |    2 +-
> >>  block/blk-merge.c            |    2 +-
> >>  drivers/block/virtio_blk.c   |  165 +++++++++--------
> >>  drivers/net/virtio_net.c     |   21 ++-
> >>  drivers/scsi/virtio_scsi.c   |  103 +++++------
> >>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
> >>  include/linux/scatterlist.h  |   16 ++
> >>  include/linux/virtio.h       |   25 +++
> >>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, virtualization

On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
> >> One major difference between virtqueue_add_buf and virtqueue_add_sg
> >> is that the latter uses scatterlist iterators, which follow chained
> >> scatterlist structs and stop at ending markers.  In order to avoid code
> >> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> >> to change all existing callers of virtqueue_add_buf to provide well-formed
> >> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> >> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> >> the ending marker must be reset after calling virtqueue_add_buf, in
> >> preparation for the next usage of the scatterlist.  Other drivers are
> >> safe already.
> > 
> > What are the changes as compared to the previous version?
> > How about some comments made on the previous version?
> > See e.g.
> > https://patchwork.kernel.org/patch/1891541/
> 
> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
> virtio-net changes.
> 
> The virtio-blk and virtio-net changes are based on some ideas in the
> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
> redone from scratch.
> 
> > Generally we have code for direct and indirect which is already
> > painful. We do not want 4 more variants of this code.
> 
> Yes, indeed, the other main difference is that I'm now reimplementing
> virtqueue_add_buf using the new functions.  So:
> 
> - we previously had 2 variants (direct/indirect)
> 
> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
> 
> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)

single is never indirect so should have a single variant.

> >> This is an RFC for two reasons.  First, because I haven't done enough
> >> testing yet (especially with all the variations on receiving that
> >> virtio-net has).  Second, because I still have two struct vring_desc *
> >> fields in virtqueue API, which is a layering violation.  I'm not really
> >> sure how important that is and how to fix that---except by making the
> >> fields void*.
> > 
> > Hide the whole structure as part of vring struct, the problem will go
> > away.
> 
> Yes, that's the other possibility.  Will do for the next submission.
> 
> Paolo
> 
> >> Paolo
> >> Paolo Bonzini (8):
> >>   virtio: add functions for piecewise addition of buffers
> >>   virtio-blk: reorganize virtblk_add_req
> >>   virtio-blk: use virtqueue_start_buf on bio path
> >>   virtio-blk: use virtqueue_start_buf on req path
> >>   scatterlist: introduce sg_unmark_end
> >>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
> >>   virtio-scsi: use virtqueue_start_buf
> >>   virtio: reimplement virtqueue_add_buf using new functions
> >>
> >>  block/blk-integrity.c        |    2 +-
> >>  block/blk-merge.c            |    2 +-
> >>  drivers/block/virtio_blk.c   |  165 +++++++++--------
> >>  drivers/net/virtio_net.c     |   21 ++-
> >>  drivers/scsi/virtio_scsi.c   |  103 +++++------
> >>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
> >>  include/linux/scatterlist.h  |   16 ++
> >>  include/linux/virtio.h       |   25 +++
> >>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 13:31           ` Michael S. Tsirkin
@ 2013-02-07 13:30             ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

Il 07/02/2013 14:31, Michael S. Tsirkin ha scritto:
> > Single means *this piece* (for example a request header) is single.  It
> > could still end up in an indirect buffer because QEMU does not support
> > mixed direct/indirect buffers.
> 
> Yes but why is the optimization worth it?
> It makes sense if all we want to do is add a single buffer
> in one go, this would give us virtqueue_add_buf_single.
> 
> But if we are building up an s/g list anyway,
> speeding up one of the entries a tiny bit
> seems very unlikely to be measureable.
> No?

There is some optimization potential even in unrolling the loop, but
yes, it looks like I misunderstood.  I'll add virtqueue_add_buf_single
instead.

Paolo


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:30             ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-07 13:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization

Il 07/02/2013 14:31, Michael S. Tsirkin ha scritto:
> > Single means *this piece* (for example a request header) is single.  It
> > could still end up in an indirect buffer because QEMU does not support
> > mixed direct/indirect buffers.
> 
> Yes but why is the optimization worth it?
> It makes sense if all we want to do is add a single buffer
> in one go, this would give us virtqueue_add_buf_single.
> 
> But if we are building up an s/g list anyway,
> speeding up one of the entries a tiny bit
> seems very unlikely to be measureable.
> No?

There is some optimization potential even in unrolling the loop, but
yes, it looks like I misunderstood.  I'll add virtqueue_add_buf_single
instead.

Paolo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 13:20         ` Paolo Bonzini
@ 2013-02-07 13:31           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, Wanlong Gao, asias, Rusty Russell, kvm, virtualization

On Thu, Feb 07, 2013 at 02:20:53PM +0100, Paolo Bonzini wrote:
> Il 07/02/2013 14:23, Michael S. Tsirkin ha scritto:
> > On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
> >> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
> >>>> One major difference between virtqueue_add_buf and virtqueue_add_sg
> >>>> is that the latter uses scatterlist iterators, which follow chained
> >>>> scatterlist structs and stop at ending markers.  In order to avoid code
> >>>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> >>>> to change all existing callers of virtqueue_add_buf to provide well-formed
> >>>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> >>>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> >>>> the ending marker must be reset after calling virtqueue_add_buf, in
> >>>> preparation for the next usage of the scatterlist.  Other drivers are
> >>>> safe already.
> >>>
> >>> What are the changes as compared to the previous version?
> >>> How about some comments made on the previous version?
> >>> See e.g.
> >>> https://patchwork.kernel.org/patch/1891541/
> >>
> >> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
> >> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
> >> virtio-net changes.
> >>
> >> The virtio-blk and virtio-net changes are based on some ideas in the
> >> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
> >> redone from scratch.
> >>
> >>> Generally we have code for direct and indirect which is already
> >>> painful. We do not want 4 more variants of this code.
> >>
> >> Yes, indeed, the other main difference is that I'm now reimplementing
> >> virtqueue_add_buf using the new functions.  So:
> >>
> >> - we previously had 2 variants (direct/indirect)
> >>
> >> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
> >>
> >> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)
> > 
> > single is never indirect so should have a single variant.
> 
> Single means *this piece* (for example a request header) is single.  It
> could still end up in an indirect buffer because QEMU does not support
> mixed direct/indirect buffers.
> 
> Paolo

Yes but why is the optimization worth it?
It makes sense if all we want to do is add a single buffer
in one go, this would give us virtqueue_add_buf_single.

But if we are building up an s/g list anyway,
speeding up one of the entries a tiny bit
seems very unlikely to be measureable.
No?

> >>>> This is an RFC for two reasons.  First, because I haven't done enough
> >>>> testing yet (especially with all the variations on receiving that
> >>>> virtio-net has).  Second, because I still have two struct vring_desc *
> >>>> fields in virtqueue API, which is a layering violation.  I'm not really
> >>>> sure how important that is and how to fix that---except by making the
> >>>> fields void*.
> >>>
> >>> Hide the whole structure as part of vring struct, the problem will go
> >>> away.
> >>
> >> Yes, that's the other possibility.  Will do for the next submission.
> >>
> >> Paolo
> >>
> >>>> Paolo
> >>>> Paolo Bonzini (8):
> >>>>   virtio: add functions for piecewise addition of buffers
> >>>>   virtio-blk: reorganize virtblk_add_req
> >>>>   virtio-blk: use virtqueue_start_buf on bio path
> >>>>   virtio-blk: use virtqueue_start_buf on req path
> >>>>   scatterlist: introduce sg_unmark_end
> >>>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
> >>>>   virtio-scsi: use virtqueue_start_buf
> >>>>   virtio: reimplement virtqueue_add_buf using new functions
> >>>>
> >>>>  block/blk-integrity.c        |    2 +-
> >>>>  block/blk-merge.c            |    2 +-
> >>>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
> >>>>  drivers/net/virtio_net.c     |   21 ++-
> >>>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
> >>>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
> >>>>  include/linux/scatterlist.h  |   16 ++
> >>>>  include/linux/virtio.h       |   25 +++
> >>>>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-07 13:31           ` Michael S. Tsirkin
  0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2013-02-07 13:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, virtualization

On Thu, Feb 07, 2013 at 02:20:53PM +0100, Paolo Bonzini wrote:
> Il 07/02/2013 14:23, Michael S. Tsirkin ha scritto:
> > On Thu, Feb 07, 2013 at 02:14:24PM +0100, Paolo Bonzini wrote:
> >> Il 07/02/2013 14:09, Michael S. Tsirkin ha scritto:
> >>>> One major difference between virtqueue_add_buf and virtqueue_add_sg
> >>>> is that the latter uses scatterlist iterators, which follow chained
> >>>> scatterlist structs and stop at ending markers.  In order to avoid code
> >>>> duplication, and use the new API from virtqueue_add_buf (patch 8), we need
> >>>> to change all existing callers of virtqueue_add_buf to provide well-formed
> >>>> scatterlists.  This is what patches 2-7 do.  For virtio-blk it is easiest
> >>>> to just switch to the new API, just like for virtio-scsi.  For virtio-net
> >>>> the ending marker must be reset after calling virtqueue_add_buf, in
> >>>> preparation for the next usage of the scatterlist.  Other drivers are
> >>>> safe already.
> >>>
> >>> What are the changes as compared to the previous version?
> >>> How about some comments made on the previous version?
> >>> See e.g.
> >>> https://patchwork.kernel.org/patch/1891541/
> >>
> >> Two changes: 1) added virtqueue_add_sg_single; 2) reimplemented
> >> virtqueue_add_buf in terms of the new API, which requires virtio-blk and
> >> virtio-net changes.
> >>
> >> The virtio-blk and virtio-net changes are based on some ideas in the
> >> patch Rusty posted, but virtio-net is a bit simpler and virtio-blk was
> >> redone from scratch.
> >>
> >>> Generally we have code for direct and indirect which is already
> >>> painful. We do not want 4 more variants of this code.
> >>
> >> Yes, indeed, the other main difference is that I'm now reimplementing
> >> virtqueue_add_buf using the new functions.  So:
> >>
> >> - we previously had 2 variants (direct/indirect)
> >>
> >> - v1 had 4 variants (direct/indirect x add_buf/add_sg)
> >>
> >> - v2 has 4 variants (direct/indirect x add_sg/add_sg_single)
> > 
> > single is never indirect so should have a single variant.
> 
> Single means *this piece* (for example a request header) is single.  It
> could still end up in an indirect buffer because QEMU does not support
> mixed direct/indirect buffers.
> 
> Paolo

Yes but why is the optimization worth it?
It makes sense if all we want to do is add a single buffer
in one go, this would give us virtqueue_add_buf_single.

But if we are building up an s/g list anyway,
speeding up one of the entries a tiny bit
seems very unlikely to be measureable.
No?

> >>>> This is an RFC for two reasons.  First, because I haven't done enough
> >>>> testing yet (especially with all the variations on receiving that
> >>>> virtio-net has).  Second, because I still have two struct vring_desc *
> >>>> fields in virtqueue API, which is a layering violation.  I'm not really
> >>>> sure how important that is and how to fix that---except by making the
> >>>> fields void*.
> >>>
> >>> Hide the whole structure as part of vring struct, the problem will go
> >>> away.
> >>
> >> Yes, that's the other possibility.  Will do for the next submission.
> >>
> >> Paolo
> >>
> >>>> Paolo
> >>>> Paolo Bonzini (8):
> >>>>   virtio: add functions for piecewise addition of buffers
> >>>>   virtio-blk: reorganize virtblk_add_req
> >>>>   virtio-blk: use virtqueue_start_buf on bio path
> >>>>   virtio-blk: use virtqueue_start_buf on req path
> >>>>   scatterlist: introduce sg_unmark_end
> >>>>   virtio-net: unmark scatterlist ending after virtqueue_add_buf
> >>>>   virtio-scsi: use virtqueue_start_buf
> >>>>   virtio: reimplement virtqueue_add_buf using new functions
> >>>>
> >>>>  block/blk-integrity.c        |    2 +-
> >>>>  block/blk-merge.c            |    2 +-
> >>>>  drivers/block/virtio_blk.c   |  165 +++++++++--------
> >>>>  drivers/net/virtio_net.c     |   21 ++-
> >>>>  drivers/scsi/virtio_scsi.c   |  103 +++++------
> >>>>  drivers/virtio/virtio_ring.c |  417 +++++++++++++++++++++++++++---------------
> >>>>  include/linux/scatterlist.h  |   16 ++
> >>>>  include/linux/virtio.h       |   25 +++
> >>>>  8 files changed, 460 insertions(+), 291 deletions(-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-07 12:22 ` Paolo Bonzini
@ 2013-02-08  4:05   ` Rusty Russell
  -1 siblings, 0 replies; 40+ messages in thread
From: Rusty Russell @ 2013-02-08  4:05 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel
  Cc: Wanlong Gao, asias, mst, kvm, virtualization, Jens Axboe

Paolo Bonzini <pbonzini@redhat.com> writes:
> The virtqueue_add_buf function has two limitations:
>
> 1) it requires the caller to provide all the buffers in a single call;
>
> 2) it does not support chained scatterlists: the buffers must be
> provided as an array of struct scatterlist.
>
> Because of these limitations, virtio-scsi has to copy each request into
> a scatterlist internal to the driver.  It cannot just use the one that
> was prepared by the upper SCSI layers.

Hi Paulo,

        Note that you've defined your problem in terms of your solution
here.  For clarity:

The problem: we want to prepend and append to a scatterlist.  We can't
        append, because the chained scatterlist implementation requires
        an element to be appended to join two scatterlists together.

The solution: fix scatterlists by introducing struct sg_ring:
        struct sg_ring {
                struct list_head ring;
        	unsigned int nents;
        	unsigned int orig_nents; /* do we want to replace sg_table? */
                struct scatterlist *sg;
        };

The workaround: make virtio accept multiple scatterlists for a single
        buffer.

There's nothing wrong with your workaround, but if other subsystems have
the same problem we do, perhaps we should consider a broader solution?

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-08  4:05   ` Rusty Russell
  0 siblings, 0 replies; 40+ messages in thread
From: Rusty Russell @ 2013-02-08  4:05 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel; +Cc: Jens Axboe, kvm, mst, virtualization

Paolo Bonzini <pbonzini@redhat.com> writes:
> The virtqueue_add_buf function has two limitations:
>
> 1) it requires the caller to provide all the buffers in a single call;
>
> 2) it does not support chained scatterlists: the buffers must be
> provided as an array of struct scatterlist.
>
> Because of these limitations, virtio-scsi has to copy each request into
> a scatterlist internal to the driver.  It cannot just use the one that
> was prepared by the upper SCSI layers.

Hi Paulo,

        Note that you've defined your problem in terms of your solution
here.  For clarity:

The problem: we want to prepend and append to a scatterlist.  We can't
        append, because the chained scatterlist implementation requires
        an element to be appended to join two scatterlists together.

The solution: fix scatterlists by introducing struct sg_ring:
        struct sg_ring {
                struct list_head ring;
        	unsigned int nents;
        	unsigned int orig_nents; /* do we want to replace sg_table? */
                struct scatterlist *sg;
        };

The workaround: make virtio accept multiple scatterlists for a single
        buffer.

There's nothing wrong with your workaround, but if other subsystems have
the same problem we do, perhaps we should consider a broader solution?

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-08  4:05   ` Rusty Russell
  (?)
@ 2013-02-08  6:35   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-08  6:35 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, Jens Axboe, kvm, mst, virtualization

Il 08/02/2013 05:05, Rusty Russell ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
>> The virtqueue_add_buf function has two limitations:
>>
>> 1) it requires the caller to provide all the buffers in a single call;
>>
>> 2) it does not support chained scatterlists: the buffers must be
>> provided as an array of struct scatterlist.
>>
>> Because of these limitations, virtio-scsi has to copy each request into
>> a scatterlist internal to the driver.  It cannot just use the one that
>> was prepared by the upper SCSI layers.
> 
> Hi Paulo,
> 
>         Note that you've defined your problem in terms of your solution
> here.  For clarity:

Good catch. :)

> The problem: we want to prepend and append to a scatterlist.  We can't
>         append, because the chained scatterlist implementation requires
>         an element to be appended to join two scatterlists together.
> 
> The solution: fix scatterlists by introducing struct sg_ring:
>         struct sg_ring {
>                 struct list_head ring;
>         	unsigned int nents;
>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>                 struct scatterlist *sg;
>         };
> 
> The workaround: make virtio accept multiple scatterlists for a single
>         buffer.
> 
> There's nothing wrong with your workaround, but if other subsystems have
> the same problem we do, perhaps we should consider a broader solution?

Do they?  Given the resistance you have had on the topic, perhaps they
don't (though I agree that chained scatterlist are horrible).

But I'll add a note on this to the commit message and why the workaround
is IMHO acceptable.

Paolo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-08  4:05   ` Rusty Russell
  (?)
  (?)
@ 2013-02-08  6:35   ` Paolo Bonzini
  -1 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2013-02-08  6:35 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Jens Axboe, virtualization, linux-kernel, kvm, mst

Il 08/02/2013 05:05, Rusty Russell ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
>> The virtqueue_add_buf function has two limitations:
>>
>> 1) it requires the caller to provide all the buffers in a single call;
>>
>> 2) it does not support chained scatterlists: the buffers must be
>> provided as an array of struct scatterlist.
>>
>> Because of these limitations, virtio-scsi has to copy each request into
>> a scatterlist internal to the driver.  It cannot just use the one that
>> was prepared by the upper SCSI layers.
> 
> Hi Paulo,
> 
>         Note that you've defined your problem in terms of your solution
> here.  For clarity:

Good catch. :)

> The problem: we want to prepend and append to a scatterlist.  We can't
>         append, because the chained scatterlist implementation requires
>         an element to be appended to join two scatterlists together.
> 
> The solution: fix scatterlists by introducing struct sg_ring:
>         struct sg_ring {
>                 struct list_head ring;
>         	unsigned int nents;
>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>                 struct scatterlist *sg;
>         };
> 
> The workaround: make virtio accept multiple scatterlists for a single
>         buffer.
> 
> There's nothing wrong with your workaround, but if other subsystems have
> the same problem we do, perhaps we should consider a broader solution?

Do they?  Given the resistance you have had on the topic, perhaps they
don't (though I agree that chained scatterlist are horrible).

But I'll add a note on this to the commit message and why the workaround
is IMHO acceptable.

Paolo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-08  4:05   ` Rusty Russell
                     ` (3 preceding siblings ...)
  (?)
@ 2013-02-08 11:52   ` Jens Axboe
  2013-02-13  9:46       ` Rusty Russell
  -1 siblings, 1 reply; 40+ messages in thread
From: Jens Axboe @ 2013-02-08 11:52 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Paolo Bonzini, linux-kernel, Wanlong Gao, asias, mst, kvm,
	virtualization

On Fri, Feb 08 2013, Rusty Russell wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> > The virtqueue_add_buf function has two limitations:
> >
> > 1) it requires the caller to provide all the buffers in a single call;
> >
> > 2) it does not support chained scatterlists: the buffers must be
> > provided as an array of struct scatterlist.
> >
> > Because of these limitations, virtio-scsi has to copy each request into
> > a scatterlist internal to the driver.  It cannot just use the one that
> > was prepared by the upper SCSI layers.
> 
> Hi Paulo,
> 
>         Note that you've defined your problem in terms of your solution
> here.  For clarity:
> 
> The problem: we want to prepend and append to a scatterlist.  We can't
>         append, because the chained scatterlist implementation requires
>         an element to be appended to join two scatterlists together.
> 
> The solution: fix scatterlists by introducing struct sg_ring:
>         struct sg_ring {
>                 struct list_head ring;
>         	unsigned int nents;
>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>                 struct scatterlist *sg;
>         };

This would definitely be more flexible than the current chaining.
However:

> The workaround: make virtio accept multiple scatterlists for a single
>         buffer.
> 
> There's nothing wrong with your workaround, but if other subsystems have
> the same problem we do, perhaps we should consider a broader solution?

Do other use cases actually exist? I don't think I've come across this
requirement before, since it was introduced (6 years ago, from a cursory
look at the git logs!).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-08  4:05   ` Rusty Russell
                     ` (2 preceding siblings ...)
  (?)
@ 2013-02-08 11:52   ` Jens Axboe
  -1 siblings, 0 replies; 40+ messages in thread
From: Jens Axboe @ 2013-02-08 11:52 UTC (permalink / raw)
  To: Rusty Russell; +Cc: kvm, mst, linux-kernel, virtualization, Paolo Bonzini

On Fri, Feb 08 2013, Rusty Russell wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> > The virtqueue_add_buf function has two limitations:
> >
> > 1) it requires the caller to provide all the buffers in a single call;
> >
> > 2) it does not support chained scatterlists: the buffers must be
> > provided as an array of struct scatterlist.
> >
> > Because of these limitations, virtio-scsi has to copy each request into
> > a scatterlist internal to the driver.  It cannot just use the one that
> > was prepared by the upper SCSI layers.
> 
> Hi Paulo,
> 
>         Note that you've defined your problem in terms of your solution
> here.  For clarity:
> 
> The problem: we want to prepend and append to a scatterlist.  We can't
>         append, because the chained scatterlist implementation requires
>         an element to be appended to join two scatterlists together.
> 
> The solution: fix scatterlists by introducing struct sg_ring:
>         struct sg_ring {
>                 struct list_head ring;
>         	unsigned int nents;
>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>                 struct scatterlist *sg;
>         };

This would definitely be more flexible than the current chaining.
However:

> The workaround: make virtio accept multiple scatterlists for a single
>         buffer.
> 
> There's nothing wrong with your workaround, but if other subsystems have
> the same problem we do, perhaps we should consider a broader solution?

Do other use cases actually exist? I don't think I've come across this
requirement before, since it was introduced (6 years ago, from a cursory
look at the git logs!).

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
  2013-02-08 11:52   ` Jens Axboe
@ 2013-02-13  9:46       ` Rusty Russell
  0 siblings, 0 replies; 40+ messages in thread
From: Rusty Russell @ 2013-02-13  9:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Paolo Bonzini, linux-kernel, Wanlong Gao, asias, mst, kvm,
	virtualization

Jens Axboe <axboe@kernel.dk> writes:
> On Fri, Feb 08 2013, Rusty Russell wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> > The virtqueue_add_buf function has two limitations:
>> >
>> > 1) it requires the caller to provide all the buffers in a single call;
>> >
>> > 2) it does not support chained scatterlists: the buffers must be
>> > provided as an array of struct scatterlist.
>> >
>> > Because of these limitations, virtio-scsi has to copy each request into
>> > a scatterlist internal to the driver.  It cannot just use the one that
>> > was prepared by the upper SCSI layers.
>> 
>> Hi Paulo,
>> 
>>         Note that you've defined your problem in terms of your solution
>> here.  For clarity:
>> 
>> The problem: we want to prepend and append to a scatterlist.  We can't
>>         append, because the chained scatterlist implementation requires
>>         an element to be appended to join two scatterlists together.
>> 
>> The solution: fix scatterlists by introducing struct sg_ring:
>>         struct sg_ring {
>>                 struct list_head ring;
>>         	unsigned int nents;
>>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>>                 struct scatterlist *sg;
>>         };
>
> This would definitely be more flexible than the current chaining.
> However:
>
>> The workaround: make virtio accept multiple scatterlists for a single
>>         buffer.
>> 
>> There's nothing wrong with your workaround, but if other subsystems have
>> the same problem we do, perhaps we should consider a broader solution?
>
> Do other use cases actually exist? I don't think I've come across this
> requirement before, since it was introduced (6 years ago, from a cursory
> look at the git logs!).

Thanks Jens.

OK, let's not over-solve the problem then, we'll make a virtio-specific
solution.

Paulo, I'll take your patches once you repost.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes
@ 2013-02-13  9:46       ` Rusty Russell
  0 siblings, 0 replies; 40+ messages in thread
From: Rusty Russell @ 2013-02-13  9:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: kvm, mst, linux-kernel, virtualization, Paolo Bonzini

Jens Axboe <axboe@kernel.dk> writes:
> On Fri, Feb 08 2013, Rusty Russell wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> > The virtqueue_add_buf function has two limitations:
>> >
>> > 1) it requires the caller to provide all the buffers in a single call;
>> >
>> > 2) it does not support chained scatterlists: the buffers must be
>> > provided as an array of struct scatterlist.
>> >
>> > Because of these limitations, virtio-scsi has to copy each request into
>> > a scatterlist internal to the driver.  It cannot just use the one that
>> > was prepared by the upper SCSI layers.
>> 
>> Hi Paulo,
>> 
>>         Note that you've defined your problem in terms of your solution
>> here.  For clarity:
>> 
>> The problem: we want to prepend and append to a scatterlist.  We can't
>>         append, because the chained scatterlist implementation requires
>>         an element to be appended to join two scatterlists together.
>> 
>> The solution: fix scatterlists by introducing struct sg_ring:
>>         struct sg_ring {
>>                 struct list_head ring;
>>         	unsigned int nents;
>>         	unsigned int orig_nents; /* do we want to replace sg_table? */
>>                 struct scatterlist *sg;
>>         };
>
> This would definitely be more flexible than the current chaining.
> However:
>
>> The workaround: make virtio accept multiple scatterlists for a single
>>         buffer.
>> 
>> There's nothing wrong with your workaround, but if other subsystems have
>> the same problem we do, perhaps we should consider a broader solution?
>
> Do other use cases actually exist? I don't think I've come across this
> requirement before, since it was introduced (6 years ago, from a cursory
> look at the git logs!).

Thanks Jens.

OK, let's not over-solve the problem then, we'll make a virtio-specific
solution.

Paulo, I'll take your patches once you repost.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2013-02-13 10:18 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-07 12:22 [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes Paolo Bonzini
2013-02-07 12:22 ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 1/8] virtio: add functions for piecewise addition of buffers Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 2/8] virtio-blk: reorganize virtblk_add_req Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 3/8] virtio-blk: use virtqueue_start_buf on bio path Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 4/8] virtio-blk: use virtqueue_start_buf on req path Paolo Bonzini
2013-02-07 12:22 ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 5/8] scatterlist: introduce sg_unmark_end Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:35   ` Jens Axboe
2013-02-07 12:35   ` Jens Axboe
2013-02-07 12:22 ` [RFC PATCH 6/8] virtio-net: unmark scatterlist ending after virtqueue_add_buf Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 7/8] virtio-scsi: use virtqueue_start_buf Paolo Bonzini
2013-02-07 12:22   ` Paolo Bonzini
2013-02-07 12:22 ` [RFC PATCH 8/8] virtio: reimplement virtqueue_add_buf using new functions Paolo Bonzini
2013-02-07 12:22 ` Paolo Bonzini
2013-02-07 13:09 ` [RFC PATCH 0/8] virtio: new API for addition of buffers, scatterlist changes Michael S. Tsirkin
2013-02-07 13:09   ` Michael S. Tsirkin
2013-02-07 13:14   ` Paolo Bonzini
2013-02-07 13:14     ` Paolo Bonzini
2013-02-07 13:23     ` Michael S. Tsirkin
2013-02-07 13:23       ` Michael S. Tsirkin
2013-02-07 13:20       ` Paolo Bonzini
2013-02-07 13:20         ` Paolo Bonzini
2013-02-07 13:31         ` Michael S. Tsirkin
2013-02-07 13:31           ` Michael S. Tsirkin
2013-02-07 13:30           ` Paolo Bonzini
2013-02-07 13:30             ` Paolo Bonzini
2013-02-08  4:05 ` Rusty Russell
2013-02-08  4:05   ` Rusty Russell
2013-02-08  6:35   ` Paolo Bonzini
2013-02-08  6:35   ` Paolo Bonzini
2013-02-08 11:52   ` Jens Axboe
2013-02-08 11:52   ` Jens Axboe
2013-02-13  9:46     ` Rusty Russell
2013-02-13  9:46       ` Rusty Russell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.