linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: mst@redhat.com, gaowanlog@cn.fujitsu.com, asias@redhat.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: [PATCH v2 1/9] virtio: add functions for piecewise addition of buffers
Date: Tue, 12 Feb 2013 19:03:55 +0100	[thread overview]
Message-ID: <1360692235-8978-1-git-send-email-pbonzini@redhat.com> (raw)
In-Reply-To: <1360671815-2135-2-git-send-email-pbonzini@redhat.com>

virtio device drivers translate requests from higher layer in two steps:
a device-specific step in the device driver, and generic preparation
of virtio direct or indirect buffers in virtqueue_add_buf.  Because
virtqueue_add_buf also accepts the outcome of the first step as a single
struct scatterlist, drivers may need to put additional items at the
front or back of the data scatterlists before handing it to virtqueue_add_buf.
Because of this, virtio-scsi has to copy each request into a scatterlist
internal to the driver.  It cannot just use the one that was prepared
by the upper SCSI layers.

On top of this, virtqueue_add_buf also has the limitation of not
supporting chained scatterlists: the buffers must be provided as an
array of struct scatterlist.  Chained scatterlist, though not supported
on all architectures, would help for virtio-scsi where all additional
items are placed at the front.

This patch adds a different set of APIs for adding a buffer to a virtqueue.
The new API lets you pass the buffers piecewise, wrapping multiple calls
to virtqueue_add_sg between virtqueue_start_buf and virtqueue_end_buf.
virtio-scsi can then call virtqueue_add_sg 3/4 times: for the request
header, for the write buffer (if present), for the response header, and
finally for the read buffer (again if present).  It saves the copying
and the related locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/virtio/virtio_ring.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/virtio.h       |   14 +++
 2 files changed, 225 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ffd7e7d..eede8a7 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -101,6 +101,10 @@ struct vring_virtqueue
 	/* Last used index we've seen. */
 	u16 last_used_idx;
 
+	/* State between virtqueue_start_buf and virtqueue_end_buf.  */
+	int head;
+	struct vring_desc *indirect_base, *tail;
+
 	/* How to notify other side. FIXME: commonalize hcalls! */
 	void (*notify)(struct virtqueue *vq);
 
@@ -394,6 +398,213 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
 	vq->vq.num_free++;
 }
 
+/**
+ * virtqueue_start_buf - start building buffer for the other end
+ * @vq: the struct virtqueue we're talking about.
+ * @data: the token identifying the buffer.
+ * @nents: the number of buffers that will be added
+ * @nsg: the number of sg lists that will be added
+ * @gfp: how to do memory allocations (if necessary).
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted), and that a successful call is
+ * followed by one or more calls to virtqueue_add_sg, and finally a call
+ * to virtqueue_end_buf.
+ *
+ * Returns zero or a negative error (ie. ENOSPC).
+ */
+int virtqueue_start_buf(struct virtqueue *_vq,
+			void *data,
+			unsigned int nents,
+			unsigned int nsg,
+			gfp_t gfp)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	struct vring_desc *desc = NULL;
+	int head;
+	int ret = -ENOMEM;
+
+	START_USE(vq);
+
+	BUG_ON(data == NULL);
+
+#ifdef DEBUG
+	{
+		ktime_t now = ktime_get();
+
+		/* No kick or get, with .1 second between?  Warn. */
+		if (vq->last_add_time_valid)
+			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
+					    > 100);
+		vq->last_add_time = now;
+		vq->last_add_time_valid = true;
+	}
+#endif
+
+	BUG_ON(nents < nsg);
+	BUG_ON(nsg == 0);
+
+	/*
+	 * If the host supports indirect descriptor tables, and there are
+	 * multiple buffers, go indirect.  More sophisticated heuristics
+	 * could be used instead.
+	 */
+	head = vq->free_head;
+	if (vq->indirect && nents > 1) {
+		if (vq->vq.num_free == 0)
+			goto no_space;
+
+		desc = kmalloc(nents * sizeof(struct vring_desc), gfp);
+		if (!desc)
+			goto error;
+
+		/* We're about to use a buffer */
+		vq->vq.num_free--;
+
+		/* Use a single buffer which doesn't continue */
+		vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
+		vq->vring.desc[head].addr = virt_to_phys(desc);
+		vq->vring.desc[head].len = nents * sizeof(struct vring_desc);
+
+		/* Update free pointer */
+		vq->free_head = vq->vring.desc[head].next;
+	}
+
+	/* Set token. */
+	vq->data[head] = data;
+
+	pr_debug("Started buffer head %i for %p\n", head, vq);
+
+	vq->indirect_base = desc;
+	vq->tail = NULL;
+	vq->head = head;
+	return 0;
+
+no_space:
+	ret = -ENOSPC;
+error:
+	pr_debug("Can't add buf (%d) - nents = %i, avail = %i\n",
+		 ret, nents, vq->vq.num_free);
+	END_USE(vq);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(virtqueue_start_buf);
+
+/**
+ * virtqueue_add_sg - add sglist to buffer being built
+ * @_vq: the virtqueue for which the buffer is being built
+ * @sgl: the description of the buffer(s).
+ * @nents: the number of items to process in sgl
+ * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
+ *
+ * Note that, unlike virtqueue_add_buf, this function follows chained
+ * scatterlists, and stops before the @nents-th item if a scatterlist item
+ * has a marker.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_add_sg(struct virtqueue *_vq,
+		      struct scatterlist sgl[],
+		      unsigned int nents,
+		      enum dma_data_direction dir)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	unsigned int i, n;
+	struct scatterlist *sg;
+	struct vring_desc *tail;
+	u32 flags;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+
+	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
+	BUG_ON(nents == 0);
+
+	flags = dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0;
+	flags |= VRING_DESC_F_NEXT;
+
+	/*
+	 * If using indirect descriptor tables, fill in the buffers
+	 * at vq->indirect_base.
+	 */
+	if (vq->indirect_base) {
+		i = 0;
+		if (likely(vq->tail))
+			i = vq->tail - vq->indirect_base + 1;
+
+		for_each_sg(sgl, sg, nents, n) {
+			tail = &vq->indirect_base[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			tail->next = ++i;
+		}
+	} else {
+		BUG_ON(vq->vq.num_free < nents);
+
+		i = vq->free_head;
+		for_each_sg(sgl, sg, nents, n) {
+			tail = &vq->vring.desc[i];
+			tail->flags = flags;
+			tail->addr = sg_phys(sg);
+			tail->len = sg->length;
+			i = tail->next;
+			vq->vq.num_free--;
+		}
+
+		vq->free_head = i;
+	}
+	vq->tail = tail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_add_sg);
+
+/**
+ * virtqueue_end_buf - expose buffer to other end
+ * @_vq: the virtqueue for which the buffer was built
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ */
+void virtqueue_end_buf(struct virtqueue *_vq)
+{
+	struct vring_virtqueue *vq = to_vvq(_vq);
+	unsigned int avail;
+	int head = vq->head;
+	struct vring_desc *tail = vq->tail;
+
+#ifdef DEBUG
+	BUG_ON(!vq->in_use);
+#endif
+	BUG_ON(tail == NULL);
+
+	/* The last one does not have the next flag set.  */
+	tail->flags &= ~VRING_DESC_F_NEXT;
+
+	/*
+	 * Put entry in available array.  Descriptors and available array
+	 * need to be set before we expose the new available array entries.
+	 */
+	avail = vq->vring.avail->idx & (vq->vring.num-1);
+	vq->vring.avail->ring[avail] = head;
+	virtio_wmb(vq);
+
+	vq->vring.avail->idx++;
+	vq->num_added++;
+
+	/*
+	 * This is very unlikely, but theoretically possible.  Kick
+	 * just in case.
+	 */
+	if (unlikely(vq->num_added == (1 << 16) - 1))
+		virtqueue_kick(&vq->vq);
+
+	pr_debug("Added buffer head %i to %p\n", head, vq);
+	END_USE(vq);
+}
+EXPORT_SYMBOL_GPL(virtqueue_end_buf);
+
 static inline bool more_used(const struct vring_virtqueue *vq)
 {
 	return vq->last_used_idx != vq->vring.used->idx;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index cf8adb1..b5c2c12 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -7,6 +7,7 @@
 #include <linux/spinlock.h>
 #include <linux/device.h>
 #include <linux/mod_devicetable.h>
+#include <linux/dma-direction.h>
 #include <linux/gfp.h>
 
 /**
@@ -40,6 +41,19 @@ int virtqueue_add_buf(struct virtqueue *vq,
 		      void *data,
 		      gfp_t gfp);
 
+int virtqueue_start_buf(struct virtqueue *_vq,
+			void *data,
+			unsigned int nents,
+			unsigned int nsg,
+			gfp_t gfp);
+
+void virtqueue_add_sg(struct virtqueue *_vq,
+		      struct scatterlist sgl[],
+		      unsigned int nents,
+		      enum dma_data_direction dir);
+
+void virtqueue_end_buf(struct virtqueue *_vq);
+
 void virtqueue_kick(struct virtqueue *vq);
 
 bool virtqueue_kick_prepare(struct virtqueue *vq);
-- 
1.7.1


  parent reply	other threads:[~2013-02-12 18:04 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12 12:23 [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Paolo Bonzini
2013-02-12 12:23 ` [PATCH 1/9] virtio: add functions for piecewise addition of buffers Paolo Bonzini
2013-02-12 14:56   ` Michael S. Tsirkin
2013-02-12 15:32     ` Paolo Bonzini
2013-02-12 15:43       ` Michael S. Tsirkin
2013-02-12 15:48         ` Paolo Bonzini
2013-02-12 16:13           ` Michael S. Tsirkin
2013-02-12 16:17             ` Paolo Bonzini
2013-02-12 16:35               ` Michael S. Tsirkin
2013-02-12 16:57                 ` Paolo Bonzini
2013-02-12 17:34                   ` Michael S. Tsirkin
2013-02-12 18:04                     ` Paolo Bonzini
2013-02-12 18:23                       ` Michael S. Tsirkin
2013-02-12 20:08                         ` Paolo Bonzini
2013-02-12 20:49                           ` Michael S. Tsirkin
2013-02-13  8:06                             ` Paolo Bonzini
2013-02-13 10:33                               ` Michael S. Tsirkin
2013-02-12 18:03   ` Paolo Bonzini [this message]
2013-02-12 12:23 ` [PATCH 2/9] virtio-blk: reorganize virtblk_add_req Paolo Bonzini
2013-02-17  6:38   ` Asias He
2013-02-12 12:23 ` [PATCH 3/9] virtio-blk: use virtqueue_start_buf on bio path Paolo Bonzini
2013-02-17  6:39   ` Asias He
2013-02-12 12:23 ` [PATCH 4/9] virtio-blk: use virtqueue_start_buf on req path Paolo Bonzini
2013-02-17  6:37   ` Asias He
2013-02-18  9:05     ` Paolo Bonzini
2013-02-12 12:23 ` [PATCH 5/9] scatterlist: introduce sg_unmark_end Paolo Bonzini
2013-02-12 12:23 ` [PATCH 6/9] virtio-net: unmark scatterlist ending after virtqueue_add_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 7/9] virtio-scsi: use virtqueue_start_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 8/9] virtio: introduce and use virtqueue_add_buf_single Paolo Bonzini
2013-02-12 12:23 ` [PATCH 9/9] virtio: reimplement virtqueue_add_buf using new functions Paolo Bonzini
2013-02-14  6:00 ` [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Rusty Russell
2013-02-14  9:23   ` Paolo Bonzini
2013-02-15 18:04     ` Paolo Bonzini
2013-02-19  7:49     ` Rusty Russell
2013-02-19  9:11       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360692235-8978-1-git-send-email-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=asias@redhat.com \
    --cc=gaowanlog@cn.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).