linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 0/5] Add vhost-blk support
@ 2012-07-13  8:55 Asias He
  2012-07-13  8:55 ` [PATCH RESEND 1/5] aio: Export symbols and struct kiocb_batch for in kernel aio usage Asias He
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexander Viro, Benjamin LaHaise, James Bottomley, Jeff Moyer,
	kvm, linux-aio, linux-fsdevel, Michael S. Tsirkin,
	virtualization


Hi folks,

[I am resending to fix the broken thread in the previous one.]

This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
device accelerator. Compared to userspace virtio-blk implementation, vhost-blk
gives about 5% to 15% performance improvement.

Asias He (5):
  aio: Export symbols and struct kiocb_batch for in kernel aio usage
  eventfd: Export symbol eventfd_file_create()
  vhost: Make vhost a separate module
  vhost-net: Use VHOST_NET_FEATURES for vhost-net
  vhost-blk: Add vhost-blk support

 drivers/vhost/Kconfig  |   20 +-
 drivers/vhost/Makefile |    6 +-
 drivers/vhost/blk.c    |  600 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/net.c    |    4 +-
 drivers/vhost/test.c   |    4 +-
 drivers/vhost/vhost.c  |   48 ++++
 drivers/vhost/vhost.h  |   18 +-
 fs/aio.c               |   37 ++-
 fs/eventfd.c           |    1 +
 include/linux/aio.h    |   21 ++
 include/linux/vhost.h  |    3 +
 11 files changed, 731 insertions(+), 31 deletions(-)
 create mode 100644 drivers/vhost/blk.c

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RESEND 1/5] aio: Export symbols and struct kiocb_batch for in kernel aio usage
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
@ 2012-07-13  8:55 ` Asias He
  2012-07-13  8:55 ` [PATCH RESEND 2/5] eventfd: Export symbol eventfd_file_create() Asias He
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Benjamin LaHaise, Alexander Viro, Jeff Moyer, James Bottomley,
	Michael S. Tsirkin, linux-aio, linux-fsdevel, kvm,
	virtualization

This is useful for people who want to use aio in kernel, e.g. vhost-blk.

Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: linux-aio@kvack.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Asias He <asias@redhat.com>
---
 fs/aio.c            |   37 ++++++++++++++++++-------------------
 include/linux/aio.h |   21 +++++++++++++++++++++
 2 files changed, 39 insertions(+), 19 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 55c4c76..93dfbdd 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -224,22 +224,24 @@ static void __put_ioctx(struct kioctx *ctx)
 	call_rcu(&ctx->rcu_head, ctx_rcu_free);
 }
 
-static inline int try_get_ioctx(struct kioctx *kioctx)
+inline int try_get_ioctx(struct kioctx *kioctx)
 {
 	return atomic_inc_not_zero(&kioctx->users);
 }
+EXPORT_SYMBOL(try_get_ioctx);
 
-static inline void put_ioctx(struct kioctx *kioctx)
+inline void put_ioctx(struct kioctx *kioctx)
 {
 	BUG_ON(atomic_read(&kioctx->users) <= 0);
 	if (unlikely(atomic_dec_and_test(&kioctx->users)))
 		__put_ioctx(kioctx);
 }
+EXPORT_SYMBOL(put_ioctx);
 
 /* ioctx_alloc
  *	Allocates and initializes an ioctx.  Returns an ERR_PTR if it failed.
  */
-static struct kioctx *ioctx_alloc(unsigned nr_events)
+struct kioctx *ioctx_alloc(unsigned nr_events)
 {
 	struct mm_struct *mm;
 	struct kioctx *ctx;
@@ -303,6 +305,7 @@ out_freectx:
 	dprintk("aio: error allocating ioctx %d\n", err);
 	return ERR_PTR(err);
 }
+EXPORT_SYMBOL(ioctx_alloc);
 
 /* kill_ctx
  *	Cancels all outstanding aio requests on an aio context.  Used 
@@ -436,23 +439,14 @@ static struct kiocb *__aio_get_req(struct kioctx *ctx)
 	return req;
 }
 
-/*
- * struct kiocb's are allocated in batches to reduce the number of
- * times the ctx lock is acquired and released.
- */
-#define KIOCB_BATCH_SIZE	32L
-struct kiocb_batch {
-	struct list_head head;
-	long count; /* number of requests left to allocate */
-};
-
-static void kiocb_batch_init(struct kiocb_batch *batch, long total)
+void kiocb_batch_init(struct kiocb_batch *batch, long total)
 {
 	INIT_LIST_HEAD(&batch->head);
 	batch->count = total;
 }
+EXPORT_SYMBOL(kiocb_batch_init);
 
-static void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch)
+void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch)
 {
 	struct kiocb *req, *n;
 
@@ -470,6 +464,7 @@ static void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch)
 		wake_up_all(&ctx->wait);
 	spin_unlock_irq(&ctx->ctx_lock);
 }
+EXPORT_SYMBOL(kiocb_batch_free);
 
 /*
  * Allocate a batch of kiocbs.  This avoids taking and dropping the
@@ -540,7 +535,7 @@ out:
 	return allocated;
 }
 
-static inline struct kiocb *aio_get_req(struct kioctx *ctx,
+inline struct kiocb *aio_get_req(struct kioctx *ctx,
 					struct kiocb_batch *batch)
 {
 	struct kiocb *req;
@@ -552,6 +547,7 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx,
 	list_del(&req->ki_batch);
 	return req;
 }
+EXPORT_SYMBOL(aio_get_req);
 
 static inline void really_put_req(struct kioctx *ctx, struct kiocb *req)
 {
@@ -721,7 +717,7 @@ static inline int __queue_kicked_iocb(struct kiocb *iocb)
  * simplifies the coding of individual aio operations as
  * it avoids various potential races.
  */
-static ssize_t aio_run_iocb(struct kiocb *iocb)
+ssize_t aio_run_iocb(struct kiocb *iocb)
 {
 	struct kioctx	*ctx = iocb->ki_ctx;
 	ssize_t (*retry)(struct kiocb *);
@@ -815,6 +811,7 @@ out:
 	}
 	return ret;
 }
+EXPORT_SYMBOL(aio_run_iocb);
 
 /*
  * __aio_run_iocbs:
@@ -1136,7 +1133,7 @@ static inline void clear_timeout(struct aio_timeout *to)
 	del_singleshot_timer_sync(&to->timer);
 }
 
-static int read_events(struct kioctx *ctx,
+int read_events(struct kioctx *ctx,
 			long min_nr, long nr,
 			struct io_event __user *event,
 			struct timespec __user *timeout)
@@ -1252,6 +1249,7 @@ out:
 	destroy_timer_on_stack(&to.timer);
 	return i ? i : ret;
 }
+EXPORT_SYMBOL(read_events);
 
 /* Take an ioctx and remove it from the list of ioctx's.  Protects 
  * against races with itself via ->dead.
@@ -1492,7 +1490,7 @@ static ssize_t aio_setup_single_vector(int type, struct file * file, struct kioc
  *	Performs the initial checks and aio retry method
  *	setup for the kiocb at the time of io submission.
  */
-static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat)
+ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat)
 {
 	struct file *file = kiocb->ki_filp;
 	ssize_t ret = 0;
@@ -1570,6 +1568,7 @@ static ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat)
 
 	return 0;
 }
+EXPORT_SYMBOL(aio_setup_iocb);
 
 static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
 			 struct iocb *iocb, struct kiocb_batch *batch,
diff --git a/include/linux/aio.h b/include/linux/aio.h
index b1a520e..4731da5 100644
--- a/include/linux/aio.h
+++ b/include/linux/aio.h
@@ -126,6 +126,16 @@ struct kiocb {
 	struct eventfd_ctx	*ki_eventfd;
 };
 
+/*
+ * struct kiocb's are allocated in batches to reduce the number of
+ * times the ctx lock is acquired and released.
+ */
+#define KIOCB_BATCH_SIZE	32L
+struct kiocb_batch {
+	struct list_head head;
+	long count; /* number of requests left to allocate */
+};
+
 #define is_sync_kiocb(iocb)	((iocb)->ki_key == KIOCB_SYNC_KEY)
 #define init_sync_kiocb(x, filp)			\
 	do {						\
@@ -216,6 +226,17 @@ struct mm_struct;
 extern void exit_aio(struct mm_struct *mm);
 extern long do_io_submit(aio_context_t ctx_id, long nr,
 			 struct iocb __user *__user *iocbpp, bool compat);
+extern struct kioctx *ioctx_alloc(unsigned nr_events);
+extern ssize_t aio_run_iocb(struct kiocb *iocb);
+extern int read_events(struct kioctx *ctx, long min_nr, long nr,
+		       struct io_event __user *event,
+		       struct timespec __user *timeout);
+extern ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat);
+extern void kiocb_batch_init(struct kiocb_batch *batch, long total);
+extern void kiocb_batch_free(struct kioctx *ctx, struct kiocb_batch *batch);
+extern struct kiocb *aio_get_req(struct kioctx *ctx, struct kiocb_batch *batch);
+extern int try_get_ioctx(struct kioctx *kioctx);
+extern void put_ioctx(struct kioctx *kioctx);
 #else
 static inline ssize_t wait_on_sync_kiocb(struct kiocb *iocb) { return 0; }
 static inline int aio_put_req(struct kiocb *iocb) { return 0; }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RESEND 2/5] eventfd: Export symbol eventfd_file_create()
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
  2012-07-13  8:55 ` [PATCH RESEND 1/5] aio: Export symbols and struct kiocb_batch for in kernel aio usage Asias He
@ 2012-07-13  8:55 ` Asias He
  2012-07-13  8:55 ` [PATCH RESEND 3/5] vhost: Make vhost a separate module Asias He
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexander Viro, Jeff Moyer, Michael S. Tsirkin, linux-fsdevel,
	kvm, virtualization

This is useful for people who want to create an eventfd in kernel,
e.g. vhost-blk.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Asias He <asias@redhat.com>
---
 fs/eventfd.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index d81b9f6..b288963 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -402,6 +402,7 @@ struct file *eventfd_file_create(unsigned int count, int flags)
 
 	return file;
 }
+EXPORT_SYMBOL_GPL(eventfd_file_create);
 
 SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
 {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RESEND 3/5] vhost: Make vhost a separate module
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
  2012-07-13  8:55 ` [PATCH RESEND 1/5] aio: Export symbols and struct kiocb_batch for in kernel aio usage Asias He
  2012-07-13  8:55 ` [PATCH RESEND 2/5] eventfd: Export symbol eventfd_file_create() Asias He
@ 2012-07-13  8:55 ` Asias He
  2012-07-13  8:55 ` [PATCH RESEND 4/5] vhost-net: Use VHOST_NET_FEATURES for vhost-net Asias He
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael S. Tsirkin, kvm, virtualization

Currently, vhost-net is the only consumer of vhost infrastructure. So
vhost infrastructure and vhost-net driver are in a single module.

Separating this as a vhost.ko module and a vhost-net.ko module makes it
is easier to share code with other vhost drivers, e.g. vhost-blk.ko,
tcm-vhost.ko.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/Kconfig  |   10 +++++++++-
 drivers/vhost/Makefile |    4 +++-
 drivers/vhost/vhost.c  |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/vhost.h  |    1 +
 4 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index e4e2fd1..c387067 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -1,6 +1,14 @@
+config VHOST
+	tristate "Host kernel accelerator for virtio (EXPERIMENTAL)"
+	---help---
+	  This kernel module can be loaded in host kernel to accelerate
+	  guest networking and block.
+
+	  To compile this driver as a module, choose M here: the module will
+	  be called vhost_net.
 config VHOST_NET
 	tristate "Host kernel accelerator for virtio net (EXPERIMENTAL)"
-	depends on NET && EVENTFD && (TUN || !TUN) && (MACVTAP || !MACVTAP) && EXPERIMENTAL
+	depends on VHOST && NET && EVENTFD && (TUN || !TUN) && (MACVTAP || !MACVTAP) && EXPERIMENTAL
 	---help---
 	  This kernel module can be loaded in host kernel to accelerate
 	  guest networking with virtio_net. Not to be confused with virtio_net
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 72dd020..cd36885 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -1,2 +1,4 @@
+obj-$(CONFIG_VHOST)	+= vhost.o
 obj-$(CONFIG_VHOST_NET) += vhost_net.o
-vhost_net-y := vhost.o net.o
+
+vhost_net-y		:= net.o
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 112156f..6e9f586 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <linux/kthread.h>
 #include <linux/cgroup.h>
+#include <linux/module.h>
 
 #include <linux/net.h>
 #include <linux/if_packet.h>
@@ -84,6 +85,7 @@ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
 
 	vhost_work_init(&poll->work, fn);
 }
+EXPORT_SYMBOL_GPL(vhost_poll_init);
 
 /* Start polling a file. We add ourselves to file's wait queue. The caller must
  * keep a reference to a file until after vhost_poll_stop is called. */
@@ -95,6 +97,7 @@ void vhost_poll_start(struct vhost_poll *poll, struct file *file)
 	if (mask)
 		vhost_poll_wakeup(&poll->wait, 0, 0, (void *)mask);
 }
+EXPORT_SYMBOL_GPL(vhost_poll_start);
 
 /* Stop polling a file. After this function returns, it becomes safe to drop the
  * file reference. You must also flush afterwards. */
@@ -102,6 +105,7 @@ void vhost_poll_stop(struct vhost_poll *poll)
 {
 	remove_wait_queue(poll->wqh, &poll->wait);
 }
+EXPORT_SYMBOL_GPL(vhost_poll_stop);
 
 static bool vhost_work_seq_done(struct vhost_dev *dev, struct vhost_work *work,
 				unsigned seq)
@@ -136,6 +140,7 @@ void vhost_poll_flush(struct vhost_poll *poll)
 {
 	vhost_work_flush(poll->dev, &poll->work);
 }
+EXPORT_SYMBOL_GPL(vhost_poll_flush);
 
 static inline void vhost_work_queue(struct vhost_dev *dev,
 				    struct vhost_work *work)
@@ -155,6 +160,7 @@ void vhost_poll_queue(struct vhost_poll *poll)
 {
 	vhost_work_queue(poll->dev, &poll->work);
 }
+EXPORT_SYMBOL_GPL(vhost_poll_queue);
 
 static void vhost_vq_reset(struct vhost_dev *dev,
 			   struct vhost_virtqueue *vq)
@@ -251,6 +257,7 @@ void vhost_enable_zcopy(int vq)
 {
 	vhost_zcopy_mask |= 0x1 << vq;
 }
+EXPORT_SYMBOL_GPL(vhost_enable_zcopy);
 
 /* Helper to allocate iovec buffers for all vqs. */
 static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
@@ -322,6 +329,7 @@ long vhost_dev_init(struct vhost_dev *dev,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vhost_dev_init);
 
 /* Caller should have device mutex */
 long vhost_dev_check_owner(struct vhost_dev *dev)
@@ -329,6 +337,7 @@ long vhost_dev_check_owner(struct vhost_dev *dev)
 	/* Are you the owner? If not, I don't think you mean to do that */
 	return dev->mm == current->mm ? 0 : -EPERM;
 }
+EXPORT_SYMBOL_GPL(vhost_dev_check_owner);
 
 struct vhost_attach_cgroups_struct {
 	struct vhost_work work;
@@ -414,6 +423,7 @@ long vhost_dev_reset_owner(struct vhost_dev *dev)
 	RCU_INIT_POINTER(dev->memory, memory);
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vhost_dev_reset_owner);
 
 /* In case of DMA done not in order in lower device driver for some reason.
  * upend_idx is used to track end of used idx, done_idx is used to track head
@@ -438,6 +448,7 @@ int vhost_zerocopy_signal_used(struct vhost_virtqueue *vq)
 		vq->done_idx = i;
 	return j;
 }
+EXPORT_SYMBOL_GPL(vhost_zerocopy_signal_used);
 
 /* Caller should have device mutex if and only if locked is set */
 void vhost_dev_cleanup(struct vhost_dev *dev, bool locked)
@@ -489,6 +500,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev, bool locked)
 		mmput(dev->mm);
 	dev->mm = NULL;
 }
+EXPORT_SYMBOL_GPL(vhost_dev_cleanup);
 
 static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
 {
@@ -574,6 +586,7 @@ int vhost_log_access_ok(struct vhost_dev *dev)
 				       lockdep_is_held(&dev->mutex));
 	return memory_access_ok(dev, mp, 1);
 }
+EXPORT_SYMBOL_GPL(vhost_log_access_ok);
 
 /* Verify access for write logging. */
 /* Caller should have vq mutex and device mutex */
@@ -599,6 +612,7 @@ int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 	return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
 		vq_log_access_ok(vq->dev, vq, vq->log_base);
 }
+EXPORT_SYMBOL_GPL(vhost_vq_access_ok);
 
 static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m)
 {
@@ -909,6 +923,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned long arg)
 done:
 	return r;
 }
+EXPORT_SYMBOL_GPL(vhost_dev_ioctl);
 
 static const struct vhost_memory_region *find_region(struct vhost_memory *mem,
 						     __u64 addr, __u32 len)
@@ -1000,6 +1015,7 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 	BUG();
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vhost_log_write);
 
 static int vhost_update_used_flags(struct vhost_virtqueue *vq)
 {
@@ -1051,6 +1067,7 @@ int vhost_init_used(struct vhost_virtqueue *vq)
 	vq->signalled_used_valid = false;
 	return get_user(vq->last_used_idx, &vq->used->idx);
 }
+EXPORT_SYMBOL_GPL(vhost_init_used);
 
 static int translate_desc(struct vhost_dev *dev, u64 addr, u32 len,
 			  struct iovec iov[], int iov_size)
@@ -1327,12 +1344,14 @@ int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 	BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
 	return head;
 }
+EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
 
 /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
 void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
 {
 	vq->last_avail_idx -= n;
 }
+EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
 
 /* After we've used one of their buffers, we tell them about it.  We'll then
  * want to notify the guest, using eventfd. */
@@ -1381,6 +1400,7 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 		vq->signalled_used_valid = false;
 	return 0;
 }
+EXPORT_SYMBOL_GPL(vhost_add_used);
 
 static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			    struct vring_used_elem *heads,
@@ -1450,6 +1470,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 	}
 	return r;
 }
+EXPORT_SYMBOL_GPL(vhost_add_used_n);
 
 static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
@@ -1494,6 +1515,7 @@ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 	if (vq->call_ctx && vhost_notify(dev, vq))
 		eventfd_signal(vq->call_ctx, 1);
 }
+EXPORT_SYMBOL_GPL(vhost_signal);
 
 /* And here's the combo meal deal.  Supersize me! */
 void vhost_add_used_and_signal(struct vhost_dev *dev,
@@ -1503,6 +1525,7 @@ void vhost_add_used_and_signal(struct vhost_dev *dev,
 	vhost_add_used(vq, head, len);
 	vhost_signal(dev, vq);
 }
+EXPORT_SYMBOL_GPL(vhost_add_used_and_signal);
 
 /* multi-buffer version of vhost_add_used_and_signal */
 void vhost_add_used_and_signal_n(struct vhost_dev *dev,
@@ -1512,6 +1535,7 @@ void vhost_add_used_and_signal_n(struct vhost_dev *dev,
 	vhost_add_used_n(vq, heads, count);
 	vhost_signal(dev, vq);
 }
+EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n);
 
 /* OK, now we need to know about added descriptors. */
 bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
@@ -1549,6 +1573,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 
 	return avail_idx != vq->avail_idx;
 }
+EXPORT_SYMBOL_GPL(vhost_enable_notify);
 
 /* We don't need to be notified again. */
 void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
@@ -1565,6 +1590,7 @@ void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 			       &vq->used->flags, r);
 	}
 }
+EXPORT_SYMBOL_GPL(vhost_disable_notify);
 
 static void vhost_zerocopy_done_signal(struct kref *kref)
 {
@@ -1588,11 +1614,13 @@ struct vhost_ubuf_ref *vhost_ubuf_alloc(struct vhost_virtqueue *vq,
 	ubufs->vq = vq;
 	return ubufs;
 }
+EXPORT_SYMBOL_GPL(vhost_ubuf_alloc);
 
 void vhost_ubuf_put(struct vhost_ubuf_ref *ubufs)
 {
 	kref_put(&ubufs->kref, vhost_zerocopy_done_signal);
 }
+EXPORT_SYMBOL_GPL(vhost_ubuf_put);
 
 void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref *ubufs)
 {
@@ -1600,6 +1628,7 @@ void vhost_ubuf_put_and_wait(struct vhost_ubuf_ref *ubufs)
 	wait_event(ubufs->wait, !atomic_read(&ubufs->kref.refcount));
 	kfree(ubufs);
 }
+EXPORT_SYMBOL_GPL(vhost_ubuf_put_and_wait);
 
 void vhost_zerocopy_callback(struct ubuf_info *ubuf)
 {
@@ -1611,3 +1640,22 @@ void vhost_zerocopy_callback(struct ubuf_info *ubuf)
 	vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN;
 	kref_put(&ubufs->kref, vhost_zerocopy_done_signal);
 }
+EXPORT_SYMBOL_GPL(vhost_zerocopy_callback);
+
+static int __init vhost_init(void)
+{
+	return 0;
+}
+
+static void __exit vhost_exit(void)
+{
+	return;
+}
+
+module_init(vhost_init);
+module_exit(vhost_exit);
+
+MODULE_VERSION("0.0.1");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Michael S. Tsirkin");
+MODULE_DESCRIPTION("Host kernel accelerator for virtio");
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 8de1fd5..c5c7fb0 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -12,6 +12,7 @@
 #include <linux/virtio_config.h>
 #include <linux/virtio_ring.h>
 #include <linux/atomic.h>
+#include <linux/virtio_net.h>
 
 /* This is for zerocopy, used buffer len is set to 1 when lower device DMA
  * done */
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RESEND 4/5] vhost-net: Use VHOST_NET_FEATURES for vhost-net
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
                   ` (2 preceding siblings ...)
  2012-07-13  8:55 ` [PATCH RESEND 3/5] vhost: Make vhost a separate module Asias He
@ 2012-07-13  8:55 ` Asias He
  2012-07-13  8:55 ` [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support Asias He
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael S. Tsirkin, kvm, virtualization

vhost-net's feature does not deseve the name VHOST_FEATURES. Use
VHOST_NET_FEATURES instead.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/net.c   |    4 ++--
 drivers/vhost/test.c  |    4 ++--
 drivers/vhost/vhost.h |   12 ++++++------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index f82a739..072cbba 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -823,14 +823,14 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_net_set_backend(n, backend.index, backend.fd);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_net_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 3de00d9..91d6f06 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -261,14 +261,14 @@ static long vhost_test_ioctl(struct file *f, unsigned int ioctl,
 			return -EFAULT;
 		return vhost_test_run(n, test);
 	case VHOST_GET_FEATURES:
-		features = VHOST_FEATURES;
+		features = VHOST_NET_FEATURES;
 		if (copy_to_user(featurep, &features, sizeof features))
 			return -EFAULT;
 		return 0;
 	case VHOST_SET_FEATURES:
 		if (copy_from_user(&features, featurep, sizeof features))
 			return -EFAULT;
-		if (features & ~VHOST_FEATURES)
+		if (features & ~VHOST_NET_FEATURES)
 			return -EOPNOTSUPP;
 		return vhost_test_set_features(n, features);
 	case VHOST_RESET_OWNER:
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index c5c7fb0..cc046a9 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -199,12 +199,12 @@ int vhost_zerocopy_signal_used(struct vhost_virtqueue *vq);
 	} while (0)
 
 enum {
-	VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
-			 (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
-			 (1ULL << VIRTIO_RING_F_EVENT_IDX) |
-			 (1ULL << VHOST_F_LOG_ALL) |
-			 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
-			 (1ULL << VIRTIO_NET_F_MRG_RXBUF),
+	VHOST_NET_FEATURES =	(1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+				(1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+				(1ULL << VIRTIO_RING_F_EVENT_IDX) |
+				(1ULL << VHOST_F_LOG_ALL) |
+				(1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
+				(1ULL << VIRTIO_NET_F_MRG_RXBUF),
 };
 
 static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
                   ` (3 preceding siblings ...)
  2012-07-13  8:55 ` [PATCH RESEND 4/5] vhost-net: Use VHOST_NET_FEATURES for vhost-net Asias He
@ 2012-07-13  8:55 ` Asias He
  2012-07-17 19:10   ` Jeff Moyer
  2012-07-19 13:05   ` Anthony Liguori
  2012-07-14  7:49 ` [PATCH RESEND 0/5] " Christoph Hellwig
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 22+ messages in thread
From: Asias He @ 2012-07-13  8:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Michael S. Tsirkin, kvm, virtualization

vhost-blk is a in kernel virito-blk device accelerator.

This patch is based on Liu Yuan's implementation with various
improvements and bug fixes. Notably, this patch makes guest notify and
host completion processing in parallel which gives about 60% performance
improvement compared to Liu Yuan's implementation.

Performance evaluation:
-----------------------------
The comparison is between kvm tool with usersapce implementation and kvm
tool with vhost-blk.

1) Fio with libaio ioengine on Fusion IO device
With bio-based IO path, sequential read/write, random read/write
IOPS boost         : 8.4%, 15.3%, 10.4%, 14.6%
Latency improvement: 8.5%, 15.4%, 10.4%, 15.1%

2) Fio with vsync ioengine on Fusion IO device
With bio-based IO path, sequential read/write, random read/write
IOPS boost         : 10.5%, 4.8%, 5.2%, 5.6%
Latency improvement: 11.4%, 5.0%, 5.2%, 5.8%

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/Kconfig  |   10 +
 drivers/vhost/Makefile |    2 +
 drivers/vhost/blk.c    |  600 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/vhost.h  |    5 +
 include/linux/vhost.h  |    3 +
 5 files changed, 620 insertions(+)
 create mode 100644 drivers/vhost/blk.c

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index c387067..fa071a8 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -16,4 +16,14 @@ config VHOST_NET
 
 	  To compile this driver as a module, choose M here: the module will
 	  be called vhost_net.
+config VHOST_BLK
+	tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)"
+	depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL
+	---help---
+	  This kernel module can be loaded in host kernel to accelerate
+	  guest block with virtio_blk. Not to be confused with virtio_blk
+	  module itself which needs to be loaded in guest kernel.
+
+	  To compile this driver as a module, choose M here: the module will
+	  be called vhost_blk.
 
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index cd36885..aa461d5 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -1,4 +1,6 @@
 obj-$(CONFIG_VHOST)	+= vhost.o
 obj-$(CONFIG_VHOST_NET) += vhost_net.o
+obj-$(CONFIG_VHOST_BLK) += vhost_blk.o
 
 vhost_net-y		:= net.o
+vhost_blk-y		:= blk.o
diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
new file mode 100644
index 0000000..6a94894
--- /dev/null
+++ b/drivers/vhost/blk.c
@@ -0,0 +1,600 @@
+/*
+ * Copyright (C) 2011 Taobao, Inc.
+ * Author: Liu Yuan <tailai.ly@taobao.com>
+ *
+ * Copyright (C) 2012 Red Hat, Inc.
+ * Author: Asias He <asias@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ *
+ * virtio-blk server in host kernel.
+ */
+
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/vhost.h>
+#include <linux/virtio_blk.h>
+#include <linux/eventfd.h>
+#include <linux/mutex.h>
+#include <linux/file.h>
+#include <linux/mmu_context.h>
+#include <linux/anon_inodes.h>
+#include <linux/kthread.h>
+#include <linux/blkdev.h>
+
+#include "vhost.h"
+
+#define BLK_HDR	0
+
+enum {
+	VHOST_BLK_VQ_REQ = 0,
+	VHOST_BLK_VQ_MAX = 1,
+};
+
+struct vhost_blk_req {
+	u16 head;
+	u8 *status;
+};
+
+struct vhost_blk {
+	struct task_struct *worker_host_kick;
+	struct task_struct *worker;
+	struct vhost_blk_req *reqs;
+	struct vhost_virtqueue vq;
+	struct eventfd_ctx *ectx;
+	struct io_event *ioevent;
+	struct kioctx *ioctx;
+	struct vhost_dev dev;
+	struct file *efile;
+	u64 ioevent_nr;
+	bool stop;
+};
+
+static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr)
+{
+	mm_segment_t old_fs = get_fs();
+	int ret;
+
+	set_fs(KERNEL_DS);
+	ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL);
+	set_fs(old_fs);
+
+	return ret;
+}
+
+static int vhost_blk_setup(struct vhost_blk *blk)
+{
+	struct kioctx *ctx;
+
+	if (blk->ioctx)
+		return 0;
+
+	blk->ioevent_nr = blk->vq.num;
+	ctx = ioctx_alloc(blk->ioevent_nr);
+	if (IS_ERR(ctx)) {
+		pr_err("Failed to ioctx_alloc");
+		return PTR_ERR(ctx);
+	}
+	put_ioctx(ctx);
+	blk->ioctx = ctx;
+
+	blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
+			       GFP_KERNEL);
+	if (!blk->ioevent) {
+		pr_err("Failed to allocate memory for io_events");
+		return -ENOMEM;
+	}
+
+	blk->reqs = kmalloc(sizeof(struct vhost_blk_req) * blk->ioevent_nr,
+			    GFP_KERNEL);
+	if (!blk->reqs) {
+		pr_err("Failed to allocate memory for vhost_blk_req");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static inline int vhost_blk_set_status(struct vhost_blk *blk, u8 *statusp,
+				       u8 status)
+{
+	if (copy_to_user(statusp, &status, sizeof(status))) {
+		vq_err(&blk->vq, "Failed to write status\n");
+		vhost_discard_vq_desc(&blk->vq, 1);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static void vhost_blk_enable_vq(struct vhost_blk *blk,
+				struct vhost_virtqueue *vq)
+{
+	wake_up_process(blk->worker_host_kick);
+}
+
+static int vhost_blk_io_submit(struct vhost_blk *blk, struct file *file,
+			       struct vhost_blk_req *req,
+			       struct iovec *iov, u64 nr_vecs, loff_t offset,
+			       int opcode)
+{
+	struct kioctx *ioctx = blk->ioctx;
+	mm_segment_t oldfs = get_fs();
+	struct kiocb_batch batch;
+	struct blk_plug plug;
+	struct kiocb *iocb;
+	int ret;
+
+	if (!try_get_ioctx(ioctx)) {
+		pr_info("Failed to get ioctx");
+		return -EAGAIN;
+	}
+
+	atomic_long_inc_not_zero(&file->f_count);
+	eventfd_ctx_get(blk->ectx);
+
+	/* TODO: batch to 1 is not good! */
+	kiocb_batch_init(&batch, 1);
+	blk_start_plug(&plug);
+
+	iocb = aio_get_req(ioctx, &batch);
+	if (unlikely(!iocb)) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
+	iocb->ki_filp	= file;
+	iocb->ki_pos	= offset;
+	iocb->ki_buf	= (void *)iov;
+	iocb->ki_left	= nr_vecs;
+	iocb->ki_nbytes	= nr_vecs;
+	iocb->ki_opcode	= opcode;
+	iocb->ki_obj.user = req;
+	iocb->ki_eventfd  = blk->ectx;
+
+	set_fs(KERNEL_DS);
+	ret = aio_setup_iocb(iocb, false);
+	set_fs(oldfs);
+	if (unlikely(ret))
+		goto out_put_iocb;
+
+	spin_lock_irq(&ioctx->ctx_lock);
+	if (unlikely(ioctx->dead)) {
+		spin_unlock_irq(&ioctx->ctx_lock);
+		ret = -EINVAL;
+		goto out_put_iocb;
+	}
+	aio_run_iocb(iocb);
+	spin_unlock_irq(&ioctx->ctx_lock);
+
+	aio_put_req(iocb);
+
+	blk_finish_plug(&plug);
+	kiocb_batch_free(ioctx, &batch);
+	put_ioctx(ioctx);
+
+	return ret;
+out_put_iocb:
+	aio_put_req(iocb); /* Drop extra ref to req */
+	aio_put_req(iocb); /* Drop I/O ref to req */
+out:
+	put_ioctx(ioctx);
+	return ret;
+}
+
+static void vhost_blk_flush(struct vhost_blk *blk)
+{
+	vhost_poll_flush(&blk->vq.poll);
+}
+
+static struct file *vhost_blk_stop_vq(struct vhost_blk *blk,
+				      struct vhost_virtqueue *vq)
+{
+	struct file *file;
+
+	mutex_lock(&vq->mutex);
+	file = rcu_dereference_protected(vq->private_data,
+			lockdep_is_held(&vq->mutex));
+	rcu_assign_pointer(vq->private_data, NULL);
+	mutex_unlock(&vq->mutex);
+
+	return file;
+
+}
+
+static inline void vhost_blk_stop(struct vhost_blk *blk, struct file **file)
+{
+
+	*file = vhost_blk_stop_vq(blk, &blk->vq);
+}
+
+/* Handle guest request */
+static int vhost_blk_do_req(struct vhost_virtqueue *vq,
+			    struct virtio_blk_outhdr *hdr,
+			    u16 head, u16 out, u16 in,
+			    struct file *file)
+{
+	struct vhost_blk *blk = container_of(vq->dev, struct vhost_blk, dev);
+	struct iovec *iov = &vq->iov[BLK_HDR + 1];
+	loff_t offset = hdr->sector << 9;
+	struct vhost_blk_req *req;
+	u64 nr_vecs;
+	int ret = 0;
+	u8 status;
+
+	if (hdr->type == VIRTIO_BLK_T_IN || hdr->type == VIRTIO_BLK_T_GET_ID)
+		nr_vecs = in - 1;
+	else
+		nr_vecs = out - 1;
+
+	req		= &blk->reqs[head];
+	req->head	= head;
+	req->status	= blk->vq.iov[nr_vecs + 1].iov_base;
+
+	switch (hdr->type) {
+	case VIRTIO_BLK_T_OUT:
+		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
+					  IOCB_CMD_PWRITEV);
+		break;
+	case VIRTIO_BLK_T_IN:
+		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
+					  IOCB_CMD_PREADV);
+		break;
+	case VIRTIO_BLK_T_FLUSH:
+		ret = vfs_fsync(file, 1);
+		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+		ret = vhost_blk_set_status(blk, req->status, status);
+		if (!ret)
+			vhost_add_used_and_signal(&blk->dev, vq, head, ret);
+		break;
+	case VIRTIO_BLK_T_GET_ID:
+		/* TODO: need a real ID string */
+		ret = snprintf(vq->iov[BLK_HDR + 1].iov_base,
+			       VIRTIO_BLK_ID_BYTES, "VHOST-BLK-DISK");
+		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+		ret = vhost_blk_set_status(blk, req->status, status);
+		if (!ret)
+			vhost_add_used_and_signal(&blk->dev, vq, head,
+						  VIRTIO_BLK_ID_BYTES);
+		break;
+	default:
+		pr_warn("Unsupported request type %d\n", hdr->type);
+		vhost_discard_vq_desc(vq, 1);
+		ret = -EFAULT;
+		break;
+	}
+
+	return ret;
+}
+
+/* Guest kick us for IO request */
+static void vhost_blk_handle_guest_kick(struct vhost_work *work)
+{
+	struct virtio_blk_outhdr hdr;
+	struct vhost_virtqueue *vq;
+	struct vhost_blk *blk;
+	struct file *f;
+	int in, out;
+	u16 head;
+
+	vq = container_of(work, struct vhost_virtqueue, poll.work);
+	blk = container_of(vq->dev, struct vhost_blk, dev);
+
+	/* TODO: check that we are running from vhost_worker? */
+	f = rcu_dereference_check(vq->private_data, 1);
+	if (!f)
+		return;
+
+	vhost_disable_notify(&blk->dev, vq);
+	for (;;) {
+		head = vhost_get_vq_desc(&blk->dev, vq, vq->iov,
+					 ARRAY_SIZE(vq->iov),
+					 &out, &in, NULL, NULL);
+		if (unlikely(head < 0))
+			break;
+
+		if (unlikely(head == vq->num)) {
+			if (unlikely(vhost_enable_notify(&blk->dev, vq))) {
+				vhost_disable_notify(&blk->dev, vq);
+				continue;
+			}
+			break;
+		}
+
+		if (unlikely(vq->iov[BLK_HDR].iov_len != sizeof(hdr))) {
+			vq_err(vq, "Bad block header lengh!\n");
+			vhost_discard_vq_desc(vq, 1);
+			break;
+		}
+
+		if (unlikely(copy_from_user(&hdr, vq->iov[BLK_HDR].iov_base,
+					    sizeof(hdr)))) {
+			vq_err(vq, "Failed to get block header!\n");
+			vhost_discard_vq_desc(vq, 1);
+			break;
+		}
+
+
+		if (unlikely(vhost_blk_do_req(vq, &hdr, head, out, in, f) < 0))
+			break;
+	}
+}
+
+/* Complete the IO request */
+static int vhost_blk_host_kick_thread(void *data)
+{
+	mm_segment_t oldfs = get_fs();
+	struct vhost_blk *blk = data;
+	struct vhost_virtqueue *vq;
+	struct vhost_blk_req *req;
+	struct io_event *e;
+	int ret, i, len;
+	u64 count, nr;
+	u8 status;
+
+	vq = &blk->vq;
+	set_fs(USER_DS);
+	use_mm(blk->dev.mm);
+	for (;;) {
+		do {
+			ret = eventfd_ctx_read(blk->ectx, 0, &count);
+			if (unlikely(kthread_should_stop() || blk->stop))
+				goto out;
+		} while (ret != 0);
+
+		do {
+			nr = vhost_blk_read_events(blk,
+						   min(count, blk->ioevent_nr));
+			if (unlikely(nr <= 0))
+				continue;
+			count -= nr;
+
+			for (i = 0; i < nr; i++) {
+				e = &blk->ioevent[i];
+				req = (void *)e->obj;
+				len = e->res;
+				status = len > 0 ? VIRTIO_BLK_S_OK :
+						   VIRTIO_BLK_S_IOERR;
+				ret = copy_to_user(req->status, &status,
+						   sizeof(status));
+				if (unlikely(ret)) {
+					vq_err(&blk->vq,
+					       "Failed to write status\n");
+					continue;
+				}
+				vhost_add_used(&blk->vq, req->head, len);
+			}
+			vhost_signal(&blk->dev, &blk->vq);
+		} while (count > 0);
+	}
+
+out:
+	unuse_mm(blk->dev.mm);
+	set_fs(oldfs);
+	return 0;
+}
+
+static int vhost_blk_open(struct inode *inode, struct file *file)
+{
+	struct vhost_blk *blk;
+	int ret;
+
+	blk = kzalloc(sizeof(*blk), GFP_KERNEL);
+	if (!blk) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	blk->vq.handle_kick = vhost_blk_handle_guest_kick;
+
+	ret = vhost_dev_init(&blk->dev, &blk->vq, VHOST_BLK_VQ_MAX);
+	if (ret < 0)
+		goto out_dev;
+	/*
+	 * Create an eventfd which is used by aio code to
+	 * notify guest when request is completed.
+	 */
+	blk->efile = eventfd_file_create(0, 0);
+	if (IS_ERR(blk->efile))
+		goto out_dev;
+	blk->ectx = eventfd_ctx_fileget(blk->efile);
+	if (IS_ERR(blk->ectx))
+		goto out_dev;
+
+	file->private_data = blk;
+
+	blk->worker_host_kick = kthread_create(vhost_blk_host_kick_thread,
+			blk, "vhost-blk-%d", current->pid);
+	if (IS_ERR(blk->worker_host_kick)) {
+		ret = PTR_ERR(blk->worker_host_kick);
+		goto out_dev;
+	}
+
+	return ret;
+out_dev:
+	kfree(blk);
+out:
+	return ret;
+}
+
+static int vhost_blk_release(struct inode *inode, struct file *f)
+{
+	struct vhost_blk *blk = f->private_data;
+	struct file *file;
+
+	vhost_blk_stop(blk, &file);
+	vhost_blk_flush(blk);
+	vhost_dev_cleanup(&blk->dev, false);
+	if (file)
+		fput(file);
+
+	blk->stop = true;
+	eventfd_signal(blk->ectx, 1);
+	kthread_stop(blk->worker_host_kick);
+
+	eventfd_ctx_put(blk->ectx);
+	if (blk->efile)
+		fput(blk->efile);
+
+	kfree(blk->ioevent);
+	kfree(blk->reqs);
+	kfree(blk);
+
+	return 0;
+}
+
+static int vhost_blk_set_features(struct vhost_blk *blk, u64 features)
+{
+	mutex_lock(&blk->dev.mutex);
+	blk->dev.acked_features = features;
+	mutex_unlock(&blk->dev.mutex);
+
+	return 0;
+}
+
+static long vhost_blk_set_backend(struct vhost_blk *blk, unsigned index, int fd)
+{
+	struct vhost_virtqueue *vq = &blk->vq;
+	struct file *file, *oldfile;
+	int ret;
+
+	mutex_lock(&blk->dev.mutex);
+	ret = vhost_dev_check_owner(&blk->dev);
+	if (ret)
+		goto out_dev;
+
+	if (index >= VHOST_BLK_VQ_MAX) {
+		ret = -ENOBUFS;
+		goto out_dev;
+	}
+
+	mutex_lock(&vq->mutex);
+
+	if (!vhost_vq_access_ok(vq)) {
+		ret = -EFAULT;
+		goto out_vq;
+	}
+
+	file = fget(fd);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto out_vq;
+	}
+
+	oldfile = rcu_dereference_protected(vq->private_data,
+			lockdep_is_held(&vq->mutex));
+	if (file != oldfile) {
+		rcu_assign_pointer(vq->private_data, file);
+		vhost_blk_enable_vq(blk, vq);
+
+		ret = vhost_init_used(vq);
+		if (ret)
+			goto out_vq;
+	}
+
+	mutex_unlock(&vq->mutex);
+
+	if (oldfile) {
+		vhost_blk_flush(blk);
+		fput(oldfile);
+	}
+
+	mutex_unlock(&blk->dev.mutex);
+	return 0;
+
+out_vq:
+	mutex_unlock(&vq->mutex);
+out_dev:
+	mutex_unlock(&blk->dev.mutex);
+	return ret;
+}
+
+static long vhost_blk_reset_owner(struct vhost_blk *blk)
+{
+	struct file *file = NULL;
+	int err;
+
+	mutex_lock(&blk->dev.mutex);
+	err = vhost_dev_check_owner(&blk->dev);
+	if (err)
+		goto done;
+	vhost_blk_stop(blk, &file);
+	vhost_blk_flush(blk);
+	err = vhost_dev_reset_owner(&blk->dev);
+done:
+	mutex_unlock(&blk->dev.mutex);
+	if (file)
+		fput(file);
+	return err;
+}
+
+static long vhost_blk_ioctl(struct file *f, unsigned int ioctl,
+			    unsigned long arg)
+{
+	struct vhost_blk *blk = f->private_data;
+	void __user *argp = (void __user *)arg;
+	struct vhost_vring_file backend;
+	u64 __user *featurep = argp;
+	u64 features;
+	int ret;
+
+	switch (ioctl) {
+	case VHOST_BLK_SET_BACKEND:
+		if (copy_from_user(&backend, argp, sizeof backend))
+			return -EFAULT;
+		return vhost_blk_set_backend(blk, backend.index, backend.fd);
+	case VHOST_GET_FEATURES:
+		features = VHOST_BLK_FEATURES;
+		if (copy_to_user(featurep, &features, sizeof features))
+			return -EFAULT;
+		return 0;
+	case VHOST_SET_FEATURES:
+		if (copy_from_user(&features, featurep, sizeof features))
+			return -EFAULT;
+		if (features & ~VHOST_BLK_FEATURES)
+			return -EOPNOTSUPP;
+		return vhost_blk_set_features(blk, features);
+	case VHOST_RESET_OWNER:
+		return vhost_blk_reset_owner(blk);
+	default:
+		mutex_lock(&blk->dev.mutex);
+		ret = vhost_dev_ioctl(&blk->dev, ioctl, arg);
+		if (!ret && ioctl == VHOST_SET_VRING_NUM)
+			ret = vhost_blk_setup(blk);
+		vhost_blk_flush(blk);
+		mutex_unlock(&blk->dev.mutex);
+		return ret;
+	}
+}
+
+static const struct file_operations vhost_blk_fops = {
+	.owner          = THIS_MODULE,
+	.open           = vhost_blk_open,
+	.release        = vhost_blk_release,
+	.llseek		= noop_llseek,
+	.unlocked_ioctl = vhost_blk_ioctl,
+};
+
+static struct miscdevice vhost_blk_misc = {
+	MISC_DYNAMIC_MINOR,
+	"vhost-blk",
+	&vhost_blk_fops,
+};
+
+int vhost_blk_init(void)
+{
+	return misc_register(&vhost_blk_misc);
+}
+
+void vhost_blk_exit(void)
+{
+	misc_deregister(&vhost_blk_misc);
+}
+
+module_init(vhost_blk_init);
+module_exit(vhost_blk_exit);
+
+MODULE_VERSION("0.0.2");
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Asias He");
+MODULE_DESCRIPTION("Host kernel accelerator for virtio_blk");
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index cc046a9..1d4db7b 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -205,6 +205,11 @@ enum {
 				(1ULL << VHOST_F_LOG_ALL) |
 				(1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
 				(1ULL << VIRTIO_NET_F_MRG_RXBUF),
+
+	VHOST_BLK_FEATURES =	(1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
+				(1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
+				(1ULL << VIRTIO_RING_F_EVENT_IDX) |
+				(1ULL << VHOST_F_LOG_ALL),
 };
 
 static inline int vhost_has_feature(struct vhost_dev *dev, int bit)
diff --git a/include/linux/vhost.h b/include/linux/vhost.h
index e847f1e..5869728 100644
--- a/include/linux/vhost.h
+++ b/include/linux/vhost.h
@@ -121,6 +121,9 @@ struct vhost_memory {
  * device.  This can be used to stop the ring (e.g. for migration). */
 #define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
 
+/* VHOST_BLK specific defines */
+#define VHOST_BLK_SET_BACKEND _IOW(VHOST_VIRTIO, 0x40, struct vhost_vring_file)
+
 /* Feature bits */
 /* Log all write descriptors. Can be changed while device is active. */
 #define VHOST_F_LOG_ALL 26
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
                   ` (4 preceding siblings ...)
  2012-07-13  8:55 ` [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support Asias He
@ 2012-07-14  7:49 ` Christoph Hellwig
  2012-07-16  9:05   ` Asias He
  2012-07-17 15:09 ` Michael S. Tsirkin
  2012-07-20 19:30 ` Michael S. Tsirkin
  7 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2012-07-14  7:49 UTC (permalink / raw)
  To: Asias He
  Cc: linux-kernel, Alexander Viro, Benjamin LaHaise, James Bottomley,
	Jeff Moyer, kvm, linux-aio, linux-fsdevel, Michael S. Tsirkin,
	virtualization

Please send a version that does direct block I/O similar to xen-blkback
for now.  If we get proper in-kernel aio support one day you can add
back file backend support.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-14  7:49 ` [PATCH RESEND 0/5] " Christoph Hellwig
@ 2012-07-16  9:05   ` Asias He
  0 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-16  9:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Alexander Viro, Benjamin LaHaise, James Bottomley,
	Jeff Moyer, kvm, linux-aio, linux-fsdevel, Michael S. Tsirkin,
	virtualization, Dave Kleikamp, Zach Brown

Hi Christoph,

On 07/14/2012 03:49 PM, Christoph Hellwig wrote:
> Please send a version that does direct block I/O similar to xen-blkback
> for now.

Seems xen-blkback converts the guest IO request to host bio and submit 
them directly. I was wondering whether this has a performance gain 
compared to AIO implementation.

> If we get proper in-kernel aio support one day you can add
> back file backend support.

I talked with Dave and Zack on the in-kernel aio patch which James 
pointed out:

http://marc.info/?l=linux-fsdevel&m=133312234313122

Dave will post a new version soon. I will wait for it.

-- 
Asias



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
                   ` (5 preceding siblings ...)
  2012-07-14  7:49 ` [PATCH RESEND 0/5] " Christoph Hellwig
@ 2012-07-17 15:09 ` Michael S. Tsirkin
  2012-07-18  2:09   ` Asias He
  2012-07-18 11:42   ` Stefan Hajnoczi
  2012-07-20 19:30 ` Michael S. Tsirkin
  7 siblings, 2 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2012-07-17 15:09 UTC (permalink / raw)
  To: Asias He
  Cc: linux-kernel, Alexander Viro, Benjamin LaHaise, James Bottomley,
	Jeff Moyer, kvm, linux-aio, linux-fsdevel, target-devel,
	linux-scsi, lf-virt, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Christoph Hellwig, Jens Axboe,
	Hannes Reinecke

On Fri, Jul 13, 2012 at 04:55:06PM +0800, Asias He wrote:
> 
> Hi folks,
> 
> [I am resending to fix the broken thread in the previous one.]
> 
> This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
> device accelerator. Compared to userspace virtio-blk implementation, vhost-blk
> gives about 5% to 15% performance improvement.

Same thing as tcm_host comment:

	It seems not 100% clear whether this driver will have major
	userspace using it. And if not, it would be very hard to support a
	driver when recent userspace does not use it in the end.

	I think a good idea for 3.6 would be to make it depend on
	CONFIG_STAGING.  Then we don't commit to an ABI.  For this, you can add
	a separate Kconfig and source it from drivers/staging/Kconfig.  Maybe it
	needs to be in a separate directory drivers/vhost/staging/Kconfig.

I Cc'd the list of tcm_host in the hope that you can cooperate on this.

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-13  8:55 ` [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support Asias He
@ 2012-07-17 19:10   ` Jeff Moyer
  2012-07-18  1:22     ` Asias He
  2012-07-19 13:05   ` Anthony Liguori
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Moyer @ 2012-07-17 19:10 UTC (permalink / raw)
  To: Asias He; +Cc: linux-kernel, Michael S. Tsirkin, kvm, virtualization

Asias He <asias@redhat.com> writes:

> vhost-blk is a in kernel virito-blk device accelerator.
>
> This patch is based on Liu Yuan's implementation with various
> improvements and bug fixes. Notably, this patch makes guest notify and
> host completion processing in parallel which gives about 60% performance
> improvement compared to Liu Yuan's implementation.

So, first off, some basic questions.  Is it correct to assume that you
tested this with buffered I/O (files opened *without* O_DIRECT)?  I'm
pretty sure that if you used O_DIRECT, you'd run into problems (which
are solved by the patch set posted by Shaggy, based on Zach Brown's work
of many moons ago).  Note that, with buffered I/O, the submission path
is NOT asynchronous.  So, any speedups you've reported are extremely
suspect.  ;-)

Next, did you look at Shaggy's patch set?  I think it would be best to
focus your efforts on testing *that*, and implementing your work on top
of it.

Having said that, I did do some review of this patch, inlined below.

> +static int vhost_blk_setup(struct vhost_blk *blk)
> +{
> +	struct kioctx *ctx;
> +
> +	if (blk->ioctx)
> +		return 0;
> +
> +	blk->ioevent_nr = blk->vq.num;
> +	ctx = ioctx_alloc(blk->ioevent_nr);
> +	if (IS_ERR(ctx)) {
> +		pr_err("Failed to ioctx_alloc");
> +		return PTR_ERR(ctx);
> +	}
> +	put_ioctx(ctx);
> +	blk->ioctx = ctx;
> +
> +	blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
> +			       GFP_KERNEL);
> +	if (!blk->ioevent) {
> +		pr_err("Failed to allocate memory for io_events");
> +		return -ENOMEM;

You've just leaked blk->ioctx.

> +	}
> +
> +	blk->reqs = kmalloc(sizeof(struct vhost_blk_req) * blk->ioevent_nr,
> +			    GFP_KERNEL);
> +	if (!blk->reqs) {
> +		pr_err("Failed to allocate memory for vhost_blk_req");
> +		return -ENOMEM;

And here.

> +	}
> +
> +	return 0;
> +}
> +
[snip]
> +static int vhost_blk_io_submit(struct vhost_blk *blk, struct file *file,
> +			       struct vhost_blk_req *req,
> +			       struct iovec *iov, u64 nr_vecs, loff_t offset,
> +			       int opcode)
> +{
> +	struct kioctx *ioctx = blk->ioctx;
> +	mm_segment_t oldfs = get_fs();
> +	struct kiocb_batch batch;
> +	struct blk_plug plug;
> +	struct kiocb *iocb;
> +	int ret;
> +
> +	if (!try_get_ioctx(ioctx)) {
> +		pr_info("Failed to get ioctx");
> +		return -EAGAIN;
> +	}

Using try_get_ioctx directly gives me a slightly uneasy feeling.  I
understand that you don't need to do the lookup, but at least wrap it
and check for ->dead.

> +
> +	atomic_long_inc_not_zero(&file->f_count);
> +	eventfd_ctx_get(blk->ectx);
> +
> +	/* TODO: batch to 1 is not good! */

Agreed.  You should setup the batching in vhost_blk_handle_guest_kick.
The way you've written the code, the batching is not at all helpful.

> +	kiocb_batch_init(&batch, 1);
> +	blk_start_plug(&plug);
> +
> +	iocb = aio_get_req(ioctx, &batch);
> +	if (unlikely(!iocb)) {
> +		ret = -EAGAIN;
> +		goto out;
> +	}
> +
> +	iocb->ki_filp	= file;
> +	iocb->ki_pos	= offset;
> +	iocb->ki_buf	= (void *)iov;
> +	iocb->ki_left	= nr_vecs;
> +	iocb->ki_nbytes	= nr_vecs;
> +	iocb->ki_opcode	= opcode;
> +	iocb->ki_obj.user = req;
> +	iocb->ki_eventfd  = blk->ectx;
> +
> +	set_fs(KERNEL_DS);
> +	ret = aio_setup_iocb(iocb, false);
> +	set_fs(oldfs);
> +	if (unlikely(ret))
> +		goto out_put_iocb;
> +
> +	spin_lock_irq(&ioctx->ctx_lock);
> +	if (unlikely(ioctx->dead)) {
> +		spin_unlock_irq(&ioctx->ctx_lock);
> +		ret = -EINVAL;
> +		goto out_put_iocb;
> +	}
> +	aio_run_iocb(iocb);
> +	spin_unlock_irq(&ioctx->ctx_lock);
> +
> +	aio_put_req(iocb);
> +
> +	blk_finish_plug(&plug);
> +	kiocb_batch_free(ioctx, &batch);
> +	put_ioctx(ioctx);
> +
> +	return ret;
> +out_put_iocb:
> +	aio_put_req(iocb); /* Drop extra ref to req */
> +	aio_put_req(iocb); /* Drop I/O ref to req */
> +out:
> +	put_ioctx(ioctx);
> +	return ret;
> +}
> +

You've duplicated a lot of io_submit_one.  I'd rather see that factored
out than to have to maintain two copies.

Again, what I'd *really* like to see is you rebase on top of Shaggy's
work.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-17 19:10   ` Jeff Moyer
@ 2012-07-18  1:22     ` Asias He
  2012-07-18 14:31       ` Jeff Moyer
  0 siblings, 1 reply; 22+ messages in thread
From: Asias He @ 2012-07-18  1:22 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-kernel, Michael S. Tsirkin, kvm, virtualization

On 07/18/2012 03:10 AM, Jeff Moyer wrote:
> Asias He <asias@redhat.com> writes:
>
>> vhost-blk is a in kernel virito-blk device accelerator.
>>
>> This patch is based on Liu Yuan's implementation with various
>> improvements and bug fixes. Notably, this patch makes guest notify and
>> host completion processing in parallel which gives about 60% performance
>> improvement compared to Liu Yuan's implementation.
>
> So, first off, some basic questions.  Is it correct to assume that you
> tested this with buffered I/O (files opened *without* O_DIRECT)?
>  I'm pretty sure that if you used O_DIRECT, you'd run into problems (which
> are solved by the patch set posted by Shaggy, based on Zach Brown's work
> of many moons ago).  Note that, with buffered I/O, the submission path
> is NOT asynchronous.  So, any speedups you've reported are extremely
> suspect.  ;-)

I always used O_DIRECT to test this patchset. And I mostly used raw 
block device as guest image. Is this the reason why I did not hit the 
problem you mentioned. Btw, I do have run this patchset on image based 
file. I still do not see problems like IO hangs.

> Next, did you look at Shaggy's patch set?  I think it would be best to
> focus your efforts on testing *that*, and implementing your work on top
> of it.

I guess you mean this one:

http://marc.info/?l=linux-fsdevel&m=133312234313122

I did not notice that until James pointed that out.

I talked with Zach and Shaggy. Shaggy said he is still working on that 
patch set and will send that patch out soon.


> Having said that, I did do some review of this patch, inlined below.

Thanks, Jeff!

>> +static int vhost_blk_setup(struct vhost_blk *blk)
>> +{
>> +	struct kioctx *ctx;
>> +
>> +	if (blk->ioctx)
>> +		return 0;
>> +
>> +	blk->ioevent_nr = blk->vq.num;
>> +	ctx = ioctx_alloc(blk->ioevent_nr);
>> +	if (IS_ERR(ctx)) {
>> +		pr_err("Failed to ioctx_alloc");
>> +		return PTR_ERR(ctx);
>> +	}
>> +	put_ioctx(ctx);
>> +	blk->ioctx = ctx;
>> +
>> +	blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
>> +			       GFP_KERNEL);
>> +	if (!blk->ioevent) {
>> +		pr_err("Failed to allocate memory for io_events");
>> +		return -ENOMEM;
>
> You've just leaked blk->ioctx.

Yes. Will fix.

>> +	}
>> +
>> +	blk->reqs = kmalloc(sizeof(struct vhost_blk_req) * blk->ioevent_nr,
>> +			    GFP_KERNEL);
>> +	if (!blk->reqs) {
>> +		pr_err("Failed to allocate memory for vhost_blk_req");
>> +		return -ENOMEM;
>
> And here.

Yes. Will fix.

>
>> +	}
>> +
>> +	return 0;
>> +}
>> +
> [snip]
>> +static int vhost_blk_io_submit(struct vhost_blk *blk, struct file *file,
>> +			       struct vhost_blk_req *req,
>> +			       struct iovec *iov, u64 nr_vecs, loff_t offset,
>> +			       int opcode)
>> +{
>> +	struct kioctx *ioctx = blk->ioctx;
>> +	mm_segment_t oldfs = get_fs();
>> +	struct kiocb_batch batch;
>> +	struct blk_plug plug;
>> +	struct kiocb *iocb;
>> +	int ret;
>> +
>> +	if (!try_get_ioctx(ioctx)) {
>> +		pr_info("Failed to get ioctx");
>> +		return -EAGAIN;
>> +	}
>
> Using try_get_ioctx directly gives me a slightly uneasy feeling.  I
> understand that you don't need to do the lookup, but at least wrap it
> and check for ->dead.

OK.

>
>> +
>> +	atomic_long_inc_not_zero(&file->f_count);
>> +	eventfd_ctx_get(blk->ectx);
>> +
>> +	/* TODO: batch to 1 is not good! */
>
> Agreed.  You should setup the batching in vhost_blk_handle_guest_kick.
> The way you've written the code, the batching is not at all helpful.

Yes. that's why there is a TODO.

>> +	kiocb_batch_init(&batch, 1);
>> +	blk_start_plug(&plug);
>> +
>> +	iocb = aio_get_req(ioctx, &batch);
>> +	if (unlikely(!iocb)) {
>> +		ret = -EAGAIN;
>> +		goto out;
>> +	}
>> +
>> +	iocb->ki_filp	= file;
>> +	iocb->ki_pos	= offset;
>> +	iocb->ki_buf	= (void *)iov;
>> +	iocb->ki_left	= nr_vecs;
>> +	iocb->ki_nbytes	= nr_vecs;
>> +	iocb->ki_opcode	= opcode;
>> +	iocb->ki_obj.user = req;
>> +	iocb->ki_eventfd  = blk->ectx;
>> +
>> +	set_fs(KERNEL_DS);
>> +	ret = aio_setup_iocb(iocb, false);
>> +	set_fs(oldfs);
>> +	if (unlikely(ret))
>> +		goto out_put_iocb;
>> +
>> +	spin_lock_irq(&ioctx->ctx_lock);
>> +	if (unlikely(ioctx->dead)) {
>> +		spin_unlock_irq(&ioctx->ctx_lock);
>> +		ret = -EINVAL;
>> +		goto out_put_iocb;
>> +	}
>> +	aio_run_iocb(iocb);
>> +	spin_unlock_irq(&ioctx->ctx_lock);
>> +
>> +	aio_put_req(iocb);
>> +
>> +	blk_finish_plug(&plug);
>> +	kiocb_batch_free(ioctx, &batch);
>> +	put_ioctx(ioctx);
>> +
>> +	return ret;
>> +out_put_iocb:
>> +	aio_put_req(iocb); /* Drop extra ref to req */
>> +	aio_put_req(iocb); /* Drop I/O ref to req */
>> +out:
>> +	put_ioctx(ioctx);
>> +	return ret;
>> +}
>> +
>
> You've duplicated a lot of io_submit_one.  I'd rather see that factored
> out than to have to maintain two copies.

Agree.

> Again, what I'd *really* like to see is you rebase on top of Shaggy's
> work.

Sure. Let's wait for Shaggy's new version.


-- 
Asias



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-17 15:09 ` Michael S. Tsirkin
@ 2012-07-18  2:09   ` Asias He
  2012-07-18 11:42   ` Stefan Hajnoczi
  1 sibling, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-18  2:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Alexander Viro, Benjamin LaHaise, James Bottomley,
	Jeff Moyer, kvm, linux-aio, linux-fsdevel, target-devel,
	linux-scsi, lf-virt, Stefan Hajnoczi, Zhi Yong Wu,
	Anthony Liguori, Paolo Bonzini, Christoph Hellwig, Jens Axboe,
	Hannes Reinecke

On 07/17/2012 11:09 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 13, 2012 at 04:55:06PM +0800, Asias He wrote:
>>
>> Hi folks,
>>
>> [I am resending to fix the broken thread in the previous one.]
>>
>> This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
>> device accelerator. Compared to userspace virtio-blk implementation, vhost-blk
>> gives about 5% to 15% performance improvement.
>
> Same thing as tcm_host comment:
>
> 	It seems not 100% clear whether this driver will have major
> 	userspace using it. And if not, it would be very hard to support a
> 	driver when recent userspace does not use it in the end.
>
> 	I think a good idea for 3.6 would be to make it depend on
> 	CONFIG_STAGING.  Then we don't commit to an ABI.  For this, you can add
> 	a separate Kconfig and source it from drivers/staging/Kconfig.  Maybe it
> 	needs to be in a separate directory drivers/vhost/staging/Kconfig.

OK.

> I Cc'd the list of tcm_host in the hope that you can cooperate on this.

Sure.

-- 
Asias



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-17 15:09 ` Michael S. Tsirkin
  2012-07-18  2:09   ` Asias He
@ 2012-07-18 11:42   ` Stefan Hajnoczi
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2012-07-18 11:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Asias He, Jens Axboe, linux-aio, target-devel, linux-scsi, kvm,
	linux-kernel, lf-virt, James Bottomley, Anthony Liguori,
	Jeff Moyer, Benjamin LaHaise, Alexander Viro, linux-fsdevel,
	Paolo Bonzini, Zhi Yong Wu, Christoph Hellwig, Stefan Hajnoczi

On Tue, Jul 17, 2012 at 4:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Jul 13, 2012 at 04:55:06PM +0800, Asias He wrote:
>>
>> Hi folks,
>>
>> [I am resending to fix the broken thread in the previous one.]
>>
>> This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
>> device accelerator. Compared to userspace virtio-blk implementation, vhost-blk
>> gives about 5% to 15% performance improvement.
>
> Same thing as tcm_host comment:
>
>         It seems not 100% clear whether this driver will have major
>         userspace using it. And if not, it would be very hard to support a
>         driver when recent userspace does not use it in the end.
>
>         I think a good idea for 3.6 would be to make it depend on
>         CONFIG_STAGING.  Then we don't commit to an ABI.  For this, you can add
>         a separate Kconfig and source it from drivers/staging/Kconfig.  Maybe it
>         needs to be in a separate directory drivers/vhost/staging/Kconfig.
>
> I Cc'd the list of tcm_host in the hope that you can cooperate on this.
>

Adding it to staging allows more people to try it out, so that's a
good thing.  If I get a moment to play with it I'll let you know the
results.

Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-18  1:22     ` Asias He
@ 2012-07-18 14:31       ` Jeff Moyer
  2012-07-18 14:45         ` Asias He
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Moyer @ 2012-07-18 14:31 UTC (permalink / raw)
  To: Asias He; +Cc: linux-kernel, Michael S. Tsirkin, kvm, virtualization

Asias He <asias@redhat.com> writes:

> On 07/18/2012 03:10 AM, Jeff Moyer wrote:
>> Asias He <asias@redhat.com> writes:
>>
>>> vhost-blk is a in kernel virito-blk device accelerator.
>>>
>>> This patch is based on Liu Yuan's implementation with various
>>> improvements and bug fixes. Notably, this patch makes guest notify and
>>> host completion processing in parallel which gives about 60% performance
>>> improvement compared to Liu Yuan's implementation.
>>
>> So, first off, some basic questions.  Is it correct to assume that you
>> tested this with buffered I/O (files opened *without* O_DIRECT)?
>>  I'm pretty sure that if you used O_DIRECT, you'd run into problems (which
>> are solved by the patch set posted by Shaggy, based on Zach Brown's work
>> of many moons ago).  Note that, with buffered I/O, the submission path
>> is NOT asynchronous.  So, any speedups you've reported are extremely
>> suspect.  ;-)
>
> I always used O_DIRECT to test this patchset. And I mostly used raw
> block device as guest image. Is this the reason why I did not hit the
> problem you mentioned. Btw, I do have run this patchset on image based
> file. I still do not see problems like IO hangs.

Hmm, so do the iovec's passed in point to buffers in userspace?  I
thought they were kernel buffers, which would have blown up in
get_user_pages_fast.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-18 14:31       ` Jeff Moyer
@ 2012-07-18 14:45         ` Asias He
  0 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-18 14:45 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-kernel, Michael S. Tsirkin, kvm, virtualization

On 07/18/2012 10:31 PM, Jeff Moyer wrote:
> Asias He <asias@redhat.com> writes:
>
>> On 07/18/2012 03:10 AM, Jeff Moyer wrote:
>>> Asias He <asias@redhat.com> writes:
>>>
>>>> vhost-blk is a in kernel virito-blk device accelerator.
>>>>
>>>> This patch is based on Liu Yuan's implementation with various
>>>> improvements and bug fixes. Notably, this patch makes guest notify and
>>>> host completion processing in parallel which gives about 60% performance
>>>> improvement compared to Liu Yuan's implementation.
>>>
>>> So, first off, some basic questions.  Is it correct to assume that you
>>> tested this with buffered I/O (files opened *without* O_DIRECT)?
>>>   I'm pretty sure that if you used O_DIRECT, you'd run into problems (which
>>> are solved by the patch set posted by Shaggy, based on Zach Brown's work
>>> of many moons ago).  Note that, with buffered I/O, the submission path
>>> is NOT asynchronous.  So, any speedups you've reported are extremely
>>> suspect.  ;-)
>>
>> I always used O_DIRECT to test this patchset. And I mostly used raw
>> block device as guest image. Is this the reason why I did not hit the
>> problem you mentioned. Btw, I do have run this patchset on image based
>> file. I still do not see problems like IO hangs.
>
> Hmm, so do the iovec's passed in point to buffers in userspace?  I
> thought they were kernel buffers, which would have blown up in
> get_user_pages_fast.

Yes. The iovec's passed in point to userspace buffers. ;-)

-- 
Asias



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-13  8:55 ` [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support Asias He
  2012-07-17 19:10   ` Jeff Moyer
@ 2012-07-19 13:05   ` Anthony Liguori
  2012-07-19 13:09     ` Michael S. Tsirkin
  2012-07-19 13:09     ` Michael S. Tsirkin
  1 sibling, 2 replies; 22+ messages in thread
From: Anthony Liguori @ 2012-07-19 13:05 UTC (permalink / raw)
  To: Asias He, linux-kernel; +Cc: Michael S. Tsirkin, kvm, virtualization

Asias He <asias@redhat.com> writes:

> vhost-blk is a in kernel virito-blk device accelerator.
>
> This patch is based on Liu Yuan's implementation with various
> improvements and bug fixes. Notably, this patch makes guest notify and
> host completion processing in parallel which gives about 60% performance
> improvement compared to Liu Yuan's implementation.
>
> Performance evaluation:
> -----------------------------
> The comparison is between kvm tool with usersapce implementation and kvm
> tool with vhost-blk.
>
> 1) Fio with libaio ioengine on Fusion IO device
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost         : 8.4%, 15.3%, 10.4%, 14.6%
> Latency improvement: 8.5%, 15.4%, 10.4%, 15.1%
>
> 2) Fio with vsync ioengine on Fusion IO device
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost         : 10.5%, 4.8%, 5.2%, 5.6%
> Latency improvement: 11.4%, 5.0%, 5.2%, 5.8%
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/vhost/Kconfig  |   10 +
>  drivers/vhost/Makefile |    2 +
>  drivers/vhost/blk.c    |  600 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/vhost/vhost.h  |    5 +
>  include/linux/vhost.h  |    3 +
>  5 files changed, 620 insertions(+)
>  create mode 100644 drivers/vhost/blk.c
>
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index c387067..fa071a8 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -16,4 +16,14 @@ config VHOST_NET
>  
>  	  To compile this driver as a module, choose M here: the module will
>  	  be called vhost_net.
> +config VHOST_BLK
> +	tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)"
> +	depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL
> +	---help---
> +	  This kernel module can be loaded in host kernel to accelerate
> +	  guest block with virtio_blk. Not to be confused with virtio_blk
> +	  module itself which needs to be loaded in guest kernel.
> +
> +	  To compile this driver as a module, choose M here: the module will
> +	  be called vhost_blk.
>  
> diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> index cd36885..aa461d5 100644
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -1,4 +1,6 @@
>  obj-$(CONFIG_VHOST)	+= vhost.o
>  obj-$(CONFIG_VHOST_NET) += vhost_net.o
> +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o
>  
>  vhost_net-y		:= net.o
> +vhost_blk-y		:= blk.o
> diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
> new file mode 100644
> index 0000000..6a94894
> --- /dev/null
> +++ b/drivers/vhost/blk.c
> @@ -0,0 +1,600 @@
> +/*
> + * Copyright (C) 2011 Taobao, Inc.
> + * Author: Liu Yuan <tailai.ly@taobao.com>
> + *
> + * Copyright (C) 2012 Red Hat, Inc.
> + * Author: Asias He <asias@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + *
> + * virtio-blk server in host kernel.
> + */
> +
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +#include <linux/vhost.h>
> +#include <linux/virtio_blk.h>
> +#include <linux/eventfd.h>
> +#include <linux/mutex.h>
> +#include <linux/file.h>
> +#include <linux/mmu_context.h>
> +#include <linux/anon_inodes.h>
> +#include <linux/kthread.h>
> +#include <linux/blkdev.h>
> +
> +#include "vhost.h"
> +
> +#define BLK_HDR	0
> +
> +enum {
> +	VHOST_BLK_VQ_REQ = 0,
> +	VHOST_BLK_VQ_MAX = 1,
> +};
> +
> +struct vhost_blk_req {
> +	u16 head;
> +	u8 *status;
> +};
> +
> +struct vhost_blk {
> +	struct task_struct *worker_host_kick;
> +	struct task_struct *worker;
> +	struct vhost_blk_req *reqs;
> +	struct vhost_virtqueue vq;
> +	struct eventfd_ctx *ectx;
> +	struct io_event *ioevent;
> +	struct kioctx *ioctx;
> +	struct vhost_dev dev;
> +	struct file *efile;
> +	u64 ioevent_nr;
> +	bool stop;
> +};
> +
> +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr)
> +{
> +	mm_segment_t old_fs = get_fs();
> +	int ret;
> +
> +	set_fs(KERNEL_DS);
> +	ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL);
> +	set_fs(old_fs);
> +
> +	return ret;
> +}
> +
> +static int vhost_blk_setup(struct vhost_blk *blk)
> +{
> +	struct kioctx *ctx;
> +
> +	if (blk->ioctx)
> +		return 0;
> +
> +	blk->ioevent_nr = blk->vq.num;
> +	ctx = ioctx_alloc(blk->ioevent_nr);
> +	if (IS_ERR(ctx)) {
> +		pr_err("Failed to ioctx_alloc");
> +		return PTR_ERR(ctx);
> +	}

Not that it's very likely that ioctx_alloc will fail in practice.
There's a fixed number of events that can be allocated that's currently
0x10000.  If you have a ring queue size of 1024 (which is normal) then
that limits you to 64 vhost-blk devices.

Realistically, I don't think you can only do aio with vhost-blk because
of this (and many other) limitations.  It's necessary to be able to fall
back to a thread pool because AIO cannot be relied upon.

> +	put_ioctx(ctx);
> +	blk->ioctx = ctx;
> +
> +	blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
> +			       GFP_KERNEL);
> +	if (!blk->ioevent) {
> +		pr_err("Failed to allocate memory for io_events");
> +		return -ENOMEM;
> +	}
> +
> +	blk->reqs = kmalloc(sizeof(struct vhost_blk_req) * blk->ioevent_nr,
> +			    GFP_KERNEL);
> +	if (!blk->reqs) {
> +		pr_err("Failed to allocate memory for vhost_blk_req");
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static inline int vhost_blk_set_status(struct vhost_blk *blk, u8 *statusp,
> +				       u8 status)
> +{
> +	if (copy_to_user(statusp, &status, sizeof(status))) {
> +		vq_err(&blk->vq, "Failed to write status\n");
> +		vhost_discard_vq_desc(&blk->vq, 1);
> +		return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +static void vhost_blk_enable_vq(struct vhost_blk *blk,
> +				struct vhost_virtqueue *vq)
> +{
> +	wake_up_process(blk->worker_host_kick);
> +}
> +
> +static int vhost_blk_io_submit(struct vhost_blk *blk, struct file *file,
> +			       struct vhost_blk_req *req,
> +			       struct iovec *iov, u64 nr_vecs, loff_t offset,
> +			       int opcode)
> +{
> +	struct kioctx *ioctx = blk->ioctx;
> +	mm_segment_t oldfs = get_fs();
> +	struct kiocb_batch batch;
> +	struct blk_plug plug;
> +	struct kiocb *iocb;
> +	int ret;
> +
> +	if (!try_get_ioctx(ioctx)) {
> +		pr_info("Failed to get ioctx");
> +		return -EAGAIN;
> +	}
> +
> +	atomic_long_inc_not_zero(&file->f_count);
> +	eventfd_ctx_get(blk->ectx);
> +
> +	/* TODO: batch to 1 is not good! */
> +	kiocb_batch_init(&batch, 1);
> +	blk_start_plug(&plug);
> +
> +	iocb = aio_get_req(ioctx, &batch);
> +	if (unlikely(!iocb)) {
> +		ret = -EAGAIN;
> +		goto out;
> +	}
> +
> +	iocb->ki_filp	= file;
> +	iocb->ki_pos	= offset;
> +	iocb->ki_buf	= (void *)iov;
> +	iocb->ki_left	= nr_vecs;
> +	iocb->ki_nbytes	= nr_vecs;
> +	iocb->ki_opcode	= opcode;
> +	iocb->ki_obj.user = req;
> +	iocb->ki_eventfd  = blk->ectx;
> +
> +	set_fs(KERNEL_DS);
> +	ret = aio_setup_iocb(iocb, false);
> +	set_fs(oldfs);
> +	if (unlikely(ret))
> +		goto out_put_iocb;
> +
> +	spin_lock_irq(&ioctx->ctx_lock);
> +	if (unlikely(ioctx->dead)) {
> +		spin_unlock_irq(&ioctx->ctx_lock);
> +		ret = -EINVAL;
> +		goto out_put_iocb;
> +	}
> +	aio_run_iocb(iocb);
> +	spin_unlock_irq(&ioctx->ctx_lock);
> +
> +	aio_put_req(iocb);
> +
> +	blk_finish_plug(&plug);
> +	kiocb_batch_free(ioctx, &batch);
> +	put_ioctx(ioctx);
> +
> +	return ret;
> +out_put_iocb:
> +	aio_put_req(iocb); /* Drop extra ref to req */
> +	aio_put_req(iocb); /* Drop I/O ref to req */
> +out:
> +	put_ioctx(ioctx);
> +	return ret;
> +}
> +
> +static void vhost_blk_flush(struct vhost_blk *blk)
> +{
> +	vhost_poll_flush(&blk->vq.poll);
> +}
> +
> +static struct file *vhost_blk_stop_vq(struct vhost_blk *blk,
> +				      struct vhost_virtqueue *vq)
> +{
> +	struct file *file;
> +
> +	mutex_lock(&vq->mutex);
> +	file = rcu_dereference_protected(vq->private_data,
> +			lockdep_is_held(&vq->mutex));
> +	rcu_assign_pointer(vq->private_data, NULL);
> +	mutex_unlock(&vq->mutex);
> +
> +	return file;
> +
> +}
> +
> +static inline void vhost_blk_stop(struct vhost_blk *blk, struct file **file)
> +{
> +
> +	*file = vhost_blk_stop_vq(blk, &blk->vq);
> +}
> +
> +/* Handle guest request */
> +static int vhost_blk_do_req(struct vhost_virtqueue *vq,
> +			    struct virtio_blk_outhdr *hdr,
> +			    u16 head, u16 out, u16 in,
> +			    struct file *file)
> +{
> +	struct vhost_blk *blk = container_of(vq->dev, struct vhost_blk, dev);
> +	struct iovec *iov = &vq->iov[BLK_HDR + 1];
> +	loff_t offset = hdr->sector << 9;
> +	struct vhost_blk_req *req;
> +	u64 nr_vecs;
> +	int ret = 0;
> +	u8 status;
> +
> +	if (hdr->type == VIRTIO_BLK_T_IN || hdr->type == VIRTIO_BLK_T_GET_ID)
> +		nr_vecs = in - 1;
> +	else
> +		nr_vecs = out - 1;
> +
> +	req		= &blk->reqs[head];
> +	req->head	= head;
> +	req->status	= blk->vq.iov[nr_vecs + 1].iov_base;
> +
> +	switch (hdr->type) {
> +	case VIRTIO_BLK_T_OUT:
> +		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
> +					  IOCB_CMD_PWRITEV);
> +		break;
> +	case VIRTIO_BLK_T_IN:
> +		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
> +					  IOCB_CMD_PREADV);
> +		break;
> +	case VIRTIO_BLK_T_FLUSH:
> +		ret = vfs_fsync(file, 1);
> +		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> +		ret = vhost_blk_set_status(blk, req->status, status);
> +		if (!ret)
> +			vhost_add_used_and_signal(&blk->dev, vq, head, ret);
> +		break;
> +	case VIRTIO_BLK_T_GET_ID:
> +		/* TODO: need a real ID string */
> +		ret = snprintf(vq->iov[BLK_HDR + 1].iov_base,
> +			       VIRTIO_BLK_ID_BYTES, "VHOST-BLK-DISK");
> +		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> +		ret = vhost_blk_set_status(blk, req->status, status);
> +		if (!ret)
> +			vhost_add_used_and_signal(&blk->dev, vq, head,
> +						  VIRTIO_BLK_ID_BYTES);
> +		break;
> +	default:
> +		pr_warn("Unsupported request type %d\n", hdr->type);
> +		vhost_discard_vq_desc(vq, 1);
> +		ret = -EFAULT;
> +		break;
> +	}

There doesn't appear to be any error handling in the event that
vhost_blk_io_submit fails.  It would appear that you leak the ring queue
entry since you never push anything onto the used queue.

I think you need to handle io_submit() failing too with EAGAIN.  Relying
on min nr_events == queue_size seems like a dangerous assumption to me
particularly since queue_size tends to be very large and max_nr_events
is a fixed size global pool.

To properly handle EAGAIN, you effectively need to implement flow
control and back off reading the virtqueue until you can submit requests again.

Of course, the million dollar question is why would using AIO in the
kernel be faster than using AIO in userspace?

When using eventfd, there is no heavy weight exit on the notify path.
It should just be the difference between scheduling a kernel thread vs
scheduling a userspace thread.  There's simply no way that that's a 60%
difference in performance.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-19 13:05   ` Anthony Liguori
@ 2012-07-19 13:09     ` Michael S. Tsirkin
  2012-07-19 13:09     ` Michael S. Tsirkin
  1 sibling, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2012-07-19 13:09 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Asias He, linux-kernel, kvm, virtualization

On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
> Asias He <asias@redhat.com> writes:
> 
> > vhost-blk is a in kernel virito-blk device accelerator.
> >
> > This patch is based on Liu Yuan's implementation with various
> > improvements and bug fixes. Notably, this patch makes guest notify and
> > host completion processing in parallel which gives about 60% performance
> > improvement compared to Liu Yuan's implementation.
> >
> > Performance evaluation:
> > -----------------------------
> > The comparison is between kvm tool with usersapce implementation and kvm
> > tool with vhost-blk.
> >
> > 1) Fio with libaio ioengine on Fusion IO device
> > With bio-based IO path, sequential read/write, random read/write
> > IOPS boost         : 8.4%, 15.3%, 10.4%, 14.6%
> > Latency improvement: 8.5%, 15.4%, 10.4%, 15.1%
> >
> > 2) Fio with vsync ioengine on Fusion IO device
> > With bio-based IO path, sequential read/write, random read/write
> > IOPS boost         : 10.5%, 4.8%, 5.2%, 5.6%
> > Latency improvement: 11.4%, 5.0%, 5.2%, 5.8%
> >
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kvm@vger.kernel.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> >  drivers/vhost/Kconfig  |   10 +
> >  drivers/vhost/Makefile |    2 +
> >  drivers/vhost/blk.c    |  600 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  drivers/vhost/vhost.h  |    5 +
> >  include/linux/vhost.h  |    3 +
> >  5 files changed, 620 insertions(+)
> >  create mode 100644 drivers/vhost/blk.c
> >
> > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> > index c387067..fa071a8 100644
> > --- a/drivers/vhost/Kconfig
> > +++ b/drivers/vhost/Kconfig
> > @@ -16,4 +16,14 @@ config VHOST_NET
> >  
> >  	  To compile this driver as a module, choose M here: the module will
> >  	  be called vhost_net.
> > +config VHOST_BLK
> > +	tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)"
> > +	depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL
> > +	---help---
> > +	  This kernel module can be loaded in host kernel to accelerate
> > +	  guest block with virtio_blk. Not to be confused with virtio_blk
> > +	  module itself which needs to be loaded in guest kernel.
> > +
> > +	  To compile this driver as a module, choose M here: the module will
> > +	  be called vhost_blk.
> >  
> > diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> > index cd36885..aa461d5 100644
> > --- a/drivers/vhost/Makefile
> > +++ b/drivers/vhost/Makefile
> > @@ -1,4 +1,6 @@
> >  obj-$(CONFIG_VHOST)	+= vhost.o
> >  obj-$(CONFIG_VHOST_NET) += vhost_net.o
> > +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o
> >  
> >  vhost_net-y		:= net.o
> > +vhost_blk-y		:= blk.o
> > diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
> > new file mode 100644
> > index 0000000..6a94894
> > --- /dev/null
> > +++ b/drivers/vhost/blk.c
> > @@ -0,0 +1,600 @@
> > +/*
> > + * Copyright (C) 2011 Taobao, Inc.
> > + * Author: Liu Yuan <tailai.ly@taobao.com>
> > + *
> > + * Copyright (C) 2012 Red Hat, Inc.
> > + * Author: Asias He <asias@redhat.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
> > + *
> > + * virtio-blk server in host kernel.
> > + */
> > +
> > +#include <linux/miscdevice.h>
> > +#include <linux/module.h>
> > +#include <linux/vhost.h>
> > +#include <linux/virtio_blk.h>
> > +#include <linux/eventfd.h>
> > +#include <linux/mutex.h>
> > +#include <linux/file.h>
> > +#include <linux/mmu_context.h>
> > +#include <linux/anon_inodes.h>
> > +#include <linux/kthread.h>
> > +#include <linux/blkdev.h>
> > +
> > +#include "vhost.h"
> > +
> > +#define BLK_HDR	0
> > +
> > +enum {
> > +	VHOST_BLK_VQ_REQ = 0,
> > +	VHOST_BLK_VQ_MAX = 1,
> > +};
> > +
> > +struct vhost_blk_req {
> > +	u16 head;
> > +	u8 *status;
> > +};
> > +
> > +struct vhost_blk {
> > +	struct task_struct *worker_host_kick;
> > +	struct task_struct *worker;
> > +	struct vhost_blk_req *reqs;
> > +	struct vhost_virtqueue vq;
> > +	struct eventfd_ctx *ectx;
> > +	struct io_event *ioevent;
> > +	struct kioctx *ioctx;
> > +	struct vhost_dev dev;
> > +	struct file *efile;
> > +	u64 ioevent_nr;
> > +	bool stop;
> > +};
> > +
> > +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr)
> > +{
> > +	mm_segment_t old_fs = get_fs();
> > +	int ret;
> > +
> > +	set_fs(KERNEL_DS);
> > +	ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL);
> > +	set_fs(old_fs);
> > +
> > +	return ret;
> > +}
> > +
> > +static int vhost_blk_setup(struct vhost_blk *blk)
> > +{
> > +	struct kioctx *ctx;
> > +
> > +	if (blk->ioctx)
> > +		return 0;
> > +
> > +	blk->ioevent_nr = blk->vq.num;
> > +	ctx = ioctx_alloc(blk->ioevent_nr);
> > +	if (IS_ERR(ctx)) {
> > +		pr_err("Failed to ioctx_alloc");
> > +		return PTR_ERR(ctx);
> > +	}
> 
> Not that it's very likely that ioctx_alloc will fail in practice.
> There's a fixed number of events that can be allocated that's currently
> 0x10000.  If you have a ring queue size of 1024 (which is normal) then
> that limits you to 64 vhost-blk devices.
> 
> Realistically, I don't think you can only do aio with vhost-blk because
> of this (and many other) limitations.  It's necessary to be able to fall
> back to a thread pool because AIO cannot be relied upon.
> 
> > +	put_ioctx(ctx);
> > +	blk->ioctx = ctx;
> > +
> > +	blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
> > +			       GFP_KERNEL);
> > +	if (!blk->ioevent) {
> > +		pr_err("Failed to allocate memory for io_events");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	blk->reqs = kmalloc(sizeof(struct vhost_blk_req) * blk->ioevent_nr,
> > +			    GFP_KERNEL);
> > +	if (!blk->reqs) {
> > +		pr_err("Failed to allocate memory for vhost_blk_req");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static inline int vhost_blk_set_status(struct vhost_blk *blk, u8 *statusp,
> > +				       u8 status)
> > +{
> > +	if (copy_to_user(statusp, &status, sizeof(status))) {
> > +		vq_err(&blk->vq, "Failed to write status\n");
> > +		vhost_discard_vq_desc(&blk->vq, 1);
> > +		return -EFAULT;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void vhost_blk_enable_vq(struct vhost_blk *blk,
> > +				struct vhost_virtqueue *vq)
> > +{
> > +	wake_up_process(blk->worker_host_kick);
> > +}
> > +
> > +static int vhost_blk_io_submit(struct vhost_blk *blk, struct file *file,
> > +			       struct vhost_blk_req *req,
> > +			       struct iovec *iov, u64 nr_vecs, loff_t offset,
> > +			       int opcode)
> > +{
> > +	struct kioctx *ioctx = blk->ioctx;
> > +	mm_segment_t oldfs = get_fs();
> > +	struct kiocb_batch batch;
> > +	struct blk_plug plug;
> > +	struct kiocb *iocb;
> > +	int ret;
> > +
> > +	if (!try_get_ioctx(ioctx)) {
> > +		pr_info("Failed to get ioctx");
> > +		return -EAGAIN;
> > +	}
> > +
> > +	atomic_long_inc_not_zero(&file->f_count);
> > +	eventfd_ctx_get(blk->ectx);
> > +
> > +	/* TODO: batch to 1 is not good! */
> > +	kiocb_batch_init(&batch, 1);
> > +	blk_start_plug(&plug);
> > +
> > +	iocb = aio_get_req(ioctx, &batch);
> > +	if (unlikely(!iocb)) {
> > +		ret = -EAGAIN;
> > +		goto out;
> > +	}
> > +
> > +	iocb->ki_filp	= file;
> > +	iocb->ki_pos	= offset;
> > +	iocb->ki_buf	= (void *)iov;
> > +	iocb->ki_left	= nr_vecs;
> > +	iocb->ki_nbytes	= nr_vecs;
> > +	iocb->ki_opcode	= opcode;
> > +	iocb->ki_obj.user = req;
> > +	iocb->ki_eventfd  = blk->ectx;
> > +
> > +	set_fs(KERNEL_DS);
> > +	ret = aio_setup_iocb(iocb, false);
> > +	set_fs(oldfs);
> > +	if (unlikely(ret))
> > +		goto out_put_iocb;
> > +
> > +	spin_lock_irq(&ioctx->ctx_lock);
> > +	if (unlikely(ioctx->dead)) {
> > +		spin_unlock_irq(&ioctx->ctx_lock);
> > +		ret = -EINVAL;
> > +		goto out_put_iocb;
> > +	}
> > +	aio_run_iocb(iocb);
> > +	spin_unlock_irq(&ioctx->ctx_lock);
> > +
> > +	aio_put_req(iocb);
> > +
> > +	blk_finish_plug(&plug);
> > +	kiocb_batch_free(ioctx, &batch);
> > +	put_ioctx(ioctx);
> > +
> > +	return ret;
> > +out_put_iocb:
> > +	aio_put_req(iocb); /* Drop extra ref to req */
> > +	aio_put_req(iocb); /* Drop I/O ref to req */
> > +out:
> > +	put_ioctx(ioctx);
> > +	return ret;
> > +}
> > +
> > +static void vhost_blk_flush(struct vhost_blk *blk)
> > +{
> > +	vhost_poll_flush(&blk->vq.poll);
> > +}
> > +
> > +static struct file *vhost_blk_stop_vq(struct vhost_blk *blk,
> > +				      struct vhost_virtqueue *vq)
> > +{
> > +	struct file *file;
> > +
> > +	mutex_lock(&vq->mutex);
> > +	file = rcu_dereference_protected(vq->private_data,
> > +			lockdep_is_held(&vq->mutex));
> > +	rcu_assign_pointer(vq->private_data, NULL);
> > +	mutex_unlock(&vq->mutex);
> > +
> > +	return file;
> > +
> > +}
> > +
> > +static inline void vhost_blk_stop(struct vhost_blk *blk, struct file **file)
> > +{
> > +
> > +	*file = vhost_blk_stop_vq(blk, &blk->vq);
> > +}
> > +
> > +/* Handle guest request */
> > +static int vhost_blk_do_req(struct vhost_virtqueue *vq,
> > +			    struct virtio_blk_outhdr *hdr,
> > +			    u16 head, u16 out, u16 in,
> > +			    struct file *file)
> > +{
> > +	struct vhost_blk *blk = container_of(vq->dev, struct vhost_blk, dev);
> > +	struct iovec *iov = &vq->iov[BLK_HDR + 1];
> > +	loff_t offset = hdr->sector << 9;
> > +	struct vhost_blk_req *req;
> > +	u64 nr_vecs;
> > +	int ret = 0;
> > +	u8 status;
> > +
> > +	if (hdr->type == VIRTIO_BLK_T_IN || hdr->type == VIRTIO_BLK_T_GET_ID)
> > +		nr_vecs = in - 1;
> > +	else
> > +		nr_vecs = out - 1;
> > +
> > +	req		= &blk->reqs[head];
> > +	req->head	= head;
> > +	req->status	= blk->vq.iov[nr_vecs + 1].iov_base;
> > +
> > +	switch (hdr->type) {
> > +	case VIRTIO_BLK_T_OUT:
> > +		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
> > +					  IOCB_CMD_PWRITEV);
> > +		break;
> > +	case VIRTIO_BLK_T_IN:
> > +		ret = vhost_blk_io_submit(blk, file, req, iov, nr_vecs, offset,
> > +					  IOCB_CMD_PREADV);
> > +		break;
> > +	case VIRTIO_BLK_T_FLUSH:
> > +		ret = vfs_fsync(file, 1);
> > +		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> > +		ret = vhost_blk_set_status(blk, req->status, status);
> > +		if (!ret)
> > +			vhost_add_used_and_signal(&blk->dev, vq, head, ret);
> > +		break;
> > +	case VIRTIO_BLK_T_GET_ID:
> > +		/* TODO: need a real ID string */
> > +		ret = snprintf(vq->iov[BLK_HDR + 1].iov_base,
> > +			       VIRTIO_BLK_ID_BYTES, "VHOST-BLK-DISK");
> > +		status = ret < 0 ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
> > +		ret = vhost_blk_set_status(blk, req->status, status);
> > +		if (!ret)
> > +			vhost_add_used_and_signal(&blk->dev, vq, head,
> > +						  VIRTIO_BLK_ID_BYTES);
> > +		break;
> > +	default:
> > +		pr_warn("Unsupported request type %d\n", hdr->type);
> > +		vhost_discard_vq_desc(vq, 1);
> > +		ret = -EFAULT;
> > +		break;
> > +	}
> 
> There doesn't appear to be any error handling in the event that
> vhost_blk_io_submit fails.  It would appear that you leak the ring queue
> entry since you never push anything onto the used queue.
> 
> I think you need to handle io_submit() failing too with EAGAIN.  Relying
> on min nr_events == queue_size seems like a dangerous assumption to me
> particularly since queue_size tends to be very large and max_nr_events
> is a fixed size global pool.
> 
> To properly handle EAGAIN, you effectively need to implement flow
> control and back off reading the virtqueue until you can submit requests again.
> 
> Of course, the million dollar question is why would using AIO in the
> kernel be faster than using AIO in userspace?
> 
> When using eventfd, there is no heavy weight exit on the notify path.
> It should just be the difference between scheduling a kernel thread vs
> scheduling a userspace thread.

Actually AIO from userspace involves a kernel thread too, doesn't it?

>  There's simply no way that that's a 60%
> difference in performance.
> 
> Regards,
> 
> Anthony Liguori

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-19 13:05   ` Anthony Liguori
  2012-07-19 13:09     ` Michael S. Tsirkin
@ 2012-07-19 13:09     ` Michael S. Tsirkin
  2012-07-20 10:31       ` Stefan Hajnoczi
  2012-07-20 20:56       ` Anthony Liguori
  1 sibling, 2 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2012-07-19 13:09 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Asias He, linux-kernel, kvm, virtualization

On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
> Of course, the million dollar question is why would using AIO in the
> kernel be faster than using AIO in userspace?

Actually for me a more important question is how does it compare
with virtio-blk dataplane?

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-19 13:09     ` Michael S. Tsirkin
@ 2012-07-20 10:31       ` Stefan Hajnoczi
  2012-07-20 20:56       ` Anthony Liguori
  1 sibling, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2012-07-20 10:31 UTC (permalink / raw)
  To: Khoa Huynh
  Cc: Anthony Liguori, linux-kernel, kvm, virtualization, Michael S. Tsirkin

On Thu, Jul 19, 2012 at 2:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
>> Of course, the million dollar question is why would using AIO in the
>> kernel be faster than using AIO in userspace?
>
> Actually for me a more important question is how does it compare
> with virtio-blk dataplane?

Hi Khoa,
I think you have results of data-plane and vhost-blk?  Is the
vhost-blk version identical to Asias' recent patches?

Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 0/5] Add vhost-blk support
  2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
                   ` (6 preceding siblings ...)
  2012-07-17 15:09 ` Michael S. Tsirkin
@ 2012-07-20 19:30 ` Michael S. Tsirkin
  7 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2012-07-20 19:30 UTC (permalink / raw)
  To: Asias He
  Cc: linux-kernel, Alexander Viro, Benjamin LaHaise, James Bottomley,
	Jeff Moyer, kvm, linux-aio, linux-fsdevel, virtualization,
	gregkh

On Fri, Jul 13, 2012 at 04:55:06PM +0800, Asias He wrote:
> 
> Hi folks,
> 
> [I am resending to fix the broken thread in the previous one.]
> 
> This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk
> device accelerator. Compared to userspace virtio-blk implementation, vhost-blk
> gives about 5% to 15% performance improvement.
> 
> Asias He (5):
>   aio: Export symbols and struct kiocb_batch for in kernel aio usage
>   eventfd: Export symbol eventfd_file_create()
>   vhost: Make vhost a separate module
>   vhost-net: Use VHOST_NET_FEATURES for vhost-net
>   vhost-blk: Add vhost-blk support


OK so given the state it's in, and assuming you think it is helpful
to let it mature in tree and not out of tree, I think it's
reasonable to try to do it like tcm_vhost is going to do it:
- send me changes to vhost core ASAP (and keep it minimal, e.g.
  use your own header file to export to userspace)
- for other stuff - put in drivers/staging, and ask Greg to merge

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-19 13:09     ` Michael S. Tsirkin
  2012-07-20 10:31       ` Stefan Hajnoczi
@ 2012-07-20 20:56       ` Anthony Liguori
  2012-07-21  1:07         ` Asias He
  1 sibling, 1 reply; 22+ messages in thread
From: Anthony Liguori @ 2012-07-20 20:56 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Asias He, linux-kernel, kvm, virtualization

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
>> Of course, the million dollar question is why would using AIO in the
>> kernel be faster than using AIO in userspace?
>
> Actually for me a more important question is how does it compare
> with virtio-blk dataplane?

I'm not even asking for a benchmark comparision.  It's the same API
being called from a kernel thread vs. a userspace thread.  Why would
there be a 60% performance difference between the two?  That doesn't
make any sense.

There's got to be a better justification for putting this in the kernel
than just that we can.

I completely understand why Christoph's suggestion of submitting BIOs
directly would be faster.  There's no way to do that in userspace.

Regards,

Anthony Liguori

>
> -- 
> MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
  2012-07-20 20:56       ` Anthony Liguori
@ 2012-07-21  1:07         ` Asias He
  0 siblings, 0 replies; 22+ messages in thread
From: Asias He @ 2012-07-21  1:07 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Michael S. Tsirkin, linux-kernel, kvm, virtualization

On 07/21/2012 04:56 AM, Anthony Liguori wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>
>> On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
>>> Of course, the million dollar question is why would using AIO in the
>>> kernel be faster than using AIO in userspace?
>>
>> Actually for me a more important question is how does it compare
>> with virtio-blk dataplane?
>
> I'm not even asking for a benchmark comparision.  It's the same API
> being called from a kernel thread vs. a userspace thread.  Why would
> there be a 60% performance difference between the two?  That doesn't
> make any sense.

Please read the commit log again. I am not saying vhost-blk v.s 
userspace implementation gives 60% improvement. I am saying the 
vhost-blk v.s original vhost-blk gives 60% improvement.


"""
This patch is based on Liu Yuan's implementation with various
improvements and bug fixes. Notably, this patch makes guest notify and
host completion processing in parallel which gives about 60% performance
improvement compared to Liu Yuan's implementation.
"""

>
> There's got to be a better justification for putting this in the kernel
> than just that we can.
>
> I completely understand why Christoph's suggestion of submitting BIOs
> directly would be faster.  There's no way to do that in userspace.

Well. With Zach and Dave's new in-kernel aio API, the aio usage in 
kernel is much simpler than in userspace. This a potential reason that 
in kernel one is better than userspace one. I am working on it right 
now. And for block based image, as suggested by Christoph, we can submit 
bio directly. This is another potential reason.

Why can't we just go further to see if we can improve the IO stack from 
guest kernel side all the way down to host kernel side. We can not do 
that if we stick to doing everything in userspace (qemu).

-- 
Asias



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-07-21  1:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-13  8:55 [PATCH RESEND 0/5] Add vhost-blk support Asias He
2012-07-13  8:55 ` [PATCH RESEND 1/5] aio: Export symbols and struct kiocb_batch for in kernel aio usage Asias He
2012-07-13  8:55 ` [PATCH RESEND 2/5] eventfd: Export symbol eventfd_file_create() Asias He
2012-07-13  8:55 ` [PATCH RESEND 3/5] vhost: Make vhost a separate module Asias He
2012-07-13  8:55 ` [PATCH RESEND 4/5] vhost-net: Use VHOST_NET_FEATURES for vhost-net Asias He
2012-07-13  8:55 ` [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support Asias He
2012-07-17 19:10   ` Jeff Moyer
2012-07-18  1:22     ` Asias He
2012-07-18 14:31       ` Jeff Moyer
2012-07-18 14:45         ` Asias He
2012-07-19 13:05   ` Anthony Liguori
2012-07-19 13:09     ` Michael S. Tsirkin
2012-07-19 13:09     ` Michael S. Tsirkin
2012-07-20 10:31       ` Stefan Hajnoczi
2012-07-20 20:56       ` Anthony Liguori
2012-07-21  1:07         ` Asias He
2012-07-14  7:49 ` [PATCH RESEND 0/5] " Christoph Hellwig
2012-07-16  9:05   ` Asias He
2012-07-17 15:09 ` Michael S. Tsirkin
2012-07-18  2:09   ` Asias He
2012-07-18 11:42   ` Stefan Hajnoczi
2012-07-20 19:30 ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).