linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/3] Improve virtio-blk performance
@ 2012-07-28  2:21 Asias He
  2012-07-28  2:21 ` [PATCH V4 1/3] block: Introduce __blk_segment_map_sg() helper Asias He
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Asias He @ 2012-07-28  2:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jens Axboe, kvm, Michael S. Tsirkin, Rusty Russell, Shaohua Li,
	Tejun Heo, virtualization

Hi, Jens & Rusty

This version is rebased against linux-next which resolves the conflict with
Paolo Bonzini's 'virtio-blk: allow toggling host cache between writeback and
writethrough' patch.

Patch 1/3 and 2/3 applies on linus's master as well. Since Rusty will pick up
patch 3/3 so the changes to block core (adding blk_bio_map_sg()) will have a
user.

Jens, could you please consider picking up the dependencies 1/3 and 2/3 in your
tree. Thanks!

This patchset implements bio-based IO path for virito-blk to improve
performance.

Fio test shows bio-based IO path gives the following performance improvement:

1) Ramdisk device
     With bio-based IO path, sequential read/write, random read/write
     IOPS boost         : 28%, 24%, 21%, 16%
     Latency improvement: 32%, 17%, 21%, 16%
2) Fusion IO device
     With bio-based IO path, sequential read/write, random read/write
     IOPS boost         : 11%, 11%, 13%, 10%
     Latency improvement: 10%, 10%, 12%, 10%

Asias He (3):
  block: Introduce __blk_segment_map_sg() helper
  block: Add blk_bio_map_sg() helper
  virtio-blk: Add bio-based IO path for virtio-blk

 block/blk-merge.c          |  117 +++++++++++++++++--------
 drivers/block/virtio_blk.c |  203 +++++++++++++++++++++++++++++++++++---------
 include/linux/blkdev.h     |    2 +
 3 files changed, 247 insertions(+), 75 deletions(-)

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V4 1/3] block: Introduce __blk_segment_map_sg() helper
  2012-07-28  2:21 [PATCH V4 0/3] Improve virtio-blk performance Asias He
@ 2012-07-28  2:21 ` Asias He
  2012-07-28  2:21 ` [PATCH V4 2/3] block: Add blk_bio_map_sg() helper Asias He
  2012-07-28  2:21 ` [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
  2 siblings, 0 replies; 10+ messages in thread
From: Asias He @ 2012-07-28  2:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Tejun Heo, Shaohua Li, Rusty Russell

Split the mapping code in blk_rq_map_sg() to a helper
__blk_segment_map_sg(), so that other mapping function, e.g.
blk_bio_map_sg(), can share the code.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org
Suggested-by: Tejun Heo <tj@kernel.org>
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Asias He <asias@redhat.com>
---
 block/blk-merge.c |   80 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 45 insertions(+), 35 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 160035f..576b68e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
 	return 0;
 }
 
+static void
+__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
+		     struct scatterlist *sglist, struct bio_vec **bvprv,
+		     struct scatterlist **sg, int *nsegs, int *cluster)
+{
+
+	int nbytes = bvec->bv_len;
+
+	if (*bvprv && *cluster) {
+		if ((*sg)->length + nbytes > queue_max_segment_size(q))
+			goto new_segment;
+
+		if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
+			goto new_segment;
+		if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
+			goto new_segment;
+
+		(*sg)->length += nbytes;
+	} else {
+new_segment:
+		if (!*sg)
+			*sg = sglist;
+		else {
+			/*
+			 * If the driver previously mapped a shorter
+			 * list, we could see a termination bit
+			 * prematurely unless it fully inits the sg
+			 * table on each mapping. We KNOW that there
+			 * must be more entries here or the driver
+			 * would be buggy, so force clear the
+			 * termination bit to avoid doing a full
+			 * sg_init_table() in drivers for each command.
+			 */
+			(*sg)->page_link &= ~0x02;
+			*sg = sg_next(*sg);
+		}
+
+		sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
+		(*nsegs)++;
+	}
+	*bvprv = bvec;
+}
+
 /*
  * map a request to scatterlist, return number of sg entries setup. Caller
  * must make sure sg can hold rq->nr_phys_segments entries
@@ -131,41 +174,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
 	bvprv = NULL;
 	sg = NULL;
 	rq_for_each_segment(bvec, rq, iter) {
-		int nbytes = bvec->bv_len;
-
-		if (bvprv && cluster) {
-			if (sg->length + nbytes > queue_max_segment_size(q))
-				goto new_segment;
-
-			if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec))
-				goto new_segment;
-			if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec))
-				goto new_segment;
-
-			sg->length += nbytes;
-		} else {
-new_segment:
-			if (!sg)
-				sg = sglist;
-			else {
-				/*
-				 * If the driver previously mapped a shorter
-				 * list, we could see a termination bit
-				 * prematurely unless it fully inits the sg
-				 * table on each mapping. We KNOW that there
-				 * must be more entries here or the driver
-				 * would be buggy, so force clear the
-				 * termination bit to avoid doing a full
-				 * sg_init_table() in drivers for each command.
-				 */
-				sg->page_link &= ~0x02;
-				sg = sg_next(sg);
-			}
-
-			sg_set_page(sg, bvec->bv_page, nbytes, bvec->bv_offset);
-			nsegs++;
-		}
-		bvprv = bvec;
+		__blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+				     &nsegs, &cluster);
 	} /* segments in rq */
 
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V4 2/3] block: Add blk_bio_map_sg() helper
  2012-07-28  2:21 [PATCH V4 0/3] Improve virtio-blk performance Asias He
  2012-07-28  2:21 ` [PATCH V4 1/3] block: Introduce __blk_segment_map_sg() helper Asias He
@ 2012-07-28  2:21 ` Asias He
  2012-07-28  2:21 ` [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
  2 siblings, 0 replies; 10+ messages in thread
From: Asias He @ 2012-07-28  2:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jens Axboe, Tejun Heo, Shaohua Li, Rusty Russell,
	Christoph Hellwig, Minchan Kim

Add a helper to map a bio to a scatterlist, modelled after
blk_rq_map_sg.

This helper is useful for any driver that wants to create
a scatterlist from its ->make_request_fn method.

Changes in v2:
 - Use __blk_segment_map_sg to avoid duplicated code
 - Add cocbook style function comment

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
 block/blk-merge.c      |   37 +++++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h |    2 ++
 2 files changed, 39 insertions(+)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 576b68e..e76279e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -209,6 +209,43 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
 }
 EXPORT_SYMBOL(blk_rq_map_sg);
 
+/**
+ * blk_bio_map_sg - map a bio to a scatterlist
+ * @q: request_queue in question
+ * @bio: bio being mapped
+ * @sglist: scatterlist being mapped
+ *
+ * Note:
+ *    Caller must make sure sg can hold bio->bi_phys_segments entries
+ *
+ * Will return the number of sg entries setup
+ */
+int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+		   struct scatterlist *sglist)
+{
+	struct bio_vec *bvec, *bvprv;
+	struct scatterlist *sg;
+	int nsegs, cluster;
+	unsigned long i;
+
+	nsegs = 0;
+	cluster = blk_queue_cluster(q);
+
+	bvprv = NULL;
+	sg = NULL;
+	bio_for_each_segment(bvec, bio, i) {
+		__blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+				     &nsegs, &cluster);
+	} /* segments in bio */
+
+	if (sg)
+		sg_mark_end(sg);
+
+	BUG_ON(bio->bi_phys_segments && nsegs > bio->bi_phys_segments);
+	return nsegs;
+}
+EXPORT_SYMBOL(blk_bio_map_sg);
+
 static inline int ll_new_hw_segment(struct request_queue *q,
 				    struct request *req,
 				    struct bio *bio)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2698866..e4a0cf4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -883,6 +883,8 @@ extern void blk_queue_flush_queueable(struct request_queue *q, bool queueable);
 extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
 
 extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
+extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+			  struct scatterlist *sglist);
 extern void blk_dump_rq_flags(struct request *, char *);
 extern long nr_blockdev_pages(void);
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-28  2:21 [PATCH V4 0/3] Improve virtio-blk performance Asias He
  2012-07-28  2:21 ` [PATCH V4 1/3] block: Introduce __blk_segment_map_sg() helper Asias He
  2012-07-28  2:21 ` [PATCH V4 2/3] block: Add blk_bio_map_sg() helper Asias He
@ 2012-07-28  2:21 ` Asias He
  2012-07-28  6:35   ` Sasha Levin
  2012-07-29 11:11   ` Michael S. Tsirkin
  2 siblings, 2 replies; 10+ messages in thread
From: Asias He @ 2012-07-28  2:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Michael S. Tsirkin, virtualization, kvm,
	Christoph Hellwig, Minchan Kim

This patch introduces bio-based IO path for virtio-blk.

Compared to request-based IO path, bio-based IO path uses driver
provided ->make_request_fn() method to bypasses the IO scheduler. It
handles the bio to device directly without allocating a request in block
layer. This reduces the IO path in guest kernel to achieve high IOPS
and lower latency. The downside is that guest can not use the IO
scheduler to merge and sort requests. However, this is not a big problem
if the backend disk in host side uses faster disk device.

When the bio-based IO path is not enabled, virtio-blk still uses the
original request-based IO path, no performance difference is observed.

Performance evaluation:
-----------------------------
1) Fio test is performed in a 8 vcpu guest with ramdisk based guest using
kvm tool.

Short version:
 With bio-based IO path, sequential read/write, random read/write
 IOPS boost         : 28%, 24%, 21%, 16%
 Latency improvement: 32%, 17%, 21%, 16%

Long version:
 With bio-based IO path:
  seq-read  : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
  seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
  rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
  rand-write: io=3095.7MB, bw=96198KB/s,  iops=192396 , runt= 32952msec
    clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
    clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
    clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
    clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
  cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
  cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
  cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
  cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
 With request-based IO path:
  seq-read  : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
  seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
  rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
  rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
    clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
    clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
    clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
    clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
  cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
  cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
  cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
  cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985

2) Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
kvm tool.

Short version:
 With bio-based IO path, sequential read/write, random read/write
 IOPS boost         : 11%, 11%, 13%, 10%
 Latency improvement: 10%, 10%, 12%, 10%
Long Version:
 With bio-based IO path:
  read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
  write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
  read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
  write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
    clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
    clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
    clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
    clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
  cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
  cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
  cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
  cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
 With request-based IO path:
  read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
  write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
  read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
  write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
    clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
    clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
    clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
    clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
  cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
  cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
  cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
  cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409

How to use:
-----------------------------
Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
use_bio=1' to enable ->make_request_fn() based I/O path.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/block/virtio_blk.c |  203 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 163 insertions(+), 40 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index c0bbeb4..95cfeed 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -14,6 +14,9 @@
 
 #define PART_BITS 4
 
+static bool use_bio;
+module_param(use_bio, bool, S_IRUGO);
+
 static int major;
 static DEFINE_IDA(vd_index_ida);
 
@@ -23,6 +26,7 @@ struct virtio_blk
 {
 	struct virtio_device *vdev;
 	struct virtqueue *vq;
+	wait_queue_head_t queue_wait;
 
 	/* The disk structure for the kernel. */
 	struct gendisk *disk;
@@ -51,53 +55,87 @@ struct virtio_blk
 struct virtblk_req
 {
 	struct request *req;
+	struct bio *bio;
 	struct virtio_blk_outhdr out_hdr;
 	struct virtio_scsi_inhdr in_hdr;
 	u8 status;
+	struct scatterlist sg[];
 };
 
-static void blk_done(struct virtqueue *vq)
+static inline int virtblk_result(struct virtblk_req *vbr)
+{
+	switch (vbr->status) {
+	case VIRTIO_BLK_S_OK:
+		return 0;
+	case VIRTIO_BLK_S_UNSUPP:
+		return -ENOTTY;
+	default:
+		return -EIO;
+	}
+}
+
+static inline void virtblk_request_done(struct virtio_blk *vblk,
+					struct virtblk_req *vbr)
+{
+	struct request *req = vbr->req;
+	int error = virtblk_result(vbr);
+
+	if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
+		req->resid_len = vbr->in_hdr.residual;
+		req->sense_len = vbr->in_hdr.sense_len;
+		req->errors = vbr->in_hdr.errors;
+	} else if (req->cmd_type == REQ_TYPE_SPECIAL) {
+		req->errors = (error != 0);
+	}
+
+	__blk_end_request_all(req, error);
+	mempool_free(vbr, vblk->pool);
+}
+
+static inline void virtblk_bio_done(struct virtio_blk *vblk,
+				    struct virtblk_req *vbr)
+{
+	bio_endio(vbr->bio, virtblk_result(vbr));
+	mempool_free(vbr, vblk->pool);
+}
+
+static void virtblk_done(struct virtqueue *vq)
 {
 	struct virtio_blk *vblk = vq->vdev->priv;
+	unsigned long bio_done = 0, req_done = 0;
 	struct virtblk_req *vbr;
-	unsigned int len;
 	unsigned long flags;
+	unsigned int len;
 
 	spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
 	while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
-		int error;
-
-		switch (vbr->status) {
-		case VIRTIO_BLK_S_OK:
-			error = 0;
-			break;
-		case VIRTIO_BLK_S_UNSUPP:
-			error = -ENOTTY;
-			break;
-		default:
-			error = -EIO;
-			break;
-		}
-
-		switch (vbr->req->cmd_type) {
-		case REQ_TYPE_BLOCK_PC:
-			vbr->req->resid_len = vbr->in_hdr.residual;
-			vbr->req->sense_len = vbr->in_hdr.sense_len;
-			vbr->req->errors = vbr->in_hdr.errors;
-			break;
-		case REQ_TYPE_SPECIAL:
-			vbr->req->errors = (error != 0);
-			break;
-		default:
-			break;
+		if (vbr->bio) {
+			virtblk_bio_done(vblk, vbr);
+			bio_done++;
+		} else {
+			virtblk_request_done(vblk, vbr);
+			req_done++;
 		}
-
-		__blk_end_request_all(vbr->req, error);
-		mempool_free(vbr, vblk->pool);
 	}
 	/* In case queue is stopped waiting for more buffers. */
-	blk_start_queue(vblk->disk->queue);
+	if (req_done)
+		blk_start_queue(vblk->disk->queue);
 	spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
+
+	if (bio_done)
+		wake_up(&vblk->queue_wait);
+}
+
+static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
+						    gfp_t gfp_mask)
+{
+	struct virtblk_req *vbr;
+
+	vbr = mempool_alloc(vblk->pool, gfp_mask);
+	if (vbr && use_bio)
+		sg_init_table(vbr->sg, vblk->sg_elems);
+
+	return vbr;
 }
 
 static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
@@ -106,13 +144,13 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 	unsigned long num, out = 0, in = 0;
 	struct virtblk_req *vbr;
 
-	vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
+	vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
 	if (!vbr)
 		/* When another request finishes we'll try again. */
 		return false;
 
 	vbr->req = req;
-
+	vbr->bio = NULL;
 	if (req->cmd_flags & REQ_FLUSH) {
 		vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
 		vbr->out_hdr.sector = 0;
@@ -172,7 +210,8 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		}
 	}
 
-	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
+	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
+			      GFP_ATOMIC) < 0) {
 		mempool_free(vbr, vblk->pool);
 		return false;
 	}
@@ -180,7 +219,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 	return true;
 }
 
-static void do_virtblk_request(struct request_queue *q)
+static void virtblk_request(struct request_queue *q)
 {
 	struct virtio_blk *vblk = q->queuedata;
 	struct request *req;
@@ -203,6 +242,82 @@ static void do_virtblk_request(struct request_queue *q)
 		virtqueue_kick(vblk->vq);
 }
 
+static void virtblk_add_buf_wait(struct virtio_blk *vblk,
+				 struct virtblk_req *vbr,
+				 unsigned long out,
+				 unsigned long in)
+{
+	DEFINE_WAIT(wait);
+
+	for (;;) {
+		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
+					  TASK_UNINTERRUPTIBLE);
+
+		spin_lock_irq(vblk->disk->queue->queue_lock);
+		if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+				      GFP_ATOMIC) < 0) {
+			spin_unlock_irq(vblk->disk->queue->queue_lock);
+			io_schedule();
+		} else {
+			virtqueue_kick(vblk->vq);
+			spin_unlock_irq(vblk->disk->queue->queue_lock);
+			break;
+		}
+
+	}
+
+	finish_wait(&vblk->queue_wait, &wait);
+}
+
+static void virtblk_make_request(struct request_queue *q, struct bio *bio)
+{
+	struct virtio_blk *vblk = q->queuedata;
+	unsigned int num, out = 0, in = 0;
+	struct virtblk_req *vbr;
+
+	BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
+	BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
+
+	vbr = virtblk_alloc_req(vblk, GFP_NOIO);
+	if (!vbr) {
+		bio_endio(bio, -ENOMEM);
+		return;
+	}
+
+	vbr->bio = bio;
+	vbr->req = NULL;
+	vbr->out_hdr.type = 0;
+	vbr->out_hdr.sector = bio->bi_sector;
+	vbr->out_hdr.ioprio = bio_prio(bio);
+
+	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
+
+	num = blk_bio_map_sg(q, bio, vbr->sg + out);
+
+	sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
+		   sizeof(vbr->status));
+
+	if (num) {
+		if (bio->bi_rw & REQ_WRITE) {
+			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
+			out += num;
+		} else {
+			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
+			in += num;
+		}
+	}
+
+	spin_lock_irq(vblk->disk->queue->queue_lock);
+	if (unlikely(virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+				       GFP_ATOMIC) < 0)) {
+		spin_unlock_irq(vblk->disk->queue->queue_lock);
+		virtblk_add_buf_wait(vblk, vbr, out, in);
+		return;
+	}
+	virtqueue_kick(vblk->vq);
+	spin_unlock_irq(vblk->disk->queue->queue_lock);
+}
+
 /* return id (s/n) string for *disk to *id_str
  */
 static int virtblk_get_id(struct gendisk *disk, char *id_str)
@@ -360,7 +475,7 @@ static int init_vq(struct virtio_blk *vblk)
 	int err = 0;
 
 	/* We expect one virtqueue, for output. */
-	vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
+	vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
 	if (IS_ERR(vblk->vq))
 		err = PTR_ERR(vblk->vq);
 
@@ -414,7 +529,7 @@ static void virtblk_update_cache_mode(struct virtio_device *vdev)
 	u8 writeback = virtblk_get_cache_mode(vdev);
 	struct virtio_blk *vblk = vdev->priv;
 
-	if (writeback)
+	if (writeback && !use_bio)
 		blk_queue_flush(vblk->disk->queue, REQ_FLUSH);
 	else
 		blk_queue_flush(vblk->disk->queue, 0);
@@ -477,6 +592,8 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
 	struct virtio_blk *vblk;
 	struct request_queue *q;
 	int err, index;
+	int pool_size;
+
 	u64 cap;
 	u32 v, blk_size, sg_elems, opt_io_size;
 	u16 min_io_size;
@@ -506,10 +623,12 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
 		goto out_free_index;
 	}
 
+	init_waitqueue_head(&vblk->queue_wait);
 	vblk->vdev = vdev;
 	vblk->sg_elems = sg_elems;
 	sg_init_table(vblk->sg, vblk->sg_elems);
 	mutex_init(&vblk->config_lock);
+
 	INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
 	vblk->config_enable = true;
 
@@ -517,7 +636,10 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
 	if (err)
 		goto out_free_vblk;
 
-	vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
+	pool_size = sizeof(struct virtblk_req);
+	if (use_bio)
+		pool_size += sizeof(struct scatterlist) * sg_elems;
+	vblk->pool = mempool_create_kmalloc_pool(1, pool_size);
 	if (!vblk->pool) {
 		err = -ENOMEM;
 		goto out_free_vq;
@@ -530,12 +652,14 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
 		goto out_mempool;
 	}
 
-	q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
+	q = vblk->disk->queue = blk_init_queue(virtblk_request, NULL);
 	if (!q) {
 		err = -ENOMEM;
 		goto out_put_disk;
 	}
 
+	if (use_bio)
+		blk_queue_make_request(q, virtblk_make_request);
 	q->queuedata = vblk;
 
 	virtblk_name_format("vd", index, vblk->disk->disk_name, DISK_NAME_LEN);
@@ -620,7 +744,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
 	if (!err && opt_io_size)
 		blk_queue_io_opt(q, blk_size * opt_io_size);
 
-
 	add_disk(vblk->disk);
 	err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
 	if (err)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-28  2:21 ` [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
@ 2012-07-28  6:35   ` Sasha Levin
  2012-07-30  6:27     ` Asias He
  2012-07-29 11:11   ` Michael S. Tsirkin
  1 sibling, 1 reply; 10+ messages in thread
From: Sasha Levin @ 2012-07-28  6:35 UTC (permalink / raw)
  To: Asias He
  Cc: linux-kernel, Rusty Russell, Michael S. Tsirkin, virtualization,
	kvm, Christoph Hellwig, Minchan Kim

On 07/28/2012 04:21 AM, Asias He wrote:
> This patch introduces bio-based IO path for virtio-blk.
> 
> Compared to request-based IO path, bio-based IO path uses driver
> provided ->make_request_fn() method to bypasses the IO scheduler. It
> handles the bio to device directly without allocating a request in block
> layer. This reduces the IO path in guest kernel to achieve high IOPS
> and lower latency. The downside is that guest can not use the IO
> scheduler to merge and sort requests. However, this is not a big problem
> if the backend disk in host side uses faster disk device.
> 
> When the bio-based IO path is not enabled, virtio-blk still uses the
> original request-based IO path, no performance difference is observed.
> 
> Performance evaluation:
> -----------------------------
> 1) Fio test is performed in a 8 vcpu guest with ramdisk based guest using
> kvm tool.
> 
> Short version:
>  With bio-based IO path, sequential read/write, random read/write
>  IOPS boost         : 28%, 24%, 21%, 16%
>  Latency improvement: 32%, 17%, 21%, 16%
> 
> Long version:
>  With bio-based IO path:
>   seq-read  : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
>   seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
>   rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
>   rand-write: io=3095.7MB, bw=96198KB/s,  iops=192396 , runt= 32952msec
>     clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
>     clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
>     clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
>     clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
>   cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
>   cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
>   cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
>   cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
>  With request-based IO path:
>   seq-read  : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
>   seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
>   rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
>   rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
>     clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
>     clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
>     clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
>     clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
>   cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
>   cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
>   cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
>   cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985
> 
> 2) Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
> kvm tool.
> 
> Short version:
>  With bio-based IO path, sequential read/write, random read/write
>  IOPS boost         : 11%, 11%, 13%, 10%
>  Latency improvement: 10%, 10%, 12%, 10%
> Long Version:
>  With bio-based IO path:
>   read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
>   write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
>   read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
>   write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
>     clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
>     clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
>     clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
>     clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
>   cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
>   cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
>   cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
>   cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
>  With request-based IO path:
>   read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
>   write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
>   read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
>   write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
>     clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
>     clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
>     clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
>     clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
>   cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
>   cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
>   cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
>   cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409

What are the cases where we'll see a performance degradation with using the bio path? Could we measure performance for those as well?

> How to use:
> -----------------------------
> Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
> use_bio=1' to enable ->make_request_fn() based I/O path.

If there are, in fact, no cases where performance is degraded, can use_bio=1 be the default?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-28  2:21 ` [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
  2012-07-28  6:35   ` Sasha Levin
@ 2012-07-29 11:11   ` Michael S. Tsirkin
  2012-07-30  1:55     ` Rusty Russell
  2012-07-30  4:33     ` Asias He
  1 sibling, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2012-07-29 11:11 UTC (permalink / raw)
  To: Asias He
  Cc: linux-kernel, Rusty Russell, virtualization, kvm,
	Christoph Hellwig, Minchan Kim

On Sat, Jul 28, 2012 at 10:21:05AM +0800, Asias He wrote:
> This patch introduces bio-based IO path for virtio-blk.
> 
> Compared to request-based IO path, bio-based IO path uses driver
> provided ->make_request_fn() method to bypasses the IO scheduler. It
> handles the bio to device directly without allocating a request in block
> layer. This reduces the IO path in guest kernel to achieve high IOPS
> and lower latency. The downside is that guest can not use the IO
> scheduler to merge and sort requests. However, this is not a big problem
> if the backend disk in host side uses faster disk device.

If this optimization depends on the host, then it
should be reported to the guest using a feature bit,
as opposed to being guest driven.

> When the bio-based IO path is not enabled, virtio-blk still uses the
> original request-based IO path, no performance difference is observed.
> 
> Performance evaluation:
> -----------------------------
> 1) Fio test is performed in a 8 vcpu guest with ramdisk based guest using
> kvm tool.
> 
> Short version:
>  With bio-based IO path, sequential read/write, random read/write
>  IOPS boost         : 28%, 24%, 21%, 16%
>  Latency improvement: 32%, 17%, 21%, 16%
> 
> Long version:
>  With bio-based IO path:
>   seq-read  : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
>   seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
>   rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
>   rand-write: io=3095.7MB, bw=96198KB/s,  iops=192396 , runt= 32952msec
>     clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
>     clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
>     clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
>     clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
>   cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
>   cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
>   cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
>   cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
>  With request-based IO path:
>   seq-read  : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
>   seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
>   rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
>   rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
>     clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
>     clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
>     clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
>     clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
>   cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
>   cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
>   cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
>   cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985


So bio based causes cpu to jump up by some 30%?
What happens if you divide IOPS/CPU?
Any improvement that comes from increasing the cpu share
of the given guest on the host will not scale well on
a typical overcommitted host.

> 2) Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
> kvm tool.
> 
> Short version:
>  With bio-based IO path, sequential read/write, random read/write
>  IOPS boost         : 11%, 11%, 13%, 10%
>  Latency improvement: 10%, 10%, 12%, 10%
> Long Version:
>  With bio-based IO path:
>   read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
>   write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
>   read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
>   write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
>     clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
>     clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
>     clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
>     clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
>   cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
>   cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
>   cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
>   cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
>  With request-based IO path:
>   read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
>   write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
>   read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
>   write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
>     clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
>     clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
>     clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
>     clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
>   cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
>   cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
>   cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
>   cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409


Is this host or guest cpu? We should probably measure host cpu
as this includes device overhead which could vary by load.

> How to use:
> -----------------------------
> Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
> use_bio=1' to enable ->make_request_fn() based I/O path.
> 
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Rusty Russell <rusty@rustcorp.com.au>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/block/virtio_blk.c |  203 +++++++++++++++++++++++++++++++++++---------
>  1 file changed, 163 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index c0bbeb4..95cfeed 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -14,6 +14,9 @@
>  
>  #define PART_BITS 4
>  
> +static bool use_bio;
> +module_param(use_bio, bool, S_IRUGO);
> +
>  static int major;
>  static DEFINE_IDA(vd_index_ida);
>  
> @@ -23,6 +26,7 @@ struct virtio_blk
>  {
>  	struct virtio_device *vdev;
>  	struct virtqueue *vq;
> +	wait_queue_head_t queue_wait;
>  
>  	/* The disk structure for the kernel. */
>  	struct gendisk *disk;
> @@ -51,53 +55,87 @@ struct virtio_blk
>  struct virtblk_req
>  {
>  	struct request *req;
> +	struct bio *bio;
>  	struct virtio_blk_outhdr out_hdr;
>  	struct virtio_scsi_inhdr in_hdr;
>  	u8 status;
> +	struct scatterlist sg[];
>  };
>  
> -static void blk_done(struct virtqueue *vq)
> +static inline int virtblk_result(struct virtblk_req *vbr)
> +{
> +	switch (vbr->status) {
> +	case VIRTIO_BLK_S_OK:
> +		return 0;
> +	case VIRTIO_BLK_S_UNSUPP:
> +		return -ENOTTY;
> +	default:
> +		return -EIO;
> +	}
> +}
> +
> +static inline void virtblk_request_done(struct virtio_blk *vblk,
> +					struct virtblk_req *vbr)
> +{
> +	struct request *req = vbr->req;
> +	int error = virtblk_result(vbr);
> +
> +	if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
> +		req->resid_len = vbr->in_hdr.residual;
> +		req->sense_len = vbr->in_hdr.sense_len;
> +		req->errors = vbr->in_hdr.errors;
> +	} else if (req->cmd_type == REQ_TYPE_SPECIAL) {
> +		req->errors = (error != 0);
> +	}
> +
> +	__blk_end_request_all(req, error);
> +	mempool_free(vbr, vblk->pool);
> +}
> +
> +static inline void virtblk_bio_done(struct virtio_blk *vblk,
> +				    struct virtblk_req *vbr)
> +{
> +	bio_endio(vbr->bio, virtblk_result(vbr));
> +	mempool_free(vbr, vblk->pool);
> +}
> +
> +static void virtblk_done(struct virtqueue *vq)
>  {
>  	struct virtio_blk *vblk = vq->vdev->priv;
> +	unsigned long bio_done = 0, req_done = 0;
>  	struct virtblk_req *vbr;
> -	unsigned int len;
>  	unsigned long flags;
> +	unsigned int len;
>  
>  	spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
>  	while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
> -		int error;
> -
> -		switch (vbr->status) {
> -		case VIRTIO_BLK_S_OK:
> -			error = 0;
> -			break;
> -		case VIRTIO_BLK_S_UNSUPP:
> -			error = -ENOTTY;
> -			break;
> -		default:
> -			error = -EIO;
> -			break;
> -		}
> -
> -		switch (vbr->req->cmd_type) {
> -		case REQ_TYPE_BLOCK_PC:
> -			vbr->req->resid_len = vbr->in_hdr.residual;
> -			vbr->req->sense_len = vbr->in_hdr.sense_len;
> -			vbr->req->errors = vbr->in_hdr.errors;
> -			break;
> -		case REQ_TYPE_SPECIAL:
> -			vbr->req->errors = (error != 0);
> -			break;
> -		default:
> -			break;
> +		if (vbr->bio) {
> +			virtblk_bio_done(vblk, vbr);
> +			bio_done++;
> +		} else {
> +			virtblk_request_done(vblk, vbr);
> +			req_done++;
>  		}
> -
> -		__blk_end_request_all(vbr->req, error);
> -		mempool_free(vbr, vblk->pool);
>  	}
>  	/* In case queue is stopped waiting for more buffers. */
> -	blk_start_queue(vblk->disk->queue);
> +	if (req_done)
> +		blk_start_queue(vblk->disk->queue);
>  	spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
> +
> +	if (bio_done)
> +		wake_up(&vblk->queue_wait);
> +}
> +
> +static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
> +						    gfp_t gfp_mask)
> +{
> +	struct virtblk_req *vbr;
> +
> +	vbr = mempool_alloc(vblk->pool, gfp_mask);
> +	if (vbr && use_bio)
> +		sg_init_table(vbr->sg, vblk->sg_elems);
> +
> +	return vbr;
>  }
>  
>  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
> @@ -106,13 +144,13 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>  	unsigned long num, out = 0, in = 0;
>  	struct virtblk_req *vbr;
>  
> -	vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
> +	vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
>  	if (!vbr)
>  		/* When another request finishes we'll try again. */
>  		return false;
>  
>  	vbr->req = req;
> -
> +	vbr->bio = NULL;
>  	if (req->cmd_flags & REQ_FLUSH) {
>  		vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
>  		vbr->out_hdr.sector = 0;
> @@ -172,7 +210,8 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>  		}
>  	}
>  
> -	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
> +	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
> +			      GFP_ATOMIC) < 0) {
>  		mempool_free(vbr, vblk->pool);
>  		return false;
>  	}
> @@ -180,7 +219,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>  	return true;
>  }
>  
> -static void do_virtblk_request(struct request_queue *q)
> +static void virtblk_request(struct request_queue *q)
>  {
>  	struct virtio_blk *vblk = q->queuedata;
>  	struct request *req;
> @@ -203,6 +242,82 @@ static void do_virtblk_request(struct request_queue *q)
>  		virtqueue_kick(vblk->vq);
>  }
>  
> +static void virtblk_add_buf_wait(struct virtio_blk *vblk,
> +				 struct virtblk_req *vbr,
> +				 unsigned long out,
> +				 unsigned long in)
> +{
> +	DEFINE_WAIT(wait);
> +
> +	for (;;) {
> +		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
> +					  TASK_UNINTERRUPTIBLE);
> +
> +		spin_lock_irq(vblk->disk->queue->queue_lock);
> +		if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> +				      GFP_ATOMIC) < 0) {
> +			spin_unlock_irq(vblk->disk->queue->queue_lock);
> +			io_schedule();
> +		} else {
> +			virtqueue_kick(vblk->vq);
> +			spin_unlock_irq(vblk->disk->queue->queue_lock);
> +			break;
> +		}
> +
> +	}
> +
> +	finish_wait(&vblk->queue_wait, &wait);
> +}
> +
> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
> +{
> +	struct virtio_blk *vblk = q->queuedata;
> +	unsigned int num, out = 0, in = 0;
> +	struct virtblk_req *vbr;
> +
> +	BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
> +	BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
> +
> +	vbr = virtblk_alloc_req(vblk, GFP_NOIO);
> +	if (!vbr) {
> +		bio_endio(bio, -ENOMEM);
> +		return;
> +	}
> +
> +	vbr->bio = bio;
> +	vbr->req = NULL;
> +	vbr->out_hdr.type = 0;
> +	vbr->out_hdr.sector = bio->bi_sector;
> +	vbr->out_hdr.ioprio = bio_prio(bio);
> +
> +	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
> +
> +	num = blk_bio_map_sg(q, bio, vbr->sg + out);
> +
> +	sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
> +		   sizeof(vbr->status));
> +
> +	if (num) {
> +		if (bio->bi_rw & REQ_WRITE) {
> +			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
> +			out += num;
> +		} else {
> +			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
> +			in += num;
> +		}
> +	}
> +
> +	spin_lock_irq(vblk->disk->queue->queue_lock);
> +	if (unlikely(virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> +				       GFP_ATOMIC) < 0)) {
> +		spin_unlock_irq(vblk->disk->queue->queue_lock);
> +		virtblk_add_buf_wait(vblk, vbr, out, in);
> +		return;
> +	}
> +	virtqueue_kick(vblk->vq);
> +	spin_unlock_irq(vblk->disk->queue->queue_lock);
> +}
> +
>  /* return id (s/n) string for *disk to *id_str
>   */
>  static int virtblk_get_id(struct gendisk *disk, char *id_str)
> @@ -360,7 +475,7 @@ static int init_vq(struct virtio_blk *vblk)
>  	int err = 0;
>  
>  	/* We expect one virtqueue, for output. */
> -	vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
> +	vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
>  	if (IS_ERR(vblk->vq))
>  		err = PTR_ERR(vblk->vq);
>  
> @@ -414,7 +529,7 @@ static void virtblk_update_cache_mode(struct virtio_device *vdev)
>  	u8 writeback = virtblk_get_cache_mode(vdev);
>  	struct virtio_blk *vblk = vdev->priv;
>  
> -	if (writeback)
> +	if (writeback && !use_bio)
>  		blk_queue_flush(vblk->disk->queue, REQ_FLUSH);
>  	else
>  		blk_queue_flush(vblk->disk->queue, 0);
> @@ -477,6 +592,8 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  	struct virtio_blk *vblk;
>  	struct request_queue *q;
>  	int err, index;
> +	int pool_size;
> +
>  	u64 cap;
>  	u32 v, blk_size, sg_elems, opt_io_size;
>  	u16 min_io_size;
> @@ -506,10 +623,12 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  		goto out_free_index;
>  	}
>  
> +	init_waitqueue_head(&vblk->queue_wait);
>  	vblk->vdev = vdev;
>  	vblk->sg_elems = sg_elems;
>  	sg_init_table(vblk->sg, vblk->sg_elems);
>  	mutex_init(&vblk->config_lock);
> +
>  	INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
>  	vblk->config_enable = true;
>  
> @@ -517,7 +636,10 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  	if (err)
>  		goto out_free_vblk;
>  
> -	vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
> +	pool_size = sizeof(struct virtblk_req);
> +	if (use_bio)
> +		pool_size += sizeof(struct scatterlist) * sg_elems;
> +	vblk->pool = mempool_create_kmalloc_pool(1, pool_size);
>  	if (!vblk->pool) {
>  		err = -ENOMEM;
>  		goto out_free_vq;
> @@ -530,12 +652,14 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  		goto out_mempool;
>  	}
>  
> -	q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
> +	q = vblk->disk->queue = blk_init_queue(virtblk_request, NULL);
>  	if (!q) {
>  		err = -ENOMEM;
>  		goto out_put_disk;
>  	}
>  
> +	if (use_bio)
> +		blk_queue_make_request(q, virtblk_make_request);
>  	q->queuedata = vblk;
>  
>  	virtblk_name_format("vd", index, vblk->disk->disk_name, DISK_NAME_LEN);
> @@ -620,7 +744,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  	if (!err && opt_io_size)
>  		blk_queue_io_opt(q, blk_size * opt_io_size);
>  
> -
>  	add_disk(vblk->disk);
>  	err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
>  	if (err)
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-29 11:11   ` Michael S. Tsirkin
@ 2012-07-30  1:55     ` Rusty Russell
  2012-07-30 13:42       ` Christoph Hellwig
  2012-07-30  4:33     ` Asias He
  1 sibling, 1 reply; 10+ messages in thread
From: Rusty Russell @ 2012-07-30  1:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Asias He
  Cc: linux-kernel, virtualization, kvm, Christoph Hellwig, Minchan Kim

On Sun, 29 Jul 2012 14:11:15 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sat, Jul 28, 2012 at 10:21:05AM +0800, Asias He wrote:
> > This patch introduces bio-based IO path for virtio-blk.
> > 
> > Compared to request-based IO path, bio-based IO path uses driver
> > provided ->make_request_fn() method to bypasses the IO scheduler. It
> > handles the bio to device directly without allocating a request in block
> > layer. This reduces the IO path in guest kernel to achieve high IOPS
> > and lower latency. The downside is that guest can not use the IO
> > scheduler to merge and sort requests. However, this is not a big problem
> > if the backend disk in host side uses faster disk device.
> 
> If this optimization depends on the host, then it
> should be reported to the guest using a feature bit,
> as opposed to being guest driven.

I consider this approach a half-way step.  Quick attempts on my laptop
and I couldn't find a case where the bio path was a loss, but in theory
if the host wasn't doing any reordering and it was a slow device, you'd
want the guest to do so.

I'm not sure if current qemu can be configured to do such a thing?

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-29 11:11   ` Michael S. Tsirkin
  2012-07-30  1:55     ` Rusty Russell
@ 2012-07-30  4:33     ` Asias He
  1 sibling, 0 replies; 10+ messages in thread
From: Asias He @ 2012-07-30  4:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Rusty Russell, virtualization, kvm,
	Christoph Hellwig, Minchan Kim

On 07/29/2012 07:11 PM, Michael S. Tsirkin wrote:
> On Sat, Jul 28, 2012 at 10:21:05AM +0800, Asias He wrote:
>> This patch introduces bio-based IO path for virtio-blk.
>>
>> Compared to request-based IO path, bio-based IO path uses driver
>> provided ->make_request_fn() method to bypasses the IO scheduler. It
>> handles the bio to device directly without allocating a request in block
>> layer. This reduces the IO path in guest kernel to achieve high IOPS
>> and lower latency. The downside is that guest can not use the IO
>> scheduler to merge and sort requests. However, this is not a big problem
>> if the backend disk in host side uses faster disk device.
>
> If this optimization depends on the host, then it
> should be reported to the guest using a feature bit,
> as opposed to being guest driven.
>
>> When the bio-based IO path is not enabled, virtio-blk still uses the
>> original request-based IO path, no performance difference is observed.
>>
>> Performance evaluation:
>> -----------------------------
>> 1) Fio test is performed in a 8 vcpu guest with ramdisk based guest using
>> kvm tool.
>>
>> Short version:
>>   With bio-based IO path, sequential read/write, random read/write
>>   IOPS boost         : 28%, 24%, 21%, 16%
>>   Latency improvement: 32%, 17%, 21%, 16%
>>
>> Long version:
>>   With bio-based IO path:
>>    seq-read  : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
>>    seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
>>    rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
>>    rand-write: io=3095.7MB, bw=96198KB/s,  iops=192396 , runt= 32952msec
>>      clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
>>      clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
>>      clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
>>      clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
>>    cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
>>    cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
>>    cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
>>    cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
>>   With request-based IO path:
>>    seq-read  : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
>>    seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
>>    rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
>>    rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
>>      clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
>>      clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
>>      clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
>>      clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
>>    cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
>>    cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
>>    cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
>>    cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985
>
>
> So bio based causes cpu to jump up by some 30%?
> What happens if you divide IOPS/CPU?
> Any improvement that comes from increasing the cpu share
> of the given guest on the host will not scale well on
> a typical overcommitted host.

if you add sys and usr time up, the jump is that much.

For ramdisk based device,

bio-based
------------------
 >>> 74.08 + 703.84
777.9200000000001
 >>> 70.92 + 702.81
773.7299999999999
 >>> 72.23 + 695.37
767.6
 >>> 69.69 + 654.13
723.8199999999999
 >>> 53.28 + 724.19
777.47

req-based
------------------
 >>> 53.28 + 724.19
777.47
 >>> 49.03 + 633.20
682.23
 >>> 55.78 + 722.40
778.18
 >>> 56.55 + 690.83
747.38


And for real device(fusion io), the cpu time is smaller with bio path.
bio-based
------------------
 >>> 56.79 + 421.70
478.49
 >>> 61.81 + 455.53
517.3399999999999
 >>> 63.10+455.38
518.48
 >>> 62.04 + 453.58
515.62

req-based
------------------
 >>> 44.08 + 590.71
634.7900000000001
 >>> 48.73 + 610.78
659.51
 >>> 45.58 + 581.16
626.74
 >>> 48.40 + 599.65
648.05


>
>> 2) Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
>> kvm tool.
>>
>> Short version:
>>   With bio-based IO path, sequential read/write, random read/write
>>   IOPS boost         : 11%, 11%, 13%, 10%
>>   Latency improvement: 10%, 10%, 12%, 10%
>> Long Version:
>>   With bio-based IO path:
>>    read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
>>    write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
>>    read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
>>    write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
>>      clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
>>      clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
>>      clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
>>      clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
>>    cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
>>    cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
>>    cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
>>    cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
>>   With request-based IO path:
>>    read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
>>    write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
>>    read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
>>    write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
>>      clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
>>      clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
>>      clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
>>      clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
>>    cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
>>    cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
>>    cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
>>    cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409
>
>
> Is this host or guest cpu? We should probably measure host cpu
> as this includes device overhead which could vary by load.

It's guest cpu. Yes, host cpu is also interesting.

>
>> How to use:
>> -----------------------------
>> Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
>> use_bio=1' to enable ->make_request_fn() based I/O path.
>>
>> Cc: Rusty Russell <rusty@rustcorp.com.au>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: virtualization@lists.linux-foundation.org
>> Cc: kvm@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Acked-by: Rusty Russell <rusty@rustcorp.com.au>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
>> Signed-off-by: Asias He <asias@redhat.com>
>> ---
>>   drivers/block/virtio_blk.c |  203 +++++++++++++++++++++++++++++++++++---------
>>   1 file changed, 163 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index c0bbeb4..95cfeed 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -14,6 +14,9 @@
>>
>>   #define PART_BITS 4
>>
>> +static bool use_bio;
>> +module_param(use_bio, bool, S_IRUGO);
>> +
>>   static int major;
>>   static DEFINE_IDA(vd_index_ida);
>>
>> @@ -23,6 +26,7 @@ struct virtio_blk
>>   {
>>   	struct virtio_device *vdev;
>>   	struct virtqueue *vq;
>> +	wait_queue_head_t queue_wait;
>>
>>   	/* The disk structure for the kernel. */
>>   	struct gendisk *disk;
>> @@ -51,53 +55,87 @@ struct virtio_blk
>>   struct virtblk_req
>>   {
>>   	struct request *req;
>> +	struct bio *bio;
>>   	struct virtio_blk_outhdr out_hdr;
>>   	struct virtio_scsi_inhdr in_hdr;
>>   	u8 status;
>> +	struct scatterlist sg[];
>>   };
>>
>> -static void blk_done(struct virtqueue *vq)
>> +static inline int virtblk_result(struct virtblk_req *vbr)
>> +{
>> +	switch (vbr->status) {
>> +	case VIRTIO_BLK_S_OK:
>> +		return 0;
>> +	case VIRTIO_BLK_S_UNSUPP:
>> +		return -ENOTTY;
>> +	default:
>> +		return -EIO;
>> +	}
>> +}
>> +
>> +static inline void virtblk_request_done(struct virtio_blk *vblk,
>> +					struct virtblk_req *vbr)
>> +{
>> +	struct request *req = vbr->req;
>> +	int error = virtblk_result(vbr);
>> +
>> +	if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
>> +		req->resid_len = vbr->in_hdr.residual;
>> +		req->sense_len = vbr->in_hdr.sense_len;
>> +		req->errors = vbr->in_hdr.errors;
>> +	} else if (req->cmd_type == REQ_TYPE_SPECIAL) {
>> +		req->errors = (error != 0);
>> +	}
>> +
>> +	__blk_end_request_all(req, error);
>> +	mempool_free(vbr, vblk->pool);
>> +}
>> +
>> +static inline void virtblk_bio_done(struct virtio_blk *vblk,
>> +				    struct virtblk_req *vbr)
>> +{
>> +	bio_endio(vbr->bio, virtblk_result(vbr));
>> +	mempool_free(vbr, vblk->pool);
>> +}
>> +
>> +static void virtblk_done(struct virtqueue *vq)
>>   {
>>   	struct virtio_blk *vblk = vq->vdev->priv;
>> +	unsigned long bio_done = 0, req_done = 0;
>>   	struct virtblk_req *vbr;
>> -	unsigned int len;
>>   	unsigned long flags;
>> +	unsigned int len;
>>
>>   	spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
>>   	while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
>> -		int error;
>> -
>> -		switch (vbr->status) {
>> -		case VIRTIO_BLK_S_OK:
>> -			error = 0;
>> -			break;
>> -		case VIRTIO_BLK_S_UNSUPP:
>> -			error = -ENOTTY;
>> -			break;
>> -		default:
>> -			error = -EIO;
>> -			break;
>> -		}
>> -
>> -		switch (vbr->req->cmd_type) {
>> -		case REQ_TYPE_BLOCK_PC:
>> -			vbr->req->resid_len = vbr->in_hdr.residual;
>> -			vbr->req->sense_len = vbr->in_hdr.sense_len;
>> -			vbr->req->errors = vbr->in_hdr.errors;
>> -			break;
>> -		case REQ_TYPE_SPECIAL:
>> -			vbr->req->errors = (error != 0);
>> -			break;
>> -		default:
>> -			break;
>> +		if (vbr->bio) {
>> +			virtblk_bio_done(vblk, vbr);
>> +			bio_done++;
>> +		} else {
>> +			virtblk_request_done(vblk, vbr);
>> +			req_done++;
>>   		}
>> -
>> -		__blk_end_request_all(vbr->req, error);
>> -		mempool_free(vbr, vblk->pool);
>>   	}
>>   	/* In case queue is stopped waiting for more buffers. */
>> -	blk_start_queue(vblk->disk->queue);
>> +	if (req_done)
>> +		blk_start_queue(vblk->disk->queue);
>>   	spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
>> +
>> +	if (bio_done)
>> +		wake_up(&vblk->queue_wait);
>> +}
>> +
>> +static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
>> +						    gfp_t gfp_mask)
>> +{
>> +	struct virtblk_req *vbr;
>> +
>> +	vbr = mempool_alloc(vblk->pool, gfp_mask);
>> +	if (vbr && use_bio)
>> +		sg_init_table(vbr->sg, vblk->sg_elems);
>> +
>> +	return vbr;
>>   }
>>
>>   static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>> @@ -106,13 +144,13 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>>   	unsigned long num, out = 0, in = 0;
>>   	struct virtblk_req *vbr;
>>
>> -	vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
>> +	vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
>>   	if (!vbr)
>>   		/* When another request finishes we'll try again. */
>>   		return false;
>>
>>   	vbr->req = req;
>> -
>> +	vbr->bio = NULL;
>>   	if (req->cmd_flags & REQ_FLUSH) {
>>   		vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
>>   		vbr->out_hdr.sector = 0;
>> @@ -172,7 +210,8 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>>   		}
>>   	}
>>
>> -	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
>> +	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
>> +			      GFP_ATOMIC) < 0) {
>>   		mempool_free(vbr, vblk->pool);
>>   		return false;
>>   	}
>> @@ -180,7 +219,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>>   	return true;
>>   }
>>
>> -static void do_virtblk_request(struct request_queue *q)
>> +static void virtblk_request(struct request_queue *q)
>>   {
>>   	struct virtio_blk *vblk = q->queuedata;
>>   	struct request *req;
>> @@ -203,6 +242,82 @@ static void do_virtblk_request(struct request_queue *q)
>>   		virtqueue_kick(vblk->vq);
>>   }
>>
>> +static void virtblk_add_buf_wait(struct virtio_blk *vblk,
>> +				 struct virtblk_req *vbr,
>> +				 unsigned long out,
>> +				 unsigned long in)
>> +{
>> +	DEFINE_WAIT(wait);
>> +
>> +	for (;;) {
>> +		prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
>> +					  TASK_UNINTERRUPTIBLE);
>> +
>> +		spin_lock_irq(vblk->disk->queue->queue_lock);
>> +		if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> +				      GFP_ATOMIC) < 0) {
>> +			spin_unlock_irq(vblk->disk->queue->queue_lock);
>> +			io_schedule();
>> +		} else {
>> +			virtqueue_kick(vblk->vq);
>> +			spin_unlock_irq(vblk->disk->queue->queue_lock);
>> +			break;
>> +		}
>> +
>> +	}
>> +
>> +	finish_wait(&vblk->queue_wait, &wait);
>> +}
>> +
>> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> +	struct virtio_blk *vblk = q->queuedata;
>> +	unsigned int num, out = 0, in = 0;
>> +	struct virtblk_req *vbr;
>> +
>> +	BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
>> +	BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
>> +
>> +	vbr = virtblk_alloc_req(vblk, GFP_NOIO);
>> +	if (!vbr) {
>> +		bio_endio(bio, -ENOMEM);
>> +		return;
>> +	}
>> +
>> +	vbr->bio = bio;
>> +	vbr->req = NULL;
>> +	vbr->out_hdr.type = 0;
>> +	vbr->out_hdr.sector = bio->bi_sector;
>> +	vbr->out_hdr.ioprio = bio_prio(bio);
>> +
>> +	sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>> +
>> +	num = blk_bio_map_sg(q, bio, vbr->sg + out);
>> +
>> +	sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
>> +		   sizeof(vbr->status));
>> +
>> +	if (num) {
>> +		if (bio->bi_rw & REQ_WRITE) {
>> +			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
>> +			out += num;
>> +		} else {
>> +			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>> +			in += num;
>> +		}
>> +	}
>> +
>> +	spin_lock_irq(vblk->disk->queue->queue_lock);
>> +	if (unlikely(virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> +				       GFP_ATOMIC) < 0)) {
>> +		spin_unlock_irq(vblk->disk->queue->queue_lock);
>> +		virtblk_add_buf_wait(vblk, vbr, out, in);
>> +		return;
>> +	}
>> +	virtqueue_kick(vblk->vq);
>> +	spin_unlock_irq(vblk->disk->queue->queue_lock);
>> +}
>> +
>>   /* return id (s/n) string for *disk to *id_str
>>    */
>>   static int virtblk_get_id(struct gendisk *disk, char *id_str)
>> @@ -360,7 +475,7 @@ static int init_vq(struct virtio_blk *vblk)
>>   	int err = 0;
>>
>>   	/* We expect one virtqueue, for output. */
>> -	vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
>> +	vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
>>   	if (IS_ERR(vblk->vq))
>>   		err = PTR_ERR(vblk->vq);
>>
>> @@ -414,7 +529,7 @@ static void virtblk_update_cache_mode(struct virtio_device *vdev)
>>   	u8 writeback = virtblk_get_cache_mode(vdev);
>>   	struct virtio_blk *vblk = vdev->priv;
>>
>> -	if (writeback)
>> +	if (writeback && !use_bio)
>>   		blk_queue_flush(vblk->disk->queue, REQ_FLUSH);
>>   	else
>>   		blk_queue_flush(vblk->disk->queue, 0);
>> @@ -477,6 +592,8 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>>   	struct virtio_blk *vblk;
>>   	struct request_queue *q;
>>   	int err, index;
>> +	int pool_size;
>> +
>>   	u64 cap;
>>   	u32 v, blk_size, sg_elems, opt_io_size;
>>   	u16 min_io_size;
>> @@ -506,10 +623,12 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>>   		goto out_free_index;
>>   	}
>>
>> +	init_waitqueue_head(&vblk->queue_wait);
>>   	vblk->vdev = vdev;
>>   	vblk->sg_elems = sg_elems;
>>   	sg_init_table(vblk->sg, vblk->sg_elems);
>>   	mutex_init(&vblk->config_lock);
>> +
>>   	INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
>>   	vblk->config_enable = true;
>>
>> @@ -517,7 +636,10 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>>   	if (err)
>>   		goto out_free_vblk;
>>
>> -	vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
>> +	pool_size = sizeof(struct virtblk_req);
>> +	if (use_bio)
>> +		pool_size += sizeof(struct scatterlist) * sg_elems;
>> +	vblk->pool = mempool_create_kmalloc_pool(1, pool_size);
>>   	if (!vblk->pool) {
>>   		err = -ENOMEM;
>>   		goto out_free_vq;
>> @@ -530,12 +652,14 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>>   		goto out_mempool;
>>   	}
>>
>> -	q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
>> +	q = vblk->disk->queue = blk_init_queue(virtblk_request, NULL);
>>   	if (!q) {
>>   		err = -ENOMEM;
>>   		goto out_put_disk;
>>   	}
>>
>> +	if (use_bio)
>> +		blk_queue_make_request(q, virtblk_make_request);
>>   	q->queuedata = vblk;
>>
>>   	virtblk_name_format("vd", index, vblk->disk->disk_name, DISK_NAME_LEN);
>> @@ -620,7 +744,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>>   	if (!err && opt_io_size)
>>   		blk_queue_io_opt(q, blk_size * opt_io_size);
>>
>> -
>>   	add_disk(vblk->disk);
>>   	err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
>>   	if (err)
>> --
>> 1.7.10.4


-- 
Asias

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-28  6:35   ` Sasha Levin
@ 2012-07-30  6:27     ` Asias He
  0 siblings, 0 replies; 10+ messages in thread
From: Asias He @ 2012-07-30  6:27 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, Rusty Russell, Michael S. Tsirkin, virtualization,
	kvm, Christoph Hellwig, Minchan Kim

Hello Sasha,

On 07/28/2012 02:35 PM, Sasha Levin wrote:
> On 07/28/2012 04:21 AM, Asias He wrote:
>> This patch introduces bio-based IO path for virtio-blk.
>>
>> Compared to request-based IO path, bio-based IO path uses driver
>> provided ->make_request_fn() method to bypasses the IO scheduler. It
>> handles the bio to device directly without allocating a request in block
>> layer. This reduces the IO path in guest kernel to achieve high IOPS
>> and lower latency. The downside is that guest can not use the IO
>> scheduler to merge and sort requests. However, this is not a big problem
>> if the backend disk in host side uses faster disk device.
>>
>> When the bio-based IO path is not enabled, virtio-blk still uses the
>> original request-based IO path, no performance difference is observed.
>>
>> Performance evaluation:
>> -----------------------------
>> 1) Fio test is performed in a 8 vcpu guest with ramdisk based guest using
>> kvm tool.
>>
>> Short version:
>>   With bio-based IO path, sequential read/write, random read/write
>>   IOPS boost         : 28%, 24%, 21%, 16%
>>   Latency improvement: 32%, 17%, 21%, 16%
>>
>> Long version:
>>   With bio-based IO path:
>>    seq-read  : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
>>    seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
>>    rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
>>    rand-write: io=3095.7MB, bw=96198KB/s,  iops=192396 , runt= 32952msec
>>      clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
>>      clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
>>      clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
>>      clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
>>    cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
>>    cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
>>    cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
>>    cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
>>   With request-based IO path:
>>    seq-read  : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
>>    seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
>>    rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
>>    rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
>>      clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
>>      clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
>>      clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
>>      clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
>>    cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
>>    cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
>>    cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
>>    cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985
>>
>> 2) Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
>> kvm tool.
>>
>> Short version:
>>   With bio-based IO path, sequential read/write, random read/write
>>   IOPS boost         : 11%, 11%, 13%, 10%
>>   Latency improvement: 10%, 10%, 12%, 10%
>> Long Version:
>>   With bio-based IO path:
>>    read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
>>    write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
>>    read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
>>    write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
>>      clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
>>      clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
>>      clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
>>      clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
>>    cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
>>    cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
>>    cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
>>    cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
>>   With request-based IO path:
>>    read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
>>    write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
>>    read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
>>    write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
>>      clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
>>      clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
>>      clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
>>      clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
>>    cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
>>    cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
>>    cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
>>    cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409
>
> What are the cases where we'll see a performance degradation with using the bio path? Could we measure performance for those as well?
>
>> How to use:
>> -----------------------------
>> Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
>> use_bio=1' to enable ->make_request_fn() based I/O path.
>
> If there are, in fact, no cases where performance is degraded, can use_bio=1 be the default?

Here are some results on a SATA based image file. In this case, the bio 
path is slower than req path doing sequential read/write.

qemu use_bio=0 (req-based)
-------------------------------------
   read : io=120964KB, bw=26098KB/s, iops=6470 , runt=  4635msec
   write: io=190236KB, bw=35505KB/s, iops=8829 , runt=  5358msec
   read : io=257400KB, bw=1963.7KB/s, iops=488 , runt=131081msec
   write: io=258944KB, bw=1302.7KB/s, iops=324 , runt=198872msec
     clat (msec): min=1 , max=1527 , avg=30.73, stdev=144.73
     clat (usec): min=811 , max=247072 , avg=28451.71, stdev=16107.22
     clat (msec): min=6 , max=2519 , avg=513.91, stdev=231.07
     clat (msec): min=33 , max=2621 , avg=772.33, stdev=348.39
   cpu          : usr=4.05%, sys=14.56%, ctx=38199, majf=0, minf=4
   cpu          : usr=4.02%, sys=15.48%, ctx=53724, majf=0, minf=0
   cpu          : usr=0.15%, sys=0.30%, ctx=20535, majf=0, minf=16
   cpu          : usr=0.32%, sys=0.96%, ctx=101465, majf=0, minf=0


qemu use_bio=1 (bio-based)
-------------------------------------
   read : io=202736KB, bw=25569KB/s, iops=6360 , runt=  7929msec
   write: io=217844KB, bw=20335KB/s, iops=5060 , runt= 10713msec
   read : io=256980KB, bw=1958.2KB/s, iops=487 , runt=131235msec
   write: io=258288KB, bw=1423.9KB/s, iops=354 , runt=181405msec
     clat (usec): min=922 , max=1578.2K, avg=38702.18, stdev=99248.33
     clat (usec): min=460 , max=241314 , avg=49326.52, stdev=18705.68
     clat (msec): min=19 , max=2370 , avg=515.30, stdev=200.84
     clat (msec): min=11 , max=3751 , avg=702.60, stdev=286.93
   cpu          : usr=2.54%, sys=8.75%, ctx=68522, majf=0, minf=6
   cpu          : usr=1.96%, sys=7.70%, ctx=70003, majf=0, minf=0
   cpu          : usr=0.39%, sys=1.46%, ctx=259459, majf=0, minf=16
   cpu          : usr=0.28%, sys=1.21%, ctx=265148, majf=0, minf=0


lkvm use_bio=0 (req-based)
-------------------------------------
   read : io=150120KB, bw=40420KB/s, iops=10037 , runt=  3714msec
   write: io=194932KB, bw=27029KB/s, iops=6722 , runt=  7212msec
   read : io=257136KB, bw=2001.1KB/s, iops=498 , runt=128443msec
   write: io=258276KB, bw=1537.2KB/s, iops=382 , runt=168028msec
     clat (msec): min=1 , max=1542 , avg=24.84, stdev=32.45
     clat (msec): min=3 , max=628 , avg=35.62, stdev=39.71
     clat (msec): min=8 , max=2540 , avg=503.28, stdev=236.97
     clat (msec): min=41 , max=4398 , avg=653.88, stdev=302.61
   cpu          : usr=3.91%, sys=15.75%, ctx=26968, majf=0, minf=23
   cpu          : usr=2.50%, sys=10.56%, ctx=19090, majf=0, minf=0
   cpu          : usr=0.16%, sys=0.43%, ctx=20159, majf=0, minf=16
   cpu          : usr=0.18%, sys=0.53%, ctx=81364, majf=0, minf=0


lkvm use_bio=1 (bio-based)
-------------------------------------
   read : io=124812KB, bw=36537KB/s, iops=9060 , runt=  3416msec
   write: io=169180KB, bw=24406KB/s, iops=6065 , runt=  6932msec
   read : io=256200KB, bw=2089.3KB/s, iops=520 , runt=122630msec
   write: io=257988KB, bw=1545.7KB/s, iops=384 , runt=166910msec
     clat (msec): min=1 , max=1527 , avg=28.06, stdev=89.54
     clat (msec): min=2 , max=344 , avg=41.12, stdev=38.70
     clat (msec): min=8 , max=1984 , avg=490.63, stdev=207.28
     clat (msec): min=33 , max=4131 , avg=659.19, stdev=304.71
   cpu          : usr=4.85%, sys=17.15%, ctx=31593, majf=0, minf=7
   cpu          : usr=3.04%, sys=11.45%, ctx=39377, majf=0, minf=0
   cpu          : usr=0.47%, sys=1.59%, ctx=262986, majf=0, minf=16
   cpu          : usr=0.47%, sys=1.46%, ctx=337410, majf=0, minf=0

-- 
Asias

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk
  2012-07-30  1:55     ` Rusty Russell
@ 2012-07-30 13:42       ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2012-07-30 13:42 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Michael S. Tsirkin, Asias He, linux-kernel, virtualization, kvm,
	Christoph Hellwig, Minchan Kim

On Mon, Jul 30, 2012 at 11:25:51AM +0930, Rusty Russell wrote:
> I consider this approach a half-way step.  Quick attempts on my laptop
> and I couldn't find a case where the bio path was a loss, but in theory
> if the host wasn't doing any reordering and it was a slow device, you'd
> want the guest to do so.
> 
> I'm not sure if current qemu can be configured to do such a thing?


The host kernel will do the I/O scheduling for you unless you explicitly
disable it.  And we should be able to assume an administrator will only
disable it when they have a reason for it - if not they'll get worse
performance for non-virtualized workloads as well.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-30 13:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-28  2:21 [PATCH V4 0/3] Improve virtio-blk performance Asias He
2012-07-28  2:21 ` [PATCH V4 1/3] block: Introduce __blk_segment_map_sg() helper Asias He
2012-07-28  2:21 ` [PATCH V4 2/3] block: Add blk_bio_map_sg() helper Asias He
2012-07-28  2:21 ` [PATCH V4 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
2012-07-28  6:35   ` Sasha Levin
2012-07-30  6:27     ` Asias He
2012-07-29 11:11   ` Michael S. Tsirkin
2012-07-30  1:55     ` Rusty Russell
2012-07-30 13:42       ` Christoph Hellwig
2012-07-30  4:33     ` Asias He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).