* [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-18 6:53 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization; +Cc: Asias He
This patchset implements bio-based IO path for virito-blk to improve
performance.
Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
latency improvement for sequential read/write, random read/write respectively.
Asias He (3):
block: Introduce __blk_segment_map_sg() helper
block: Add blk_bio_map_sg() helper
virtio-blk: Add bio-based IO path for virtio-blk
block/blk-merge.c | 117 +++++++++++++++++--------
drivers/block/virtio_blk.c | 203 +++++++++++++++++++++++++++++++++++---------
include/linux/blkdev.h | 2 +
3 files changed, 247 insertions(+), 75 deletions(-)
--
1.7.10.2
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-18 6:53 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization
This patchset implements bio-based IO path for virito-blk to improve
performance.
Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
latency improvement for sequential read/write, random read/write respectively.
Asias He (3):
block: Introduce __blk_segment_map_sg() helper
block: Add blk_bio_map_sg() helper
virtio-blk: Add bio-based IO path for virtio-blk
block/blk-merge.c | 117 +++++++++++++++++--------
drivers/block/virtio_blk.c | 203 +++++++++++++++++++++++++++++++++++---------
include/linux/blkdev.h | 2 +
3 files changed, 247 insertions(+), 75 deletions(-)
--
1.7.10.2
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
2012-06-18 6:53 ` Asias He
@ 2012-06-18 6:53 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization
Cc: Asias He, Jens Axboe, Tejun Heo, Shaohua Li
Split the mapping code in blk_rq_map_sg() to a helper
__blk_segment_map_sg(), so that other mapping function, e.g.
blk_bio_map_sg(), can share the code.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-kernel@vger.kernel.org
Suggested-by: Tejun Heo <tj@kernel.org>
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Asias He <asias@redhat.com>
---
block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
1 file changed, 45 insertions(+), 35 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 160035f..576b68e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
return 0;
}
+static void
+__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
+ struct scatterlist *sglist, struct bio_vec **bvprv,
+ struct scatterlist **sg, int *nsegs, int *cluster)
+{
+
+ int nbytes = bvec->bv_len;
+
+ if (*bvprv && *cluster) {
+ if ((*sg)->length + nbytes > queue_max_segment_size(q))
+ goto new_segment;
+
+ if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
+ goto new_segment;
+ if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
+ goto new_segment;
+
+ (*sg)->length += nbytes;
+ } else {
+new_segment:
+ if (!*sg)
+ *sg = sglist;
+ else {
+ /*
+ * If the driver previously mapped a shorter
+ * list, we could see a termination bit
+ * prematurely unless it fully inits the sg
+ * table on each mapping. We KNOW that there
+ * must be more entries here or the driver
+ * would be buggy, so force clear the
+ * termination bit to avoid doing a full
+ * sg_init_table() in drivers for each command.
+ */
+ (*sg)->page_link &= ~0x02;
+ *sg = sg_next(*sg);
+ }
+
+ sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
+ (*nsegs)++;
+ }
+ *bvprv = bvec;
+}
+
/*
* map a request to scatterlist, return number of sg entries setup. Caller
* must make sure sg can hold rq->nr_phys_segments entries
@@ -131,41 +174,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
bvprv = NULL;
sg = NULL;
rq_for_each_segment(bvec, rq, iter) {
- int nbytes = bvec->bv_len;
-
- if (bvprv && cluster) {
- if (sg->length + nbytes > queue_max_segment_size(q))
- goto new_segment;
-
- if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec))
- goto new_segment;
- if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec))
- goto new_segment;
-
- sg->length += nbytes;
- } else {
-new_segment:
- if (!sg)
- sg = sglist;
- else {
- /*
- * If the driver previously mapped a shorter
- * list, we could see a termination bit
- * prematurely unless it fully inits the sg
- * table on each mapping. We KNOW that there
- * must be more entries here or the driver
- * would be buggy, so force clear the
- * termination bit to avoid doing a full
- * sg_init_table() in drivers for each command.
- */
- sg->page_link &= ~0x02;
- sg = sg_next(sg);
- }
-
- sg_set_page(sg, bvec->bv_page, nbytes, bvec->bv_offset);
- nsegs++;
- }
- bvprv = bvec;
+ __blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+ &nsegs, &cluster);
} /* segments in rq */
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
@ 2012-06-18 6:53 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization; +Cc: Jens Axboe, Tejun Heo, Shaohua Li
Split the mapping code in blk_rq_map_sg() to a helper
__blk_segment_map_sg(), so that other mapping function, e.g.
blk_bio_map_sg(), can share the code.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-kernel@vger.kernel.org
Suggested-by: Tejun Heo <tj@kernel.org>
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Asias He <asias@redhat.com>
---
block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
1 file changed, 45 insertions(+), 35 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 160035f..576b68e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
return 0;
}
+static void
+__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
+ struct scatterlist *sglist, struct bio_vec **bvprv,
+ struct scatterlist **sg, int *nsegs, int *cluster)
+{
+
+ int nbytes = bvec->bv_len;
+
+ if (*bvprv && *cluster) {
+ if ((*sg)->length + nbytes > queue_max_segment_size(q))
+ goto new_segment;
+
+ if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
+ goto new_segment;
+ if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
+ goto new_segment;
+
+ (*sg)->length += nbytes;
+ } else {
+new_segment:
+ if (!*sg)
+ *sg = sglist;
+ else {
+ /*
+ * If the driver previously mapped a shorter
+ * list, we could see a termination bit
+ * prematurely unless it fully inits the sg
+ * table on each mapping. We KNOW that there
+ * must be more entries here or the driver
+ * would be buggy, so force clear the
+ * termination bit to avoid doing a full
+ * sg_init_table() in drivers for each command.
+ */
+ (*sg)->page_link &= ~0x02;
+ *sg = sg_next(*sg);
+ }
+
+ sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
+ (*nsegs)++;
+ }
+ *bvprv = bvec;
+}
+
/*
* map a request to scatterlist, return number of sg entries setup. Caller
* must make sure sg can hold rq->nr_phys_segments entries
@@ -131,41 +174,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
bvprv = NULL;
sg = NULL;
rq_for_each_segment(bvec, rq, iter) {
- int nbytes = bvec->bv_len;
-
- if (bvprv && cluster) {
- if (sg->length + nbytes > queue_max_segment_size(q))
- goto new_segment;
-
- if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec))
- goto new_segment;
- if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec))
- goto new_segment;
-
- sg->length += nbytes;
- } else {
-new_segment:
- if (!sg)
- sg = sglist;
- else {
- /*
- * If the driver previously mapped a shorter
- * list, we could see a termination bit
- * prematurely unless it fully inits the sg
- * table on each mapping. We KNOW that there
- * must be more entries here or the driver
- * would be buggy, so force clear the
- * termination bit to avoid doing a full
- * sg_init_table() in drivers for each command.
- */
- sg->page_link &= ~0x02;
- sg = sg_next(sg);
- }
-
- sg_set_page(sg, bvec->bv_page, nbytes, bvec->bv_offset);
- nsegs++;
- }
- bvprv = bvec;
+ __blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+ &nsegs, &cluster);
} /* segments in rq */
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH v2 2/3] block: Add blk_bio_map_sg() helper
2012-06-18 6:53 ` Asias He
@ 2012-06-18 6:53 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization
Cc: Asias He, Jens Axboe, Tejun Heo, Shaohua Li, Christoph Hellwig,
Minchan Kim
Add a helper to map a bio to a scatterlist, modelled after
blk_rq_map_sg.
This helper is useful for any driver that wants to create
a scatterlist from its ->make_request_fn method.
Changes in v2:
- Use __blk_segment_map_sg to avoid duplicated code
- Add cocbook style function comment
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
block/blk-merge.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 2 ++
2 files changed, 39 insertions(+)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 576b68e..e76279e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -209,6 +209,43 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
}
EXPORT_SYMBOL(blk_rq_map_sg);
+/**
+ * blk_bio_map_sg - map a bio to a scatterlist
+ * @q: request_queue in question
+ * @bio: bio being mapped
+ * @sglist: scatterlist being mapped
+ *
+ * Note:
+ * Caller must make sure sg can hold bio->bi_phys_segments entries
+ *
+ * Will return the number of sg entries setup
+ */
+int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist)
+{
+ struct bio_vec *bvec, *bvprv;
+ struct scatterlist *sg;
+ int nsegs, cluster;
+ unsigned long i;
+
+ nsegs = 0;
+ cluster = blk_queue_cluster(q);
+
+ bvprv = NULL;
+ sg = NULL;
+ bio_for_each_segment(bvec, bio, i) {
+ __blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+ &nsegs, &cluster);
+ } /* segments in bio */
+
+ if (sg)
+ sg_mark_end(sg);
+
+ BUG_ON(bio->bi_phys_segments && nsegs > bio->bi_phys_segments);
+ return nsegs;
+}
+EXPORT_SYMBOL(blk_bio_map_sg);
+
static inline int ll_new_hw_segment(struct request_queue *q,
struct request *req,
struct bio *bio)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ba43f40..46916af 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -884,6 +884,8 @@ extern void blk_queue_flush_queueable(struct request_queue *q, bool queueable);
extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
+extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist);
extern void blk_dump_rq_flags(struct request *, char *);
extern long nr_blockdev_pages(void);
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH v2 2/3] block: Add blk_bio_map_sg() helper
@ 2012-06-18 6:53 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization
Cc: Jens Axboe, Tejun Heo, Shaohua Li, Christoph Hellwig
Add a helper to map a bio to a scatterlist, modelled after
blk_rq_map_sg.
This helper is useful for any driver that wants to create
a scatterlist from its ->make_request_fn method.
Changes in v2:
- Use __blk_segment_map_sg to avoid duplicated code
- Add cocbook style function comment
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
block/blk-merge.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/blkdev.h | 2 ++
2 files changed, 39 insertions(+)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 576b68e..e76279e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -209,6 +209,43 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
}
EXPORT_SYMBOL(blk_rq_map_sg);
+/**
+ * blk_bio_map_sg - map a bio to a scatterlist
+ * @q: request_queue in question
+ * @bio: bio being mapped
+ * @sglist: scatterlist being mapped
+ *
+ * Note:
+ * Caller must make sure sg can hold bio->bi_phys_segments entries
+ *
+ * Will return the number of sg entries setup
+ */
+int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist)
+{
+ struct bio_vec *bvec, *bvprv;
+ struct scatterlist *sg;
+ int nsegs, cluster;
+ unsigned long i;
+
+ nsegs = 0;
+ cluster = blk_queue_cluster(q);
+
+ bvprv = NULL;
+ sg = NULL;
+ bio_for_each_segment(bvec, bio, i) {
+ __blk_segment_map_sg(q, bvec, sglist, &bvprv, &sg,
+ &nsegs, &cluster);
+ } /* segments in bio */
+
+ if (sg)
+ sg_mark_end(sg);
+
+ BUG_ON(bio->bi_phys_segments && nsegs > bio->bi_phys_segments);
+ return nsegs;
+}
+EXPORT_SYMBOL(blk_bio_map_sg);
+
static inline int ll_new_hw_segment(struct request_queue *q,
struct request *req,
struct bio *bio)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ba43f40..46916af 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -884,6 +884,8 @@ extern void blk_queue_flush_queueable(struct request_queue *q, bool queueable);
extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
+extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
+ struct scatterlist *sglist);
extern void blk_dump_rq_flags(struct request *, char *);
extern long nr_blockdev_pages(void);
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 6:53 ` Asias He
@ 2012-06-18 6:53 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization
Cc: Asias He, Rusty Russell, Michael S. Tsirkin, Christoph Hellwig,
Minchan Kim
This patch introduces bio-based IO path for virtio-blk.
Compared to request-based IO path, bio-based IO path uses driver
provided ->make_request_fn() method to bypasses the IO scheduler. It
handles the bio to device directly without allocating a request in block
layer. This reduces the IO path in guest kernel to achieve high IOPS
and lower latency. The downside is that guest can not use the IO
scheduler to merge and sort requests. However, this is not a big problem
if the backend disk in host side uses faster disk device.
When the bio-based IO path is not enabled, virtio-blk still uses the
original request-based IO path, no performance difference is observed.
Performance evaluation:
-----------------------------
Fio test is performed in a 8 vcpu guest with ramdisk based guest using
kvm tool.
Short version:
With bio-based IO path, sequential read/write, random read/write
IOPS boost : 28%, 24%, 21%, 16%
Latency improvement: 32%, 17%, 21%, 16%
Long version:
With bio-based IO path:
seq-read : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
rand-write: io=3095.7MB, bw=96198KB/s, iops=192396 , runt= 32952msec
clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
With request-based IO path:
seq-read : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985
How to use:
-----------------------------
Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
use_bio=1' to enable ->make_request_fn() based I/O path.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
drivers/block/virtio_blk.c | 203 +++++++++++++++++++++++++++++++++++---------
1 file changed, 163 insertions(+), 40 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 774c31d..b2d5002 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -14,6 +14,9 @@
#define PART_BITS 4
+static bool use_bio;
+module_param(use_bio, bool, S_IRUGO);
+
static int major;
static DEFINE_IDA(vd_index_ida);
@@ -23,6 +26,7 @@ struct virtio_blk
{
struct virtio_device *vdev;
struct virtqueue *vq;
+ wait_queue_head_t queue_wait;
/* The disk structure for the kernel. */
struct gendisk *disk;
@@ -51,53 +55,87 @@ struct virtio_blk
struct virtblk_req
{
struct request *req;
+ struct bio *bio;
struct virtio_blk_outhdr out_hdr;
struct virtio_scsi_inhdr in_hdr;
u8 status;
+ struct scatterlist sg[];
};
-static void blk_done(struct virtqueue *vq)
+static inline int virtblk_result(struct virtblk_req *vbr)
+{
+ switch (vbr->status) {
+ case VIRTIO_BLK_S_OK:
+ return 0;
+ case VIRTIO_BLK_S_UNSUPP:
+ return -ENOTTY;
+ default:
+ return -EIO;
+ }
+}
+
+static inline void virtblk_request_done(struct virtio_blk *vblk,
+ struct virtblk_req *vbr)
+{
+ struct request *req = vbr->req;
+ int error = virtblk_result(vbr);
+
+ if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
+ req->resid_len = vbr->in_hdr.residual;
+ req->sense_len = vbr->in_hdr.sense_len;
+ req->errors = vbr->in_hdr.errors;
+ } else if (req->cmd_type == REQ_TYPE_SPECIAL) {
+ req->errors = (error != 0);
+ }
+
+ __blk_end_request_all(req, error);
+ mempool_free(vbr, vblk->pool);
+}
+
+static inline void virtblk_bio_done(struct virtio_blk *vblk,
+ struct virtblk_req *vbr)
+{
+ bio_endio(vbr->bio, virtblk_result(vbr));
+ mempool_free(vbr, vblk->pool);
+}
+
+static void virtblk_done(struct virtqueue *vq)
{
struct virtio_blk *vblk = vq->vdev->priv;
+ unsigned long bio_done = 0, req_done = 0;
struct virtblk_req *vbr;
- unsigned int len;
unsigned long flags;
+ unsigned int len;
spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
- int error;
-
- switch (vbr->status) {
- case VIRTIO_BLK_S_OK:
- error = 0;
- break;
- case VIRTIO_BLK_S_UNSUPP:
- error = -ENOTTY;
- break;
- default:
- error = -EIO;
- break;
- }
-
- switch (vbr->req->cmd_type) {
- case REQ_TYPE_BLOCK_PC:
- vbr->req->resid_len = vbr->in_hdr.residual;
- vbr->req->sense_len = vbr->in_hdr.sense_len;
- vbr->req->errors = vbr->in_hdr.errors;
- break;
- case REQ_TYPE_SPECIAL:
- vbr->req->errors = (error != 0);
- break;
- default:
- break;
+ if (vbr->bio) {
+ virtblk_bio_done(vblk, vbr);
+ bio_done++;
+ } else {
+ virtblk_request_done(vblk, vbr);
+ req_done++;
}
-
- __blk_end_request_all(vbr->req, error);
- mempool_free(vbr, vblk->pool);
}
/* In case queue is stopped waiting for more buffers. */
- blk_start_queue(vblk->disk->queue);
+ if (req_done)
+ blk_start_queue(vblk->disk->queue);
spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
+
+ if (bio_done)
+ wake_up(&vblk->queue_wait);
+}
+
+static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
+ gfp_t gfp_mask)
+{
+ struct virtblk_req *vbr;
+
+ vbr = mempool_alloc(vblk->pool, gfp_mask);
+ if (vbr && use_bio)
+ sg_init_table(vbr->sg, vblk->sg_elems);
+
+ return vbr;
}
static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
@@ -106,13 +144,13 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
unsigned long num, out = 0, in = 0;
struct virtblk_req *vbr;
- vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
+ vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
if (!vbr)
/* When another request finishes we'll try again. */
return false;
vbr->req = req;
-
+ vbr->bio = NULL;
if (req->cmd_flags & REQ_FLUSH) {
vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
vbr->out_hdr.sector = 0;
@@ -172,7 +210,8 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
}
}
- if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
+ if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
mempool_free(vbr, vblk->pool);
return false;
}
@@ -180,7 +219,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
return true;
}
-static void do_virtblk_request(struct request_queue *q)
+static void virtblk_request(struct request_queue *q)
{
struct virtio_blk *vblk = q->queuedata;
struct request *req;
@@ -203,6 +242,82 @@ static void do_virtblk_request(struct request_queue *q)
virtqueue_kick(vblk->vq);
}
+static void virtblk_add_buf_wait(struct virtio_blk *vblk,
+ struct virtblk_req *vbr,
+ unsigned long out,
+ unsigned long in)
+{
+ DEFINE_WAIT(wait);
+
+ for (;;) {
+ prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ io_schedule();
+ } else {
+ virtqueue_kick(vblk->vq);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ break;
+ }
+
+ }
+
+ finish_wait(&vblk->queue_wait, &wait);
+}
+
+static void virtblk_make_request(struct request_queue *q, struct bio *bio)
+{
+ struct virtio_blk *vblk = q->queuedata;
+ unsigned int num, out = 0, in = 0;
+ struct virtblk_req *vbr;
+
+ BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
+ BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
+
+ vbr = virtblk_alloc_req(vblk, GFP_NOIO);
+ if (!vbr) {
+ bio_endio(bio, -ENOMEM);
+ return;
+ }
+
+ vbr->bio = bio;
+ vbr->req = NULL;
+ vbr->out_hdr.type = 0;
+ vbr->out_hdr.sector = bio->bi_sector;
+ vbr->out_hdr.ioprio = bio_prio(bio);
+
+ sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
+
+ num = blk_bio_map_sg(q, bio, vbr->sg + out);
+
+ sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
+ sizeof(vbr->status));
+
+ if (num) {
+ if (bio->bi_rw & REQ_WRITE) {
+ vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
+ out += num;
+ } else {
+ vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
+ in += num;
+ }
+ }
+
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ virtblk_add_buf_wait(vblk, vbr, out, in);
+ } else {
+ virtqueue_kick(vblk->vq);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ }
+}
+
/* return id (s/n) string for *disk to *id_str
*/
static int virtblk_get_id(struct gendisk *disk, char *id_str)
@@ -360,7 +475,7 @@ static int init_vq(struct virtio_blk *vblk)
int err = 0;
/* We expect one virtqueue, for output. */
- vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
+ vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
if (IS_ERR(vblk->vq))
err = PTR_ERR(vblk->vq);
@@ -400,6 +515,8 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
struct virtio_blk *vblk;
struct request_queue *q;
int err, index;
+ int pool_size;
+
u64 cap;
u32 v, blk_size, sg_elems, opt_io_size;
u16 min_io_size;
@@ -429,10 +546,12 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
goto out_free_index;
}
+ init_waitqueue_head(&vblk->queue_wait);
vblk->vdev = vdev;
vblk->sg_elems = sg_elems;
sg_init_table(vblk->sg, vblk->sg_elems);
mutex_init(&vblk->config_lock);
+
INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
vblk->config_enable = true;
@@ -440,7 +559,10 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
if (err)
goto out_free_vblk;
- vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
+ pool_size = sizeof(struct virtblk_req);
+ if (use_bio)
+ pool_size += sizeof(struct scatterlist) * sg_elems;
+ vblk->pool = mempool_create_kmalloc_pool(1, pool_size);
if (!vblk->pool) {
err = -ENOMEM;
goto out_free_vq;
@@ -453,12 +575,14 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
goto out_mempool;
}
- q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
+ q = vblk->disk->queue = blk_init_queue(virtblk_request, NULL);
if (!q) {
err = -ENOMEM;
goto out_put_disk;
}
+ if (use_bio)
+ blk_queue_make_request(q, virtblk_make_request);
q->queuedata = vblk;
virtblk_name_format("vd", index, vblk->disk->disk_name, DISK_NAME_LEN);
@@ -471,7 +595,7 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
vblk->index = index;
/* configure queue flush support */
- if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH))
+ if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH) && !use_bio)
blk_queue_flush(q, REQ_FLUSH);
/* If disk is read-only in the host, the guest should obey */
@@ -544,7 +668,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
if (!err && opt_io_size)
blk_queue_io_opt(q, blk_size * opt_io_size);
-
add_disk(vblk->disk);
err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
if (err)
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 6:53 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 6:53 UTC (permalink / raw)
To: kvm, linux-kernel, virtualization; +Cc: Christoph Hellwig, Michael S. Tsirkin
This patch introduces bio-based IO path for virtio-blk.
Compared to request-based IO path, bio-based IO path uses driver
provided ->make_request_fn() method to bypasses the IO scheduler. It
handles the bio to device directly without allocating a request in block
layer. This reduces the IO path in guest kernel to achieve high IOPS
and lower latency. The downside is that guest can not use the IO
scheduler to merge and sort requests. However, this is not a big problem
if the backend disk in host side uses faster disk device.
When the bio-based IO path is not enabled, virtio-blk still uses the
original request-based IO path, no performance difference is observed.
Performance evaluation:
-----------------------------
Fio test is performed in a 8 vcpu guest with ramdisk based guest using
kvm tool.
Short version:
With bio-based IO path, sequential read/write, random read/write
IOPS boost : 28%, 24%, 21%, 16%
Latency improvement: 32%, 17%, 21%, 16%
Long version:
With bio-based IO path:
seq-read : io=2048.0MB, bw=116996KB/s, iops=233991 , runt= 17925msec
seq-write : io=2048.0MB, bw=100829KB/s, iops=201658 , runt= 20799msec
rand-read : io=3095.7MB, bw=112134KB/s, iops=224268 , runt= 28269msec
rand-write: io=3095.7MB, bw=96198KB/s, iops=192396 , runt= 32952msec
clat (usec): min=0 , max=2631.6K, avg=58716.99, stdev=191377.30
clat (usec): min=0 , max=1753.2K, avg=66423.25, stdev=81774.35
clat (usec): min=0 , max=2915.5K, avg=61685.70, stdev=120598.39
clat (usec): min=0 , max=1933.4K, avg=76935.12, stdev=96603.45
cpu : usr=74.08%, sys=703.84%, ctx=29661403, majf=21354, minf=22460954
cpu : usr=70.92%, sys=702.81%, ctx=77219828, majf=13980, minf=27713137
cpu : usr=72.23%, sys=695.37%, ctx=88081059, majf=18475, minf=28177648
cpu : usr=69.69%, sys=654.13%, ctx=145476035, majf=15867, minf=26176375
With request-based IO path:
seq-read : io=2048.0MB, bw=91074KB/s, iops=182147 , runt= 23027msec
seq-write : io=2048.0MB, bw=80725KB/s, iops=161449 , runt= 25979msec
rand-read : io=3095.7MB, bw=92106KB/s, iops=184211 , runt= 34416msec
rand-write: io=3095.7MB, bw=82815KB/s, iops=165630 , runt= 38277msec
clat (usec): min=0 , max=1932.4K, avg=77824.17, stdev=170339.49
clat (usec): min=0 , max=2510.2K, avg=78023.96, stdev=146949.15
clat (usec): min=0 , max=3037.2K, avg=74746.53, stdev=128498.27
clat (usec): min=0 , max=1363.4K, avg=89830.75, stdev=114279.68
cpu : usr=53.28%, sys=724.19%, ctx=37988895, majf=17531, minf=23577622
cpu : usr=49.03%, sys=633.20%, ctx=205935380, majf=18197, minf=27288959
cpu : usr=55.78%, sys=722.40%, ctx=101525058, majf=19273, minf=28067082
cpu : usr=56.55%, sys=690.83%, ctx=228205022, majf=18039, minf=26551985
How to use:
-----------------------------
Add 'virtio_blk.use_bio=1' to kernel cmdline or 'modprobe virtio_blk
use_bio=1' to enable ->make_request_fn() based I/O path.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Asias He <asias@redhat.com>
---
drivers/block/virtio_blk.c | 203 +++++++++++++++++++++++++++++++++++---------
1 file changed, 163 insertions(+), 40 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 774c31d..b2d5002 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -14,6 +14,9 @@
#define PART_BITS 4
+static bool use_bio;
+module_param(use_bio, bool, S_IRUGO);
+
static int major;
static DEFINE_IDA(vd_index_ida);
@@ -23,6 +26,7 @@ struct virtio_blk
{
struct virtio_device *vdev;
struct virtqueue *vq;
+ wait_queue_head_t queue_wait;
/* The disk structure for the kernel. */
struct gendisk *disk;
@@ -51,53 +55,87 @@ struct virtio_blk
struct virtblk_req
{
struct request *req;
+ struct bio *bio;
struct virtio_blk_outhdr out_hdr;
struct virtio_scsi_inhdr in_hdr;
u8 status;
+ struct scatterlist sg[];
};
-static void blk_done(struct virtqueue *vq)
+static inline int virtblk_result(struct virtblk_req *vbr)
+{
+ switch (vbr->status) {
+ case VIRTIO_BLK_S_OK:
+ return 0;
+ case VIRTIO_BLK_S_UNSUPP:
+ return -ENOTTY;
+ default:
+ return -EIO;
+ }
+}
+
+static inline void virtblk_request_done(struct virtio_blk *vblk,
+ struct virtblk_req *vbr)
+{
+ struct request *req = vbr->req;
+ int error = virtblk_result(vbr);
+
+ if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
+ req->resid_len = vbr->in_hdr.residual;
+ req->sense_len = vbr->in_hdr.sense_len;
+ req->errors = vbr->in_hdr.errors;
+ } else if (req->cmd_type == REQ_TYPE_SPECIAL) {
+ req->errors = (error != 0);
+ }
+
+ __blk_end_request_all(req, error);
+ mempool_free(vbr, vblk->pool);
+}
+
+static inline void virtblk_bio_done(struct virtio_blk *vblk,
+ struct virtblk_req *vbr)
+{
+ bio_endio(vbr->bio, virtblk_result(vbr));
+ mempool_free(vbr, vblk->pool);
+}
+
+static void virtblk_done(struct virtqueue *vq)
{
struct virtio_blk *vblk = vq->vdev->priv;
+ unsigned long bio_done = 0, req_done = 0;
struct virtblk_req *vbr;
- unsigned int len;
unsigned long flags;
+ unsigned int len;
spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
- int error;
-
- switch (vbr->status) {
- case VIRTIO_BLK_S_OK:
- error = 0;
- break;
- case VIRTIO_BLK_S_UNSUPP:
- error = -ENOTTY;
- break;
- default:
- error = -EIO;
- break;
- }
-
- switch (vbr->req->cmd_type) {
- case REQ_TYPE_BLOCK_PC:
- vbr->req->resid_len = vbr->in_hdr.residual;
- vbr->req->sense_len = vbr->in_hdr.sense_len;
- vbr->req->errors = vbr->in_hdr.errors;
- break;
- case REQ_TYPE_SPECIAL:
- vbr->req->errors = (error != 0);
- break;
- default:
- break;
+ if (vbr->bio) {
+ virtblk_bio_done(vblk, vbr);
+ bio_done++;
+ } else {
+ virtblk_request_done(vblk, vbr);
+ req_done++;
}
-
- __blk_end_request_all(vbr->req, error);
- mempool_free(vbr, vblk->pool);
}
/* In case queue is stopped waiting for more buffers. */
- blk_start_queue(vblk->disk->queue);
+ if (req_done)
+ blk_start_queue(vblk->disk->queue);
spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
+
+ if (bio_done)
+ wake_up(&vblk->queue_wait);
+}
+
+static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk,
+ gfp_t gfp_mask)
+{
+ struct virtblk_req *vbr;
+
+ vbr = mempool_alloc(vblk->pool, gfp_mask);
+ if (vbr && use_bio)
+ sg_init_table(vbr->sg, vblk->sg_elems);
+
+ return vbr;
}
static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
@@ -106,13 +144,13 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
unsigned long num, out = 0, in = 0;
struct virtblk_req *vbr;
- vbr = mempool_alloc(vblk->pool, GFP_ATOMIC);
+ vbr = virtblk_alloc_req(vblk, GFP_ATOMIC);
if (!vbr)
/* When another request finishes we'll try again. */
return false;
vbr->req = req;
-
+ vbr->bio = NULL;
if (req->cmd_flags & REQ_FLUSH) {
vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
vbr->out_hdr.sector = 0;
@@ -172,7 +210,8 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
}
}
- if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
+ if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
mempool_free(vbr, vblk->pool);
return false;
}
@@ -180,7 +219,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
return true;
}
-static void do_virtblk_request(struct request_queue *q)
+static void virtblk_request(struct request_queue *q)
{
struct virtio_blk *vblk = q->queuedata;
struct request *req;
@@ -203,6 +242,82 @@ static void do_virtblk_request(struct request_queue *q)
virtqueue_kick(vblk->vq);
}
+static void virtblk_add_buf_wait(struct virtio_blk *vblk,
+ struct virtblk_req *vbr,
+ unsigned long out,
+ unsigned long in)
+{
+ DEFINE_WAIT(wait);
+
+ for (;;) {
+ prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
+ TASK_UNINTERRUPTIBLE);
+
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ io_schedule();
+ } else {
+ virtqueue_kick(vblk->vq);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ break;
+ }
+
+ }
+
+ finish_wait(&vblk->queue_wait, &wait);
+}
+
+static void virtblk_make_request(struct request_queue *q, struct bio *bio)
+{
+ struct virtio_blk *vblk = q->queuedata;
+ unsigned int num, out = 0, in = 0;
+ struct virtblk_req *vbr;
+
+ BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
+ BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
+
+ vbr = virtblk_alloc_req(vblk, GFP_NOIO);
+ if (!vbr) {
+ bio_endio(bio, -ENOMEM);
+ return;
+ }
+
+ vbr->bio = bio;
+ vbr->req = NULL;
+ vbr->out_hdr.type = 0;
+ vbr->out_hdr.sector = bio->bi_sector;
+ vbr->out_hdr.ioprio = bio_prio(bio);
+
+ sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
+
+ num = blk_bio_map_sg(q, bio, vbr->sg + out);
+
+ sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
+ sizeof(vbr->status));
+
+ if (num) {
+ if (bio->bi_rw & REQ_WRITE) {
+ vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
+ out += num;
+ } else {
+ vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
+ in += num;
+ }
+ }
+
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
+ GFP_ATOMIC) < 0) {
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ virtblk_add_buf_wait(vblk, vbr, out, in);
+ } else {
+ virtqueue_kick(vblk->vq);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ }
+}
+
/* return id (s/n) string for *disk to *id_str
*/
static int virtblk_get_id(struct gendisk *disk, char *id_str)
@@ -360,7 +475,7 @@ static int init_vq(struct virtio_blk *vblk)
int err = 0;
/* We expect one virtqueue, for output. */
- vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
+ vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
if (IS_ERR(vblk->vq))
err = PTR_ERR(vblk->vq);
@@ -400,6 +515,8 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
struct virtio_blk *vblk;
struct request_queue *q;
int err, index;
+ int pool_size;
+
u64 cap;
u32 v, blk_size, sg_elems, opt_io_size;
u16 min_io_size;
@@ -429,10 +546,12 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
goto out_free_index;
}
+ init_waitqueue_head(&vblk->queue_wait);
vblk->vdev = vdev;
vblk->sg_elems = sg_elems;
sg_init_table(vblk->sg, vblk->sg_elems);
mutex_init(&vblk->config_lock);
+
INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
vblk->config_enable = true;
@@ -440,7 +559,10 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
if (err)
goto out_free_vblk;
- vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
+ pool_size = sizeof(struct virtblk_req);
+ if (use_bio)
+ pool_size += sizeof(struct scatterlist) * sg_elems;
+ vblk->pool = mempool_create_kmalloc_pool(1, pool_size);
if (!vblk->pool) {
err = -ENOMEM;
goto out_free_vq;
@@ -453,12 +575,14 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
goto out_mempool;
}
- q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
+ q = vblk->disk->queue = blk_init_queue(virtblk_request, NULL);
if (!q) {
err = -ENOMEM;
goto out_put_disk;
}
+ if (use_bio)
+ blk_queue_make_request(q, virtblk_make_request);
q->queuedata = vblk;
virtblk_name_format("vd", index, vblk->disk->disk_name, DISK_NAME_LEN);
@@ -471,7 +595,7 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
vblk->index = index;
/* configure queue flush support */
- if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH))
+ if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH) && !use_bio)
blk_queue_flush(q, REQ_FLUSH);
/* If disk is read-only in the host, the guest should obey */
@@ -544,7 +668,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
if (!err && opt_io_size)
blk_queue_io_opt(q, blk_size * opt_io_size);
-
add_disk(vblk->disk);
err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
if (err)
--
1.7.10.2
^ permalink raw reply related [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 6:53 ` Asias He
@ 2012-06-18 7:46 ` Rusty Russell
-1 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-06-18 7:46 UTC (permalink / raw)
To: Asias He, kvm, linux-kernel, virtualization
Cc: Asias He, Michael S. Tsirkin, Christoph Hellwig, Minchan Kim
On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> This patch introduces bio-based IO path for virtio-blk.
Why make it optional?
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 7:46 ` Rusty Russell
0 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-06-18 7:46 UTC (permalink / raw)
To: Asias He, kvm, linux-kernel, virtualization
Cc: Christoph Hellwig, Michael S. Tsirkin
On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> This patch introduces bio-based IO path for virtio-blk.
Why make it optional?
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 7:46 ` Rusty Russell
@ 2012-06-18 8:03 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 8:03 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, linux-kernel, virtualization, Michael S. Tsirkin,
Christoph Hellwig, Minchan Kim
On 06/18/2012 03:46 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>> This patch introduces bio-based IO path for virtio-blk.
>
> Why make it optional?
request-based IO path is useful for users who do not want to bypass the
IO scheduler in guest kernel, e.g. users using spinning disk. For users
using fast disk device, e.g. SSD device, they can use bio-based IO path.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 8:03 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 8:03 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 03:46 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>> This patch introduces bio-based IO path for virtio-blk.
>
> Why make it optional?
request-based IO path is useful for users who do not want to bypass the
IO scheduler in guest kernel, e.g. users using spinning disk. For users
using fast disk device, e.g. SSD device, they can use bio-based IO path.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
2012-06-18 6:53 ` Asias He
@ 2012-06-18 9:14 ` Stefan Hajnoczi
-1 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 9:14 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization
On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
> latency improvement for sequential read/write, random read/write respectively.
Sounds great. What storage configuration did you use (single spinning
disk, SSD, storage array) and are these numbers for parallel I/O or
sequential I/O?
What changed since Minchan worked on this? I remember he wasn't
satisfied that this was a clear win. Your numbers are strong so
either you fixed something important or you are looking at different
benchmark configurations.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-18 9:14 ` Stefan Hajnoczi
0 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 9:14 UTC (permalink / raw)
To: Asias He; +Cc: linux-kernel, kvm, virtualization
On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
> latency improvement for sequential read/write, random read/write respectively.
Sounds great. What storage configuration did you use (single spinning
disk, SSD, storage array) and are these numbers for parallel I/O or
sequential I/O?
What changed since Minchan worked on this? I remember he wasn't
satisfied that this was a clear win. Your numbers are strong so
either you fixed something important or you are looking at different
benchmark configurations.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 6:53 ` Asias He
@ 2012-06-18 9:37 ` Stefan Hajnoczi
-1 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 9:37 UTC (permalink / raw)
To: Asias He
Cc: kvm, linux-kernel, virtualization, Christoph Hellwig, Michael S. Tsirkin
On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
> +static void virtblk_add_buf_wait(struct virtio_blk *vblk,
> + struct virtblk_req *vbr,
> + unsigned long out,
> + unsigned long in)
> +{
> + DEFINE_WAIT(wait);
> +
> + for (;;) {
> + prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
> + TASK_UNINTERRUPTIBLE);
> +
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> + GFP_ATOMIC) < 0) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + io_schedule();
> + } else {
> + virtqueue_kick(vblk->vq);
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + break;
> + }
> +
> + }
Is there a meaningful condition where we would cancel this request
(e.g. signal)? If the vring fills up for some reason and we get stuck
here the user might wish to kill hung processes.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 9:37 ` Stefan Hajnoczi
0 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 9:37 UTC (permalink / raw)
To: Asias He
Cc: Michael S. Tsirkin, Christoph Hellwig, linux-kernel, kvm, virtualization
On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
> +static void virtblk_add_buf_wait(struct virtio_blk *vblk,
> + struct virtblk_req *vbr,
> + unsigned long out,
> + unsigned long in)
> +{
> + DEFINE_WAIT(wait);
> +
> + for (;;) {
> + prepare_to_wait_exclusive(&vblk->queue_wait, &wait,
> + TASK_UNINTERRUPTIBLE);
> +
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> + GFP_ATOMIC) < 0) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + io_schedule();
> + } else {
> + virtqueue_kick(vblk->vq);
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + break;
> + }
> +
> + }
Is there a meaningful condition where we would cancel this request
(e.g. signal)? If the vring fills up for some reason and we get stuck
here the user might wish to kill hung processes.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
2012-06-18 9:14 ` Stefan Hajnoczi
@ 2012-06-18 9:39 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 9:39 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm, linux-kernel, virtualization
On 06/18/2012 05:14 PM, Stefan Hajnoczi wrote:
> On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
>> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
>> latency improvement for sequential read/write, random read/write respectively.
>
> Sounds great. What storage configuration did you use (single spinning
> disk, SSD, storage array) and are these numbers for parallel I/O or
> sequential I/O?
I used ramdisk as the backend storage.
> What changed since Minchan worked on this? I remember he wasn't
> satisfied that this was a clear win. Your numbers are strong so
> either you fixed something important or you are looking at different
> benchmark configurations.
I am using kvm tool instead qemu. He wasn't satisfied the poor
sequential performance. I removed the plug and unplug operation and bio
completion batching. You can grab Michan's patch and make a diff to see
the details.
Here is the fio's config file.
[global]
exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
group_reporting
norandommap
ioscheduler=noop
thread
bs=512
size=4MB
direct=1
filename=/dev/vdb
numjobs=256
ioengine=aio
iodepth=64
loops=3
[seq-read]
stonewall
rw=read
[seq-write]
stonewall
rw=write
[rnd-read]
stonewall
rw=randread
[rnd-write]
stonewall
rw=randwrite
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-18 9:39 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-18 9:39 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: linux-kernel, kvm, virtualization
On 06/18/2012 05:14 PM, Stefan Hajnoczi wrote:
> On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
>> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%, 16%
>> latency improvement for sequential read/write, random read/write respectively.
>
> Sounds great. What storage configuration did you use (single spinning
> disk, SSD, storage array) and are these numbers for parallel I/O or
> sequential I/O?
I used ramdisk as the backend storage.
> What changed since Minchan worked on this? I remember he wasn't
> satisfied that this was a clear win. Your numbers are strong so
> either you fixed something important or you are looking at different
> benchmark configurations.
I am using kvm tool instead qemu. He wasn't satisfied the poor
sequential performance. I removed the plug and unplug operation and bio
completion batching. You can grab Michan's patch and make a diff to see
the details.
Here is the fio's config file.
[global]
exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
group_reporting
norandommap
ioscheduler=noop
thread
bs=512
size=4MB
direct=1
filename=/dev/vdb
numjobs=256
ioengine=aio
iodepth=64
loops=3
[seq-read]
stonewall
rw=read
[seq-write]
stonewall
rw=write
[rnd-read]
stonewall
rw=randread
[rnd-write]
stonewall
rw=randwrite
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 8:03 ` Asias He
@ 2012-06-18 10:05 ` Rusty Russell
-1 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-06-18 10:05 UTC (permalink / raw)
To: Asias He
Cc: kvm, linux-kernel, virtualization, Michael S. Tsirkin,
Christoph Hellwig, Minchan Kim
On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> > On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> >> This patch introduces bio-based IO path for virtio-blk.
> >
> > Why make it optional?
>
> request-based IO path is useful for users who do not want to bypass the
> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> using fast disk device, e.g. SSD device, they can use bio-based IO path.
Users using a spinning disk still get IO scheduling in the host though.
What benefit is there in doing it in the guest as well?
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 10:05 ` Rusty Russell
0 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-06-18 10:05 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> > On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> >> This patch introduces bio-based IO path for virtio-blk.
> >
> > Why make it optional?
>
> request-based IO path is useful for users who do not want to bypass the
> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> using fast disk device, e.g. SSD device, they can use bio-based IO path.
Users using a spinning disk still get IO scheduling in the host though.
What benefit is there in doing it in the guest as well?
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 8:03 ` Asias He
@ 2012-06-18 10:13 ` Michael S. Tsirkin
-1 siblings, 0 replies; 80+ messages in thread
From: Michael S. Tsirkin @ 2012-06-18 10:13 UTC (permalink / raw)
To: Asias He
Cc: Rusty Russell, kvm, linux-kernel, virtualization,
Christoph Hellwig, Minchan Kim
On Mon, Jun 18, 2012 at 04:03:23PM +0800, Asias He wrote:
> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> >>This patch introduces bio-based IO path for virtio-blk.
> >
> >Why make it optional?
>
> request-based IO path is useful for users who do not want to bypass
> the IO scheduler in guest kernel, e.g. users using spinning disk.
> For users using fast disk device, e.g. SSD device, they can use
> bio-based IO path.
OK I guess but then it should be per-device. There could be
a mix of slow and fast disks :)
> --
> Asias
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 10:13 ` Michael S. Tsirkin
0 siblings, 0 replies; 80+ messages in thread
From: Michael S. Tsirkin @ 2012-06-18 10:13 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization, Christoph Hellwig
On Mon, Jun 18, 2012 at 04:03:23PM +0800, Asias He wrote:
> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> >>This patch introduces bio-based IO path for virtio-blk.
> >
> >Why make it optional?
>
> request-based IO path is useful for users who do not want to bypass
> the IO scheduler in guest kernel, e.g. users using spinning disk.
> For users using fast disk device, e.g. SSD device, they can use
> bio-based IO path.
OK I guess but then it should be per-device. There could be
a mix of slow and fast disks :)
> --
> Asias
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 6:53 ` Asias He
@ 2012-06-18 10:21 ` Michael S. Tsirkin
-1 siblings, 0 replies; 80+ messages in thread
From: Michael S. Tsirkin @ 2012-06-18 10:21 UTC (permalink / raw)
To: Asias He
Cc: kvm, linux-kernel, virtualization, Rusty Russell,
Christoph Hellwig, Minchan Kim
On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
> +{
> + struct virtio_blk *vblk = q->queuedata;
> + unsigned int num, out = 0, in = 0;
> + struct virtblk_req *vbr;
> +
> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
> +
> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
> + if (!vbr) {
> + bio_endio(bio, -ENOMEM);
> + return;
> + }
> +
> + vbr->bio = bio;
> + vbr->req = NULL;
> + vbr->out_hdr.type = 0;
> + vbr->out_hdr.sector = bio->bi_sector;
> + vbr->out_hdr.ioprio = bio_prio(bio);
> +
> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
> +
> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
> +
> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
> + sizeof(vbr->status));
> +
> + if (num) {
> + if (bio->bi_rw & REQ_WRITE) {
> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
> + out += num;
> + } else {
> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
> + in += num;
> + }
> + }
> +
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> + GFP_ATOMIC) < 0) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
Any implications of dropping lock like that?
E.g. for suspend. like we are still discussing with
unlocked kick?
> + virtblk_add_buf_wait(vblk, vbr, out, in);
> + } else {
> + virtqueue_kick(vblk->vq);
Why special case the first call? task state manipulation so expensive?
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + }
> +}
> +
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 10:21 ` Michael S. Tsirkin
0 siblings, 0 replies; 80+ messages in thread
From: Michael S. Tsirkin @ 2012-06-18 10:21 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization, Christoph Hellwig
On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
> +{
> + struct virtio_blk *vblk = q->queuedata;
> + unsigned int num, out = 0, in = 0;
> + struct virtblk_req *vbr;
> +
> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
> +
> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
> + if (!vbr) {
> + bio_endio(bio, -ENOMEM);
> + return;
> + }
> +
> + vbr->bio = bio;
> + vbr->req = NULL;
> + vbr->out_hdr.type = 0;
> + vbr->out_hdr.sector = bio->bi_sector;
> + vbr->out_hdr.ioprio = bio_prio(bio);
> +
> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
> +
> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
> +
> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
> + sizeof(vbr->status));
> +
> + if (num) {
> + if (bio->bi_rw & REQ_WRITE) {
> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
> + out += num;
> + } else {
> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
> + in += num;
> + }
> + }
> +
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
> + GFP_ATOMIC) < 0) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
Any implications of dropping lock like that?
E.g. for suspend. like we are still discussing with
unlocked kick?
> + virtblk_add_buf_wait(vblk, vbr, out, in);
> + } else {
> + virtqueue_kick(vblk->vq);
Why special case the first call? task state manipulation so expensive?
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + }
> +}
> +
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:21 ` Michael S. Tsirkin
@ 2012-06-18 10:45 ` Stefan Hajnoczi
-1 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 10:45 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Asias He, kvm, linux-kernel, virtualization, Rusty Russell,
Christoph Hellwig, Minchan Kim
On Mon, Jun 18, 2012 at 11:21 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
>> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> + struct virtio_blk *vblk = q->queuedata;
>> + unsigned int num, out = 0, in = 0;
>> + struct virtblk_req *vbr;
>> +
>> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
>> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
>> +
>> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
>> + if (!vbr) {
>> + bio_endio(bio, -ENOMEM);
>> + return;
>> + }
>> +
>> + vbr->bio = bio;
>> + vbr->req = NULL;
>> + vbr->out_hdr.type = 0;
>> + vbr->out_hdr.sector = bio->bi_sector;
>> + vbr->out_hdr.ioprio = bio_prio(bio);
>> +
>> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>> +
>> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
>> +
>> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
>> + sizeof(vbr->status));
>> +
>> + if (num) {
>> + if (bio->bi_rw & REQ_WRITE) {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
>> + out += num;
>> + } else {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>> + in += num;
>> + }
>> + }
>> +
>> + spin_lock_irq(vblk->disk->queue->queue_lock);
>> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> + GFP_ATOMIC) < 0) {
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>
> Any implications of dropping lock like that?
> E.g. for suspend. like we are still discussing with
> unlocked kick?
Since we aquired the lock in this function there should be no problem.
Whatever protects against vblk or vblk->disk disappearing upon
entering this function also protects after unlocking queue_lock.
Otherwise all .make_request_fn() functions would be broken.
I'd still like to understand the details though.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 10:45 ` Stefan Hajnoczi
0 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 10:45 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization, Christoph Hellwig
On Mon, Jun 18, 2012 at 11:21 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
>> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> + struct virtio_blk *vblk = q->queuedata;
>> + unsigned int num, out = 0, in = 0;
>> + struct virtblk_req *vbr;
>> +
>> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
>> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
>> +
>> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
>> + if (!vbr) {
>> + bio_endio(bio, -ENOMEM);
>> + return;
>> + }
>> +
>> + vbr->bio = bio;
>> + vbr->req = NULL;
>> + vbr->out_hdr.type = 0;
>> + vbr->out_hdr.sector = bio->bi_sector;
>> + vbr->out_hdr.ioprio = bio_prio(bio);
>> +
>> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>> +
>> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
>> +
>> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
>> + sizeof(vbr->status));
>> +
>> + if (num) {
>> + if (bio->bi_rw & REQ_WRITE) {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
>> + out += num;
>> + } else {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>> + in += num;
>> + }
>> + }
>> +
>> + spin_lock_irq(vblk->disk->queue->queue_lock);
>> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> + GFP_ATOMIC) < 0) {
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>
> Any implications of dropping lock like that?
> E.g. for suspend. like we are still discussing with
> unlocked kick?
Since we aquired the lock in this function there should be no problem.
Whatever protects against vblk or vblk->disk disappearing upon
entering this function also protects after unlocking queue_lock.
Otherwise all .make_request_fn() functions would be broken.
I'd still like to understand the details though.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
2012-06-18 9:39 ` Asias He
@ 2012-06-18 10:58 ` Stefan Hajnoczi
-1 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 10:58 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization
On Mon, Jun 18, 2012 at 10:39 AM, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 05:14 PM, Stefan Hajnoczi wrote:
>>
>> On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
>>>
>>> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%,
>>> 16%
>>> latency improvement for sequential read/write, random read/write
>>> respectively.
>>
>>
>> Sounds great. What storage configuration did you use (single spinning
>> disk, SSD, storage array) and are these numbers for parallel I/O or
>> sequential I/O?
>
>
> I used ramdisk as the backend storage.
As long as the latency is decreasing that's good. But It's worth
keeping in mind that these percentages are probably wildly different
on real storage devices and/or qemu-kvm. What we don't know here is
whether this bottleneck matters in real environments - results with
real storage and with qemu-kvm would be interesting.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-18 10:58 ` Stefan Hajnoczi
0 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-18 10:58 UTC (permalink / raw)
To: Asias He; +Cc: linux-kernel, kvm, virtualization
On Mon, Jun 18, 2012 at 10:39 AM, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 05:14 PM, Stefan Hajnoczi wrote:
>>
>> On Mon, Jun 18, 2012 at 7:53 AM, Asias He <asias@redhat.com> wrote:
>>>
>>> Fio test shows it gives, 28%, 24%, 21%, 16% IOPS boost and 32%, 17%, 21%,
>>> 16%
>>> latency improvement for sequential read/write, random read/write
>>> respectively.
>>
>>
>> Sounds great. What storage configuration did you use (single spinning
>> disk, SSD, storage array) and are these numbers for parallel I/O or
>> sequential I/O?
>
>
> I used ramdisk as the backend storage.
As long as the latency is decreasing that's good. But It's worth
keeping in mind that these percentages are probably wildly different
on real storage devices and/or qemu-kvm. What we don't know here is
whether this bottleneck matters in real environments - results with
real storage and with qemu-kvm would be interesting.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:05 ` Rusty Russell
@ 2012-06-18 11:14 ` Dor Laor
-1 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-18 11:14 UTC (permalink / raw)
To: Rusty Russell
Cc: Asias He, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Christoph Hellwig
On 06/18/2012 01:05 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass the
>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
The io scheduler waits for requests to merge and thus batch IOs
together. It's not important w.r.t spinning disks since the host can do
it but it causes much less vmexits which is the key issue for VMs.
>
> Cheers,
> Rusty.
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 11:14 ` Dor Laor
0 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-18 11:14 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 01:05 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass the
>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
The io scheduler waits for requests to merge and thus batch IOs
together. It's not important w.r.t spinning disks since the host can do
it but it causes much less vmexits which is the key issue for VMs.
>
> Cheers,
> Rusty.
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 11:14 ` Dor Laor
@ 2012-06-18 11:39 ` Sasha Levin
-1 siblings, 0 replies; 80+ messages in thread
From: Sasha Levin @ 2012-06-18 11:39 UTC (permalink / raw)
To: dlaor
Cc: Rusty Russell, Asias He, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
> On 06/18/2012 01:05 PM, Rusty Russell wrote:
> > On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
> >> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
> >>>> This patch introduces bio-based IO path for virtio-blk.
> >>>
> >>> Why make it optional?
> >>
> >> request-based IO path is useful for users who do not want to bypass the
> >> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> >> using fast disk device, e.g. SSD device, they can use bio-based IO path.
> >
> > Users using a spinning disk still get IO scheduling in the host though.
> > What benefit is there in doing it in the guest as well?
>
> The io scheduler waits for requests to merge and thus batch IOs
> together. It's not important w.r.t spinning disks since the host can do
> it but it causes much less vmexits which is the key issue for VMs.
Is the amount of exits caused by virtio-blk significant at all with
EVENT_IDX?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 11:39 ` Sasha Levin
0 siblings, 0 replies; 80+ messages in thread
From: Sasha Levin @ 2012-06-18 11:39 UTC (permalink / raw)
To: dlaor
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
> On 06/18/2012 01:05 PM, Rusty Russell wrote:
> > On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
> >> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
> >>>> This patch introduces bio-based IO path for virtio-blk.
> >>>
> >>> Why make it optional?
> >>
> >> request-based IO path is useful for users who do not want to bypass the
> >> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> >> using fast disk device, e.g. SSD device, they can use bio-based IO path.
> >
> > Users using a spinning disk still get IO scheduling in the host though.
> > What benefit is there in doing it in the guest as well?
>
> The io scheduler waits for requests to merge and thus batch IOs
> together. It's not important w.r.t spinning disks since the host can do
> it but it causes much less vmexits which is the key issue for VMs.
Is the amount of exits caused by virtio-blk significant at all with
EVENT_IDX?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:05 ` Rusty Russell
@ 2012-06-18 21:28 ` Tejun Heo
-1 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-18 21:28 UTC (permalink / raw)
To: Rusty Russell
Cc: Asias He, kvm, linux-kernel, virtualization, Michael S. Tsirkin,
Christoph Hellwig, Minchan Kim
Hello, guys.
On Mon, Jun 18, 2012 at 07:35:22PM +0930, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
> > On 06/18/2012 03:46 PM, Rusty Russell wrote:
> > > On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> > >> This patch introduces bio-based IO path for virtio-blk.
> > >
> > > Why make it optional?
> >
> > request-based IO path is useful for users who do not want to bypass the
> > IO scheduler in guest kernel, e.g. users using spinning disk. For users
> > using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
Another thing is that some of cgroup features are impelmented in IO
scheduler (cfq). e.g. bio based driver will be able to use cgroup
based throttling but IO weights won't work. Not sure how meaningful
this is tho.
With that said, I think this sort of feature switching is quite ugly.
The pros and cons of each choice aren't obvious unless one is familiar
with implementation details. IMHO, if the benefits of ioscheds aren't
critical, it would be better to just go with bio based implementation.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-18 21:28 ` Tejun Heo
0 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-18 21:28 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
Hello, guys.
On Mon, Jun 18, 2012 at 07:35:22PM +0930, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
> > On 06/18/2012 03:46 PM, Rusty Russell wrote:
> > > On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
> > >> This patch introduces bio-based IO path for virtio-blk.
> > >
> > > Why make it optional?
> >
> > request-based IO path is useful for users who do not want to bypass the
> > IO scheduler in guest kernel, e.g. users using spinning disk. For users
> > using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
Another thing is that some of cgroup features are impelmented in IO
scheduler (cfq). e.g. bio based driver will be able to use cgroup
based throttling but IO weights won't work. Not sure how meaningful
this is tho.
With that said, I think this sort of feature switching is quite ugly.
The pros and cons of each choice aren't obvious unless one is familiar
with implementation details. IMHO, if the benefits of ioscheds aren't
critical, it would be better to just go with bio based implementation.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
2012-06-18 6:53 ` Asias He
@ 2012-06-18 21:31 ` Tejun Heo
-1 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-18 21:31 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization, Jens Axboe, Shaohua Li
Hello, Asias.
On Mon, Jun 18, 2012 at 02:53:08PM +0800, Asias He wrote:
> Split the mapping code in blk_rq_map_sg() to a helper
> __blk_segment_map_sg(), so that other mapping function, e.g.
> blk_bio_map_sg(), can share the code.
>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Suggested-by: Tejun Heo <tj@kernel.org>
> Suggested-by: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Asias He <asias@redhat.com>
> ---
> block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
> 1 file changed, 45 insertions(+), 35 deletions(-)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 160035f..576b68e 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
> return 0;
> }
>
> +static void
> +__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
> + struct scatterlist *sglist, struct bio_vec **bvprv,
> + struct scatterlist **sg, int *nsegs, int *cluster)
> +{
> +
> + int nbytes = bvec->bv_len;
> +
> + if (*bvprv && *cluster) {
> + if ((*sg)->length + nbytes > queue_max_segment_size(q))
> + goto new_segment;
> +
> + if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
> + goto new_segment;
> + if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
> + goto new_segment;
> +
> + (*sg)->length += nbytes;
> + } else {
> +new_segment:
> + if (!*sg)
> + *sg = sglist;
> + else {
> + /*
> + * If the driver previously mapped a shorter
> + * list, we could see a termination bit
> + * prematurely unless it fully inits the sg
> + * table on each mapping. We KNOW that there
> + * must be more entries here or the driver
> + * would be buggy, so force clear the
> + * termination bit to avoid doing a full
> + * sg_init_table() in drivers for each command.
> + */
> + (*sg)->page_link &= ~0x02;
> + *sg = sg_next(*sg);
> + }
> +
> + sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
> + (*nsegs)++;
> + }
> + *bvprv = bvec;
> +}
I *hope* this is a bit prettier. e.g. Do we really need to pass in
@sglist and keep using "goto new_segment"?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
@ 2012-06-18 21:31 ` Tejun Heo
0 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-18 21:31 UTC (permalink / raw)
To: Asias He; +Cc: Jens Axboe, Shaohua Li, linux-kernel, kvm, virtualization
Hello, Asias.
On Mon, Jun 18, 2012 at 02:53:08PM +0800, Asias He wrote:
> Split the mapping code in blk_rq_map_sg() to a helper
> __blk_segment_map_sg(), so that other mapping function, e.g.
> blk_bio_map_sg(), can share the code.
>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Suggested-by: Tejun Heo <tj@kernel.org>
> Suggested-by: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Asias He <asias@redhat.com>
> ---
> block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
> 1 file changed, 45 insertions(+), 35 deletions(-)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 160035f..576b68e 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
> return 0;
> }
>
> +static void
> +__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
> + struct scatterlist *sglist, struct bio_vec **bvprv,
> + struct scatterlist **sg, int *nsegs, int *cluster)
> +{
> +
> + int nbytes = bvec->bv_len;
> +
> + if (*bvprv && *cluster) {
> + if ((*sg)->length + nbytes > queue_max_segment_size(q))
> + goto new_segment;
> +
> + if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
> + goto new_segment;
> + if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
> + goto new_segment;
> +
> + (*sg)->length += nbytes;
> + } else {
> +new_segment:
> + if (!*sg)
> + *sg = sglist;
> + else {
> + /*
> + * If the driver previously mapped a shorter
> + * list, we could see a termination bit
> + * prematurely unless it fully inits the sg
> + * table on each mapping. We KNOW that there
> + * must be more entries here or the driver
> + * would be buggy, so force clear the
> + * termination bit to avoid doing a full
> + * sg_init_table() in drivers for each command.
> + */
> + (*sg)->page_link &= ~0x02;
> + *sg = sg_next(*sg);
> + }
> +
> + sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
> + (*nsegs)++;
> + }
> + *bvprv = bvec;
> +}
I *hope* this is a bit prettier. e.g. Do we really need to pass in
@sglist and keep using "goto new_segment"?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
2012-06-18 21:31 ` Tejun Heo
@ 2012-06-19 2:02 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:02 UTC (permalink / raw)
To: Tejun Heo; +Cc: kvm, linux-kernel, virtualization, Jens Axboe, Shaohua Li
On 06/19/2012 05:31 AM, Tejun Heo wrote:
> Hello, Asias.
>
> On Mon, Jun 18, 2012 at 02:53:08PM +0800, Asias He wrote:
>> Split the mapping code in blk_rq_map_sg() to a helper
>> __blk_segment_map_sg(), so that other mapping function, e.g.
>> blk_bio_map_sg(), can share the code.
>>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: linux-kernel@vger.kernel.org
>> Suggested-by: Tejun Heo <tj@kernel.org>
>> Suggested-by: Jens Axboe <axboe@kernel.dk>
>> Signed-off-by: Asias He <asias@redhat.com>
>> ---
>> block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
>> 1 file changed, 45 insertions(+), 35 deletions(-)
>>
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index 160035f..576b68e 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
>> return 0;
>> }
>>
>> +static void
>> +__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
>> + struct scatterlist *sglist, struct bio_vec **bvprv,
>> + struct scatterlist **sg, int *nsegs, int *cluster)
>> +{
>> +
>> + int nbytes = bvec->bv_len;
>> +
>> + if (*bvprv && *cluster) {
>> + if ((*sg)->length + nbytes > queue_max_segment_size(q))
>> + goto new_segment;
>> +
>> + if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
>> + goto new_segment;
>> + if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
>> + goto new_segment;
>> +
>> + (*sg)->length += nbytes;
>> + } else {
>> +new_segment:
>> + if (!*sg)
>> + *sg = sglist;
>> + else {
>> + /*
>> + * If the driver previously mapped a shorter
>> + * list, we could see a termination bit
>> + * prematurely unless it fully inits the sg
>> + * table on each mapping. We KNOW that there
>> + * must be more entries here or the driver
>> + * would be buggy, so force clear the
>> + * termination bit to avoid doing a full
>> + * sg_init_table() in drivers for each command.
>> + */
>> + (*sg)->page_link &= ~0x02;
>> + *sg = sg_next(*sg);
>> + }
>> +
>> + sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
>> + (*nsegs)++;
>> + }
>> + *bvprv = bvec;
>> +}
>
> I *hope* this is a bit prettier. e.g. Do we really need to pass in
> @sglist and keep using "goto new_segment"?
I think this deserves another patch on top of this splitting one. I'd
like to clean this up later.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
@ 2012-06-19 2:02 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:02 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Shaohua Li, linux-kernel, kvm, virtualization
On 06/19/2012 05:31 AM, Tejun Heo wrote:
> Hello, Asias.
>
> On Mon, Jun 18, 2012 at 02:53:08PM +0800, Asias He wrote:
>> Split the mapping code in blk_rq_map_sg() to a helper
>> __blk_segment_map_sg(), so that other mapping function, e.g.
>> blk_bio_map_sg(), can share the code.
>>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: linux-kernel@vger.kernel.org
>> Suggested-by: Tejun Heo <tj@kernel.org>
>> Suggested-by: Jens Axboe <axboe@kernel.dk>
>> Signed-off-by: Asias He <asias@redhat.com>
>> ---
>> block/blk-merge.c | 80 ++++++++++++++++++++++++++++++-----------------------
>> 1 file changed, 45 insertions(+), 35 deletions(-)
>>
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index 160035f..576b68e 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -110,6 +110,49 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
>> return 0;
>> }
>>
>> +static void
>> +__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
>> + struct scatterlist *sglist, struct bio_vec **bvprv,
>> + struct scatterlist **sg, int *nsegs, int *cluster)
>> +{
>> +
>> + int nbytes = bvec->bv_len;
>> +
>> + if (*bvprv && *cluster) {
>> + if ((*sg)->length + nbytes > queue_max_segment_size(q))
>> + goto new_segment;
>> +
>> + if (!BIOVEC_PHYS_MERGEABLE(*bvprv, bvec))
>> + goto new_segment;
>> + if (!BIOVEC_SEG_BOUNDARY(q, *bvprv, bvec))
>> + goto new_segment;
>> +
>> + (*sg)->length += nbytes;
>> + } else {
>> +new_segment:
>> + if (!*sg)
>> + *sg = sglist;
>> + else {
>> + /*
>> + * If the driver previously mapped a shorter
>> + * list, we could see a termination bit
>> + * prematurely unless it fully inits the sg
>> + * table on each mapping. We KNOW that there
>> + * must be more entries here or the driver
>> + * would be buggy, so force clear the
>> + * termination bit to avoid doing a full
>> + * sg_init_table() in drivers for each command.
>> + */
>> + (*sg)->page_link &= ~0x02;
>> + *sg = sg_next(*sg);
>> + }
>> +
>> + sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
>> + (*nsegs)++;
>> + }
>> + *bvprv = bvec;
>> +}
>
> I *hope* this is a bit prettier. e.g. Do we really need to pass in
> @sglist and keep using "goto new_segment"?
I think this deserves another patch on top of this splitting one. I'd
like to clean this up later.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:13 ` Michael S. Tsirkin
@ 2012-06-19 2:21 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:21 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Rusty Russell, kvm, linux-kernel, virtualization,
Christoph Hellwig, Minchan Kim
On 06/18/2012 06:13 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2012 at 04:03:23PM +0800, Asias He wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass
>> the IO scheduler in guest kernel, e.g. users using spinning disk.
>> For users using fast disk device, e.g. SSD device, they can use
>> bio-based IO path.
>
> OK I guess but then it should be per-device. There could be
> a mix of slow and fast disks :)
Yes, per-device might be useful. There are issues which need solving.
- How do we tell the drive which IO path to use
- Device add some flag
- Old qemu/lkvm can not turn this feature on
- Through /sys filesystem attribute
- How do we handle the switch from one path to anther.
So, let's add the per-device feature later.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-19 2:21 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:21 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 06:13 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2012 at 04:03:23PM +0800, Asias He wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass
>> the IO scheduler in guest kernel, e.g. users using spinning disk.
>> For users using fast disk device, e.g. SSD device, they can use
>> bio-based IO path.
>
> OK I guess but then it should be per-device. There could be
> a mix of slow and fast disks :)
Yes, per-device might be useful. There are issues which need solving.
- How do we tell the drive which IO path to use
- Device add some flag
- Old qemu/lkvm can not turn this feature on
- Through /sys filesystem attribute
- How do we handle the switch from one path to anther.
So, let's add the per-device feature later.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:21 ` Michael S. Tsirkin
@ 2012-06-19 2:30 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:30 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: kvm, linux-kernel, virtualization, Rusty Russell,
Christoph Hellwig, Minchan Kim
On 06/18/2012 06:21 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
>> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> + struct virtio_blk *vblk = q->queuedata;
>> + unsigned int num, out = 0, in = 0;
>> + struct virtblk_req *vbr;
>> +
>> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
>> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
>> +
>> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
>> + if (!vbr) {
>> + bio_endio(bio, -ENOMEM);
>> + return;
>> + }
>> +
>> + vbr->bio = bio;
>> + vbr->req = NULL;
>> + vbr->out_hdr.type = 0;
>> + vbr->out_hdr.sector = bio->bi_sector;
>> + vbr->out_hdr.ioprio = bio_prio(bio);
>> +
>> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>> +
>> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
>> +
>> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
>> + sizeof(vbr->status));
>> +
>> + if (num) {
>> + if (bio->bi_rw & REQ_WRITE) {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
>> + out += num;
>> + } else {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>> + in += num;
>> + }
>> + }
>> +
>> + spin_lock_irq(vblk->disk->queue->queue_lock);
>> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> + GFP_ATOMIC) < 0) {
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>
> Any implications of dropping lock like that?
> E.g. for suspend. like we are still discussing with
> unlocked kick?
>
>> + virtblk_add_buf_wait(vblk, vbr, out, in);
>> + } else {
>> + virtqueue_kick(vblk->vq);
>
> Why special case the first call? task state manipulation so expensive?
Hmm. Will switch them.
>
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>> + }
>> +}
>> +
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-19 2:30 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:30 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: kvm, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 06:21 PM, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2012 at 02:53:10PM +0800, Asias He wrote:
>> +static void virtblk_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> + struct virtio_blk *vblk = q->queuedata;
>> + unsigned int num, out = 0, in = 0;
>> + struct virtblk_req *vbr;
>> +
>> + BUG_ON(bio->bi_phys_segments + 2 > vblk->sg_elems);
>> + BUG_ON(bio->bi_rw & (REQ_FLUSH | REQ_FUA));
>> +
>> + vbr = virtblk_alloc_req(vblk, GFP_NOIO);
>> + if (!vbr) {
>> + bio_endio(bio, -ENOMEM);
>> + return;
>> + }
>> +
>> + vbr->bio = bio;
>> + vbr->req = NULL;
>> + vbr->out_hdr.type = 0;
>> + vbr->out_hdr.sector = bio->bi_sector;
>> + vbr->out_hdr.ioprio = bio_prio(bio);
>> +
>> + sg_set_buf(&vbr->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>> +
>> + num = blk_bio_map_sg(q, bio, vbr->sg + out);
>> +
>> + sg_set_buf(&vbr->sg[num + out + in++], &vbr->status,
>> + sizeof(vbr->status));
>> +
>> + if (num) {
>> + if (bio->bi_rw & REQ_WRITE) {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
>> + out += num;
>> + } else {
>> + vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>> + in += num;
>> + }
>> + }
>> +
>> + spin_lock_irq(vblk->disk->queue->queue_lock);
>> + if (virtqueue_add_buf(vblk->vq, vbr->sg, out, in, vbr,
>> + GFP_ATOMIC) < 0) {
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>
> Any implications of dropping lock like that?
> E.g. for suspend. like we are still discussing with
> unlocked kick?
>
>> + virtblk_add_buf_wait(vblk, vbr, out, in);
>> + } else {
>> + virtqueue_kick(vblk->vq);
>
> Why special case the first call? task state manipulation so expensive?
Hmm. Will switch them.
>
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>> + }
>> +}
>> +
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 10:05 ` Rusty Russell
@ 2012-06-19 2:39 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:39 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, linux-kernel, virtualization, Michael S. Tsirkin,
Christoph Hellwig, Minchan Kim
On 06/18/2012 06:05 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass the
>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
Merging in guest kernel's IO scheduling can reduce the number of
requests guest fires to host side. For instance, with the same workload
in guest side, the number of request drops to ~200K from ~4000K with
guest kernel's merging in qemu.
>
> Cheers,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-19 2:39 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:39 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 06:05 PM, Rusty Russell wrote:
> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He <asias@redhat.com> wrote:
>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He <asias@redhat.com> wrote:
>>>> This patch introduces bio-based IO path for virtio-blk.
>>>
>>> Why make it optional?
>>
>> request-based IO path is useful for users who do not want to bypass the
>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>
> Users using a spinning disk still get IO scheduling in the host though.
> What benefit is there in doing it in the guest as well?
Merging in guest kernel's IO scheduling can reduce the number of
requests guest fires to host side. For instance, with the same workload
in guest side, the number of request drops to ~200K from ~4000K with
guest kernel's merging in qemu.
>
> Cheers,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 11:39 ` Sasha Levin
@ 2012-06-19 2:51 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:51 UTC (permalink / raw)
To: Sasha Levin
Cc: dlaor, Rusty Russell, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 06/18/2012 07:39 PM, Sasha Levin wrote:
> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>
>>>>> Why make it optional?
>>>>
>>>> request-based IO path is useful for users who do not want to bypass the
>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>>
>>> Users using a spinning disk still get IO scheduling in the host though.
>>> What benefit is there in doing it in the guest as well?
>>
>> The io scheduler waits for requests to merge and thus batch IOs
>> together. It's not important w.r.t spinning disks since the host can do
>> it but it causes much less vmexits which is the key issue for VMs.
>
> Is the amount of exits caused by virtio-blk significant at all with
> EVENT_IDX?
Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
interrupt as an example, The guest fires 200K request to host, the
number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
6K = 33. The ratio of merging is 40000K / 200K = 20.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-19 2:51 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 2:51 UTC (permalink / raw)
To: Sasha Levin
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 07:39 PM, Sasha Levin wrote:
> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>
>>>>> Why make it optional?
>>>>
>>>> request-based IO path is useful for users who do not want to bypass the
>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>>
>>> Users using a spinning disk still get IO scheduling in the host though.
>>> What benefit is there in doing it in the guest as well?
>>
>> The io scheduler waits for requests to merge and thus batch IOs
>> together. It's not important w.r.t spinning disks since the host can do
>> it but it causes much less vmexits which is the key issue for VMs.
>
> Is the amount of exits caused by virtio-blk significant at all with
> EVENT_IDX?
Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
interrupt as an example, The guest fires 200K request to host, the
number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
6K = 33. The ratio of merging is 40000K / 200K = 20.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
2012-06-18 10:58 ` Stefan Hajnoczi
@ 2012-06-19 4:24 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 4:24 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm, linux-kernel, virtualization
On 06/18/2012 06:58 PM, Stefan Hajnoczi wrote:
> As long as the latency is decreasing that's good. But It's worth
> keeping in mind that these percentages are probably wildly different
> on real storage devices and/or qemu-kvm. What we don't know here is
> whether this bottleneck matters in real environments - results with
> real storage and with qemu-kvm would be interesting.
Yes. Here is the performance data on a Fusion-IO SSD device.
Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
kvm tool.
Short version:
With bio-based IO path, sequential read/write, random read/write
IOPS boost : 11%, 11%, 13%, 10%
Latency improvement: 10%, 10%, 12%, 10%
Long Version:
With bio-based IO path:
read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
With request-based IO path:
read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-19 4:24 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-19 4:24 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: linux-kernel, kvm, virtualization
On 06/18/2012 06:58 PM, Stefan Hajnoczi wrote:
> As long as the latency is decreasing that's good. But It's worth
> keeping in mind that these percentages are probably wildly different
> on real storage devices and/or qemu-kvm. What we don't know here is
> whether this bottleneck matters in real environments - results with
> real storage and with qemu-kvm would be interesting.
Yes. Here is the performance data on a Fusion-IO SSD device.
Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using
kvm tool.
Short version:
With bio-based IO path, sequential read/write, random read/write
IOPS boost : 11%, 11%, 13%, 10%
Latency improvement: 10%, 10%, 12%, 10%
Long Version:
With bio-based IO path:
read : io=2048.0MB, bw=58920KB/s, iops=117840 , runt= 35593msec
write: io=2048.0MB, bw=64308KB/s, iops=128616 , runt= 32611msec
read : io=3095.7MB, bw=59633KB/s, iops=119266 , runt= 53157msec
write: io=3095.7MB, bw=62993KB/s, iops=125985 , runt= 50322msec
clat (usec): min=0 , max=1284.3K, avg=128109.01, stdev=71513.29
clat (usec): min=94 , max=962339 , avg=116832.95, stdev=65836.80
clat (usec): min=0 , max=1846.6K, avg=128509.99, stdev=89575.07
clat (usec): min=0 , max=2256.4K, avg=121361.84, stdev=82747.25
cpu : usr=56.79%, sys=421.70%, ctx=147335118, majf=21080, minf=19852517
cpu : usr=61.81%, sys=455.53%, ctx=143269950, majf=16027, minf=24800604
cpu : usr=63.10%, sys=455.38%, ctx=178373538, majf=16958, minf=24822612
cpu : usr=62.04%, sys=453.58%, ctx=226902362, majf=16089, minf=23278105
With request-based IO path:
read : io=2048.0MB, bw=52896KB/s, iops=105791 , runt= 39647msec
write: io=2048.0MB, bw=57856KB/s, iops=115711 , runt= 36248msec
read : io=3095.7MB, bw=52387KB/s, iops=104773 , runt= 60510msec
write: io=3095.7MB, bw=57310KB/s, iops=114619 , runt= 55312msec
clat (usec): min=0 , max=1532.6K, avg=142085.62, stdev=109196.84
clat (usec): min=0 , max=1487.4K, avg=129110.71, stdev=114973.64
clat (usec): min=0 , max=1388.6K, avg=145049.22, stdev=107232.55
clat (usec): min=0 , max=1465.9K, avg=133585.67, stdev=110322.95
cpu : usr=44.08%, sys=590.71%, ctx=451812322, majf=14841, minf=17648641
cpu : usr=48.73%, sys=610.78%, ctx=418953997, majf=22164, minf=26850689
cpu : usr=45.58%, sys=581.16%, ctx=714079216, majf=21497, minf=22558223
cpu : usr=48.40%, sys=599.65%, ctx=656089423, majf=16393, minf=23824409
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-19 2:51 ` Asias He
@ 2012-06-19 6:21 ` Dor Laor
-1 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-19 6:21 UTC (permalink / raw)
To: Asias He
Cc: Sasha Levin, Rusty Russell, kvm, Michael S. Tsirkin,
linux-kernel, virtualization, Christoph Hellwig
On 06/19/2012 05:51 AM, Asias He wrote:
> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>
>>>>>> Why make it optional?
>>>>>
>>>>> request-based IO path is useful for users who do not want to bypass
>>>>> the
>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>> users
>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>> path.
>>>>
>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>> What benefit is there in doing it in the guest as well?
>>>
>>> The io scheduler waits for requests to merge and thus batch IOs
>>> together. It's not important w.r.t spinning disks since the host can do
>>> it but it causes much less vmexits which is the key issue for VMs.
>>
>> Is the amount of exits caused by virtio-blk significant at all with
>> EVENT_IDX?
>
> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
> interrupt as an example, The guest fires 200K request to host, the
> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>
In this case, why don't you always recommend bio over request based?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-19 6:21 ` Dor Laor
0 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-19 6:21 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 06/19/2012 05:51 AM, Asias He wrote:
> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>
>>>>>> Why make it optional?
>>>>>
>>>>> request-based IO path is useful for users who do not want to bypass
>>>>> the
>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>> users
>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>> path.
>>>>
>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>> What benefit is there in doing it in the guest as well?
>>>
>>> The io scheduler waits for requests to merge and thus batch IOs
>>> together. It's not important w.r.t spinning disks since the host can do
>>> it but it causes much less vmexits which is the key issue for VMs.
>>
>> Is the amount of exits caused by virtio-blk significant at all with
>> EVENT_IDX?
>
> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
> interrupt as an example, The guest fires 200K request to host, the
> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>
In this case, why don't you always recommend bio over request based?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
2012-06-19 4:24 ` Asias He
@ 2012-06-19 10:14 ` Stefan Hajnoczi
-1 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-19 10:14 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization
On Tue, Jun 19, 2012 at 5:24 AM, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 06:58 PM, Stefan Hajnoczi wrote:
>>
>> As long as the latency is decreasing that's good. But It's worth
>> keeping in mind that these percentages are probably wildly different
>> on real storage devices and/or qemu-kvm. What we don't know here is
>> whether this bottleneck matters in real environments - results with
>> real storage and with qemu-kvm would be interesting.
>
>
> Yes. Here is the performance data on a Fusion-IO SSD device.
>
> Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using kvm
> tool.
>
> Short version:
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost : 11%, 11%, 13%, 10%
> Latency improvement: 10%, 10%, 12%, 10%
Nice, I'm glad the improvement shows on real hardware.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 0/3] Improve virtio-blk performance
@ 2012-06-19 10:14 ` Stefan Hajnoczi
0 siblings, 0 replies; 80+ messages in thread
From: Stefan Hajnoczi @ 2012-06-19 10:14 UTC (permalink / raw)
To: Asias He; +Cc: linux-kernel, kvm, virtualization
On Tue, Jun 19, 2012 at 5:24 AM, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 06:58 PM, Stefan Hajnoczi wrote:
>>
>> As long as the latency is decreasing that's good. But It's worth
>> keeping in mind that these percentages are probably wildly different
>> on real storage devices and/or qemu-kvm. What we don't know here is
>> whether this bottleneck matters in real environments - results with
>> real storage and with qemu-kvm would be interesting.
>
>
> Yes. Here is the performance data on a Fusion-IO SSD device.
>
> Fio test is performed in a 8 vcpu guest with Fusion-IO based guest using kvm
> tool.
>
> Short version:
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost : 11%, 11%, 13%, 10%
> Latency improvement: 10%, 10%, 12%, 10%
Nice, I'm glad the improvement shows on real hardware.
Stefan
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
2012-06-19 2:02 ` Asias He
@ 2012-06-19 18:00 ` Tejun Heo
-1 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-19 18:00 UTC (permalink / raw)
To: Asias He; +Cc: kvm, linux-kernel, virtualization, Jens Axboe, Shaohua Li
Hello,
On Mon, Jun 18, 2012 at 7:02 PM, Asias He <asias@redhat.com> wrote:
>> I *hope* this is a bit prettier. e.g. Do we really need to pass in
>> @sglist and keep using "goto new_segment"?
>
> I think this deserves another patch on top of this splitting one. I'd like
> to clean this up later.
Yeap, doing it in a separate patch would be better. It would be great
if you can include such patch in this series. :)
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper
@ 2012-06-19 18:00 ` Tejun Heo
0 siblings, 0 replies; 80+ messages in thread
From: Tejun Heo @ 2012-06-19 18:00 UTC (permalink / raw)
To: Asias He; +Cc: Jens Axboe, Shaohua Li, linux-kernel, kvm, virtualization
Hello,
On Mon, Jun 18, 2012 at 7:02 PM, Asias He <asias@redhat.com> wrote:
>> I *hope* this is a bit prettier. e.g. Do we really need to pass in
>> @sglist and keep using "goto new_segment"?
>
> I think this deserves another patch on top of this splitting one. I'd like
> to clean this up later.
Yeap, doing it in a separate patch would be better. It would be great
if you can include such patch in this series. :)
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-19 6:21 ` Dor Laor
@ 2012-06-20 4:46 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-20 4:46 UTC (permalink / raw)
To: dlaor
Cc: Sasha Levin, Rusty Russell, kvm, Michael S. Tsirkin,
linux-kernel, virtualization, Christoph Hellwig
On 06/19/2012 02:21 PM, Dor Laor wrote:
> On 06/19/2012 05:51 AM, Asias He wrote:
>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com>
>>>>>>> wrote:
>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>
>>>>>>> Why make it optional?
>>>>>>
>>>>>> request-based IO path is useful for users who do not want to bypass
>>>>>> the
>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>>> users
>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>>> path.
>>>>>
>>>>> Users using a spinning disk still get IO scheduling in the host
>>>>> though.
>>>>> What benefit is there in doing it in the guest as well?
>>>>
>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>> together. It's not important w.r.t spinning disks since the host can do
>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>
>>> Is the amount of exits caused by virtio-blk significant at all with
>>> EVENT_IDX?
>>
>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>> interrupt as an example, The guest fires 200K request to host, the
>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>>
>
> In this case, why don't you always recommend bio over request based?
This case shows that IO scheduler's merging in guest saves a lot of
requests to host side. Why should I recommend bio over request based here?
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-20 4:46 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-06-20 4:46 UTC (permalink / raw)
To: dlaor
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 06/19/2012 02:21 PM, Dor Laor wrote:
> On 06/19/2012 05:51 AM, Asias He wrote:
>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com>
>>>>>>> wrote:
>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>
>>>>>>> Why make it optional?
>>>>>>
>>>>>> request-based IO path is useful for users who do not want to bypass
>>>>>> the
>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>>> users
>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>>> path.
>>>>>
>>>>> Users using a spinning disk still get IO scheduling in the host
>>>>> though.
>>>>> What benefit is there in doing it in the guest as well?
>>>>
>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>> together. It's not important w.r.t spinning disks since the host can do
>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>
>>> Is the amount of exits caused by virtio-blk significant at all with
>>> EVENT_IDX?
>>
>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>> interrupt as an example, The guest fires 200K request to host, the
>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>>
>
> In this case, why don't you always recommend bio over request based?
This case shows that IO scheduler's merging in guest saves a lot of
requests to host side. Why should I recommend bio over request based here?
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-20 4:46 ` Asias He
@ 2012-06-21 9:49 ` Dor Laor
-1 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-21 9:49 UTC (permalink / raw)
To: Asias He
Cc: Sasha Levin, Rusty Russell, kvm, Michael S. Tsirkin,
linux-kernel, virtualization, Christoph Hellwig
On 06/20/2012 07:46 AM, Asias He wrote:
> On 06/19/2012 02:21 PM, Dor Laor wrote:
>> On 06/19/2012 05:51 AM, Asias He wrote:
>>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com>
>>>>>>>> wrote:
>>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>>
>>>>>>>> Why make it optional?
>>>>>>>
>>>>>>> request-based IO path is useful for users who do not want to bypass
>>>>>>> the
>>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>>>> users
>>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>>>> path.
>>>>>>
>>>>>> Users using a spinning disk still get IO scheduling in the host
>>>>>> though.
>>>>>> What benefit is there in doing it in the guest as well?
>>>>>
>>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>>> together. It's not important w.r.t spinning disks since the host
>>>>> can do
>>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>>
>>>> Is the amount of exits caused by virtio-blk significant at all with
>>>> EVENT_IDX?
>>>
>>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>>> interrupt as an example, The guest fires 200K request to host, the
>>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>>>
>>
>> In this case, why don't you always recommend bio over request based?
>
> This case shows that IO scheduler's merging in guest saves a lot of
> requests to host side. Why should I recommend bio over request based here?
>
Does it merge 20 request _on top_ of what event idx does? Of course if
that's the case, we should keep that.
Dor
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-06-21 9:49 ` Dor Laor
0 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-06-21 9:49 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 06/20/2012 07:46 AM, Asias He wrote:
> On 06/19/2012 02:21 PM, Dor Laor wrote:
>> On 06/19/2012 05:51 AM, Asias He wrote:
>>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com>
>>>>>>>> wrote:
>>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>>
>>>>>>>> Why make it optional?
>>>>>>>
>>>>>>> request-based IO path is useful for users who do not want to bypass
>>>>>>> the
>>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For
>>>>>>> users
>>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>>>>> path.
>>>>>>
>>>>>> Users using a spinning disk still get IO scheduling in the host
>>>>>> though.
>>>>>> What benefit is there in doing it in the guest as well?
>>>>>
>>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>>> together. It's not important w.r.t spinning disks since the host
>>>>> can do
>>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>>
>>>> Is the amount of exits caused by virtio-blk significant at all with
>>>> EVENT_IDX?
>>>
>>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>>> interrupt as an example, The guest fires 200K request to host, the
>>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>>>
>>
>> In this case, why don't you always recommend bio over request based?
>
> This case shows that IO scheduler's merging in guest saves a lot of
> requests to host side. Why should I recommend bio over request based here?
>
Does it merge 20 request _on top_ of what event idx does? Of course if
that's the case, we should keep that.
Dor
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-19 2:51 ` Asias He
(?)
(?)
@ 2012-07-01 23:54 ` Rusty Russell
2012-07-02 2:45 ` Asias He
-1 siblings, 1 reply; 80+ messages in thread
From: Rusty Russell @ 2012-07-01 23:54 UTC (permalink / raw)
To: Asias He, Sasha Levin
Cc: dlaor, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Christoph Hellwig
On Tue, 19 Jun 2012 10:51:18 +0800, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 07:39 PM, Sasha Levin wrote:
> > On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
> >> On 06/18/2012 01:05 PM, Rusty Russell wrote:
> >>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
> >>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
> >>>>>> This patch introduces bio-based IO path for virtio-blk.
> >>>>>
> >>>>> Why make it optional?
> >>>>
> >>>> request-based IO path is useful for users who do not want to bypass the
> >>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> >>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
> >>>
> >>> Users using a spinning disk still get IO scheduling in the host though.
> >>> What benefit is there in doing it in the guest as well?
> >>
> >> The io scheduler waits for requests to merge and thus batch IOs
> >> together. It's not important w.r.t spinning disks since the host can do
> >> it but it causes much less vmexits which is the key issue for VMs.
> >
> > Is the amount of exits caused by virtio-blk significant at all with
> > EVENT_IDX?
>
> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
> interrupt as an example, The guest fires 200K request to host, the
> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
> 6K = 33. The ratio of merging is 40000K / 200K = 20.
Confused. So, without merging we get 6k exits (per second?) How many
do we get when we use the request-based IO path?
If your device is slow, then you won't be able to make many requests per
second: why worry about exit costs? If your device is fast (eg. ram),
you've already shown that your patch is a win, right?
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-19 2:51 ` Asias He
` (2 preceding siblings ...)
(?)
@ 2012-07-01 23:54 ` Rusty Russell
-1 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-07-01 23:54 UTC (permalink / raw)
To: Asias He, Sasha Levin
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On Tue, 19 Jun 2012 10:51:18 +0800, Asias He <asias@redhat.com> wrote:
> On 06/18/2012 07:39 PM, Sasha Levin wrote:
> > On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
> >> On 06/18/2012 01:05 PM, Rusty Russell wrote:
> >>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
> >>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
> >>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
> >>>>>> This patch introduces bio-based IO path for virtio-blk.
> >>>>>
> >>>>> Why make it optional?
> >>>>
> >>>> request-based IO path is useful for users who do not want to bypass the
> >>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
> >>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
> >>>
> >>> Users using a spinning disk still get IO scheduling in the host though.
> >>> What benefit is there in doing it in the guest as well?
> >>
> >> The io scheduler waits for requests to merge and thus batch IOs
> >> together. It's not important w.r.t spinning disks since the host can do
> >> it but it causes much less vmexits which is the key issue for VMs.
> >
> > Is the amount of exits caused by virtio-blk significant at all with
> > EVENT_IDX?
>
> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
> interrupt as an example, The guest fires 200K request to host, the
> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
> 6K = 33. The ratio of merging is 40000K / 200K = 20.
Confused. So, without merging we get 6k exits (per second?) How many
do we get when we use the request-based IO path?
If your device is slow, then you won't be able to make many requests per
second: why worry about exit costs? If your device is fast (eg. ram),
you've already shown that your patch is a win, right?
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-01 23:54 ` Rusty Russell
@ 2012-07-02 2:45 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-02 2:45 UTC (permalink / raw)
To: Rusty Russell
Cc: Sasha Levin, dlaor, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 07/02/2012 07:54 AM, Rusty Russell wrote:
> On Tue, 19 Jun 2012 10:51:18 +0800, Asias He <asias@redhat.com> wrote:
>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>
>>>>>>> Why make it optional?
>>>>>>
>>>>>> request-based IO path is useful for users who do not want to bypass the
>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>>>>
>>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>>> What benefit is there in doing it in the guest as well?
>>>>
>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>> together. It's not important w.r.t spinning disks since the host can do
>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>
>>> Is the amount of exits caused by virtio-blk significant at all with
>>> EVENT_IDX?
>>
>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>> interrupt as an example, The guest fires 200K request to host, the
>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>
> Confused. So, without merging we get 6k exits (per second?) How many
> do we get when we use the request-based IO path?
Sorry for the confusion. The numbers were collected from request-based
IO path where we can have the guest block layer merge the requests.
With the same workload in guest, the guest fires 200K requests to host
with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
while the guest fires 40000K requests to host with merges disabled in
guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
in block layer reduces the total number of requests fire to host a lot
(40000K / 200K = 20).
The guest fires 200K requests to host with merges enabled in guest (echo
0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
total for the 200K requests. This show that the ratio of interrupts
coalesced (200K / 6K = 33).
>
> If your device is slow, then you won't be able to make many requests per
> second: why worry about exit costs?
If a device is slow, the merge would merge more requests and reduce the
total number of requests to host. This saves exit costs, no?
> If your device is fast (eg. ram),
> you've already shown that your patch is a win, right?
Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-02 2:45 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-02 2:45 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 07/02/2012 07:54 AM, Rusty Russell wrote:
> On Tue, 19 Jun 2012 10:51:18 +0800, Asias He <asias@redhat.com> wrote:
>> On 06/18/2012 07:39 PM, Sasha Levin wrote:
>>> On Mon, 2012-06-18 at 14:14 +0300, Dor Laor wrote:
>>>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>>>
>>>>>>> Why make it optional?
>>>>>>
>>>>>> request-based IO path is useful for users who do not want to bypass the
>>>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>>>>
>>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>>> What benefit is there in doing it in the guest as well?
>>>>
>>>> The io scheduler waits for requests to merge and thus batch IOs
>>>> together. It's not important w.r.t spinning disks since the host can do
>>>> it but it causes much less vmexits which is the key issue for VMs.
>>>
>>> Is the amount of exits caused by virtio-blk significant at all with
>>> EVENT_IDX?
>>
>> Yes. EVENT_IDX saves the number of notify and interrupt. Let's take the
>> interrupt as an example, The guest fires 200K request to host, the
>> number of interrupt is about 6K thanks to EVENT_IDX. The ratio is 200K /
>> 6K = 33. The ratio of merging is 40000K / 200K = 20.
>
> Confused. So, without merging we get 6k exits (per second?) How many
> do we get when we use the request-based IO path?
Sorry for the confusion. The numbers were collected from request-based
IO path where we can have the guest block layer merge the requests.
With the same workload in guest, the guest fires 200K requests to host
with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
while the guest fires 40000K requests to host with merges disabled in
guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
in block layer reduces the total number of requests fire to host a lot
(40000K / 200K = 20).
The guest fires 200K requests to host with merges enabled in guest (echo
0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
total for the 200K requests. This show that the ratio of interrupts
coalesced (200K / 6K = 33).
>
> If your device is slow, then you won't be able to make many requests per
> second: why worry about exit costs?
If a device is slow, the merge would merge more requests and reduce the
total number of requests to host. This saves exit costs, no?
> If your device is fast (eg. ram),
> you've already shown that your patch is a win, right?
Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-02 2:45 ` Asias He
@ 2012-07-02 6:41 ` Rusty Russell
-1 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-07-02 6:41 UTC (permalink / raw)
To: Asias He
Cc: Sasha Levin, dlaor, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On Mon, 02 Jul 2012 10:45:05 +0800, Asias He <asias@redhat.com> wrote:
> On 07/02/2012 07:54 AM, Rusty Russell wrote:
> > Confused. So, without merging we get 6k exits (per second?) How many
> > do we get when we use the request-based IO path?
>
> Sorry for the confusion. The numbers were collected from request-based
> IO path where we can have the guest block layer merge the requests.
>
> With the same workload in guest, the guest fires 200K requests to host
> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
> while the guest fires 40000K requests to host with merges disabled in
> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
> in block layer reduces the total number of requests fire to host a lot
> (40000K / 200K = 20).
>
> The guest fires 200K requests to host with merges enabled in guest (echo
> 0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
> total for the 200K requests. This show that the ratio of interrupts
> coalesced (200K / 6K = 33).
OK, got it! Guest merging cuts requests by a factor of 20. EVENT_IDX
cuts interrupts by a factor of 33.
> > If your device is slow, then you won't be able to make many requests per
> > second: why worry about exit costs?
>
> If a device is slow, the merge would merge more requests and reduce the
> total number of requests to host. This saves exit costs, no?
Sure, our guest merging might save us 100x as many exits as no merging.
But since we're not doing many requests, does it matter?
Ideally we'd merge requests only if the device queue is full. But that
sounds hard.
> > If your device is fast (eg. ram),
> > you've already shown that your patch is a win, right?
>
> Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-02 6:41 ` Rusty Russell
0 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-07-02 6:41 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On Mon, 02 Jul 2012 10:45:05 +0800, Asias He <asias@redhat.com> wrote:
> On 07/02/2012 07:54 AM, Rusty Russell wrote:
> > Confused. So, without merging we get 6k exits (per second?) How many
> > do we get when we use the request-based IO path?
>
> Sorry for the confusion. The numbers were collected from request-based
> IO path where we can have the guest block layer merge the requests.
>
> With the same workload in guest, the guest fires 200K requests to host
> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
> while the guest fires 40000K requests to host with merges disabled in
> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
> in block layer reduces the total number of requests fire to host a lot
> (40000K / 200K = 20).
>
> The guest fires 200K requests to host with merges enabled in guest (echo
> 0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
> total for the 200K requests. This show that the ratio of interrupts
> coalesced (200K / 6K = 33).
OK, got it! Guest merging cuts requests by a factor of 20. EVENT_IDX
cuts interrupts by a factor of 33.
> > If your device is slow, then you won't be able to make many requests per
> > second: why worry about exit costs?
>
> If a device is slow, the merge would merge more requests and reduce the
> total number of requests to host. This saves exit costs, no?
Sure, our guest merging might save us 100x as many exits as no merging.
But since we're not doing many requests, does it matter?
Ideally we'd merge requests only if the device queue is full. But that
sounds hard.
> > If your device is fast (eg. ram),
> > you've already shown that your patch is a win, right?
>
> Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-02 6:41 ` Rusty Russell
@ 2012-07-03 0:39 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-03 0:39 UTC (permalink / raw)
To: Rusty Russell
Cc: Sasha Levin, dlaor, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 07/02/2012 02:41 PM, Rusty Russell wrote:
> On Mon, 02 Jul 2012 10:45:05 +0800, Asias He <asias@redhat.com> wrote:
>> On 07/02/2012 07:54 AM, Rusty Russell wrote:
>>> Confused. So, without merging we get 6k exits (per second?) How many
>>> do we get when we use the request-based IO path?
>>
>> Sorry for the confusion. The numbers were collected from request-based
>> IO path where we can have the guest block layer merge the requests.
>>
>> With the same workload in guest, the guest fires 200K requests to host
>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>> while the guest fires 40000K requests to host with merges disabled in
>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>> in block layer reduces the total number of requests fire to host a lot
>> (40000K / 200K = 20).
>>
>> The guest fires 200K requests to host with merges enabled in guest (echo
>> 0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
>> total for the 200K requests. This show that the ratio of interrupts
>> coalesced (200K / 6K = 33).
>
> OK, got it! Guest merging cuts requests by a factor of 20. EVENT_IDX
> cuts interrupts by a factor of 33.
Yes.
>
>>> If your device is slow, then you won't be able to make many requests per
>>> second: why worry about exit costs?
>>
>> If a device is slow, the merge would merge more requests and reduce the
>> total number of requests to host. This saves exit costs, no?
>
> Sure, our guest merging might save us 100x as many exits as no merging.
> But since we're not doing many requests, does it matter?
We can still have many requests with slow devices. The number of
requests depends on the workload in guest. E.g. 512 IO threads in guest
keeping doing IO.
> Ideally we'd merge requests only if the device queue is full. But that
> sounds hard.
>
>>> If your device is fast (eg. ram),
>>> you've already shown that your patch is a win, right?
>>
>> Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
>
> Thanks,
> Rusty.
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-03 0:39 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-03 0:39 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 07/02/2012 02:41 PM, Rusty Russell wrote:
> On Mon, 02 Jul 2012 10:45:05 +0800, Asias He <asias@redhat.com> wrote:
>> On 07/02/2012 07:54 AM, Rusty Russell wrote:
>>> Confused. So, without merging we get 6k exits (per second?) How many
>>> do we get when we use the request-based IO path?
>>
>> Sorry for the confusion. The numbers were collected from request-based
>> IO path where we can have the guest block layer merge the requests.
>>
>> With the same workload in guest, the guest fires 200K requests to host
>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>> while the guest fires 40000K requests to host with merges disabled in
>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>> in block layer reduces the total number of requests fire to host a lot
>> (40000K / 200K = 20).
>>
>> The guest fires 200K requests to host with merges enabled in guest (echo
>> 0 > /sys/block/vdb/queue/nomerges), the host fires 6K interrupts in
>> total for the 200K requests. This show that the ratio of interrupts
>> coalesced (200K / 6K = 33).
>
> OK, got it! Guest merging cuts requests by a factor of 20. EVENT_IDX
> cuts interrupts by a factor of 33.
Yes.
>
>>> If your device is slow, then you won't be able to make many requests per
>>> second: why worry about exit costs?
>>
>> If a device is slow, the merge would merge more requests and reduce the
>> total number of requests to host. This saves exit costs, no?
>
> Sure, our guest merging might save us 100x as many exits as no merging.
> But since we're not doing many requests, does it matter?
We can still have many requests with slow devices. The number of
requests depends on the workload in guest. E.g. 512 IO threads in guest
keeping doing IO.
> Ideally we'd merge requests only if the device queue is full. But that
> sounds hard.
>
>>> If your device is fast (eg. ram),
>>> you've already shown that your patch is a win, right?
>>
>> Yes. Both on ramdisk and fast SSD device (e.g. FusionIO).
>
> Thanks,
> Rusty.
>
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-02 6:41 ` Rusty Russell
` (2 preceding siblings ...)
(?)
@ 2012-07-03 13:31 ` Paolo Bonzini
2012-07-03 14:02 ` Asias He
-1 siblings, 1 reply; 80+ messages in thread
From: Paolo Bonzini @ 2012-07-03 13:31 UTC (permalink / raw)
To: Rusty Russell
Cc: Asias He, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
Il 02/07/2012 08:41, Rusty Russell ha scritto:
>> With the same workload in guest, the guest fires 200K requests to host
>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>> while the guest fires 40000K requests to host with merges disabled in
>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>> in block layer reduces the total number of requests fire to host a lot
>> (40000K / 200K = 20).
>>
40000 / 200 is 200, not 20. :)
Paolo
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-02 6:41 ` Rusty Russell
(?)
(?)
@ 2012-07-03 13:31 ` Paolo Bonzini
-1 siblings, 0 replies; 80+ messages in thread
From: Paolo Bonzini @ 2012-07-03 13:31 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
Il 02/07/2012 08:41, Rusty Russell ha scritto:
>> With the same workload in guest, the guest fires 200K requests to host
>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>> while the guest fires 40000K requests to host with merges disabled in
>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>> in block layer reduces the total number of requests fire to host a lot
>> (40000K / 200K = 20).
>>
40000 / 200 is 200, not 20. :)
Paolo
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-03 13:31 ` Paolo Bonzini
@ 2012-07-03 14:02 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-03 14:02 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Rusty Russell, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Sasha Levin, Christoph Hellwig
On 07/03/2012 09:31 PM, Paolo Bonzini wrote:
> Il 02/07/2012 08:41, Rusty Russell ha scritto:
>>> With the same workload in guest, the guest fires 200K requests to host
>>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>>> while the guest fires 40000K requests to host with merges disabled in
>>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>>> in block layer reduces the total number of requests fire to host a lot
>>> (40000K / 200K = 20).
>>>
>
> 40000 / 200 is 200, not 20. :)
Crap, I wrote one more zero here. Actually, it is 4000K. So the factor
is still 4000K/200k = 20 ;-)
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-03 14:02 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-03 14:02 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 07/03/2012 09:31 PM, Paolo Bonzini wrote:
> Il 02/07/2012 08:41, Rusty Russell ha scritto:
>>> With the same workload in guest, the guest fires 200K requests to host
>>> with merges enabled in guest (echo 0 > /sys/block/vdb/queue/nomerges),
>>> while the guest fires 40000K requests to host with merges disabled in
>>> guest (echo 2 > /sys/block/vdb/queue/nomerges). This show that the merge
>>> in block layer reduces the total number of requests fire to host a lot
>>> (40000K / 200K = 20).
>>>
>
> 40000 / 200 is 200, not 20. :)
Crap, I wrote one more zero here. Actually, it is 4000K. So the factor
is still 4000K/200k = 20 ;-)
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-06-18 11:14 ` Dor Laor
@ 2012-07-03 14:22 ` Ronen Hod
-1 siblings, 0 replies; 80+ messages in thread
From: Ronen Hod @ 2012-07-03 14:22 UTC (permalink / raw)
To: dlaor
Cc: Rusty Russell, Asias He, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 06/18/2012 02:14 PM, Dor Laor wrote:
> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>
>>>> Why make it optional?
>>>
>>> request-based IO path is useful for users who do not want to bypass the
>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>
>> Users using a spinning disk still get IO scheduling in the host though.
>> What benefit is there in doing it in the guest as well?
>
> The io scheduler waits for requests to merge and thus batch IOs together. It's not important w.r.t spinning disks since the host can do it but it causes much less vmexits which is the key issue for VMs.
Does it make sense to use the guest's I/O scheduler at all?
- It is not aware of the physical (spinning) disk layout.
- It is not aware of all the host's disk pending requests.
It does have a good side-effect - batching of requests.
Ronen.
>
>>
>> Cheers,
>> Rusty.
>> _______________________________________________
>> Virtualization mailing list
>> Virtualization@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-03 14:22 ` Ronen Hod
0 siblings, 0 replies; 80+ messages in thread
From: Ronen Hod @ 2012-07-03 14:22 UTC (permalink / raw)
To: dlaor
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 06/18/2012 02:14 PM, Dor Laor wrote:
> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>
>>>> Why make it optional?
>>>
>>> request-based IO path is useful for users who do not want to bypass the
>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>> using fast disk device, e.g. SSD device, they can use bio-based IO path.
>>
>> Users using a spinning disk still get IO scheduling in the host though.
>> What benefit is there in doing it in the guest as well?
>
> The io scheduler waits for requests to merge and thus batch IOs together. It's not important w.r.t spinning disks since the host can do it but it causes much less vmexits which is the key issue for VMs.
Does it make sense to use the guest's I/O scheduler at all?
- It is not aware of the physical (spinning) disk layout.
- It is not aware of all the host's disk pending requests.
It does have a good side-effect - batching of requests.
Ronen.
>
>>
>> Cheers,
>> Rusty.
>> _______________________________________________
>> Virtualization mailing list
>> Virtualization@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-03 14:22 ` Ronen Hod
@ 2012-07-03 14:28 ` Dor Laor
-1 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-07-03 14:28 UTC (permalink / raw)
To: Ronen Hod
Cc: Rusty Russell, Asias He, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 07/03/2012 05:22 PM, Ronen Hod wrote:
> On 06/18/2012 02:14 PM, Dor Laor wrote:
>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>
>>>>> Why make it optional?
>>>>
>>>> request-based IO path is useful for users who do not want to bypass the
>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>> path.
>>>
>>> Users using a spinning disk still get IO scheduling in the host though.
>>> What benefit is there in doing it in the guest as well?
>>
>> The io scheduler waits for requests to merge and thus batch IOs
>> together. It's not important w.r.t spinning disks since the host can
>> do it but it causes much less vmexits which is the key issue for VMs.
>
> Does it make sense to use the guest's I/O scheduler at all?
That's the reason we have a noop io scheduler.
> - It is not aware of the physical (spinning) disk layout.
> - It is not aware of all the host's disk pending requests.
> It does have a good side-effect - batching of requests.
>
> Ronen.
>
>>
>>>
>>> Cheers,
>>> Rusty.
>>> _______________________________________________
>>> Virtualization mailing list
>>> Virtualization@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-03 14:28 ` Dor Laor
0 siblings, 0 replies; 80+ messages in thread
From: Dor Laor @ 2012-07-03 14:28 UTC (permalink / raw)
To: Ronen Hod
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Christoph Hellwig
On 07/03/2012 05:22 PM, Ronen Hod wrote:
> On 06/18/2012 02:14 PM, Dor Laor wrote:
>> On 06/18/2012 01:05 PM, Rusty Russell wrote:
>>> On Mon, 18 Jun 2012 16:03:23 +0800, Asias He<asias@redhat.com> wrote:
>>>> On 06/18/2012 03:46 PM, Rusty Russell wrote:
>>>>> On Mon, 18 Jun 2012 14:53:10 +0800, Asias He<asias@redhat.com> wrote:
>>>>>> This patch introduces bio-based IO path for virtio-blk.
>>>>>
>>>>> Why make it optional?
>>>>
>>>> request-based IO path is useful for users who do not want to bypass the
>>>> IO scheduler in guest kernel, e.g. users using spinning disk. For users
>>>> using fast disk device, e.g. SSD device, they can use bio-based IO
>>>> path.
>>>
>>> Users using a spinning disk still get IO scheduling in the host though.
>>> What benefit is there in doing it in the guest as well?
>>
>> The io scheduler waits for requests to merge and thus batch IOs
>> together. It's not important w.r.t spinning disks since the host can
>> do it but it causes much less vmexits which is the key issue for VMs.
>
> Does it make sense to use the guest's I/O scheduler at all?
That's the reason we have a noop io scheduler.
> - It is not aware of the physical (spinning) disk layout.
> - It is not aware of all the host's disk pending requests.
> It does have a good side-effect - batching of requests.
>
> Ronen.
>
>>
>>>
>>> Cheers,
>>> Rusty.
>>> _______________________________________________
>>> Virtualization mailing list
>>> Virtualization@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-03 0:39 ` Asias He
@ 2012-07-04 2:40 ` Rusty Russell
-1 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-07-04 2:40 UTC (permalink / raw)
To: Asias He
Cc: Sasha Levin, dlaor, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On Tue, 03 Jul 2012 08:39:39 +0800, Asias He <asias@redhat.com> wrote:
> On 07/02/2012 02:41 PM, Rusty Russell wrote:
> > Sure, our guest merging might save us 100x as many exits as no merging.
> > But since we're not doing many requests, does it matter?
>
> We can still have many requests with slow devices. The number of
> requests depends on the workload in guest. E.g. 512 IO threads in guest
> keeping doing IO.
You can have many requests outstanding. But if the device is slow, the
rate of requests being serviced must be low.
Am I misunderstanding something? I thought if you could have a high
rate of requests, it's not a slow device.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-04 2:40 ` Rusty Russell
0 siblings, 0 replies; 80+ messages in thread
From: Rusty Russell @ 2012-07-04 2:40 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On Tue, 03 Jul 2012 08:39:39 +0800, Asias He <asias@redhat.com> wrote:
> On 07/02/2012 02:41 PM, Rusty Russell wrote:
> > Sure, our guest merging might save us 100x as many exits as no merging.
> > But since we're not doing many requests, does it matter?
>
> We can still have many requests with slow devices. The number of
> requests depends on the workload in guest. E.g. 512 IO threads in guest
> keeping doing IO.
You can have many requests outstanding. But if the device is slow, the
rate of requests being serviced must be low.
Am I misunderstanding something? I thought if you could have a high
rate of requests, it's not a slow device.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-03 14:28 ` Dor Laor
@ 2012-07-04 14:10 ` Paolo Bonzini
-1 siblings, 0 replies; 80+ messages in thread
From: Paolo Bonzini @ 2012-07-04 14:10 UTC (permalink / raw)
To: dlaor
Cc: Ronen Hod, kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Christoph Hellwig
Il 03/07/2012 16:28, Dor Laor ha scritto:
>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>> What benefit is there in doing it in the guest as well?
>>>
>>> The io scheduler waits for requests to merge and thus batch IOs
>>> together. It's not important w.r.t spinning disks since the host can
>>> do it but it causes much less vmexits which is the key issue for VMs.
>>
>> Does it make sense to use the guest's I/O scheduler at all?
>
> That's the reason we have a noop io scheduler.
But is performance really better with noop? We had to revert usage of
QUEUE_FLAG_NONROT in the guests because it caused performance
degradation (commit f8b12e513b953aebf30f8ff7d2de9be7e024dbbe).
The bio-based path is really QUEUE_FLAG_NONROT++, so it should really be
a special case for people who know what they're doing. (Better would be
to improve QEMU, there's definitely room for 20% improvement...).
Paolo
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-04 14:10 ` Paolo Bonzini
0 siblings, 0 replies; 80+ messages in thread
From: Paolo Bonzini @ 2012-07-04 14:10 UTC (permalink / raw)
To: dlaor
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, Ronen Hod,
Christoph Hellwig
Il 03/07/2012 16:28, Dor Laor ha scritto:
>>>> Users using a spinning disk still get IO scheduling in the host though.
>>>> What benefit is there in doing it in the guest as well?
>>>
>>> The io scheduler waits for requests to merge and thus batch IOs
>>> together. It's not important w.r.t spinning disks since the host can
>>> do it but it causes much less vmexits which is the key issue for VMs.
>>
>> Does it make sense to use the guest's I/O scheduler at all?
>
> That's the reason we have a noop io scheduler.
But is performance really better with noop? We had to revert usage of
QUEUE_FLAG_NONROT in the guests because it caused performance
degradation (commit f8b12e513b953aebf30f8ff7d2de9be7e024dbbe).
The bio-based path is really QUEUE_FLAG_NONROT++, so it should really be
a special case for people who know what they're doing. (Better would be
to improve QEMU, there's definitely room for 20% improvement...).
Paolo
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
2012-07-04 2:40 ` Rusty Russell
@ 2012-07-06 1:03 ` Asias He
-1 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-06 1:03 UTC (permalink / raw)
To: Rusty Russell
Cc: Sasha Levin, dlaor, kvm, Michael S. Tsirkin, linux-kernel,
virtualization, Christoph Hellwig
On 07/04/2012 10:40 AM, Rusty Russell wrote:
> On Tue, 03 Jul 2012 08:39:39 +0800, Asias He <asias@redhat.com> wrote:
>> On 07/02/2012 02:41 PM, Rusty Russell wrote:
>>> Sure, our guest merging might save us 100x as many exits as no merging.
>>> But since we're not doing many requests, does it matter?
>>
>> We can still have many requests with slow devices. The number of
>> requests depends on the workload in guest. E.g. 512 IO threads in guest
>> keeping doing IO.
>
> You can have many requests outstanding. But if the device is slow, the
> rate of requests being serviced must be low.
Yes.
> Am I misunderstanding something? I thought if you could have a high
> rate of requests, it's not a slow device.
Sure.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk
@ 2012-07-06 1:03 ` Asias He
0 siblings, 0 replies; 80+ messages in thread
From: Asias He @ 2012-07-06 1:03 UTC (permalink / raw)
To: Rusty Russell
Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization,
Sasha Levin, Christoph Hellwig
On 07/04/2012 10:40 AM, Rusty Russell wrote:
> On Tue, 03 Jul 2012 08:39:39 +0800, Asias He <asias@redhat.com> wrote:
>> On 07/02/2012 02:41 PM, Rusty Russell wrote:
>>> Sure, our guest merging might save us 100x as many exits as no merging.
>>> But since we're not doing many requests, does it matter?
>>
>> We can still have many requests with slow devices. The number of
>> requests depends on the workload in guest. E.g. 512 IO threads in guest
>> keeping doing IO.
>
> You can have many requests outstanding. But if the device is slow, the
> rate of requests being serviced must be low.
Yes.
> Am I misunderstanding something? I thought if you could have a high
> rate of requests, it's not a slow device.
Sure.
--
Asias
^ permalink raw reply [flat|nested] 80+ messages in thread
end of thread, other threads:[~2012-07-06 1:03 UTC | newest]
Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-18 6:53 [PATCH v2 0/3] Improve virtio-blk performance Asias He
2012-06-18 6:53 ` Asias He
2012-06-18 6:53 ` [PATCH 1/3] block: Introduce __blk_segment_map_sg() helper Asias He
2012-06-18 6:53 ` Asias He
2012-06-18 21:31 ` Tejun Heo
2012-06-18 21:31 ` Tejun Heo
2012-06-19 2:02 ` Asias He
2012-06-19 2:02 ` Asias He
2012-06-19 18:00 ` Tejun Heo
2012-06-19 18:00 ` Tejun Heo
2012-06-18 6:53 ` [PATCH v2 2/3] block: Add blk_bio_map_sg() helper Asias He
2012-06-18 6:53 ` Asias He
2012-06-18 6:53 ` [PATCH 3/3] virtio-blk: Add bio-based IO path for virtio-blk Asias He
2012-06-18 6:53 ` Asias He
2012-06-18 7:46 ` Rusty Russell
2012-06-18 7:46 ` Rusty Russell
2012-06-18 8:03 ` Asias He
2012-06-18 8:03 ` Asias He
2012-06-18 10:05 ` Rusty Russell
2012-06-18 10:05 ` Rusty Russell
2012-06-18 11:14 ` Dor Laor
2012-06-18 11:14 ` Dor Laor
2012-06-18 11:39 ` Sasha Levin
2012-06-18 11:39 ` Sasha Levin
2012-06-19 2:51 ` Asias He
2012-06-19 2:51 ` Asias He
2012-06-19 6:21 ` Dor Laor
2012-06-19 6:21 ` Dor Laor
2012-06-20 4:46 ` Asias He
2012-06-20 4:46 ` Asias He
2012-06-21 9:49 ` Dor Laor
2012-06-21 9:49 ` Dor Laor
2012-07-01 23:54 ` Rusty Russell
2012-07-02 2:45 ` Asias He
2012-07-02 2:45 ` Asias He
2012-07-02 6:41 ` Rusty Russell
2012-07-02 6:41 ` Rusty Russell
2012-07-03 0:39 ` Asias He
2012-07-03 0:39 ` Asias He
2012-07-04 2:40 ` Rusty Russell
2012-07-04 2:40 ` Rusty Russell
2012-07-06 1:03 ` Asias He
2012-07-06 1:03 ` Asias He
2012-07-03 13:31 ` Paolo Bonzini
2012-07-03 13:31 ` Paolo Bonzini
2012-07-03 14:02 ` Asias He
2012-07-03 14:02 ` Asias He
2012-07-01 23:54 ` Rusty Russell
2012-07-03 14:22 ` Ronen Hod
2012-07-03 14:22 ` Ronen Hod
2012-07-03 14:28 ` Dor Laor
2012-07-03 14:28 ` Dor Laor
2012-07-04 14:10 ` Paolo Bonzini
2012-07-04 14:10 ` Paolo Bonzini
2012-06-18 21:28 ` Tejun Heo
2012-06-18 21:28 ` Tejun Heo
2012-06-19 2:39 ` Asias He
2012-06-19 2:39 ` Asias He
2012-06-18 10:13 ` Michael S. Tsirkin
2012-06-18 10:13 ` Michael S. Tsirkin
2012-06-19 2:21 ` Asias He
2012-06-19 2:21 ` Asias He
2012-06-18 9:37 ` Stefan Hajnoczi
2012-06-18 9:37 ` Stefan Hajnoczi
2012-06-18 10:21 ` Michael S. Tsirkin
2012-06-18 10:21 ` Michael S. Tsirkin
2012-06-18 10:45 ` Stefan Hajnoczi
2012-06-18 10:45 ` Stefan Hajnoczi
2012-06-19 2:30 ` Asias He
2012-06-19 2:30 ` Asias He
2012-06-18 9:14 ` [PATCH v2 0/3] Improve virtio-blk performance Stefan Hajnoczi
2012-06-18 9:14 ` Stefan Hajnoczi
2012-06-18 9:39 ` Asias He
2012-06-18 9:39 ` Asias He
2012-06-18 10:58 ` Stefan Hajnoczi
2012-06-18 10:58 ` Stefan Hajnoczi
2012-06-19 4:24 ` Asias He
2012-06-19 4:24 ` Asias He
2012-06-19 10:14 ` Stefan Hajnoczi
2012-06-19 10:14 ` Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.