All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] block/loop: improve performance
@ 2017-08-24 19:24 Shaohua Li
  2017-08-24 19:24 ` [PATCH V2 1/2] block/loop: set hw_sectors Shaohua Li
  2017-08-24 19:24 ` [PATCH V2 2/2] block/loop: allow request merge for directio mode Shaohua Li
  0 siblings, 2 replies; 11+ messages in thread
From: Shaohua Li @ 2017-08-24 19:24 UTC (permalink / raw)
  To: linux-block; +Cc: axboe, Kernel-team, Shaohua Li

From: Shaohua Li <shli@fb.com>

two small patches to improve performance for loop in directio mode. The goal is
to increase IO size sending to underlayer disks. As Omar pointed out, the
patches have slight conflict with his, but should be easy to fix.

Thanks,
Shaohua

Shaohua Li (2):
  block/loop: set hw_sectors
  block/loop: allow request merge for directio mode

 drivers/block/loop.c | 67 ++++++++++++++++++++++++++++++++++++++++------------
 drivers/block/loop.h |  1 +
 2 files changed, 53 insertions(+), 15 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V2 1/2] block/loop: set hw_sectors
  2017-08-24 19:24 [PATCH V2 0/2] block/loop: improve performance Shaohua Li
@ 2017-08-24 19:24 ` Shaohua Li
  2017-08-29  9:35   ` Ming Lei
  2017-08-24 19:24 ` [PATCH V2 2/2] block/loop: allow request merge for directio mode Shaohua Li
  1 sibling, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2017-08-24 19:24 UTC (permalink / raw)
  To: linux-block; +Cc: axboe, Kernel-team, Shaohua Li

From: Shaohua Li <shli@fb.com>

Loop can handle any size of request. Limiting it to 255 sectors just
burns the CPU for bio split and request merge for underlayer disk and
also cause bad fs block allocation in directio mode.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/block/loop.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index b55a1f8..428da07 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1799,6 +1799,7 @@ static int loop_add(struct loop_device **l, int i)
 	}
 	lo->lo_queue->queuedata = lo;
 
+	blk_queue_max_hw_sectors(lo->lo_queue, BLK_DEF_MAX_SECTORS);
 	/*
 	 * It doesn't make sense to enable merge because the I/O
 	 * submitted to backing file is handled page by page.
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-24 19:24 [PATCH V2 0/2] block/loop: improve performance Shaohua Li
  2017-08-24 19:24 ` [PATCH V2 1/2] block/loop: set hw_sectors Shaohua Li
@ 2017-08-24 19:24 ` Shaohua Li
  2017-08-29  9:56   ` Ming Lei
  1 sibling, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2017-08-24 19:24 UTC (permalink / raw)
  To: linux-block; +Cc: axboe, Kernel-team, Shaohua Li

From: Shaohua Li <shli@fb.com>

Currently loop disables merge. While it makes sense for buffer IO mode,
directio mode can benefit from request merge. Without merge, loop could
send small size IO to underlayer disk and harm performance.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 drivers/block/loop.c | 66 ++++++++++++++++++++++++++++++++++++++++------------
 drivers/block/loop.h |  1 +
 2 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 428da07..75a8f6e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -213,10 +213,13 @@ static void __loop_update_dio(struct loop_device *lo, bool dio)
 	 */
 	blk_mq_freeze_queue(lo->lo_queue);
 	lo->use_dio = use_dio;
-	if (use_dio)
+	if (use_dio) {
+		queue_flag_clear_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
 		lo->lo_flags |= LO_FLAGS_DIRECT_IO;
-	else
+	} else {
+		queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
 		lo->lo_flags &= ~LO_FLAGS_DIRECT_IO;
+	}
 	blk_mq_unfreeze_queue(lo->lo_queue);
 }
 
@@ -464,6 +467,8 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
 {
 	struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
 
+	kfree(cmd->bvec);
+	cmd->bvec = NULL;
 	cmd->ret = ret;
 	blk_mq_complete_request(cmd->rq);
 }
@@ -473,22 +478,50 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
 {
 	struct iov_iter iter;
 	struct bio_vec *bvec;
-	struct bio *bio = cmd->rq->bio;
+	struct request *rq = cmd->rq;
+	struct bio *bio = rq->bio;
 	struct file *file = lo->lo_backing_file;
+	unsigned int offset;
+	int segments = 0;
 	int ret;
 
-	/* nomerge for loop request queue */
-	WARN_ON(cmd->rq->bio != cmd->rq->biotail);
+	if (rq->bio != rq->biotail) {
+		struct req_iterator iter;
+		struct bio_vec tmp;
+
+		__rq_for_each_bio(bio, rq)
+			segments += bio_segments(bio);
+		bvec = kmalloc(sizeof(struct bio_vec) * segments, GFP_KERNEL);
+		if (!bvec)
+			return -EIO;
+		cmd->bvec = bvec;
+
+		/*
+		 * The bios of the request may be started from the middle of
+		 * the 'bvec' because of bio splitting, so we can't directly
+		 * copy bio->bi_iov_vec to new bvec. The rq_for_each_segment
+		 * API will take care of all details for us.
+		 */
+		rq_for_each_segment(tmp, rq, iter) {
+			*bvec = tmp;
+			bvec++;
+		}
+		bvec = cmd->bvec;
+		offset = 0;
+	} else {
+		/*
+		 * Same here, this bio may be started from the middle of the
+		 * 'bvec' because of bio splitting, so offset from the bvec
+		 * must be passed to iov iterator
+		 */
+		offset = bio->bi_iter.bi_bvec_done;
+		bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
+		segments = bio_segments(bio);
+	}
 
-	bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
 	iov_iter_bvec(&iter, ITER_BVEC | rw, bvec,
-		      bio_segments(bio), blk_rq_bytes(cmd->rq));
-	/*
-	 * This bio may be started from the middle of the 'bvec'
-	 * because of bio splitting, so offset from the bvec must
-	 * be passed to iov iterator
-	 */
-	iter.iov_offset = bio->bi_iter.bi_bvec_done;
+		      segments, blk_rq_bytes(rq));
+	iter.iov_offset = offset;
 
 	cmd->iocb.ki_pos = pos;
 	cmd->iocb.ki_filp = file;
@@ -1800,9 +1833,12 @@ static int loop_add(struct loop_device **l, int i)
 	lo->lo_queue->queuedata = lo;
 
 	blk_queue_max_hw_sectors(lo->lo_queue, BLK_DEF_MAX_SECTORS);
+
 	/*
-	 * It doesn't make sense to enable merge because the I/O
-	 * submitted to backing file is handled page by page.
+	 * By default, we do buffer IO, so it doesn't make sense to enable
+	 * merge because the I/O submitted to backing file is handled page by
+	 * page. For directio mode, merge does help to dispatch bigger request
+	 * to underlayer disk. We will enable merge once directio is enabled.
 	 */
 	queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
 
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index fecd3f9..bc9cf91 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -72,6 +72,7 @@ struct loop_cmd {
 	bool use_aio;           /* use AIO interface to handle I/O */
 	long ret;
 	struct kiocb iocb;
+	struct bio_vec *bvec;
 };
 
 /* Support for loadable transfer modules */
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 1/2] block/loop: set hw_sectors
  2017-08-24 19:24 ` [PATCH V2 1/2] block/loop: set hw_sectors Shaohua Li
@ 2017-08-29  9:35   ` Ming Lei
  0 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2017-08-29  9:35 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Thu, Aug 24, 2017 at 12:24:52PM -0700, Shaohua Li wrote:
> From: Shaohua Li <shli@fb.com>
> 
> Loop can handle any size of request. Limiting it to 255 sectors just
> burns the CPU for bio split and request merge for underlayer disk and
> also cause bad fs block allocation in directio mode.
> 
> Reviewed-by: Omar Sandoval <osandov@fb.com>
> Signed-off-by: Shaohua Li <shli@fb.com>
> ---
>  drivers/block/loop.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index b55a1f8..428da07 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1799,6 +1799,7 @@ static int loop_add(struct loop_device **l, int i)
>  	}
>  	lo->lo_queue->queuedata = lo;
>  
> +	blk_queue_max_hw_sectors(lo->lo_queue, BLK_DEF_MAX_SECTORS);
>  	/*
>  	 * It doesn't make sense to enable merge because the I/O
>  	 * submitted to backing file is handled page by page.
> -- 
> 2.9.5
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-24 19:24 ` [PATCH V2 2/2] block/loop: allow request merge for directio mode Shaohua Li
@ 2017-08-29  9:56   ` Ming Lei
  2017-08-29 15:13     ` Shaohua Li
  0 siblings, 1 reply; 11+ messages in thread
From: Ming Lei @ 2017-08-29  9:56 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> From: Shaohua Li <shli@fb.com>
> 
> Currently loop disables merge. While it makes sense for buffer IO mode,
> directio mode can benefit from request merge. Without merge, loop could
> send small size IO to underlayer disk and harm performance.

Hi Shaohua,

IMO no matter if merge is used, loop always sends page by page
to VFS in both dio or buffer I/O.

Also if merge is enabled on loop, that means merge is run
on both loop and low level block driver, and not sure if we
can benefit from that.

So Could you provide some performance data about this patch?

> 
> Reviewed-by: Omar Sandoval <osandov@fb.com>
> Signed-off-by: Shaohua Li <shli@fb.com>
> ---
>  drivers/block/loop.c | 66 ++++++++++++++++++++++++++++++++++++++++------------
>  drivers/block/loop.h |  1 +
>  2 files changed, 52 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 428da07..75a8f6e 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -213,10 +213,13 @@ static void __loop_update_dio(struct loop_device *lo, bool dio)
>  	 */
>  	blk_mq_freeze_queue(lo->lo_queue);
>  	lo->use_dio = use_dio;
> -	if (use_dio)
> +	if (use_dio) {
> +		queue_flag_clear_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
>  		lo->lo_flags |= LO_FLAGS_DIRECT_IO;
> -	else
> +	} else {
> +		queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
>  		lo->lo_flags &= ~LO_FLAGS_DIRECT_IO;
> +	}
>  	blk_mq_unfreeze_queue(lo->lo_queue);
>  }
>  
> @@ -464,6 +467,8 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
>  {
>  	struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
>  
> +	kfree(cmd->bvec);
> +	cmd->bvec = NULL;
>  	cmd->ret = ret;
>  	blk_mq_complete_request(cmd->rq);
>  }
> @@ -473,22 +478,50 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
>  {
>  	struct iov_iter iter;
>  	struct bio_vec *bvec;
> -	struct bio *bio = cmd->rq->bio;
> +	struct request *rq = cmd->rq;
> +	struct bio *bio = rq->bio;
>  	struct file *file = lo->lo_backing_file;
> +	unsigned int offset;
> +	int segments = 0;
>  	int ret;
>  
> -	/* nomerge for loop request queue */
> -	WARN_ON(cmd->rq->bio != cmd->rq->biotail);
> +	if (rq->bio != rq->biotail) {
> +		struct req_iterator iter;
> +		struct bio_vec tmp;
> +
> +		__rq_for_each_bio(bio, rq)
> +			segments += bio_segments(bio);
> +		bvec = kmalloc(sizeof(struct bio_vec) * segments, GFP_KERNEL);

The allocation should have been GFP_NOIO.

-- 
Ming

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-29  9:56   ` Ming Lei
@ 2017-08-29 15:13     ` Shaohua Li
  2017-08-30  2:51       ` Ming Lei
  0 siblings, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2017-08-29 15:13 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > From: Shaohua Li <shli@fb.com>
> > 
> > Currently loop disables merge. While it makes sense for buffer IO mode,
> > directio mode can benefit from request merge. Without merge, loop could
> > send small size IO to underlayer disk and harm performance.
> 
> Hi Shaohua,
> 
> IMO no matter if merge is used, loop always sends page by page
> to VFS in both dio or buffer I/O.

Why do you think so?
 
> Also if merge is enabled on loop, that means merge is run
> on both loop and low level block driver, and not sure if we
> can benefit from that.

why does merge still happen in low level block driver?

> 
> So Could you provide some performance data about this patch?

In my virtual machine, a workload improves from ~20M/s to ~50M/s. And I clearly
see the request size becomes bigger.

> > 
> > Reviewed-by: Omar Sandoval <osandov@fb.com>
> > Signed-off-by: Shaohua Li <shli@fb.com>
> > ---
> >  drivers/block/loop.c | 66 ++++++++++++++++++++++++++++++++++++++++------------
> >  drivers/block/loop.h |  1 +
> >  2 files changed, 52 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 428da07..75a8f6e 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -213,10 +213,13 @@ static void __loop_update_dio(struct loop_device *lo, bool dio)
> >  	 */
> >  	blk_mq_freeze_queue(lo->lo_queue);
> >  	lo->use_dio = use_dio;
> > -	if (use_dio)
> > +	if (use_dio) {
> > +		queue_flag_clear_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
> >  		lo->lo_flags |= LO_FLAGS_DIRECT_IO;
> > -	else
> > +	} else {
> > +		queue_flag_set_unlocked(QUEUE_FLAG_NOMERGES, lo->lo_queue);
> >  		lo->lo_flags &= ~LO_FLAGS_DIRECT_IO;
> > +	}
> >  	blk_mq_unfreeze_queue(lo->lo_queue);
> >  }
> >  
> > @@ -464,6 +467,8 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
> >  {
> >  	struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
> >  
> > +	kfree(cmd->bvec);
> > +	cmd->bvec = NULL;
> >  	cmd->ret = ret;
> >  	blk_mq_complete_request(cmd->rq);
> >  }
> > @@ -473,22 +478,50 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
> >  {
> >  	struct iov_iter iter;
> >  	struct bio_vec *bvec;
> > -	struct bio *bio = cmd->rq->bio;
> > +	struct request *rq = cmd->rq;
> > +	struct bio *bio = rq->bio;
> >  	struct file *file = lo->lo_backing_file;
> > +	unsigned int offset;
> > +	int segments = 0;
> >  	int ret;
> >  
> > -	/* nomerge for loop request queue */
> > -	WARN_ON(cmd->rq->bio != cmd->rq->biotail);
> > +	if (rq->bio != rq->biotail) {
> > +		struct req_iterator iter;
> > +		struct bio_vec tmp;
> > +
> > +		__rq_for_each_bio(bio, rq)
> > +			segments += bio_segments(bio);
> > +		bvec = kmalloc(sizeof(struct bio_vec) * segments, GFP_KERNEL);
> 
> The allocation should have been GFP_NOIO.

Sounds good. To make this completely correct isn't easy though, we are calling
into the underlayer fs operations which can do allocation.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-29 15:13     ` Shaohua Li
@ 2017-08-30  2:51       ` Ming Lei
  2017-08-30  4:43         ` Shaohua Li
  0 siblings, 1 reply; 11+ messages in thread
From: Ming Lei @ 2017-08-30  2:51 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote:
> On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > > From: Shaohua Li <shli@fb.com>
> > > 
> > > Currently loop disables merge. While it makes sense for buffer IO mode,
> > > directio mode can benefit from request merge. Without merge, loop could
> > > send small size IO to underlayer disk and harm performance.
> > 
> > Hi Shaohua,
> > 
> > IMO no matter if merge is used, loop always sends page by page
> > to VFS in both dio or buffer I/O.
> 
> Why do you think so?

do_blockdev_direct_IO() still handles page by page from iov_iter, and
with bigger request, I guess it might be the plug merge working.

>  
> > Also if merge is enabled on loop, that means merge is run
> > on both loop and low level block driver, and not sure if we
> > can benefit from that.
> 
> why does merge still happen in low level block driver?

Because scheduler is still working on low level disk. My question
is that why the scheduler in low level disk doesn't work now
if scheduler on loop can merge?

> 
> > 
> > So Could you provide some performance data about this patch?
> 
> In my virtual machine, a workload improves from ~20M/s to ~50M/s. And I clearly
> see the request size becomes bigger.

Could you share us what the low level disk is?

-- 
Ming

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-30  2:51       ` Ming Lei
@ 2017-08-30  4:43         ` Shaohua Li
  2017-08-30  6:43           ` Ming Lei
  0 siblings, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2017-08-30  4:43 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote:
> On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote:
> > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > > > From: Shaohua Li <shli@fb.com>
> > > > 
> > > > Currently loop disables merge. While it makes sense for buffer IO mode,
> > > > directio mode can benefit from request merge. Without merge, loop could
> > > > send small size IO to underlayer disk and harm performance.
> > > 
> > > Hi Shaohua,
> > > 
> > > IMO no matter if merge is used, loop always sends page by page
> > > to VFS in both dio or buffer I/O.
> > 
> > Why do you think so?
> 
> do_blockdev_direct_IO() still handles page by page from iov_iter, and
> with bigger request, I guess it might be the plug merge working.

This is not true. directio sends big size bio directly, not because of plug
merge. Please at least check the code before you complain.

> >  
> > > Also if merge is enabled on loop, that means merge is run
> > > on both loop and low level block driver, and not sure if we
> > > can benefit from that.
> > 
> > why does merge still happen in low level block driver?
> 
> Because scheduler is still working on low level disk. My question
> is that why the scheduler in low level disk doesn't work now
> if scheduler on loop can merge?

The low level disk can still do merge, but since this is directio, the upper
layer already dispatches request as big as possible. There is very little
chance the requests can be merged again.

> > 
> > > 
> > > So Could you provide some performance data about this patch?
> > 
> > In my virtual machine, a workload improves from ~20M/s to ~50M/s. And I clearly
> > see the request size becomes bigger.
> 
> Could you share us what the low level disk is?

It's a SATA ssd.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-30  4:43         ` Shaohua Li
@ 2017-08-30  6:43           ` Ming Lei
  2017-08-30 22:06             ` Shaohua Li
  0 siblings, 1 reply; 11+ messages in thread
From: Ming Lei @ 2017-08-30  6:43 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Tue, Aug 29, 2017 at 09:43:20PM -0700, Shaohua Li wrote:
> On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote:
> > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote:
> > > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> > > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > > > > From: Shaohua Li <shli@fb.com>
> > > > > 
> > > > > Currently loop disables merge. While it makes sense for buffer IO mode,
> > > > > directio mode can benefit from request merge. Without merge, loop could
> > > > > send small size IO to underlayer disk and harm performance.
> > > > 
> > > > Hi Shaohua,
> > > > 
> > > > IMO no matter if merge is used, loop always sends page by page
> > > > to VFS in both dio or buffer I/O.
> > > 
> > > Why do you think so?
> > 
> > do_blockdev_direct_IO() still handles page by page from iov_iter, and
> > with bigger request, I guess it might be the plug merge working.
> 
> This is not true. directio sends big size bio directly, not because of plug
> merge. Please at least check the code before you complain.

I complain nothing, just try to understand the idea behind,
never mind, :-)

> 
> > >  
> > > > Also if merge is enabled on loop, that means merge is run
> > > > on both loop and low level block driver, and not sure if we
> > > > can benefit from that.
> > > 
> > > why does merge still happen in low level block driver?
> > 
> > Because scheduler is still working on low level disk. My question
> > is that why the scheduler in low level disk doesn't work now
> > if scheduler on loop can merge?
> 
> The low level disk can still do merge, but since this is directio, the upper
> layer already dispatches request as big as possible. There is very little
> chance the requests can be merged again.

That is true, but these requests need to enter scheduler queue and
be tried to merge again, even though it is less possible to succeed.
Double merge may take extra CPU utilization.

Looks it doesn't answer my question.

Without this patch, the requests dispatched to loop won't be merged,
so they may be small and their sectors may be continuous, my question
is why dio bios converted from these small loop requests can't be
merged in block layer when queuing these dio bios to low level device?


> 
> > > 
> > > > 
> > > > So Could you provide some performance data about this patch?
> > > 
> > > In my virtual machine, a workload improves from ~20M/s to ~50M/s. And I clearly
> > > see the request size becomes bigger.
> > 
> > Could you share us what the low level disk is?
> 
> It's a SATA ssd.

For sata, it is pretty easy to trigger I/O merge.

-- 
Ming

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-30  6:43           ` Ming Lei
@ 2017-08-30 22:06             ` Shaohua Li
  2017-08-31  3:25               ` Ming Lei
  0 siblings, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2017-08-30 22:06 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Wed, Aug 30, 2017 at 02:43:40PM +0800, Ming Lei wrote:
> On Tue, Aug 29, 2017 at 09:43:20PM -0700, Shaohua Li wrote:
> > On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote:
> > > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote:
> > > > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> > > > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > > > > > From: Shaohua Li <shli@fb.com>
> > > > > > 
> > > > > > Currently loop disables merge. While it makes sense for buffer IO mode,
> > > > > > directio mode can benefit from request merge. Without merge, loop could
> > > > > > send small size IO to underlayer disk and harm performance.
> > > > > 
> > > > > Hi Shaohua,
> > > > > 
> > > > > IMO no matter if merge is used, loop always sends page by page
> > > > > to VFS in both dio or buffer I/O.
> > > > 
> > > > Why do you think so?
> > > 
> > > do_blockdev_direct_IO() still handles page by page from iov_iter, and
> > > with bigger request, I guess it might be the plug merge working.
> > 
> > This is not true. directio sends big size bio directly, not because of plug
> > merge. Please at least check the code before you complain.
> 
> I complain nothing, just try to understand the idea behind,
> never mind, :-)
> 
> > 
> > > >  
> > > > > Also if merge is enabled on loop, that means merge is run
> > > > > on both loop and low level block driver, and not sure if we
> > > > > can benefit from that.
> > > > 
> > > > why does merge still happen in low level block driver?
> > > 
> > > Because scheduler is still working on low level disk. My question
> > > is that why the scheduler in low level disk doesn't work now
> > > if scheduler on loop can merge?
> > 
> > The low level disk can still do merge, but since this is directio, the upper
> > layer already dispatches request as big as possible. There is very little
> > chance the requests can be merged again.
> 
> That is true, but these requests need to enter scheduler queue and
> be tried to merge again, even though it is less possible to succeed.
> Double merge may take extra CPU utilization.
> 
> Looks it doesn't answer my question.
> 
> Without this patch, the requests dispatched to loop won't be merged,
> so they may be small and their sectors may be continuous, my question
> is why dio bios converted from these small loop requests can't be
> merged in block layer when queuing these dio bios to low level device?

loop thread doesn't have plug there. Even we have plug there, it's still a bad
idea to do the merge in low level layer. If we run direct_IO for every 4k, the
overhead is much much higher than bio merge. The direct_IO will call into fs
code, take different mutexes, metadata update for write and so on.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/2] block/loop: allow request merge for directio mode
  2017-08-30 22:06             ` Shaohua Li
@ 2017-08-31  3:25               ` Ming Lei
  0 siblings, 0 replies; 11+ messages in thread
From: Ming Lei @ 2017-08-31  3:25 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-block, axboe, Kernel-team, Shaohua Li

On Wed, Aug 30, 2017 at 03:06:16PM -0700, Shaohua Li wrote:
> On Wed, Aug 30, 2017 at 02:43:40PM +0800, Ming Lei wrote:
> > On Tue, Aug 29, 2017 at 09:43:20PM -0700, Shaohua Li wrote:
> > > On Wed, Aug 30, 2017 at 10:51:21AM +0800, Ming Lei wrote:
> > > > On Tue, Aug 29, 2017 at 08:13:39AM -0700, Shaohua Li wrote:
> > > > > On Tue, Aug 29, 2017 at 05:56:05PM +0800, Ming Lei wrote:
> > > > > > On Thu, Aug 24, 2017 at 12:24:53PM -0700, Shaohua Li wrote:
> > > > > > > From: Shaohua Li <shli@fb.com>
> > > > > > > 
> > > > > > > Currently loop disables merge. While it makes sense for buffer IO mode,
> > > > > > > directio mode can benefit from request merge. Without merge, loop could
> > > > > > > send small size IO to underlayer disk and harm performance.
> > > > > > 
> > > > > > Hi Shaohua,
> > > > > > 
> > > > > > IMO no matter if merge is used, loop always sends page by page
> > > > > > to VFS in both dio or buffer I/O.
> > > > > 
> > > > > Why do you think so?
> > > > 
> > > > do_blockdev_direct_IO() still handles page by page from iov_iter, and
> > > > with bigger request, I guess it might be the plug merge working.
> > > 
> > > This is not true. directio sends big size bio directly, not because of plug
> > > merge. Please at least check the code before you complain.
> > 
> > I complain nothing, just try to understand the idea behind,
> > never mind, :-)
> > 
> > > 
> > > > >  
> > > > > > Also if merge is enabled on loop, that means merge is run
> > > > > > on both loop and low level block driver, and not sure if we
> > > > > > can benefit from that.
> > > > > 
> > > > > why does merge still happen in low level block driver?
> > > > 
> > > > Because scheduler is still working on low level disk. My question
> > > > is that why the scheduler in low level disk doesn't work now
> > > > if scheduler on loop can merge?
> > > 
> > > The low level disk can still do merge, but since this is directio, the upper
> > > layer already dispatches request as big as possible. There is very little
> > > chance the requests can be merged again.
> > 
> > That is true, but these requests need to enter scheduler queue and
> > be tried to merge again, even though it is less possible to succeed.
> > Double merge may take extra CPU utilization.
> > 
> > Looks it doesn't answer my question.
> > 
> > Without this patch, the requests dispatched to loop won't be merged,
> > so they may be small and their sectors may be continuous, my question
> > is why dio bios converted from these small loop requests can't be
> > merged in block layer when queuing these dio bios to low level device?
> 
> loop thread doesn't have plug there. Even we have plug there, it's still a bad
> idea to do the merge in low level layer. If we run direct_IO for every 4k, the
> overhead is much much higher than bio merge. The direct_IO will call into fs
> code, take different mutexes, metadata update for write and so on.

OK, that looks making sense now.

-- 
Ming

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-08-31  3:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-24 19:24 [PATCH V2 0/2] block/loop: improve performance Shaohua Li
2017-08-24 19:24 ` [PATCH V2 1/2] block/loop: set hw_sectors Shaohua Li
2017-08-29  9:35   ` Ming Lei
2017-08-24 19:24 ` [PATCH V2 2/2] block/loop: allow request merge for directio mode Shaohua Li
2017-08-29  9:56   ` Ming Lei
2017-08-29 15:13     ` Shaohua Li
2017-08-30  2:51       ` Ming Lei
2017-08-30  4:43         ` Shaohua Li
2017-08-30  6:43           ` Ming Lei
2017-08-30 22:06             ` Shaohua Li
2017-08-31  3:25               ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.