All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6] block: loop: avoiding too many pending per work I/O
@ 2015-05-01  3:28 Ming Lei
  2015-05-01 10:17 ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2015-05-01  3:28 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel; +Cc: Justin M. Forbes, Jeff Moyer, Ming Lei, v4.0

If there are too many pending per work I/O, too many
high priority work thread can be generated so that
system performance can be effected.

This patch limits the max pending per work I/O as 16,
and will fackback to single queue mode when the max
number is reached.

This patch fixes Fedora 22 live booting performance
regression when it is booted from squashfs over dm
based on loop, and looks the following reasons are
related with the problem:

- not like other filesyststems(such as ext4), squashfs
is a bit special, and I observed that increasing I/O jobs
to access file in squashfs only improve I/O performance a
little, but it can make big difference for ext4

- nested loop: both squashfs.img and ext3fs.img are mounted
as loop block, and ext3fs.img is inside the squashfs

- during booting, lots of tasks may run concurrently

Fixes: b5dd2f6047ca108001328aac0e8588edd15f1778
Cc: stable@vger.kernel.org (v4.0)
Reported-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/loop.c | 19 +++++++++++++++++--
 drivers/block/loop.h |  2 ++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ae3fcb4..5a728c6 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1425,13 +1425,24 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx,
 		const struct blk_mq_queue_data *bd)
 {
 	struct loop_cmd *cmd = blk_mq_rq_to_pdu(bd->rq);
+	struct loop_device *lo = cmd->rq->q->queuedata;
+	bool single_queue = !!(cmd->rq->cmd_flags & REQ_WRITE);
+
+	/*
+	 * Fallback to single queue mode if the pending per work
+	 * I/O number reaches 16, otherwise too many high priority
+	 * worker thread may effect system performance as reported
+	 * in fedora live booting from squashfs over loop.
+	 */
+	if (atomic_read(&lo->pending_per_work_io) >= 16)
+		single_queue = true;
 
 	blk_mq_start_request(bd->rq);
 
-	if (cmd->rq->cmd_flags & REQ_WRITE) {
-		struct loop_device *lo = cmd->rq->q->queuedata;
+	if (single_queue) {
 		bool need_sched = true;
 
+		cmd->per_work_io = false;
 		spin_lock_irq(&lo->lo_lock);
 		if (lo->write_started)
 			need_sched = false;
@@ -1443,6 +1454,8 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx,
 		if (need_sched)
 			queue_work(loop_wq, &lo->write_work);
 	} else {
+		atomic_inc(&lo->pending_per_work_io);
+		cmd->per_work_io = true;
 		queue_work(loop_wq, &cmd->read_work);
 	}
 
@@ -1467,6 +1480,8 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
 	if (ret)
 		cmd->rq->errors = -EIO;
 	blk_mq_complete_request(cmd->rq);
+	if (cmd->per_work_io)
+		atomic_dec(&lo->pending_per_work_io);
 }
 
 static void loop_queue_write_work(struct work_struct *work)
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index 301c27f..eb855f5 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -57,6 +57,7 @@ struct loop_device {
 	struct list_head	write_cmd_head;
 	struct work_struct	write_work;
 	bool			write_started;
+	atomic_t		pending_per_work_io;
 	int			lo_state;
 	struct mutex		lo_ctl_mutex;
 
@@ -68,6 +69,7 @@ struct loop_device {
 struct loop_cmd {
 	struct work_struct read_work;
 	struct request *rq;
+	bool per_work_io;
 	struct list_head list;
 };
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01  3:28 [PATCH v6] block: loop: avoiding too many pending per work I/O Ming Lei
@ 2015-05-01 10:17 ` Christoph Hellwig
  2015-05-01 13:36   ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2015-05-01 10:17 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, Justin M. Forbes, Jeff Moyer, v4.0, Tejun Heo

On Fri, May 01, 2015 at 11:28:01AM +0800, Ming Lei wrote:
> If there are too many pending per work I/O, too many
> high priority work thread can be generated so that
> system performance can be effected.
> 
> This patch limits the max pending per work I/O as 16,
> and will fackback to single queue mode when the max
> number is reached.

Why would you do this fall back?  Shouldn't we just communicate
a concurrency limit to the workqueue code?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 10:17 ` Christoph Hellwig
@ 2015-05-01 13:36   ` Ming Lei
  2015-05-01 14:22     ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2015-05-01 13:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Justin M. Forbes,
	Jeff Moyer, v4.0, Tejun Heo

On Fri, May 1, 2015 at 6:17 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Fri, May 01, 2015 at 11:28:01AM +0800, Ming Lei wrote:
>> If there are too many pending per work I/O, too many
>> high priority work thread can be generated so that
>> system performance can be effected.
>>
>> This patch limits the max pending per work I/O as 16,
>> and will fackback to single queue mode when the max
>> number is reached.
>
> Why would you do this fall back?  Shouldn't we just communicate
> a concurrency limit to the workqueue code?

It can't work with workqueue's concurrency limit because the
queue is shared by all loop block devices, and the limit is on the
whole queue.

That was also the test I asked Justin to run, and looks it doesn't work.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 13:36   ` Ming Lei
@ 2015-05-01 14:22     ` Tejun Heo
  2015-05-01 15:05       ` Christoph Hellwig
  2015-05-02 14:56       ` Ming Lei
  0 siblings, 2 replies; 10+ messages in thread
From: Tejun Heo @ 2015-05-01 14:22 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

On Fri, May 01, 2015 at 09:36:47PM +0800, Ming Lei wrote:
> On Fri, May 1, 2015 at 6:17 PM, Christoph Hellwig <hch@infradead.org> wrote:
> > On Fri, May 01, 2015 at 11:28:01AM +0800, Ming Lei wrote:
> >> If there are too many pending per work I/O, too many
> >> high priority work thread can be generated so that
> >> system performance can be effected.

Hmmm... why is it even marked HIGHPRI?  The commit doesn't seem to
explain why.  Also, I wonder whether this would be better served by
unbound workqueues.  These tasks are mostly like to walk all the way
through the filesystem and block layer.  That can be quite a bit of
processing for concurrency managed per-cpu workqueues and may
effectively block out other work items which actually need to be
HIGHPRI.

> >> This patch limits the max pending per work I/O as 16,
> >> and will fackback to single queue mode when the max
> >> number is reached.
> >
> > Why would you do this fall back?  Shouldn't we just communicate
> > a concurrency limit to the workqueue code?
> 
> It can't work with workqueue's concurrency limit because the
> queue is shared by all loop block devices, and the limit is on the
> whole queue.

Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
how many concurrent workers are we talking about and why are we
capping per-queue concurrency from worker pool side instead of command
tag side?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 14:22     ` Tejun Heo
@ 2015-05-01 15:05       ` Christoph Hellwig
  2015-05-01 15:47         ` Tejun Heo
  2015-05-02 14:56       ` Ming Lei
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2015-05-01 15:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ming Lei, Christoph Hellwig, Jens Axboe,
	Linux Kernel Mailing List, Justin M. Forbes, Jeff Moyer, v4.0

On Fri, May 01, 2015 at 10:22:21AM -0400, Tejun Heo wrote:
> > > Why would you do this fall back?  Shouldn't we just communicate
> > > a concurrency limit to the workqueue code?
> > 
> > It can't work with workqueue's concurrency limit because the
> > queue is shared by all loop block devices, and the limit is on the
> > whole queue.
> 
> Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
> how many concurrent workers are we talking about and why are we
> capping per-queue concurrency from worker pool side instead of command
> tag side?

Also we probably should have per device workqueues to start with..

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 15:05       ` Christoph Hellwig
@ 2015-05-01 15:47         ` Tejun Heo
  2015-05-02 15:09           ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2015-05-01 15:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

On Fri, May 01, 2015 at 08:05:45AM -0700, Christoph Hellwig wrote:
> > Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
> > how many concurrent workers are we talking about and why are we
> > capping per-queue concurrency from worker pool side instead of command
> > tag side?
> 
> Also we probably should have per device workqueues to start with..

Yeah, that's an option.  The only thing is that each workqueue would
have to be tagged WQ_RESCUER and end up with separate rescuer task,
which usually isn't big a deal but there are setups where a lot of
loop devices are used and it may sting a bit.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 14:22     ` Tejun Heo
  2015-05-01 15:05       ` Christoph Hellwig
@ 2015-05-02 14:56       ` Ming Lei
  2015-05-03  1:52         ` Tejun Heo
  1 sibling, 1 reply; 10+ messages in thread
From: Ming Lei @ 2015-05-02 14:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

On Fri, May 1, 2015 at 10:22 PM, Tejun Heo <tj@kernel.org> wrote:
> On Fri, May 01, 2015 at 09:36:47PM +0800, Ming Lei wrote:
>> On Fri, May 1, 2015 at 6:17 PM, Christoph Hellwig <hch@infradead.org> wrote:
>> > On Fri, May 01, 2015 at 11:28:01AM +0800, Ming Lei wrote:
>> >> If there are too many pending per work I/O, too many
>> >> high priority work thread can be generated so that
>> >> system performance can be effected.
>
> Hmmm... why is it even marked HIGHPRI?  The commit doesn't seem to

I set it as HIGHPRI because the priority of previous loop thread is MIN_NICE,
but it may be doable to set it as normal priority, and I will test if
that may improve
the live booting situation.

> explain why.  Also, I wonder whether this would be better served by
> unbound workqueues.  These tasks are mostly like to walk all the way

>From my tests, looks bound is a bit better, but the difference is small.

> through the filesystem and block layer.  That can be quite a bit of
> processing for concurrency managed per-cpu workqueues and may
> effectively block out other work items which actually need to be
> HIGHPRI.
>
>> >> This patch limits the max pending per work I/O as 16,
>> >> and will fackback to single queue mode when the max
>> >> number is reached.
>> >
>> > Why would you do this fall back?  Shouldn't we just communicate
>> > a concurrency limit to the workqueue code?
>>
>> It can't work with workqueue's concurrency limit because the
>> queue is shared by all loop block devices, and the limit is on the
>> whole queue.
>
> Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,

It might not work because there are nested loop devices like fedora live CD, and
in theory the max_active should have been set as loop's queue depth *
nr_loop, otherwise there may be possibility of hanging.

So this patch is introduced.

> how many concurrent workers are we talking about and why are we
> capping per-queue concurrency from worker pool side instead of command
> tag side?

I think there should be performance advantage to make queue depth a bit more
because it can help to make queue pipeline as full. Also queue depth often
means how many requests the hardware can queue, and it is a bit different
with per-queue concurrency.

Thanks,
Ming Lei

>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-01 15:47         ` Tejun Heo
@ 2015-05-02 15:09           ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2015-05-02 15:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

On Fri, May 1, 2015 at 11:47 PM, Tejun Heo <tj@kernel.org> wrote:
> On Fri, May 01, 2015 at 08:05:45AM -0700, Christoph Hellwig wrote:
>> > Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
>> > how many concurrent workers are we talking about and why are we
>> > capping per-queue concurrency from worker pool side instead of command
>> > tag side?
>>
>> Also we probably should have per device workqueues to start with..
>
> Yeah, that's an option.  The only thing is that each workqueue would

I guess we have to do that because of nested loop devices.

> have to be tagged WQ_RESCUER and end up with separate rescuer task,
> which usually isn't big a deal but there are setups where a lot of
> loop devices are used and it may sting a bit.

The work queue can be allocated just before the loop is to be used
and destroyed when it needn't.

I will figure out one patch to do that.

>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-02 14:56       ` Ming Lei
@ 2015-05-03  1:52         ` Tejun Heo
  2015-05-04 12:54           ` Ming Lei
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2015-05-03  1:52 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

Hello,

On Sat, May 02, 2015 at 10:56:20PM +0800, Ming Lei wrote:
> > Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
> 
> It might not work because there are nested loop devices like fedora live CD, and
> in theory the max_active should have been set as loop's queue depth *
> nr_loop, otherwise there may be possibility of hanging.
> 
> So this patch is introduced.

If loop devices can be stacked, regardless of what you do with
nr_active, it may deadlock.  There needs to be a rescuer per each
nesting level (or just one per device).  This means that the current
code is broken.

> > how many concurrent workers are we talking about and why are we
> > capping per-queue concurrency from worker pool side instead of command
> > tag side?
> 
> I think there should be performance advantage to make queue depth a bit more
> because it can help to make queue pipeline as full. Also queue depth often
> means how many requests the hardware can queue, and it is a bit different
> with per-queue concurrency.

I'm not really following.  Can you please elaborate?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6] block: loop: avoiding too many pending per work I/O
  2015-05-03  1:52         ` Tejun Heo
@ 2015-05-04 12:54           ` Ming Lei
  0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2015-05-04 12:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Justin M. Forbes, Jeff Moyer, v4.0

On Sun, May 3, 2015 at 9:52 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Sat, May 02, 2015 at 10:56:20PM +0800, Ming Lei wrote:
>> > Maybe just cap max_active to NR_OF_LOOP_DEVS * 16 or sth?  But idk,
>>
>> It might not work because there are nested loop devices like fedora live CD, and
>> in theory the max_active should have been set as loop's queue depth *
>> nr_loop, otherwise there may be possibility of hanging.
>>
>> So this patch is introduced.
>
> If loop devices can be stacked, regardless of what you do with
> nr_active, it may deadlock.  There needs to be a rescuer per each
> nesting level (or just one per device).  This means that the current
> code is broken.

Yes.

>> > how many concurrent workers are we talking about and why are we
>> > capping per-queue concurrency from worker pool side instead of command
>> > tag side?
>>
>> I think there should be performance advantage to make queue depth a bit more
>> because it can help to make queue pipeline as full. Also queue depth often
>> means how many requests the hardware can queue, and it is a bit different
>> with per-queue concurrency.
>
> I'm not really following.  Can you please elaborate?

In case of loop-mq, a bigger queue_depth often has better performance
when doing read/write page cache in sequential read/write because they
are very quick and better to run them as a batch in one time work function,
but simply deceasing queue depth may hurt performance for this case.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-05-04 12:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-01  3:28 [PATCH v6] block: loop: avoiding too many pending per work I/O Ming Lei
2015-05-01 10:17 ` Christoph Hellwig
2015-05-01 13:36   ` Ming Lei
2015-05-01 14:22     ` Tejun Heo
2015-05-01 15:05       ` Christoph Hellwig
2015-05-01 15:47         ` Tejun Heo
2015-05-02 15:09           ` Ming Lei
2015-05-02 14:56       ` Ming Lei
2015-05-03  1:52         ` Tejun Heo
2015-05-04 12:54           ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.