All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests
@ 2016-09-13 16:01 Mike Snitzer
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
  To: axboe, dm-devel; +Cc: linux-block

I'd like to see these changes included during the 4.9 merge.  The
blk-mq change is a straight-forward switch from work_struct to
delayed_work and the addition of the blk_mq_delay_kick_requeue_list()
interface.  Jens, can you pick up this blk-mq changes (patch 1)?

These changes offer reduced CPU utilization during DM multipath's all
path down case (when 'queue_if_no_path' is enabled).  Please see each
patch's header for more details.

Mike Snitzer (3):
  blk-mq: introduce blk_mq_delay_kick_requeue_list()
  dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
  dm mpath: delay the requeue of blk-mq requests while all paths down

 block/blk-mq.c                | 15 +++++++++++----
 drivers/md/dm-mpath.c         | 15 +++++++++------
 drivers/md/dm-rq.c            | 35 ++++++++++++++++++++---------------
 include/linux/blk-mq.h        |  1 +
 include/linux/blkdev.h        |  2 +-
 include/linux/device-mapper.h |  1 +
 6 files changed, 43 insertions(+), 26 deletions(-)

-- 
2.7.4 (Apple Git-66)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
  2016-09-14  6:22     ` Hannes Reinecke
                     ` (2 more replies)
  2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
  2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
  2 siblings, 3 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
  To: axboe, dm-devel; +Cc: linux-block

blk_mq_delay_kick_requeue_list() provides the ability to kick the
q->requeue_list after a specified time.  To do this the request_queue's
'requeue_work' member was changed to a delayed_work.

blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
requests while it doesn't make sense to immediately requeue them
(e.g. when all paths in a DM multipath have failed).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-mq.c         | 15 +++++++++++----
 include/linux/blk-mq.h |  1 +
 include/linux/blkdev.h |  2 +-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..096e7a0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -501,7 +501,7 @@ EXPORT_SYMBOL(blk_mq_requeue_request);
 static void blk_mq_requeue_work(struct work_struct *work)
 {
 	struct request_queue *q =
-		container_of(work, struct request_queue, requeue_work);
+		container_of(work, struct request_queue, requeue_work.work);
 	LIST_HEAD(rq_list);
 	struct request *rq, *next;
 	unsigned long flags;
@@ -556,16 +556,23 @@ EXPORT_SYMBOL(blk_mq_add_to_requeue_list);
 
 void blk_mq_cancel_requeue_work(struct request_queue *q)
 {
-	cancel_work_sync(&q->requeue_work);
+	cancel_delayed_work_sync(&q->requeue_work);
 }
 EXPORT_SYMBOL_GPL(blk_mq_cancel_requeue_work);
 
 void blk_mq_kick_requeue_list(struct request_queue *q)
 {
-	kblockd_schedule_work(&q->requeue_work);
+	kblockd_schedule_delayed_work(&q->requeue_work, 0);
 }
 EXPORT_SYMBOL(blk_mq_kick_requeue_list);
 
+void blk_mq_delay_kick_requeue_list(struct request_queue *q,
+				    unsigned long msecs)
+{
+	kblockd_schedule_delayed_work(&q->requeue_work, msecs);
+}
+EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
+
 void blk_mq_abort_requeue_list(struct request_queue *q)
 {
 	unsigned long flags;
@@ -2082,7 +2089,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 
 	q->sg_reserved_size = INT_MAX;
 
-	INIT_WORK(&q->requeue_work, blk_mq_requeue_work);
+	INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work);
 	INIT_LIST_HEAD(&q->requeue_list);
 	spin_lock_init(&q->requeue_lock);
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..ecec4b8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -232,6 +232,7 @@ void blk_mq_requeue_request(struct request *rq);
 void blk_mq_add_to_requeue_list(struct request *rq, bool at_head);
 void blk_mq_cancel_requeue_work(struct request_queue *q);
 void blk_mq_kick_requeue_list(struct request_queue *q);
+void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
 void blk_mq_abort_requeue_list(struct request_queue *q);
 void blk_mq_complete_request(struct request *rq, int error);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..b0a6189 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -449,7 +449,7 @@ struct request_queue {
 
 	struct list_head	requeue_list;
 	spinlock_t		requeue_lock;
-	struct work_struct	requeue_work;
+	struct delayed_work	requeue_work;
 
 	struct mutex		sysfs_lock;
 
-- 
2.7.4 (Apple Git-66)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
  2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
  2016-09-14  6:24     ` Hannes Reinecke
  2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
  2 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
  To: axboe, dm-devel; +Cc: linux-block

Otherwise blk-mq will immediately dispatch requests that are requeued
via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.

Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
with a delay of 5 secs.  In the context of DM multipath (all paths down)
it doesn't make any sense to requeue more quickly.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-rq.c            | 35 ++++++++++++++++++++---------------
 include/linux/device-mapper.h |  1 +
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 0d301d5..1722220 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -336,21 +336,23 @@ static void dm_old_requeue_request(struct request *rq)
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
-static void dm_mq_requeue_request(struct request *rq)
+static void dm_mq_delay_requeue_request(struct request *rq, unsigned long msecs)
 {
 	struct request_queue *q = rq->q;
 	unsigned long flags;
 
 	blk_mq_requeue_request(rq);
+
 	spin_lock_irqsave(q->queue_lock, flags);
 	if (!blk_queue_stopped(q))
-		blk_mq_kick_requeue_list(q);
+		blk_mq_delay_kick_requeue_list(q, msecs);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
-static void dm_requeue_original_request(struct mapped_device *md,
-					struct request *rq)
+static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_requeue)
 {
+	struct mapped_device *md = tio->md;
+	struct request *rq = tio->orig;
 	int rw = rq_data_dir(rq);
 
 	rq_end_stats(md, rq);
@@ -359,7 +361,7 @@ static void dm_requeue_original_request(struct mapped_device *md,
 	if (!rq->q->mq_ops)
 		dm_old_requeue_request(rq);
 	else
-		dm_mq_requeue_request(rq);
+		dm_mq_delay_requeue_request(rq, delay_requeue ? 5000 : 0);
 
 	rq_completed(md, rw, false);
 }
@@ -389,7 +391,7 @@ static void dm_done(struct request *clone, int error, bool mapped)
 		return;
 	else if (r == DM_ENDIO_REQUEUE)
 		/* The target wants to requeue the I/O */
-		dm_requeue_original_request(tio->md, tio->orig);
+		dm_requeue_original_request(tio, false);
 	else {
 		DMWARN("unimplemented target endio return value: %d", r);
 		BUG();
@@ -629,8 +631,8 @@ static int dm_old_prep_fn(struct request_queue *q, struct request *rq)
 
 /*
  * Returns:
- * 0                : the request has been processed
- * DM_MAPIO_REQUEUE : the original request needs to be requeued
+ * DM_MAPIO_*       : the request has been processed as indicated
+ * DM_MAPIO_REQUEUE : the original request needs to be immediately requeued
  * < 0              : the request was completed due to failure
  */
 static int map_request(struct dm_rq_target_io *tio, struct request *rq,
@@ -643,6 +645,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
 	if (tio->clone) {
 		clone = tio->clone;
 		r = ti->type->map_rq(ti, clone, &tio->info);
+		if (r == DM_MAPIO_DELAY_REQUEUE)
+			return DM_MAPIO_REQUEUE; /* .request_fn requeue is always immediate */
 	} else {
 		r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone);
 		if (r < 0) {
@@ -650,9 +654,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
 			dm_kill_unmapped_request(rq, r);
 			return r;
 		}
-		if (r != DM_MAPIO_REMAPPED)
-			return r;
-		if (setup_clone(clone, rq, tio, GFP_ATOMIC)) {
+		if (r == DM_MAPIO_REMAPPED &&
+		    setup_clone(clone, rq, tio, GFP_ATOMIC)) {
 			/* -ENOMEM */
 			ti->type->release_clone_rq(clone);
 			return DM_MAPIO_REQUEUE;
@@ -671,7 +674,10 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
 		break;
 	case DM_MAPIO_REQUEUE:
 		/* The target wants to requeue the I/O */
-		dm_requeue_original_request(md, tio->orig);
+		break;
+	case DM_MAPIO_DELAY_REQUEUE:
+		/* The target wants to requeue the I/O after a delay */
+		dm_requeue_original_request(tio, true);
 		break;
 	default:
 		if (r > 0) {
@@ -681,10 +687,9 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
 
 		/* The target wants to complete the I/O */
 		dm_kill_unmapped_request(rq, r);
-		return r;
 	}
 
-	return 0;
+	return r;
 }
 
 static void dm_start_request(struct mapped_device *md, struct request *orig)
@@ -727,7 +732,7 @@ static void map_tio_request(struct kthread_work *work)
 	struct mapped_device *md = tio->md;
 
 	if (map_request(tio, rq, md) == DM_MAPIO_REQUEUE)
-		dm_requeue_original_request(md, rq);
+		dm_requeue_original_request(tio, false);
 }
 
 ssize_t dm_attr_rq_based_seq_io_merge_deadline_show(struct mapped_device *md, char *buf)
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 91acfce..ef7962e 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -590,6 +590,7 @@ extern struct ratelimit_state dm_ratelimit_state;
 #define DM_MAPIO_SUBMITTED	0
 #define DM_MAPIO_REMAPPED	1
 #define DM_MAPIO_REQUEUE	DM_ENDIO_REQUEUE
+#define DM_MAPIO_DELAY_REQUEUE	3
 
 #define dm_sector_div64(x, y)( \
 { \
-- 
2.7.4 (Apple Git-66)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
  2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
  2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
  2016-09-14  6:25     ` Hannes Reinecke
  2 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
  To: axboe, dm-devel; +Cc: linux-block

Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq.  Also, return
false from .busy, if all paths are down, so that blk-mq requests get
mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
being returned to dm-rq.

This change allows for a noticeable reduction in cpu utilization
(reduced kworker load) while all paths are down, e.g.:

system CPU idleness (as measured by fio's --idle-prof=system):
before: system: 87.53%
after:  system: 97.07%

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-mpath.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index c777d38..7613d64 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -550,9 +550,9 @@ static int __multipath_map(struct dm_target *ti, struct request *clone,
 		pgpath = choose_pgpath(m, nr_bytes);
 
 	if (!pgpath) {
-		if (!must_push_back_rq(m))
-			r = -EIO;	/* Failed */
-		return r;
+		if (must_push_back_rq(m))
+			return DM_MAPIO_DELAY_REQUEUE;
+		return -EIO;	/* Failed */
 	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
 		   test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
 		pg_init_all_paths(m);
@@ -1990,11 +1990,14 @@ static int multipath_busy(struct dm_target *ti)
 	struct priority_group *pg, *next_pg;
 	struct pgpath *pgpath;
 
-	/* pg_init in progress or no paths available */
-	if (atomic_read(&m->pg_init_in_progress) ||
-	    (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)))
+	/* pg_init in progress */
+	if (atomic_read(&m->pg_init_in_progress))
 		return true;
 
+	/* no paths available, for blk-mq: rely on IO mapping to delay requeue */
+	if (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags))
+		return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);
+
 	/* Guess which priority_group will be used at next mapping time */
 	pg = lockless_dereference(m->current_pg);
 	next_pg = lockless_dereference(m->next_pg);
-- 
2.7.4 (Apple Git-66)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
@ 2016-09-14  6:22     ` Hannes Reinecke
  2016-09-14 11:28   ` Bart Van Assche
  2016-09-14 15:04   ` Jens Axboe
  2 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:22 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time.  To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
> 
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-mq.c         | 15 +++++++++++----
>  include/linux/blk-mq.h |  1 +
>  include/linux/blkdev.h |  2 +-
>  3 files changed, 13 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
@ 2016-09-14  6:22     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:22 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time.  To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
> 
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-mq.c         | 15 +++++++++++----
>  include/linux/blk-mq.h |  1 +
>  include/linux/blkdev.h |  2 +-
>  3 files changed, 13 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
  2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
@ 2016-09-14  6:24     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:24 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Otherwise blk-mq will immediately dispatch requests that are requeued
> via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
> 
> Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> with a delay of 5 secs.  In the context of DM multipath (all paths down)
> it doesn't make any sense to requeue more quickly.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-rq.c            | 35 ++++++++++++++++++++---------------
>  include/linux/device-mapper.h |  1 +
>  2 files changed, 21 insertions(+), 15 deletions(-)
> 
Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
seems to be a rather broad estimate.
Won't it delay I/O resumption when we have intermittent path failure (eg
on iSCSI)?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
@ 2016-09-14  6:24     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:24 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Otherwise blk-mq will immediately dispatch requests that are requeued
> via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
> 
> Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> with a delay of 5 secs.  In the context of DM multipath (all paths down)
> it doesn't make any sense to requeue more quickly.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-rq.c            | 35 ++++++++++++++++++++---------------
>  include/linux/device-mapper.h |  1 +
>  2 files changed, 21 insertions(+), 15 deletions(-)
> 
Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
seems to be a rather broad estimate.
Won't it delay I/O resumption when we have intermittent path failure (eg
on iSCSI)?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
  2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
@ 2016-09-14  6:25     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:25 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq.  Also, return
> false from .busy, if all paths are down, so that blk-mq requests get
> mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
> being returned to dm-rq.
> 
> This change allows for a noticeable reduction in cpu utilization
> (reduced kworker load) while all paths are down, e.g.:
> 
> system CPU idleness (as measured by fio's --idle-prof=system):
> before: system: 87.53%
> after:  system: 97.07%
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-mpath.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
@ 2016-09-14  6:25     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14  6:25 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq.  Also, return
> false from .busy, if all paths are down, so that blk-mq requests get
> mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
> being returned to dm-rq.
> 
> This change allows for a noticeable reduction in cpu utilization
> (reduced kworker load) while all paths are down, e.g.:
> 
> system CPU idleness (as measured by fio's --idle-prof=system):
> before: system: 87.53%
> after:  system: 97.07%
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-mpath.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
  2016-09-14  6:22     ` Hannes Reinecke
@ 2016-09-14 11:28   ` Bart Van Assche
  2016-09-14 13:45     ` Mike Snitzer
  2016-09-14 15:04   ` Jens Axboe
  2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2016-09-14 11:28 UTC (permalink / raw)
  To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block

On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> +void blk_mq_delay_kick_requeue_list(struct request_queue *q,
> +				    unsigned long msecs)
> +{
> +	kblockd_schedule_delayed_work(&q->requeue_work, msecs);
> +}
> +EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);

Hello Mike,

I think kblockd_schedule_delayed_work() expects that the second argument 
is a number of jiffies instead of a number of milliseconds.

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-14 11:28   ` Bart Van Assche
@ 2016-09-14 13:45     ` Mike Snitzer
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 13:45 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: axboe, dm-devel, linux-block

On Wed, Sep 14 2016 at  7:28am -0400,
Bart Van Assche <bart.vanassche@gmail.com> wrote:

> On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> >+void blk_mq_delay_kick_requeue_list(struct request_queue *q,
> >+				    unsigned long msecs)
> >+{
> >+	kblockd_schedule_delayed_work(&q->requeue_work, msecs);
> >+}
> >+EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
> 
> Hello Mike,
> 
> I think kblockd_schedule_delayed_work() expects that the second
> argument is a number of jiffies instead of a number of milliseconds.

Yeap, you're right.  An earlier version was using msecs_to_jiffies() in
the DM caller.  I'll submit v2.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
  2016-09-14  6:24     ` Hannes Reinecke
  (?)
@ 2016-09-14 13:51     ` Mike Snitzer
  -1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 13:51 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: axboe, dm-devel, linux-block

On Wed, Sep 14 2016 at  2:24am -0400,
Hannes Reinecke <hare@suse.de> wrote:

> On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> > Otherwise blk-mq will immediately dispatch requests that are requeued
> > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
> > 
> > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> > with a delay of 5 secs.  In the context of DM multipath (all paths down)
> > it doesn't make any sense to requeue more quickly.
> > 
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm-rq.c            | 35 ++++++++++++++++++++---------------
> >  include/linux/device-mapper.h |  1 +
> >  2 files changed, 21 insertions(+), 15 deletions(-)
> > 
> Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
> seems to be a rather broad estimate.
> Won't it delay I/O resumption when we have intermittent path failure (eg
> on iSCSI)?

Are you saying the path would be reinstated via the 'reinstate_path'
message or by table reload?  Table reload will kick the requeue list.
'reinstate_path' doesn't (and there isn't an easy way to have dm-mpath
inform dm-rq core to do so -- could add a new function...)

If I can wire up requeue list kick on reinstate_path then the 5 secs is
perfectly fine and won't hurt how quickly IO is resumed once a path
comes back.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
  2016-09-14  6:22     ` Hannes Reinecke
  2016-09-14 11:28   ` Bart Van Assche
@ 2016-09-14 15:04   ` Jens Axboe
  2016-09-14 15:13     ` Mike Snitzer
  2 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2016-09-14 15:04 UTC (permalink / raw)
  To: Mike Snitzer, dm-devel; +Cc: linux-block

On 09/13/2016 10:01 AM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time.  To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
>
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).

Looks fine to me, sans the missing jiffies conversion that Bart pointed
out.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
  2016-09-14 15:04   ` Jens Axboe
@ 2016-09-14 15:13     ` Mike Snitzer
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 15:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: dm-devel, linux-block

On Wed, Sep 14 2016 at 11:04am -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 09/13/2016 10:01 AM, Mike Snitzer wrote:
> >blk_mq_delay_kick_requeue_list() provides the ability to kick the
> >q->requeue_list after a specified time.  To do this the request_queue's
> >'requeue_work' member was changed to a delayed_work.
> >
> >blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> >requests while it doesn't make sense to immediately requeue them
> >(e.g. when all paths in a DM multipath have failed).
> 
> Looks fine to me, sans the missing jiffies conversion that Bart pointed
> out.

Thanks, I'll send v2 out shortly.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-09-14 15:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
2016-09-14  6:22   ` Hannes Reinecke
2016-09-14  6:22     ` Hannes Reinecke
2016-09-14 11:28   ` Bart Van Assche
2016-09-14 13:45     ` Mike Snitzer
2016-09-14 15:04   ` Jens Axboe
2016-09-14 15:13     ` Mike Snitzer
2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
2016-09-14  6:24   ` Hannes Reinecke
2016-09-14  6:24     ` Hannes Reinecke
2016-09-14 13:51     ` Mike Snitzer
2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
2016-09-14  6:25   ` Hannes Reinecke
2016-09-14  6:25     ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.