* [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests @ 2016-09-13 16:01 Mike Snitzer 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw) To: axboe, dm-devel; +Cc: linux-block I'd like to see these changes included during the 4.9 merge. The blk-mq change is a straight-forward switch from work_struct to delayed_work and the addition of the blk_mq_delay_kick_requeue_list() interface. Jens, can you pick up this blk-mq changes (patch 1)? These changes offer reduced CPU utilization during DM multipath's all path down case (when 'queue_if_no_path' is enabled). Please see each patch's header for more details. Mike Snitzer (3): blk-mq: introduce blk_mq_delay_kick_requeue_list() dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests dm mpath: delay the requeue of blk-mq requests while all paths down block/blk-mq.c | 15 +++++++++++---- drivers/md/dm-mpath.c | 15 +++++++++------ drivers/md/dm-rq.c | 35 ++++++++++++++++++++--------------- include/linux/blk-mq.h | 1 + include/linux/blkdev.h | 2 +- include/linux/device-mapper.h | 1 + 6 files changed, 43 insertions(+), 26 deletions(-) -- 2.7.4 (Apple Git-66) ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer @ 2016-09-13 16:01 ` Mike Snitzer 2016-09-14 6:22 ` Hannes Reinecke ` (2 more replies) 2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer 2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer 2 siblings, 3 replies; 15+ messages in thread From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw) To: axboe, dm-devel; +Cc: linux-block blk_mq_delay_kick_requeue_list() provides the ability to kick the q->requeue_list after a specified time. To do this the request_queue's 'requeue_work' member was changed to a delayed_work. blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued requests while it doesn't make sense to immediately requeue them (e.g. when all paths in a DM multipath have failed). Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- block/blk-mq.c | 15 +++++++++++---- include/linux/blk-mq.h | 1 + include/linux/blkdev.h | 2 +- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 13f5a6c..096e7a0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -501,7 +501,7 @@ EXPORT_SYMBOL(blk_mq_requeue_request); static void blk_mq_requeue_work(struct work_struct *work) { struct request_queue *q = - container_of(work, struct request_queue, requeue_work); + container_of(work, struct request_queue, requeue_work.work); LIST_HEAD(rq_list); struct request *rq, *next; unsigned long flags; @@ -556,16 +556,23 @@ EXPORT_SYMBOL(blk_mq_add_to_requeue_list); void blk_mq_cancel_requeue_work(struct request_queue *q) { - cancel_work_sync(&q->requeue_work); + cancel_delayed_work_sync(&q->requeue_work); } EXPORT_SYMBOL_GPL(blk_mq_cancel_requeue_work); void blk_mq_kick_requeue_list(struct request_queue *q) { - kblockd_schedule_work(&q->requeue_work); + kblockd_schedule_delayed_work(&q->requeue_work, 0); } EXPORT_SYMBOL(blk_mq_kick_requeue_list); +void blk_mq_delay_kick_requeue_list(struct request_queue *q, + unsigned long msecs) +{ + kblockd_schedule_delayed_work(&q->requeue_work, msecs); +} +EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); + void blk_mq_abort_requeue_list(struct request_queue *q) { unsigned long flags; @@ -2082,7 +2089,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->sg_reserved_size = INT_MAX; - INIT_WORK(&q->requeue_work, blk_mq_requeue_work); + INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work); INIT_LIST_HEAD(&q->requeue_list); spin_lock_init(&q->requeue_lock); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index e43bbff..ecec4b8 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -232,6 +232,7 @@ void blk_mq_requeue_request(struct request *rq); void blk_mq_add_to_requeue_list(struct request *rq, bool at_head); void blk_mq_cancel_requeue_work(struct request_queue *q); void blk_mq_kick_requeue_list(struct request_queue *q); +void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs); void blk_mq_abort_requeue_list(struct request_queue *q); void blk_mq_complete_request(struct request *rq, int error); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e79055c..b0a6189 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -449,7 +449,7 @@ struct request_queue { struct list_head requeue_list; spinlock_t requeue_lock; - struct work_struct requeue_work; + struct delayed_work requeue_work; struct mutex sysfs_lock; -- 2.7.4 (Apple Git-66) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer @ 2016-09-14 6:22 ` Hannes Reinecke 2016-09-14 11:28 ` Bart Van Assche 2016-09-14 15:04 ` Jens Axboe 2 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:22 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > blk_mq_delay_kick_requeue_list() provides the ability to kick the > q->requeue_list after a specified time. To do this the request_queue's > 'requeue_work' member was changed to a delayed_work. > > blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued > requests while it doesn't make sense to immediately requeue them > (e.g. when all paths in a DM multipath have failed). > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > block/blk-mq.c | 15 +++++++++++---- > include/linux/blk-mq.h | 1 + > include/linux/blkdev.h | 2 +- > 3 files changed, 13 insertions(+), 5 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() @ 2016-09-14 6:22 ` Hannes Reinecke 0 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:22 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > blk_mq_delay_kick_requeue_list() provides the ability to kick the > q->requeue_list after a specified time. To do this the request_queue's > 'requeue_work' member was changed to a delayed_work. > > blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued > requests while it doesn't make sense to immediately requeue them > (e.g. when all paths in a DM multipath have failed). > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > block/blk-mq.c | 15 +++++++++++---- > include/linux/blk-mq.h | 1 + > include/linux/blkdev.h | 2 +- > 3 files changed, 13 insertions(+), 5 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer 2016-09-14 6:22 ` Hannes Reinecke @ 2016-09-14 11:28 ` Bart Van Assche 2016-09-14 13:45 ` Mike Snitzer 2016-09-14 15:04 ` Jens Axboe 2 siblings, 1 reply; 15+ messages in thread From: Bart Van Assche @ 2016-09-14 11:28 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > +void blk_mq_delay_kick_requeue_list(struct request_queue *q, > + unsigned long msecs) > +{ > + kblockd_schedule_delayed_work(&q->requeue_work, msecs); > +} > +EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); Hello Mike, I think kblockd_schedule_delayed_work() expects that the second argument is a number of jiffies instead of a number of milliseconds. Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-14 11:28 ` Bart Van Assche @ 2016-09-14 13:45 ` Mike Snitzer 0 siblings, 0 replies; 15+ messages in thread From: Mike Snitzer @ 2016-09-14 13:45 UTC (permalink / raw) To: Bart Van Assche; +Cc: axboe, dm-devel, linux-block On Wed, Sep 14 2016 at 7:28am -0400, Bart Van Assche <bart.vanassche@gmail.com> wrote: > On 09/13/2016 06:01 PM, Mike Snitzer wrote: > >+void blk_mq_delay_kick_requeue_list(struct request_queue *q, > >+ unsigned long msecs) > >+{ > >+ kblockd_schedule_delayed_work(&q->requeue_work, msecs); > >+} > >+EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); > > Hello Mike, > > I think kblockd_schedule_delayed_work() expects that the second > argument is a number of jiffies instead of a number of milliseconds. Yeap, you're right. An earlier version was using msecs_to_jiffies() in the DM caller. I'll submit v2. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer 2016-09-14 6:22 ` Hannes Reinecke 2016-09-14 11:28 ` Bart Van Assche @ 2016-09-14 15:04 ` Jens Axboe 2016-09-14 15:13 ` Mike Snitzer 2 siblings, 1 reply; 15+ messages in thread From: Jens Axboe @ 2016-09-14 15:04 UTC (permalink / raw) To: Mike Snitzer, dm-devel; +Cc: linux-block On 09/13/2016 10:01 AM, Mike Snitzer wrote: > blk_mq_delay_kick_requeue_list() provides the ability to kick the > q->requeue_list after a specified time. To do this the request_queue's > 'requeue_work' member was changed to a delayed_work. > > blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued > requests while it doesn't make sense to immediately requeue them > (e.g. when all paths in a DM multipath have failed). Looks fine to me, sans the missing jiffies conversion that Bart pointed out. -- Jens Axboe ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() 2016-09-14 15:04 ` Jens Axboe @ 2016-09-14 15:13 ` Mike Snitzer 0 siblings, 0 replies; 15+ messages in thread From: Mike Snitzer @ 2016-09-14 15:13 UTC (permalink / raw) To: Jens Axboe; +Cc: dm-devel, linux-block On Wed, Sep 14 2016 at 11:04am -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 09/13/2016 10:01 AM, Mike Snitzer wrote: > >blk_mq_delay_kick_requeue_list() provides the ability to kick the > >q->requeue_list after a specified time. To do this the request_queue's > >'requeue_work' member was changed to a delayed_work. > > > >blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued > >requests while it doesn't make sense to immediately requeue them > >(e.g. when all paths in a DM multipath have failed). > > Looks fine to me, sans the missing jiffies conversion that Bart pointed > out. Thanks, I'll send v2 out shortly. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests 2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer @ 2016-09-13 16:01 ` Mike Snitzer 2016-09-14 6:24 ` Hannes Reinecke 2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer 2 siblings, 1 reply; 15+ messages in thread From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw) To: axboe, dm-devel; +Cc: linux-block Otherwise blk-mq will immediately dispatch requests that are requeued via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() with a delay of 5 secs. In the context of DM multipath (all paths down) it doesn't make any sense to requeue more quickly. Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-rq.c | 35 ++++++++++++++++++++--------------- include/linux/device-mapper.h | 1 + 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 0d301d5..1722220 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -336,21 +336,23 @@ static void dm_old_requeue_request(struct request *rq) spin_unlock_irqrestore(q->queue_lock, flags); } -static void dm_mq_requeue_request(struct request *rq) +static void dm_mq_delay_requeue_request(struct request *rq, unsigned long msecs) { struct request_queue *q = rq->q; unsigned long flags; blk_mq_requeue_request(rq); + spin_lock_irqsave(q->queue_lock, flags); if (!blk_queue_stopped(q)) - blk_mq_kick_requeue_list(q); + blk_mq_delay_kick_requeue_list(q, msecs); spin_unlock_irqrestore(q->queue_lock, flags); } -static void dm_requeue_original_request(struct mapped_device *md, - struct request *rq) +static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_requeue) { + struct mapped_device *md = tio->md; + struct request *rq = tio->orig; int rw = rq_data_dir(rq); rq_end_stats(md, rq); @@ -359,7 +361,7 @@ static void dm_requeue_original_request(struct mapped_device *md, if (!rq->q->mq_ops) dm_old_requeue_request(rq); else - dm_mq_requeue_request(rq); + dm_mq_delay_requeue_request(rq, delay_requeue ? 5000 : 0); rq_completed(md, rw, false); } @@ -389,7 +391,7 @@ static void dm_done(struct request *clone, int error, bool mapped) return; else if (r == DM_ENDIO_REQUEUE) /* The target wants to requeue the I/O */ - dm_requeue_original_request(tio->md, tio->orig); + dm_requeue_original_request(tio, false); else { DMWARN("unimplemented target endio return value: %d", r); BUG(); @@ -629,8 +631,8 @@ static int dm_old_prep_fn(struct request_queue *q, struct request *rq) /* * Returns: - * 0 : the request has been processed - * DM_MAPIO_REQUEUE : the original request needs to be requeued + * DM_MAPIO_* : the request has been processed as indicated + * DM_MAPIO_REQUEUE : the original request needs to be immediately requeued * < 0 : the request was completed due to failure */ static int map_request(struct dm_rq_target_io *tio, struct request *rq, @@ -643,6 +645,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq, if (tio->clone) { clone = tio->clone; r = ti->type->map_rq(ti, clone, &tio->info); + if (r == DM_MAPIO_DELAY_REQUEUE) + return DM_MAPIO_REQUEUE; /* .request_fn requeue is always immediate */ } else { r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone); if (r < 0) { @@ -650,9 +654,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq, dm_kill_unmapped_request(rq, r); return r; } - if (r != DM_MAPIO_REMAPPED) - return r; - if (setup_clone(clone, rq, tio, GFP_ATOMIC)) { + if (r == DM_MAPIO_REMAPPED && + setup_clone(clone, rq, tio, GFP_ATOMIC)) { /* -ENOMEM */ ti->type->release_clone_rq(clone); return DM_MAPIO_REQUEUE; @@ -671,7 +674,10 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq, break; case DM_MAPIO_REQUEUE: /* The target wants to requeue the I/O */ - dm_requeue_original_request(md, tio->orig); + break; + case DM_MAPIO_DELAY_REQUEUE: + /* The target wants to requeue the I/O after a delay */ + dm_requeue_original_request(tio, true); break; default: if (r > 0) { @@ -681,10 +687,9 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq, /* The target wants to complete the I/O */ dm_kill_unmapped_request(rq, r); - return r; } - return 0; + return r; } static void dm_start_request(struct mapped_device *md, struct request *orig) @@ -727,7 +732,7 @@ static void map_tio_request(struct kthread_work *work) struct mapped_device *md = tio->md; if (map_request(tio, rq, md) == DM_MAPIO_REQUEUE) - dm_requeue_original_request(md, rq); + dm_requeue_original_request(tio, false); } ssize_t dm_attr_rq_based_seq_io_merge_deadline_show(struct mapped_device *md, char *buf) diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 91acfce..ef7962e 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -590,6 +590,7 @@ extern struct ratelimit_state dm_ratelimit_state; #define DM_MAPIO_SUBMITTED 0 #define DM_MAPIO_REMAPPED 1 #define DM_MAPIO_REQUEUE DM_ENDIO_REQUEUE +#define DM_MAPIO_DELAY_REQUEUE 3 #define dm_sector_div64(x, y)( \ { \ -- 2.7.4 (Apple Git-66) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests 2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer @ 2016-09-14 6:24 ` Hannes Reinecke 0 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:24 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > Otherwise blk-mq will immediately dispatch requests that are requeued > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. > > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() > with a delay of 5 secs. In the context of DM multipath (all paths down) > it doesn't make any sense to requeue more quickly. > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm-rq.c | 35 ++++++++++++++++++++--------------- > include/linux/device-mapper.h | 1 + > 2 files changed, 21 insertions(+), 15 deletions(-) > Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this seems to be a rather broad estimate. Won't it delay I/O resumption when we have intermittent path failure (eg on iSCSI)? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests @ 2016-09-14 6:24 ` Hannes Reinecke 0 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:24 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > Otherwise blk-mq will immediately dispatch requests that are requeued > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. > > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() > with a delay of 5 secs. In the context of DM multipath (all paths down) > it doesn't make any sense to requeue more quickly. > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm-rq.c | 35 ++++++++++++++++++++--------------- > include/linux/device-mapper.h | 1 + > 2 files changed, 21 insertions(+), 15 deletions(-) > Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this seems to be a rather broad estimate. Won't it delay I/O resumption when we have intermittent path failure (eg on iSCSI)? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests 2016-09-14 6:24 ` Hannes Reinecke (?) @ 2016-09-14 13:51 ` Mike Snitzer -1 siblings, 0 replies; 15+ messages in thread From: Mike Snitzer @ 2016-09-14 13:51 UTC (permalink / raw) To: Hannes Reinecke; +Cc: axboe, dm-devel, linux-block On Wed, Sep 14 2016 at 2:24am -0400, Hannes Reinecke <hare@suse.de> wrote: > On 09/13/2016 06:01 PM, Mike Snitzer wrote: > > Otherwise blk-mq will immediately dispatch requests that are requeued > > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. > > > > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() > > with a delay of 5 secs. In the context of DM multipath (all paths down) > > it doesn't make any sense to requeue more quickly. > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > > --- > > drivers/md/dm-rq.c | 35 ++++++++++++++++++++--------------- > > include/linux/device-mapper.h | 1 + > > 2 files changed, 21 insertions(+), 15 deletions(-) > > > Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this > seems to be a rather broad estimate. > Won't it delay I/O resumption when we have intermittent path failure (eg > on iSCSI)? Are you saying the path would be reinstated via the 'reinstate_path' message or by table reload? Table reload will kick the requeue list. 'reinstate_path' doesn't (and there isn't an easy way to have dm-mpath inform dm-rq core to do so -- could add a new function...) If I can wire up requeue list kick on reinstate_path then the 5 secs is perfectly fine and won't hurt how quickly IO is resumed once a path comes back. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down 2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer 2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer @ 2016-09-13 16:01 ` Mike Snitzer 2016-09-14 6:25 ` Hannes Reinecke 2 siblings, 1 reply; 15+ messages in thread From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw) To: axboe, dm-devel; +Cc: linux-block Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return false from .busy, if all paths are down, so that blk-mq requests get mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE being returned to dm-rq. This change allows for a noticeable reduction in cpu utilization (reduced kworker load) while all paths are down, e.g.: system CPU idleness (as measured by fio's --idle-prof=system): before: system: 87.53% after: system: 97.07% Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-mpath.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index c777d38..7613d64 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -550,9 +550,9 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, pgpath = choose_pgpath(m, nr_bytes); if (!pgpath) { - if (!must_push_back_rq(m)) - r = -EIO; /* Failed */ - return r; + if (must_push_back_rq(m)) + return DM_MAPIO_DELAY_REQUEUE; + return -EIO; /* Failed */ } else if (test_bit(MPATHF_QUEUE_IO, &m->flags) || test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) { pg_init_all_paths(m); @@ -1990,11 +1990,14 @@ static int multipath_busy(struct dm_target *ti) struct priority_group *pg, *next_pg; struct pgpath *pgpath; - /* pg_init in progress or no paths available */ - if (atomic_read(&m->pg_init_in_progress) || - (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags))) + /* pg_init in progress */ + if (atomic_read(&m->pg_init_in_progress)) return true; + /* no paths available, for blk-mq: rely on IO mapping to delay requeue */ + if (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) + return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED); + /* Guess which priority_group will be used at next mapping time */ pg = lockless_dereference(m->current_pg); next_pg = lockless_dereference(m->next_pg); -- 2.7.4 (Apple Git-66) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down 2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer @ 2016-09-14 6:25 ` Hannes Reinecke 0 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:25 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return > false from .busy, if all paths are down, so that blk-mq requests get > mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE > being returned to dm-rq. > > This change allows for a noticeable reduction in cpu utilization > (reduced kworker load) while all paths are down, e.g.: > > system CPU idleness (as measured by fio's --idle-prof=system): > before: system: 87.53% > after: system: 97.07% > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm-mpath.c | 15 +++++++++------ > 1 file changed, 9 insertions(+), 6 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down @ 2016-09-14 6:25 ` Hannes Reinecke 0 siblings, 0 replies; 15+ messages in thread From: Hannes Reinecke @ 2016-09-14 6:25 UTC (permalink / raw) To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block On 09/13/2016 06:01 PM, Mike Snitzer wrote: > Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return > false from .busy, if all paths are down, so that blk-mq requests get > mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE > being returned to dm-rq. > > This change allows for a noticeable reduction in cpu utilization > (reduced kworker load) while all paths are down, e.g.: > > system CPU idleness (as measured by fio's --idle-prof=system): > before: system: 87.53% > after: system: 97.07% > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm-mpath.c | 15 +++++++++------ > 1 file changed, 9 insertions(+), 6 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-09-14 15:13 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer 2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer 2016-09-14 6:22 ` Hannes Reinecke 2016-09-14 6:22 ` Hannes Reinecke 2016-09-14 11:28 ` Bart Van Assche 2016-09-14 13:45 ` Mike Snitzer 2016-09-14 15:04 ` Jens Axboe 2016-09-14 15:13 ` Mike Snitzer 2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer 2016-09-14 6:24 ` Hannes Reinecke 2016-09-14 6:24 ` Hannes Reinecke 2016-09-14 13:51 ` Mike Snitzer 2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer 2016-09-14 6:25 ` Hannes Reinecke 2016-09-14 6:25 ` Hannes Reinecke
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.