* [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests
@ 2016-09-13 16:01 Mike Snitzer
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
To: axboe, dm-devel; +Cc: linux-block
I'd like to see these changes included during the 4.9 merge. The
blk-mq change is a straight-forward switch from work_struct to
delayed_work and the addition of the blk_mq_delay_kick_requeue_list()
interface. Jens, can you pick up this blk-mq changes (patch 1)?
These changes offer reduced CPU utilization during DM multipath's all
path down case (when 'queue_if_no_path' is enabled). Please see each
patch's header for more details.
Mike Snitzer (3):
blk-mq: introduce blk_mq_delay_kick_requeue_list()
dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
dm mpath: delay the requeue of blk-mq requests while all paths down
block/blk-mq.c | 15 +++++++++++----
drivers/md/dm-mpath.c | 15 +++++++++------
drivers/md/dm-rq.c | 35 ++++++++++++++++++++---------------
include/linux/blk-mq.h | 1 +
include/linux/blkdev.h | 2 +-
include/linux/device-mapper.h | 1 +
6 files changed, 43 insertions(+), 26 deletions(-)
--
2.7.4 (Apple Git-66)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
2016-09-14 6:22 ` Hannes Reinecke
` (2 more replies)
2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
2 siblings, 3 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
To: axboe, dm-devel; +Cc: linux-block
blk_mq_delay_kick_requeue_list() provides the ability to kick the
q->requeue_list after a specified time. To do this the request_queue's
'requeue_work' member was changed to a delayed_work.
blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
requests while it doesn't make sense to immediately requeue them
(e.g. when all paths in a DM multipath have failed).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
block/blk-mq.c | 15 +++++++++++----
include/linux/blk-mq.h | 1 +
include/linux/blkdev.h | 2 +-
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13f5a6c..096e7a0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -501,7 +501,7 @@ EXPORT_SYMBOL(blk_mq_requeue_request);
static void blk_mq_requeue_work(struct work_struct *work)
{
struct request_queue *q =
- container_of(work, struct request_queue, requeue_work);
+ container_of(work, struct request_queue, requeue_work.work);
LIST_HEAD(rq_list);
struct request *rq, *next;
unsigned long flags;
@@ -556,16 +556,23 @@ EXPORT_SYMBOL(blk_mq_add_to_requeue_list);
void blk_mq_cancel_requeue_work(struct request_queue *q)
{
- cancel_work_sync(&q->requeue_work);
+ cancel_delayed_work_sync(&q->requeue_work);
}
EXPORT_SYMBOL_GPL(blk_mq_cancel_requeue_work);
void blk_mq_kick_requeue_list(struct request_queue *q)
{
- kblockd_schedule_work(&q->requeue_work);
+ kblockd_schedule_delayed_work(&q->requeue_work, 0);
}
EXPORT_SYMBOL(blk_mq_kick_requeue_list);
+void blk_mq_delay_kick_requeue_list(struct request_queue *q,
+ unsigned long msecs)
+{
+ kblockd_schedule_delayed_work(&q->requeue_work, msecs);
+}
+EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
+
void blk_mq_abort_requeue_list(struct request_queue *q)
{
unsigned long flags;
@@ -2082,7 +2089,7 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
q->sg_reserved_size = INT_MAX;
- INIT_WORK(&q->requeue_work, blk_mq_requeue_work);
+ INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work);
INIT_LIST_HEAD(&q->requeue_list);
spin_lock_init(&q->requeue_lock);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e43bbff..ecec4b8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -232,6 +232,7 @@ void blk_mq_requeue_request(struct request *rq);
void blk_mq_add_to_requeue_list(struct request *rq, bool at_head);
void blk_mq_cancel_requeue_work(struct request_queue *q);
void blk_mq_kick_requeue_list(struct request_queue *q);
+void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
void blk_mq_abort_requeue_list(struct request_queue *q);
void blk_mq_complete_request(struct request *rq, int error);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..b0a6189 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -449,7 +449,7 @@ struct request_queue {
struct list_head requeue_list;
spinlock_t requeue_lock;
- struct work_struct requeue_work;
+ struct delayed_work requeue_work;
struct mutex sysfs_lock;
--
2.7.4 (Apple Git-66)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
2016-09-14 6:24 ` Hannes Reinecke
2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
2 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
To: axboe, dm-devel; +Cc: linux-block
Otherwise blk-mq will immediately dispatch requests that are requeued
via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
with a delay of 5 secs. In the context of DM multipath (all paths down)
it doesn't make any sense to requeue more quickly.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm-rq.c | 35 ++++++++++++++++++++---------------
include/linux/device-mapper.h | 1 +
2 files changed, 21 insertions(+), 15 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 0d301d5..1722220 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -336,21 +336,23 @@ static void dm_old_requeue_request(struct request *rq)
spin_unlock_irqrestore(q->queue_lock, flags);
}
-static void dm_mq_requeue_request(struct request *rq)
+static void dm_mq_delay_requeue_request(struct request *rq, unsigned long msecs)
{
struct request_queue *q = rq->q;
unsigned long flags;
blk_mq_requeue_request(rq);
+
spin_lock_irqsave(q->queue_lock, flags);
if (!blk_queue_stopped(q))
- blk_mq_kick_requeue_list(q);
+ blk_mq_delay_kick_requeue_list(q, msecs);
spin_unlock_irqrestore(q->queue_lock, flags);
}
-static void dm_requeue_original_request(struct mapped_device *md,
- struct request *rq)
+static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_requeue)
{
+ struct mapped_device *md = tio->md;
+ struct request *rq = tio->orig;
int rw = rq_data_dir(rq);
rq_end_stats(md, rq);
@@ -359,7 +361,7 @@ static void dm_requeue_original_request(struct mapped_device *md,
if (!rq->q->mq_ops)
dm_old_requeue_request(rq);
else
- dm_mq_requeue_request(rq);
+ dm_mq_delay_requeue_request(rq, delay_requeue ? 5000 : 0);
rq_completed(md, rw, false);
}
@@ -389,7 +391,7 @@ static void dm_done(struct request *clone, int error, bool mapped)
return;
else if (r == DM_ENDIO_REQUEUE)
/* The target wants to requeue the I/O */
- dm_requeue_original_request(tio->md, tio->orig);
+ dm_requeue_original_request(tio, false);
else {
DMWARN("unimplemented target endio return value: %d", r);
BUG();
@@ -629,8 +631,8 @@ static int dm_old_prep_fn(struct request_queue *q, struct request *rq)
/*
* Returns:
- * 0 : the request has been processed
- * DM_MAPIO_REQUEUE : the original request needs to be requeued
+ * DM_MAPIO_* : the request has been processed as indicated
+ * DM_MAPIO_REQUEUE : the original request needs to be immediately requeued
* < 0 : the request was completed due to failure
*/
static int map_request(struct dm_rq_target_io *tio, struct request *rq,
@@ -643,6 +645,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
if (tio->clone) {
clone = tio->clone;
r = ti->type->map_rq(ti, clone, &tio->info);
+ if (r == DM_MAPIO_DELAY_REQUEUE)
+ return DM_MAPIO_REQUEUE; /* .request_fn requeue is always immediate */
} else {
r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone);
if (r < 0) {
@@ -650,9 +654,8 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
dm_kill_unmapped_request(rq, r);
return r;
}
- if (r != DM_MAPIO_REMAPPED)
- return r;
- if (setup_clone(clone, rq, tio, GFP_ATOMIC)) {
+ if (r == DM_MAPIO_REMAPPED &&
+ setup_clone(clone, rq, tio, GFP_ATOMIC)) {
/* -ENOMEM */
ti->type->release_clone_rq(clone);
return DM_MAPIO_REQUEUE;
@@ -671,7 +674,10 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
break;
case DM_MAPIO_REQUEUE:
/* The target wants to requeue the I/O */
- dm_requeue_original_request(md, tio->orig);
+ break;
+ case DM_MAPIO_DELAY_REQUEUE:
+ /* The target wants to requeue the I/O after a delay */
+ dm_requeue_original_request(tio, true);
break;
default:
if (r > 0) {
@@ -681,10 +687,9 @@ static int map_request(struct dm_rq_target_io *tio, struct request *rq,
/* The target wants to complete the I/O */
dm_kill_unmapped_request(rq, r);
- return r;
}
- return 0;
+ return r;
}
static void dm_start_request(struct mapped_device *md, struct request *orig)
@@ -727,7 +732,7 @@ static void map_tio_request(struct kthread_work *work)
struct mapped_device *md = tio->md;
if (map_request(tio, rq, md) == DM_MAPIO_REQUEUE)
- dm_requeue_original_request(md, rq);
+ dm_requeue_original_request(tio, false);
}
ssize_t dm_attr_rq_based_seq_io_merge_deadline_show(struct mapped_device *md, char *buf)
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 91acfce..ef7962e 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -590,6 +590,7 @@ extern struct ratelimit_state dm_ratelimit_state;
#define DM_MAPIO_SUBMITTED 0
#define DM_MAPIO_REMAPPED 1
#define DM_MAPIO_REQUEUE DM_ENDIO_REQUEUE
+#define DM_MAPIO_DELAY_REQUEUE 3
#define dm_sector_div64(x, y)( \
{ \
--
2.7.4 (Apple Git-66)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
@ 2016-09-13 16:01 ` Mike Snitzer
2016-09-14 6:25 ` Hannes Reinecke
2 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2016-09-13 16:01 UTC (permalink / raw)
To: axboe, dm-devel; +Cc: linux-block
Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return
false from .busy, if all paths are down, so that blk-mq requests get
mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
being returned to dm-rq.
This change allows for a noticeable reduction in cpu utilization
(reduced kworker load) while all paths are down, e.g.:
system CPU idleness (as measured by fio's --idle-prof=system):
before: system: 87.53%
after: system: 97.07%
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm-mpath.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index c777d38..7613d64 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -550,9 +550,9 @@ static int __multipath_map(struct dm_target *ti, struct request *clone,
pgpath = choose_pgpath(m, nr_bytes);
if (!pgpath) {
- if (!must_push_back_rq(m))
- r = -EIO; /* Failed */
- return r;
+ if (must_push_back_rq(m))
+ return DM_MAPIO_DELAY_REQUEUE;
+ return -EIO; /* Failed */
} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
pg_init_all_paths(m);
@@ -1990,11 +1990,14 @@ static int multipath_busy(struct dm_target *ti)
struct priority_group *pg, *next_pg;
struct pgpath *pgpath;
- /* pg_init in progress or no paths available */
- if (atomic_read(&m->pg_init_in_progress) ||
- (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)))
+ /* pg_init in progress */
+ if (atomic_read(&m->pg_init_in_progress))
return true;
+ /* no paths available, for blk-mq: rely on IO mapping to delay requeue */
+ if (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags))
+ return (m->queue_mode != DM_TYPE_MQ_REQUEST_BASED);
+
/* Guess which priority_group will be used at next mapping time */
pg = lockless_dereference(m->current_pg);
next_pg = lockless_dereference(m->next_pg);
--
2.7.4 (Apple Git-66)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
@ 2016-09-14 6:22 ` Hannes Reinecke
2016-09-14 11:28 ` Bart Van Assche
2016-09-14 15:04 ` Jens Axboe
2 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:22 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time. To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
>
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> block/blk-mq.c | 15 +++++++++++----
> include/linux/blk-mq.h | 1 +
> include/linux/blkdev.h | 2 +-
> 3 files changed, 13 insertions(+), 5 deletions(-)
>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
@ 2016-09-14 6:22 ` Hannes Reinecke
0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:22 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time. To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
>
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> block/blk-mq.c | 15 +++++++++++----
> include/linux/blk-mq.h | 1 +
> include/linux/blkdev.h | 2 +-
> 3 files changed, 13 insertions(+), 5 deletions(-)
>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
@ 2016-09-14 6:24 ` Hannes Reinecke
0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:24 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Otherwise blk-mq will immediately dispatch requests that are requeued
> via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
>
> Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> with a delay of 5 secs. In the context of DM multipath (all paths down)
> it doesn't make any sense to requeue more quickly.
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/md/dm-rq.c | 35 ++++++++++++++++++++---------------
> include/linux/device-mapper.h | 1 +
> 2 files changed, 21 insertions(+), 15 deletions(-)
>
Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
seems to be a rather broad estimate.
Won't it delay I/O resumption when we have intermittent path failure (eg
on iSCSI)?
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
@ 2016-09-14 6:24 ` Hannes Reinecke
0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:24 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Otherwise blk-mq will immediately dispatch requests that are requeued
> via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
>
> Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> with a delay of 5 secs. In the context of DM multipath (all paths down)
> it doesn't make any sense to requeue more quickly.
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/md/dm-rq.c | 35 ++++++++++++++++++++---------------
> include/linux/device-mapper.h | 1 +
> 2 files changed, 21 insertions(+), 15 deletions(-)
>
Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
seems to be a rather broad estimate.
Won't it delay I/O resumption when we have intermittent path failure (eg
on iSCSI)?
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
@ 2016-09-14 6:25 ` Hannes Reinecke
0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:25 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return
> false from .busy, if all paths are down, so that blk-mq requests get
> mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
> being returned to dm-rq.
>
> This change allows for a noticeable reduction in cpu utilization
> (reduced kworker load) while all paths are down, e.g.:
>
> system CPU idleness (as measured by fio's --idle-prof=system):
> before: system: 87.53%
> after: system: 97.07%
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/md/dm-mpath.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down
@ 2016-09-14 6:25 ` Hannes Reinecke
0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2016-09-14 6:25 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return
> false from .busy, if all paths are down, so that blk-mq requests get
> mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
> being returned to dm-rq.
>
> This change allows for a noticeable reduction in cpu utilization
> (reduced kworker load) while all paths are down, e.g.:
>
> system CPU idleness (as measured by fio's --idle-prof=system):
> before: system: 87.53%
> after: system: 97.07%
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/md/dm-mpath.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
2016-09-14 6:22 ` Hannes Reinecke
@ 2016-09-14 11:28 ` Bart Van Assche
2016-09-14 13:45 ` Mike Snitzer
2016-09-14 15:04 ` Jens Axboe
2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2016-09-14 11:28 UTC (permalink / raw)
To: Mike Snitzer, axboe, dm-devel; +Cc: linux-block
On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> +void blk_mq_delay_kick_requeue_list(struct request_queue *q,
> + unsigned long msecs)
> +{
> + kblockd_schedule_delayed_work(&q->requeue_work, msecs);
> +}
> +EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
Hello Mike,
I think kblockd_schedule_delayed_work() expects that the second argument
is a number of jiffies instead of a number of milliseconds.
Bart.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-14 11:28 ` Bart Van Assche
@ 2016-09-14 13:45 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 13:45 UTC (permalink / raw)
To: Bart Van Assche; +Cc: axboe, dm-devel, linux-block
On Wed, Sep 14 2016 at 7:28am -0400,
Bart Van Assche <bart.vanassche@gmail.com> wrote:
> On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> >+void blk_mq_delay_kick_requeue_list(struct request_queue *q,
> >+ unsigned long msecs)
> >+{
> >+ kblockd_schedule_delayed_work(&q->requeue_work, msecs);
> >+}
> >+EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>
> Hello Mike,
>
> I think kblockd_schedule_delayed_work() expects that the second
> argument is a number of jiffies instead of a number of milliseconds.
Yeap, you're right. An earlier version was using msecs_to_jiffies() in
the DM caller. I'll submit v2.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
2016-09-14 6:24 ` Hannes Reinecke
(?)
@ 2016-09-14 13:51 ` Mike Snitzer
-1 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 13:51 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: axboe, dm-devel, linux-block
On Wed, Sep 14 2016 at 2:24am -0400,
Hannes Reinecke <hare@suse.de> wrote:
> On 09/13/2016 06:01 PM, Mike Snitzer wrote:
> > Otherwise blk-mq will immediately dispatch requests that are requeued
> > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq.
> >
> > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list()
> > with a delay of 5 secs. In the context of DM multipath (all paths down)
> > it doesn't make any sense to requeue more quickly.
> >
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> > drivers/md/dm-rq.c | 35 ++++++++++++++++++++---------------
> > include/linux/device-mapper.h | 1 +
> > 2 files changed, 21 insertions(+), 15 deletions(-)
> >
> Hmm. Not sure if I agree with the reasoning for a 5 seconds delay; this
> seems to be a rather broad estimate.
> Won't it delay I/O resumption when we have intermittent path failure (eg
> on iSCSI)?
Are you saying the path would be reinstated via the 'reinstate_path'
message or by table reload? Table reload will kick the requeue list.
'reinstate_path' doesn't (and there isn't an easy way to have dm-mpath
inform dm-rq core to do so -- could add a new function...)
If I can wire up requeue list kick on reinstate_path then the 5 secs is
perfectly fine and won't hurt how quickly IO is resumed once a path
comes back.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
2016-09-14 6:22 ` Hannes Reinecke
2016-09-14 11:28 ` Bart Van Assche
@ 2016-09-14 15:04 ` Jens Axboe
2016-09-14 15:13 ` Mike Snitzer
2 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2016-09-14 15:04 UTC (permalink / raw)
To: Mike Snitzer, dm-devel; +Cc: linux-block
On 09/13/2016 10:01 AM, Mike Snitzer wrote:
> blk_mq_delay_kick_requeue_list() provides the ability to kick the
> q->requeue_list after a specified time. To do this the request_queue's
> 'requeue_work' member was changed to a delayed_work.
>
> blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> requests while it doesn't make sense to immediately requeue them
> (e.g. when all paths in a DM multipath have failed).
Looks fine to me, sans the missing jiffies conversion that Bart pointed
out.
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list()
2016-09-14 15:04 ` Jens Axboe
@ 2016-09-14 15:13 ` Mike Snitzer
0 siblings, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2016-09-14 15:13 UTC (permalink / raw)
To: Jens Axboe; +Cc: dm-devel, linux-block
On Wed, Sep 14 2016 at 11:04am -0400,
Jens Axboe <axboe@kernel.dk> wrote:
> On 09/13/2016 10:01 AM, Mike Snitzer wrote:
> >blk_mq_delay_kick_requeue_list() provides the ability to kick the
> >q->requeue_list after a specified time. To do this the request_queue's
> >'requeue_work' member was changed to a delayed_work.
> >
> >blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued
> >requests while it doesn't make sense to immediately requeue them
> >(e.g. when all paths in a DM multipath have failed).
>
> Looks fine to me, sans the missing jiffies conversion that Bart pointed
> out.
Thanks, I'll send v2 out shortly.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-09-14 15:13 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-13 16:01 [PATCH for-4.9 0/3] allow delayed requeue of blk-mq requests Mike Snitzer
2016-09-13 16:01 ` [PATCH 1/3] blk-mq: introduce blk_mq_delay_kick_requeue_list() Mike Snitzer
2016-09-14 6:22 ` Hannes Reinecke
2016-09-14 6:22 ` Hannes Reinecke
2016-09-14 11:28 ` Bart Van Assche
2016-09-14 13:45 ` Mike Snitzer
2016-09-14 15:04 ` Jens Axboe
2016-09-14 15:13 ` Mike Snitzer
2016-09-13 16:01 ` [PATCH 2/3] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Mike Snitzer
2016-09-14 6:24 ` Hannes Reinecke
2016-09-14 6:24 ` Hannes Reinecke
2016-09-14 13:51 ` Mike Snitzer
2016-09-13 16:01 ` [PATCH 3/3] dm mpath: delay the requeue of blk-mq requests while all paths down Mike Snitzer
2016-09-14 6:25 ` Hannes Reinecke
2016-09-14 6:25 ` Hannes Reinecke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.