All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v5] blk-mq: reimplement timeout handling
@ 2018-01-09 16:29 Tejun Heo
  2018-01-09 16:29 ` [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper Tejun Heo
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Tejun Heo @ 2018-01-09 16:29 UTC (permalink / raw)
  To: jack, axboe, clm, jbacik
  Cc: kernel-team, linux-kernel, linux-btrfs, peterz, jianchao.w.wang,
	Bart.VanAssche, linux-block

Hello,

Changes from [v4]

- Comments added.  Patch description updated.

Changes from [v3]

- Rebased on top of for-4.16/block.

- Integrated Jens's hctx_[un]lock() factoring patch and refreshed the
  patches accordingly.

- Added comment explaining the use of hctx_lock() instead of
  rcu_read_lock() in completion path.

Changes from [v2]

- Possible extended looping around seqcount and u64_stat_sync fixed.

- Misplaced MQ_RQ_IDLE state setting fixed.

- RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout
  multiple times.

- s/queue_rq_src/srcu/ patch added.

- Other misc changes.

Changes from [v1]

- BLK_EH_RESET_TIMER handling fixed.

- s/request->gstate_seqc/request->gstate_seq/

- READ_ONCE() added to blk_mq_rq_udpate_state().

- Removed left over blk_clear_rq_complete() invocation from
  blk_mq_rq_timed_out().

Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules.  Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request.  If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly.  Nothing actually timed out.  It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one.  Please read the patch description of
the second path for more details.

This patchset contains the following eight patches.

0001-blk-mq-move-hctx-lock-unlock-into-a-helper.patch
0002-blk-mq-protect-completion-path-with-RCU.patch
0003-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0004-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0005-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0006-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0007-blk-mq-remove-REQ_ATOM_STARTED.patch
0008-blk-mq-rename-blk_mq_hw_ctx-queue_rq_srcu-to-srcu.patch

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout-v5

diffstat follows.  Thanks.

 block/blk-core.c       |    2 
 block/blk-mq-debugfs.c |    4 
 block/blk-mq.c         |  346 +++++++++++++++++++++++++++++--------------------
 block/blk-mq.h         |   49 ++++++
 block/blk-timeout.c    |   16 +-
 block/blk.h            |    7 
 include/linux/blk-mq.h |    3 
 include/linux/blkdev.h |   25 +++
 8 files changed, 294 insertions(+), 158 deletions(-)

--
tejun

[v4] http://lkml.kernel.org/r/20180108191542.379478-1-tj@kernel.org
[v3] http://lkml.kernel.org/r/20171216120726.517153-1-tj@kernel.org
[v2] http://lkml.kernel.org/r/20171010155441.753966-1-tj@kernel.org
[v1] http://lkml.kernel.org/r/20171209192525.982030-1-tj@kernel.org

^ permalink raw reply	[flat|nested] 29+ messages in thread
* [PATCHSET v4] blk-mq: reimplement timeout handling
@ 2018-01-08 19:15 Tejun Heo
  2018-01-08 19:15 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2018-01-08 19:15 UTC (permalink / raw)
  To: jack, axboe, clm, jbacik
  Cc: kernel-team, linux-kernel, linux-btrfs, peterz, jianchao.w.wang,
	Bart.VanAssche, linux-block

Hello,

Changes from [v3]

- Rebased on top of for-4.16/block.

- Integrated Jens's hctx_[un]lock() factoring patch and refreshed the
  patches accordingly.

- Added comment explaining the use of hctx_lock() instead of
  rcu_read_lock() in completion path.

Changes from [v2]

- Possible extended looping around seqcount and u64_stat_sync fixed.

- Misplaced MQ_RQ_IDLE state setting fixed.

- RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout
  multiple times.

- s/queue_rq_src/srcu/ patch added.

- Other misc changes.

Changes from [v1]

- BLK_EH_RESET_TIMER handling fixed.

- s/request->gstate_seqc/request->gstate_seq/

- READ_ONCE() added to blk_mq_rq_udpate_state().

- Removed left over blk_clear_rq_complete() invocation from
  blk_mq_rq_timed_out().

Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules.  Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request.  If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly.  Nothing actually timed out.  It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one.  Please read the patch description of
the second path for more details.

This patchset contains the following eight patches.

0001-blk-mq-move-hctx-lock-unlock-into-a-helper.patch
0002-blk-mq-protect-completion-path-with-RCU.patch
0003-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0004-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0005-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0006-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0007-blk-mq-remove-REQ_ATOM_STARTED.patch
0008-blk-mq-rename-blk_mq_hw_ctx-queue_rq_srcu-to-srcu.patch

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout-v4

diffstat follows.  Thanks.

 block/blk-core.c       |    2 
 block/blk-mq-debugfs.c |    4 
 block/blk-mq.c         |  340 ++++++++++++++++++++++++++++---------------------
 block/blk-mq.h         |   49 ++++++-
 block/blk-timeout.c    |   11 -
 block/blk.h            |    7 -
 include/linux/blk-mq.h |    3 
 include/linux/blkdev.h |   25 +++
 8 files changed, 283 insertions(+), 158 deletions(-)

--
tejun

[v3] http://lkml.kernel.org/r/20171216120726.517153-1-tj@kernel.org
[v2] http://lkml.kernel.org/r/20171010155441.753966-1-tj@kernel.org
[v1] http://lkml.kernel.org/r/20171209192525.982030-1-tj@kernel.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-01-14 15:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-09 16:29 [PATCHSET v5] blk-mq: reimplement timeout handling Tejun Heo
2018-01-09 16:29 ` [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper Tejun Heo
2018-01-09 16:29 ` [PATCH 2/8] blk-mq: protect completion path with RCU Tejun Heo
2018-01-09 16:29 ` [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme Tejun Heo
2018-01-09 16:29 ` [PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE Tejun Heo
2018-01-09 16:29 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2018-01-09 16:29 ` [PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq Tejun Heo
2018-01-09 16:29 ` [PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED Tejun Heo
2018-01-09 16:29 ` [PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu Tejun Heo
2018-01-09 16:53 ` [PATCHSET v5] blk-mq: reimplement timeout handling Jens Axboe
2018-01-12 20:57 ` Bart Van Assche
2018-01-12 20:57   ` Bart Van Assche
2018-01-12 21:07   ` Jens Axboe
2018-01-12 21:19     ` Bart Van Assche
2018-01-12 21:55       ` Jens Axboe
2018-01-14 15:12       ` jianchao.wang
2018-01-14 15:45         ` Ming Lei
2018-01-12 21:55   ` Laurence Oberman
2018-01-12 21:55     ` Laurence Oberman
2018-01-13 12:39     ` Ming Lei
2018-01-13 12:39       ` Ming Lei
2018-01-13 14:45     ` Ming Lei
2018-01-13 14:45       ` Ming Lei
2018-01-13 17:59       ` Ming Lei
2018-01-13 17:59         ` Ming Lei
  -- strict thread matches above, loose matches on Subject: below --
2018-01-08 19:15 [PATCHSET v4] " Tejun Heo
2018-01-08 19:15 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2018-01-08 22:10   ` Bart Van Assche
2018-01-08 22:10     ` Bart Van Assche
2018-01-09 16:02     ` tj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.