All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: jack@suse.cz, axboe@kernel.dk, clm@fb.com, jbacik@fb.com
Cc: kernel-team@fb.com, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, peterz@infradead.org,
	jianchao.w.wang@oracle.com, Bart.VanAssche@wdc.com
Subject: [PATCHSET v3] blk-mq: reimplement timeout handling
Date: Sat, 16 Dec 2017 04:07:19 -0800	[thread overview]
Message-ID: <20171216120726.517153-1-tj@kernel.org> (raw)

Hello,

Changes from [v2]

- Possible extended looping around seqcount and u64_stat_sync fixed.

- Misplaced MQ_RQ_IDLE state setting fixed.

- RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout
  multiple times.

- s/queue_rq_src/srcu/ patch added.

- Other misc changes.

Changes from [v1]

- BLK_EH_RESET_TIMER handling fixed.

- s/request->gstate_seqc/request->gstate_seq/

- READ_ONCE() added to blk_mq_rq_udpate_state().

- Removed left over blk_clear_rq_complete() invocation from
  blk_mq_rq_timed_out().

Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules.  Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request.  If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly.  Nothing actually timed out.  It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one.  Please read the patch description of
the second path for more details.

This patchset contains the following six patches.

0001-blk-mq-protect-completion-path-with-RCU.patch
0002-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0003-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0004-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0005-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0006-blk-mq-remove-REQ_ATOM_STARTED.patch
0007-blk-mq-rename-blk_mq_hw_ctx-queue_rq_srcu-to-srcu.patch

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout-v3

diffstat follows.  Thanks.

 block/blk-core.c       |    2 
 block/blk-mq-debugfs.c |    4 
 block/blk-mq.c         |  285 +++++++++++++++++++++++++++++--------------------
 block/blk-mq.h         |   49 ++++++++
 block/blk-timeout.c    |   11 +
 block/blk.h            |    7 -
 include/linux/blk-mq.h |    3 
 include/linux/blkdev.h |   25 ++++
 8 files changed, 257 insertions(+), 129 deletions(-)

--
tejun

[v2] http://lkml.kernel.org/r/20171010155441.753966-1-tj@kernel.org
[v1] http://lkml.kernel.org/r/20171209192525.982030-1-tj@kernel.org

             reply	other threads:[~2017-12-16 12:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-16 12:07 Tejun Heo [this message]
2017-12-16 12:07 ` [PATCH 1/7] blk-mq: protect completion path with RCU Tejun Heo
2017-12-29 10:04   ` Christoph Hellwig
2018-01-08 17:12     ` Tejun Heo
2017-12-16 12:07 ` [PATCH 2/7] blk-mq: replace timeout synchronization with a RCU and generation based scheme Tejun Heo
2017-12-16 12:07 ` [PATCH 3/7] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE Tejun Heo
2017-12-16 12:07 ` [PATCH 4/7] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2017-12-29 10:07   ` Christoph Hellwig
2017-12-16 12:07 ` [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq Tejun Heo
2017-12-21  3:56   ` jianchao.wang
2017-12-21 13:50     ` Tejun Heo
2017-12-22  4:02       ` jianchao.wang
2018-01-08 17:27         ` Tejun Heo
2018-01-09  3:08           ` jianchao.wang
2018-01-09  3:37             ` Tejun Heo
2018-01-09  5:59               ` jianchao.wang
2017-12-16 12:07 ` [PATCH 6/7] blk-mq: remove REQ_ATOM_STARTED Tejun Heo
2017-12-16 12:07 ` [PATCH 7/7] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu Tejun Heo
2017-12-29 10:02 ` [PATCHSET v3] blk-mq: reimplement timeout handling Christoph Hellwig
2018-01-08 17:03   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171216120726.517153-1-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=clm@fb.com \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=jianchao.w.wang@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.