All of lore.kernel.org
 help / color / mirror / Atom feed
From: hch@lst.de (Christoph Hellwig)
Subject: [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value
Date: Tue, 24 Nov 2015 16:40:19 +0100	[thread overview]
Message-ID: <20151124154019.GB5939@lst.de> (raw)
In-Reply-To: <x49mvu32zrw.fsf@segfault.boston.devel.redhat.com>

On Tue, Nov 24, 2015@10:16:51AM -0500, Jeff Moyer wrote:
> Hi Christoph,
> 
> Christoph Hellwig <hch at lst.de> writes:
> 
> > This marks the request as one that's not actually completed yet, but
> > should be reaped next time blk_mq_complete_request comes in.  This is
> > useful it the abort handler kicked of a reset that will complete all
> > pending requests.
> 
> What's the purpose, though?  Is this an optimization?

It allows us to to correctly implement controller reset (like SCSI target
resets) from the timeout handler.  The current HANDLED/NOT_HANDLED returns
are not very useful if you want to eventually kick of a reset that will
abort all requests, but needs to ensure the the requests don't get reused
before that.  Only SCSI handles that for now, and needs it's own per-LUN
command list and a lot of complex code for that - something we'd like
to avoid for NVMe or other new drivers.

> We've had "fun" problems with races between completion and timeout
> before.  I can't say I'm too keen on adding more complexity to this code
> path.  Have you considered what happens in your new code when this race
> occurs?  I don't expect it to cause any issues in the mq case, since the
> timeout handler should run on the same cpu as the completion code for a
> given request (right?).  However, for the old code path, they could run
> in parallel.
> 
> blk_complete_request:
> A  if (!blk_mark_rq_complete(rq) ||
> B      test_and_cleart_bit(REQ_ATOM_QUIESCED, &req->atomic_flags)) {
> C        __blk_mq_complete_request(rq);
> 
> could run alongside of:
> 
> blk_rq_check_expired:
> 1 if (!blk_mark_rq_complete(rq))
> 2   blk_rq_timed_out(rq);
> 
> So, if 1 comes before A, we have two cases to consider:
> 
> i.  the expiration path does not yet set REQ_ATOM_QUIESCED before the
>     completion code runs, and so the completion code does nothing.

The command has timed out and sets REQ_ATOM_COMPLETED first,
the the actual completion comes in and does indeed nothing.  We now
set REQ_ATOM_QUIESCED and kick off a controller reset, which will
ultimatively complete all commands using blk_mq_complete_request.
Now REQ_ATOM_QUIESCED is set on the command that caused the timeout,
so it will be completed as well.

> ii. the expiration path *does* SET REQ_ATOM_QUIESCED.  In this instance,
>     will we get yet another completion for the request when the command
>     is ultimately retired by the adapter reset?

The command has timed out and sets REQ_ATOM_COMPLETED first, then
REQ_ATOM_QUIESCED as well.  Now the actual completion comes in and does
nothing because REQ_ATOM_COMPLETED was set.  We will then kick off the
controller reset, which will ultimatively complete all commands using
blk_mq_complete_request. Now REQ_ATOM_QUIESCED is set on the command that
caused the timeout, so it will be completed as well.

  reply	other threads:[~2015-11-24 15:40 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20 16:34 NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-20 16:34 ` [PATCH 01/47] nvme: add missing unmaps in nvme_queue_rq Christoph Hellwig
2015-11-20 16:34   ` Christoph Hellwig
2015-11-20 16:34 ` [PATCH 02/47] block: fix blk_abort_request for blk-mq drivers Christoph Hellwig
2015-11-20 21:43   ` Jeff Moyer
2015-11-20 21:47     ` Jens Axboe
2015-11-20 21:54       ` Jeff Moyer
2015-11-20 22:20   ` Jeff Moyer
2015-11-21  7:34     ` Christoph Hellwig
2015-11-20 16:34 ` [PATCH 03/47] block: defer timeouts to a workqueue Christoph Hellwig
2015-11-23 20:31   ` Jeff Moyer
2015-11-23 20:48     ` Christoph Hellwig
2015-11-23 20:59       ` Jeff Moyer
2015-11-20 16:34 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig
2015-11-24 15:16   ` Jeff Moyer
2015-11-24 15:40     ` Christoph Hellwig [this message]
2015-11-24 15:51       ` Jeff Moyer
2015-11-24 15:56         ` Christoph Hellwig
2015-11-24 16:34           ` Jeff Moyer
2015-11-24 16:47             ` Keith Busch
2015-11-24 17:56             ` Christoph Hellwig
2015-11-24 18:12               ` Jeff Moyer
2015-11-24 19:40                 ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 05/47] block: remoe REQ_ATOM_COMPLETE wrappers Christoph Hellwig
2015-11-23 21:23   ` Jeff Moyer
2015-11-24 21:22   ` Jens Axboe
2015-11-20 16:35 ` [PATCH 06/47] blk-mq: add a flags parameter to blk_mq_alloc_request Christoph Hellwig
2015-11-24 15:19   ` Jeff Moyer
2015-11-24 15:32     ` Christoph Hellwig
2015-11-24 21:21   ` Jens Axboe
2015-11-24 22:22     ` Christoph Hellwig
2015-11-24 22:25       ` Jens Axboe
2015-11-20 16:35 ` [PATCH 07/47] nvme: move struct nvme_iod to pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 08/47] nvme: split command submission helpers out of pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 09/47] nvme: add a vendor field to struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 10/47] nvme: use offset instead of a struct for registers Christoph Hellwig
2015-11-20 16:35 ` [PATCH 11/47] nvme: split a new struct nvme_ctrl out of struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 12/47] nvme: simplify nvme_setup_prps calling convention Christoph Hellwig
2015-11-20 16:35 ` [PATCH 13/47] nvme: refactor nvme_queue_rq Christoph Hellwig
2015-11-20 16:35 ` [PATCH 14/47] nvme: move nvme_error_status to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 15/47] nvme: move nvme_setup_flush and nvme_setup_rw " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 16/47] nvme: split __nvme_submit_sync_cmd Christoph Hellwig
2015-11-20 16:35 ` [PATCH 17/47] nvme: use the block layer for userspace passthrough metadata Christoph Hellwig
2015-11-20 16:35 ` [PATCH 18/47] nvme: move block_device_operations and ns/ctrl freeing to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 19/47] nvme: add explicit quirk handling Christoph Hellwig
2015-11-20 16:35 ` [PATCH 20/47] nvme: add a common helper to read Identify Controller data Christoph Hellwig
2015-11-20 16:35 ` [PATCH 21/47] nvme: move the call to nvme_init_identify earlier Christoph Hellwig
2015-11-20 16:35 ` [PATCH 22/47] nvme: move namespace scanning to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 23/47] nvme: move chardev and sysfs interface " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 24/47] nvme: only add a controller to dev_list after it's been fully initialized Christoph Hellwig
2015-11-20 16:35 ` [PATCH 25/47] nvme: don't take the I/O queue q_lock in nvme_timeout Christoph Hellwig
2015-11-20 16:35   ` Christoph Hellwig
2017-03-10 12:51   ` David Woodhouse
2017-03-10 12:51     ` David Woodhouse
2017-03-10 14:24     ` Christoph Hellwig
2017-03-10 14:24       ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 26/47] nvme: merge nvme_abort_req and nvme_timeout Christoph Hellwig
2015-11-20 16:35 ` [PATCH 27/47] nvme: do not restart the request timeout if we're resetting the controller Christoph Hellwig
2015-11-20 16:35 ` [PATCH 28/47] nvme: simplify resets Christoph Hellwig
2015-11-20 16:35 ` [PATCH 29/47] nvme: merge probe_work and reset_work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 30/47] nvme: remove dead controllers from a work item Christoph Hellwig
2015-11-20 16:35 ` [PATCH 31/47] nvme: switch abort_limit to an atomic_t Christoph Hellwig
2015-11-20 16:35 ` [PATCH 32/47] NVMe: Implement namespace list scanning Christoph Hellwig
2015-11-20 16:35 ` [PATCH 33/47] NVMe: Use unbounded work queue for all work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 34/47] NVMe: Remove device management handles on remove Christoph Hellwig
2015-11-20 16:50 ` NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-21  7:19 NVMe mega patchbomb for Linux 4.5-rc (resend) Christoph Hellwig
2015-11-21  7:19 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151124154019.GB5939@lst.de \
    --to=hch@lst.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.