All of lore.kernel.org
 help / color / mirror / Atom feed
From: jmoyer@redhat.com (Jeff Moyer)
Subject: [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value
Date: Tue, 24 Nov 2015 10:51:04 -0500	[thread overview]
Message-ID: <x491tbf2y6v.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <20151124154019.GB5939@lst.de> (Christoph Hellwig's message of "Tue, 24 Nov 2015 16:40:19 +0100")

Christoph Hellwig <hch at lst.de> writes:

> On Tue, Nov 24, 2015@10:16:51AM -0500, Jeff Moyer wrote:
>> Hi Christoph,
>> 
>> Christoph Hellwig <hch at lst.de> writes:
>> 
>> > This marks the request as one that's not actually completed yet, but
>> > should be reaped next time blk_mq_complete_request comes in.  This is
>> > useful it the abort handler kicked of a reset that will complete all
>> > pending requests.
>> 
>> What's the purpose, though?  Is this an optimization?
>
> It allows us to to correctly implement controller reset (like SCSI target
> resets) from the timeout handler.  The current HANDLED/NOT_HANDLED returns
> are not very useful if you want to eventually kick of a reset that will
> abort all requests, but needs to ensure the the requests don't get reused
> before that.  Only SCSI handles that for now, and needs it's own per-LUN
> command list and a lot of complex code for that - something we'd like
> to avoid for NVMe or other new drivers.

Thanks for the explanation.  One more question below.

>> We've had "fun" problems with races between completion and timeout
>> before.  I can't say I'm too keen on adding more complexity to this code
>> path.  Have you considered what happens in your new code when this race
>> occurs?  I don't expect it to cause any issues in the mq case, since the
>> timeout handler should run on the same cpu as the completion code for a
>> given request (right?).  However, for the old code path, they could run
>> in parallel.
>> 
>> blk_complete_request:
>> A  if (!blk_mark_rq_complete(rq) ||
>> B      test_and_cleart_bit(REQ_ATOM_QUIESCED, &req->atomic_flags)) {
>> C        __blk_mq_complete_request(rq);
>> 
>> could run alongside of:
>> 
>> blk_rq_check_expired:
>> 1 if (!blk_mark_rq_complete(rq))
>> 2   blk_rq_timed_out(rq);
>> 
>> So, if 1 comes before A, we have two cases to consider:
>> 
>> i.  the expiration path does not yet set REQ_ATOM_QUIESCED before the
>>     completion code runs, and so the completion code does nothing.
>
> The command has timed out and sets REQ_ATOM_COMPLETED first,
> the the actual completion comes in and does indeed nothing.  We now
> set REQ_ATOM_QUIESCED and kick off a controller reset, which will
> ultimatively complete all commands using blk_mq_complete_request.
> Now REQ_ATOM_QUIESCED is set on the command that caused the timeout,
> so it will be completed as well.
>
>> ii. the expiration path *does* SET REQ_ATOM_QUIESCED.  In this instance,
>>     will we get yet another completion for the request when the command
>>     is ultimately retired by the adapter reset?
>
> The command has timed out and sets REQ_ATOM_COMPLETED first, then
> REQ_ATOM_QUIESCED as well.  Now the actual completion comes in and does
> nothing because REQ_ATOM_COMPLETED was set.  We will then kick off the

See B above.  REQ_ATOM_COMPLETE is set, so the first half of that
statement is false, but then test_and_clear_bit(REQ_ATOM_QUIESCED...)
returns true, so we call __blk_complete_request.  So the question is,
will we get a double completion for that request after the reset is
performed?

-Jeff

p.s. That should be __blk_complete_request up there in 'C'.

> controller reset, which will ultimatively complete all commands using
> blk_mq_complete_request. Now REQ_ATOM_QUIESCED is set on the command that
> caused the timeout, so it will be completed as well.


> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-11-24 15:51 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20 16:34 NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-20 16:34 ` [PATCH 01/47] nvme: add missing unmaps in nvme_queue_rq Christoph Hellwig
2015-11-20 16:34   ` Christoph Hellwig
2015-11-20 16:34 ` [PATCH 02/47] block: fix blk_abort_request for blk-mq drivers Christoph Hellwig
2015-11-20 21:43   ` Jeff Moyer
2015-11-20 21:47     ` Jens Axboe
2015-11-20 21:54       ` Jeff Moyer
2015-11-20 22:20   ` Jeff Moyer
2015-11-21  7:34     ` Christoph Hellwig
2015-11-20 16:34 ` [PATCH 03/47] block: defer timeouts to a workqueue Christoph Hellwig
2015-11-23 20:31   ` Jeff Moyer
2015-11-23 20:48     ` Christoph Hellwig
2015-11-23 20:59       ` Jeff Moyer
2015-11-20 16:34 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig
2015-11-24 15:16   ` Jeff Moyer
2015-11-24 15:40     ` Christoph Hellwig
2015-11-24 15:51       ` Jeff Moyer [this message]
2015-11-24 15:56         ` Christoph Hellwig
2015-11-24 16:34           ` Jeff Moyer
2015-11-24 16:47             ` Keith Busch
2015-11-24 17:56             ` Christoph Hellwig
2015-11-24 18:12               ` Jeff Moyer
2015-11-24 19:40                 ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 05/47] block: remoe REQ_ATOM_COMPLETE wrappers Christoph Hellwig
2015-11-23 21:23   ` Jeff Moyer
2015-11-24 21:22   ` Jens Axboe
2015-11-20 16:35 ` [PATCH 06/47] blk-mq: add a flags parameter to blk_mq_alloc_request Christoph Hellwig
2015-11-24 15:19   ` Jeff Moyer
2015-11-24 15:32     ` Christoph Hellwig
2015-11-24 21:21   ` Jens Axboe
2015-11-24 22:22     ` Christoph Hellwig
2015-11-24 22:25       ` Jens Axboe
2015-11-20 16:35 ` [PATCH 07/47] nvme: move struct nvme_iod to pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 08/47] nvme: split command submission helpers out of pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 09/47] nvme: add a vendor field to struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 10/47] nvme: use offset instead of a struct for registers Christoph Hellwig
2015-11-20 16:35 ` [PATCH 11/47] nvme: split a new struct nvme_ctrl out of struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 12/47] nvme: simplify nvme_setup_prps calling convention Christoph Hellwig
2015-11-20 16:35 ` [PATCH 13/47] nvme: refactor nvme_queue_rq Christoph Hellwig
2015-11-20 16:35 ` [PATCH 14/47] nvme: move nvme_error_status to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 15/47] nvme: move nvme_setup_flush and nvme_setup_rw " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 16/47] nvme: split __nvme_submit_sync_cmd Christoph Hellwig
2015-11-20 16:35 ` [PATCH 17/47] nvme: use the block layer for userspace passthrough metadata Christoph Hellwig
2015-11-20 16:35 ` [PATCH 18/47] nvme: move block_device_operations and ns/ctrl freeing to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 19/47] nvme: add explicit quirk handling Christoph Hellwig
2015-11-20 16:35 ` [PATCH 20/47] nvme: add a common helper to read Identify Controller data Christoph Hellwig
2015-11-20 16:35 ` [PATCH 21/47] nvme: move the call to nvme_init_identify earlier Christoph Hellwig
2015-11-20 16:35 ` [PATCH 22/47] nvme: move namespace scanning to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 23/47] nvme: move chardev and sysfs interface " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 24/47] nvme: only add a controller to dev_list after it's been fully initialized Christoph Hellwig
2015-11-20 16:35 ` [PATCH 25/47] nvme: don't take the I/O queue q_lock in nvme_timeout Christoph Hellwig
2015-11-20 16:35   ` Christoph Hellwig
2017-03-10 12:51   ` David Woodhouse
2017-03-10 12:51     ` David Woodhouse
2017-03-10 14:24     ` Christoph Hellwig
2017-03-10 14:24       ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 26/47] nvme: merge nvme_abort_req and nvme_timeout Christoph Hellwig
2015-11-20 16:35 ` [PATCH 27/47] nvme: do not restart the request timeout if we're resetting the controller Christoph Hellwig
2015-11-20 16:35 ` [PATCH 28/47] nvme: simplify resets Christoph Hellwig
2015-11-20 16:35 ` [PATCH 29/47] nvme: merge probe_work and reset_work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 30/47] nvme: remove dead controllers from a work item Christoph Hellwig
2015-11-20 16:35 ` [PATCH 31/47] nvme: switch abort_limit to an atomic_t Christoph Hellwig
2015-11-20 16:35 ` [PATCH 32/47] NVMe: Implement namespace list scanning Christoph Hellwig
2015-11-20 16:35 ` [PATCH 33/47] NVMe: Use unbounded work queue for all work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 34/47] NVMe: Remove device management handles on remove Christoph Hellwig
2015-11-20 16:50 ` NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-21  7:19 NVMe mega patchbomb for Linux 4.5-rc (resend) Christoph Hellwig
2015-11-21  7:19 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x491tbf2y6v.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.