linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: jmoyer@redhat.com (Jeff Moyer)
Subject: [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value
Date: Tue, 24 Nov 2015 10:16:51 -0500	[thread overview]
Message-ID: <x49mvu32zrw.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <1448037342-18384-5-git-send-email-hch@lst.de> (Christoph Hellwig's message of "Fri, 20 Nov 2015 17:34:59 +0100")

Hi Christoph,

Christoph Hellwig <hch at lst.de> writes:

> This marks the request as one that's not actually completed yet, but
> should be reaped next time blk_mq_complete_request comes in.  This is
> useful it the abort handler kicked of a reset that will complete all
> pending requests.

What's the purpose, though?  Is this an optimization?

We've had "fun" problems with races between completion and timeout
before.  I can't say I'm too keen on adding more complexity to this code
path.  Have you considered what happens in your new code when this race
occurs?  I don't expect it to cause any issues in the mq case, since the
timeout handler should run on the same cpu as the completion code for a
given request (right?).  However, for the old code path, they could run
in parallel.

blk_complete_request:
A  if (!blk_mark_rq_complete(rq) ||
B      test_and_cleart_bit(REQ_ATOM_QUIESCED, &req->atomic_flags)) {
C        __blk_mq_complete_request(rq);

could run alongside of:

blk_rq_check_expired:
1 if (!blk_mark_rq_complete(rq))
2   blk_rq_timed_out(rq);

So, if 1 comes before A, we have two cases to consider:

i.  the expiration path does not yet set REQ_ATOM_QUIESCED before the
    completion code runs, and so the completion code does nothing.
ii. the expiration path *does* SET REQ_ATOM_QUIESCED.  In this instance,
    will we get yet another completion for the request when the command
    is ultimately retired by the adapter reset?

Cheers,
Jeff

>
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> ---
>  block/blk-mq.c         | 6 +++++-
>  block/blk-softirq.c    | 3 ++-
>  block/blk-timeout.c    | 3 +++
>  block/blk.h            | 1 +
>  include/linux/blkdev.h | 1 +
>  5 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 8354601..76773dc 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -383,7 +383,8 @@ void blk_mq_complete_request(struct request *rq, int error)
>  
>  	if (unlikely(blk_should_fake_timeout(q)))
>  		return;
> -	if (!blk_mark_rq_complete(rq)) {
> +	if (!blk_mark_rq_complete(rq) ||
> +	    test_and_clear_bit(REQ_ATOM_QUIESCED, &rq->atomic_flags)) {
>  		rq->errors = error;
>  		__blk_mq_complete_request(rq);
>  	}
> @@ -586,6 +587,9 @@ void blk_mq_rq_timed_out(struct request *req, bool reserved)
>  		break;
>  	case BLK_EH_NOT_HANDLED:
>  		break;
> +	case BLK_EH_QUIESCED:
> +		set_bit(REQ_ATOM_QUIESCED, &req->atomic_flags);
> +		break;
>  	default:
>  		printk(KERN_ERR "block: bad eh return: %d\n", ret);
>  		break;
> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
> index 53b1737..9d47fbc 100644
> --- a/block/blk-softirq.c
> +++ b/block/blk-softirq.c
> @@ -167,7 +167,8 @@ void blk_complete_request(struct request *req)
>  {
>  	if (unlikely(blk_should_fake_timeout(req->q)))
>  		return;
> -	if (!blk_mark_rq_complete(req))
> +	if (!blk_mark_rq_complete(req) ||
> +	    test_and_clear_bit(REQ_ATOM_QUIESCED, &req->atomic_flags))
>  		__blk_complete_request(req);
>  }
>  EXPORT_SYMBOL(blk_complete_request);
> diff --git a/block/blk-timeout.c b/block/blk-timeout.c
> index aedd128..b3a7f20 100644
> --- a/block/blk-timeout.c
> +++ b/block/blk-timeout.c
> @@ -96,6 +96,9 @@ static void blk_rq_timed_out(struct request *req)
>  		blk_add_timer(req);
>  		blk_clear_rq_complete(req);
>  		break;
> +	case BLK_EH_QUIESCED:
> +		set_bit(REQ_ATOM_QUIESCED, &req->atomic_flags);
> +		break;
>  	case BLK_EH_NOT_HANDLED:
>  		/*
>  		 * LLD handles this for now but in the future
> diff --git a/block/blk.h b/block/blk.h
> index 37b9165..f4c98f8 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -120,6 +120,7 @@ void blk_account_io_done(struct request *req);
>  enum rq_atomic_flags {
>  	REQ_ATOM_COMPLETE = 0,
>  	REQ_ATOM_STARTED,
> +	REQ_ATOM_QUIESCED,
>  };
>  
>  /*
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 9a8424a..5df5fb13 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -223,6 +223,7 @@ enum blk_eh_timer_return {
>  	BLK_EH_NOT_HANDLED,
>  	BLK_EH_HANDLED,
>  	BLK_EH_RESET_TIMER,
> +	BLK_EH_QUIESCED,
>  };
>  
>  typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);

  reply	other threads:[~2015-11-24 15:16 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20 16:34 NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-20 16:34 ` [PATCH 01/47] nvme: add missing unmaps in nvme_queue_rq Christoph Hellwig
2015-11-20 16:34 ` [PATCH 02/47] block: fix blk_abort_request for blk-mq drivers Christoph Hellwig
2015-11-20 21:43   ` Jeff Moyer
2015-11-20 21:47     ` Jens Axboe
2015-11-20 21:54       ` Jeff Moyer
2015-11-20 22:20   ` Jeff Moyer
2015-11-21  7:34     ` Christoph Hellwig
2015-11-20 16:34 ` [PATCH 03/47] block: defer timeouts to a workqueue Christoph Hellwig
2015-11-23 20:31   ` Jeff Moyer
2015-11-23 20:48     ` Christoph Hellwig
2015-11-23 20:59       ` Jeff Moyer
2015-11-20 16:34 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig
2015-11-24 15:16   ` Jeff Moyer [this message]
2015-11-24 15:40     ` Christoph Hellwig
2015-11-24 15:51       ` Jeff Moyer
2015-11-24 15:56         ` Christoph Hellwig
2015-11-24 16:34           ` Jeff Moyer
2015-11-24 16:47             ` Keith Busch
2015-11-24 17:56             ` Christoph Hellwig
2015-11-24 18:12               ` Jeff Moyer
2015-11-24 19:40                 ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 05/47] block: remoe REQ_ATOM_COMPLETE wrappers Christoph Hellwig
2015-11-23 21:23   ` Jeff Moyer
2015-11-24 21:22   ` Jens Axboe
2015-11-20 16:35 ` [PATCH 06/47] blk-mq: add a flags parameter to blk_mq_alloc_request Christoph Hellwig
2015-11-24 15:19   ` Jeff Moyer
2015-11-24 15:32     ` Christoph Hellwig
2015-11-24 21:21   ` Jens Axboe
2015-11-24 22:22     ` Christoph Hellwig
2015-11-24 22:25       ` Jens Axboe
2015-11-20 16:35 ` [PATCH 07/47] nvme: move struct nvme_iod to pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 08/47] nvme: split command submission helpers out of pci.c Christoph Hellwig
2015-11-20 16:35 ` [PATCH 09/47] nvme: add a vendor field to struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 10/47] nvme: use offset instead of a struct for registers Christoph Hellwig
2015-11-20 16:35 ` [PATCH 11/47] nvme: split a new struct nvme_ctrl out of struct nvme_dev Christoph Hellwig
2015-11-20 16:35 ` [PATCH 12/47] nvme: simplify nvme_setup_prps calling convention Christoph Hellwig
2015-11-20 16:35 ` [PATCH 13/47] nvme: refactor nvme_queue_rq Christoph Hellwig
2015-11-20 16:35 ` [PATCH 14/47] nvme: move nvme_error_status to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 15/47] nvme: move nvme_setup_flush and nvme_setup_rw " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 16/47] nvme: split __nvme_submit_sync_cmd Christoph Hellwig
2015-11-20 16:35 ` [PATCH 17/47] nvme: use the block layer for userspace passthrough metadata Christoph Hellwig
2015-11-20 16:35 ` [PATCH 18/47] nvme: move block_device_operations and ns/ctrl freeing to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 19/47] nvme: add explicit quirk handling Christoph Hellwig
2015-11-20 16:35 ` [PATCH 20/47] nvme: add a common helper to read Identify Controller data Christoph Hellwig
2015-11-20 16:35 ` [PATCH 21/47] nvme: move the call to nvme_init_identify earlier Christoph Hellwig
2015-11-20 16:35 ` [PATCH 22/47] nvme: move namespace scanning to common code Christoph Hellwig
2015-11-20 16:35 ` [PATCH 23/47] nvme: move chardev and sysfs interface " Christoph Hellwig
2015-11-20 16:35 ` [PATCH 24/47] nvme: only add a controller to dev_list after it's been fully initialized Christoph Hellwig
2015-11-20 16:35 ` [PATCH 25/47] nvme: don't take the I/O queue q_lock in nvme_timeout Christoph Hellwig
2017-03-10 12:51   ` David Woodhouse
2017-03-10 14:24     ` Christoph Hellwig
2015-11-20 16:35 ` [PATCH 26/47] nvme: merge nvme_abort_req and nvme_timeout Christoph Hellwig
2015-11-20 16:35 ` [PATCH 27/47] nvme: do not restart the request timeout if we're resetting the controller Christoph Hellwig
2015-11-20 16:35 ` [PATCH 28/47] nvme: simplify resets Christoph Hellwig
2015-11-20 16:35 ` [PATCH 29/47] nvme: merge probe_work and reset_work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 30/47] nvme: remove dead controllers from a work item Christoph Hellwig
2015-11-20 16:35 ` [PATCH 31/47] nvme: switch abort_limit to an atomic_t Christoph Hellwig
2015-11-20 16:35 ` [PATCH 32/47] NVMe: Implement namespace list scanning Christoph Hellwig
2015-11-20 16:35 ` [PATCH 33/47] NVMe: Use unbounded work queue for all work Christoph Hellwig
2015-11-20 16:35 ` [PATCH 34/47] NVMe: Remove device management handles on remove Christoph Hellwig
2015-11-20 16:50 ` NVMe mega patchbomb for Linux 4.5-rc Christoph Hellwig
2015-11-21  7:19 NVMe mega patchbomb for Linux 4.5-rc (resend) Christoph Hellwig
2015-11-21  7:19 ` [PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=x49mvu32zrw.fsf@segfault.boston.devel.redhat.com \
    --to=jmoyer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).