All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.com>, Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH V10 07/11] blk-mq: stop to handle IO and drain IO before hctx becomes inactive
Date: Sun, 10 May 2020 20:20:24 -0700	[thread overview]
Message-ID: <8ef5352b-a1bb-a3c1-3ad2-696df6e86f1f@acm.org> (raw)
In-Reply-To: <20200511014538.GB1418834@T590>

On 2020-05-10 18:45, Ming Lei wrote:
> On Sat, May 09, 2020 at 07:18:46AM -0700, Bart Van Assche wrote:
>> On 2020-05-08 21:10, Ming Lei wrote:
>>> queue freezing can only be applied on the request queue level, and not
>>> hctx level. When requests can't be completed, wait freezing just hangs
>>> for-ever.
>>
>> That's indeed what I meant: freeze the entire queue instead of
>> introducing a new mechanism that freezes only one hardware queue at a time.
> 
> No, the issue is exactly that one single hctx becomes inactive, and
> other hctx are still active and workable.
> 
> If one entire queue is frozen because of some of CPUs are offline, how
> can userspace submit IO to this disk? You suggestion justs makes the
> disk not usable, that won't be accepted.

What I meant is to freeze a request queue temporarily (until hot
unplugging of a CPU has finished). I would never suggest to freeze a
request queue forever and I think that you already knew that.

>> Please clarify what "when requests can't be completed" means. Are you
>> referring to requests that take longer than expected due to e.g. a
>> controller lockup or to requests that take a long time intentionally?
> 
> If all CPUs in one hctx->cpumask are offline, the managed irq of this hw
> queue will be shutdown by genirq code, so any in-flight IO won't be
> completed or timedout after the managed irq is shutdown because of cpu
> offline.
> 
> Some drivers may implement timeout handler, so these in-flight requests
> will be timed out, but still not friendly behaviour given the default
> timeout is too long.
> 
> Some drivers don't implement timeout handler at all, so these IO won't
> be completed.

I think that the block layer needs to be notified after the decision has
been taken to offline a CPU and before the interrupts associated with
that CPU are disabled. That would allow the block layer to freeze a
request queue without triggering any timeouts (ignoring block driver and
hardware bugs). I'm not familiar with CPU hotplugging so I don't know
whether or not such a mechanism already exists.

>> The former case is handled by the block layer timeout handler. I propose
>> to handle the latter case by introducing a new callback function pointer
>> in struct blk_mq_ops that aborts all outstanding requests.
> 
> As I mentioned, timeout isn't a friendly behavior. Or not every driver
> implements timeout handler or well enough.

What I propose is to fix those block drivers instead of complicating the
block layer core further and instead of introducing potential deadlocks
in the block layer core.

>> Request queue
>> freezing is such an important block layer mechanism that I think we
>> should require that all block drivers support freezing a request queue
>> in a short time.
> 
> Firstly, we just need to drain in-flight requests and re-submit queued
> requests from one single hctx, and queue wide freezing causes whole
> userspace IOs blocked unnecessarily.

Freezing a request queue for a short time is acceptable. As you know we
already do that when the queue depth is modified, when the write-back
throttling latency is modified and also when the I/O scheduler is changed.

> Secondly, some requests may not be completed at all, so freezing can't
> work because freeze_wait may hang forever.

If a request neither can be aborted nor completes then that's a severe
bug in the block driver that submitted the request to the block device.

Bart.

  reply	other threads:[~2020-05-11  3:20 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05  2:09 [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-05  2:09 ` [PATCH V10 01/11] block: clone nr_integrity_segments and write_hint in blk_rq_prep_clone Ming Lei
2020-05-05  2:09 ` [PATCH V10 02/11] block: add helper for copying request Ming Lei
2020-05-05  2:09 ` [PATCH V10 03/11] blk-mq: mark blk_mq_get_driver_tag as static Ming Lei
2020-05-05  2:09 ` [PATCH V10 04/11] blk-mq: assign rq->tag in blk_mq_get_driver_tag Ming Lei
2020-05-05  2:09 ` [PATCH V10 05/11] blk-mq: support rq filter callback when iterating rqs Ming Lei
2020-05-08 23:32   ` Bart Van Assche
2020-05-09  0:18     ` Bart Van Assche
2020-05-09  2:05       ` Ming Lei
2020-05-09  3:08         ` Bart Van Assche
2020-05-09  3:52           ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 06/11] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei
2020-05-05  6:14   ` Hannes Reinecke
2020-05-08 23:26   ` Bart Van Assche
2020-05-09  2:09     ` Ming Lei
2020-05-09  3:11       ` Bart Van Assche
2020-05-09  3:56         ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 07/11] blk-mq: stop to handle IO and drain IO before hctx becomes inactive Ming Lei
2020-05-08 23:39   ` Bart Van Assche
2020-05-09  2:20     ` Ming Lei
2020-05-09  3:24       ` Bart Van Assche
2020-05-09  4:10         ` Ming Lei
2020-05-09 14:18           ` Bart Van Assche
2020-05-11  1:45             ` Ming Lei
2020-05-11  3:20               ` Bart Van Assche [this message]
2020-05-11  3:48                 ` Ming Lei
2020-05-11 20:56                   ` Bart Van Assche
2020-05-12  1:25                     ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 08/11] block: add blk_end_flush_machinery Ming Lei
2020-05-05  2:09 ` [PATCH V10 09/11] blk-mq: add blk_mq_hctx_handle_dead_cpu for handling cpu dead Ming Lei
2020-05-05  2:09 ` [PATCH V10 10/11] blk-mq: re-submit IO in case that hctx is inactive Ming Lei
2020-05-05  2:09 ` [PATCH V10 11/11] block: deactivate hctx when the hctx is actually inactive Ming Lei
2020-05-09 14:07   ` Bart Van Assche
2020-05-11  2:11     ` Ming Lei
2020-05-11  3:30       ` Bart Van Assche
2020-05-11  4:08         ` Ming Lei
2020-05-11 20:52           ` Bart Van Assche
2020-05-12  1:43             ` Ming Lei
2020-05-12  2:08             ` Ming Lei
2020-05-08 21:49 ` [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-09  3:17   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ef5352b-a1bb-a3c1-3ad2-696df6e86f1f@acm.org \
    --to=bvanassche@acm.org \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.