All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.com>, Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH V10 11/11] block: deactivate hctx when the hctx is actually inactive
Date: Tue, 12 May 2020 09:43:58 +0800	[thread overview]
Message-ID: <20200512014358.GB1531898@T590> (raw)
In-Reply-To: <c4d78e75-91e1-1521-1ba0-d30bf3716f83@acm.org>

On Mon, May 11, 2020 at 01:52:14PM -0700, Bart Van Assche wrote:
> On 2020-05-10 21:08, Ming Lei wrote:
> > OK, just forgot the whole story, but the issue can be fixed quite easily
> > by adding a new request allocation flag in slow path, see the following
> > patch:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index ec50d7e6be21..d743be1b45a2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -418,6 +418,11 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags)
> >  		if (success)
> >  			return 0;
> >  
> > +		if (flags & BLK_MQ_REQ_FORCE) {
> > +			percpu_ref_get(ref);
> > +			return 0;
> > +		}
> > +
> >  		if (flags & BLK_MQ_REQ_NOWAIT)
> >  			return -EBUSY;
> >  
> > diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
> > index c2ea0a6e5b56..2816886d0bea 100644
> > --- a/include/linux/blk-mq.h
> > +++ b/include/linux/blk-mq.h
> > @@ -448,6 +448,13 @@ enum {
> >  	BLK_MQ_REQ_INTERNAL	= (__force blk_mq_req_flags_t)(1 << 2),
> >  	/* set RQF_PREEMPT */
> >  	BLK_MQ_REQ_PREEMPT	= (__force blk_mq_req_flags_t)(1 << 3),
> > +
> > +	/*
> > +	 * force to allocate request and caller has to make sure queue
> > +	 * won't be forzen completely during allocation, and this flag
> > +	 * is only applied after queue freeze is started
> > +	 */
> > +	BLK_MQ_REQ_FORCE	= (__force blk_mq_req_flags_t)(1 << 4),
> >  };
> >  
> >  struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
> 
> I'm not sure that introducing such a flag is a good idea. After
> blk_mq_freeze_queue() has made it clear that a request queue must be

BLK_MQ_REQ_FORCE is only allowed to be used when caller guarantees that the
queue's freezing isn't done, here because there is pending request which
can't be completed. So blk_mq_freeze_queue() won't be done if there is one
request allocated with BLK_MQ_REQ_FORCE. And once blk_mq_freeze_queue() is
done, there can't be such allocation request any more.

> frozen and before the request queue is really frozen, an RCU grace
> period must expire. Otherwise it cannot be guaranteed that the intention
> to freeze a request queue (by calling percpu_ref_kill()) has been
> observed by all potential blk_queue_enter() callers (blk_queue_enter()
> calls percpu_ref_tryget_live()). Not introducing any new race conditions
> would either require to introduce an smp_mb() call in blk_queue_enter()

percpu_ref_tryget_live() returns false when the percpu refcount is dead
or atomic_read() returns zero in case of atomic mode.

BLK_MQ_REQ_FORCE is only allowed to be used when caller guarantees that the
queue's freezing isn't done. So if the refcount is in percpu mode,
true is always returned, otherwise false is always returned. So there
isn't any race. And no any smp_mb() is required too cause no any new
global memory RW is introduced by this patch.

> or to let another RCU grace period expire after the last allocation of a
> request with BLK_MQ_REQ_FORCE and before the request queue is really frozen.

Not at all. As I explained, allocation with BLK_MQ_REQ_FORCE is always
called when the actual percpu refcount is > 1 because we are quite clear
that there is pending request which can't be completed and the pending
request relies on the new allocation with BLK_MQ_REQ_FORCE, so after the
new request is allocated, blk_mq_freeze_queue_wait() will wait until
the actual percpu refcount becomes 0.


Thanks,
Ming


  reply	other threads:[~2020-05-12  1:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05  2:09 [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-05  2:09 ` [PATCH V10 01/11] block: clone nr_integrity_segments and write_hint in blk_rq_prep_clone Ming Lei
2020-05-05  2:09 ` [PATCH V10 02/11] block: add helper for copying request Ming Lei
2020-05-05  2:09 ` [PATCH V10 03/11] blk-mq: mark blk_mq_get_driver_tag as static Ming Lei
2020-05-05  2:09 ` [PATCH V10 04/11] blk-mq: assign rq->tag in blk_mq_get_driver_tag Ming Lei
2020-05-05  2:09 ` [PATCH V10 05/11] blk-mq: support rq filter callback when iterating rqs Ming Lei
2020-05-08 23:32   ` Bart Van Assche
2020-05-09  0:18     ` Bart Van Assche
2020-05-09  2:05       ` Ming Lei
2020-05-09  3:08         ` Bart Van Assche
2020-05-09  3:52           ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 06/11] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei
2020-05-05  6:14   ` Hannes Reinecke
2020-05-08 23:26   ` Bart Van Assche
2020-05-09  2:09     ` Ming Lei
2020-05-09  3:11       ` Bart Van Assche
2020-05-09  3:56         ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 07/11] blk-mq: stop to handle IO and drain IO before hctx becomes inactive Ming Lei
2020-05-08 23:39   ` Bart Van Assche
2020-05-09  2:20     ` Ming Lei
2020-05-09  3:24       ` Bart Van Assche
2020-05-09  4:10         ` Ming Lei
2020-05-09 14:18           ` Bart Van Assche
2020-05-11  1:45             ` Ming Lei
2020-05-11  3:20               ` Bart Van Assche
2020-05-11  3:48                 ` Ming Lei
2020-05-11 20:56                   ` Bart Van Assche
2020-05-12  1:25                     ` Ming Lei
2020-05-05  2:09 ` [PATCH V10 08/11] block: add blk_end_flush_machinery Ming Lei
2020-05-05  2:09 ` [PATCH V10 09/11] blk-mq: add blk_mq_hctx_handle_dead_cpu for handling cpu dead Ming Lei
2020-05-05  2:09 ` [PATCH V10 10/11] blk-mq: re-submit IO in case that hctx is inactive Ming Lei
2020-05-05  2:09 ` [PATCH V10 11/11] block: deactivate hctx when the hctx is actually inactive Ming Lei
2020-05-09 14:07   ` Bart Van Assche
2020-05-11  2:11     ` Ming Lei
2020-05-11  3:30       ` Bart Van Assche
2020-05-11  4:08         ` Ming Lei
2020-05-11 20:52           ` Bart Van Assche
2020-05-12  1:43             ` Ming Lei [this message]
2020-05-12  2:08             ` Ming Lei
2020-05-08 21:49 ` [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-09  3:17   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200512014358.GB1531898@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.