linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-block@vger.kernel.org, John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline
Date: Thu, 28 May 2020 13:19:32 +0800	[thread overview]
Message-ID: <20200528051932.GA1008129@T590> (raw)
In-Reply-To: <1ec7922c-f2b0-08ec-5849-f4eb7f71e9e7@acm.org>

On Wed, May 27, 2020 at 08:33:48PM -0700, Bart Van Assche wrote:
> On 2020-05-27 18:46, Ming Lei wrote:
> > On Wed, May 27, 2020 at 04:09:19PM -0700, Bart Van Assche wrote:
> >> On 2020-05-27 11:06, Christoph Hellwig wrote:
> >>> --- a/block/blk-mq-tag.c
> >>> +++ b/block/blk-mq-tag.c
> >>> @@ -180,6 +180,14 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
> >>>  	sbitmap_finish_wait(bt, ws, &wait);
> >>>  
> >>>  found_tag:
> >>> +	/*
> >>> +	 * Give up this allocation if the hctx is inactive.  The caller will
> >>> +	 * retry on an active hctx.
> >>> +	 */
> >>> +	if (unlikely(test_bit(BLK_MQ_S_INACTIVE, &data->hctx->state))) {
> >>> +		blk_mq_put_tag(tags, data->ctx, tag + tag_offset);
> >>> +		return -1;
> >>> +	}
> >>>  	return tag + tag_offset;
> >>>  }
> >>
> >> The code that has been added in blk_mq_hctx_notify_offline() will only
> >> work correctly if blk_mq_get_tag() tests BLK_MQ_S_INACTIVE after the
> >> store instructions involved in the tag allocation happened. Does this
> >> mean that a memory barrier should be added in the above function before
> >> the test_bit() call?
> > 
> > Please see comment in blk_mq_hctx_notify_offline():
> > 
> > +       /*
> > +        * Prevent new request from being allocated on the current hctx.
> > +        *
> > +        * The smp_mb__after_atomic() Pairs with the implied barrier in
> > +        * test_and_set_bit_lock in sbitmap_get().  Ensures the inactive flag is
> > +        * seen once we return from the tag allocator.
> > +        */
> > +       set_bit(BLK_MQ_S_INACTIVE, &hctx->state);
> 
> From Documentation/atomic_bitops.txt: "Except for a successful
> test_and_set_bit_lock() which has ACQUIRE semantics and
> clear_bit_unlock() which has RELEASE semantics."

test_bit(BLK_MQ_S_INACTIVE, &data->hctx->state) is called exactly after
one tag is allocated, that means test_and_set_bit_lock is successful before
the test_bit(). The ACQUIRE semantics guarantees that test_bit(BLK_MQ_S_INACTIVE)
is always done after successful test_and_set_bit_lock(), so tag bit is
always set before testing BLK_MQ_S_INACTIVE.
 
See Documentation/memory-barriers.txt:
 (5) ACQUIRE operations.

     This acts as a one-way permeable barrier.  It guarantees that all memory
     operations after the ACQUIRE operation will appear to happen after the
     ACQUIRE operation with respect to the other components of the system.
     ACQUIRE operations include LOCK operations and both smp_load_acquire()
     and smp_cond_load_acquire() operations.

> 
> My understanding is that operations that have acquire semantics pair
> with operations that have release semantics. I haven't been able to find
> any documentation that shows that smp_mb__after_atomic() has release
> semantics. So I looked up its definition. This is what I found:
> 
> $ git grep -nH 'define __smp_mb__after_atomic'
> arch/ia64/include/asm/barrier.h:49:#define __smp_mb__after_atomic()
> barrier()
> arch/mips/include/asm/barrier.h:133:#define __smp_mb__after_atomic()
> smp_llsc_mb()
> arch/s390/include/asm/barrier.h:50:#define __smp_mb__after_atomic()
> barrier()
> arch/sparc/include/asm/barrier_64.h:57:#define __smp_mb__after_atomic()
> barrier()
> arch/x86/include/asm/barrier.h:83:#define __smp_mb__after_atomic()	do {
> } while (0)
> arch/xtensa/include/asm/barrier.h:20:#define __smp_mb__after_atomic()	
> barrier()
> include/asm-generic/barrier.h:116:#define __smp_mb__after_atomic()
> __smp_mb()
> 
> My interpretation of the above is that not all smp_mb__after_atomic()
> implementations have release semantics. Do you agree with this conclusion?

I understand smp_mb__after_atomic() orders set_bit(BLK_MQ_S_INACTIVE)
and reading the tag bit which is done in blk_mq_all_tag_iter().

So the two pair of OPs are ordered:

1) if one request(tag bit) is allocated before setting BLK_MQ_S_INACTIVE,
the tag bit will be observed in blk_mq_all_tag_iter() from blk_mq_hctx_has_requests(),
so the request will be drained.

OR

2) if one request(tag bit) is allocated after setting BLK_MQ_S_INACTIVE,
the request(tag bit) will be released and retried on another CPU
finally, see __blk_mq_alloc_request().

Cc Paul and linux-kernel list.


Thanks,
Ming


  reply	other threads:[~2020-05-28  5:19 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 18:06 blk-mq: improvement CPU hotplug (simplified version) v4 Christoph Hellwig
2020-05-27 18:06 ` [PATCH 1/8] blk-mq: remove the bio argument to ->prepare_request Christoph Hellwig
2020-05-27 18:16   ` Johannes Thumshirn
2020-05-27 18:06 ` [PATCH 2/8] blk-mq: simplify the blk_mq_get_request calling convention Christoph Hellwig
2020-05-27 18:17   ` Johannes Thumshirn
2020-05-27 18:06 ` [PATCH 3/8] blk-mq: move more request initialization to blk_mq_rq_ctx_init Christoph Hellwig
2020-05-27 18:16   ` Hannes Reinecke
2020-05-28  9:50   ` Johannes Thumshirn
2020-05-27 18:06 ` [PATCH 4/8] blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG Christoph Hellwig
2020-05-27 18:14   ` Johannes Thumshirn
2020-05-27 18:17   ` Hannes Reinecke
2020-05-27 22:38   ` Bart Van Assche
2020-05-27 18:06 ` [PATCH 5/8] blk-mq: use BLK_MQ_NO_TAG in more places Christoph Hellwig
2020-05-27 18:15   ` Johannes Thumshirn
2020-05-27 18:18   ` Hannes Reinecke
2020-05-27 22:38   ` Bart Van Assche
2020-05-27 18:06 ` [PATCH 6/8] blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx Christoph Hellwig
2020-05-27 18:06 ` [PATCH 7/8] blk-mq: add blk_mq_all_tag_iter Christoph Hellwig
2020-05-27 18:21   ` Hannes Reinecke
2020-05-27 22:52   ` Bart Van Assche
2020-05-27 18:06 ` [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline Christoph Hellwig
2020-05-27 18:26   ` Hannes Reinecke
2020-05-27 23:09   ` Bart Van Assche
2020-05-28  1:46     ` Ming Lei
2020-05-28  3:33       ` Bart Van Assche
2020-05-28  5:19         ` Ming Lei [this message]
2020-05-28 13:37           ` Bart Van Assche
2020-05-28 17:21             ` Paul E. McKenney
2020-05-29  1:53               ` Ming Lei
2020-05-29  3:07                 ` Paul E. McKenney
2020-05-29  3:53                   ` Ming Lei
2020-05-29 18:13                     ` Paul E. McKenney
2020-05-29 19:55                       ` Bart Van Assche
2020-05-29 21:12                         ` Paul E. McKenney
2020-05-29  1:13             ` Ming Lei
2020-05-27 20:07 ` blk-mq: improvement CPU hotplug (simplified version) v4 Bart Van Assche
2020-05-27 20:31   ` John Garry
2020-05-29 13:26     ` Christoph Hellwig
2020-05-28  8:29 ` John Garry
2020-05-29 13:53 Christoph Hellwig
2020-05-29 13:53 ` [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline Christoph Hellwig
2020-05-29 14:34   ` Hannes Reinecke
2020-05-29 14:41   ` Daniel Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200528051932.GA1008129@T590 \
    --to=ming.lei@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).