All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: axboe@kernel.dk, asml.silence@gmail.com, osandov@fb.com,
	jack@suse.cz, kbusch@kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yukuai3@huawei.com,
	yi.zhang@huawei.com
Subject: Re: [PATCH RFC v3 2/3] sbitmap: fix invalid wakeup on the wrong waitqueue
Date: Mon, 11 Jul 2022 16:26:23 +0200	[thread overview]
Message-ID: <20220711142623.haam2wks36xa5nde@quack3.lan> (raw)
In-Reply-To: <20220710042200.20936-3-yukuai1@huaweicloud.com>

On Sun 10-07-22 12:21:59, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> For example, 2 * wake_batch tags are put, while only wake_batch threads
> are woken:
> 
> __sbq_wake_up
>  atomic_cmpxchg -> reset wait_cnt
> 			__sbq_wake_up -> decrease wait_cnt
> 			...
> 			__sbq_wake_up -> wait_cnt is decreased to 0 again
> 			 atomic_cmpxchg
> 			 sbq_index_atomic_inc -> increase wake_index
> 			 wake_up_nr -> wake up and waitqueue might be empty
>  sbq_index_atomic_inc -> increase again, one waitqueue is skipped
>  wake_up_nr -> invalid wake up because old wakequeue might be empty
> 
> To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'.
> 
> Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library")
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

This patch and the following one look sane to me but please merge them to a
single patch. They fix the same race of two concurrent wakers just with a
slightly different timing so there isn't a point in having two patches for
this (in particular changes in this patch are difficult to reason about
when we know the result is still buggy).

								Honza

> ---
>  lib/sbitmap.c | 45 +++++++++++++++++++++++----------------------
>  1 file changed, 23 insertions(+), 22 deletions(-)
> 
> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index b46fce1beb3a..57095dd88a33 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -616,32 +616,33 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq)
>  		return false;
>  
>  	wait_cnt = atomic_dec_return(&ws->wait_cnt);
> -	if (wait_cnt <= 0) {
> -		int ret;
> +	if (wait_cnt > 0)
> +		return false;
>  
> -		wake_batch = READ_ONCE(sbq->wake_batch);
> +	/*
> +	 * For concurrent callers of this, callers should call this function
> +	 * again to wakeup a new batch on a different 'ws'.
> +	 */
> +	if (wait_cnt < 0)
> +		return true;
>  
> -		/*
> -		 * Pairs with the memory barrier in sbitmap_queue_resize() to
> -		 * ensure that we see the batch size update before the wait
> -		 * count is reset.
> -		 */
> -		smp_mb__before_atomic();
> +	wake_batch = READ_ONCE(sbq->wake_batch);
>  
> -		/*
> -		 * For concurrent callers of this, the one that failed the
> -		 * atomic_cmpxhcg() race should call this function again
> -		 * to wakeup a new batch on a different 'ws'.
> -		 */
> -		ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch);
> -		if (ret == wait_cnt) {
> -			sbq_index_atomic_inc(&sbq->wake_index);
> -			wake_up_nr(&ws->wait, wake_batch);
> -			return false;
> -		}
> +	/*
> +	 * Pairs with the memory barrier in sbitmap_queue_resize() to
> +	 * ensure that we see the batch size update before the wait
> +	 * count is reset.
> +	 */
> +	smp_mb__before_atomic();
>  
> -		return true;
> -	}
> +	/*
> +	 * Increase wake_index before updating wait_cnt, otherwise concurrent
> +	 * callers can see valid wait_cnt in old waitqueue, which can cause
> +	 * invalid wakeup on the old waitqueue.
> +	 */
> +	sbq_index_atomic_inc(&sbq->wake_index);
> +	atomic_set(&ws->wait_cnt, wake_batch);
> +	wake_up_nr(&ws->wait, wake_batch);
>  
>  	return false;
>  }
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2022-07-11 14:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-10  4:21 [PATCH RFC v3 0/3] bugfix for sbitmap Yu Kuai
2022-07-10  4:21 ` [PATCH RFC v3 1/3] sbitmap: fix that same waitqueue can be woken up continuously Yu Kuai
2022-07-11 14:20   ` Jan Kara
2022-07-12 13:25     ` Yu Kuai
2022-07-10  4:21 ` [PATCH RFC v3 2/3] sbitmap: fix invalid wakeup on the wrong waitqueue Yu Kuai
2022-07-11 14:26   ` Jan Kara [this message]
2022-07-12 13:26     ` Yu Kuai
2022-07-10  4:22 ` [PATCH RFC v3 3/3] sbitmap: fix that 'wait_cnt' can be decreased while waitqueue is empty Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220711142623.haam2wks36xa5nde@quack3.lan \
    --to=jack@suse.cz \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=osandov@fb.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.