All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Jan Kara <jack@suse.cz>, Hugh Dickins <hughd@google.com>
Cc: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Yu Kuai <yukuai1@huaweicloud.com>,
	Liu Song <liusong@linux.alibaba.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH next] sbitmap: fix lockup while swapping
Date: Mon, 26 Sep 2022 22:08:52 +0800	[thread overview]
Message-ID: <fbb83171-4069-09d3-a119-68055c86797a@huaweicloud.com> (raw)
In-Reply-To: <20220926114416.t7t65u66ze76aiz7@quack3>

Hi,

在 2022/09/26 19:44, Jan Kara 写道:
> On Fri 23-09-22 16:15:29, Hugh Dickins wrote:
>> On Fri, 23 Sep 2022, Hugh Dickins wrote:
>>> On Fri, 23 Sep 2022, Keith Busch wrote:
>>>
>>>> Does the following fix the observation? Rational being that there's no reason
>>>> to spin on the current wait state that is already under handling; let
>>>> subsequent clearings proceed to the next inevitable wait state immediately.
>>>
>>> It's running fine without lockup so far; but doesn't this change merely
>>> narrow the window?  If this is interrupted in between atomic_try_cmpxchg()
>>> setting wait_cnt to 0 and sbq_index_atomic_inc() advancing wake_index,
>>> don't we run the same risk as before, of sbitmap_queue_wake_up() from
>>> the interrupt handler getting stuck on that wait_cnt 0?
>>
>> Yes, it ran successfully for 50 minutes, then an interrupt came in
>> immediately after the cmpxchg, and it locked up just as before.
>>
>> Easily dealt with by disabling interrupts, no doubt, but I assume it's a
>> badge of honour not to disable interrupts here (except perhaps in waking).
> 
> I don't think any magic with sbq_index_atomic_inc() is going to reliably
> fix this. After all the current waitqueue may be the only one that has active
> waiters so sbq_wake_ptr() will always end up returning this waitqueue
> regardless of the current value of sbq->wake_index.
> 
> Honestly, this whole code needs a serious redesign. I have some
> simplifications in mind but it will take some thinking and benchmarking so
> we need some fix for the interim. I was pondering for quite some time about
> some band aid to the problem you've found but didn't find anything
> satisfactory.
> 
> In the end I see two options:
> 
> 1) Take your patch (as wrong as it is ;). Yes, it can lead to lost wakeups
> but we were living with those for a relatively long time so probably we can
> live with them for some longer.
> 
> 2) Revert Yu Kuai's original fix 040b83fcecfb8 ("sbitmap: fix possible io
> hung due to lost wakeup") and my fixup 48c033314f37 ("sbitmap: Avoid leaving
> waitqueue in invalid state in __sbq_wake_up()"). But then Keith would have
> to redo his batched accounting patches on top.
> 
>> Some clever way to make the wait_cnt and wake_index adjustments atomic?

I'm thinking about a hacky way to make the update of wake_cnt and
wake_index atomic, however, redesign of sbitmap_queue is probably
better. 🤣

There are only 8 wait queues and wake_batch is 8 at most, thus
only need 3 * 9 = 27 bit, and a single atomic value is enough:

- 0-2 represents ws[0].wake_cnt
- 3-5 represents ws[1].wake_cnt
- ...
- 21-24 represents ws[7].wake_cnt
- 25-27 represents sbq->wake_index

for example, assume the atomic value is:

0B 111 111 111 111 111 111 111 111 111 000,

which means wake_index is 7 and ws[0].wake_cnt is 0,
if we try to inc wake_index and reset wake_cnt together:

atomic_add(..., 0B 001 000 000 000 000 000 000 000 000 111)

Thanks,
Kuai

>>
>> Or is this sbitmap_queue_wake_up() interrupting sbitmap_queue_wake_up()
>> just supposed never to happen, the counts preventing it: but some
>> misaccounting letting it happen by mistake?
> 
> No, I think that is in principle a situation that we have to accommodate.
> 
> 								Honza
> 


  reply	other threads:[~2022-09-26 15:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-18 21:10 [PATCH next] sbitmap: fix lockup while swapping Hugh Dickins
2022-09-19 21:22 ` Keith Busch
2022-09-19 23:01   ` Hugh Dickins
2022-09-21 16:40     ` Jan Kara
2022-09-23 14:43       ` Jan Kara
2022-09-23 15:13         ` Keith Busch
2022-09-23 16:16         ` Hugh Dickins
2022-09-23 19:07           ` Keith Busch
2022-09-23 21:29             ` Hugh Dickins
2022-09-23 23:15               ` Hugh Dickins
2022-09-26 11:44                 ` Jan Kara
2022-09-26 14:08                   ` Yu Kuai [this message]
2022-09-27  3:39                   ` Hugh Dickins
2022-09-27 10:31                     ` Jan Kara
2022-09-28  3:56                       ` Hugh Dickins
2022-09-28  3:59                         ` [PATCH next v2] " Hugh Dickins
2022-09-28  4:07                           ` Hugh Dickins
2022-09-29  8:39                             ` Jan Kara
2022-09-29 19:50                               ` [PATCH next v3] " Hugh Dickins
2022-09-29 19:56                                 ` Keith Busch
2022-09-29 23:58                                 ` Jens Axboe
     [not found]               ` <20220924023047.1410-1-hdanton@sina.com>
2022-09-27  4:02                 ` [PATCH next] " Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fbb83171-4069-09d3-a119-68055c86797a@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=axboe@kernel.dk \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liusong@linux.alibaba.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.