All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Yu Kuai <yukuai3@huawei.com>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
	Omar Sandoval <osandov@fb.com>,
	linux-block@vger.kernel.org
Subject: Re: Races in sbitmap batched wakeups
Date: Fri, 17 Jun 2022 13:31:12 +0200	[thread overview]
Message-ID: <20220617113112.rlmx7npkavwkhcxx@quack3> (raw)
In-Reply-To: <9a0f1ea5-c62c-4439-b80f-0319b9a15fd5@huawei.com>

Hi!

On Fri 17-06-22 09:40:11, Yu Kuai wrote:
> 在 2022/06/17 1:21, Jan Kara 写道:
> > I've been debugging some customer reports of tasks hanging (forever)
> > waiting for free tags when in fact all tags are free. After looking into it
> > for some time I think I know what it happening. First, keep in mind that
> > it concerns a device which uses shared tags. There are 127 tags available
> > and the number of active queues using these tags is easily 40 or more. So
> > number of tags available for each device is rather small. Now I'm not sure
> > how batched wakeups can ever work in such situations, but maybe I'm missing
> > something.
> > 
> > So take for example a situation where two tags are available for a device,
> > they are both currently used. Now a process comes into blk_mq_get_tag() and
> > wants to allocate tag and goes to sleep. Now how can it ever be woken up if
> > wake_batch is 4? If the two IOs complete, sbitmap will get two wakeups but
> > that's not enough to trigger the batched wakeup to really wakeup the
> > waiter...
> > 
> > Even if we have say 4 tags available so in theory there should be enough
> > wakeups to fill the batch, there can be the following problem. So 4 tags
> > are in use, two processes come to blk_mq_get_tag() and sleep, one on wait
> > queue 0, one on wait queue 1. Now four IOs complete so
> > sbitmap_queue_wake_up() gets called 4 times and the fourth call decrements
> > wait_cnt to 0 so it ends up calling wake_up_nr(wq0, 4). Fine, one of the
> > waiters is woken up but the other one is still sleeping in wq1 and there
> > are not enough wakeups to fill the batch and wake it up? This is
> > essentially because we have lost three wakeups on wq0 because it didn't
> > have enough waiters to wake...
> 
> From what I see, if tags are shared for multiple devices, wake_batch
> should make sure that all waiter will be woke up:
> 
> For example:
> there are total 64 tags shared for two devices, then wake_batch is 4(if
> both devices are active).  If there are waiters, which means at least 32
> tags are grabed, thus 8 queues will ensure to wake up at least once
> after 32 tags are freed.

Well, yes, wake_batch is updated but as my example above shows it is not
enough to fix "wasted" wakeups.

> > Finally, sbitmap_queue_wake_up() is racy and if two of them race together,
> > they can end up decrementing wait_cnt of wq which does not have any process
> > queued which again effectively leads to lost wakeups and possibly
> > indefinitely sleeping tasks.
> > 
> 
> BTW, I do this implementation have some problems on concurrent
> scenario, as described in following patch:
> 
> https://lore.kernel.org/lkml/20220415101053.554495-4-yukuai3@huawei.com/

Yes, as far as I can see you have identified similar races as I point out
in this email. But I'm not sure whether your patch fixes all the
possibilities for lost wakeups...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2022-06-17 11:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-16 17:21 Races in sbitmap batched wakeups Jan Kara
2022-06-17  1:07 ` Ming Lei
2022-06-17 10:50   ` Jan Kara
2022-06-17  1:40 ` Yu Kuai
2022-06-17 11:31   ` Jan Kara [this message]
2022-06-17 12:50     ` Yu Kuai
2022-06-20 11:57       ` Jan Kara
2022-06-20 13:11         ` Yu Kuai
2022-06-20 16:57           ` Jan Kara
2022-06-21  1:10             ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220617113112.rlmx7npkavwkhcxx@quack3 \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=osandov@fb.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.