All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Coly Li <colyli@suse.de>
Cc: linux-block@vger.kernel.org, linux-bcache@vger.kernel.org
Subject: Re: [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
Date: Sat, 28 May 2022 06:23:08 -0600	[thread overview]
Message-ID: <e0002f36-fb39-5d27-ce10-2278fd4a77bb@kernel.dk> (raw)
In-Reply-To: <1BD307CE-9390-4A4B-B917-676104DA77F1@suse.de>

On 5/28/22 6:22 AM, Coly Li wrote:
> 
> 
>> 2022?5?28? 20:20?Jens Axboe <axboe@kernel.dk> ???
>>
>> On 5/28/22 12:19 AM, Coly Li wrote:
>>> The kworker routine update_writeback_rate() is schedued to update the
>>> writeback rate in every 5 seconds by default. Before calling
>>> __update_writeback_rate() to do real job, semaphore dc->writeback_lock
>>> should be held by the kworker routine.
>>>
>>> At the same time, bcache writeback thread routine bch_writeback_thread()
>>> also needs to hold dc->writeback_lock before flushing dirty data back
>>> into the backing device. If the dirty data set is large, it might be
>>> very long time for bch_writeback_thread() to scan all dirty buckets and
>>> releases dc->writeback_lock. In such case update_writeback_rate() can be
>>> starved for long enough time so that kernel reports a soft lockup warn-
>>> ing started like:
>>>  watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
>>>
>>> Such soft lockup condition is unnecessary, because after the writeback
>>> thread finishes its job and releases dc->writeback_lock, the kworker
>>> update_writeback_rate() may continue to work and everything is fine
>>> indeed.
>>>
>>> This patch avoids the unnecessary soft lockup by the following method,
>>> - Add new member to struct cached_dev
>>>  - dc->rate_update_retry (0 by default)
>>> - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
>>>  firstly, if it fails then lock contention happens.
>>> - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
>>>  acquire the lock and reschedules the kworker for next try.
>>> - If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
>>>  anymore and call down_read(&dc->writeback_lock) to wait for the lock.
>>>
>>> By the above method, at worst case update_writeback_rate() may retry for
>>> 1+ minutes before blocking on dc->writeback_lock by calling down_read().
>>> For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
>>> lockup warning message can be avoided.
>>>
>>> When retrying to acquire dc->writeback_lock in update_writeback_rate(),
>>> of course the writeback rate cannot be updated. It is fair, because when
>>> the kworker is blocked on the lock contention of dc->writeback_lock, the
>>> writeback rate cannot be updated neither.
>>>
>>> This change follows Jens Axboe's suggestion to a more clear and simple
>>> version.
>>
>> This looks fine, but it doesn't apply to my current for-5.19/drivers
>> branch which the previous ones did. Did you spin this one without the
>> other patches, perhaps?
>>
>> One minor thing we might want to change if you're respinning it -
>> BCH_WBRATE_UPDATE_RETRY_MAX isn't really named for what it does, since
>> it doesn't retry anything, it simply allows updates to be skipped. Why
>> not call it BCH_WBRATE_UPDATE_MAX_SKIPS instead? I think that'd be
>> better convey what it does.
> 
> Naming is often challenge for me. Sure, _MAX_SKIPS is better. I will
> post another modified version.

It's hard for everyone :-)

Sounds good, I'll get it applied when it shows up.

-- 
Jens Axboe


  reply	other threads:[~2022-05-28 12:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-28  6:19 [PATCH 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
2022-05-28  6:19 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li
2022-05-28 12:20   ` Jens Axboe
2022-05-28 12:22     ` Coly Li
2022-05-28 12:23       ` Jens Axboe [this message]
2022-05-28 12:45 [PATCH v2 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
2022-05-28 12:45 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0002f36-fb39-5d27-ce10-2278fd4a77bb@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.