linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] bcache fix for Linux v5.19 (3rd wave)
@ 2022-05-28  6:19 Coly Li
  2022-05-28  6:19 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li
  0 siblings, 1 reply; 5+ messages in thread
From: Coly Li @ 2022-05-28  6:19 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-bcache, Coly Li

Hi Jens,

This submission only has 1 patch, which is the effort to avoid bogus
soft lockup warning from the bcache writeback rate update kworker.

Based on your suggestion, this version is more clear and simple, it
works as expected in my testing. BCH_WBRATE_UPDATE_RETRY_MAX (15)
defines the maximum retry times for lock contention, in worst case it
is 1+ minutes before update_writeback_rate() call down_read() to acquire
dc->writeback_lock.

Please consider to take it, and thank you again for the suggestion.

Coly Li
---

Coly Li (1):
  bcache: avoid unnecessary soft lockup in kworker
    update_writeback_rate()

 drivers/md/bcache/bcache.h    |  7 +++++++
 drivers/md/bcache/writeback.c | 31 +++++++++++++++++++++----------
 2 files changed, 28 insertions(+), 10 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
  2022-05-28  6:19 [PATCH 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
@ 2022-05-28  6:19 ` Coly Li
  2022-05-28 12:20   ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Coly Li @ 2022-05-28  6:19 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-bcache, Coly Li

The kworker routine update_writeback_rate() is schedued to update the
writeback rate in every 5 seconds by default. Before calling
__update_writeback_rate() to do real job, semaphore dc->writeback_lock
should be held by the kworker routine.

At the same time, bcache writeback thread routine bch_writeback_thread()
also needs to hold dc->writeback_lock before flushing dirty data back
into the backing device. If the dirty data set is large, it might be
very long time for bch_writeback_thread() to scan all dirty buckets and
releases dc->writeback_lock. In such case update_writeback_rate() can be
starved for long enough time so that kernel reports a soft lockup warn-
ing started like:
  watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]

Such soft lockup condition is unnecessary, because after the writeback
thread finishes its job and releases dc->writeback_lock, the kworker
update_writeback_rate() may continue to work and everything is fine
indeed.

This patch avoids the unnecessary soft lockup by the following method,
- Add new member to struct cached_dev
  - dc->rate_update_retry (0 by default)
- In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
  firstly, if it fails then lock contention happens.
- If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
  acquire the lock and reschedules the kworker for next try.
- If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
  anymore and call down_read(&dc->writeback_lock) to wait for the lock.

By the above method, at worst case update_writeback_rate() may retry for
1+ minutes before blocking on dc->writeback_lock by calling down_read().
For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
lockup warning message can be avoided.

When retrying to acquire dc->writeback_lock in update_writeback_rate(),
of course the writeback rate cannot be updated. It is fair, because when
the kworker is blocked on the lock contention of dc->writeback_lock, the
writeback rate cannot be updated neither.

This change follows Jens Axboe's suggestion to a more clear and simple
version.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/bcache.h    |  7 +++++++
 drivers/md/bcache/writeback.c | 31 +++++++++++++++++++++----------
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 9ed9c955add7..24735a5e839f 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -395,6 +395,13 @@ struct cached_dev {
 	atomic_t		io_errors;
 	unsigned int		error_limit;
 	unsigned int		offline_seconds;
+
+	/*
+	 * Retry to update writeback_rate if contention happens for
+	 * down_read(dc->writeback_lock) in update_writeback_rate()
+	 */
+#define BCH_WBRATE_UPDATE_RETRY_MAX	15
+	unsigned int		rate_update_retry;
 };
 
 enum alloc_reserve {
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index d138a2d73240..e8df6e5fe012 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -235,19 +235,27 @@ static void update_writeback_rate(struct work_struct *work)
 		return;
 	}
 
-	if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
-		/*
-		 * If the whole cache set is idle, set_at_max_writeback_rate()
-		 * will set writeback rate to a max number. Then it is
-		 * unnecessary to update writeback rate for an idle cache set
-		 * in maximum writeback rate number(s).
-		 */
-		if (!set_at_max_writeback_rate(c, dc)) {
-			down_read(&dc->writeback_lock);
+	/*
+	 * If the whole cache set is idle, set_at_max_writeback_rate()
+	 * will set writeback rate to a max number. Then it is
+	 * unnecessary to update writeback rate for an idle cache set
+	 * in maximum writeback rate number(s).
+	 */
+	if (atomic_read(&dc->has_dirty) && dc->writeback_percent &&
+	    !set_at_max_writeback_rate(c, dc)) {
+		do {
+			if (!down_read_trylock((&dc->writeback_lock))) {
+				dc->rate_update_retry++;
+				if (dc->rate_update_retry <=
+				    BCH_WBRATE_UPDATE_RETRY_MAX)
+					break;
+				down_read(&dc->writeback_lock);
+				dc->rate_update_retry = 0;
+			}
 			__update_writeback_rate(dc);
 			update_gc_after_writeback(c);
 			up_read(&dc->writeback_lock);
-		}
+		} while (0);
 	}
 
 
@@ -1006,6 +1014,9 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
 	dc->writeback_rate_fp_term_high = 1000;
 	dc->writeback_rate_i_term_inverse = 10000;
 
+	/* For dc->writeback_lock contention in update_writeback_rate() */
+	dc->rate_update_retry = 0;
+
 	WARN_ON(test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags));
 	INIT_DELAYED_WORK(&dc->writeback_rate_update, update_writeback_rate);
 }
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
  2022-05-28  6:19 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li
@ 2022-05-28 12:20   ` Jens Axboe
  2022-05-28 12:22     ` Coly Li
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2022-05-28 12:20 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-block, linux-bcache

On 5/28/22 12:19 AM, Coly Li wrote:
> The kworker routine update_writeback_rate() is schedued to update the
> writeback rate in every 5 seconds by default. Before calling
> __update_writeback_rate() to do real job, semaphore dc->writeback_lock
> should be held by the kworker routine.
> 
> At the same time, bcache writeback thread routine bch_writeback_thread()
> also needs to hold dc->writeback_lock before flushing dirty data back
> into the backing device. If the dirty data set is large, it might be
> very long time for bch_writeback_thread() to scan all dirty buckets and
> releases dc->writeback_lock. In such case update_writeback_rate() can be
> starved for long enough time so that kernel reports a soft lockup warn-
> ing started like:
>   watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
> 
> Such soft lockup condition is unnecessary, because after the writeback
> thread finishes its job and releases dc->writeback_lock, the kworker
> update_writeback_rate() may continue to work and everything is fine
> indeed.
> 
> This patch avoids the unnecessary soft lockup by the following method,
> - Add new member to struct cached_dev
>   - dc->rate_update_retry (0 by default)
> - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
>   firstly, if it fails then lock contention happens.
> - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
>   acquire the lock and reschedules the kworker for next try.
> - If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
>   anymore and call down_read(&dc->writeback_lock) to wait for the lock.
> 
> By the above method, at worst case update_writeback_rate() may retry for
> 1+ minutes before blocking on dc->writeback_lock by calling down_read().
> For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
> lockup warning message can be avoided.
> 
> When retrying to acquire dc->writeback_lock in update_writeback_rate(),
> of course the writeback rate cannot be updated. It is fair, because when
> the kworker is blocked on the lock contention of dc->writeback_lock, the
> writeback rate cannot be updated neither.
> 
> This change follows Jens Axboe's suggestion to a more clear and simple
> version.

This looks fine, but it doesn't apply to my current for-5.19/drivers
branch which the previous ones did. Did you spin this one without the
other patches, perhaps?

One minor thing we might want to change if you're respinning it -
BCH_WBRATE_UPDATE_RETRY_MAX isn't really named for what it does, since
it doesn't retry anything, it simply allows updates to be skipped. Why
not call it BCH_WBRATE_UPDATE_MAX_SKIPS instead? I think that'd be
better convey what it does.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
  2022-05-28 12:20   ` Jens Axboe
@ 2022-05-28 12:22     ` Coly Li
  2022-05-28 12:23       ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Coly Li @ 2022-05-28 12:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-bcache



> 2022年5月28日 20:20,Jens Axboe <axboe@kernel.dk> 写道:
> 
> On 5/28/22 12:19 AM, Coly Li wrote:
>> The kworker routine update_writeback_rate() is schedued to update the
>> writeback rate in every 5 seconds by default. Before calling
>> __update_writeback_rate() to do real job, semaphore dc->writeback_lock
>> should be held by the kworker routine.
>> 
>> At the same time, bcache writeback thread routine bch_writeback_thread()
>> also needs to hold dc->writeback_lock before flushing dirty data back
>> into the backing device. If the dirty data set is large, it might be
>> very long time for bch_writeback_thread() to scan all dirty buckets and
>> releases dc->writeback_lock. In such case update_writeback_rate() can be
>> starved for long enough time so that kernel reports a soft lockup warn-
>> ing started like:
>>  watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
>> 
>> Such soft lockup condition is unnecessary, because after the writeback
>> thread finishes its job and releases dc->writeback_lock, the kworker
>> update_writeback_rate() may continue to work and everything is fine
>> indeed.
>> 
>> This patch avoids the unnecessary soft lockup by the following method,
>> - Add new member to struct cached_dev
>>  - dc->rate_update_retry (0 by default)
>> - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
>>  firstly, if it fails then lock contention happens.
>> - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
>>  acquire the lock and reschedules the kworker for next try.
>> - If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
>>  anymore and call down_read(&dc->writeback_lock) to wait for the lock.
>> 
>> By the above method, at worst case update_writeback_rate() may retry for
>> 1+ minutes before blocking on dc->writeback_lock by calling down_read().
>> For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
>> lockup warning message can be avoided.
>> 
>> When retrying to acquire dc->writeback_lock in update_writeback_rate(),
>> of course the writeback rate cannot be updated. It is fair, because when
>> the kworker is blocked on the lock contention of dc->writeback_lock, the
>> writeback rate cannot be updated neither.
>> 
>> This change follows Jens Axboe's suggestion to a more clear and simple
>> version.
> 
> This looks fine, but it doesn't apply to my current for-5.19/drivers
> branch which the previous ones did. Did you spin this one without the
> other patches, perhaps?
> 
> One minor thing we might want to change if you're respinning it -
> BCH_WBRATE_UPDATE_RETRY_MAX isn't really named for what it does, since
> it doesn't retry anything, it simply allows updates to be skipped. Why
> not call it BCH_WBRATE_UPDATE_MAX_SKIPS instead? I think that'd be
> better convey what it does.

Naming is often challenge for me. Sure, _MAX_SKIPS is better. I will post another modified version.

Thanks for the suggestion.

Coly Li


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
  2022-05-28 12:22     ` Coly Li
@ 2022-05-28 12:23       ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2022-05-28 12:23 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-block, linux-bcache

On 5/28/22 6:22 AM, Coly Li wrote:
> 
> 
>> 2022?5?28? 20:20?Jens Axboe <axboe@kernel.dk> ???
>>
>> On 5/28/22 12:19 AM, Coly Li wrote:
>>> The kworker routine update_writeback_rate() is schedued to update the
>>> writeback rate in every 5 seconds by default. Before calling
>>> __update_writeback_rate() to do real job, semaphore dc->writeback_lock
>>> should be held by the kworker routine.
>>>
>>> At the same time, bcache writeback thread routine bch_writeback_thread()
>>> also needs to hold dc->writeback_lock before flushing dirty data back
>>> into the backing device. If the dirty data set is large, it might be
>>> very long time for bch_writeback_thread() to scan all dirty buckets and
>>> releases dc->writeback_lock. In such case update_writeback_rate() can be
>>> starved for long enough time so that kernel reports a soft lockup warn-
>>> ing started like:
>>>  watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
>>>
>>> Such soft lockup condition is unnecessary, because after the writeback
>>> thread finishes its job and releases dc->writeback_lock, the kworker
>>> update_writeback_rate() may continue to work and everything is fine
>>> indeed.
>>>
>>> This patch avoids the unnecessary soft lockup by the following method,
>>> - Add new member to struct cached_dev
>>>  - dc->rate_update_retry (0 by default)
>>> - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
>>>  firstly, if it fails then lock contention happens.
>>> - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
>>>  acquire the lock and reschedules the kworker for next try.
>>> - If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
>>>  anymore and call down_read(&dc->writeback_lock) to wait for the lock.
>>>
>>> By the above method, at worst case update_writeback_rate() may retry for
>>> 1+ minutes before blocking on dc->writeback_lock by calling down_read().
>>> For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
>>> lockup warning message can be avoided.
>>>
>>> When retrying to acquire dc->writeback_lock in update_writeback_rate(),
>>> of course the writeback rate cannot be updated. It is fair, because when
>>> the kworker is blocked on the lock contention of dc->writeback_lock, the
>>> writeback rate cannot be updated neither.
>>>
>>> This change follows Jens Axboe's suggestion to a more clear and simple
>>> version.
>>
>> This looks fine, but it doesn't apply to my current for-5.19/drivers
>> branch which the previous ones did. Did you spin this one without the
>> other patches, perhaps?
>>
>> One minor thing we might want to change if you're respinning it -
>> BCH_WBRATE_UPDATE_RETRY_MAX isn't really named for what it does, since
>> it doesn't retry anything, it simply allows updates to be skipped. Why
>> not call it BCH_WBRATE_UPDATE_MAX_SKIPS instead? I think that'd be
>> better convey what it does.
> 
> Naming is often challenge for me. Sure, _MAX_SKIPS is better. I will
> post another modified version.

It's hard for everyone :-)

Sounds good, I'll get it applied when it shows up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-05-28 12:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-28  6:19 [PATCH 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
2022-05-28  6:19 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li
2022-05-28 12:20   ` Jens Axboe
2022-05-28 12:22     ` Coly Li
2022-05-28 12:23       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).