linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: Matthias Ferdinand <bcache@mfedv.net>,
	Thorsten Knabe <linux@thorsten-knabe.de>
Cc: linux-bcache@vger.kernel.org
Subject: Re: PROBLEM: bcache related kernel BUG() since Linux 5.12
Date: Sun, 16 May 2021 22:43:33 +0800	[thread overview]
Message-ID: <1f1c49d2-6291-f8b5-3627-08ed88114e88@suse.de> (raw)
In-Reply-To: <YKDa9IOPsDJ0Wa8i@xoff>

[-- Attachment #1: Type: text/plain, Size: 1254 bytes --]

On 5/16/21 4:42 PM, Matthias Ferdinand wrote:
> On Sat, May 15, 2021 at 09:06:07PM +0200, Thorsten Knabe wrote:
>> Hello.
>>
>> Starting with Linux 5.12 bcache triggers a BUG() after a few minutes of
>> usage.
>>
>> Linux up to 5.11.x is not affected by this bug.
>>
>> Environment:
>> - Debian 10 AMD 64
>> - Kernel 5.12 - 5.12.4
>> - Filesystem ext4
>> - Backing device: degraded software RAID-6 (MD) with 3 of 4 disks active
>>   (unsure if the degraded RAID-6 has an effect or not)
>> - Cache device: Single SSD
> 
> Sorry I can't immediately help with bcache, but for DRBD, there was a
> similar problem with DRBD on degraded md-raid fixed just recently:
> 
>     https://lists.linbit.com/pipermail/drbd-user/2021-May/025904.html
> 
> Although they had silent data corruption AFAICT, not a loud BUG(), and
> they stated problems started with kernel 4.3.
> 
> For DRBD it had to do with split BIOs and readahead, which degraded
> md-raid may or may not fail, and missing a fail on parts of a split-up
> readahead BIO.
> 
> Matthias
> 


This is caused by a hidden issue which is triggered by the bio code
change in v5.12.

The attached patch can help to avoid the panic, and the finally fixes
are under testing and will be posted very soon.

Coly Li

[-- Attachment #2: 0001-bcache-avoid-oversized-bio_alloc_bioset-call-in-cach.patch --]
[-- Type: text/plain, Size: 2265 bytes --]

From 6f2edee7100efabf2ccccb84e4a92ccbfbddd8c5 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli@suse.de>
Date: Thu, 6 May 2021 10:38:41 +0800
Subject: [PATCH] bcache: avoid oversized bio_alloc_bioset() call in
 cached_dev_cache_miss()

Since Linux v5.12, calling bio_alloc_bioset() with oversized bio vectors
number will cause a BUG() panic in biovec_slab(). There are 2 locations
in bcache code calling bio_alloc_bioset(), and only the location in
cached_dev_cache_miss() has such potential oversized risk.

In cached_dev_cache_miss() the bio vectors number is calculated by
DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS), this patch checks the
calculated result, if it is larger than BIO_MAX_VECS, then give up the
allocation of cache_bio and sending request to backing device directly.

By this restriction, the potential BUG() panic can be avoided from the
cache missing code path.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/request.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 29c231758293..a657d3a2b624 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -879,7 +879,7 @@ static void cached_dev_read_done_bh(struct closure *cl)
 static int cached_dev_cache_miss(struct btree *b, struct search *s,
 				 struct bio *bio, unsigned int sectors)
 {
-	int ret = MAP_CONTINUE;
+	int ret = MAP_CONTINUE, nr_iovecs = 0;
 	unsigned int reada = 0;
 	struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
 	struct bio *miss, *cache_bio;
@@ -916,9 +916,14 @@ static int cached_dev_cache_miss(struct btree *b, struct search *s,
 	/* btree_search_recurse()'s btree iterator is no good anymore */
 	ret = miss == bio ? MAP_DONE : -EINTR;
 
-	cache_bio = bio_alloc_bioset(GFP_NOWAIT,
-			DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS),
-			&dc->disk.bio_split);
+	nr_iovecs = DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS);
+	if (nr_iovecs > BIO_MAX_VECS) {
+		pr_warn("inserting bio is too large: %d iovecs, not intsert.\n",
+			nr_iovecs);
+		goto out_submit;
+	}
+	cache_bio = bio_alloc_bioset(GFP_NOWAIT, nr_iovecs,
+				     &dc->disk.bio_split);
 	if (!cache_bio)
 		goto out_submit;
 
-- 
2.26.2


  reply	other threads:[~2021-05-16 14:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-15 19:06 PROBLEM: bcache related kernel BUG() since Linux 5.12 Thorsten Knabe
2021-05-16  8:42 ` Matthias Ferdinand
2021-05-16 14:43   ` Coly Li [this message]
2021-05-31  9:37 ` Rolf Fokkens
2021-06-01 15:34   ` Coly Li
2021-06-01 21:04     ` Rolf Fokkens
2021-06-01 21:46       ` Rolf Fokkens
2021-06-02  8:33     ` Thorsten Knabe
2021-06-02  9:45       ` Rolf Fokkens
2021-06-02 11:08         ` Coly Li
2021-06-04  9:07           ` Rolf Fokkens
2021-06-04 10:35             ` Coly Li
2021-06-04 12:06               ` Thorsten Knabe
2021-06-05 15:20                 ` Coly Li
     [not found]               ` <ec9f73fa-a16c-b0c1-d1f8-2bf4e585be5f@rolffokkens.nl>
2021-06-07 10:11                 ` Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f1c49d2-6291-f8b5-3627-08ed88114e88@suse.de \
    --to=colyli@suse.de \
    --cc=bcache@mfedv.net \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux@thorsten-knabe.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).