All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
To: linux-block@vger.kernel.org, linux-raid@vger.kernel.org
Cc: dm-devel@redhat.com, axboe@kernel.dk, gavin.guo@canonical.com,
	jay.vosburgh@canonical.com, gpiccoli@canonical.com,
	kernel@gpiccoli.net, Ming Lei <ming.lei@redhat.com>,
	Song Liu <liu.song.a23@gmail.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	stable@vger.kernel.org
Subject: [PATCH V2 2/2] md/raid0: Do not bypass blocking queue entered for raid0 bios
Date: Mon, 20 May 2019 19:09:11 -0300	[thread overview]
Message-ID: <20190520220911.25192-2-gpiccoli@canonical.com> (raw)
In-Reply-To: <20190520220911.25192-1-gpiccoli@canonical.com>

Commit cd4a4ae4683d ("block: don't use blocking queue entered for
recursive bio submits") introduced the flag BIO_QUEUE_ENTERED in order
split bios bypass the blocking queue entering routine and use the live
non-blocking version. It was a result of an extensive discussion in
a linux-block thread[0], and the purpose of this change was to prevent
a hung task waiting on a reference to drop.

Happens that md raid0 split bios all the time, and more important,
it changes their underlying device to the raid member. After the change
introduced by this flag's usage, we experience various crashes if a raid0
member is removed during a large write. This happens because the bio
reaches the live queue entering function when the queue of the raid0
member is dying.

A simple reproducer of this behavior is presented below:
a) Build kernel v5.2-rc1 with CONFIG_BLK_DEV_THROTTLING=y.

b) Create a raid0 md array with 2 NVMe devices as members, and mount it
with an ext4 filesystem.

c) Run the following oneliner (supposing the raid0 is mounted in /mnt):
(dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3;
echo 1 > /sys/block/nvme0n1/device/device/remove
(whereas nvme0n1 is the 2nd array member)

This will trigger the following warning/oops:

------------[ cut here ]------------
no blkg associated for bio on block-device: nvme0n1
WARNING: CPU: 9 PID: 184 at ./include/linux/blk-cgroup.h:785
generic_make_request_checks+0x4dd/0x690
[...]
BUG: unable to handle kernel NULL pointer dereference at 0000000000000155
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
RIP: 0010:blk_throtl_bio+0x45/0x970
[...]
Call Trace:
 generic_make_request_checks+0x1bf/0x690
 generic_make_request+0x64/0x3f0
 raid0_make_request+0x184/0x620 [raid0]
 ? raid0_make_request+0x184/0x620 [raid0]
 ? blk_queue_split+0x384/0x6d0
 md_handle_request+0x126/0x1a0
 md_make_request+0x7b/0x180
 generic_make_request+0x19e/0x3f0
 submit_bio+0x73/0x140
[...]

This patch changes raid0 driver to fallback to the "old" blocking queue
entering procedure, by clearing the BIO_QUEUE_ENTERED from raid0 bios.
This prevents the crashes and restores the regular behavior of raid0
arrays when a member is removed during a large write.

[0] https://marc.info/?l=linux-block&m=152638475806811

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable@vger.kernel.org # v4.18
Fixes: cd4a4ae4683d ("block: don't use blocking queue entered for recursive bio submits")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

No changes from V1, only rebased to v5.2-rc1.
Also, notice that if [1] gets merged before this patch, the
BIO_QUEUE_ENTERED flag will change to BIO_SPLITTED, so the (easy) conflict
will need to be worked.

[1] https://lore.kernel.org/linux-block/20190515030310.20393-4-ming.lei@redhat.com/

 drivers/md/raid0.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index f3fb5bb8c82a..d5bdc79e0835 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -547,6 +547,7 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
 			trace_block_bio_remap(bdev_get_queue(rdev->bdev),
 				discard_bio, disk_devt(mddev->gendisk),
 				bio->bi_iter.bi_sector);
+		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
 		generic_make_request(discard_bio);
 	}
 	bio_endio(bio);
@@ -602,6 +603,7 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
 				disk_devt(mddev->gendisk), bio_sector);
 	mddev_check_writesame(mddev, bio);
 	mddev_check_write_zeroes(mddev, bio);
+	bio_clear_flag(bio, BIO_QUEUE_ENTERED);
 	generic_make_request(bio);
 	return true;
 }
-- 
2.21.0

  reply	other threads:[~2019-05-20 22:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-20 22:09 [PATCH V2 1/2] block: Fix a NULL pointer dereference in generic_make_request() Guilherme G. Piccoli
2019-05-20 22:09 ` Guilherme G. Piccoli [this message]
2019-05-21 12:56   ` [PATCH V2 2/2] md/raid0: Do not bypass blocking queue entered for raid0 bios Christoph Hellwig
2019-05-21 14:10     ` Guilherme Piccoli
2019-05-21 17:22       ` Christoph Hellwig
2019-05-21 17:37         ` Guilherme Piccoli
2019-05-21  5:59 ` [PATCH V2 1/2] block: Fix a NULL pointer dereference in generic_make_request() Song Liu
2019-05-21 12:07   ` Guilherme Piccoli
2019-05-23 14:33   ` Guilherme G. Piccoli
2019-05-23 17:06     ` Song Liu
2019-05-23 20:30       ` Guilherme Piccoli
2019-05-21 12:51 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190520220911.25192-2-gpiccoli@canonical.com \
    --to=gpiccoli@canonical.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=gavin.guo@canonical.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=kernel@gpiccoli.net \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=liu.song.a23@gmail.com \
    --cc=ming.lei@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.