From: Logan Gunthorpe <logang@deltatee.com>
To: Guoqing Jiang <guoqing.jiang@linux.dev>,
Donald Buczek <buczek@molgen.mpg.de>, Song Liu <song@kernel.org>,
Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
Paolo Valente <paolo.valente@linaro.org>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held
Date: Wed, 25 May 2022 12:22:06 -0600 [thread overview]
Message-ID: <ae6d294a-e9ec-a81d-6085-a9341ed8a470@deltatee.com> (raw)
In-Reply-To: <775d6734-2b08-21a8-a093-f750d31ce6ce@linux.dev>
On 2022-05-25 03:04, Guoqing Jiang wrote:
> I would prefer to focus on block tree or md tree. With latest block tree
> (commit 44d8538d7e7dbee7246acda3b706c8134d15b9cb), I get below
> similar issue as Donald reported, it happened with the cmd (which did
> work with 5.12 kernel).
>
> vm79:~/mdadm> sudo ./test --dev=loop --tests=05r1-add-internalbitmap
Ok, so this test passes for me, but my VM was not running with bfq. It
also seems we have layers upon layers of different bugs to untangle.
Perhaps you can try the tests with bfq disabled to make progress on the
other regression I reported.
If I enable bfq and set the loop devices to the bfq scheduler, then I
hit the same bug as you and Donald. It's clearly a NULL pointer
de-reference in the bfq code, which seems to be triggered on the
partition read after mdadm opens a block device (not sure if it's the md
device or the loop device but I suspect the latter seeing it's not going
through any md code).
Simplifying things down a bit, the null pointer dereference can be
triggered by creating an md device with loop devices that have bfq
scheduler set:
mdadm --create --run /dev/md0 --level=1 -n2 /dev/loop0 /dev/loop1
The crash occurs in bfq_bio_bfqg() with blkg_to_bfqg() returning NULL.
It's hard to trace where the NULL comes from in there -- the code is a
bit complex.
I've found that the bfq bug exists in current md-next (42b805af102) but
did not trigger in the base tag of v5.18-rc3. Bisecting revealed the bug
was introduced by:
4e54a2493e58 ("bfq: Get rid of __bio_blkcg() usage")
Reverting that commit and the next commit (075a53b7) on top of md-next
was confirmed to fix the bug.
I've copied Jan, Jens and Paolo who can hopefully help with this. A
cleaned up stack trace follows this email for their benefit.
Logan
--
BUG: KASAN: null-ptr-deref in bfq_bio_bfqg+0x65/0xf0
Read of size 1 at addr 0000000000000094 by task mdadm/850
CPU: 1 PID: 850 Comm: mdadm Not tainted
5.18.0-rc3-eid-vmlocalyes-dbg-00005-g42b805af1024 #2113
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2
04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5a/0x74
kasan_report.cold+0x5f/0x1a9
__asan_load1+0x4d/0x50
bfq_bio_bfqg+0x65/0xf0
bfq_bic_update_cgroup+0x2f/0x340
bfq_insert_requests+0x568/0x5800
blk_mq_sched_insert_request+0x180/0x230
blk_mq_submit_bio+0x9f0/0xe50
__submit_bio+0xeb/0x100
submit_bio_noacct_nocheck+0x1fd/0x470
submit_bio_noacct+0x350/0xa80
submit_bio+0x84/0xf0
submit_bh_wbc+0x27a/0x2b0
block_read_full_page+0x578/0xb60
blkdev_readpage+0x18/0x20
do_read_cache_folio+0x290/0x430
read_cache_page+0x41/0x130
read_part_sector+0x7a/0x3d0
read_lba+0x161/0x340
efi_partition+0x1ce/0xdd0
bdev_disk_changed+0x2e9/0x6a0
blkdev_get_whole+0xd5/0x140
blkdev_get_by_dev.part.0+0x37f/0x570
blkdev_get_by_dev+0x51/0x60
blkdev_open+0xa4/0x140
do_dentry_open+0x2a7/0x6d0
vfs_open+0x58/0x60
path_openat+0x77e/0x13f0
do_filp_open+0x154/0x280
do_sys_openat2+0x119/0x2c0
__x64_sys_openat+0xe7/0x160
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
--
bfq_bio_bfqg+0x65/0xf0:
bfq_bio_bfqg at block/bfq-cgroup.c:619
614 struct blkcg_gq *blkg = bio->bi_blkg;
615 struct bfq_group *bfqg;
616
617 while (blkg) {
618 bfqg = blkg_to_bfqg(blkg);
>619< if (bfqg->online) {
620 bio_associate_blkg_from_css(bio,
621 return bfqg;
622 }
623 blkg = blkg->parent;
624
next prev parent reply other threads:[~2022-05-25 18:22 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-05 8:16 [PATCH 0/2] two fixes for md Guoqing Jiang
2022-05-05 8:16 ` [PATCH V3 1/2] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2022-05-05 14:02 ` kernel test robot
2022-05-05 18:04 ` kernel test robot
2022-05-06 2:34 ` Guoqing Jiang
2022-05-06 2:34 ` Guoqing Jiang
2022-05-05 8:16 ` [PATCH 2/2] md: protect md_unregister_thread from reentrancy Guoqing Jiang
2022-05-09 6:39 ` Song Liu
2022-05-09 8:12 ` Guoqing Jiang
2022-05-06 11:36 ` [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2022-05-09 6:37 ` Song Liu
2022-05-09 8:09 ` Guoqing Jiang
2022-05-09 9:32 ` Wols Lists
2022-05-09 10:37 ` Guoqing Jiang
2022-05-09 11:19 ` Wols Lists
2022-05-09 11:26 ` Guoqing Jiang
2022-05-10 6:44 ` Song Liu
2022-05-10 12:01 ` Donald Buczek
2022-05-10 12:09 ` Guoqing Jiang
2022-05-10 12:35 ` Donald Buczek
2022-05-10 18:02 ` Song Liu
2022-05-11 8:10 ` Guoqing Jiang
2022-05-11 21:45 ` Song Liu
2022-05-20 18:27 ` Logan Gunthorpe
2022-05-21 18:23 ` Donald Buczek
2022-05-23 1:08 ` Guoqing Jiang
2022-05-23 5:41 ` Donald Buczek
2022-05-23 9:51 ` Guoqing Jiang
2022-05-24 16:13 ` Logan Gunthorpe
2022-05-25 9:04 ` Guoqing Jiang
2022-05-25 18:22 ` Logan Gunthorpe [this message]
2022-05-26 9:46 ` Jan Kara
2022-05-26 11:53 ` Jan Kara
2022-05-31 6:11 ` Christoph Hellwig
2022-05-31 7:43 ` Jan Kara
2022-05-30 9:55 ` Guoqing Jiang
2022-05-30 16:35 ` Logan Gunthorpe
2022-05-31 8:13 ` Guoqing Jiang
2022-05-24 15:58 ` Logan Gunthorpe
2022-05-24 18:16 ` Song Liu
2022-05-25 9:17 ` Guoqing Jiang
2022-05-24 15:51 ` Logan Gunthorpe
2022-06-02 8:12 ` Xiao Ni
2022-05-09 8:18 ` Donald Buczek
2022-05-09 8:48 ` Guoqing Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae6d294a-e9ec-a81d-6085-a9341ed8a470@deltatee.com \
--to=logang@deltatee.com \
--cc=axboe@kernel.dk \
--cc=buczek@molgen.mpg.de \
--cc=guoqing.jiang@linux.dev \
--cc=jack@suse.cz \
--cc=linux-raid@vger.kernel.org \
--cc=paolo.valente@linaro.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.