From: John Garry <john.garry@huawei.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
<linux-block@vger.kernel.org>,
<syzbot+77ba3d171a25c56756ea@syzkaller.appspotmail.com>
Subject: Re: [PATCH] blk-mq: fix use-after-free in blk_mq_exit_sched
Date: Wed, 9 Jun 2021 12:42:52 +0100 [thread overview]
Message-ID: <81c38feb-9d3e-7adc-57e6-54bccf0d3142@huawei.com> (raw)
In-Reply-To: <YMCU+iuHs4ULN0lb@T590>
On 09/06/2021 11:16, Ming Lei wrote:
> On Wed, Jun 09, 2021 at 09:59:43AM +0100, John Garry wrote:
>> On 09/06/2021 07:30, Ming Lei wrote:
>>
>> Thanks for the fix
>>
>>> tagset can't be used after blk_cleanup_queue() is returned because
>>> freeing tagset usually follows blk_clenup_queue(). Commit d97e594c5166
>>> ("blk-mq: Use request queue-wide tags for tagset-wide sbitmap") adds
>>> check on q->tag_set->flags in blk_mq_exit_sched(), and causes
>>> use-after-free.
>>>
>>> Fixes it by using hctx->flags.
>>>
>>
>> The tagset is a member of the Scsi_Host structure. So it is true that this
>> memory may be freed before the request_queue is exited?
>
> Yeah, please see commit c3e2219216c9 ("block: free sched's request pool in
> blk_cleanup_queue")
JFYI, I could recreate with the following simple steps:
root@(none)$ mount /dev/sda1 mnt
[ 27.252887] FAT-fs (sda1): Volume was not properly unmounted. Some
data may be corrupt. Please run fsck.
_hw/unbind)$ echo HISI0162:01 > ./sys/bus/platform/drivers/hisi_sas_v2
[ 31.262274] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.270314] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.278262] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.286245] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.294164] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.302143] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.310097] sas: ex 500e004aaaaaaa1f phys DID NOT change
[ 31.321599] hisi_sas_v2_hw HISI0162:01: dev[9:1] is gone
[ 31.429245] hisi_sas_v2_hw HISI0162:01: dev[8:1] is gone
[ 31.533461] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone
[ 31.637338] hisi_sas_v2_hw HISI0162:01: dev[6:1] is gone
[ 31.740840] hisi_sas_v2_hw HISI0162:01: dev[5:1] is gone
[ 31.750659] sd 0:0:3:0: [sdd] Synchronizing SCSI cache
[ 31.833500] hisi_sas_v2_hw HISI0162:01: dev[4:1] is gone
[ 31.937351] hisi_sas_v2_hw HISI0162:01: dev[3:1] is gone
[ 31.947749] sd 0:0:1:0: [sdb] Synchronizing SCSI cache
[ 31.953195] sd 0:0:1:0: [sdb] Stopping disk
[ 32.690815] hisi_sas_v2_hw HISI0162:01: dev[2:5] is gone
[ 32.771526] hisi_sas_v2_hw HISI0162:01: dev[1:1] is gone
[ 32.790406] hisi_sas_v2_hw HISI0162:01: dev[0:2] is gone
root@(none)$
root@(none)$
root@(none)$ umount mnt
[ 37.323039]
==================================================================
[ 37.330262] BUG: KASAN: use-after-free in blk_mq_exit_sched+0x110/0x1c8
[ 37.336880] Read of size 4 at addr ffff001051e80100 by task umount/547
[ 37.343401]
[ 37.344884] CPU: 4 PID: 547 Comm: umount Not tainted
5.13.0-rc5-next-20210608 #80
[ 37.352362] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon
D05 IT21 Nemo 2.0 RC0 04/18/2018
[ 37.361486] Call trace:
[ 37.363924] dump_backtrace+0x0/0x2d0
[ 37.367586] show_stack+0x18/0x28
[ 37.370898] dump_stack_lvl+0xfc/0x138
[ 37.374643] print_address_description.constprop.13+0x78/0x314
[ 37.380472] kasan_report+0x1e0/0x248
[ 37.384131] __asan_load4+0x9c/0xd8
[ 37.387615] blk_mq_exit_sched+0x110/0x1c8
[ 37.391706] __elevator_exit+0x34/0x58
[ 37.395451] blk_release_queue+0x108/0x1d8
[ 37.399545] kobject_put+0xa8/0x180
[ 37.403029] blk_put_queue+0x14/0x20
[ 37.406601] disk_release+0xcc/0x100
[ 37.410171] device_release+0x94/0x110
[ 37.413918] kobject_put+0xa8/0x180
[ 37.417401] put_device+0x14/0x28
[ 37.420712] put_disk+0x2c/0x40
[ 37.423848] blkdev_put_no_open+0x54/0x78
[ 37.427853] blkdev_put+0x108/0x258
[ 37.431335] kill_block_super+0x5c/0x78
[ 37.435166] deactivate_locked_super+0x6c/0xd0
[ 37.439605] deactivate_super+0x8c/0xa8
[ 37.443435] cleanup_mnt+0x110/0x1c0
[ 37.447007] __cleanup_mnt+0x14/0x20
[ 37.450578] task_work_run+0xbc/0x1a8
[ 37.454236] do_notify_resume+0x2cc/0x590
[ 37.458242] work_pending+0xc/0x3c8
[ 37.461725]
[ 37.463207] The buggy address belongs to the page:
[ 37.467990] page:(____ptrval____) refcount:0 mapcount:-128
mapping:0000000000000000 index:0x0 pfn:0x1051e80
[ 37.477724] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff)
[ 37.484164] raw: 0bfffc0000000000 fffffc00415a9008 ffff0017fbffebb0
0000000000000000
[ 37.491900] raw: 0000000000000000 0000000000000006 00000000ffffff7f
0000000000000000
[ 37.499635] page dumped because: kasan: bad access detected
[ 37.505198]
[ 37.506680] Memory state around the buggy address:
[ 37.511463] ffff001051e80000: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 37.518677] ffff001051e80080: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 37.525891] >ffff001051e80100: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 37.533104]^
[ 37.536324] ffff001051e80180: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 37.543538] ffff001051e80200: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 37.550751]
==================================================================
[ 37.557963] Disabling lock debugging due to kernel taint
root@(none)$
root@(none)$
And this patch fixes it:
Tested-by: John Garry <john.garry@huawei.com>
>
>>
>>> Reported-by: syzbot+77ba3d171a25c56756ea@syzkaller.appspotmail.com
>>> Fixes: d97e594c5166 ("blk-mq: Use request queue-wide tags for tagset-wide sbitmap")
>>> Cc: John Garry <john.garry@huawei.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>> block/blk-mq-sched.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>>> index a9182d2f8ad3..80273245d11a 100644
>>> --- a/block/blk-mq-sched.c
>>> +++ b/block/blk-mq-sched.c
>>> @@ -680,6 +680,7 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e)
>>> {
>>> struct blk_mq_hw_ctx *hctx;
>>> unsigned int i;
>>> + unsigned int flags = 0;
>>> queue_for_each_hw_ctx(q, hctx, i) {
>>> blk_mq_debugfs_unregister_sched_hctx(hctx);
>>> @@ -687,12 +688,13 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e)
>>> e->type->ops.exit_hctx(hctx, i);
>>> hctx->sched_data = NULL;
>>> }
>>> + flags = hctx->flags;
>>
>> I know the choice is limited, but it is unfortunate that we must set flags
>> in a loop
>
> Does it matter?
It's just a nit on the coding style: it's not an especially good
practice to set the same value in a loop.
But, as I said, choice is limited.
>
>>
>>> }
>>> blk_mq_debugfs_unregister_sched(q);
>>> if (e->type->ops.exit_sched)
>>> e->type->ops.exit_sched(e);
>>> blk_mq_sched_tags_teardown(q);
>>> - if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
>>> + if (blk_mq_is_sbitmap_shared(flags))
>>> blk_mq_exit_sched_shared_sbitmap(q);
>>
>> this is
>>
>> blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
>> {
>> sbitmap_queue_free(&queue->sched_bitmap_tags);
>> ..
>> }
>>
>> And isn't it safe to call sbitmap_queue_free() when
>> sbitmap_queue_init_node() has not been called?
>>
>> I'm just wondering if we can always call blk_mq_exit_sched_shared_sbitmap()?
>> I know it's not an ideal choice either.
>
> So far it may work, not sure if it can in future, I suggest to follow
> the traditional alloc & free pattern.
>
>
Fine
Thanks,
John
next prev parent reply other threads:[~2021-06-09 11:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-09 6:30 [PATCH] blk-mq: fix use-after-free in blk_mq_exit_sched Ming Lei
2021-06-09 8:59 ` John Garry
2021-06-09 10:16 ` Ming Lei
2021-06-09 11:42 ` John Garry [this message]
2021-06-14 22:07 ` Ming Lei
2021-06-15 10:02 ` John Garry
2021-06-18 6:03 ` Ming Lei
2021-06-18 14:50 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81c38feb-9d3e-7adc-57e6-54bccf0d3142@huawei.com \
--to=john.garry@huawei.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=syzbot+77ba3d171a25c56756ea@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.