linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lei Lei2 Yin <yinlei2@lenovo.com>
To: Sagi Grimberg <sagi@grimberg.me>,
	"kbusch@kernel.org" <kbusch@kernel.org>,
	"axboe@fb.com" <axboe@fb.com>, "hch@lst.de" <hch@lst.de>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"cybeyond@foxmail.com" <cybeyond@foxmail.com>
Subject: Re: [External] Re: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme multipath
Date: Tue, 21 Mar 2023 11:54:23 +0000	[thread overview]
Message-ID: <PS1PR03MB49395AC5BC73DDDA6A79E87488819@PS1PR03MB4939.apcprd03.prod.outlook.com> (raw)
In-Reply-To: <042385ef-285e-5179-941b-ab37f490c1d8@grimberg.me>

	Thank you for your reply

	This problem occurs in nvme over rdma and nvme over tcp with nvme generate multipath. Delete the ns gendisk is caused by nvmf target subsystem is faulty, then host detect all path keep alive overtime and io timeout. After ctrl-loss-tmo seconds, host will remove fail ctrl and ns gendisk.

	We have reappear this proble in Linux-5.10.136, Linux-5.10.167 and the latest commit in linux-5.10.y, and this patch is only applicable to Linux-5.10.y

	Yes , this is absolutely the wrong place to do this . Can i move this modification after nvme_trace_bio_complete?

	Do I need to resubmit a patch, if modifications are needed?



-----邮件原件-----
发件人: Sagi Grimberg <sagi@grimberg.me> 
发送时间: 2023年3月21日 19:09
收件人: Lei Lei2 Yin <yinlei2@lenovo.com>; kbusch@kernel.org; axboe@fb.com; hch@lst.de
抄送: linux-nvme@lists.infradead.org; linux-kernel@vger.kernel.org; cybeyond@foxmail.com
主题: [External] Re: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme multipath



On 3/21/23 12:50, Lei Lei2 Yin wrote:
>  From b134e7930b50679ce48e5522ddd37672b1802340 Mon Sep 17 00:00:00 
> 2001
> From: Lei Yin <yinlei2@lenovo.com>
> Date: Tue, 21 Mar 2023 16:09:08 +0800
> Subject: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme
>   multipath
> 
> When blk_queue_split works in nvme_ns_head_submit_bio, input bio will 
> be splited to two bios. If parent bio is completed first, and the 
> bi_disk in parent bio is kfreed by nvme_free_ns, child will access 
> this freed bi_disk in bio_endio. This will trigger heap-use-after-free 
> or null pointer oops.

Can you explain further? It is unclear to me how we can delete the ns gendisk

> 
> The following is kasan report:
> 
> BUG: KASAN: use-after-free in bio_endio+0x477/0x500 Read of size 8 at 
> addr ffff888106f2e3a8 by task kworker/1:1H/241
> 
> CPU: 1 PID: 241 Comm: kworker/1:1H Kdump: loaded Tainted: G           O
>        5.10.167 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> Workqueue: kblockd nvme_requeue_work [nvme_core] Call Trace:
>   dump_stack+0x92/0xc4
>   ? bio_endio+0x477/0x500
>   print_address_description.constprop.7+0x1e/0x230
>   ? record_print_text.cold.40+0x11/0x11
>   ? _raw_spin_trylock_bh+0x120/0x120
>   ? blk_throtl_bio+0x225/0x3050
>   ? bio_endio+0x477/0x500
>   ? bio_endio+0x477/0x500
>   kasan_report.cold.9+0x37/0x7c
>   ? bio_endio+0x477/0x500
>   bio_endio+0x477/0x500
>   nvme_ns_head_submit_bio+0x950/0x1130 [nvme_core]
>   ? nvme_find_path+0x7f0/0x7f0 [nvme_core]
>   ? __kasan_slab_free+0x11a/0x150
>   ? bio_endio+0x213/0x500
>   submit_bio_noacct+0x2a4/0xd10
>   ? _dev_info+0xcd/0xff
>   ? _dev_notice+0xff/0xff
>   ? blk_queue_enter+0x6c0/0x6c0
>   ? _raw_spin_lock_irq+0x81/0xd5
>   ? _raw_spin_lock+0xd0/0xd0
>   nvme_requeue_work+0x144/0x18c [nvme_core]
>   process_one_work+0x878/0x13e0
>   worker_thread+0x87/0xf70
>   ? __kthread_parkme+0x8f/0x100
>   ? process_one_work+0x13e0/0x13e0
>   kthread+0x30f/0x3d0
>   ? kthread_parkme+0x80/0x80
>   ret_from_fork+0x1f/0x30
> 
> Allocated by task 52:
>   kasan_save_stack+0x19/0x40
>   __kasan_kmalloc.constprop.11+0xc8/0xd0
>   __alloc_disk_node+0x5c/0x320
>   nvme_alloc_ns+0x6e9/0x1520 [nvme_core]
>   nvme_validate_or_alloc_ns+0x17c/0x370 [nvme_core]
>   nvme_scan_work+0x2d4/0x4d0 [nvme_core]
>   process_one_work+0x878/0x13e0
>   worker_thread+0x87/0xf70
>   kthread+0x30f/0x3d0
>   ret_from_fork+0x1f/0x30
> 
> Freed by task 54:
>   kasan_save_stack+0x19/0x40
>   kasan_set_track+0x1c/0x30
>   kasan_set_free_info+0x1b/0x30
>   __kasan_slab_free+0x108/0x150
>   kfree+0xa7/0x300
>   device_release+0x98/0x210
>   kobject_release+0x109/0x3a0
>   nvme_free_ns+0x15e/0x1f7 [nvme_core]
>   nvme_remove_namespaces+0x22f/0x390 [nvme_core]
>   nvme_do_delete_ctrl+0xac/0x106 [nvme_core]
>   process_one_work+0x878/0x13e0
>   worker_thread+0x87/0xf70
>   kthread+0x30f/0x3d0
>   ret_from_fork+0x1f/0x30
> 
> Signed-off-by: Lei Yin <yinlei2@lenovo.com>
> ---
>   drivers/nvme/host/nvme.h | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 
> c3e4d9b6f9c0..b441c5ce4157 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -749,8 +749,17 @@ static inline void nvme_trace_bio_complete(struct request *req,
>   {
>   	struct nvme_ns *ns = req->q->queuedata;
>   
> -	if ((req->cmd_flags & REQ_NVME_MPATH) && req->bio)
> +	if ((req->cmd_flags & REQ_NVME_MPATH) && req->bio) {
>   		trace_block_bio_complete(ns->head->disk->queue, req->bio);
> +
> +		/* Point bio->bi_disk to head disk.
> +		 * This bio maybe as other bio's parent in bio chain. If this bi_disk
> +		 * is kfreed by nvme_free_ns, other bio may get this bio by __bio_chain_endio
> +		 * in bio_endio, and access this bi_disk. This will trigger heap-use-after-free
> +		 * or null pointer oops.
> +		 */
> +		req->bio->bi_disk = ns->head->disk;
> +	}

This is absolutely the wrong place to do this. This is a tracing function, it should not have any other logic.

What tree is this against anyways? There is no bi_disk in struct bio anymore.

  reply	other threads:[~2023-03-21 11:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 10:50 [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme multipath Lei Lei2 Yin
2023-03-21 11:09 ` Sagi Grimberg
2023-03-21 11:54   ` Lei Lei2 Yin [this message]
2023-03-21 12:26     ` [External] " Sagi Grimberg
2023-03-21 13:30       ` 回复: " Lei Lei2 Yin
2023-03-21 15:00         ` hch
2023-03-22  7:12         ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PS1PR03MB49395AC5BC73DDDA6A79E87488819@PS1PR03MB4939.apcprd03.prod.outlook.com \
    --to=yinlei2@lenovo.com \
    --cc=axboe@fb.com \
    --cc=cybeyond@foxmail.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).