Re: [linux-next][mainline/master] [IPR] [Function could be = "__mutex_lock_slowpath(lock)"]OOPs kernel crash while performing IPR test

From: Mohamed Khalfella <mkhalfella@purestorage.com>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	abdhalee@linux.vnet.ibm.com, mingo@redhat.com, will@kernel.org,
	longman@redhat.com, boqun.feng@gmail.com, sachinp@linux.vnet.com,
	mputtash@linux.vnet.com,
	Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com>
Subject: Re: [linux-next][mainline/master] [IPR] [Function could be = "__mutex_lock_slowpath(lock)"]OOPs kernel crash while performing IPR test
Date: Mon, 29 Jan 2024 11:23:45 -0800	[thread overview]
Message-ID: <20240129192345.GA2300500@medusa> (raw)
In-Reply-To: <67f349e2-33f1-30a3-f92c-3c0a68d6d22f@linux.vnet.ibm.com>

On 2023-08-27 13:56:14 +0530, Tasmiya Nalatwad wrote:
> Greetings,
> 
> [linux-next][mainline/master] [IPR] [Function could be =
> "__mutex_lock_slowpath(lock)"]OOPs kernel crash while performing IPR test

Hello, 

We hit this issue while testing 6.6.9 LTS kernel and I narrowed it down
to commit fcaa174a9c99 ("scsi/sg: don't grab scsi host module reference").
Not holding a reference to the scsi_device caused the last reference to
be dropped in sg_remove_sfp_usercontext(). This caused request_queue to
be set to NULL in scsi_device_dev_release(). Passing NULL to blk_trace_remove()
caused this panic. More detail below.

The issue can be reproduced by having userspace process holding the last
refcount to device that was removed.

# python3
\>>> import os
\>>> fd = os.open('/dev/sg22', os.O_RDONLY)
\>>> # wait until the device is removed
\>>> os.close(fd)
#

# echo 1 >  /sys/bus/pci/devices/0000\:5e\:00.0/remove
# # Now run >>> os.close(fd) above

    python3-14739    53..... 3782240930us : sg_remove_sfp_kprobe: (sg_remove_sfp+0x0/0xa0 <ffffffff816dd5c0>) kref=0xffff88b047055320
    python3-14739    53..... 3782240934us : <stack trace>
 => sg_remove_sfp+0x1/0xa0 <ffffffff816dd5c1>
 => sg_release+0xa2/0x100 <ffffffff816de5e2>
 => __fput+0xe9/0x280 <ffffffff812fcf79>
 => __x64_sys_close+0x39/0x80 <ffffffff812f58a9>
 => do_syscall_64+0x35/0x80 <ffffffff81b57485>
 => entry_SYSCALL_64_after_hwframe+0x46/0xb0 <ffffffff81c0006a>
    kworker/-2357     53..... 3782240948us : scsi_device_dev_release_kprobe: (scsi_device_dev_release+0x0/0x2c0 <ffffffff816c0680>) device=0xffff88ac553a61c0
    kworker/-2357     53..... 3782240951us : <stack trace>
 => scsi_device_dev_release+0x1/0x2c0 <ffffffff816c0681>
 => device_release+0x31/0x90 <ffffffff81662fc1>
 => kobject_put+0x6d/0x180 <ffffffff81b3527d>
 => scsi_device_put+0x20/0x30 <ffffffff816b1190>
 => sg_remove_sfp_usercontext+0xfb/0x190 <ffffffff816de73b>
 => process_one_work+0x133/0x2f0 <ffffffff810a5983>
 => worker_thread+0x2ec/0x400 <ffffffff810a6dbc>
 => kthread+0xe2/0x110 <ffffffff810aed42>
 => ret_from_fork+0x2d/0x50 <ffffffff8103ddad>
 => ret_from_fork_asm+0x11/0x20 <ffffffff810017d1>

python3-14739 was holding the last refcount. sg_remove_sfp() queued
sg_remove_sfp_usercontext() for execution. scsi_device_dev_release()
set sdev->request_queue to NULL causing the panic.

    kworker/49:1-607     [049] .....   519.002877: scsi_device_dev_release_kprobe: (scsi_device_dev_release+0x0/0x2c0 <ffffffff816c0680>) device=0xffff889d227bf1c0
    kworker/49:1-607     [049] .....   519.002882: <stack trace>
 => scsi_device_dev_release+0x1/0x2c0 <ffffffff816c0681>
 => device_release+0x31/0x90 <ffffffff81662fc1>
 => kobject_put+0x6d/0x180 <ffffffff81b3526d>
 => scsi_device_put+0x20/0x30 <ffffffff816b1190>
 => sg_device_destroy+0x2f/0xb0 <ffffffff816dc16f>
 => sg_remove_sfp_usercontext+0x133/0x190 <ffffffff816de763>
 => process_one_work+0x133/0x2f0 <ffffffff810a5983>
 => worker_thread+0x2ec/0x400 <ffffffff810a6dbc>
 => kthread+0xe2/0x110 <ffffffff810aed42>
 => ret_from_fork+0x2d/0x50 <ffffffff8103ddad>
 => ret_from_fork_asm+0x11/0x20 <ffffffff810017d1>

Reverting 80b6051085c5 ("scsi: sg: Fix checking return value of
blk_get_queue()") and fcaa174a9c99 ("scsi/sg: don't grab scsi host module
reference") fixed the problem. The stacktrace above is showing the last
refcount of the scsi_device is dropped from sg_device_destroy().