* Some concurrent actions cause __device_links_no_driver to report the warning calltrace.
@ 2021-09-01 8:01 luojiaxing
0 siblings, 0 replies; only message in thread
From: luojiaxing @ 2021-09-01 8:01 UTC (permalink / raw)
To: rafael.j.wysocki, linux-pm; +Cc: linux-kernel
Hi, rafael
I found one issue about device link, and want to ask for your help.
During the kernel test recently, we find that some concurrent actions
generate a calltrace, as shown in the following:
<4>[ 606.102307] WARNING: CPU: 0 PID: 7 at drivers/base/core.c:1339
__device_links_no_driver+0x138/0x170
<4>[ 606.284685] Call trace:
<4>[ 606.287122] __device_links_no_driver+0x138/0x170
<4>[ 606.291804] device_links_driver_cleanup+0xb0/0xfc
<4>[ 606.296575] __device_release_driver+0x148/0x1d8
<4>[ 606.301173] device_release_driver+0x38/0x50
<4>[ 606.305423] bus_remove_device+0x130/0x140
<4>[ 606.309502] device_del+0x174/0x430
<4>[ 606.312975] __scsi_remove_device+0x114/0x14c
<4>[ 606.317313] scsi_remove_target+0x1bc/0x240
<4>[ 606.321469] sas_rphy_remove+0x90/0x94
<4>[ 606.325202] sas_rphy_delete+0x44/0x5c
<4>[ 606.328935] sas_destruct_devices+0x64/0xa0 [libsas]
<4>[ 606.333883] sas_revalidate_domain+0xf8/0x1d0 [libsas]
<4>[ 606.339002] process_one_work+0x1dc/0x48c
<4>[ 606.342994] worker_thread+0x15c/0x464
<4>[ 606.346726] kthread+0x168/0x16c
<4>[ 606.349940] ret_from_fork+0x10/0x18
<4>[ 606.353502] ---[ end trace cceb4f5db8bdcd25 ]---
The test method is to rmmod device driver and perform hard reset on the
hard disk at the same time.
We know device_links_unbind_consumers() is called during rmmod device
driver to release all consumers under the device in sequence.
As we are storage controller driver, so it look as follows:
supplier: storage controller
consumer: sda->sdb->sdc...
As the device_links_unbind_consumers () releases the consumer device in
serial mode. If a concurrent action is performed to hard reset a hard
disk, as the following software call stack show :
scsi_remove_target->device_del->bus_remove_device->device_release_driver->__device_links_no_driver().
The hardreset process also calls __device_links_no_driver.
Assume that device_links_unbind_consumers () is releasing sda and sdb is
queuing, but scsi_remove_target() calls __device_links_no_driver() to
release sdb in advance.Then a warning calltrace is generated.
We got some further analysis, it shows that sdb's link->status is now
DL_STATE_ACTIVE(sda's sdb's link->status is modified to
DL_STATE_SUPPLIER_UNBIND by device_links_unbind_consumers).
The if() in the following code will be false and pass through.
if (link->status != DL_STATE_CONSUMER_PROBE &&
link->status != DL_STATE_ACTIVE)
continue;
Since link->supplier->links.status has been set to DL_DEV_UNBINDING,
next code enters the else branch.
if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) {
WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
} else {
WARN_ON(!(link->flags & DL_FLAG_SYNC_STATE_ONLY));
WRITE_ONCE(link->status, DL_STATE_DORMANT);
}
Because link->flags is set to DL_FLAG_MANAGED, calltrace is generated
based on WARN_ON.
In conclusion, we know that the call trace is generated because
link->supplier->links.status and link->status are not modified
synchronously.
After link->supplier->links.status is changed to DL_DEV_UNBINDING, the
value of link->status is changed to DL_STATE_SUPPLIER_UNBIND in sequence.
During this time difference, if a concurrent kernel thread invokes
__device_links_no_driver, warning calltrace will occurs.
I wonder if there is any way to solve this warning call trace?
Thanks
Jiaxing
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-09-01 8:02 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01 8:01 Some concurrent actions cause __device_links_no_driver to report the warning calltrace luojiaxing
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.