All of lore.kernel.org
 help / color / mirror / Atom feed
* Some concurrent actions cause __device_links_no_driver to report the warning calltrace.
@ 2021-09-01  8:01 luojiaxing
  0 siblings, 0 replies; only message in thread
From: luojiaxing @ 2021-09-01  8:01 UTC (permalink / raw)
  To: rafael.j.wysocki, linux-pm; +Cc: linux-kernel

Hi, rafael


I found one issue about device link, and want to ask for your help.


During the kernel test recently, we find that some concurrent actions 
generate a calltrace, as shown in the following:

<4>[  606.102307] WARNING: CPU: 0 PID: 7 at drivers/base/core.c:1339 
__device_links_no_driver+0x138/0x170
<4>[  606.284685] Call trace:
<4>[  606.287122]  __device_links_no_driver+0x138/0x170
<4>[  606.291804]  device_links_driver_cleanup+0xb0/0xfc
<4>[  606.296575]  __device_release_driver+0x148/0x1d8
<4>[  606.301173]  device_release_driver+0x38/0x50
<4>[  606.305423]  bus_remove_device+0x130/0x140
<4>[  606.309502]  device_del+0x174/0x430
<4>[  606.312975]  __scsi_remove_device+0x114/0x14c
<4>[  606.317313]  scsi_remove_target+0x1bc/0x240
<4>[  606.321469]  sas_rphy_remove+0x90/0x94
<4>[  606.325202]  sas_rphy_delete+0x44/0x5c
<4>[  606.328935]  sas_destruct_devices+0x64/0xa0 [libsas]
<4>[  606.333883]  sas_revalidate_domain+0xf8/0x1d0 [libsas]
<4>[  606.339002]  process_one_work+0x1dc/0x48c
<4>[  606.342994]  worker_thread+0x15c/0x464
<4>[  606.346726]  kthread+0x168/0x16c
<4>[  606.349940]  ret_from_fork+0x10/0x18
<4>[  606.353502] ---[ end trace cceb4f5db8bdcd25 ]---


The test method is to rmmod device driver and perform hard reset on the 
hard disk at the same time.


We know device_links_unbind_consumers() is called during rmmod device 
driver to release all consumers under the device in sequence.

As we are storage controller driver, so it look as follows:

supplier: storage controller

consumer: sda->sdb->sdc...

As the device_links_unbind_consumers () releases the consumer device in 
serial mode. If a concurrent action is performed to hard reset a hard 
disk, as the following software call stack show :

scsi_remove_target->device_del->bus_remove_device->device_release_driver->__device_links_no_driver().

The hardreset process also calls __device_links_no_driver.

Assume that device_links_unbind_consumers () is releasing sda and sdb is 
queuing, but scsi_remove_target() calls __device_links_no_driver() to 
release sdb in advance.Then a warning calltrace is generated.

We got some further analysis, it shows that sdb's link->status is now 
DL_STATE_ACTIVE(sda's sdb's link->status is modified to 
DL_STATE_SUPPLIER_UNBIND by device_links_unbind_consumers).

The if() in the following code will be false and pass through.

if (link->status != DL_STATE_CONSUMER_PROBE &&
     link->status != DL_STATE_ACTIVE)
     continue;

Since link->supplier->links.status has been set to DL_DEV_UNBINDING, 
next code enters the else branch.

if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) {
         WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
} else {
         WARN_ON(!(link->flags & DL_FLAG_SYNC_STATE_ONLY));
         WRITE_ONCE(link->status, DL_STATE_DORMANT);
}


Because link->flags is set to DL_FLAG_MANAGED, calltrace is generated 
based on WARN_ON.


In conclusion, we know that the call trace is generated because 
link->supplier->links.status and link->status are not modified 
synchronously.

After link->supplier->links.status is changed to DL_DEV_UNBINDING, the 
value of link->status is changed to DL_STATE_SUPPLIER_UNBIND in sequence.

During this time difference, if a concurrent kernel thread invokes 
__device_links_no_driver, warning calltrace will occurs.


I wonder if there is any way to solve this warning call trace?


Thanks

Jiaxing




^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-01  8:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01  8:01 Some concurrent actions cause __device_links_no_driver to report the warning calltrace luojiaxing

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.