linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] coresight: Defer probe when the child dev is not probed
@ 2022-02-28 13:31 Mao Jinlong
  2022-02-28 14:51 ` Suzuki K Poulose
  0 siblings, 1 reply; 7+ messages in thread
From: Mao Jinlong @ 2022-02-28 13:31 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	Alexander Shishkin
  Cc: Mao Jinlong, coresight, linux-arm-kernel, linux-kernel,
	linux-arm-msm, Tingwei Zhang, Jinlong Mao, Yuanfang Zhang,
	Tao Zhang, Hao Zhang

From: Mao Jinlong <jinlmao@qti.qualcomm.com>

It is possible that when device probe, its child device is not
probed. Then it will fail when add sysfs connection for the device.
Make device defer probe when the child device is not probed.

Signed-off-by: Mao Jinlong <jinlmao@qti.qualcomm.com>
---
 drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
index 34d2a2d31d00..7df9eb59bf2c 100644
--- a/drivers/hwtracing/coresight/coresight-sysfs.c
+++ b/drivers/hwtracing/coresight/coresight-sysfs.c
@@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
 	if (!info->orig || !info->target ||
 	    !info->orig_name || !info->target_name)
 		return -EINVAL;
-	if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
+	if (!info->orig->has_conns_grp)
 		return -EINVAL;
+	if (!info->target->has_conns_grp)
+		return -EPROBE_DEFER;
 
 	/* first link orig->target */
 	ret = sysfs_add_link_to_group(&info->orig->dev.kobj,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
  2022-02-28 13:31 [PATCH] coresight: Defer probe when the child dev is not probed Mao Jinlong
@ 2022-02-28 14:51 ` Suzuki K Poulose
       [not found]   ` <a63abe13-793e-323f-e214-1dd9826c8a9a@quicinc.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Suzuki K Poulose @ 2022-02-28 14:51 UTC (permalink / raw)
  To: Mao Jinlong, Mathieu Poirier, Mike Leach, Leo Yan, Alexander Shishkin
  Cc: Mao Jinlong, coresight, linux-arm-kernel, linux-kernel,
	linux-arm-msm, Tingwei Zhang, Yuanfang Zhang, Tao Zhang,
	Hao Zhang

Hi Jinlong

On 28/02/2022 13:31, Mao Jinlong wrote:
> From: Mao Jinlong <jinlmao@qti.qualcomm.com>
> 
> It is possible that when device probe, its child device is not
> probed. Then it will fail when add sysfs connection for the device.
> Make device defer probe when the child device is not probed.

Please could you a bit a more specific on the exact issue ?
I don't see a problem with probe deferral right now, with
coresight/next.

For e.g.,

root@juno-server:~# lsmod
Module                  Size  Used by
coresight              73728  0
root@juno-server:~# ls /sys/bus/coresight/devices/
root@juno-server:~# modprobe coresight-etm4x
root@juno-server:~# lsmod
Module                  Size  Used by
coresight_etm4x        81920  0
coresight              73728  1 coresight_etm4x
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0  etm1

-- Note etm2-etm5 doesn't appear --

root@juno-server:~# modprobe coresight-funnel
root@juno-server:~# lsmod
Module                  Size  Used by
coresight_funnel       20480  0
coresight_etm4x        81920  0
coresight              73728  2 coresight_etm4x,coresight_funnel
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0  etm1

-- Still don't appear ---

root@juno-server:~# modprobe coresight-replicator
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0  etm1
root@juno-server:~# modprobe coresight-tmc

-- At this stage, the devices automatically get probed and appear --
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0  etm1  etm2  etm3  etm4  etm5  funnel0  funnel1  funnel2  tmc_etf0 
  tmc_etr0


root@juno-server:~# lsmod
Module                  Size  Used by
coresight_tmc          40960  0
coresight_replicator    20480  0
coresight_funnel       20480  0
coresight_etm4x        81920  0
coresight              73728  4 
coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel

So, my question is, what is this patch trying to solve ?


Cheers
Suzuki

> 
> Signed-off-by: Mao Jinlong <jinlmao@qti.qualcomm.com>
> ---
>   drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 34d2a2d31d00..7df9eb59bf2c 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
>   	if (!info->orig || !info->target ||
>   	    !info->orig_name || !info->target_name)
>   		return -EINVAL;
> -	if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
> +	if (!info->orig->has_conns_grp)
>   		return -EINVAL;
> +	if (!info->target->has_conns_grp)
> +		return -EPROBE_DEFER;
>   
>   	/* first link orig->target */
>   	ret = sysfs_add_link_to_group(&info->orig->dev.kobj,


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
       [not found]   ` <a63abe13-793e-323f-e214-1dd9826c8a9a@quicinc.com>
@ 2022-03-01 13:15     ` Mike Leach
  2022-03-01 13:30       ` Jinlong Mao
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Leach @ 2022-03-01 13:15 UTC (permalink / raw)
  To: Jinlong Mao
  Cc: Suzuki K Poulose, Mathieu Poirier, Leo Yan, Alexander Shishkin,
	Mao Jinlong, coresight, linux-arm-kernel, linux-kernel,
	linux-arm-msm, Tingwei Zhang, Yuanfang Zhang, Tao Zhang,
	Hao Zhang

Hi,

On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@quicinc.com> wrote:
>
> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>
> Hi Jinlong
>
> On 28/02/2022 13:31, Mao Jinlong wrote:
>
> From: Mao Jinlong <jinlmao@qti.qualcomm.com>
>
> It is possible that when device probe, its child device is not
> probed. Then it will fail when add sysfs connection for the device.
> Make device defer probe when the child device is not probed.
>
>
> Please could you a bit a more specific on the exact issue ?
> I don't see a problem with probe deferral right now, with
> coresight/next.
>
> For e.g.,
>
> root@juno-server:~# lsmod
> Module                  Size  Used by
> coresight              73728  0
> root@juno-server:~# ls /sys/bus/coresight/devices/
> root@juno-server:~# modprobe coresight-etm4x
> root@juno-server:~# lsmod
> Module                  Size  Used by
> coresight_etm4x        81920  0
> coresight              73728  1 coresight_etm4x
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0  etm1
>
> -- Note etm2-etm5 doesn't appear --
>
> root@juno-server:~# modprobe coresight-funnel
> root@juno-server:~# lsmod
> Module                  Size  Used by
> coresight_funnel       20480  0
> coresight_etm4x        81920  0
> coresight              73728  2 coresight_etm4x,coresight_funnel
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0  etm1
>
> -- Still don't appear ---
>
> root@juno-server:~# modprobe coresight-replicator
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0  etm1
> root@juno-server:~# modprobe coresight-tmc
>
> -- At this stage, the devices automatically get probed and appear --
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0  etm1  etm2  etm3  etm4  etm5  funnel0  funnel1  funnel2  tmc_etf0  tmc_etr0
>
>
> root@juno-server:~# lsmod
> Module                  Size  Used by
> coresight_tmc          40960  0
> coresight_replicator    20480  0
> coresight_funnel       20480  0
> coresight_etm4x        81920  0
> coresight              73728  4 coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel
>
> So, my question is, what is this patch trying to solve ?
>
>
> Cheers
> Suzuki
>
> Hi Suzuki,
>
> This issue happens when race condition happens.
> The condition is that the device and its child_device's probe happens at the same time.
>
> For example: device0 and its child device device1.
> Both of them are calling coresight_register function. device0 is calling coresight_fixup_device_conns.
> device1 is waiting for device0 to release the coresight_mutex. Because device1's csdev node is allocated,
> coresight_make_links will be called for device0. Then in coresight_add_sysfs_link, has_conns_grp is true
> for device0, but has_conns_grp is false for device1 as has_conns_grp is set to true in coresight_create_conns_sysfs_group .
> The probe of device0 will fail for at this condition.
>
>
> struct coresight_device *coresight_register(struct coresight_desc *desc)
> {
>    .........
>     mutex_lock(&coresight_mutex);
>
>     ret = coresight_create_conns_sysfs_group(csdev);
>     if (!ret)
>         ret = coresight_fixup_device_conns(csdev);
>     if (!ret)
>         ret = coresight_fixup_orphan_conns(csdev);
>     if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>         cti_assoc_ops->add(csdev);
>
>     mutex_unlock(&coresight_mutex);
>
> .........
>
> }
>
> static int coresight_fixup_device_conns(struct coresight_device *csdev)
> {
>    ..........
>         conn->child_dev =
>             coresight_find_csdev_by_fwnode(conn->child_fwnode);

The issue appears to be a constraint hidden in the lower layers of the code.
Would a better solution not be to alter the code here:

if (conn->child_dev && conn->child_dev->has_conns_grp) {
   ...
} else {
      csdev->orphan = true;
}

which would mean that the connection attempt would drop through to
label the connection as an orphan, to be cleaned up by the child
itself when it runs coresight_fixup_orphan_conns()

Regards

Mike

>         if (conn->child_dev) {
>             ret = coresight_make_links(csdev, conn,
>
>                            conn->child_dev);
>
> ..........
>
> }
>
>
> int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
> {
>    ................
>     if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
>         return -EINVAL;
>
>
>
> The probe fail issue is reproduced with reboot stress test on our internal device.
>
> With the patch, the probe fail issue is not reproduced.
>
> Thanks
>
> Jinlong Mao
>
>
>
> Signed-off-by: Mao Jinlong <jinlmao@qti.qualcomm.com>
> ---
>   drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 34d2a2d31d00..7df9eb59bf2c 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
>       if (!info->orig || !info->target ||
>           !info->orig_name || !info->target_name)
>           return -EINVAL;
> -    if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
> +    if (!info->orig->has_conns_grp)
>           return -EINVAL;
> +    if (!info->target->has_conns_grp)
> +        return -EPROBE_DEFER;
>         /* first link orig->target */
>       ret = sysfs_add_link_to_group(&info->orig->dev.kobj,
>
>


-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
  2022-03-01 13:15     ` Mike Leach
@ 2022-03-01 13:30       ` Jinlong Mao
  2022-03-01 15:03         ` Suzuki K Poulose
  0 siblings, 1 reply; 7+ messages in thread
From: Jinlong Mao @ 2022-03-01 13:30 UTC (permalink / raw)
  To: Mike Leach
  Cc: Suzuki K Poulose, Mathieu Poirier, Leo Yan, Alexander Shishkin,
	Mao Jinlong, coresight, linux-arm-kernel, linux-kernel,
	linux-arm-msm, Tingwei Zhang, Yuanfang Zhang, Tao Zhang,
	Hao Zhang

Hi Mike,

On 3/1/2022 9:15 PM, Mike Leach wrote:
> Hi,
>
> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@quicinc.com> wrote:
>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>
>> Hi Jinlong
>>
>> On 28/02/2022 13:31, Mao Jinlong wrote:
>>
>> From: Mao Jinlong <jinlmao@qti.qualcomm.com>
>>
>> It is possible that when device probe, its child device is not
>> probed. Then it will fail when add sysfs connection for the device.
>> Make device defer probe when the child device is not probed.
>>
>>
>> Please could you a bit a more specific on the exact issue ?
>> I don't see a problem with probe deferral right now, with
>> coresight/next.
>>
>> For e.g.,
>>
>> root@juno-server:~# lsmod
>> Module                  Size  Used by
>> coresight              73728  0
>> root@juno-server:~# ls /sys/bus/coresight/devices/
>> root@juno-server:~# modprobe coresight-etm4x
>> root@juno-server:~# lsmod
>> Module                  Size  Used by
>> coresight_etm4x        81920  0
>> coresight              73728  1 coresight_etm4x
>> root@juno-server:~# ls /sys/bus/coresight/devices/
>> etm0  etm1
>>
>> -- Note etm2-etm5 doesn't appear --
>>
>> root@juno-server:~# modprobe coresight-funnel
>> root@juno-server:~# lsmod
>> Module                  Size  Used by
>> coresight_funnel       20480  0
>> coresight_etm4x        81920  0
>> coresight              73728  2 coresight_etm4x,coresight_funnel
>> root@juno-server:~# ls /sys/bus/coresight/devices/
>> etm0  etm1
>>
>> -- Still don't appear ---
>>
>> root@juno-server:~# modprobe coresight-replicator
>> root@juno-server:~# ls /sys/bus/coresight/devices/
>> etm0  etm1
>> root@juno-server:~# modprobe coresight-tmc
>>
>> -- At this stage, the devices automatically get probed and appear --
>> root@juno-server:~# ls /sys/bus/coresight/devices/
>> etm0  etm1  etm2  etm3  etm4  etm5  funnel0  funnel1  funnel2  tmc_etf0  tmc_etr0
>>
>>
>> root@juno-server:~# lsmod
>> Module                  Size  Used by
>> coresight_tmc          40960  0
>> coresight_replicator    20480  0
>> coresight_funnel       20480  0
>> coresight_etm4x        81920  0
>> coresight              73728  4 coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel
>>
>> So, my question is, what is this patch trying to solve ?
>>
>>
>> Cheers
>> Suzuki
>>
>> Hi Suzuki,
>>
>> This issue happens when race condition happens.
>> The condition is that the device and its child_device's probe happens at the same time.
>>
>> For example: device0 and its child device device1.
>> Both of them are calling coresight_register function. device0 is calling coresight_fixup_device_conns.
>> device1 is waiting for device0 to release the coresight_mutex. Because device1's csdev node is allocated,
>> coresight_make_links will be called for device0. Then in coresight_add_sysfs_link, has_conns_grp is true
>> for device0, but has_conns_grp is false for device1 as has_conns_grp is set to true in coresight_create_conns_sysfs_group .
>> The probe of device0 will fail for at this condition.
>>
>>
>> struct coresight_device *coresight_register(struct coresight_desc *desc)
>> {
>>     .........
>>      mutex_lock(&coresight_mutex);
>>
>>      ret = coresight_create_conns_sysfs_group(csdev);
>>      if (!ret)
>>          ret = coresight_fixup_device_conns(csdev);
>>      if (!ret)
>>          ret = coresight_fixup_orphan_conns(csdev);
>>      if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>          cti_assoc_ops->add(csdev);
>>
>>      mutex_unlock(&coresight_mutex);
>>
>> .........
>>
>> }
>>
>> static int coresight_fixup_device_conns(struct coresight_device *csdev)
>> {
>>     ..........
>>          conn->child_dev =
>>              coresight_find_csdev_by_fwnode(conn->child_fwnode);
> The issue appears to be a constraint hidden in the lower layers of the code.
> Would a better solution not be to alter the code here:
>
> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>     ...
> } else {
>        csdev->orphan = true;
> }
>
> which would mean that the connection attempt would drop through to
> label the connection as an orphan, to be cleaned up by the child
> itself when it runs coresight_fixup_orphan_conns()
>
> Regards
>
> Mike
Thanks Mike.

Your recommended fix looks much better than my fix. Let me try with it 
and get back to you.

Thanks

Jinlong  Mao


>
>>          if (conn->child_dev) {
>>              ret = coresight_make_links(csdev, conn,
>>
>>                             conn->child_dev);
>>
>> ..........
>>
>> }
>>
>>
>> int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
>> {
>>     ................
>>      if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
>>          return -EINVAL;
>>
>>
>>
>> The probe fail issue is reproduced with reboot stress test on our internal device.
>>
>> With the patch, the probe fail issue is not reproduced.
>>
>> Thanks
>>
>> Jinlong Mao
>>
>>
>>
>> Signed-off-by: Mao Jinlong <jinlmao@qti.qualcomm.com>
>> ---
>>    drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
>> index 34d2a2d31d00..7df9eb59bf2c 100644
>> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
>> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
>> @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
>>        if (!info->orig || !info->target ||
>>            !info->orig_name || !info->target_name)
>>            return -EINVAL;
>> -    if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
>> +    if (!info->orig->has_conns_grp)
>>            return -EINVAL;
>> +    if (!info->target->has_conns_grp)
>> +        return -EPROBE_DEFER;
>>          /* first link orig->target */
>>        ret = sysfs_add_link_to_group(&info->orig->dev.kobj,
>>
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
  2022-03-01 13:30       ` Jinlong Mao
@ 2022-03-01 15:03         ` Suzuki K Poulose
  2022-03-02  2:29           ` Jinlong Mao
  0 siblings, 1 reply; 7+ messages in thread
From: Suzuki K Poulose @ 2022-03-01 15:03 UTC (permalink / raw)
  To: Jinlong Mao, Mike Leach
  Cc: Mathieu Poirier, Leo Yan, Alexander Shishkin, Mao Jinlong,
	coresight, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Tingwei Zhang, Yuanfang Zhang, Tao Zhang, Hao Zhang

Hi

On 01/03/2022 13:30, Jinlong Mao wrote:
> Hi Mike,
> 
> On 3/1/2022 9:15 PM, Mike Leach wrote:
>> Hi,
>>
>> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@quicinc.com> 
>> wrote:
>>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>>

...

>>>
>>> Hi Suzuki,
>>>
>>> This issue happens when race condition happens.
>>> The condition is that the device and its child_device's probe happens 
>>> at the same time.
>>>
>>> For example: device0 and its child device device1.
>>> Both of them are calling coresight_register function. device0 is 
>>> calling coresight_fixup_device_conns.
>>> device1 is waiting for device0 to release the coresight_mutex. 
>>> Because device1's csdev node is allocated,
>>> coresight_make_links will be called for device0. Then in 
>>> coresight_add_sysfs_link, has_conns_grp is true
>>> for device0, but has_conns_grp is false for device1 as has_conns_grp 
>>> is set to true in coresight_create_conns_sysfs_group .
>>> The probe of device0 will fail for at this condition.
>>>
>>>
>>> struct coresight_device *coresight_register(struct coresight_desc *desc)
>>> {
>>>     .........
>>>      mutex_lock(&coresight_mutex);
>>>
>>>      ret = coresight_create_conns_sysfs_group(csdev);
>>>      if (!ret)
>>>          ret = coresight_fixup_device_conns(csdev);
>>>      if (!ret)
>>>          ret = coresight_fixup_orphan_conns(csdev);
>>>      if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>>          cti_assoc_ops->add(csdev);
>>>
>>>      mutex_unlock(&coresight_mutex);
>>>
>>> .........
>>>
>>> }
>>>
>>> static int coresight_fixup_device_conns(struct coresight_device *csdev)
>>> {
>>>     ..........
>>>          conn->child_dev =
>>>              coresight_find_csdev_by_fwnode(conn->child_fwnode);
>> The issue appears to be a constraint hidden in the lower layers of the 
>> code.
>> Would a better solution not be to alter the code here:
>>
>> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>>     ...
>> } else {
>>        csdev->orphan = true;
>> }
>>
>> which would mean that the connection attempt would drop through to
>> label the connection as an orphan, to be cleaned up by the child
>> itself when it runs coresight_fixup_orphan_conns()
>>

Tnanks Mike, I think that is a good solution. Alternatively, we
could make sure that device_register() and the fixup following
that are atomic.

i.e.

	mutex_lock()

	device_register()
	fixup_connections()
	create_sysfs()

	mutex_unlock();

The fix may be a bit invasive than Mike's proposal, but it makes
sure we don't end up with half baked device on the coresight-bus.

Suzuki

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
  2022-03-01 15:03         ` Suzuki K Poulose
@ 2022-03-02  2:29           ` Jinlong Mao
  2022-03-02  8:05             ` Jinlong Mao
  0 siblings, 1 reply; 7+ messages in thread
From: Jinlong Mao @ 2022-03-02  2:29 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach
  Cc: Mathieu Poirier, Leo Yan, Alexander Shishkin, Mao Jinlong,
	coresight, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Tingwei Zhang, Yuanfang Zhang, Tao Zhang, Hao Zhang


On 3/1/2022 11:03 PM, Suzuki K Poulose wrote:
> Hi
>
> On 01/03/2022 13:30, Jinlong Mao wrote:
>> Hi Mike,
>>
>> On 3/1/2022 9:15 PM, Mike Leach wrote:
>>> Hi,
>>>
>>> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@quicinc.com> 
>>> wrote:
>>>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>>>
>
> ...
>
>>>>
>>>> Hi Suzuki,
>>>>
>>>> This issue happens when race condition happens.
>>>> The condition is that the device and its child_device's probe 
>>>> happens at the same time.
>>>>
>>>> For example: device0 and its child device device1.
>>>> Both of them are calling coresight_register function. device0 is 
>>>> calling coresight_fixup_device_conns.
>>>> device1 is waiting for device0 to release the coresight_mutex. 
>>>> Because device1's csdev node is allocated,
>>>> coresight_make_links will be called for device0. Then in 
>>>> coresight_add_sysfs_link, has_conns_grp is true
>>>> for device0, but has_conns_grp is false for device1 as 
>>>> has_conns_grp is set to true in coresight_create_conns_sysfs_group .
>>>> The probe of device0 will fail for at this condition.
>>>>
>>>>
>>>> struct coresight_device *coresight_register(struct coresight_desc 
>>>> *desc)
>>>> {
>>>>     .........
>>>>      mutex_lock(&coresight_mutex);
>>>>
>>>>      ret = coresight_create_conns_sysfs_group(csdev);
>>>>      if (!ret)
>>>>          ret = coresight_fixup_device_conns(csdev);
>>>>      if (!ret)
>>>>          ret = coresight_fixup_orphan_conns(csdev);
>>>>      if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>>>          cti_assoc_ops->add(csdev);
>>>>
>>>>      mutex_unlock(&coresight_mutex);
>>>>
>>>> .........
>>>>
>>>> }
>>>>
>>>> static int coresight_fixup_device_conns(struct coresight_device 
>>>> *csdev)
>>>> {
>>>>     ..........
>>>>          conn->child_dev =
>>>> coresight_find_csdev_by_fwnode(conn->child_fwnode);
>>> The issue appears to be a constraint hidden in the lower layers of 
>>> the code.
>>> Would a better solution not be to alter the code here:
>>>
>>> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>>>     ...
>>> } else {
>>>        csdev->orphan = true;
>>> }
>>>
>>> which would mean that the connection attempt would drop through to
>>> label the connection as an orphan, to be cleaned up by the child
>>> itself when it runs coresight_fixup_orphan_conns()
>>>
>
> Tnanks Mike, I think that is a good solution. Alternatively, we
> could make sure that device_register() and the fixup following
> that are atomic.
>
> i.e.
>
>     mutex_lock()
>
>     device_register()
>     fixup_connections()
>     create_sysfs()
>
>     mutex_unlock();
>
> The fix may be a bit invasive than Mike's proposal, but it makes
> sure we don't end up with half baked device on the coresight-bus.
>
> Suzuki

Thanks Mike & Suzuki.

I will combine your proposals and make the changes.

I will get back to you after the test.


Thanks
Jinlong Mao


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] coresight: Defer probe when the child dev is not probed
  2022-03-02  2:29           ` Jinlong Mao
@ 2022-03-02  8:05             ` Jinlong Mao
  0 siblings, 0 replies; 7+ messages in thread
From: Jinlong Mao @ 2022-03-02  8:05 UTC (permalink / raw)
  To: Suzuki K Poulose, Mike Leach
  Cc: Mathieu Poirier, Leo Yan, Alexander Shishkin, Mao Jinlong,
	coresight, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Tingwei Zhang, Yuanfang Zhang, Tao Zhang, Hao Zhang


On 3/2/2022 10:29 AM, Jinlong Mao wrote:
>
> On 3/1/2022 11:03 PM, Suzuki K Poulose wrote:
>> Hi
>>
>> On 01/03/2022 13:30, Jinlong Mao wrote:
>>> Hi Mike,
>>>
>>> On 3/1/2022 9:15 PM, Mike Leach wrote:
>>>> Hi,
>>>>
>>>> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@quicinc.com> 
>>>> wrote:
>>>>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>>>>
>>
>> ...
>>
>>>>>
>>>>> Hi Suzuki,
>>>>>
>>>>> This issue happens when race condition happens.
>>>>> The condition is that the device and its child_device's probe 
>>>>> happens at the same time.
>>>>>
>>>>> For example: device0 and its child device device1.
>>>>> Both of them are calling coresight_register function. device0 is 
>>>>> calling coresight_fixup_device_conns.
>>>>> device1 is waiting for device0 to release the coresight_mutex. 
>>>>> Because device1's csdev node is allocated,
>>>>> coresight_make_links will be called for device0. Then in 
>>>>> coresight_add_sysfs_link, has_conns_grp is true
>>>>> for device0, but has_conns_grp is false for device1 as 
>>>>> has_conns_grp is set to true in coresight_create_conns_sysfs_group .
>>>>> The probe of device0 will fail for at this condition.
>>>>>
>>>>>
>>>>> struct coresight_device *coresight_register(struct coresight_desc 
>>>>> *desc)
>>>>> {
>>>>>     .........
>>>>>      mutex_lock(&coresight_mutex);
>>>>>
>>>>>      ret = coresight_create_conns_sysfs_group(csdev);
>>>>>      if (!ret)
>>>>>          ret = coresight_fixup_device_conns(csdev);
>>>>>      if (!ret)
>>>>>          ret = coresight_fixup_orphan_conns(csdev);
>>>>>      if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>>>>          cti_assoc_ops->add(csdev);
>>>>>
>>>>>      mutex_unlock(&coresight_mutex);
>>>>>
>>>>> .........
>>>>>
>>>>> }
>>>>>
>>>>> static int coresight_fixup_device_conns(struct coresight_device 
>>>>> *csdev)
>>>>> {
>>>>>     ..........
>>>>>          conn->child_dev =
>>>>> coresight_find_csdev_by_fwnode(conn->child_fwnode);
>>>> The issue appears to be a constraint hidden in the lower layers of 
>>>> the code.
>>>> Would a better solution not be to alter the code here:
>>>>
>>>> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>>>>     ...
>>>> } else {
>>>>        csdev->orphan = true;
>>>> }
>>>>
>>>> which would mean that the connection attempt would drop through to
>>>> label the connection as an orphan, to be cleaned up by the child
>>>> itself when it runs coresight_fixup_orphan_conns()
>>>>
>>
>> Tnanks Mike, I think that is a good solution. Alternatively, we
>> could make sure that device_register() and the fixup following
>> that are atomic.
>>
>> i.e.
>>
>>     mutex_lock()
>>
>>     device_register()
>>     fixup_connections()
>>     create_sysfs()
>>
>>     mutex_unlock();
>>
>> The fix may be a bit invasive than Mike's proposal, but it makes
>> sure we don't end up with half baked device on the coresight-bus.
>>
>> Suzuki
>
> Thanks Mike & Suzuki.
>
> I will combine your proposals and make the changes.
>
> I will get back to you after the test.

Hi Mike & Suzuki,

Issue is fixed with your proposal.
I submit a new patch "coresight: core: Fix coresight device probe 
failure issue".
Please help to review.

Thanks
Jinlong Mao
>
>
> Thanks
> Jinlong Mao
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-02  8:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-28 13:31 [PATCH] coresight: Defer probe when the child dev is not probed Mao Jinlong
2022-02-28 14:51 ` Suzuki K Poulose
     [not found]   ` <a63abe13-793e-323f-e214-1dd9826c8a9a@quicinc.com>
2022-03-01 13:15     ` Mike Leach
2022-03-01 13:30       ` Jinlong Mao
2022-03-01 15:03         ` Suzuki K Poulose
2022-03-02  2:29           ` Jinlong Mao
2022-03-02  8:05             ` Jinlong Mao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).