linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix some issues with TRBE building as a module
@ 2023-08-14  9:38 Junhao He
  2023-08-14  9:38 ` [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Junhao He
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Junhao He @ 2023-08-14  9:38 UTC (permalink / raw)
  To: suzuki.poulose, mike.leach, leo.yan, anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng, hejunhao3

The TRBE driver support is build as a module, we found some driver issues
based on the patchset [1] and set CONFIG_CORESIGHT_TRBE=m.
1. TRBE driver potential sleep in atomic context when unregister device
2. Multiple free the platform data resource when rmmod coresight TRBE
driver

[1] "coresight: trbe: Enable ACPI based devices"
https://lore.kernel.org/all/20230808082247.383405-1-anshuman.khandual@arm.com/

Junhao He (2):
  coresight: trbe: Fix TRBE potential sleep in atomic context
  coresight: core: Fix multiple free TRBE platform data resource

 drivers/hwtracing/coresight/coresight-core.c |  7 ++--
 drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
 2 files changed, 24 insertions(+), 18 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14  9:38 [PATCH 0/2] Fix some issues with TRBE building as a module Junhao He
@ 2023-08-14  9:38 ` Junhao He
  2023-08-14 10:34   ` Suzuki K Poulose
  2023-08-14  9:38 ` [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource Junhao He
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Junhao He @ 2023-08-14  9:38 UTC (permalink / raw)
  To: suzuki.poulose, mike.leach, leo.yan, anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng, hejunhao3

smp_call_function_single() will allocate an IPI interrupt vector to
the target processor and send a function call request to the interrupt
vector. After the target processor receives the IPI interrupt, it will
execute arm_trbe_remove_coresight_cpu() call request in the interrupt
handler.

According to the device_unregister() stack information, if other process
is useing the device, the down_write() may sleep, and trigger deadlocks
or unexpected errors.

  arm_trbe_remove_coresight_cpu
    coresight_unregister
      device_unregister
        device_del
          kobject_del
            __kobject_del
              sysfs_remove_dir
                kernfs_remove
                  down_write ---------> it may sleep

Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
per TRBE.
Simply call arm_trbe_remove_coresight_cpu() directly without useing the
smp_call_function_single(), which is the same as registering the TRBE
coresight device.

Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 7720619909d6..ce1e6f537b8d 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
 	enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
 }
 
+static void arm_trbe_disable_cpu(void *info)
+{
+	struct trbe_drvdata *drvdata = info;
+	struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
+
+	disable_percpu_irq(drvdata->irq);
+	trbe_reset_local(cpudata);
+	cpudata->drvdata = NULL;
+}
+
+
 static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
 {
 	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
@@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
 	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
 }
 
-static void arm_trbe_remove_coresight_cpu(void *info)
+static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
 {
-	int cpu = smp_processor_id();
-	struct trbe_drvdata *drvdata = info;
-	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
 	struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
 
-	disable_percpu_irq(drvdata->irq);
-	trbe_reset_local(cpudata);
 	if (trbe_csdev) {
 		coresight_unregister(trbe_csdev);
-		cpudata->drvdata = NULL;
 		coresight_set_percpu_sink(cpu, NULL);
 	}
 }
@@ -1366,8 +1371,12 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
 {
 	int cpu;
 
-	for_each_cpu(cpu, &drvdata->supported_cpus)
-		smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
+	for_each_cpu(cpu, &drvdata->supported_cpus) {
+		if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+			smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
+		if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+			arm_trbe_remove_coresight_cpu(drvdata, cpu);
+	}
 	free_percpu(drvdata->cpudata);
 	return 0;
 }
@@ -1406,12 +1415,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
 {
 	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
 
-	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
-		struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
-
-		disable_percpu_irq(drvdata->irq);
-		trbe_reset_local(cpudata);
-	}
+	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+		arm_trbe_disable_cpu(drvdata);
 	return 0;
 }
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource
  2023-08-14  9:38 [PATCH 0/2] Fix some issues with TRBE building as a module Junhao He
  2023-08-14  9:38 ` [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Junhao He
@ 2023-08-14  9:38 ` Junhao He
  2023-08-14 22:47   ` Suzuki K Poulose
  2023-08-16 14:10 ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Suzuki K Poulose
  2023-08-17  6:18 ` [PATCH 0/2] Fix some issues with TRBE building as a module Anshuman Khandual
  3 siblings, 1 reply; 25+ messages in thread
From: Junhao He @ 2023-08-14  9:38 UTC (permalink / raw)
  To: suzuki.poulose, mike.leach, leo.yan, anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng, hejunhao3

Current the TRBE driver supports matching TRBE platform device through
id_table. The ACPI created a dummy TRBE platform device inside
drivers/perf/arm_pmu_acpi.c. So the TRBE platform driver will probe only
once and allocate just one TRBE platform data resource.

If the system supports the TRBE feature, Each CPU in the systems can
have at least one TRBE present, and the coresight_unregister gets called
multiple times, once for each of them.
Therefore, when unregister TRBE coresight devices, the TRBE platform data
resource will multiple free in function coresight_unregister.

root@localhost:# insmod coresight-trbe.ko
root@localhost:# rmmod coresight-trbe.ko
[  423.455932] ------------[ cut here ]------------
[  423.461987] WARNING: CPU: 1 PID: 0 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
[  423.483821] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O       6.5.0-rc4+ #1
[  423.505842] pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
...
[  423.601301] Call trace:
[  423.604202]  devm_kfree+0x88/0x98
[  423.608369]  coresight_release_platform_data+0xb8/0xe0 [coresight]
[  423.616589]  coresight_unregister+0x120/0x170 [coresight]
[  423.623533]  arm_trbe_remove_coresight_cpu+0x70/0xa0 [coresight_trbe]
[  423.631082]  __flush_smp_call_function_queue+0x1e4/0x4e0
[  423.637471]  generic_smp_call_function_single_interrupt+0x1c/0x30
[  423.644796]  ipi_handler+0x90/0x278
[  423.648992]  handle_percpu_devid_irq+0x90/0x250
[  423.654636]  generic_handle_domain_irq+0x34/0x58
[  423.659786]  gic_handle_irq+0x12c/0x270
[  423.664039]  call_on_irq_stack+0x24/0x30
[  423.668452]  do_interrupt_handler+0x88/0x98
[  423.673027]  el1_interrupt+0x48/0xe8
[  423.677413]  el1h_64_irq_handler+0x18/0x28
[  423.681781]  el1h_64_irq+0x78/0x80
[  423.685550]  default_idle_call+0x5c/0x180
[  423.689855]  do_idle+0x25c/0x2c0
[  423.694196]  cpu_startup_entry+0x2c/0x40
[  423.698373]  secondary_start_kernel+0x144/0x188
[  423.703920]  __secondary_switched+0xb8/0xc0
[  423.708972] ---[ end trace 0000000000000000 ]---
[  423.729209] ------------[ cut here ]------------
...
[  423.735217] WARNING: CPU: 2 PID: 40 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
...
[  424.012385] WARNING: CPU: 3 PID: 0 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
...

This patch does the following:
1.TRBE coresight devices do not need regular connections information, We
  can free connections resource when the nr_conns is valid.
2.And we can ignore the free platform data resource, it will be
  automatically free in platform_driver_unregister().

Signed-off-by: Junhao He <hejunhao3@huawei.com>
---
 drivers/hwtracing/coresight/coresight-core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 118fcf27854d..c6f7889d1b4d 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -1555,9 +1555,10 @@ void coresight_release_platform_data(struct coresight_device *csdev,
 		conns[i]->dest_fwnode = NULL;
 		devm_kfree(dev, conns[i]);
 	}
-	devm_kfree(dev, pdata->out_conns);
-	devm_kfree(dev, pdata->in_conns);
-	devm_kfree(dev, pdata);
+	if (pdata->nr_outconns)
+		devm_kfree(dev, pdata->out_conns);
+	if (pdata->nr_inconns)
+		devm_kfree(dev, pdata->in_conns);
 	if (csdev)
 		coresight_remove_conns_sysfs_group(csdev);
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14  9:38 ` [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Junhao He
@ 2023-08-14 10:34   ` Suzuki K Poulose
  2023-08-14 13:32     ` hejunhao
  0 siblings, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-14 10:34 UTC (permalink / raw)
  To: Junhao He, mike.leach, leo.yan, anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng

Hi Junhao

On 14/08/2023 10:38, Junhao He wrote:
> smp_call_function_single() will allocate an IPI interrupt vector to
> the target processor and send a function call request to the interrupt
> vector. After the target processor receives the IPI interrupt, it will
> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
> handler.
> 
> According to the device_unregister() stack information, if other process
> is useing the device, the down_write() may sleep, and trigger deadlocks
> or unexpected errors.
> 
>    arm_trbe_remove_coresight_cpu
>      coresight_unregister
>        device_unregister
>          device_del
>            kobject_del
>              __kobject_del
>                sysfs_remove_dir
>                  kernfs_remove
>                    down_write ---------> it may sleep
> 
> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
> per TRBE.
> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
> smp_call_function_single(), which is the same as registering the TRBE
> coresight device.
> 
> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
> ---
>   drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
>   1 file changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 7720619909d6..ce1e6f537b8d 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>   	enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>   }
>   
> +static void arm_trbe_disable_cpu(void *info)
> +{
> +	struct trbe_drvdata *drvdata = info;
> +	struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
> +
> +	disable_percpu_irq(drvdata->irq);
> +	trbe_reset_local(cpudata);
> +	cpudata->drvdata = NULL;
> +}
> +
> +
>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>   {
>   	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>   	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>   }
>   
> -static void arm_trbe_remove_coresight_cpu(void *info)
> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>   {
> -	int cpu = smp_processor_id();
> -	struct trbe_drvdata *drvdata = info;
> -	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>   	struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
>   
> -	disable_percpu_irq(drvdata->irq);
> -	trbe_reset_local(cpudata);
>   	if (trbe_csdev) {
>   		coresight_unregister(trbe_csdev);
> -		cpudata->drvdata = NULL;
>   		coresight_set_percpu_sink(cpu, NULL);

I am a bit concerned about "resetting" the sink from a different CPU.
Could we instead, schedule a delayed work to unregister the trbe_csdev?


Suzuki


>   	}
>   }
> @@ -1366,8 +1371,12 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
>   {
>   	int cpu;
>   
> -	for_each_cpu(cpu, &drvdata->supported_cpus)
> -		smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
> +	for_each_cpu(cpu, &drvdata->supported_cpus) {
> +		if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> +			smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
> +		if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> +			arm_trbe_remove_coresight_cpu(drvdata, cpu);
> +	}
>   	free_percpu(drvdata->cpudata);
>   	return 0;
>   }
> @@ -1406,12 +1415,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>   {
>   	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
>   
> -	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> -		struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> -
> -		disable_percpu_irq(drvdata->irq);
> -		trbe_reset_local(cpudata);
> -	}
> +	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> +		arm_trbe_disable_cpu(drvdata);
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14 10:34   ` Suzuki K Poulose
@ 2023-08-14 13:32     ` hejunhao
  2023-08-14 22:57       ` Suzuki K Poulose
  0 siblings, 1 reply; 25+ messages in thread
From: hejunhao @ 2023-08-14 13:32 UTC (permalink / raw)
  To: Suzuki K Poulose, Junhao He, mike.leach, leo.yan,
	anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, yangyicong, prime.zeng

Hi Suzuki


On 2023/8/14 18:34, Suzuki K Poulose wrote:
> Hi Junhao
>
> On 14/08/2023 10:38, Junhao He wrote:
>> smp_call_function_single() will allocate an IPI interrupt vector to
>> the target processor and send a function call request to the interrupt
>> vector. After the target processor receives the IPI interrupt, it will
>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>> handler.
>>
>> According to the device_unregister() stack information, if other process
>> is useing the device, the down_write() may sleep, and trigger deadlocks
>> or unexpected errors.
>>
>>    arm_trbe_remove_coresight_cpu
>>      coresight_unregister
>>        device_unregister
>>          device_del
>>            kobject_del
>>              __kobject_del
>>                sysfs_remove_dir
>>                  kernfs_remove
>>                    down_write ---------> it may sleep
>>
>> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
>> per TRBE.
>> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
>> smp_call_function_single(), which is the same as registering the TRBE
>> coresight device.
>>
>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
>>   1 file changed, 20 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c 
>> b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 7720619909d6..ce1e6f537b8d 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>>       enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>>   }
>>   +static void arm_trbe_disable_cpu(void *info)
>> +{
>> +    struct trbe_drvdata *drvdata = info;
>> +    struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
>> +
>> +    disable_percpu_irq(drvdata->irq);
>> +    trbe_reset_local(cpudata);
>> +    cpudata->drvdata = NULL;
>> +}
>> +
>> +
>>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata 
>> *drvdata, int cpu)
>>   {
>>       struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>>       cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>>   }
>>   -static void arm_trbe_remove_coresight_cpu(void *info)
>> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata 
>> *drvdata, int cpu)
>>   {
>> -    int cpu = smp_processor_id();
>> -    struct trbe_drvdata *drvdata = info;
>> -    struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>       struct coresight_device *trbe_csdev = 
>> coresight_get_percpu_sink(cpu);
>>   -    disable_percpu_irq(drvdata->irq);
>> -    trbe_reset_local(cpudata);
>>       if (trbe_csdev) {
>>           coresight_unregister(trbe_csdev);
>> -        cpudata->drvdata = NULL;
>>           coresight_set_percpu_sink(cpu, NULL);
>
> I am a bit concerned about "resetting" the sink from a different CPU.
> Could we instead, schedule a delayed work to unregister the trbe_csdev?

Yes, I will try to do that.
Sorry for my following questions.
As you mean, do we need to take the same care when setting the percpu sink
in the register trbe_csdev ?

Best regards,
Junhao.

>
>
>>       }
>>   }
>> @@ -1366,8 +1371,12 @@ static int arm_trbe_remove_coresight(struct 
>> trbe_drvdata *drvdata)
>>   {
>>       int cpu;
>>   -    for_each_cpu(cpu, &drvdata->supported_cpus)
>> -        smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, 
>> drvdata, 1);
>> +    for_each_cpu(cpu, &drvdata->supported_cpus) {
>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>> +            smp_call_function_single(cpu, arm_trbe_disable_cpu, 
>> drvdata, 1);
>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>> +            arm_trbe_remove_coresight_cpu(drvdata, cpu);
>> +    }
>>       free_percpu(drvdata->cpudata);
>>       return 0;
>>   }
>> @@ -1406,12 +1415,8 @@ static int arm_trbe_cpu_teardown(unsigned int 
>> cpu, struct hlist_node *node)
>>   {
>>       struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct 
>> trbe_drvdata, hotplug_node);
>>   -    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
>> -        struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, 
>> cpu);
>> -
>> -        disable_percpu_irq(drvdata->irq);
>> -        trbe_reset_local(cpudata);
>> -    }
>> +    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>> +        arm_trbe_disable_cpu(drvdata);
>>       return 0;
>>   }
>
>
> .
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource
  2023-08-14  9:38 ` [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource Junhao He
@ 2023-08-14 22:47   ` Suzuki K Poulose
  2023-08-15 11:38     ` hejunhao
  0 siblings, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-14 22:47 UTC (permalink / raw)
  To: Junhao He, mike.leach, leo.yan, anshuman.khandual,
	jonathan.cameron, James Clark
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng

+ James Clark

On 14/08/2023 10:38, Junhao He wrote:
> Current the TRBE driver supports matching TRBE platform device through
> id_table. The ACPI created a dummy TRBE platform device inside
> drivers/perf/arm_pmu_acpi.c. So the TRBE platform driver will probe only
> once and allocate just one TRBE platform data resource.
> 
> If the system supports the TRBE feature, Each CPU in the systems can
> have at least one TRBE present, and the coresight_unregister gets called
> multiple times, once for each of them.
> Therefore, when unregister TRBE coresight devices, the TRBE platform data
> resource will multiple free in function coresight_unregister.
> 
> root@localhost:# insmod coresight-trbe.ko
> root@localhost:# rmmod coresight-trbe.ko
> [  423.455932] ------------[ cut here ]------------
> [  423.461987] WARNING: CPU: 1 PID: 0 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
> [  423.483821] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O       6.5.0-rc4+ #1
> [  423.505842] pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> ...
> [  423.601301] Call trace:
> [  423.604202]  devm_kfree+0x88/0x98
> [  423.608369]  coresight_release_platform_data+0xb8/0xe0 [coresight]
> [  423.616589]  coresight_unregister+0x120/0x170 [coresight]
> [  423.623533]  arm_trbe_remove_coresight_cpu+0x70/0xa0 [coresight_trbe]
> [  423.631082]  __flush_smp_call_function_queue+0x1e4/0x4e0
> [  423.637471]  generic_smp_call_function_single_interrupt+0x1c/0x30
> [  423.644796]  ipi_handler+0x90/0x278
> [  423.648992]  handle_percpu_devid_irq+0x90/0x250
> [  423.654636]  generic_handle_domain_irq+0x34/0x58
> [  423.659786]  gic_handle_irq+0x12c/0x270
> [  423.664039]  call_on_irq_stack+0x24/0x30
> [  423.668452]  do_interrupt_handler+0x88/0x98
> [  423.673027]  el1_interrupt+0x48/0xe8
> [  423.677413]  el1h_64_irq_handler+0x18/0x28
> [  423.681781]  el1h_64_irq+0x78/0x80
> [  423.685550]  default_idle_call+0x5c/0x180
> [  423.689855]  do_idle+0x25c/0x2c0
> [  423.694196]  cpu_startup_entry+0x2c/0x40
> [  423.698373]  secondary_start_kernel+0x144/0x188
> [  423.703920]  __secondary_switched+0xb8/0xc0
> [  423.708972] ---[ end trace 0000000000000000 ]---
> [  423.729209] ------------[ cut here ]------------
> ...
> [  423.735217] WARNING: CPU: 2 PID: 40 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
> ...
> [  424.012385] WARNING: CPU: 3 PID: 0 at drivers/base/devres.c:1064 devm_kfree+0x88/0x98
> ...
> 
> This patch does the following:
> 1.TRBE coresight devices do not need regular connections information, We
>    can free connections resource when the nr_conns is valid.
> 2.And we can ignore the free platform data resource, it will be
>    automatically free in platform_driver_unregister().

Do we need a Fixes tag here ?

> 
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
> ---
>   drivers/hwtracing/coresight/coresight-core.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 118fcf27854d..c6f7889d1b4d 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -1555,9 +1555,10 @@ void coresight_release_platform_data(struct coresight_device *csdev,
>   		conns[i]->dest_fwnode = NULL;
>   		devm_kfree(dev, conns[i]);
>   	}
> -	devm_kfree(dev, pdata->out_conns);
> -	devm_kfree(dev, pdata->in_conns);
> -	devm_kfree(dev, pdata);
> +	if (pdata->nr_outconns)
> +		devm_kfree(dev, pdata->out_conns);
> +	if (pdata->nr_inconns)
> +		devm_kfree(dev, pdata->in_conns);

These allocations are made on the parent device and that
may never get unregistered (e.g., AMBA device, platform device,
stay forever, even when the "coresight" modules are unloaded).
Thus the memory will be left unused, literally leaking.
This specific devm_kfree() was added to fix that. May be we should fix
this in the TRBE driver to use separate pdata for the TRBE device
instances.

Suzuki

>   	if (csdev)
>   		coresight_remove_conns_sysfs_group(csdev);
>   }


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14 13:32     ` hejunhao
@ 2023-08-14 22:57       ` Suzuki K Poulose
  2023-08-15 11:40         ` hejunhao
  0 siblings, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-14 22:57 UTC (permalink / raw)
  To: hejunhao, mike.leach, leo.yan, anshuman.khandual, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, yangyicong, prime.zeng

On 14/08/2023 14:32, hejunhao wrote:
> Hi Suzuki
> 
> 
> On 2023/8/14 18:34, Suzuki K Poulose wrote:
>> Hi Junhao
>>
>> On 14/08/2023 10:38, Junhao He wrote:
>>> smp_call_function_single() will allocate an IPI interrupt vector to
>>> the target processor and send a function call request to the interrupt
>>> vector. After the target processor receives the IPI interrupt, it will
>>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>>> handler.
>>>
>>> According to the device_unregister() stack information, if other process
>>> is useing the device, the down_write() may sleep, and trigger deadlocks
>>> or unexpected errors.
>>>
>>>    arm_trbe_remove_coresight_cpu
>>>      coresight_unregister
>>>        device_unregister
>>>          device_del
>>>            kobject_del
>>>              __kobject_del
>>>                sysfs_remove_dir
>>>                  kernfs_remove
>>>                    down_write ---------> it may sleep
>>>
>>> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
>>> per TRBE.
>>> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
>>> smp_call_function_single(), which is the same as registering the TRBE
>>> coresight device.
>>>
>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
>>>   1 file changed, 20 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c 
>>> b/drivers/hwtracing/coresight/coresight-trbe.c
>>> index 7720619909d6..ce1e6f537b8d 100644
>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>>>       enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>>>   }
>>>   +static void arm_trbe_disable_cpu(void *info)
>>> +{
>>> +    struct trbe_drvdata *drvdata = info;
>>> +    struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
>>> +
>>> +    disable_percpu_irq(drvdata->irq);
>>> +    trbe_reset_local(cpudata);
>>> +    cpudata->drvdata = NULL;
>>> +}
>>> +
>>> +
>>>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata 
>>> *drvdata, int cpu)
>>>   {
>>>       struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>>>       cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>>>   }
>>>   -static void arm_trbe_remove_coresight_cpu(void *info)
>>> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata 
>>> *drvdata, int cpu)
>>>   {
>>> -    int cpu = smp_processor_id();
>>> -    struct trbe_drvdata *drvdata = info;
>>> -    struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>>       struct coresight_device *trbe_csdev = 
>>> coresight_get_percpu_sink(cpu);
>>>   -    disable_percpu_irq(drvdata->irq);
>>> -    trbe_reset_local(cpudata);
>>>       if (trbe_csdev) {
>>>           coresight_unregister(trbe_csdev);
>>> -        cpudata->drvdata = NULL;
>>>           coresight_set_percpu_sink(cpu, NULL);
>>
>> I am a bit concerned about "resetting" the sink from a different CPU.
>> Could we instead, schedule a delayed work to unregister the trbe_csdev?
> 
> Yes, I will try to do that.
> Sorry for my following questions.
> As you mean, do we need to take the same care when setting the percpu sink
> in the register trbe_csdev ?

Apologies, having taken another look, we set the percpu_sink for
a cpu outside smp_call_function(). So, I think your patch is fine.


> 
> Best regards,
> Junhao.
> 
>>
>>
>>>       }
>>>   }
>>> @@ -1366,8 +1371,12 @@ static int arm_trbe_remove_coresight(struct 
>>> trbe_drvdata *drvdata)
>>>   {
>>>       int cpu;
>>>   -    for_each_cpu(cpu, &drvdata->supported_cpus)
>>> -        smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, 
>>> drvdata, 1);
>>> +    for_each_cpu(cpu, &drvdata->supported_cpus) {
>>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>> +            smp_call_function_single(cpu, arm_trbe_disable_cpu, 
>>> drvdata, 1);
>>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>> +            arm_trbe_remove_coresight_cpu(drvdata, cpu);

Do we need to test the cpu here in both places ? We already check that
in the loop entry. The reason why we repeat the check during the probe,
is to skip any CPUs that may have a TRBE not accessible.

Suzuki


>>> +    }
>>>       free_percpu(drvdata->cpudata);
>>>       return 0;
>>>   }
>>> @@ -1406,12 +1415,8 @@ static int arm_trbe_cpu_teardown(unsigned int 
>>> cpu, struct hlist_node *node)
>>>   {
>>>       struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct 
>>> trbe_drvdata, hotplug_node);
>>>   -    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
>>> -        struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, 
>>> cpu);
>>> -
>>> -        disable_percpu_irq(drvdata->irq);
>>> -        trbe_reset_local(cpudata);
>>> -    }
>>> +    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>> +        arm_trbe_disable_cpu(drvdata);
>>>       return 0;
>>>   }
>>
>>
>> .
>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource
  2023-08-14 22:47   ` Suzuki K Poulose
@ 2023-08-15 11:38     ` hejunhao
  2023-08-16 13:13       ` Suzuki K Poulose
  0 siblings, 1 reply; 25+ messages in thread
From: hejunhao @ 2023-08-15 11:38 UTC (permalink / raw)
  To: Suzuki K Poulose, mike.leach, leo.yan, anshuman.khandual,
	jonathan.cameron, James Clark
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng

Hi, Suzuki


On 2023/8/15 6:47, Suzuki K Poulose wrote:
> + James Clark
>
> On 14/08/2023 10:38, Junhao He wrote:
>> Current the TRBE driver supports matching TRBE platform device through
>> id_table. The ACPI created a dummy TRBE platform device inside
>> drivers/perf/arm_pmu_acpi.c. So the TRBE platform driver will probe only
>> once and allocate just one TRBE platform data resource.
>>
>> If the system supports the TRBE feature, Each CPU in the systems can
>> have at least one TRBE present, and the coresight_unregister gets called
>> multiple times, once for each of them.
>> Therefore, when unregister TRBE coresight devices, the TRBE platform 
>> data
>> resource will multiple free in function coresight_unregister.
>>
>> root@localhost:# insmod coresight-trbe.ko
>> root@localhost:# rmmod coresight-trbe.ko
>> [  423.455932] ------------[ cut here ]------------
>> [  423.461987] WARNING: CPU: 1 PID: 0 at drivers/base/devres.c:1064 
>> devm_kfree+0x88/0x98
>> [  423.483821] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           
>> O       6.5.0-rc4+ #1
>> [  423.505842] pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS 
>> BTYPE=--)
>> ...
>> [  423.601301] Call trace:
>> [  423.604202]  devm_kfree+0x88/0x98
>> [  423.608369]  coresight_release_platform_data+0xb8/0xe0 [coresight]
>> [  423.616589]  coresight_unregister+0x120/0x170 [coresight]
>> [  423.623533]  arm_trbe_remove_coresight_cpu+0x70/0xa0 [coresight_trbe]
>> [  423.631082]  __flush_smp_call_function_queue+0x1e4/0x4e0
>> [  423.637471] generic_smp_call_function_single_interrupt+0x1c/0x30
>> [  423.644796]  ipi_handler+0x90/0x278
>> [  423.648992]  handle_percpu_devid_irq+0x90/0x250
>> [  423.654636]  generic_handle_domain_irq+0x34/0x58
>> [  423.659786]  gic_handle_irq+0x12c/0x270
>> [  423.664039]  call_on_irq_stack+0x24/0x30
>> [  423.668452]  do_interrupt_handler+0x88/0x98
>> [  423.673027]  el1_interrupt+0x48/0xe8
>> [  423.677413]  el1h_64_irq_handler+0x18/0x28
>> [  423.681781]  el1h_64_irq+0x78/0x80
>> [  423.685550]  default_idle_call+0x5c/0x180
>> [  423.689855]  do_idle+0x25c/0x2c0
>> [  423.694196]  cpu_startup_entry+0x2c/0x40
>> [  423.698373]  secondary_start_kernel+0x144/0x188
>> [  423.703920]  __secondary_switched+0xb8/0xc0
>> [  423.708972] ---[ end trace 0000000000000000 ]---
>> [  423.729209] ------------[ cut here ]------------
>> ...
>> [  423.735217] WARNING: CPU: 2 PID: 40 at drivers/base/devres.c:1064 
>> devm_kfree+0x88/0x98
>> ...
>> [  424.012385] WARNING: CPU: 3 PID: 0 at drivers/base/devres.c:1064 
>> devm_kfree+0x88/0x98
>> ...
>>
>> This patch does the following:
>> 1.TRBE coresight devices do not need regular connections information, We
>>    can free connections resource when the nr_conns is valid.
>> 2.And we can ignore the free platform data resource, it will be
>>    automatically free in platform_driver_unregister().
>
> Do we need a Fixes tag here ?

Yes, I will do that.

>>
>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-core.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-core.c 
>> b/drivers/hwtracing/coresight/coresight-core.c
>> index 118fcf27854d..c6f7889d1b4d 100644
>> --- a/drivers/hwtracing/coresight/coresight-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>> @@ -1555,9 +1555,10 @@ void coresight_release_platform_data(struct 
>> coresight_device *csdev,
>>           conns[i]->dest_fwnode = NULL;
>>           devm_kfree(dev, conns[i]);
>>       }
>> -    devm_kfree(dev, pdata->out_conns);
>> -    devm_kfree(dev, pdata->in_conns);
>> -    devm_kfree(dev, pdata);
>> +    if (pdata->nr_outconns)
>> +        devm_kfree(dev, pdata->out_conns);
>> +    if (pdata->nr_inconns)
>> +        devm_kfree(dev, pdata->in_conns);
>
> These allocations are made on the parent device and that
> may never get unregistered (e.g., AMBA device, platform device,
> stay forever, even when the "coresight" modules are unloaded).
> Thus the memory will be left unused, literally leaking.
> This specific devm_kfree() was added to fix that. May be we should fix
> this in the TRBE driver to use separate pdata for the TRBE device
> instances.
>
> Suzuki

If we fix this with minimal changes, I think it is possible to add a check
and not free pdata if it is TRBE?

     if (csdev->subtype.sink_subtype != 
CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM)
         devm_kfree(dev, pdata);

Then free pdata in the end of arm_trbe_remove_coresight().

>
>>       if (csdev)
>>           coresight_remove_conns_sysfs_group(csdev);
>>   }
>
>
> .
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14 22:57       ` Suzuki K Poulose
@ 2023-08-15 11:40         ` hejunhao
  0 siblings, 0 replies; 25+ messages in thread
From: hejunhao @ 2023-08-15 11:40 UTC (permalink / raw)
  To: Suzuki K Poulose, mike.leach, leo.yan, anshuman.khandual,
	jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, yangyicong, prime.zeng

Hi Suzuki


On 2023/8/15 6:57, Suzuki K Poulose wrote:
> On 14/08/2023 14:32, hejunhao wrote:
>> Hi Suzuki
>>
>>
>> On 2023/8/14 18:34, Suzuki K Poulose wrote:
>>> Hi Junhao
>>>
>>> On 14/08/2023 10:38, Junhao He wrote:
>>>> smp_call_function_single() will allocate an IPI interrupt vector to
>>>> the target processor and send a function call request to the interrupt
>>>> vector. After the target processor receives the IPI interrupt, it will
>>>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>>>> handler.
>>>>
>>>> According to the device_unregister() stack information, if other 
>>>> process
>>>> is useing the device, the down_write() may sleep, and trigger 
>>>> deadlocks
>>>> or unexpected errors.
>>>>
>>>>    arm_trbe_remove_coresight_cpu
>>>>      coresight_unregister
>>>>        device_unregister
>>>>          device_del
>>>>            kobject_del
>>>>              __kobject_del
>>>>                sysfs_remove_dir
>>>>                  kernfs_remove
>>>>                    down_write ---------> it may sleep
>>>>
>>>> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and 
>>>> reset
>>>> per TRBE.
>>>> Simply call arm_trbe_remove_coresight_cpu() directly without useing 
>>>> the
>>>> smp_call_function_single(), which is the same as registering the TRBE
>>>> coresight device.
>>>>
>>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>>>> ---
>>>>   drivers/hwtracing/coresight/coresight-trbe.c | 35 
>>>> +++++++++++---------
>>>>   1 file changed, 20 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c 
>>>> b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> index 7720619909d6..ce1e6f537b8d 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>>>>       enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>>>>   }
>>>>   +static void arm_trbe_disable_cpu(void *info)
>>>> +{
>>>> +    struct trbe_drvdata *drvdata = info;
>>>> +    struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
>>>> +
>>>> +    disable_percpu_irq(drvdata->irq);
>>>> +    trbe_reset_local(cpudata);
>>>> +    cpudata->drvdata = NULL;
>>>> +}
>>>> +
>>>> +
>>>>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata 
>>>> *drvdata, int cpu)
>>>>   {
>>>>       struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, 
>>>> cpu);
>>>> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>>>>       cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>>>>   }
>>>>   -static void arm_trbe_remove_coresight_cpu(void *info)
>>>> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata 
>>>> *drvdata, int cpu)
>>>>   {
>>>> -    int cpu = smp_processor_id();
>>>> -    struct trbe_drvdata *drvdata = info;
>>>> -    struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, 
>>>> cpu);
>>>>       struct coresight_device *trbe_csdev = 
>>>> coresight_get_percpu_sink(cpu);
>>>>   -    disable_percpu_irq(drvdata->irq);
>>>> -    trbe_reset_local(cpudata);
>>>>       if (trbe_csdev) {
>>>>           coresight_unregister(trbe_csdev);
>>>> -        cpudata->drvdata = NULL;
>>>>           coresight_set_percpu_sink(cpu, NULL);
>>>
>>> I am a bit concerned about "resetting" the sink from a different CPU.
>>> Could we instead, schedule a delayed work to unregister the trbe_csdev?
>>
>> Yes, I will try to do that.
>> Sorry for my following questions.
>> As you mean, do we need to take the same care when setting the percpu 
>> sink
>> in the register trbe_csdev ?
>
> Apologies, having taken another look, we set the percpu_sink for
> a cpu outside smp_call_function(). So, I think your patch is fine.
>
>
>>
>> Best regards,
>> Junhao.
>>
>>>
>>>
>>>>       }
>>>>   }
>>>> @@ -1366,8 +1371,12 @@ static int arm_trbe_remove_coresight(struct 
>>>> trbe_drvdata *drvdata)
>>>>   {
>>>>       int cpu;
>>>>   -    for_each_cpu(cpu, &drvdata->supported_cpus)
>>>> -        smp_call_function_single(cpu, 
>>>> arm_trbe_remove_coresight_cpu, drvdata, 1);
>>>> +    for_each_cpu(cpu, &drvdata->supported_cpus) {
>>>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>>> +            smp_call_function_single(cpu, arm_trbe_disable_cpu, 
>>>> drvdata, 1);
>>>> +        if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>>> +            arm_trbe_remove_coresight_cpu(drvdata, cpu);
>
> Do we need to test the cpu here in both places ? We already check that
> in the loop entry. The reason why we repeat the check during the probe,
> is to skip any CPUs that may have a TRBE not accessible.
>
> Suzuki
>

Ok, Will fix in next version.

Best regards,
Junhao.

>
>>>> +    }
>>>>       free_percpu(drvdata->cpudata);
>>>>       return 0;
>>>>   }
>>>> @@ -1406,12 +1415,8 @@ static int arm_trbe_cpu_teardown(unsigned 
>>>> int cpu, struct hlist_node *node)
>>>>   {
>>>>       struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct 
>>>> trbe_drvdata, hotplug_node);
>>>>   -    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
>>>> -        struct trbe_cpudata *cpudata = 
>>>> per_cpu_ptr(drvdata->cpudata, cpu);
>>>> -
>>>> -        disable_percpu_irq(drvdata->irq);
>>>> -        trbe_reset_local(cpudata);
>>>> -    }
>>>> +    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>>> +        arm_trbe_disable_cpu(drvdata);
>>>>       return 0;
>>>>   }
>>>
>>>
>>> .
>>>
>>
>
>
> .
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource
  2023-08-15 11:38     ` hejunhao
@ 2023-08-16 13:13       ` Suzuki K Poulose
  2023-08-16 13:58         ` Suzuki K Poulose
  0 siblings, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-16 13:13 UTC (permalink / raw)
  To: hejunhao, mike.leach, leo.yan, anshuman.khandual,
	jonathan.cameron, James Clark
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng

On 15/08/2023 12:38, hejunhao wrote:
> Hi, Suzuki
> 
> 
> On 2023/8/15 6:47, Suzuki K Poulose wrote:
>> + James Clark
>>
>> On 14/08/2023 10:38, Junhao He wrote:
>>> Current the TRBE driver supports matching TRBE platform device through
>>> id_table. The ACPI created a dummy TRBE platform device inside
>>> drivers/perf/arm_pmu_acpi.c. So the TRBE platform driver will probe only
>>> once and allocate just one TRBE platform data resource.
>>>
>>> If the system supports the TRBE feature, Each CPU in the systems can
>>> have at least one TRBE present, and the coresight_unregister gets called
>>> multiple times, once for each of them.
>>> Therefore, when unregister TRBE coresight devices, the TRBE platform 
>>> data
>>> resource will multiple free in function coresight_unregister.
>>>
>>> root@localhost:# insmod coresight-trbe.ko
>>> root@localhost:# rmmod coresight-trbe.ko
>>> [  423.455932] ------------[ cut here ]------------
>>> [  423.461987] WARNING: CPU: 1 PID: 0 at drivers/base/devres.c:1064 
>>> devm_kfree+0x88/0x98
>>> [  423.483821] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O       
>>> 6.5.0-rc4+ #1
>>> [  423.505842] pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS 
>>> BTYPE=--)
>>> ...
>>> [  423.601301] Call trace:
>>> [  423.604202]  devm_kfree+0x88/0x98
>>> [  423.608369]  coresight_release_platform_data+0xb8/0xe0 [coresight]
>>> [  423.616589]  coresight_unregister+0x120/0x170 [coresight]
>>> [  423.623533]  arm_trbe_remove_coresight_cpu+0x70/0xa0 [coresight_trbe]
>>> [  423.631082]  __flush_smp_call_function_queue+0x1e4/0x4e0
>>> [  423.637471] generic_smp_call_function_single_interrupt+0x1c/0x30
>>> [  423.644796]  ipi_handler+0x90/0x278
>>> [  423.648992]  handle_percpu_devid_irq+0x90/0x250
>>> [  423.654636]  generic_handle_domain_irq+0x34/0x58
>>> [  423.659786]  gic_handle_irq+0x12c/0x270
>>> [  423.664039]  call_on_irq_stack+0x24/0x30
>>> [  423.668452]  do_interrupt_handler+0x88/0x98
>>> [  423.673027]  el1_interrupt+0x48/0xe8
>>> [  423.677413]  el1h_64_irq_handler+0x18/0x28
>>> [  423.681781]  el1h_64_irq+0x78/0x80
>>> [  423.685550]  default_idle_call+0x5c/0x180
>>> [  423.689855]  do_idle+0x25c/0x2c0
>>> [  423.694196]  cpu_startup_entry+0x2c/0x40
>>> [  423.698373]  secondary_start_kernel+0x144/0x188
>>> [  423.703920]  __secondary_switched+0xb8/0xc0
>>> [  423.708972] ---[ end trace 0000000000000000 ]---
>>> [  423.729209] ------------[ cut here ]------------
>>> ...
>>> [  423.735217] WARNING: CPU: 2 PID: 40 at drivers/base/devres.c:1064 
>>> devm_kfree+0x88/0x98
>>> ...
>>> [  424.012385] WARNING: CPU: 3 PID: 0 at drivers/base/devres.c:1064 
>>> devm_kfree+0x88/0x98
>>> ...
>>>
>>> This patch does the following:
>>> 1.TRBE coresight devices do not need regular connections information, We
>>>    can free connections resource when the nr_conns is valid.
>>> 2.And we can ignore the free platform data resource, it will be
>>>    automatically free in platform_driver_unregister().
>>
>> Do we need a Fixes tag here ?
> 
> Yes, I will do that.
> 
>>>
>>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-core.c | 7 ++++---
>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c 
>>> b/drivers/hwtracing/coresight/coresight-core.c
>>> index 118fcf27854d..c6f7889d1b4d 100644
>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>> @@ -1555,9 +1555,10 @@ void coresight_release_platform_data(struct 
>>> coresight_device *csdev,
>>>           conns[i]->dest_fwnode = NULL;
>>>           devm_kfree(dev, conns[i]);
>>>       }
>>> -    devm_kfree(dev, pdata->out_conns);
>>> -    devm_kfree(dev, pdata->in_conns);
>>> -    devm_kfree(dev, pdata);
>>> +    if (pdata->nr_outconns)
>>> +        devm_kfree(dev, pdata->out_conns);
>>> +    if (pdata->nr_inconns)
>>> +        devm_kfree(dev, pdata->in_conns);
>>
>> These allocations are made on the parent device and that
>> may never get unregistered (e.g., AMBA device, platform device,
>> stay forever, even when the "coresight" modules are unloaded).
>> Thus the memory will be left unused, literally leaking.
>> This specific devm_kfree() was added to fix that. May be we should fix
>> this in the TRBE driver to use separate pdata for the TRBE device
>> instances.
>>
>> Suzuki
> 
> If we fix this with minimal changes, I think it is possible to add a check
> and not free pdata if it is TRBE?
> 
>      if (csdev->subtype.sink_subtype != 
> CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM)
>          devm_kfree(dev, pdata);
> 
> Then free pdata in the end of arm_trbe_remove_coresight().

It is much nicer to do something like :


--8>--


coresight: trbe: Allocate platform data per device

Coresight TRBE driver shares a single platform data (which is empty btw).
However, with the commit 4e8fe7e5c3a5
("coresight: Store pointers to connections rather than an array of them")
the coresight core would free up the pdata, resulting in multiple attempts
to free the same pdata for TRBE instances. Fix this by allocating a 
pdata per
coresight_device.

Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Reported-by: Junhao He <hejunhao3@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
  drivers/hwtracing/coresight/coresight-trbe.c | 15 ++++++++-------
  1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c 
b/drivers/hwtracing/coresight/coresight-trbe.c
index 025f70adee47..fbab2bb4db38 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1255,10 +1255,17 @@ static void 
arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
  	if (!desc.name)
  		goto cpu_clear;

+	/*
+	 * TRBE doesn't have any connections described via firmware. Instead
+	 * we register the trbe instance as per-cpu sink.
+	 */
+	desc.pdata = coresight_get_platform_data(dev);
+	if (IS_ERR(desc.pdata))
+		goto cpu_clear;
+
  	desc.type = CORESIGHT_DEV_TYPE_SINK;
  	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
  	desc.ops = &arm_trbe_cs_ops;
-	desc.pdata = dev_get_platdata(dev);
  	desc.groups = arm_trbe_groups;
  	desc.dev = dev;
  	trbe_csdev = coresight_register(&desc);
@@ -1482,7 +1489,6 @@ static void arm_trbe_remove_irq(struct 
trbe_drvdata *drvdata)

  static int arm_trbe_device_probe(struct platform_device *pdev)
  {
-	struct coresight_platform_data *pdata;
  	struct trbe_drvdata *drvdata;
  	struct device *dev = &pdev->dev;
  	int ret;
@@ -1497,12 +1503,7 @@ static int arm_trbe_device_probe(struct 
platform_device *pdev)
  	if (!drvdata)
  		return -ENOMEM;

-	pdata = coresight_get_platform_data(dev);
-	if (IS_ERR(pdata))
-		return PTR_ERR(pdata);
-
  	dev_set_drvdata(dev, drvdata);
-	dev->platform_data = pdata;
  	drvdata->pdev = pdev;
  	ret = arm_trbe_probe_irq(pdev, drvdata);
  	if (ret)
-- 
2.34.1



> 
>>
>>>       if (csdev)
>>>           coresight_remove_conns_sysfs_group(csdev);
>>>   }
>>
>>
>> .
>>
> 


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource
  2023-08-16 13:13       ` Suzuki K Poulose
@ 2023-08-16 13:58         ` Suzuki K Poulose
  0 siblings, 0 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-16 13:58 UTC (permalink / raw)
  To: hejunhao, mike.leach, leo.yan, anshuman.khandual,
	jonathan.cameron, James Clark
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng

On 16/08/2023 14:13, Suzuki K Poulose wrote:
> On 15/08/2023 12:38, hejunhao wrote:
>> Hi, Suzuki
>>
>>
>> On 2023/8/15 6:47, Suzuki K Poulose wrote:
>>> + James Clark
>>>
>>> On 14/08/2023 10:38, Junhao He wrote:
>>>> Current the TRBE driver supports matching TRBE platform device through
>>>> id_table. The ACPI created a dummy TRBE platform device inside
>>>> drivers/perf/arm_pmu_acpi.c. So the TRBE platform driver will probe 
>>>> only
>>>> once and allocate just one TRBE platform data resource.
>>>>
>>>> If the system supports the TRBE feature, Each CPU in the systems can
>>>> have at least one TRBE present, and the coresight_unregister gets 
>>>> called
>>>> multiple times, once for each of them.
>>>> Therefore, when unregister TRBE coresight devices, the TRBE platform 
>>>> data
>>>> resource will multiple free in function coresight_unregister.
>>>>
>>>> root@localhost:# insmod coresight-trbe.ko
>>>> root@localhost:# rmmod coresight-trbe.ko
>>>> [  423.455932] ------------[ cut here ]------------
>>>> [  423.461987] WARNING: CPU: 1 PID: 0 at drivers/base/devres.c:1064 
>>>> devm_kfree+0x88/0x98
>>>> [  423.483821] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 6.5.0-rc4+ #1
>>>> [  423.505842] pstate: 614000c9 (nZCv daIF +PAN -UAO -TCO +DIT -SSBS 
>>>> BTYPE=--)
>>>> ...
>>>> [  423.601301] Call trace:
>>>> [  423.604202]  devm_kfree+0x88/0x98
>>>> [  423.608369]  coresight_release_platform_data+0xb8/0xe0 [coresight]
>>>> [  423.616589]  coresight_unregister+0x120/0x170 [coresight]
>>>> [  423.623533]  arm_trbe_remove_coresight_cpu+0x70/0xa0 
>>>> [coresight_trbe]
>>>> [  423.631082]  __flush_smp_call_function_queue+0x1e4/0x4e0
>>>> [  423.637471] generic_smp_call_function_single_interrupt+0x1c/0x30
>>>> [  423.644796]  ipi_handler+0x90/0x278
>>>> [  423.648992]  handle_percpu_devid_irq+0x90/0x250
>>>> [  423.654636]  generic_handle_domain_irq+0x34/0x58
>>>> [  423.659786]  gic_handle_irq+0x12c/0x270
>>>> [  423.664039]  call_on_irq_stack+0x24/0x30
>>>> [  423.668452]  do_interrupt_handler+0x88/0x98
>>>> [  423.673027]  el1_interrupt+0x48/0xe8
>>>> [  423.677413]  el1h_64_irq_handler+0x18/0x28
>>>> [  423.681781]  el1h_64_irq+0x78/0x80
>>>> [  423.685550]  default_idle_call+0x5c/0x180
>>>> [  423.689855]  do_idle+0x25c/0x2c0
>>>> [  423.694196]  cpu_startup_entry+0x2c/0x40
>>>> [  423.698373]  secondary_start_kernel+0x144/0x188
>>>> [  423.703920]  __secondary_switched+0xb8/0xc0
>>>> [  423.708972] ---[ end trace 0000000000000000 ]---
>>>> [  423.729209] ------------[ cut here ]------------
>>>> ...
>>>> [  423.735217] WARNING: CPU: 2 PID: 40 at drivers/base/devres.c:1064 
>>>> devm_kfree+0x88/0x98
>>>> ...
>>>> [  424.012385] WARNING: CPU: 3 PID: 0 at drivers/base/devres.c:1064 
>>>> devm_kfree+0x88/0x98
>>>> ...
>>>>
>>>> This patch does the following:
>>>> 1.TRBE coresight devices do not need regular connections 
>>>> information, We
>>>>    can free connections resource when the nr_conns is valid.
>>>> 2.And we can ignore the free platform data resource, it will be
>>>>    automatically free in platform_driver_unregister().
>>>
>>> Do we need a Fixes tag here ?
>>
>> Yes, I will do that.
>>
>>>>
>>>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>>>> ---
>>>>   drivers/hwtracing/coresight/coresight-core.c | 7 ++++---
>>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c 
>>>> b/drivers/hwtracing/coresight/coresight-core.c
>>>> index 118fcf27854d..c6f7889d1b4d 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>> @@ -1555,9 +1555,10 @@ void coresight_release_platform_data(struct 
>>>> coresight_device *csdev,
>>>>           conns[i]->dest_fwnode = NULL;
>>>>           devm_kfree(dev, conns[i]);
>>>>       }
>>>> -    devm_kfree(dev, pdata->out_conns);
>>>> -    devm_kfree(dev, pdata->in_conns);
>>>> -    devm_kfree(dev, pdata);
>>>> +    if (pdata->nr_outconns)
>>>> +        devm_kfree(dev, pdata->out_conns);
>>>> +    if (pdata->nr_inconns)
>>>> +        devm_kfree(dev, pdata->in_conns);
>>>
>>> These allocations are made on the parent device and that
>>> may never get unregistered (e.g., AMBA device, platform device,
>>> stay forever, even when the "coresight" modules are unloaded).
>>> Thus the memory will be left unused, literally leaking.
>>> This specific devm_kfree() was added to fix that. May be we should fix
>>> this in the TRBE driver to use separate pdata for the TRBE device
>>> instances.
>>>
>>> Suzuki
>>
>> If we fix this with minimal changes, I think it is possible to add a 
>> check
>> and not free pdata if it is TRBE?
>>
>>      if (csdev->subtype.sink_subtype != 
>> CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM)
>>          devm_kfree(dev, pdata);
>>
>> Then free pdata in the end of arm_trbe_remove_coresight().
> 
> It is much nicer to do something like :
> 
> 
> --8>--
> 
> 
> coresight: trbe: Allocate platform data per device
> 
> Coresight TRBE driver shares a single platform data (which is empty btw).
> However, with the commit 4e8fe7e5c3a5
> ("coresight: Store pointers to connections rather than an array of them")
> the coresight core would free up the pdata, resulting in multiple attempts
> to free the same pdata for TRBE instances. Fix this by allocating a 
> pdata per
> coresight_device.
> 
> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
> Reported-by: Junhao He <hejunhao3@huawei.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-trbe.c | 15 ++++++++-------
>   1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c 
> b/drivers/hwtracing/coresight/coresight-trbe.c
> index 025f70adee47..fbab2bb4db38 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -1255,10 +1255,17 @@ static void 
> arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
>       if (!desc.name)
>           goto cpu_clear;

---8>--

> 
> +    /*
> +     * TRBE doesn't have any connections described via firmware. Instead
> +     * we register the trbe instance as per-cpu sink.
> +     */

Ignore the comments above ^. I have tested this locally and works for 
me. Please could you confirm if this solves the problem for you ?

Kind regards
Suzuki


> +    desc.pdata = coresight_get_platform_data(dev);
> +    if (IS_ERR(desc.pdata))
> +        goto cpu_clear;
> +
>       desc.type = CORESIGHT_DEV_TYPE_SINK;
>       desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
>       desc.ops = &arm_trbe_cs_ops;
> -    desc.pdata = dev_get_platdata(dev);
>       desc.groups = arm_trbe_groups;
>       desc.dev = dev;
>       trbe_csdev = coresight_register(&desc);
> @@ -1482,7 +1489,6 @@ static void arm_trbe_remove_irq(struct 
> trbe_drvdata *drvdata)
> 
>   static int arm_trbe_device_probe(struct platform_device *pdev)
>   {
> -    struct coresight_platform_data *pdata;
>       struct trbe_drvdata *drvdata;
>       struct device *dev = &pdev->dev;
>       int ret;
> @@ -1497,12 +1503,7 @@ static int arm_trbe_device_probe(struct 
> platform_device *pdev)
>       if (!drvdata)
>           return -ENOMEM;
> 
> -    pdata = coresight_get_platform_data(dev);
> -    if (IS_ERR(pdata))
> -        return PTR_ERR(pdata);
> -
>       dev_set_drvdata(dev, drvdata);
> -    dev->platform_data = pdata;
>       drvdata->pdev = pdev;
>       ret = arm_trbe_probe_irq(pdev, drvdata);
>       if (ret)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-14  9:38 [PATCH 0/2] Fix some issues with TRBE building as a module Junhao He
  2023-08-14  9:38 ` [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Junhao He
  2023-08-14  9:38 ` [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource Junhao He
@ 2023-08-16 14:10 ` Suzuki K Poulose
  2023-08-16 14:10   ` [PATCH 2/2] coresight: trbe: Allocate platform data per device Suzuki K Poulose
  2023-08-17  7:13   ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Anshuman Khandual
  2023-08-17  6:18 ` [PATCH 0/2] Fix some issues with TRBE building as a module Anshuman Khandual
  3 siblings, 2 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-16 14:10 UTC (permalink / raw)
  To: hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, james.clark, linuxarm, yangyicong,
	prime.zeng, Suzuki K Poulose

From: Junhao He <hejunhao3@huawei.com>

smp_call_function_single() will allocate an IPI interrupt vector to
the target processor and send a function call request to the interrupt
vector. After the target processor receives the IPI interrupt, it will
execute arm_trbe_remove_coresight_cpu() call request in the interrupt
handler.

According to the device_unregister() stack information, if other process
is useing the device, the down_write() may sleep, and trigger deadlocks
or unexpected errors.

  arm_trbe_remove_coresight_cpu
    coresight_unregister
      device_unregister
        device_del
          kobject_del
            __kobject_del
              sysfs_remove_dir
                kernfs_remove
                  down_write ---------> it may sleep

Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
per TRBE.
Simply call arm_trbe_remove_coresight_cpu() directly without useing the
smp_call_function_single(), which is the same as registering the TRBE
coresight device.

Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Link: https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com
[ Remove duplicate cpumask checks during removal ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 33 +++++++++++---------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 7720619909d6..025f70adee47 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
 	enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
 }
 
+static void arm_trbe_disable_cpu(void *info)
+{
+	struct trbe_drvdata *drvdata = info;
+	struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
+
+	disable_percpu_irq(drvdata->irq);
+	trbe_reset_local(cpudata);
+	cpudata->drvdata = NULL;
+}
+
+
 static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
 {
 	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
@@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
 	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
 }
 
-static void arm_trbe_remove_coresight_cpu(void *info)
+static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
 {
-	int cpu = smp_processor_id();
-	struct trbe_drvdata *drvdata = info;
-	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
 	struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
 
-	disable_percpu_irq(drvdata->irq);
-	trbe_reset_local(cpudata);
 	if (trbe_csdev) {
 		coresight_unregister(trbe_csdev);
-		cpudata->drvdata = NULL;
 		coresight_set_percpu_sink(cpu, NULL);
 	}
 }
@@ -1366,8 +1371,10 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
 {
 	int cpu;
 
-	for_each_cpu(cpu, &drvdata->supported_cpus)
-		smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
+	for_each_cpu(cpu, &drvdata->supported_cpus) {
+		smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
+		arm_trbe_remove_coresight_cpu(drvdata, cpu);
+	}
 	free_percpu(drvdata->cpudata);
 	return 0;
 }
@@ -1406,12 +1413,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
 {
 	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
 
-	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
-		struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
-
-		disable_percpu_irq(drvdata->irq);
-		trbe_reset_local(cpudata);
-	}
+	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+		arm_trbe_disable_cpu(drvdata);
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-16 14:10 ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Suzuki K Poulose
@ 2023-08-16 14:10   ` Suzuki K Poulose
  2023-08-17  6:37     ` Anshuman Khandual
  2023-08-17  8:47     ` hejunhao
  2023-08-17  7:13   ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Anshuman Khandual
  1 sibling, 2 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-16 14:10 UTC (permalink / raw)
  To: hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, james.clark, linuxarm, yangyicong,
	prime.zeng, Suzuki K Poulose, Anshuman Khandual

Coresight TRBE driver shares a single platform data (which is empty btw).
However, with the commit 4e8fe7e5c3a5
("coresight: Store pointers to connections rather than an array of them")
the coresight core would free up the pdata, resulting in multiple attempts
to free the same pdata for TRBE instances. Fix this by allocating a pdata per
coresight_device.

Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
Link: https://lore.kernel.org/r/20230814093813.19152-3-hejunhao3@huawei.com
Reported-by: Junhao He <hejunhao3@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 025f70adee47..d3d34a833f01 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1255,10 +1255,13 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
 	if (!desc.name)
 		goto cpu_clear;
 
+	desc.pdata = coresight_get_platform_data(dev);
+	if (IS_ERR(desc.pdata))
+		goto cpu_clear;
+
 	desc.type = CORESIGHT_DEV_TYPE_SINK;
 	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
 	desc.ops = &arm_trbe_cs_ops;
-	desc.pdata = dev_get_platdata(dev);
 	desc.groups = arm_trbe_groups;
 	desc.dev = dev;
 	trbe_csdev = coresight_register(&desc);
@@ -1482,7 +1485,6 @@ static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
 
 static int arm_trbe_device_probe(struct platform_device *pdev)
 {
-	struct coresight_platform_data *pdata;
 	struct trbe_drvdata *drvdata;
 	struct device *dev = &pdev->dev;
 	int ret;
@@ -1497,12 +1499,7 @@ static int arm_trbe_device_probe(struct platform_device *pdev)
 	if (!drvdata)
 		return -ENOMEM;
 
-	pdata = coresight_get_platform_data(dev);
-	if (IS_ERR(pdata))
-		return PTR_ERR(pdata);
-
 	dev_set_drvdata(dev, drvdata);
-	dev->platform_data = pdata;
 	drvdata->pdev = pdev;
 	ret = arm_trbe_probe_irq(pdev, drvdata);
 	if (ret)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] Fix some issues with TRBE building as a module
  2023-08-14  9:38 [PATCH 0/2] Fix some issues with TRBE building as a module Junhao He
                   ` (2 preceding siblings ...)
  2023-08-16 14:10 ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Suzuki K Poulose
@ 2023-08-17  6:18 ` Anshuman Khandual
  3 siblings, 0 replies; 25+ messages in thread
From: Anshuman Khandual @ 2023-08-17  6:18 UTC (permalink / raw)
  To: Junhao He, suzuki.poulose, mike.leach, leo.yan, jonathan.cameron
  Cc: coresight, linux-arm-kernel, linux-kernel, linuxarm, yangyicong,
	prime.zeng



On 8/14/23 15:08, Junhao He wrote:
> The TRBE driver support is build as a module, we found some driver issues
> based on the patchset [1] and set CONFIG_CORESIGHT_TRBE=m.
> 1. TRBE driver potential sleep in atomic context when unregister device
> 2. Multiple free the platform data resource when rmmod coresight TRBE
> driver
> 
> [1] "coresight: trbe: Enable ACPI based devices"
> https://lore.kernel.org/all/20230808082247.383405-1-anshuman.khandual@arm.com/

I am glad that ACPI based device enablement is already helping in getting more
test coverage for the TRBE driver itself. I have just posted a new version for
the above series.

https://lore.kernel.org/all/20230817055405.249630-1-anshuman.khandual@arm.com/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-16 14:10   ` [PATCH 2/2] coresight: trbe: Allocate platform data per device Suzuki K Poulose
@ 2023-08-17  6:37     ` Anshuman Khandual
  2023-08-17  9:24       ` James Clark
  2023-08-17 10:01       ` Suzuki K Poulose
  2023-08-17  8:47     ` hejunhao
  1 sibling, 2 replies; 25+ messages in thread
From: Anshuman Khandual @ 2023-08-17  6:37 UTC (permalink / raw)
  To: Suzuki K Poulose, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, james.clark, linuxarm, yangyicong,
	prime.zeng

Hi Suzuki,

Seems like this patch is going to conflict with the below proposed change

https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/

Please let me know how should we resolve this conflict.

On 8/16/23 19:40, Suzuki K Poulose wrote:
> Coresight TRBE driver shares a single platform data (which is empty btw).
> However, with the commit 4e8fe7e5c3a5
> ("coresight: Store pointers to connections rather than an array of them")
> the coresight core would free up the pdata, resulting in multiple attempts
> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
> coresight_device.
> 
> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")

The above mentioned commit i.e 4e8fe7e5c3a5 seems to be a more recent one which
has triggered this problem. But would the problem be still there without that ?
Else 'Fixes:' tag would need changing.

> Link: https://lore.kernel.org/r/20230814093813.19152-3-hejunhao3@huawei.com
> Reported-by: Junhao He <hejunhao3@huawei.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 025f70adee47..d3d34a833f01 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -1255,10 +1255,13 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
>  	if (!desc.name)
>  		goto cpu_clear;
>  
> +	desc.pdata = coresight_get_platform_data(dev);
> +	if (IS_ERR(desc.pdata))
> +		goto cpu_clear;
> +
>  	desc.type = CORESIGHT_DEV_TYPE_SINK;
>  	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
>  	desc.ops = &arm_trbe_cs_ops;
> -	desc.pdata = dev_get_platdata(dev);
>  	desc.groups = arm_trbe_groups;
>  	desc.dev = dev;
>  	trbe_csdev = coresight_register(&desc);
> @@ -1482,7 +1485,6 @@ static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
>  
>  static int arm_trbe_device_probe(struct platform_device *pdev)
>  {
> -	struct coresight_platform_data *pdata;
>  	struct trbe_drvdata *drvdata;
>  	struct device *dev = &pdev->dev;
>  	int ret;
> @@ -1497,12 +1499,7 @@ static int arm_trbe_device_probe(struct platform_device *pdev)
>  	if (!drvdata)
>  		return -ENOMEM;
>  
> -	pdata = coresight_get_platform_data(dev);
> -	if (IS_ERR(pdata))
> -		return PTR_ERR(pdata);
> -
>  	dev_set_drvdata(dev, drvdata);
> -	dev->platform_data = pdata;
>  	drvdata->pdev = pdev;
>  	ret = arm_trbe_probe_irq(pdev, drvdata);
>  	if (ret)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-16 14:10 ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Suzuki K Poulose
  2023-08-16 14:10   ` [PATCH 2/2] coresight: trbe: Allocate platform data per device Suzuki K Poulose
@ 2023-08-17  7:13   ` Anshuman Khandual
  2023-08-17  8:41     ` hejunhao
  2023-08-17  9:59     ` Suzuki K Poulose
  1 sibling, 2 replies; 25+ messages in thread
From: Anshuman Khandual @ 2023-08-17  7:13 UTC (permalink / raw)
  To: Suzuki K Poulose, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	mike.leach, linuxarm, yangyicong, prime.zeng

Hello Junhao,

On 8/16/23 19:40, Suzuki K Poulose wrote:
> From: Junhao He <hejunhao3@huawei.com>
> 
> smp_call_function_single() will allocate an IPI interrupt vector to
> the target processor and send a function call request to the interrupt
> vector. After the target processor receives the IPI interrupt, it will
> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
> handler.
> 
> According to the device_unregister() stack information, if other process
> is useing the device, the down_write() may sleep, and trigger deadlocks
> or unexpected errors.
> 
>   arm_trbe_remove_coresight_cpu
>     coresight_unregister
>       device_unregister
>         device_del
>           kobject_del
>             __kobject_del
>               sysfs_remove_dir
>                 kernfs_remove
>                   down_write ---------> it may sleep

But how did you really detect this problem ? Does this show up as an warning when
you enable lockdep debug ? OR it really happened during a real workload execution
followed by TRBE module unload. Although the problem seems plausible (which needs
fixing), just wondering how did we trigger this.

> 
> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
> per TRBE.
> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
> smp_call_function_single(), which is the same as registering the TRBE
> coresight device.
> 
> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
> Link: https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com
> [ Remove duplicate cpumask checks during removal ]
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 33 +++++++++++---------
>  1 file changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 7720619909d6..025f70adee47 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>  	enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>  }
>  
> +static void arm_trbe_disable_cpu(void *info)
> +{
> +	struct trbe_drvdata *drvdata = info;
> +	struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
> +
> +	disable_percpu_irq(drvdata->irq);
> +	trbe_reset_local(cpudata);
> +	cpudata->drvdata = NULL;
> +}
> +
> +
>  static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>  {
>  	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>  	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>  }
>  
> -static void arm_trbe_remove_coresight_cpu(void *info)
> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>  {
> -	int cpu = smp_processor_id();
> -	struct trbe_drvdata *drvdata = info;
> -	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>  	struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
>  
> -	disable_percpu_irq(drvdata->irq);
> -	trbe_reset_local(cpudata);
>  	if (trbe_csdev) {
>  		coresight_unregister(trbe_csdev);
> -		cpudata->drvdata = NULL;
>  		coresight_set_percpu_sink(cpu, NULL);
>  	}
>  }
> @@ -1366,8 +1371,10 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
>  {
>  	int cpu;
>  
> -	for_each_cpu(cpu, &drvdata->supported_cpus)
> -		smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
> +	for_each_cpu(cpu, &drvdata->supported_cpus) {
> +		smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
> +		arm_trbe_remove_coresight_cpu(drvdata, cpu);
> +	}
>  	free_percpu(drvdata->cpudata);
>  	return 0;
>  }
> @@ -1406,12 +1413,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>  {
>  	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
>  
> -	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> -		struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> -
> -		disable_percpu_irq(drvdata->irq);
> -		trbe_reset_local(cpudata);
> -	}
> +	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> +		arm_trbe_disable_cpu(drvdata);

This code hunk seems unrelated to the context here other than just finding another use case
for arm_trbe_disable_cpu(). The problem is - arm_trbe_disable_cpu() resets cpudata->drvdata
which might not get re-initialized back in arm_trbe_cpu_startup(), as there will still be a
per cpu sink associated as confirmed with coresight_get_percpu_sink(). I guess it might be
better to drop this change and just keep everything limited to SMP IPI callback reworking in
arm_trbe_remove_coresight().

>  	return 0;
>  }
>  

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-17  7:13   ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Anshuman Khandual
@ 2023-08-17  8:41     ` hejunhao
  2023-08-17  9:57       ` James Clark
  2023-08-17  9:59     ` Suzuki K Poulose
  1 sibling, 1 reply; 25+ messages in thread
From: hejunhao @ 2023-08-17  8:41 UTC (permalink / raw)
  To: Anshuman Khandual, Suzuki K Poulose
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	mike.leach, linuxarm, yangyicong, prime.zeng

Hi Anshuman Khandual,


On 2023/8/17 15:13, Anshuman Khandual wrote:
> Hello Junhao,
>
> On 8/16/23 19:40, Suzuki K Poulose wrote:
>> From: Junhao He <hejunhao3@huawei.com>
>>
>> smp_call_function_single() will allocate an IPI interrupt vector to
>> the target processor and send a function call request to the interrupt
>> vector. After the target processor receives the IPI interrupt, it will
>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>> handler.
>>
>> According to the device_unregister() stack information, if other process
>> is useing the device, the down_write() may sleep, and trigger deadlocks
>> or unexpected errors.
>>
>>    arm_trbe_remove_coresight_cpu
>>      coresight_unregister
>>        device_unregister
>>          device_del
>>            kobject_del
>>              __kobject_del
>>                sysfs_remove_dir
>>                  kernfs_remove
>>                    down_write ---------> it may sleep
> But how did you really detect this problem ? Does this show up as an warning when
> you enable lockdep debug ? OR it really happened during a real workload execution
> followed by TRBE module unload. Although the problem seems plausible (which needs
> fixing), just wondering how did we trigger this.

Yes, it really happened during a real workload.

If the TRBE driver is loaded and unloaded cyclically. the test script 
following:

   for ((i=0;i<99999;i++))
   do
   insmod coresight-trbe.ko;
   rmmod coresight-trbe.ko;
   echo "loop $i";
   done

The kernel will report a panic.

>> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
>> per TRBE.
>> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
>> smp_call_function_single(), which is the same as registering the TRBE
>> coresight device.
>>
>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>> Link: https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com
>> [ Remove duplicate cpumask checks during removal ]
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-trbe.c | 33 +++++++++++---------
>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 7720619909d6..025f70adee47 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>>   	enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>>   }
>>   
>> +static void arm_trbe_disable_cpu(void *info)
>> +{
>> +	struct trbe_drvdata *drvdata = info;
>> +	struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
>> +
>> +	disable_percpu_irq(drvdata->irq);
>> +	trbe_reset_local(cpudata);
>> +	cpudata->drvdata = NULL;
>> +}
>> +
>> +
>>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>>   {
>>   	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>>   	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>>   }
>>   
>> -static void arm_trbe_remove_coresight_cpu(void *info)
>> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
>>   {
>> -	int cpu = smp_processor_id();
>> -	struct trbe_drvdata *drvdata = info;
>> -	struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>   	struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
>>   
>> -	disable_percpu_irq(drvdata->irq);
>> -	trbe_reset_local(cpudata);
>>   	if (trbe_csdev) {
>>   		coresight_unregister(trbe_csdev);
>> -		cpudata->drvdata = NULL;
>>   		coresight_set_percpu_sink(cpu, NULL);
>>   	}
>>   }
>> @@ -1366,8 +1371,10 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
>>   {
>>   	int cpu;
>>   
>> -	for_each_cpu(cpu, &drvdata->supported_cpus)
>> -		smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
>> +	for_each_cpu(cpu, &drvdata->supported_cpus) {
>> +		smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
>> +		arm_trbe_remove_coresight_cpu(drvdata, cpu);
>> +	}
>>   	free_percpu(drvdata->cpudata);
>>   	return 0;
>>   }
>> @@ -1406,12 +1413,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
>>   {
>>   	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
>>   
>> -	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
>> -		struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>> -
>> -		disable_percpu_irq(drvdata->irq);
>> -		trbe_reset_local(cpudata);
>> -	}
>> +	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>> +		arm_trbe_disable_cpu(drvdata);
> This code hunk seems unrelated to the context here other than just finding another use case
> for arm_trbe_disable_cpu(). The problem is - arm_trbe_disable_cpu() resets cpudata->drvdata
> which might not get re-initialized back in arm_trbe_cpu_startup(), as there will still be a
> per cpu sink associated as confirmed with coresight_get_percpu_sink(). I guess it might be
> better to drop this change and just keep everything limited to SMP IPI callback reworking in
> arm_trbe_remove_coresight().

OK, will fix it. The change is just to simplify the code of cpu_teardown.
Maybe we can consider whether we need to set "cpudata->drvdata = NULL"
in arm_trbe_disable_cpu()?  If it's not necessary, This can be kept.
Then drop the release cpudata->drvdata from arm_trbe_disable_cpu().

Best regards,
Junhao.

>>   	return 0;
>>   }
>>   
> .
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-16 14:10   ` [PATCH 2/2] coresight: trbe: Allocate platform data per device Suzuki K Poulose
  2023-08-17  6:37     ` Anshuman Khandual
@ 2023-08-17  8:47     ` hejunhao
  1 sibling, 0 replies; 25+ messages in thread
From: hejunhao @ 2023-08-17  8:47 UTC (permalink / raw)
  To: Suzuki K Poulose, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, james.clark, yangyicong, prime.zeng,
	Anshuman Khandual


On 2023/8/16 22:10, Suzuki K Poulose wrote:
> Coresight TRBE driver shares a single platform data (which is empty btw).
> However, with the commit 4e8fe7e5c3a5
> ("coresight: Store pointers to connections rather than an array of them")
> the coresight core would free up the pdata, resulting in multiple attempts
> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
> coresight_device.
>
> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
> Link: https://lore.kernel.org/r/20230814093813.19152-3-hejunhao3@huawei.com
> Reported-by: Junhao He <hejunhao3@huawei.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Test-by: Junhao He <hejunhao3@huawei.com>

> ---
>   drivers/hwtracing/coresight/coresight-trbe.c | 11 ++++-------
>   1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 025f70adee47..d3d34a833f01 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -1255,10 +1255,13 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
>   	if (!desc.name)
>   		goto cpu_clear;
>   
> +	desc.pdata = coresight_get_platform_data(dev);
> +	if (IS_ERR(desc.pdata))
> +		goto cpu_clear;
> +
>   	desc.type = CORESIGHT_DEV_TYPE_SINK;
>   	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
>   	desc.ops = &arm_trbe_cs_ops;
> -	desc.pdata = dev_get_platdata(dev);
>   	desc.groups = arm_trbe_groups;
>   	desc.dev = dev;
>   	trbe_csdev = coresight_register(&desc);
> @@ -1482,7 +1485,6 @@ static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
>   
>   static int arm_trbe_device_probe(struct platform_device *pdev)
>   {
> -	struct coresight_platform_data *pdata;
>   	struct trbe_drvdata *drvdata;
>   	struct device *dev = &pdev->dev;
>   	int ret;
> @@ -1497,12 +1499,7 @@ static int arm_trbe_device_probe(struct platform_device *pdev)
>   	if (!drvdata)
>   		return -ENOMEM;
>   
> -	pdata = coresight_get_platform_data(dev);
> -	if (IS_ERR(pdata))
> -		return PTR_ERR(pdata);
> -
>   	dev_set_drvdata(dev, drvdata);
> -	dev->platform_data = pdata;
>   	drvdata->pdev = pdev;
>   	ret = arm_trbe_probe_irq(pdev, drvdata);
>   	if (ret)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-17  6:37     ` Anshuman Khandual
@ 2023-08-17  9:24       ` James Clark
  2023-08-17 10:01         ` Suzuki K Poulose
  2023-08-17 10:01       ` Suzuki K Poulose
  1 sibling, 1 reply; 25+ messages in thread
From: James Clark @ 2023-08-17  9:24 UTC (permalink / raw)
  To: Anshuman Khandual, Suzuki K Poulose, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, linuxarm, yangyicong, prime.zeng



On 17/08/2023 07:37, Anshuman Khandual wrote:
> Hi Suzuki,
> 
> Seems like this patch is going to conflict with the below proposed change
> 
> https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/
> 
> Please let me know how should we resolve this conflict.

We could merge them both, with the fixes: one first, just to acknowledge
that there was a problem. But I suppose your one will have to be rebased
on top.

> 
> On 8/16/23 19:40, Suzuki K Poulose wrote:
>> Coresight TRBE driver shares a single platform data (which is empty btw).
>> However, with the commit 4e8fe7e5c3a5
>> ("coresight: Store pointers to connections rather than an array of them")
>> the coresight core would free up the pdata, resulting in multiple attempts
>> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
>> coresight_device.
>>
>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
> 
> The above mentioned commit i.e 4e8fe7e5c3a5 seems to be a more recent one which
> has triggered this problem. But would the problem be still there without that ?
> Else 'Fixes:' tag would need changing.
> 

Yes I think the fixes tag should point to 4e8fe7e5c3a5.

>> Link: https://lore.kernel.org/r/20230814093813.19152-3-hejunhao3@huawei.com
>> Reported-by: Junhao He <hejunhao3@huawei.com>
>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  drivers/hwtracing/coresight/coresight-trbe.c | 11 ++++-------
>>  1 file changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 025f70adee47..d3d34a833f01 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -1255,10 +1255,13 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
>>  	if (!desc.name)
>>  		goto cpu_clear;
>>  
>> +	desc.pdata = coresight_get_platform_data(dev);
>> +	if (IS_ERR(desc.pdata))
>> +		goto cpu_clear;
>> +
>>  	desc.type = CORESIGHT_DEV_TYPE_SINK;
>>  	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
>>  	desc.ops = &arm_trbe_cs_ops;
>> -	desc.pdata = dev_get_platdata(dev);
>>  	desc.groups = arm_trbe_groups;
>>  	desc.dev = dev;
>>  	trbe_csdev = coresight_register(&desc);
>> @@ -1482,7 +1485,6 @@ static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
>>  
>>  static int arm_trbe_device_probe(struct platform_device *pdev)
>>  {
>> -	struct coresight_platform_data *pdata;
>>  	struct trbe_drvdata *drvdata;
>>  	struct device *dev = &pdev->dev;
>>  	int ret;
>> @@ -1497,12 +1499,7 @@ static int arm_trbe_device_probe(struct platform_device *pdev)
>>  	if (!drvdata)
>>  		return -ENOMEM;
>>  
>> -	pdata = coresight_get_platform_data(dev);
>> -	if (IS_ERR(pdata))
>> -		return PTR_ERR(pdata);
>> -
>>  	dev_set_drvdata(dev, drvdata);
>> -	dev->platform_data = pdata;
>>  	drvdata->pdev = pdev;
>>  	ret = arm_trbe_probe_irq(pdev, drvdata);
>>  	if (ret)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-17  8:41     ` hejunhao
@ 2023-08-17  9:57       ` James Clark
  0 siblings, 0 replies; 25+ messages in thread
From: James Clark @ 2023-08-17  9:57 UTC (permalink / raw)
  To: hejunhao, Anshuman Khandual, Suzuki K Poulose
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	mike.leach, linuxarm, yangyicong, prime.zeng



On 17/08/2023 09:41, hejunhao wrote:
> Hi Anshuman Khandual,
> 
> 
> On 2023/8/17 15:13, Anshuman Khandual wrote:
>> Hello Junhao,
>>
>> On 8/16/23 19:40, Suzuki K Poulose wrote:
>>> From: Junhao He <hejunhao3@huawei.com>
>>>
>>> smp_call_function_single() will allocate an IPI interrupt vector to
>>> the target processor and send a function call request to the interrupt
>>> vector. After the target processor receives the IPI interrupt, it will
>>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>>> handler.
>>>
>>> According to the device_unregister() stack information, if other process
>>> is useing the device, the down_write() may sleep, and trigger deadlocks
>>> or unexpected errors.
>>>
>>>    arm_trbe_remove_coresight_cpu
>>>      coresight_unregister
>>>        device_unregister
>>>          device_del
>>>            kobject_del
>>>              __kobject_del
>>>                sysfs_remove_dir
>>>                  kernfs_remove
>>>                    down_write ---------> it may sleep
>> But how did you really detect this problem ? Does this show up as an
>> warning when
>> you enable lockdep debug ? OR it really happened during a real
>> workload execution
>> followed by TRBE module unload. Although the problem seems plausible
>> (which needs
>> fixing), just wondering how did we trigger this.
> 
> Yes, it really happened during a real workload.
> 
> If the TRBE driver is loaded and unloaded cyclically. the test script
> following:
> 
>   for ((i=0;i<99999;i++))
>   do
>   insmod coresight-trbe.ko;
>   rmmod coresight-trbe.ko;
>   echo "loop $i";
>   done
> 
> The kernel will report a panic.
> 

I wonder how easy it would be to add a kselftest to do this with all of
the Coresight modules. Because we also had a problem with bad reference
counting preventing an unload of the CTI module. Although that did
require starting a perf session, which might complicated the test.

>>> Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset
>>> per TRBE.
>>> Simply call arm_trbe_remove_coresight_cpu() directly without useing the
>>> smp_call_function_single(), which is the same as registering the TRBE
>>> coresight device.
>>>
>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>> Signed-off-by: Junhao He <hejunhao3@huawei.com>
>>> Link:
>>> https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com
>>> [ Remove duplicate cpumask checks during removal ]
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-trbe.c | 33 +++++++++++---------
>>>   1 file changed, 18 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c
>>> b/drivers/hwtracing/coresight/coresight-trbe.c
>>> index 7720619909d6..025f70adee47 100644
>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>> @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info)
>>>       enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
>>>   }
>>>   +static void arm_trbe_disable_cpu(void *info)
>>> +{
>>> +    struct trbe_drvdata *drvdata = info;
>>> +    struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
>>> +
>>> +    disable_percpu_irq(drvdata->irq);
>>> +    trbe_reset_local(cpudata);
>>> +    cpudata->drvdata = NULL;
>>> +}
>>> +
>>> +
>>>   static void arm_trbe_register_coresight_cpu(struct trbe_drvdata
>>> *drvdata, int cpu)
>>>   {
>>>       struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>> @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info)
>>>       cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>>>   }
>>>   -static void arm_trbe_remove_coresight_cpu(void *info)
>>> +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata
>>> *drvdata, int cpu)
>>>   {
>>> -    int cpu = smp_processor_id();
>>> -    struct trbe_drvdata *drvdata = info;
>>> -    struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
>>>       struct coresight_device *trbe_csdev =
>>> coresight_get_percpu_sink(cpu);
>>>   -    disable_percpu_irq(drvdata->irq);
>>> -    trbe_reset_local(cpudata);
>>>       if (trbe_csdev) {
>>>           coresight_unregister(trbe_csdev);
>>> -        cpudata->drvdata = NULL;
>>>           coresight_set_percpu_sink(cpu, NULL);
>>>       }
>>>   }
>>> @@ -1366,8 +1371,10 @@ static int arm_trbe_remove_coresight(struct
>>> trbe_drvdata *drvdata)
>>>   {
>>>       int cpu;
>>>   -    for_each_cpu(cpu, &drvdata->supported_cpus)
>>> -        smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu,
>>> drvdata, 1);
>>> +    for_each_cpu(cpu, &drvdata->supported_cpus) {
>>> +        smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata,
>>> 1);
>>> +        arm_trbe_remove_coresight_cpu(drvdata, cpu);
>>> +    }
>>>       free_percpu(drvdata->cpudata);
>>>       return 0;
>>>   }
>>> @@ -1406,12 +1413,8 @@ static int arm_trbe_cpu_teardown(unsigned int
>>> cpu, struct hlist_node *node)
>>>   {
>>>       struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct
>>> trbe_drvdata, hotplug_node);
>>>   -    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
>>> -        struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata,
>>> cpu);
>>> -
>>> -        disable_percpu_irq(drvdata->irq);
>>> -        trbe_reset_local(cpudata);
>>> -    }
>>> +    if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>>> +        arm_trbe_disable_cpu(drvdata);
>> This code hunk seems unrelated to the context here other than just
>> finding another use case
>> for arm_trbe_disable_cpu(). The problem is - arm_trbe_disable_cpu()
>> resets cpudata->drvdata
>> which might not get re-initialized back in arm_trbe_cpu_startup(), as
>> there will still be a
>> per cpu sink associated as confirmed with coresight_get_percpu_sink().
>> I guess it might be
>> better to drop this change and just keep everything limited to SMP IPI
>> callback reworking in
>> arm_trbe_remove_coresight().
> 
> OK, will fix it. The change is just to simplify the code of cpu_teardown.
> Maybe we can consider whether we need to set "cpudata->drvdata = NULL"
> in arm_trbe_disable_cpu()?  If it's not necessary, This can be kept.
> Then drop the release cpudata->drvdata from arm_trbe_disable_cpu().
> 
> Best regards,
> Junhao.
> 
>>>       return 0;
>>>   }
>>>   
>> .
>>
> 
> _______________________________________________
> CoreSight mailing list -- coresight@lists.linaro.org
> To unsubscribe send an email to coresight-leave@lists.linaro.org

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context
  2023-08-17  7:13   ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Anshuman Khandual
  2023-08-17  8:41     ` hejunhao
@ 2023-08-17  9:59     ` Suzuki K Poulose
  1 sibling, 0 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-17  9:59 UTC (permalink / raw)
  To: Anshuman Khandual, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	mike.leach, linuxarm, yangyicong, prime.zeng

On 17/08/2023 08:13, Anshuman Khandual wrote:
> Hello Junhao,
> 
> On 8/16/23 19:40, Suzuki K Poulose wrote:
>> From: Junhao He <hejunhao3@huawei.com>
>>
>> smp_call_function_single() will allocate an IPI interrupt vector to
>> the target processor and send a function call request to the interrupt
>> vector. After the target processor receives the IPI interrupt, it will
>> execute arm_trbe_remove_coresight_cpu() call request in the interrupt
>> handler.
>>
>> According to the device_unregister() stack information, if other process
>> is useing the device, the down_write() may sleep, and trigger deadlocks
>> or unexpected errors.
>>
>>    arm_trbe_remove_coresight_cpu
>>      coresight_unregister
>>        device_unregister
>>          device_del
>>            kobject_del
>>              __kobject_del
>>                sysfs_remove_dir
>>                  kernfs_remove
>>                    down_write ---------> it may sleep
> 
> But how did you really detect this problem ? Does this show up as an warning when
> you enable lockdep debug ? OR it really happened during a real workload execution
> followed by TRBE module unload. Although the problem seems plausible (which needs
> fixing), just wondering how did we trigger this.

I was able to trigger this with just :

modprobe coresight-trbe; modprobe -r coresight-trbe;

With all the bells and whistles enabled in the kernel.

Suzuki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-17  9:24       ` James Clark
@ 2023-08-17 10:01         ` Suzuki K Poulose
  2023-08-17 10:16           ` Anshuman Khandual
  0 siblings, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-17 10:01 UTC (permalink / raw)
  To: James Clark, Anshuman Khandual, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, linuxarm, yangyicong, prime.zeng

On 17/08/2023 10:24, James Clark wrote:
> 
> 
> On 17/08/2023 07:37, Anshuman Khandual wrote:
>> Hi Suzuki,
>>
>> Seems like this patch is going to conflict with the below proposed change
>>
>> https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/
>>
>> Please let me know how should we resolve this conflict.
> 
> We could merge them both, with the fixes: one first, just to acknowledge
> that there was a problem. But I suppose your one will have to be rebased
> on top.
> 
>>
>> On 8/16/23 19:40, Suzuki K Poulose wrote:
>>> Coresight TRBE driver shares a single platform data (which is empty btw).
>>> However, with the commit 4e8fe7e5c3a5
>>> ("coresight: Store pointers to connections rather than an array of them")
>>> the coresight core would free up the pdata, resulting in multiple attempts
>>> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
>>> coresight_device.
>>>
>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>
>> The above mentioned commit i.e 4e8fe7e5c3a5 seems to be a more recent one which
>> has triggered this problem. But would the problem be still there without that ?
>> Else 'Fixes:' tag would need changing.
>>
> 
> Yes I think the fixes tag should point to 4e8fe7e5c3a5.

Agreed, I will change the fixes tag and push this.

Suzuki



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-17  6:37     ` Anshuman Khandual
  2023-08-17  9:24       ` James Clark
@ 2023-08-17 10:01       ` Suzuki K Poulose
  1 sibling, 0 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-17 10:01 UTC (permalink / raw)
  To: Anshuman Khandual, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, james.clark, linuxarm, yangyicong,
	prime.zeng

On 17/08/2023 07:37, Anshuman Khandual wrote:
> Hi Suzuki,
> 
> Seems like this patch is going to conflict with the below proposed change
> 
> https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/
> 
> Please let me know how should we resolve this conflict.

Please rebase your change on top of this one.

Suzuki



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-17 10:01         ` Suzuki K Poulose
@ 2023-08-17 10:16           ` Anshuman Khandual
  2023-08-17 10:33             ` Suzuki K Poulose
  0 siblings, 1 reply; 25+ messages in thread
From: Anshuman Khandual @ 2023-08-17 10:16 UTC (permalink / raw)
  To: Suzuki K Poulose, James Clark, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, linuxarm, yangyicong, prime.zeng



On 8/17/23 15:31, Suzuki K Poulose wrote:
> On 17/08/2023 10:24, James Clark wrote:
>>
>>
>> On 17/08/2023 07:37, Anshuman Khandual wrote:
>>> Hi Suzuki,
>>>
>>> Seems like this patch is going to conflict with the below proposed change
>>>
>>> https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/
>>>
>>> Please let me know how should we resolve this conflict.
>>
>> We could merge them both, with the fixes: one first, just to acknowledge
>> that there was a problem. But I suppose your one will have to be rebased
>> on top.
>>
>>>
>>> On 8/16/23 19:40, Suzuki K Poulose wrote:
>>>> Coresight TRBE driver shares a single platform data (which is empty btw).
>>>> However, with the commit 4e8fe7e5c3a5
>>>> ("coresight: Store pointers to connections rather than an array of them")
>>>> the coresight core would free up the pdata, resulting in multiple attempts
>>>> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
>>>> coresight_device.
>>>>
>>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>>
>>> The above mentioned commit i.e 4e8fe7e5c3a5 seems to be a more recent one which
>>> has triggered this problem. But would the problem be still there without that ?
>>> Else 'Fixes:' tag would need changing.
>>>
>>
>> Yes I think the fixes tag should point to 4e8fe7e5c3a5.
> 
> Agreed, I will change the fixes tag and push this.

In the first patch, the last hunk might not be required to fix the
IPI problem and in fact might be bit problematic as well. Besides,
could you please hold off pushing this change into coresight tree
for some time ?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/2] coresight: trbe: Allocate platform data per device
  2023-08-17 10:16           ` Anshuman Khandual
@ 2023-08-17 10:33             ` Suzuki K Poulose
  0 siblings, 0 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2023-08-17 10:33 UTC (permalink / raw)
  To: Anshuman Khandual, James Clark, hejunhao3
  Cc: coresight, linux-kernel, linux-arm-kernel, jonathan.cameron,
	leo.yan, mike.leach, linuxarm, yangyicong, prime.zeng

On 17/08/2023 11:16, Anshuman Khandual wrote:
> 
> 
> On 8/17/23 15:31, Suzuki K Poulose wrote:
>> On 17/08/2023 10:24, James Clark wrote:
>>>
>>>
>>> On 17/08/2023 07:37, Anshuman Khandual wrote:
>>>> Hi Suzuki,
>>>>
>>>> Seems like this patch is going to conflict with the below proposed change
>>>>
>>>> https://lore.kernel.org/all/20230817055405.249630-4-anshuman.khandual@arm.com/
>>>>
>>>> Please let me know how should we resolve this conflict.
>>>
>>> We could merge them both, with the fixes: one first, just to acknowledge
>>> that there was a problem. But I suppose your one will have to be rebased
>>> on top.
>>>
>>>>
>>>> On 8/16/23 19:40, Suzuki K Poulose wrote:
>>>>> Coresight TRBE driver shares a single platform data (which is empty btw).
>>>>> However, with the commit 4e8fe7e5c3a5
>>>>> ("coresight: Store pointers to connections rather than an array of them")
>>>>> the coresight core would free up the pdata, resulting in multiple attempts
>>>>> to free the same pdata for TRBE instances. Fix this by allocating a pdata per
>>>>> coresight_device.
>>>>>
>>>>> Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver")
>>>>
>>>> The above mentioned commit i.e 4e8fe7e5c3a5 seems to be a more recent one which
>>>> has triggered this problem. But would the problem be still there without that ?
>>>> Else 'Fixes:' tag would need changing.
>>>>
>>>
>>> Yes I think the fixes tag should point to 4e8fe7e5c3a5.
>>
>> Agreed, I will change the fixes tag and push this.
> 
> In the first patch, the last hunk might not be required to fix the
> IPI problem and in fact might be bit problematic as well. Besides,

Please could you comment your concerns on the patch ?

Suzuki

> could you please hold off pushing this change into coresight tree
> for some time ?


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-08-17 10:34 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14  9:38 [PATCH 0/2] Fix some issues with TRBE building as a module Junhao He
2023-08-14  9:38 ` [PATCH 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Junhao He
2023-08-14 10:34   ` Suzuki K Poulose
2023-08-14 13:32     ` hejunhao
2023-08-14 22:57       ` Suzuki K Poulose
2023-08-15 11:40         ` hejunhao
2023-08-14  9:38 ` [PATCH 2/2] coresight: core: Fix multiple free TRBE platform data resource Junhao He
2023-08-14 22:47   ` Suzuki K Poulose
2023-08-15 11:38     ` hejunhao
2023-08-16 13:13       ` Suzuki K Poulose
2023-08-16 13:58         ` Suzuki K Poulose
2023-08-16 14:10 ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Suzuki K Poulose
2023-08-16 14:10   ` [PATCH 2/2] coresight: trbe: Allocate platform data per device Suzuki K Poulose
2023-08-17  6:37     ` Anshuman Khandual
2023-08-17  9:24       ` James Clark
2023-08-17 10:01         ` Suzuki K Poulose
2023-08-17 10:16           ` Anshuman Khandual
2023-08-17 10:33             ` Suzuki K Poulose
2023-08-17 10:01       ` Suzuki K Poulose
2023-08-17  8:47     ` hejunhao
2023-08-17  7:13   ` [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Anshuman Khandual
2023-08-17  8:41     ` hejunhao
2023-08-17  9:57       ` James Clark
2023-08-17  9:59     ` Suzuki K Poulose
2023-08-17  6:18 ` [PATCH 0/2] Fix some issues with TRBE building as a module Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).