linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] fpga: dfl: fme: Fix cpu hotplug code
@ 2021-06-28  7:15 Kajol Jain
  2021-06-28  9:01 ` Xu Yilun
  2021-06-28 18:40 ` Moritz Fischer
  0 siblings, 2 replies; 5+ messages in thread
From: Kajol Jain @ 2021-06-28  7:15 UTC (permalink / raw)
  To: will, hao.wu, mark.rutland
  Cc: trix, yilun.xu, luwei.kang, mdf, linux-fpga, maddy, atrajeev,
	kjain, linux-kernel, linuxppc-dev, rnsastry

Commit 724142f8c42a ("fpga: dfl: fme: add performance
reporting support") added performance reporting support
for FPGA management engine via perf.

It also added cpu hotplug feature but it didn't add
pmu migration call in cpu offline function.
This can create an issue incase the current designated
cpu being used to collect fme pmu data got offline,
as based on current code we are not migrating fme pmu to
new target cpu. Because of that perf will still try to
fetch data from that offline cpu and hence we will not
get counter data.

Patch fixed this issue by adding pmu_migrate_context call
in fme_perf_offline_cpu function.

Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
 drivers/fpga/dfl-fme-perf.c | 4 ++++
 1 file changed, 4 insertions(+)

---
- This fix patch is not tested (as I don't have required environment).
  But issue mentioned in the commit msg can be re-created, by starting any
  fme_perf event and while its still running, offline current designated
  cpu pointed by cpumask file. Since current code didn't migrating pmu,
  perf gonna try getting counts from that offlined cpu and hence we will
  not get event data.
---
diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
index 4299145ef347..b9a54583e505 100644
--- a/drivers/fpga/dfl-fme-perf.c
+++ b/drivers/fpga/dfl-fme-perf.c
@@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
 		return 0;
 
 	priv->cpu = target;
+
+	/* Migrate fme_perf pmu events to the new target cpu */
+	perf_pmu_migrate_context(&priv->pmu, cpu, target);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC] fpga: dfl: fme: Fix cpu hotplug code
  2021-06-28  7:15 [RFC] fpga: dfl: fme: Fix cpu hotplug code Kajol Jain
@ 2021-06-28  9:01 ` Xu Yilun
  2021-06-28 10:04   ` kajoljain
  2021-06-28 18:40 ` Moritz Fischer
  1 sibling, 1 reply; 5+ messages in thread
From: Xu Yilun @ 2021-06-28  9:01 UTC (permalink / raw)
  To: Kajol Jain
  Cc: will, hao.wu, mark.rutland, trix, luwei.kang, mdf, linux-fpga,
	maddy, atrajeev, linux-kernel, linuxppc-dev, rnsastry

It's a good fix, you can drop the RFC in commit title. :)

The title could be more specific, like:

    fpga: dfl: fme: Fix cpu hotplug issue in performance reporting

So we know it is for performance reporting feature at first glance.

On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote:

> Commit 724142f8c42a ("fpga: dfl: fme: add performance
> reporting support") added performance reporting support
> for FPGA management engine via perf.

May drop this section, it is indicated in the Fixes tag.

> 
> It also added cpu hotplug feature but it didn't add

The performance reporting driver added cpu hotplug ...

> pmu migration call in cpu offline function.
> This can create an issue incase the current designated
> cpu being used to collect fme pmu data got offline,
> as based on current code we are not migrating fme pmu to
> new target cpu. Because of that perf will still try to
> fetch data from that offline cpu and hence we will not
> get counter data.
> 
> Patch fixed this issue by adding pmu_migrate_context call
> in fme_perf_offline_cpu function.
> 
> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>

Tested-by: Xu Yilun <yilun.xu@intel.com>

Thanks,
Yilun

> ---
>  drivers/fpga/dfl-fme-perf.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> ---
> - This fix patch is not tested (as I don't have required environment).
>   But issue mentioned in the commit msg can be re-created, by starting any
>   fme_perf event and while its still running, offline current designated
>   cpu pointed by cpumask file. Since current code didn't migrating pmu,
>   perf gonna try getting counts from that offlined cpu and hence we will
>   not get event data.
> ---
> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
> index 4299145ef347..b9a54583e505 100644
> --- a/drivers/fpga/dfl-fme-perf.c
> +++ b/drivers/fpga/dfl-fme-perf.c
> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  		return 0;
>  
>  	priv->cpu = target;
> +
> +	/* Migrate fme_perf pmu events to the new target cpu */
> +	perf_pmu_migrate_context(&priv->pmu, cpu, target);
> +
>  	return 0;
>  }
>  
> -- 
> 2.31.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] fpga: dfl: fme: Fix cpu hotplug code
  2021-06-28  9:01 ` Xu Yilun
@ 2021-06-28 10:04   ` kajoljain
  0 siblings, 0 replies; 5+ messages in thread
From: kajoljain @ 2021-06-28 10:04 UTC (permalink / raw)
  To: Xu Yilun
  Cc: will, hao.wu, mark.rutland, trix, luwei.kang, mdf, linux-fpga,
	maddy, atrajeev, linux-kernel, linuxppc-dev, rnsastry



On 6/28/21 2:31 PM, Xu Yilun wrote:
> It's a good fix, you can drop the RFC in commit title. :)
> 
> The title could be more specific, like:
> 
>     fpga: dfl: fme: Fix cpu hotplug issue in performance reporting
> 
> So we know it is for performance reporting feature at first glance.
> 
> On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote:
> 
>> Commit 724142f8c42a ("fpga: dfl: fme: add performance
>> reporting support") added performance reporting support
>> for FPGA management engine via perf.
> 
> May drop this section, it is indicated in the Fixes tag.
> 

Hi Yilun,
    Thanks for testing the patch. I will make mentioned changes and send
new patch.

Thanks,
Kajol Jain
>>
>> It also added cpu hotplug feature but it didn't add
> 
> The performance reporting driver added cpu hotplug ...
> 
>> pmu migration call in cpu offline function.
>> This can create an issue incase the current designated
>> cpu being used to collect fme pmu data got offline,
>> as based on current code we are not migrating fme pmu to
>> new target cpu. Because of that perf will still try to
>> fetch data from that offline cpu and hence we will not
>> get counter data.
>>
>> Patch fixed this issue by adding pmu_migrate_context call
>> in fme_perf_offline_cpu function.
>>
>> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> 
> Tested-by: Xu Yilun <yilun.xu@intel.com>
> 
> Thanks,
> Yilun
> 
>> ---
>>  drivers/fpga/dfl-fme-perf.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> ---
>> - This fix patch is not tested (as I don't have required environment).
>>   But issue mentioned in the commit msg can be re-created, by starting any
>>   fme_perf event and while its still running, offline current designated
>>   cpu pointed by cpumask file. Since current code didn't migrating pmu,
>>   perf gonna try getting counts from that offlined cpu and hence we will
>>   not get event data.
>> ---
>> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
>> index 4299145ef347..b9a54583e505 100644
>> --- a/drivers/fpga/dfl-fme-perf.c
>> +++ b/drivers/fpga/dfl-fme-perf.c
>> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
>>  		return 0;
>>  
>>  	priv->cpu = target;
>> +
>> +	/* Migrate fme_perf pmu events to the new target cpu */
>> +	perf_pmu_migrate_context(&priv->pmu, cpu, target);
>> +
>>  	return 0;
>>  }
>>  
>> -- 
>> 2.31.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] fpga: dfl: fme: Fix cpu hotplug code
  2021-06-28  7:15 [RFC] fpga: dfl: fme: Fix cpu hotplug code Kajol Jain
  2021-06-28  9:01 ` Xu Yilun
@ 2021-06-28 18:40 ` Moritz Fischer
  2021-06-29  7:14   ` kajoljain
  1 sibling, 1 reply; 5+ messages in thread
From: Moritz Fischer @ 2021-06-28 18:40 UTC (permalink / raw)
  To: Kajol Jain
  Cc: will, hao.wu, mark.rutland, trix, yilun.xu, luwei.kang, mdf,
	linux-fpga, maddy, atrajeev, linux-kernel, linuxppc-dev,
	rnsastry

On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote:
> Commit 724142f8c42a ("fpga: dfl: fme: add performance
> reporting support") added performance reporting support
> for FPGA management engine via perf.
> 
> It also added cpu hotplug feature but it didn't add
> pmu migration call in cpu offline function.
> This can create an issue incase the current designated
> cpu being used to collect fme pmu data got offline,
> as based on current code we are not migrating fme pmu to
> new target cpu. Because of that perf will still try to
> fetch data from that offline cpu and hence we will not
> get counter data.
> 
> Patch fixed this issue by adding pmu_migrate_context call
> in fme_perf_offline_cpu function.
> 
> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>

You might want to Cc: stable@vger.kernel.org if it fixes an actual bug.
> ---
>  drivers/fpga/dfl-fme-perf.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> ---
> - This fix patch is not tested (as I don't have required environment).
>   But issue mentioned in the commit msg can be re-created, by starting any
>   fme_perf event and while its still running, offline current designated
>   cpu pointed by cpumask file. Since current code didn't migrating pmu,
>   perf gonna try getting counts from that offlined cpu and hence we will
>   not get event data.
> ---
> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
> index 4299145ef347..b9a54583e505 100644
> --- a/drivers/fpga/dfl-fme-perf.c
> +++ b/drivers/fpga/dfl-fme-perf.c
> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
>  		return 0;
>  
>  	priv->cpu = target;
> +
> +	/* Migrate fme_perf pmu events to the new target cpu */
> +	perf_pmu_migrate_context(&priv->pmu, cpu, target);
> +
>  	return 0;
>  }
>  
> -- 
> 2.31.1
> 
- Moritz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] fpga: dfl: fme: Fix cpu hotplug code
  2021-06-28 18:40 ` Moritz Fischer
@ 2021-06-29  7:14   ` kajoljain
  0 siblings, 0 replies; 5+ messages in thread
From: kajoljain @ 2021-06-29  7:14 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: will, hao.wu, mark.rutland, trix, yilun.xu, luwei.kang,
	linux-fpga, maddy, atrajeev, linux-kernel, linuxppc-dev,
	rnsastry



On 6/29/21 12:10 AM, Moritz Fischer wrote:
> On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote:
>> Commit 724142f8c42a ("fpga: dfl: fme: add performance
>> reporting support") added performance reporting support
>> for FPGA management engine via perf.
>>
>> It also added cpu hotplug feature but it didn't add
>> pmu migration call in cpu offline function.
>> This can create an issue incase the current designated
>> cpu being used to collect fme pmu data got offline,
>> as based on current code we are not migrating fme pmu to
>> new target cpu. Because of that perf will still try to
>> fetch data from that offline cpu and hence we will not
>> get counter data.
>>
>> Patch fixed this issue by adding pmu_migrate_context call
>> in fme_perf_offline_cpu function.
>>
>> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support")
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> 
> You might want to Cc: stable@vger.kernel.org if it fixes an actual bug.

Hi Moritz,
  I already send patch out without RFC tag yesterday.
Link to the patch: https://lkml.org/lkml/2021/6/28/275

I will cc stable@vger.kernel.org there as suggested by you.

Thanks,
Kajol Jain

>> ---
>>  drivers/fpga/dfl-fme-perf.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> ---
>> - This fix patch is not tested (as I don't have required environment).
>>   But issue mentioned in the commit msg can be re-created, by starting any
>>   fme_perf event and while its still running, offline current designated
>>   cpu pointed by cpumask file. Since current code didn't migrating pmu,
>>   perf gonna try getting counts from that offlined cpu and hence we will
>>   not get event data.
>> ---
>> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
>> index 4299145ef347..b9a54583e505 100644
>> --- a/drivers/fpga/dfl-fme-perf.c
>> +++ b/drivers/fpga/dfl-fme-perf.c
>> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node)
>>  		return 0;
>>  
>>  	priv->cpu = target;
>> +
>> +	/* Migrate fme_perf pmu events to the new target cpu */
>> +	perf_pmu_migrate_context(&priv->pmu, cpu, target);
>> +
>>  	return 0;
>>  }
>>  
>> -- 
>> 2.31.1
>>
> - Moritz
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-29  7:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28  7:15 [RFC] fpga: dfl: fme: Fix cpu hotplug code Kajol Jain
2021-06-28  9:01 ` Xu Yilun
2021-06-28 10:04   ` kajoljain
2021-06-28 18:40 ` Moritz Fischer
2021-06-29  7:14   ` kajoljain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).