linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU
@ 2018-03-03  1:14 Saravana Kannan
  2018-03-05 12:17 ` Mark Rutland
  0 siblings, 1 reply; 5+ messages in thread
From: Saravana Kannan @ 2018-03-03  1:14 UTC (permalink / raw)
  To: linux-arm-kernel

Some PMUs events can be read from more than the one CPU. So allow the
PMU driver to mark events as such. For these events, we don't need to
reject reads or make smp calls to the event's CPU (and cause
unnecessary overhead and wake ups).

When a PMU driver marks an event as such, care must be taken by the
driver to make sure they can handle the event being read/updated from
more than 1 CPU at the same time (Eg: due to an IRQ indicating event
counter overflow and another thread trying to read the latest values).

Good examples of such events would be events from caches shared across
CPUs.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
Changes since v1:
- Use cpumasks instead of capability flag as that's more flexible.

 include/linux/perf_event.h |  1 +
 kernel/events/core.c       | 14 +++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7546822..4cec431 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -629,6 +629,7 @@ struct perf_event {
 
 	int				oncpu;
 	int				cpu;
+	cpumask_t			readable_on_cpus;
 
 	struct list_head		owner_entry;
 	struct task_struct		*owner;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5d3df58..1a8fbfa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3483,10 +3483,12 @@ struct perf_read_data {
 static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
 	u16 local_pkg, event_pkg;
+	int local_cpu = smp_processor_id();
 
-	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
-		int local_cpu = smp_processor_id();
+	if (cpumask_test_cpu(local_cpu, &event->readable_on_cpus))
+		return local_cpu;
 
+	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
 		event_pkg = topology_physical_package_id(event_cpu);
 		local_pkg = topology_physical_package_id(local_cpu);
 
@@ -3575,7 +3577,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 {
 	unsigned long flags;
 	int ret = 0;
-
+	int local_cpu = smp_processor_id();
+	bool readable = cpumask_test_cpu(local_cpu, &event->readable_on_cpus);
 	/*
 	 * Disabling interrupts avoids all counter scheduling (context
 	 * switches, timer based rotation and IPIs).
@@ -3600,7 +3603,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 
 	/* If this is a per-CPU event, it must be for this CPU */
 	if (!(event->attach_state & PERF_ATTACH_TASK) &&
-	    event->cpu != smp_processor_id()) {
+	    event->cpu != local_cpu &&
+	    !readable) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -3610,7 +3614,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
 	 * oncpu == -1).
 	 */
-	if (event->oncpu == smp_processor_id())
+	if (event->oncpu == smp_processor_id() || readable)
 		event->pmu->read(event);
 
 	*value = local64_read(&event->count);
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU
  2018-03-03  1:14 [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU Saravana Kannan
@ 2018-03-05 12:17 ` Mark Rutland
  2018-03-05 12:21   ` Mark Rutland
  2018-03-05 22:06   ` Saravana Kannan
  0 siblings, 2 replies; 5+ messages in thread
From: Mark Rutland @ 2018-03-05 12:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 02, 2018 at 05:14:53PM -0800, Saravana Kannan wrote:
> Some PMUs events can be read from more than the one CPU. So allow the
> PMU driver to mark events as such. For these events, we don't need to
> reject reads or make smp calls to the event's CPU (and cause
> unnecessary overhead and wake ups).
> 
> When a PMU driver marks an event as such, care must be taken by the
> driver to make sure they can handle the event being read/updated from
> more than 1 CPU at the same time (Eg: due to an IRQ indicating event
> counter overflow and another thread trying to read the latest values).
> 
> Good examples of such events would be events from caches shared across
> CPUs.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
> Changes since v1:
> - Use cpumasks instead of capability flag as that's more flexible.
> 
>  include/linux/perf_event.h |  1 +
>  kernel/events/core.c       | 14 +++++++++-----
>  2 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 7546822..4cec431 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -629,6 +629,7 @@ struct perf_event {
>  
>  	int				oncpu;
>  	int				cpu;
> +	cpumask_t			readable_on_cpus;

For most PMUs, this will be emptry, and it's potentially *very* large
(e.g. on systems where NR_CPUS is 4096). Please use a poitner to a mask,
as I suggested in [1], e.g.

	cpumask_t			*read_mask;

That way, PMUs which already maintain an affinity mask can share that
between all of their events.

PMUs with PERF_EV_CAP_READ_ACTIVE_PKG can be updated to flip that mask
in pmu::add() and pmu::del(). I assume there are existing sibling masks
we can use. That means we can remove PERF_EV_CAP_READ_ACTIVE_PKG
entriely...

>  	struct list_head		owner_entry;
>  	struct task_struct		*owner;
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5d3df58..1a8fbfa 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3483,10 +3483,12 @@ struct perf_read_data {
>  static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>  {
>  	u16 local_pkg, event_pkg;
> +	int local_cpu = smp_processor_id();
>  
> -	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
> -		int local_cpu = smp_processor_id();
> +	if (cpumask_test_cpu(local_cpu, &event->readable_on_cpus))
> +		return local_cpu;
>  
> +	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>  		event_pkg = topology_physical_package_id(event_cpu);
>  		local_pkg = topology_physical_package_id(local_cpu);

... and this would simplify down to:

static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
{
	int local_cpu = smp_processor_id();

	if (event->read_mask && cpumask_test_cpu(local_cpu, event->read_mask)
		return local_cpu;

	return event_cpu;
}

> @@ -3575,7 +3577,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  {
>  	unsigned long flags;
>  	int ret = 0;
> -
> +	int local_cpu = smp_processor_id();
> +	bool readable = cpumask_test_cpu(local_cpu, &event->readable_on_cpus);
>  	/*
>  	 * Disabling interrupts avoids all counter scheduling (context
>  	 * switches, timer based rotation and IPIs).
> @@ -3600,7 +3603,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  
>  	/* If this is a per-CPU event, it must be for this CPU */
>  	if (!(event->attach_state & PERF_ATTACH_TASK) &&
> -	    event->cpu != smp_processor_id()) {
> +	    event->cpu != local_cpu &&
> +	    !readable) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -3610,7 +3614,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>  	 * oncpu == -1).
>  	 */
> -	if (event->oncpu == smp_processor_id())
> +	if (event->oncpu == smp_processor_id() || readable)
>  		event->pmu->read(event);

Please explain why you need to change perf_event_read_local().

Is there a case where you have numbers to show that
perf_event_read_local() is a bottleneck? If so, please elaborate.

As-is, this doesn't seem right.

Thanks,
Mark.

[1] https://lkml.kernel.org/r/20171128124534.3jvuala525wvn64r at wfg-t540p.sh.intel.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU
  2018-03-05 12:17 ` Mark Rutland
@ 2018-03-05 12:21   ` Mark Rutland
  2018-03-05 22:02     ` Saravana Kannan
  2018-03-05 22:06   ` Saravana Kannan
  1 sibling, 1 reply; 5+ messages in thread
From: Mark Rutland @ 2018-03-05 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 05, 2018 at 12:17:02PM +0000, Mark Rutland wrote:
> On Fri, Mar 02, 2018 at 05:14:53PM -0800, Saravana Kannan wrote:

> > @@ -629,6 +629,7 @@ struct perf_event {
> >  
> >  	int				oncpu;
> >  	int				cpu;
> > +	cpumask_t			readable_on_cpus;
> 
> For most PMUs, this will be emptry, and it's potentially *very* large
> (e.g. on systems where NR_CPUS is 4096). Please use a poitner to a mask,
> as I suggested in [1], e.g.

> [1] https://lkml.kernel.org/r/20171128124534.3jvuala525wvn64r at wfg-t540p.sh.intel.com

Whoops, that should've been:

[1] https://lkml.kernel.org/r/20180225143802.denbkubqjg2dc7af at salmiak

Mark.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU
  2018-03-05 12:21   ` Mark Rutland
@ 2018-03-05 22:02     ` Saravana Kannan
  0 siblings, 0 replies; 5+ messages in thread
From: Saravana Kannan @ 2018-03-05 22:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/05/2018 04:21 AM, Mark Rutland wrote:
> On Mon, Mar 05, 2018 at 12:17:02PM +0000, Mark Rutland wrote:
>> On Fri, Mar 02, 2018 at 05:14:53PM -0800, Saravana Kannan wrote:
>
>>> @@ -629,6 +629,7 @@ struct perf_event {
>>>
>>>   	int				oncpu;
>>>   	int				cpu;
>>> +	cpumask_t			readable_on_cpus;
>>
>> For most PMUs, this will be emptry, and it's potentially *very* large
>> (e.g. on systems where NR_CPUS is 4096). Please use a poitner to a mask,
>> as I suggested in [1], e.g.
>
>> [1] https://lkml.kernel.org/r/20171128124534.3jvuala525wvn64r at wfg-t540p.sh.intel.com
>
> Whoops, that should've been:
>
> [1] https://lkml.kernel.org/r/20180225143802.denbkubqjg2dc7af at salmiak
>

I didn't notice you mentioned the use of pointers, but I was planning on 
doing that anyway. But then I realize people will complain about 
cacheline bouncing across 4096 CPUs if I use a cpu mask pointer.

Thanks,
Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU
  2018-03-05 12:17 ` Mark Rutland
  2018-03-05 12:21   ` Mark Rutland
@ 2018-03-05 22:06   ` Saravana Kannan
  1 sibling, 0 replies; 5+ messages in thread
From: Saravana Kannan @ 2018-03-05 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/05/2018 04:17 AM, Mark Rutland wrote:
> On Fri, Mar 02, 2018 at 05:14:53PM -0800, Saravana Kannan wrote:
>> Some PMUs events can be read from more than the one CPU. So allow the
>> PMU driver to mark events as such. For these events, we don't need to
>> reject reads or make smp calls to the event's CPU (and cause
>> unnecessary overhead and wake ups).
>>
>> When a PMU driver marks an event as such, care must be taken by the
>> driver to make sure they can handle the event being read/updated from
>> more than 1 CPU at the same time (Eg: due to an IRQ indicating event
>> counter overflow and another thread trying to read the latest values).
>>
>> Good examples of such events would be events from caches shared across
>> CPUs.
>>
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> ---
>> Changes since v1:
>> - Use cpumasks instead of capability flag as that's more flexible.
>>
>>   include/linux/perf_event.h |  1 +
>>   kernel/events/core.c       | 14 +++++++++-----
>>   2 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 7546822..4cec431 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -629,6 +629,7 @@ struct perf_event {
>>
>>   	int				oncpu;
>>   	int				cpu;
>> +	cpumask_t			readable_on_cpus;
>
> For most PMUs, this will be emptry, and it's potentially *very* large
> (e.g. on systems where NR_CPUS is 4096). Please use a poitner to a mask,
> as I suggested in [1], e.g.
>
> 	cpumask_t			*read_mask;
>
> That way, PMUs which already maintain an affinity mask can share that
> between all of their events.
>
> PMUs with PERF_EV_CAP_READ_ACTIVE_PKG can be updated to flip that mask
> in pmu::add() and pmu::del(). I assume there are existing sibling masks
> we can use. That means we can remove PERF_EV_CAP_READ_ACTIVE_PKG
> entriely...

I think that should be a separate patch if it's necessary.

>
>>   	struct list_head		owner_entry;
>>   	struct task_struct		*owner;
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 5d3df58..1a8fbfa 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -3483,10 +3483,12 @@ struct perf_read_data {
>>   static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>>   {
>>   	u16 local_pkg, event_pkg;
>> +	int local_cpu = smp_processor_id();
>>
>> -	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>> -		int local_cpu = smp_processor_id();
>> +	if (cpumask_test_cpu(local_cpu, &event->readable_on_cpus))
>> +		return local_cpu;
>>
>> +	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>>   		event_pkg = topology_physical_package_id(event_cpu);
>>   		local_pkg = topology_physical_package_id(local_cpu);
>
> ... and this would simplify down to:
>
> static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
> {
> 	int local_cpu = smp_processor_id();
>
> 	if (event->read_mask && cpumask_test_cpu(local_cpu, event->read_mask)
> 		return local_cpu;
>
> 	return event_cpu;
> }
>
>> @@ -3575,7 +3577,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>>   {
>>   	unsigned long flags;
>>   	int ret = 0;
>> -
>> +	int local_cpu = smp_processor_id();
>> +	bool readable = cpumask_test_cpu(local_cpu, &event->readable_on_cpus);
>>   	/*
>>   	 * Disabling interrupts avoids all counter scheduling (context
>>   	 * switches, timer based rotation and IPIs).
>> @@ -3600,7 +3603,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>>
>>   	/* If this is a per-CPU event, it must be for this CPU */
>>   	if (!(event->attach_state & PERF_ATTACH_TASK) &&
>> -	    event->cpu != smp_processor_id()) {
>> +	    event->cpu != local_cpu &&
>> +	    !readable) {
>>   		ret = -EINVAL;
>>   		goto out;
>>   	}
>> @@ -3610,7 +3614,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>>   	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>>   	 * oncpu == -1).
>>   	 */
>> -	if (event->oncpu == smp_processor_id())
>> +	if (event->oncpu == smp_processor_id() || readable)
>>   		event->pmu->read(event);
>
> Please explain why you need to change perf_event_read_local().
>
> Is there a case where you have numbers to show that
> perf_event_read_local() is a bottleneck? If so, please elaborate.
>
> As-is, this doesn't seem right.

The only reason perf_event_read_local seems to exist in the first place 
is because we need an API that doesn't sleep or send IPIs to other CPUs. 
If an event is readable on the current CPU, there's no reason to sleep 
or send an IPI to another CPU. So, why disallow a read that'll work? 
Hence the change.

Thanks,
Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-05 22:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-03  1:14 [PATCH v2] perf/core: Add support for PMUs that can be read from more than 1 CPU Saravana Kannan
2018-03-05 12:17 ` Mark Rutland
2018-03-05 12:21   ` Mark Rutland
2018-03-05 22:02     ` Saravana Kannan
2018-03-05 22:06   ` Saravana Kannan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).