All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 1/2] perf/core: Add API to look up PMU type by name
@ 2018-02-24  0:19 ` Saravana Kannan
  0 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:19 UTC (permalink / raw)
  To: mark.rutland, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim
  Cc: skannan, avilaj, linux-arm-kernel, linux-kernel

When the event numbers registered by multiple PMUs overlap, the
attr->type value passed to perf_event_create_kernel_counter() is used
to determine which PMU to use to create a perf_event.

However, when the PMU in question is not a standard PMU (defined in
perf_type_id), there is no way for a kernel client to look up the PMU
type for the PMU of interest and set the attr->type appropriately.

So, add an API to look up the PMU type by name. That way, the kernel
APIs can function in a fashion similar to the user space interface.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
 kernel/events/core.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 96db9ae..5d3df58 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10310,6 +10310,29 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
 	return err;
 }
 
+int perf_find_pmu_type_by_name(const char *name)
+{
+	struct pmu *pmu;
+	int ret = -1;
+
+	mutex_lock(&pmus_lock);
+
+	list_for_each_entry(pmu, &pmus, entry) {
+		if (!pmu->name || pmu->type < 0)
+			continue;
+
+		if (!strcmp(name, pmu->name)) {
+			ret = pmu->type;
+			goto out;
+		}
+	}
+
+out:
+	mutex_unlock(&pmus_lock);
+
+	return ret;
+}
+
 /**
  * perf_event_create_kernel_counter
  *
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 1/2] perf/core: Add API to look up PMU type by name
@ 2018-02-24  0:19 ` Saravana Kannan
  0 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:19 UTC (permalink / raw)
  To: linux-arm-kernel

When the event numbers registered by multiple PMUs overlap, the
attr->type value passed to perf_event_create_kernel_counter() is used
to determine which PMU to use to create a perf_event.

However, when the PMU in question is not a standard PMU (defined in
perf_type_id), there is no way for a kernel client to look up the PMU
type for the PMU of interest and set the attr->type appropriately.

So, add an API to look up the PMU type by name. That way, the kernel
APIs can function in a fashion similar to the user space interface.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
 kernel/events/core.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 96db9ae..5d3df58 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10310,6 +10310,29 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
 	return err;
 }
 
+int perf_find_pmu_type_by_name(const char *name)
+{
+	struct pmu *pmu;
+	int ret = -1;
+
+	mutex_lock(&pmus_lock);
+
+	list_for_each_entry(pmu, &pmus, entry) {
+		if (!pmu->name || pmu->type < 0)
+			continue;
+
+		if (!strcmp(name, pmu->name)) {
+			ret = pmu->type;
+			goto out;
+		}
+	}
+
+out:
+	mutex_unlock(&pmus_lock);
+
+	return ret;
+}
+
 /**
  * perf_event_create_kernel_counter
  *
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-24  0:19 ` Saravana Kannan
@ 2018-02-24  0:19   ` Saravana Kannan
  -1 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:19 UTC (permalink / raw)
  To: mark.rutland, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim
  Cc: skannan, avilaj, linux-arm-kernel, linux-kernel

Some PMUs events can be read from any CPU. So allow the PMU to mark
events as such. For these events, we don't need to reject reads or
make smp calls to the event's CPU and cause unnecessary wake ups.

Good examples of such events would be events from caches shared across
all CPUs.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
 include/linux/perf_event.h |  3 +++
 kernel/events/core.c       | 10 ++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7546822..ee8978f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
  * PERF_EV_CAP_SOFTWARE: Is a software event.
  * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
  * from any CPU in the package where it is active.
+ * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can be read
+ * from any CPU.
  */
 #define PERF_EV_CAP_SOFTWARE		BIT(0)
 #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
+#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
 
 #define SWEVENT_HLIST_BITS		8
 #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5d3df58..570187b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
 	u16 local_pkg, event_pkg;
 
+	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
+		return smp_processor_id();
+	}
+
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
 		int local_cpu = smp_processor_id();
 
@@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 {
 	unsigned long flags;
 	int ret = 0;
+	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
 
 	/*
 	 * Disabling interrupts avoids all counter scheduling (context
@@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 
 	/* If this is a per-CPU event, it must be for this CPU */
 	if (!(event->attach_state & PERF_ATTACH_TASK) &&
-	    event->cpu != smp_processor_id()) {
+	    event->cpu != smp_processor_id() &&
+	    !is_any_cpu) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
 	 * oncpu == -1).
 	 */
-	if (event->oncpu == smp_processor_id())
+	if (event->oncpu == smp_processor_id() || is_any_cpu)
 		event->pmu->read(event);
 
 	*value = local64_read(&event->count);
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-24  0:19   ` Saravana Kannan
  0 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:19 UTC (permalink / raw)
  To: linux-arm-kernel

Some PMUs events can be read from any CPU. So allow the PMU to mark
events as such. For these events, we don't need to reject reads or
make smp calls to the event's CPU and cause unnecessary wake ups.

Good examples of such events would be events from caches shared across
all CPUs.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
 include/linux/perf_event.h |  3 +++
 kernel/events/core.c       | 10 ++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7546822..ee8978f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
  * PERF_EV_CAP_SOFTWARE: Is a software event.
  * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
  * from any CPU in the package where it is active.
+ * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can be read
+ * from any CPU.
  */
 #define PERF_EV_CAP_SOFTWARE		BIT(0)
 #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
+#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
 
 #define SWEVENT_HLIST_BITS		8
 #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5d3df58..570187b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
 {
 	u16 local_pkg, event_pkg;
 
+	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
+		return smp_processor_id();
+	}
+
 	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
 		int local_cpu = smp_processor_id();
 
@@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 {
 	unsigned long flags;
 	int ret = 0;
+	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
 
 	/*
 	 * Disabling interrupts avoids all counter scheduling (context
@@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 
 	/* If this is a per-CPU event, it must be for this CPU */
 	if (!(event->attach_state & PERF_ATTACH_TASK) &&
-	    event->cpu != smp_processor_id()) {
+	    event->cpu != smp_processor_id() &&
+	    !is_any_cpu) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
 	 * oncpu == -1).
 	 */
-	if (event->oncpu == smp_processor_id())
+	if (event->oncpu == smp_processor_id() || is_any_cpu)
 		event->pmu->read(event);
 
 	*value = local64_read(&event->count);
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-24  0:19   ` Saravana Kannan
@ 2018-02-24  0:56     ` Saravana Kannan
  -1 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:56 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: mark.rutland, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, avilaj, linux-arm-kernel, linux-kernel

On 02/23/2018 04:19 PM, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
>
> Good examples of such events would be events from caches shared across
> all CPUs.
>
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>   include/linux/perf_event.h |  3 +++
>   kernel/events/core.c       | 10 ++++++++--
>   2 files changed, 11 insertions(+), 2 deletions(-)
>
>

Ugh! Didn't mean to chain these two emails. This one is independent of 
the other email.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-24  0:56     ` Saravana Kannan
  0 siblings, 0 replies; 26+ messages in thread
From: Saravana Kannan @ 2018-02-24  0:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/23/2018 04:19 PM, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
>
> Good examples of such events would be events from caches shared across
> all CPUs.
>
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>   include/linux/perf_event.h |  3 +++
>   kernel/events/core.c       | 10 ++++++++--
>   2 files changed, 11 insertions(+), 2 deletions(-)
>
>

Ugh! Didn't mean to chain these two emails. This one is independent of 
the other email.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 1/2] perf/core: Add API to look up PMU type by name
  2018-02-24  0:19 ` Saravana Kannan
@ 2018-02-24  8:08   ` Peter Zijlstra
  -1 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-02-24  8:08 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: mark.rutland, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Fri, Feb 23, 2018 at 04:19:37PM -0800, Saravana Kannan wrote:
> When the event numbers registered by multiple PMUs overlap, the
> attr->type value passed to perf_event_create_kernel_counter() is used
> to determine which PMU to use to create a perf_event.
> 
> However, when the PMU in question is not a standard PMU (defined in
> perf_type_id), there is no way for a kernel client to look up the PMU
> type for the PMU of interest and set the attr->type appropriately.
> 
> So, add an API to look up the PMU type by name. That way, the kernel
> APIs can function in a fashion similar to the user space interface.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>  kernel/events/core.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 96db9ae..5d3df58 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10310,6 +10310,29 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
>  	return err;
>  }
>  
> +int perf_find_pmu_type_by_name(const char *name)
> +{
> +	struct pmu *pmu;
> +	int ret = -1;
> +
> +	mutex_lock(&pmus_lock);
> +
> +	list_for_each_entry(pmu, &pmus, entry) {
> +		if (!pmu->name || pmu->type < 0)
> +			continue;
> +
> +		if (!strcmp(name, pmu->name)) {
> +			ret = pmu->type;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	mutex_unlock(&pmus_lock);
> +
> +	return ret;
> +}

Not without an in-tree user.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 1/2] perf/core: Add API to look up PMU type by name
@ 2018-02-24  8:08   ` Peter Zijlstra
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-02-24  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 23, 2018 at 04:19:37PM -0800, Saravana Kannan wrote:
> When the event numbers registered by multiple PMUs overlap, the
> attr->type value passed to perf_event_create_kernel_counter() is used
> to determine which PMU to use to create a perf_event.
> 
> However, when the PMU in question is not a standard PMU (defined in
> perf_type_id), there is no way for a kernel client to look up the PMU
> type for the PMU of interest and set the attr->type appropriately.
> 
> So, add an API to look up the PMU type by name. That way, the kernel
> APIs can function in a fashion similar to the user space interface.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>  kernel/events/core.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 96db9ae..5d3df58 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10310,6 +10310,29 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
>  	return err;
>  }
>  
> +int perf_find_pmu_type_by_name(const char *name)
> +{
> +	struct pmu *pmu;
> +	int ret = -1;
> +
> +	mutex_lock(&pmus_lock);
> +
> +	list_for_each_entry(pmu, &pmus, entry) {
> +		if (!pmu->name || pmu->type < 0)
> +			continue;
> +
> +		if (!strcmp(name, pmu->name)) {
> +			ret = pmu->type;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	mutex_unlock(&pmus_lock);
> +
> +	return ret;
> +}

Not without an in-tree user.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-24  0:19   ` Saravana Kannan
@ 2018-02-24  8:41     ` Peter Zijlstra
  -1 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-02-24  8:41 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: mark.rutland, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
> 
> Good examples of such events would be events from caches shared across
> all CPUs.

So why would the existing ACTIVE_PKG not work for you? Because clearly
your example does not cross a package.

> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>  include/linux/perf_event.h |  3 +++
>  kernel/events/core.c       | 10 ++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 7546822..ee8978f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
>   * PERF_EV_CAP_SOFTWARE: Is a software event.
>   * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
>   * from any CPU in the package where it is active.
> + * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can be read
> + * from any CPU.
>   */
>  #define PERF_EV_CAP_SOFTWARE		BIT(0)
>  #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
> +#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
>  
>  #define SWEVENT_HLIST_BITS		8
>  #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5d3df58..570187b 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>  {
>  	u16 local_pkg, event_pkg;
>  
> +	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
> +		return smp_processor_id();
> +	}
> +
>  	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>  		int local_cpu = smp_processor_id();
>  

> @@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  {
>  	unsigned long flags;
>  	int ret = 0;
> +	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
>  
>  	/*
>  	 * Disabling interrupts avoids all counter scheduling (context
> @@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  
>  	/* If this is a per-CPU event, it must be for this CPU */
>  	if (!(event->attach_state & PERF_ATTACH_TASK) &&
> -	    event->cpu != smp_processor_id()) {
> +	    event->cpu != smp_processor_id() &&
> +	    !is_any_cpu) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>  	 * oncpu == -1).
>  	 */
> -	if (event->oncpu == smp_processor_id())
> +	if (event->oncpu == smp_processor_id() || is_any_cpu)
>  		event->pmu->read(event);
>  
>  	*value = local64_read(&event->count);

And why are you modifying read_local for this? That didn't support
ACTIVE_PKG, so why should it support this?

And again, where are the users?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-24  8:41     ` Peter Zijlstra
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-02-24  8:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
> 
> Good examples of such events would be events from caches shared across
> all CPUs.

So why would the existing ACTIVE_PKG not work for you? Because clearly
your example does not cross a package.

> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> ---
>  include/linux/perf_event.h |  3 +++
>  kernel/events/core.c       | 10 ++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 7546822..ee8978f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
>   * PERF_EV_CAP_SOFTWARE: Is a software event.
>   * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
>   * from any CPU in the package where it is active.
> + * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can be read
> + * from any CPU.
>   */
>  #define PERF_EV_CAP_SOFTWARE		BIT(0)
>  #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
> +#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
>  
>  #define SWEVENT_HLIST_BITS		8
>  #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5d3df58..570187b 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
>  {
>  	u16 local_pkg, event_pkg;
>  
> +	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
> +		return smp_processor_id();
> +	}
> +
>  	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>  		int local_cpu = smp_processor_id();
>  

> @@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  {
>  	unsigned long flags;
>  	int ret = 0;
> +	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
>  
>  	/*
>  	 * Disabling interrupts avoids all counter scheduling (context
> @@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  
>  	/* If this is a per-CPU event, it must be for this CPU */
>  	if (!(event->attach_state & PERF_ATTACH_TASK) &&
> -	    event->cpu != smp_processor_id()) {
> +	    event->cpu != smp_processor_id() &&
> +	    !is_any_cpu) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
>  	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>  	 * oncpu == -1).
>  	 */
> -	if (event->oncpu == smp_processor_id())
> +	if (event->oncpu == smp_processor_id() || is_any_cpu)
>  		event->pmu->read(event);
>  
>  	*value = local64_read(&event->count);

And why are you modifying read_local for this? That didn't support
ACTIVE_PKG, so why should it support this?

And again, where are the users?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-24  0:19   ` Saravana Kannan
@ 2018-02-25 14:38     ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-25 14:38 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
> 
> Good examples of such events would be events from caches shared across
> all CPUs.

I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it would be
better to give events a pointer to a cpumask. That could then cover all cases
quite trivially:

static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
{
	int local_cpu = smp_processor_id();

	if (event->read_mask &&
	    cpumask_test_cpu(local_cpu, event->read_mask))
		event_cpu = local_cpu;
	
	return event_cpu;
}

... in the PERF_EV_CAP_READ_ACTIVE_PKG case, we can use the exiting(?) package
masks, and more generally we can re-use the PMU's affinit mask if it has one.

That said, I see that many pmu::read() implementations have side-effects on
hwc->prev_count, and event->count, so I worry that this won't be sfe in general
(e.g. if we race with the IRQ handler on another CPU).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-25 14:38     ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-25 14:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> Some PMUs events can be read from any CPU. So allow the PMU to mark
> events as such. For these events, we don't need to reject reads or
> make smp calls to the event's CPU and cause unnecessary wake ups.
> 
> Good examples of such events would be events from caches shared across
> all CPUs.

I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it would be
better to give events a pointer to a cpumask. That could then cover all cases
quite trivially:

static int __perf_event_read_cpu(struct perf_event *event, int event_cpu)
{
	int local_cpu = smp_processor_id();

	if (event->read_mask &&
	    cpumask_test_cpu(local_cpu, event->read_mask))
		event_cpu = local_cpu;
	
	return event_cpu;
}

... in the PERF_EV_CAP_READ_ACTIVE_PKG case, we can use the exiting(?) package
masks, and more generally we can re-use the PMU's affinit mask if it has one.

That said, I see that many pmu::read() implementations have side-effects on
hwc->prev_count, and event->count, so I worry that this won't be sfe in general
(e.g. if we race with the IRQ handler on another CPU).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-24  8:41     ` Peter Zijlstra
@ 2018-02-27  1:53       ` skannan at codeaurora.org
  -1 siblings, 0 replies; 26+ messages in thread
From: skannan @ 2018-02-27  1:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mark.rutland, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On 2018-02-24 00:41, Peter Zijlstra wrote:
> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>> events as such. For these events, we don't need to reject reads or
>> make smp calls to the event's CPU and cause unnecessary wake ups.
>> 
>> Good examples of such events would be events from caches shared across
>> all CPUs.
> 
> So why would the existing ACTIVE_PKG not work for you? Because clearly
> your example does not cross a package.

Because based on testing it on hardware, it looks like the two clusters 
in an ARM DynamIQ design are not considered part of the same "package". 
When I say clusters, I using the more common interpretation of 
"homogeneous CPUs running on the same clock"/CPUs in a cpufreq policy 
and not ARM's new redefinition of cluster. So, on a SoC with 4 little 
and 4 big cores, it'll still trigger a lot of unnecessary smp calls/IPIs 
that cause unnecessary wakeups.

Although, I like Mark's suggestion of just giving a cpumask for every 
event and using that instead. Because the meaning of "active package" is 
very ambiguous. For example if a SoC has 2 DynamIQ blocks (not sure if 
that's possible), what's considered a package? CPUs that are sitting on 
one L3 can't read the PMU counters of a different L3. In that case, 
neither "Any CPU" nor "Active Package" is correct/usable for reducing 
IPIs.

> 
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> ---
>>  include/linux/perf_event.h |  3 +++
>>  kernel/events/core.c       | 10 ++++++++--
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 7546822..ee8978f 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct 
>> perf_event *,
>>   * PERF_EV_CAP_SOFTWARE: Is a software event.
>>   * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that 
>> can be read
>>   * from any CPU in the package where it is active.
>> + * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can 
>> be read
>> + * from any CPU.
>>   */
>>  #define PERF_EV_CAP_SOFTWARE		BIT(0)
>>  #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
>> +#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
>> 
>>  #define SWEVENT_HLIST_BITS		8
>>  #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 5d3df58..570187b 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct 
>> perf_event *event, int event_cpu)
>>  {
>>  	u16 local_pkg, event_pkg;
>> 
>> +	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
>> +		return smp_processor_id();
>> +	}
>> +
>>  	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>>  		int local_cpu = smp_processor_id();
>> 
> 
>> @@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>>  {
>>  	unsigned long flags;
>>  	int ret = 0;
>> +	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
>> 
>>  	/*
>>  	 * Disabling interrupts avoids all counter scheduling (context
>> @@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>> 
>>  	/* If this is a per-CPU event, it must be for this CPU */
>>  	if (!(event->attach_state & PERF_ATTACH_TASK) &&
>> -	    event->cpu != smp_processor_id()) {
>> +	    event->cpu != smp_processor_id() &&
>> +	    !is_any_cpu) {
>>  		ret = -EINVAL;
>>  		goto out;
>>  	}
>> @@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>>  	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>>  	 * oncpu == -1).
>>  	 */
>> -	if (event->oncpu == smp_processor_id())
>> +	if (event->oncpu == smp_processor_id() || is_any_cpu)
>>  		event->pmu->read(event);
>> 
>>  	*value = local64_read(&event->count);
> 
> And why are you modifying read_local for this? That didn't support
> ACTIVE_PKG, so why should it support this?

Maybe I'll make a separate patch to first have perf_event_read_local() 
also handle ACTIVE_PACKAGE? Because in those cases, the smp call made by 
read_local is unnecessary too.

> 
> And again, where are the users?

The DynamIQ PMU driver would be the user.

-Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-27  1:53       ` skannan at codeaurora.org
  0 siblings, 0 replies; 26+ messages in thread
From: skannan at codeaurora.org @ 2018-02-27  1:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018-02-24 00:41, Peter Zijlstra wrote:
> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>> events as such. For these events, we don't need to reject reads or
>> make smp calls to the event's CPU and cause unnecessary wake ups.
>> 
>> Good examples of such events would be events from caches shared across
>> all CPUs.
> 
> So why would the existing ACTIVE_PKG not work for you? Because clearly
> your example does not cross a package.

Because based on testing it on hardware, it looks like the two clusters 
in an ARM DynamIQ design are not considered part of the same "package". 
When I say clusters, I using the more common interpretation of 
"homogeneous CPUs running on the same clock"/CPUs in a cpufreq policy 
and not ARM's new redefinition of cluster. So, on a SoC with 4 little 
and 4 big cores, it'll still trigger a lot of unnecessary smp calls/IPIs 
that cause unnecessary wakeups.

Although, I like Mark's suggestion of just giving a cpumask for every 
event and using that instead. Because the meaning of "active package" is 
very ambiguous. For example if a SoC has 2 DynamIQ blocks (not sure if 
that's possible), what's considered a package? CPUs that are sitting on 
one L3 can't read the PMU counters of a different L3. In that case, 
neither "Any CPU" nor "Active Package" is correct/usable for reducing 
IPIs.

> 
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> ---
>>  include/linux/perf_event.h |  3 +++
>>  kernel/events/core.c       | 10 ++++++++--
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 7546822..ee8978f 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -510,9 +510,12 @@ typedef void (*perf_overflow_handler_t)(struct 
>> perf_event *,
>>   * PERF_EV_CAP_SOFTWARE: Is a software event.
>>   * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that 
>> can be read
>>   * from any CPU in the package where it is active.
>> + * PERF_EV_CAP_READ_ANY_CPU: A CPU event (or cgroup event) that can 
>> be read
>> + * from any CPU.
>>   */
>>  #define PERF_EV_CAP_SOFTWARE		BIT(0)
>>  #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
>> +#define PERF_EV_CAP_READ_ANY_CPU	BIT(2)
>> 
>>  #define SWEVENT_HLIST_BITS		8
>>  #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 5d3df58..570187b 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -3484,6 +3484,10 @@ static int __perf_event_read_cpu(struct 
>> perf_event *event, int event_cpu)
>>  {
>>  	u16 local_pkg, event_pkg;
>> 
>> +	if (event->group_caps & PERF_EV_CAP_READ_ANY_CPU) {
>> +		return smp_processor_id();
>> +	}
>> +
>>  	if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
>>  		int local_cpu = smp_processor_id();
>> 
> 
>> @@ -3575,6 +3579,7 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>>  {
>>  	unsigned long flags;
>>  	int ret = 0;
>> +	bool is_any_cpu = !!(event->group_caps & PERF_EV_CAP_READ_ANY_CPU);
>> 
>>  	/*
>>  	 * Disabling interrupts avoids all counter scheduling (context
>> @@ -3600,7 +3605,8 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>> 
>>  	/* If this is a per-CPU event, it must be for this CPU */
>>  	if (!(event->attach_state & PERF_ATTACH_TASK) &&
>> -	    event->cpu != smp_processor_id()) {
>> +	    event->cpu != smp_processor_id() &&
>> +	    !is_any_cpu) {
>>  		ret = -EINVAL;
>>  		goto out;
>>  	}
>> @@ -3610,7 +3616,7 @@ int perf_event_read_local(struct perf_event 
>> *event, u64 *value,
>>  	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
>>  	 * oncpu == -1).
>>  	 */
>> -	if (event->oncpu == smp_processor_id())
>> +	if (event->oncpu == smp_processor_id() || is_any_cpu)
>>  		event->pmu->read(event);
>> 
>>  	*value = local64_read(&event->count);
> 
> And why are you modifying read_local for this? That didn't support
> ACTIVE_PKG, so why should it support this?

Maybe I'll make a separate patch to first have perf_event_read_local() 
also handle ACTIVE_PACKAGE? Because in those cases, the smp call made by 
read_local is unnecessary too.

> 
> And again, where are the users?

The DynamIQ PMU driver would be the user.

-Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-25 14:38     ` Mark Rutland
@ 2018-02-27  2:11       ` skannan at codeaurora.org
  -1 siblings, 0 replies; 26+ messages in thread
From: skannan @ 2018-02-27  2:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On 2018-02-25 06:38, Mark Rutland wrote:
> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>> events as such. For these events, we don't need to reject reads or
>> make smp calls to the event's CPU and cause unnecessary wake ups.
>> 
>> Good examples of such events would be events from caches shared across
>> all CPUs.
> 
> I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it 
> would be
> better to give events a pointer to a cpumask. That could then cover all 
> cases
> quite trivially:
> 
> static int __perf_event_read_cpu(struct perf_event *event, int 
> event_cpu)
> {
> 	int local_cpu = smp_processor_id();
> 
> 	if (event->read_mask &&
> 	    cpumask_test_cpu(local_cpu, event->read_mask))
> 		event_cpu = local_cpu;
> 
> 	return event_cpu;
> }
> 

This is a good improvement on my attempt. If I send a patch for this, is 
that something you'd be willing to incorporate into your patch set and 
make sure the DSU pmu driver handles it correctly?

> ... in the PERF_EV_CAP_READ_ACTIVE_PKG case, we can use the exiting(?) 
> package
> masks, and more generally we can re-use the PMU's affinit mask if it 
> has one.
> 
> That said, I see that many pmu::read() implementations have 
> side-effects on
> hwc->prev_count, and event->count, so I worry that this won't be sfe in 
> general
> (e.g. if we race with the IRQ handler on another CPU).
> 

Yeah, this doesn't have to be mandatory. It can be an optional mask the 
PMU can set up during perf event init.

Peter,

Is this something that's acceptable to you?

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-27  2:11       ` skannan at codeaurora.org
  0 siblings, 0 replies; 26+ messages in thread
From: skannan at codeaurora.org @ 2018-02-27  2:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018-02-25 06:38, Mark Rutland wrote:
> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>> events as such. For these events, we don't need to reject reads or
>> make smp calls to the event's CPU and cause unnecessary wake ups.
>> 
>> Good examples of such events would be events from caches shared across
>> all CPUs.
> 
> I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it 
> would be
> better to give events a pointer to a cpumask. That could then cover all 
> cases
> quite trivially:
> 
> static int __perf_event_read_cpu(struct perf_event *event, int 
> event_cpu)
> {
> 	int local_cpu = smp_processor_id();
> 
> 	if (event->read_mask &&
> 	    cpumask_test_cpu(local_cpu, event->read_mask))
> 		event_cpu = local_cpu;
> 
> 	return event_cpu;
> }
> 

This is a good improvement on my attempt. If I send a patch for this, is 
that something you'd be willing to incorporate into your patch set and 
make sure the DSU pmu driver handles it correctly?

> ... in the PERF_EV_CAP_READ_ACTIVE_PKG case, we can use the exiting(?) 
> package
> masks, and more generally we can re-use the PMU's affinit mask if it 
> has one.
> 
> That said, I see that many pmu::read() implementations have 
> side-effects on
> hwc->prev_count, and event->count, so I worry that this won't be sfe in 
> general
> (e.g. if we race with the IRQ handler on another CPU).
> 

Yeah, this doesn't have to be mandatory. It can be an optional mask the 
PMU can set up during perf event init.

Peter,

Is this something that's acceptable to you?

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-27  2:11       ` skannan at codeaurora.org
@ 2018-02-27 11:43         ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-27 11:43 UTC (permalink / raw)
  To: skannan
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Mon, Feb 26, 2018 at 06:11:45PM -0800, skannan@codeaurora.org wrote:
> On 2018-02-25 06:38, Mark Rutland wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it
> > would be
> > better to give events a pointer to a cpumask. That could then cover all
> > cases
> > quite trivially:
> > 
> > static int __perf_event_read_cpu(struct perf_event *event, int
> > event_cpu)
> > {
> > 	int local_cpu = smp_processor_id();
> > 
> > 	if (event->read_mask &&
> > 	    cpumask_test_cpu(local_cpu, event->read_mask))
> > 		event_cpu = local_cpu;
> > 
> > 	return event_cpu;
> > }
> 
> This is a good improvement on my attempt. If I send a patch for this, is
> that something you'd be willing to incorporate into your patch set and make
> sure the DSU pmu driver handles it correctly?

As I commented, I don't think that willl work without more invasive
changes as the DSU PMU's pmu::read() function has side effects on
hwc->prev_count and event_count, and could race with an IRQ handler on
another CPU.

Is the IPI really a problem in practice?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-27 11:43         ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-27 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 26, 2018 at 06:11:45PM -0800, skannan at codeaurora.org wrote:
> On 2018-02-25 06:38, Mark Rutland wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it
> > would be
> > better to give events a pointer to a cpumask. That could then cover all
> > cases
> > quite trivially:
> > 
> > static int __perf_event_read_cpu(struct perf_event *event, int
> > event_cpu)
> > {
> > 	int local_cpu = smp_processor_id();
> > 
> > 	if (event->read_mask &&
> > 	    cpumask_test_cpu(local_cpu, event->read_mask))
> > 		event_cpu = local_cpu;
> > 
> > 	return event_cpu;
> > }
> 
> This is a good improvement on my attempt. If I send a patch for this, is
> that something you'd be willing to incorporate into your patch set and make
> sure the DSU pmu driver handles it correctly?

As I commented, I don't think that willl work without more invasive
changes as the DSU PMU's pmu::read() function has side effects on
hwc->prev_count and event_count, and could race with an IRQ handler on
another CPU.

Is the IPI really a problem in practice?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-27  1:53       ` skannan at codeaurora.org
@ 2018-02-27 11:52         ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-27 11:52 UTC (permalink / raw)
  To: skannan
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan@codeaurora.org wrote:
> On 2018-02-24 00:41, Peter Zijlstra wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > So why would the existing ACTIVE_PKG not work for you? Because clearly
> > your example does not cross a package.
> 
> Because based on testing it on hardware, it looks like the two clusters in
> an ARM DynamIQ design are not considered part of the same "package". 

I don't think we should consider the topology masks at all for system
PMU affinity. Due to the number of ways these can be integrated, and the
lack of a standard(ish) topology across arm platforms.

IIUC, there's ongoing work to try to clean that up, but that won't give
us anything meaningful for PMU affinity.

If we need a mask, that should be something the FW description of the
PMU provides, and the PMU driver provides to the core code.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-27 11:52         ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2018-02-27 11:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan at codeaurora.org wrote:
> On 2018-02-24 00:41, Peter Zijlstra wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > So why would the existing ACTIVE_PKG not work for you? Because clearly
> > your example does not cross a package.
> 
> Because based on testing it on hardware, it looks like the two clusters in
> an ARM DynamIQ design are not considered part of the same "package". 

I don't think we should consider the topology masks at all for system
PMU affinity. Due to the number of ways these can be integrated, and the
lack of a standard(ish) topology across arm platforms.

IIUC, there's ongoing work to try to clean that up, but that won't give
us anything meaningful for PMU affinity.

If we need a mask, that should be something the FW description of the
PMU provides, and the PMU driver provides to the core code.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-27 11:43         ` Mark Rutland
@ 2018-02-27 23:15           ` skannan at codeaurora.org
  -1 siblings, 0 replies; 26+ messages in thread
From: skannan @ 2018-02-27 23:15 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On 2018-02-27 03:43, Mark Rutland wrote:
> On Mon, Feb 26, 2018 at 06:11:45PM -0800, skannan@codeaurora.org wrote:
>> On 2018-02-25 06:38, Mark Rutland wrote:
>> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
>> > > events as such. For these events, we don't need to reject reads or
>> > > make smp calls to the event's CPU and cause unnecessary wake ups.
>> > >
>> > > Good examples of such events would be events from caches shared across
>> > > all CPUs.
>> >
>> > I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it
>> > would be
>> > better to give events a pointer to a cpumask. That could then cover all
>> > cases
>> > quite trivially:
>> >
>> > static int __perf_event_read_cpu(struct perf_event *event, int
>> > event_cpu)
>> > {
>> > 	int local_cpu = smp_processor_id();
>> >
>> > 	if (event->read_mask &&
>> > 	    cpumask_test_cpu(local_cpu, event->read_mask))
>> > 		event_cpu = local_cpu;
>> >
>> > 	return event_cpu;
>> > }
>> 
>> This is a good improvement on my attempt. If I send a patch for this, 
>> is
>> that something you'd be willing to incorporate into your patch set and 
>> make
>> sure the DSU pmu driver handles it correctly?
> 
> As I commented, I don't think that willl work without more invasive
> changes as the DSU PMU's pmu::read() function has side effects on
> hwc->prev_count and event_count, and could race with an IRQ handler on
> another CPU.
> 
> Is the IPI really a problem in practice?
> 

There are a bunch of cases, but the simplest one is if you try to 
collect DSU stats (for analysis) while measuring power, it completely 
messes up the power measurements.

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-02-27 23:15           ` skannan at codeaurora.org
  0 siblings, 0 replies; 26+ messages in thread
From: skannan at codeaurora.org @ 2018-02-27 23:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 2018-02-27 03:43, Mark Rutland wrote:
> On Mon, Feb 26, 2018 at 06:11:45PM -0800, skannan at codeaurora.org wrote:
>> On 2018-02-25 06:38, Mark Rutland wrote:
>> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
>> > > events as such. For these events, we don't need to reject reads or
>> > > make smp calls to the event's CPU and cause unnecessary wake ups.
>> > >
>> > > Good examples of such events would be events from caches shared across
>> > > all CPUs.
>> >
>> > I think that if we need to generalize PERF_EV_CAP_READ_ACTIVE_PKG, it
>> > would be
>> > better to give events a pointer to a cpumask. That could then cover all
>> > cases
>> > quite trivially:
>> >
>> > static int __perf_event_read_cpu(struct perf_event *event, int
>> > event_cpu)
>> > {
>> > 	int local_cpu = smp_processor_id();
>> >
>> > 	if (event->read_mask &&
>> > 	    cpumask_test_cpu(local_cpu, event->read_mask))
>> > 		event_cpu = local_cpu;
>> >
>> > 	return event_cpu;
>> > }
>> 
>> This is a good improvement on my attempt. If I send a patch for this, 
>> is
>> that something you'd be willing to incorporate into your patch set and 
>> make
>> sure the DSU pmu driver handles it correctly?
> 
> As I commented, I don't think that willl work without more invasive
> changes as the DSU PMU's pmu::read() function has side effects on
> hwc->prev_count and event_count, and could race with an IRQ handler on
> another CPU.
> 
> Is the IPI really a problem in practice?
> 

There are a bunch of cases, but the simplest one is if you try to 
collect DSU stats (for analysis) while measuring power, it completely 
messes up the power measurements.

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-02-27  1:53       ` skannan at codeaurora.org
@ 2018-03-03 15:41         ` Peter Zijlstra
  -1 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-03-03 15:41 UTC (permalink / raw)
  To: skannan
  Cc: mark.rutland, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan@codeaurora.org wrote:
> On 2018-02-24 00:41, Peter Zijlstra wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > So why would the existing ACTIVE_PKG not work for you? Because clearly
> > your example does not cross a package.
> 
> Because based on testing it on hardware, it looks like the two clusters in
> an ARM DynamIQ design are not considered part of the same "package". When I
> say clusters, I using the more common interpretation of "homogeneous CPUs
> running on the same clock"/CPUs in a cpufreq policy and not ARM's new
> redefinition of cluster. So, on a SoC with 4 little and 4 big cores, it'll
> still trigger a lot of unnecessary smp calls/IPIs that cause unnecessary
> wakeups.

arch/arm64/include/asm/topology.h:#define topology_physical_package_id(cpu)     (cpu_topology[cpu].cluster_id)

*sigh*... that's just broken...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-03-03 15:41         ` Peter Zijlstra
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2018-03-03 15:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan at codeaurora.org wrote:
> On 2018-02-24 00:41, Peter Zijlstra wrote:
> > On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
> > > Some PMUs events can be read from any CPU. So allow the PMU to mark
> > > events as such. For these events, we don't need to reject reads or
> > > make smp calls to the event's CPU and cause unnecessary wake ups.
> > > 
> > > Good examples of such events would be events from caches shared across
> > > all CPUs.
> > 
> > So why would the existing ACTIVE_PKG not work for you? Because clearly
> > your example does not cross a package.
> 
> Because based on testing it on hardware, it looks like the two clusters in
> an ARM DynamIQ design are not considered part of the same "package". When I
> say clusters, I using the more common interpretation of "homogeneous CPUs
> running on the same clock"/CPUs in a cpufreq policy and not ARM's new
> redefinition of cluster. So, on a SoC with 4 little and 4 big cores, it'll
> still trigger a lot of unnecessary smp calls/IPIs that cause unnecessary
> wakeups.

arch/arm64/include/asm/topology.h:#define topology_physical_package_id(cpu)     (cpu_topology[cpu].cluster_id)

*sigh*... that's just broken...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
  2018-03-03 15:41         ` Peter Zijlstra
@ 2018-03-07 16:39           ` Jeremy Linton
  -1 siblings, 0 replies; 26+ messages in thread
From: Jeremy Linton @ 2018-03-07 16:39 UTC (permalink / raw)
  To: Peter Zijlstra, skannan
  Cc: mark.rutland, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, avilaj,
	linux-arm-kernel, linux-kernel

Hi,

On 03/03/2018 09:41 AM, Peter Zijlstra wrote:
> On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan@codeaurora.org wrote:
>> On 2018-02-24 00:41, Peter Zijlstra wrote:
>>> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>>>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>>>> events as such. For these events, we don't need to reject reads or
>>>> make smp calls to the event's CPU and cause unnecessary wake ups.
>>>>
>>>> Good examples of such events would be events from caches shared across
>>>> all CPUs.
>>>
>>> So why would the existing ACTIVE_PKG not work for you? Because clearly
>>> your example does not cross a package.
>>
>> Because based on testing it on hardware, it looks like the two clusters in
>> an ARM DynamIQ design are not considered part of the same "package". When I
>> say clusters, I using the more common interpretation of "homogeneous CPUs
>> running on the same clock"/CPUs in a cpufreq policy and not ARM's new
>> redefinition of cluster. So, on a SoC with 4 little and 4 big cores, it'll
>> still trigger a lot of unnecessary smp calls/IPIs that cause unnecessary
>> wakeups.
> 
> arch/arm64/include/asm/topology.h:#define topology_physical_package_id(cpu)     (cpu_topology[cpu].cluster_id)
> 
> *sigh*... that's just broken...
> 

Its being reworked in the PPTT (currently v7) patch set. For ACPI 
systems (and hopefully DT machines with the package property set) 
topology_physical_package and core siblings represent the socket as one 
would expect.

Thanks,

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU
@ 2018-03-07 16:39           ` Jeremy Linton
  0 siblings, 0 replies; 26+ messages in thread
From: Jeremy Linton @ 2018-03-07 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/03/2018 09:41 AM, Peter Zijlstra wrote:
> On Mon, Feb 26, 2018 at 05:53:57PM -0800, skannan at codeaurora.org wrote:
>> On 2018-02-24 00:41, Peter Zijlstra wrote:
>>> On Fri, Feb 23, 2018 at 04:19:38PM -0800, Saravana Kannan wrote:
>>>> Some PMUs events can be read from any CPU. So allow the PMU to mark
>>>> events as such. For these events, we don't need to reject reads or
>>>> make smp calls to the event's CPU and cause unnecessary wake ups.
>>>>
>>>> Good examples of such events would be events from caches shared across
>>>> all CPUs.
>>>
>>> So why would the existing ACTIVE_PKG not work for you? Because clearly
>>> your example does not cross a package.
>>
>> Because based on testing it on hardware, it looks like the two clusters in
>> an ARM DynamIQ design are not considered part of the same "package". When I
>> say clusters, I using the more common interpretation of "homogeneous CPUs
>> running on the same clock"/CPUs in a cpufreq policy and not ARM's new
>> redefinition of cluster. So, on a SoC with 4 little and 4 big cores, it'll
>> still trigger a lot of unnecessary smp calls/IPIs that cause unnecessary
>> wakeups.
> 
> arch/arm64/include/asm/topology.h:#define topology_physical_package_id(cpu)     (cpu_topology[cpu].cluster_id)
> 
> *sigh*... that's just broken...
> 

Its being reworked in the PPTT (currently v7) patch set. For ACPI 
systems (and hopefully DT machines with the package property set) 
topology_physical_package and core siblings represent the socket as one 
would expect.

Thanks,

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-03-07 16:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-24  0:19 [PATCH v1 1/2] perf/core: Add API to look up PMU type by name Saravana Kannan
2018-02-24  0:19 ` Saravana Kannan
2018-02-24  0:19 ` [PATCH v1 2/2] perf/core: Add support for PMUs that can be read from any CPU Saravana Kannan
2018-02-24  0:19   ` Saravana Kannan
2018-02-24  0:56   ` Saravana Kannan
2018-02-24  0:56     ` Saravana Kannan
2018-02-24  8:41   ` Peter Zijlstra
2018-02-24  8:41     ` Peter Zijlstra
2018-02-27  1:53     ` skannan
2018-02-27  1:53       ` skannan at codeaurora.org
2018-02-27 11:52       ` Mark Rutland
2018-02-27 11:52         ` Mark Rutland
2018-03-03 15:41       ` Peter Zijlstra
2018-03-03 15:41         ` Peter Zijlstra
2018-03-07 16:39         ` Jeremy Linton
2018-03-07 16:39           ` Jeremy Linton
2018-02-25 14:38   ` Mark Rutland
2018-02-25 14:38     ` Mark Rutland
2018-02-27  2:11     ` skannan
2018-02-27  2:11       ` skannan at codeaurora.org
2018-02-27 11:43       ` Mark Rutland
2018-02-27 11:43         ` Mark Rutland
2018-02-27 23:15         ` skannan
2018-02-27 23:15           ` skannan at codeaurora.org
2018-02-24  8:08 ` [PATCH v1 1/2] perf/core: Add API to look up PMU type by name Peter Zijlstra
2018-02-24  8:08   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.