linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2
@ 2012-06-07 23:18 Andi Kleen
  2012-06-07 23:18 ` [PATCH 2/2] perf, x86: Enable PDIR precise instruction profiling on IvyBridge Andi Kleen
  2012-06-08  8:45 ` [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Ingo Molnar
  0 siblings, 2 replies; 5+ messages in thread
From: Andi Kleen @ 2012-06-07 23:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Very similar to Sandy Bridge, but there is no PEBS problem.

As Stephane pointed out .code=0xb1, .umask=0x01 is gone, so don't
do a generic backend stall event on IvyBridge.

v2: Remove stall event
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   19 ++++++++++++++-----
 1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 166546e..0f58590 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1698,6 +1698,7 @@ __init int intel_pmu_init(void)
 	union cpuid10_ebx ebx;
 	unsigned int unused;
 	int version;
+	char *name;
 
 	if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
 		switch (boot_cpu_data.x86) {
@@ -1839,9 +1840,21 @@ __init int intel_pmu_init(void)
 		pr_cont("Westmere events, ");
 		break;
 
+	case 58: /* IvyBridge */
+		name = "Ivy";
+		/* No backend stall event */
+		goto snb_ivb_common;
+
 	case 42: /* SandyBridge */
 		x86_add_quirk(intel_sandybridge_quirk);
 	case 45: /* SandyBridge, "Romely-EP" */
+		name = "Sandy";
+
+		/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
+		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
+			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
+
+	snb_ivb_common:
 		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
@@ -1857,11 +1870,7 @@ __init int intel_pmu_init(void)
 		/* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
 		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
 			X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
-		/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
-		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
-			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
-
-		pr_cont("SandyBridge events, ");
+		pr_cont("%sBridge events, ", name);
 		break;
 
 	default:
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] perf, x86: Enable PDIR precise instruction profiling on IvyBridge
  2012-06-07 23:18 [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Andi Kleen
@ 2012-06-07 23:18 ` Andi Kleen
  2012-06-08  8:45 ` [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Ingo Molnar
  1 sibling, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2012-06-07 23:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, eranian, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Even with precise profiling Intel CPUs have a "skid". The sample
triggers a few cycles later than the instruction, so in some
cases there can be systematic errors where expensive instructions never
show up in the profile log.

Sandy Bridge added a new PDIR instruction retired event that randomizes
the sampling slightly. This corrects for systematic errors, so that
you should in most cases see the correct instruction getting profile hits.

Unfortunately the SandyBridge version could only work with a otherwise
quiescent CPU and was difficult to use. But now on IvyBridge this
restriction is gone and can be more widely used.

This only works for retired instructions.

I enabled it -- somewhat arbitarily -- for two 'p's or more.

To use it

perf record -e instructions:pp ...

This provides a more precise alternative to the usual cycles:pp,
however it will not account for expensive instructions.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 0f58590..69336f3 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1357,6 +1357,28 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
+static int pdir_hw_config(struct perf_event *event)
+{
+	int err = intel_pmu_hw_config(event);
+
+	if (err)
+		return err;
+
+	/* 
+	 * Use the PDIR instruction retired counter for two 'p's.
+	 * This will randomize samples slightly and avoid some systematic 
+	 * measurement errors.
+	 * Only works for retired cycles.
+	 */
+	if (event->attr.precise_ip >= 2 &&
+	    (event->hw.config & X86_RAW_EVENT_MASK) == 0xc0) {
+		u64 pdir_event = X86_CONFIG(.event=0xc0, .umask=1);
+		event->hw.config = pdir_event | (event->hw.config & ~X86_RAW_EVENT_MASK);
+	}
+
+	return 0;
+}
+
 struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {
 	if (x86_pmu.guest_get_msrs)
@@ -1843,6 +1865,7 @@ __init int intel_pmu_init(void)
 	case 58: /* IvyBridge */
 		name = "Ivy";
 		/* No backend stall event */
+		x86_pmu.hw_config = pdir_hw_config;
 		goto snb_ivb_common;
 
 	case 42: /* SandyBridge */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2
  2012-06-07 23:18 [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Andi Kleen
  2012-06-07 23:18 ` [PATCH 2/2] perf, x86: Enable PDIR precise instruction profiling on IvyBridge Andi Kleen
@ 2012-06-08  8:45 ` Ingo Molnar
  1 sibling, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2012-06-08  8:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, peterz, mingo, eranian, Andi Kleen


* Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Very similar to Sandy Bridge, but there is no PEBS problem.
> 
> As Stephane pointed out .code=0xb1, .umask=0x01 is gone, so don't
> do a generic backend stall event on IvyBridge.
> 
> v2: Remove stall event
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |   19 ++++++++++++++-----
>  1 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 166546e..0f58590 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1698,6 +1698,7 @@ __init int intel_pmu_init(void)
>  	union cpuid10_ebx ebx;
>  	unsigned int unused;
>  	int version;
> +	char *name;
>  
>  	if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
>  		switch (boot_cpu_data.x86) {
> @@ -1839,9 +1840,21 @@ __init int intel_pmu_init(void)
>  		pr_cont("Westmere events, ");
>  		break;
>  
> +	case 58: /* IvyBridge */
> +		name = "Ivy";
> +		/* No backend stall event */
> +		goto snb_ivb_common;
> +
>  	case 42: /* SandyBridge */
>  		x86_add_quirk(intel_sandybridge_quirk);
>  	case 45: /* SandyBridge, "Romely-EP" */
> +		name = "Sandy";
> +
> +		/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
> +		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
> +			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
> +
> +	snb_ivb_common:
>  		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
>  		       sizeof(hw_cache_event_ids));
>  
> @@ -1857,11 +1870,7 @@ __init int intel_pmu_init(void)
>  		/* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
>  		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
>  			X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
> -		/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
> -		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
> -			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
> -
> -		pr_cont("SandyBridge events, ");
> +		pr_cont("%sBridge events, ", name);
>  		break;

First round review feedback: your patch does not apply to the 
perf development/fixes tree (-tip), please send one that will.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2
  2012-06-08 21:45 ` [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Andi Kleen
@ 2012-06-11  8:16   ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2012-06-11  8:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: eranian, linux-kernel, Andi Kleen, Peter Zijlstra, Thomas Gleixner


* Andi Kleen <andi@firstfloor.org> wrote:

> +	char *name;
> +		name = "Ivy";
> +		name = "Sandy";
> -		pr_cont("SandyBridge events, ");
> +		pr_cont("%sBridge events, ", name);

Second round review feedback: you'll have to remove this ugly 
obfuscation, the kernel is already complex as-is. In performance 
insensitive code paths we want straight, obvious code without 
tricks.

Please resubmit the patches with this fixed.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2
  2012-06-08 21:45 Updated Ivybridge perf patchkit for tip perf/core Andi Kleen
@ 2012-06-08 21:45 ` Andi Kleen
  2012-06-11  8:16   ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2012-06-08 21:45 UTC (permalink / raw)
  To: mingo; +Cc: eranian, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Very similar to Sandy Bridge, but there is no PEBS problem.

As Stephane pointed out .code=0xb1, .umask=0x01 is gone, so don't
do a generic backend stall event on IvyBridge.

v2: Remove stall event
Signed-off-by: Andi Kleen <ak@linux.intel.com>

Conflicts:

	arch/x86/kernel/cpu/perf_event_intel.c
---
 arch/x86/kernel/cpu/perf_event_intel.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index e23e71f..4d6188d 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1767,6 +1767,7 @@ __init int intel_pmu_init(void)
 	union cpuid10_ebx ebx;
 	unsigned int unused;
 	int version;
+	char *name;
 
 	if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
 		switch (boot_cpu_data.x86) {
@@ -1910,10 +1911,21 @@ __init int intel_pmu_init(void)
 		pr_cont("Westmere events, ");
 		break;
 
+	case 58: /* IvyBridge */
+		name = "Ivy";
+		/* No backend stall event */
+		goto snb_ivb_common;
+
 	case 42: /* SandyBridge */
 	case 45: /* SandyBridge, "Romely-EP" */
 		x86_add_quirk(intel_sandybridge_quirk);
-	case 58: /* IvyBridge */
+		name = "Sandy";
+
+		/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
+		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
+			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
+
+	snb_ivb_common:
 		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
@@ -1934,7 +1946,7 @@ __init int intel_pmu_init(void)
 		intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
 			X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
 
-		pr_cont("SandyBridge events, ");
+		pr_cont("%sBridge events, ", name);
 		break;
 
 	default:
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-06-11  8:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-07 23:18 [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Andi Kleen
2012-06-07 23:18 ` [PATCH 2/2] perf, x86: Enable PDIR precise instruction profiling on IvyBridge Andi Kleen
2012-06-08  8:45 ` [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Ingo Molnar
2012-06-08 21:45 Updated Ivybridge perf patchkit for tip perf/core Andi Kleen
2012-06-08 21:45 ` [PATCH 1/2] perf, x86: Add basic Ivy Bridge support v2 Andi Kleen
2012-06-11  8:16   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).