linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
@ 2011-07-08 20:17 Cyrill Gorcunov
  2011-07-14 18:46 ` Don Zickus
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Cyrill Gorcunov @ 2011-07-08 20:17 UTC (permalink / raw)
  To: Ingo Molnar, Don Zickus
  Cc: LKML, Peter Zijlstra, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt

Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.

The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.

Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.

P.S. Thanks a huge to Don and Steven for for testing
     and early review.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Don Zickus <dzickus@redhat.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Stephane Eranian <eranian@google.com>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
---

Don, seems I managed to finish cooking patch earlier,
so please give it a final pass when you get some time for.
I hope I didn't screw anything ;)

As always additional review and complains are welcome.

 arch/x86/include/asm/perf_event_p4.h |   33 ++++++++
 arch/x86/kernel/cpu/perf_event.c     |    7 -
 arch/x86/kernel/cpu/perf_event_p4.c  |  135 +++++++++++++++++++++++++++--------
 kernel/watchdog.c                    |    2 
 4 files changed, 139 insertions(+), 38 deletions(-)

Index: linux-2.6.git/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux-2.6.git.orig/arch/x86/include/asm/perf_event_p4.h
+++ linux-2.6.git/arch/x86/include/asm/perf_event_p4.h
@@ -102,6 +102,14 @@
 #define P4_CONFIG_HT			(1ULL << P4_CONFIG_HT_SHIFT)
 
 /*
+ * If an event has alias it should be marked
+ * with a special bit. (Don't forget to check
+ * P4_PEBS_CONFIG_MASK and related bits on
+ * modification.)
+ */
+#define P4_CONFIG_ALIASABLE		(1 << 9)
+
+/*
  * The bits we allow to pass for RAW events
  */
 #define P4_CONFIG_MASK_ESCR		\
@@ -123,6 +131,31 @@
 	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR))	| \
 	(p4_config_pack_cccr(P4_CONFIG_MASK_CCCR))
 
+/*
+ * In case of event aliasing we need to preserve some
+ * caller bits otherwise the mapping won't be complete.
+ */
+#define P4_CONFIG_EVENT_ALIAS_MASK			  \
+	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR)	| \
+	 p4_config_pack_cccr(P4_CCCR_EDGE		| \
+			     P4_CCCR_THRESHOLD_MASK	| \
+			     P4_CCCR_COMPLEMENT		| \
+			     P4_CCCR_COMPARE))
+
+#define  P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS		  \
+	((P4_CONFIG_HT)					| \
+	 p4_config_pack_escr(P4_ESCR_T0_OS		| \
+			     P4_ESCR_T0_USR		| \
+			     P4_ESCR_T1_OS		| \
+			     P4_ESCR_T1_USR)		| \
+	 p4_config_pack_cccr(P4_CCCR_OVF		| \
+			     P4_CCCR_CASCADE		| \
+			     P4_CCCR_FORCE_OVF		| \
+			     P4_CCCR_THREAD_ANY		| \
+			     P4_CCCR_OVF_PMI_T0		| \
+			     P4_CCCR_OVF_PMI_T1		| \
+			     P4_CONFIG_ALIASABLE))
+
 static inline bool p4_is_event_cascaded(u64 config)
 {
 	u32 cccr = p4_config_unpack_cccr(config);
Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
@@ -274,7 +274,6 @@ struct x86_pmu {
 	void		(*enable_all)(int added);
 	void		(*enable)(struct perf_event *);
 	void		(*disable)(struct perf_event *);
-	void		(*hw_watchdog_set_attr)(struct perf_event_attr *attr);
 	int		(*hw_config)(struct perf_event *event);
 	int		(*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
 	unsigned	eventsel;
@@ -360,12 +359,6 @@ static u64 __read_mostly hw_cache_extra_
 				[PERF_COUNT_HW_CACHE_OP_MAX]
 				[PERF_COUNT_HW_CACHE_RESULT_MAX];
 
-void hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr)
-{
-	if (x86_pmu.hw_watchdog_set_attr)
-		x86_pmu.hw_watchdog_set_attr(wd_attr);
-}
-
 /*
  * Propagate event elapsed time into the generic event.
  * Can only be executed on the CPU where the event is active.
Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
@@ -570,11 +570,92 @@ static __initconst const u64 p4_hw_cache
  },
 };
 
+/*
+ * Because of Netburst being quite restricted in now
+ * many same events can run simultaneously, we use
+ * event aliases, ie different events which have the
+ * same functionallity but use non-intersected resources
+ * (ESCR/CCCR/couter registers). This allow us to run
+ * two or more semi-same events together. It is done
+ * transparently to a user space.
+ *
+ * Never set any cusom internal bits such as P4_CONFIG_HT,
+ * P4_CONFIG_ALIASABLE or bits for P4_PEBS_METRIC, they are
+ * either up-to-dated automatically either not appliable
+ * at all.
+ *
+ * And be really carefull choosing aliases!
+ */
+struct p4_event_alias {
+	u64 orig;
+	u64 alter;
+} p4_event_aliases[] = {
+	{
+		/*
+		 * Non-halted cycles can be substituted with
+		 * non-sleeping cycles (see Intel SDM Vol3b for
+		 * details).
+		 */
+	.orig	=
+		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
+	.alter	=
+		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))|
+		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
+				    P4_CCCR_COMPARE),
+ 	},
+};
+
+static u64 p4_get_alias_event(u64 config)
+{
+	u64 config_match;
+	int i;
+
+	/*
+	 * Probably we're lucky and don't have to do
+	 * matching over all config bits.
+	 */
+	if (!(config & P4_CONFIG_ALIASABLE))
+		return 0;
+
+	config_match = config & P4_CONFIG_EVENT_ALIAS_MASK;
+
+	/*
+	 * If an event was previously swapped to the alter config
+	 * we should swap it back otherwise contnention on registers
+	 * will return back.
+	 */
+	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
+		if (config_match == p4_event_aliases[i].orig) {
+			config_match = p4_event_aliases[i].alter;
+			break;
+		} else if (config_match == p4_event_aliases[i].alter) {
+			config_match = p4_event_aliases[i].orig;
+			break;
+		}
+	}
+
+	if (i >= ARRAY_SIZE(p4_event_aliases))
+		return 0;
+
+	return (config_match |
+		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));
+}
+
 static u64 p4_general_events[PERF_COUNT_HW_MAX] = {
   /* non-halted CPU clocks */
   [PERF_COUNT_HW_CPU_CYCLES] =
 	p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
-		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
+		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING))	|
+		P4_CONFIG_ALIASABLE,
 
   /*
    * retired instructions
@@ -719,31 +800,6 @@ static int p4_validate_raw_event(struct 
 	return 0;
 }
 
-static void p4_hw_watchdog_set_attr(struct perf_event_attr *wd_attr)
-{
-	/*
-	 * Watchdog ticks are special on Netburst, we use
-	 * that named "non-sleeping" ticks as recommended
-	 * by Intel SDM Vol3b.
-	 */
-	WARN_ON_ONCE(wd_attr->type	!= PERF_TYPE_HARDWARE ||
-		     wd_attr->config	!= PERF_COUNT_HW_CPU_CYCLES);
-
-	wd_attr->type	= PERF_TYPE_RAW;
-	wd_attr->config	=
-		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))		|
-		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
-			P4_CCCR_COMPARE);
-}
-
 static int p4_hw_config(struct perf_event *event)
 {
 	int cpu = get_cpu();
@@ -1159,6 +1215,8 @@ static int p4_pmu_schedule_events(struct
 	struct p4_event_bind *bind;
 	unsigned int i, thread, num;
 	int cntr_idx, escr_idx;
+	u64 config_alias;
+	int pass;
 
 	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
 	bitmap_zero(escr_mask, P4_ESCR_MSR_TABLE_SIZE);
@@ -1167,6 +1225,17 @@ static int p4_pmu_schedule_events(struct
 
 		hwc = &cpuc->event_list[i]->hw;
 		thread = p4_ht_thread(cpu);
+		pass = 0;
+
+again:
+		/*
+		 * Aliases are swappable so we may hit circular
+		 * lock if both original config and alias need
+		 * resources (MSR registers) which already busy.
+		 */
+		if (pass > 2)
+			goto done;
+
 		bind = p4_config_get_bind(hwc->config);
 		escr_idx = p4_get_escr_idx(bind->escr_msr[thread]);
 		if (unlikely(escr_idx == -1))
@@ -1180,8 +1249,17 @@ static int p4_pmu_schedule_events(struct
 		}
 
 		cntr_idx = p4_next_cntr(thread, used_mask, bind);
-		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask))
-			goto done;
+		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask)) {
+			/*
+			 * Probably an event alias is still available.
+			 */
+			config_alias = p4_get_alias_event(hwc->config);
+			if (!config_alias)
+				goto done;
+			hwc->config = config_alias;
+			pass++;
+			goto again;
+		}
 
 		p4_pmu_swap_config_ts(hwc, cpu);
 		if (assign)
@@ -1218,7 +1296,6 @@ static __initconst const struct x86_pmu 
 	.cntval_bits		= ARCH_P4_CNTRVAL_BITS,
 	.cntval_mask		= ARCH_P4_CNTRVAL_MASK,
 	.max_period		= (1ULL << (ARCH_P4_CNTRVAL_BITS - 1)) - 1,
-	.hw_watchdog_set_attr	= p4_hw_watchdog_set_attr,
 	.hw_config		= p4_hw_config,
 	.schedule_events	= p4_pmu_schedule_events,
 	/*
Index: linux-2.6.git/kernel/watchdog.c
===================================================================
--- linux-2.6.git.orig/kernel/watchdog.c
+++ linux-2.6.git/kernel/watchdog.c
@@ -200,7 +200,6 @@ static int is_softlockup(unsigned long t
 }
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
-void __weak hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr) { }
 
 static struct perf_event_attr wd_hw_attr = {
 	.type		= PERF_TYPE_HARDWARE,
@@ -372,7 +371,6 @@ static int watchdog_nmi_enable(int cpu)
 
 	wd_attr = &wd_hw_attr;
 	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
-	hw_nmi_watchdog_set_attr(wd_attr);
 
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-08 20:17 [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature Cyrill Gorcunov
@ 2011-07-14 18:46 ` Don Zickus
  2011-07-21  6:46 ` Ingo Molnar
  2011-07-21 10:00 ` [tip:perf/core] " tip-bot for Cyrill Gorcunov
  2 siblings, 0 replies; 9+ messages in thread
From: Don Zickus @ 2011-07-14 18:46 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt

On Sat, Jul 09, 2011 at 12:17:12AM +0400, Cyrill Gorcunov wrote:
> Instead of hw_nmi_watchdog_set_attr() weak function
> and appropriate x86_pmu::hw_watchdog_set_attr() call
> we introduce even alias mechanism which allow us
> to drop this routines completely and isolate quirks
> of Netburst architecture inside P4 PMU code only.

Acked-by: Don Zickus <dzickus@redhat.com>

> 
> The main idea remains the same though -- to allow
> nmi-watchdog and perf top run simultaneously.
> 
> Note the aliasing mechanism applies to generic
> PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
> event (say passed as RAW initially) might have some
> additional bits set inside ESCR register changing
> the behaviour of event and we can't guarantee anymore
> that alias event will give the same result.
> 
> P.S. Thanks a huge to Don and Steven for for testing
>      and early review.
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> CC: Don Zickus <dzickus@redhat.com>
> CC: Steven Rostedt <rostedt@goodmis.org>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Stephane Eranian <eranian@google.com>
> CC: Lin Ming <ming.m.lin@intel.com>
> CC: Arnaldo Carvalho de Melo <acme@redhat.com>
> CC: Frederic Weisbecker <fweisbec@gmail.com>
> ---
> 
> Don, seems I managed to finish cooking patch earlier,
> so please give it a final pass when you get some time for.
> I hope I didn't screw anything ;)
> 
> As always additional review and complains are welcome.
> 
>  arch/x86/include/asm/perf_event_p4.h |   33 ++++++++
>  arch/x86/kernel/cpu/perf_event.c     |    7 -
>  arch/x86/kernel/cpu/perf_event_p4.c  |  135 +++++++++++++++++++++++++++--------
>  kernel/watchdog.c                    |    2 
>  4 files changed, 139 insertions(+), 38 deletions(-)
> 
> Index: linux-2.6.git/arch/x86/include/asm/perf_event_p4.h
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/include/asm/perf_event_p4.h
> +++ linux-2.6.git/arch/x86/include/asm/perf_event_p4.h
> @@ -102,6 +102,14 @@
>  #define P4_CONFIG_HT			(1ULL << P4_CONFIG_HT_SHIFT)
>  
>  /*
> + * If an event has alias it should be marked
> + * with a special bit. (Don't forget to check
> + * P4_PEBS_CONFIG_MASK and related bits on
> + * modification.)
> + */
> +#define P4_CONFIG_ALIASABLE		(1 << 9)
> +
> +/*
>   * The bits we allow to pass for RAW events
>   */
>  #define P4_CONFIG_MASK_ESCR		\
> @@ -123,6 +131,31 @@
>  	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR))	| \
>  	(p4_config_pack_cccr(P4_CONFIG_MASK_CCCR))
>  
> +/*
> + * In case of event aliasing we need to preserve some
> + * caller bits otherwise the mapping won't be complete.
> + */
> +#define P4_CONFIG_EVENT_ALIAS_MASK			  \
> +	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR)	| \
> +	 p4_config_pack_cccr(P4_CCCR_EDGE		| \
> +			     P4_CCCR_THRESHOLD_MASK	| \
> +			     P4_CCCR_COMPLEMENT		| \
> +			     P4_CCCR_COMPARE))
> +
> +#define  P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS		  \
> +	((P4_CONFIG_HT)					| \
> +	 p4_config_pack_escr(P4_ESCR_T0_OS		| \
> +			     P4_ESCR_T0_USR		| \
> +			     P4_ESCR_T1_OS		| \
> +			     P4_ESCR_T1_USR)		| \
> +	 p4_config_pack_cccr(P4_CCCR_OVF		| \
> +			     P4_CCCR_CASCADE		| \
> +			     P4_CCCR_FORCE_OVF		| \
> +			     P4_CCCR_THREAD_ANY		| \
> +			     P4_CCCR_OVF_PMI_T0		| \
> +			     P4_CCCR_OVF_PMI_T1		| \
> +			     P4_CONFIG_ALIASABLE))
> +
>  static inline bool p4_is_event_cascaded(u64 config)
>  {
>  	u32 cccr = p4_config_unpack_cccr(config);
> Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c
> +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
> @@ -274,7 +274,6 @@ struct x86_pmu {
>  	void		(*enable_all)(int added);
>  	void		(*enable)(struct perf_event *);
>  	void		(*disable)(struct perf_event *);
> -	void		(*hw_watchdog_set_attr)(struct perf_event_attr *attr);
>  	int		(*hw_config)(struct perf_event *event);
>  	int		(*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
>  	unsigned	eventsel;
> @@ -360,12 +359,6 @@ static u64 __read_mostly hw_cache_extra_
>  				[PERF_COUNT_HW_CACHE_OP_MAX]
>  				[PERF_COUNT_HW_CACHE_RESULT_MAX];
>  
> -void hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr)
> -{
> -	if (x86_pmu.hw_watchdog_set_attr)
> -		x86_pmu.hw_watchdog_set_attr(wd_attr);
> -}
> -
>  /*
>   * Propagate event elapsed time into the generic event.
>   * Can only be executed on the CPU where the event is active.
> Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -570,11 +570,92 @@ static __initconst const u64 p4_hw_cache
>   },
>  };
>  
> +/*
> + * Because of Netburst being quite restricted in now
> + * many same events can run simultaneously, we use
> + * event aliases, ie different events which have the
> + * same functionallity but use non-intersected resources
> + * (ESCR/CCCR/couter registers). This allow us to run
> + * two or more semi-same events together. It is done
> + * transparently to a user space.
> + *
> + * Never set any cusom internal bits such as P4_CONFIG_HT,
> + * P4_CONFIG_ALIASABLE or bits for P4_PEBS_METRIC, they are
> + * either up-to-dated automatically either not appliable
> + * at all.
> + *
> + * And be really carefull choosing aliases!
> + */
> +struct p4_event_alias {
> +	u64 orig;
> +	u64 alter;
> +} p4_event_aliases[] = {
> +	{
> +		/*
> +		 * Non-halted cycles can be substituted with
> +		 * non-sleeping cycles (see Intel SDM Vol3b for
> +		 * details).
> +		 */
> +	.orig	=
> +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
> +	.alter	=
> +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))|
> +		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
> +				    P4_CCCR_COMPARE),
> + 	},
> +};
> +
> +static u64 p4_get_alias_event(u64 config)
> +{
> +	u64 config_match;
> +	int i;
> +
> +	/*
> +	 * Probably we're lucky and don't have to do
> +	 * matching over all config bits.
> +	 */
> +	if (!(config & P4_CONFIG_ALIASABLE))
> +		return 0;
> +
> +	config_match = config & P4_CONFIG_EVENT_ALIAS_MASK;
> +
> +	/*
> +	 * If an event was previously swapped to the alter config
> +	 * we should swap it back otherwise contnention on registers
> +	 * will return back.
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> +		if (config_match == p4_event_aliases[i].orig) {
> +			config_match = p4_event_aliases[i].alter;
> +			break;
> +		} else if (config_match == p4_event_aliases[i].alter) {
> +			config_match = p4_event_aliases[i].orig;
> +			break;
> +		}
> +	}
> +
> +	if (i >= ARRAY_SIZE(p4_event_aliases))
> +		return 0;
> +
> +	return (config_match |
> +		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));
> +}
> +
>  static u64 p4_general_events[PERF_COUNT_HW_MAX] = {
>    /* non-halted CPU clocks */
>    [PERF_COUNT_HW_CPU_CYCLES] =
>  	p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
> -		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
> +		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING))	|
> +		P4_CONFIG_ALIASABLE,
>  
>    /*
>     * retired instructions
> @@ -719,31 +800,6 @@ static int p4_validate_raw_event(struct 
>  	return 0;
>  }
>  
> -static void p4_hw_watchdog_set_attr(struct perf_event_attr *wd_attr)
> -{
> -	/*
> -	 * Watchdog ticks are special on Netburst, we use
> -	 * that named "non-sleeping" ticks as recommended
> -	 * by Intel SDM Vol3b.
> -	 */
> -	WARN_ON_ONCE(wd_attr->type	!= PERF_TYPE_HARDWARE ||
> -		     wd_attr->config	!= PERF_COUNT_HW_CPU_CYCLES);
> -
> -	wd_attr->type	= PERF_TYPE_RAW;
> -	wd_attr->config	=
> -		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))		|
> -		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
> -			P4_CCCR_COMPARE);
> -}
> -
>  static int p4_hw_config(struct perf_event *event)
>  {
>  	int cpu = get_cpu();
> @@ -1159,6 +1215,8 @@ static int p4_pmu_schedule_events(struct
>  	struct p4_event_bind *bind;
>  	unsigned int i, thread, num;
>  	int cntr_idx, escr_idx;
> +	u64 config_alias;
> +	int pass;
>  
>  	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
>  	bitmap_zero(escr_mask, P4_ESCR_MSR_TABLE_SIZE);
> @@ -1167,6 +1225,17 @@ static int p4_pmu_schedule_events(struct
>  
>  		hwc = &cpuc->event_list[i]->hw;
>  		thread = p4_ht_thread(cpu);
> +		pass = 0;
> +
> +again:
> +		/*
> +		 * Aliases are swappable so we may hit circular
> +		 * lock if both original config and alias need
> +		 * resources (MSR registers) which already busy.
> +		 */
> +		if (pass > 2)
> +			goto done;
> +
>  		bind = p4_config_get_bind(hwc->config);
>  		escr_idx = p4_get_escr_idx(bind->escr_msr[thread]);
>  		if (unlikely(escr_idx == -1))
> @@ -1180,8 +1249,17 @@ static int p4_pmu_schedule_events(struct
>  		}
>  
>  		cntr_idx = p4_next_cntr(thread, used_mask, bind);
> -		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask))
> -			goto done;
> +		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask)) {
> +			/*
> +			 * Probably an event alias is still available.
> +			 */
> +			config_alias = p4_get_alias_event(hwc->config);
> +			if (!config_alias)
> +				goto done;
> +			hwc->config = config_alias;
> +			pass++;
> +			goto again;
> +		}
>  
>  		p4_pmu_swap_config_ts(hwc, cpu);
>  		if (assign)
> @@ -1218,7 +1296,6 @@ static __initconst const struct x86_pmu 
>  	.cntval_bits		= ARCH_P4_CNTRVAL_BITS,
>  	.cntval_mask		= ARCH_P4_CNTRVAL_MASK,
>  	.max_period		= (1ULL << (ARCH_P4_CNTRVAL_BITS - 1)) - 1,
> -	.hw_watchdog_set_attr	= p4_hw_watchdog_set_attr,
>  	.hw_config		= p4_hw_config,
>  	.schedule_events	= p4_pmu_schedule_events,
>  	/*
> Index: linux-2.6.git/kernel/watchdog.c
> ===================================================================
> --- linux-2.6.git.orig/kernel/watchdog.c
> +++ linux-2.6.git/kernel/watchdog.c
> @@ -200,7 +200,6 @@ static int is_softlockup(unsigned long t
>  }
>  
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> -void __weak hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr) { }
>  
>  static struct perf_event_attr wd_hw_attr = {
>  	.type		= PERF_TYPE_HARDWARE,
> @@ -372,7 +371,6 @@ static int watchdog_nmi_enable(int cpu)
>  
>  	wd_attr = &wd_hw_attr;
>  	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
> -	hw_nmi_watchdog_set_attr(wd_attr);
>  
>  	/* Try to register using hardware perf events */
>  	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-08 20:17 [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature Cyrill Gorcunov
  2011-07-14 18:46 ` Don Zickus
@ 2011-07-21  6:46 ` Ingo Molnar
  2011-07-21  7:20   ` Cyrill Gorcunov
  2011-07-21 12:02   ` Peter Zijlstra
  2011-07-21 10:00 ` [tip:perf/core] " tip-bot for Cyrill Gorcunov
  2 siblings, 2 replies; 9+ messages in thread
From: Ingo Molnar @ 2011-07-21  6:46 UTC (permalink / raw)
  To: Cyrill Gorcunov, Peter Zijlstra
  Cc: Don Zickus, LKML, Peter Zijlstra, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt


* Cyrill Gorcunov <gorcunov@openvz.org> wrote:

> @@ -102,6 +102,14 @@
>  #define P4_CONFIG_HT			(1ULL << P4_CONFIG_HT_SHIFT)
>  
>  /*
> + * If an event has alias it should be marked
> + * with a special bit. (Don't forget to check
> + * P4_PEBS_CONFIG_MASK and related bits on
> + * modification.)
> + */
> +#define P4_CONFIG_ALIASABLE		(1 << 9)
> +
> +/*
>   * The bits we allow to pass for RAW events
>   */
>  #define P4_CONFIG_MASK_ESCR		\
> @@ -123,6 +131,31 @@
>  	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR))	| \
>  	(p4_config_pack_cccr(P4_CONFIG_MASK_CCCR))
>  
> +/*
> + * In case of event aliasing we need to preserve some
> + * caller bits otherwise the mapping won't be complete.

Missing comma.

> + */
> +#define P4_CONFIG_EVENT_ALIAS_MASK			  \
> +	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR)	| \
> +	 p4_config_pack_cccr(P4_CCCR_EDGE		| \
> +			     P4_CCCR_THRESHOLD_MASK	| \
> +			     P4_CCCR_COMPLEMENT		| \
> +			     P4_CCCR_COMPARE))
> +
> +#define  P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS		  \
> +	((P4_CONFIG_HT)					| \
> +	 p4_config_pack_escr(P4_ESCR_T0_OS		| \
> +			     P4_ESCR_T0_USR		| \
> +			     P4_ESCR_T1_OS		| \
> +			     P4_ESCR_T1_USR)		| \
> +	 p4_config_pack_cccr(P4_CCCR_OVF		| \
> +			     P4_CCCR_CASCADE		| \
> +			     P4_CCCR_FORCE_OVF		| \
> +			     P4_CCCR_THREAD_ANY		| \
> +			     P4_CCCR_OVF_PMI_T0		| \
> +			     P4_CCCR_OVF_PMI_T1		| \
> +			     P4_CONFIG_ALIASABLE))
> +
>  static inline bool p4_is_event_cascaded(u64 config)
>  {
>  	u32 cccr = p4_config_unpack_cccr(config);
> Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c
> +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
> @@ -274,7 +274,6 @@ struct x86_pmu {
>  	void		(*enable_all)(int added);
>  	void		(*enable)(struct perf_event *);
>  	void		(*disable)(struct perf_event *);
> -	void		(*hw_watchdog_set_attr)(struct perf_event_attr *attr);
>  	int		(*hw_config)(struct perf_event *event);
>  	int		(*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
>  	unsigned	eventsel;
> @@ -360,12 +359,6 @@ static u64 __read_mostly hw_cache_extra_
>  				[PERF_COUNT_HW_CACHE_OP_MAX]
>  				[PERF_COUNT_HW_CACHE_RESULT_MAX];
>  
> -void hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr)
> -{
> -	if (x86_pmu.hw_watchdog_set_attr)
> -		x86_pmu.hw_watchdog_set_attr(wd_attr);
> -}
> -
>  /*
>   * Propagate event elapsed time into the generic event.
>   * Can only be executed on the CPU where the event is active.
> Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -570,11 +570,92 @@ static __initconst const u64 p4_hw_cache
>   },
>  };
>  
> +/*
> + * Because of Netburst being quite restricted in now
> + * many same events can run simultaneously, we use

typo(s).

> + * event aliases, ie different events which have the
> + * same functionallity but use non-intersected resources
> + * (ESCR/CCCR/couter registers). This allow us to run
> + * two or more semi-same events together. It is done
> + * transparently to a user space.

typo.

> + *
> + * Never set any cusom internal bits such as P4_CONFIG_HT,
> + * P4_CONFIG_ALIASABLE or bits for P4_PEBS_METRIC, they are
> + * either up-to-dated automatically either not appliable

typo(s).

> + * at all.
> + *
> + * And be really carefull choosing aliases!

typo.

> + */
> +struct p4_event_alias {
> +	u64 orig;
> +	u64 alter;

what is an 'alter'? alias? alternate?

> +} p4_event_aliases[] = {
> +	{
> +		/*
> +		 * Non-halted cycles can be substituted with
> +		 * non-sleeping cycles (see Intel SDM Vol3b for
> +		 * details).
> +		 */
> +	.orig	=
> +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
> +	.alter	=
> +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)	|
> +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))|
> +		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
> +				    P4_CCCR_COMPARE),

That currently we only offer an alias for the generic cycles event is 
an important property - yet it's only visible from the code and is 
not mentioned in any of the voluminous comments. Good comments are to 
the point and focused: what problem is being solved, how is it solved 
and what are the limitations of the solution.

> + 	},
> +};
> +
> +static u64 p4_get_alias_event(u64 config)
> +{
> +	u64 config_match;
> +	int i;
> +
> +	/*
> +	 * Probably we're lucky and don't have to do
> +	 * matching over all config bits.
> +	 */
> +	if (!(config & P4_CONFIG_ALIASABLE))
> +		return 0;

'all' config bits? There's a single alias mapping in 
p4_event_aliases[] right now which makes the comment rather 
misleading ...

> +
> +	config_match = config & P4_CONFIG_EVENT_ALIAS_MASK;
> +
> +	/*
> +	 * If an event was previously swapped to the alter config
> +	 * we should swap it back otherwise contnention on registers
> +	 * will return back.

'alter config' is a new English word apparently. We can invent new 
words but is it really warranted here? 'alternate config'? 'config 
alias'?

> +	 */
> +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> +		if (config_match == p4_event_aliases[i].orig) {
> +			config_match = p4_event_aliases[i].alter;
> +			break;
> +		} else if (config_match == p4_event_aliases[i].alter) {
> +			config_match = p4_event_aliases[i].orig;
> +			break;
> +		}
> +	}

Since this .c file is P4 specific and p4_event_aliases[] is a 
file-scope array, is the p4_ prefix even needed?

> +
> +	if (i >= ARRAY_SIZE(p4_event_aliases))
> +		return 0;
> +
> +	return (config_match |
> +		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));

'return' is not a function. Also, please don't break the line 
pointlessly.

> +}
> +
>  static u64 p4_general_events[PERF_COUNT_HW_MAX] = {
>    /* non-halted CPU clocks */
>    [PERF_COUNT_HW_CPU_CYCLES] =
>  	p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
> -		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
> +		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING))	|
> +		P4_CONFIG_ALIASABLE,
>  
>    /*
>     * retired instructions
> @@ -719,31 +800,6 @@ static int p4_validate_raw_event(struct 
>  	return 0;
>  }
>  
> -static void p4_hw_watchdog_set_attr(struct perf_event_attr *wd_attr)
> -{
> -	/*
> -	 * Watchdog ticks are special on Netburst, we use
> -	 * that named "non-sleeping" ticks as recommended
> -	 * by Intel SDM Vol3b.
> -	 */
> -	WARN_ON_ONCE(wd_attr->type	!= PERF_TYPE_HARDWARE ||
> -		     wd_attr->config	!= PERF_COUNT_HW_CPU_CYCLES);
> -
> -	wd_attr->type	= PERF_TYPE_RAW;
> -	wd_attr->config	=
> -		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)		|
> -			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))		|
> -		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
> -			P4_CCCR_COMPARE);
> -}
> -
>  static int p4_hw_config(struct perf_event *event)
>  {
>  	int cpu = get_cpu();
> @@ -1159,6 +1215,8 @@ static int p4_pmu_schedule_events(struct
>  	struct p4_event_bind *bind;
>  	unsigned int i, thread, num;
>  	int cntr_idx, escr_idx;
> +	u64 config_alias;
> +	int pass;
>  
>  	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
>  	bitmap_zero(escr_mask, P4_ESCR_MSR_TABLE_SIZE);
> @@ -1167,6 +1225,17 @@ static int p4_pmu_schedule_events(struct
>  
>  		hwc = &cpuc->event_list[i]->hw;
>  		thread = p4_ht_thread(cpu);
> +		pass = 0;
> +
> +again:
> +		/*
> +		 * Aliases are swappable so we may hit circular
> +		 * lock if both original config and alias need
> +		 * resources (MSR registers) which already busy.

typo.

> +		 */
> +		if (pass > 2)
> +			goto done;
> +
>  		bind = p4_config_get_bind(hwc->config);
>  		escr_idx = p4_get_escr_idx(bind->escr_msr[thread]);
>  		if (unlikely(escr_idx == -1))
> @@ -1180,8 +1249,17 @@ static int p4_pmu_schedule_events(struct
>  		}
>  
>  		cntr_idx = p4_next_cntr(thread, used_mask, bind);
> -		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask))
> -			goto done;
> +		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask)) {
> +			/*
> +			 * Probably an event alias is still available.

s/Probably/Possibly?

> +			 */
> +			config_alias = p4_get_alias_event(hwc->config);
> +			if (!config_alias)
> +				goto done;
> +			hwc->config = config_alias;
> +			pass++;
> +			goto again;
> +		}
>  
>  		p4_pmu_swap_config_ts(hwc, cpu);
>  		if (assign)
> @@ -1218,7 +1296,6 @@ static __initconst const struct x86_pmu 
>  	.cntval_bits		= ARCH_P4_CNTRVAL_BITS,
>  	.cntval_mask		= ARCH_P4_CNTRVAL_MASK,
>  	.max_period		= (1ULL << (ARCH_P4_CNTRVAL_BITS - 1)) - 1,
> -	.hw_watchdog_set_attr	= p4_hw_watchdog_set_attr,
>  	.hw_config		= p4_hw_config,
>  	.schedule_events	= p4_pmu_schedule_events,
>  	/*
> Index: linux-2.6.git/kernel/watchdog.c
> ===================================================================
> --- linux-2.6.git.orig/kernel/watchdog.c
> +++ linux-2.6.git/kernel/watchdog.c
> @@ -200,7 +200,6 @@ static int is_softlockup(unsigned long t
>  }
>  
>  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> -void __weak hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr) { }
>  
>  static struct perf_event_attr wd_hw_attr = {
>  	.type		= PERF_TYPE_HARDWARE,
> @@ -372,7 +371,6 @@ static int watchdog_nmi_enable(int cpu)
>  
>  	wd_attr = &wd_hw_attr;
>  	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
> -	hw_nmi_watchdog_set_attr(wd_attr);
>  
>  	/* Try to register using hardware perf events */
>  	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);

This looks like the right approach otherwise. Peter?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-21  6:46 ` Ingo Molnar
@ 2011-07-21  7:20   ` Cyrill Gorcunov
  2011-07-21  7:52     ` Ingo Molnar
  2011-07-21 12:02   ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Cyrill Gorcunov @ 2011-07-21  7:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Don Zickus, LKML, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt

On Thu, Jul 21, 2011 at 08:46:10AM +0200, Ingo Molnar wrote:
...
> 
> > + */
> > +struct p4_event_alias {
> > +	u64 orig;
> > +	u64 alter;
> 
> what is an 'alter'? alias? alternate?

Shorthanded 'alternate', I can change it to alias.

> 
> > +} p4_event_aliases[] = {
> > +	{
> > +		/*
> > +		 * Non-halted cycles can be substituted with
> > +		 * non-sleeping cycles (see Intel SDM Vol3b for
> > +		 * details).
> > +		 */
> > +	.orig	=
> > +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
> > +	.alter	=
> > +		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)	|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)	|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)	|
> > +				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))|
> > +		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
> > +				    P4_CCCR_COMPARE),
> 
> That currently we only offer an alias for the generic cycles event is 
> an important property - yet it's only visible from the code and is 
> not mentioned in any of the voluminous comments. Good comments are to 
> the point and focused: what problem is being solved, how is it solved 
> and what are the limitations of the solution.

ok, will update the comment.

> 
> > + 	},
> > +};
> > +
> > +static u64 p4_get_alias_event(u64 config)
> > +{
> > +	u64 config_match;
> > +	int i;
> > +
> > +	/*
> > +	 * Probably we're lucky and don't have to do
> > +	 * matching over all config bits.
> > +	 */
> > +	if (!(config & P4_CONFIG_ALIASABLE))
> > +		return 0;
> 
> 'all' config bits? There's a single alias mapping in 
> p4_event_aliases[] right now which makes the comment rather 
> misleading ...

no, the cycle below does check for all bits in config, and test
for single bit might help us to return early.

> 
> > +
> > +	config_match = config & P4_CONFIG_EVENT_ALIAS_MASK;
> > +
> > +	/*
> > +	 * If an event was previously swapped to the alter config
> > +	 * we should swap it back otherwise contnention on registers
> > +	 * will return back.
> 
> 'alter config' is a new English word apparently. We can invent new 
> words but is it really warranted here? 'alternate config'? 'config 
> alias'?

i'll update to alternate config

> 
> > +	 */
> > +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> > +		if (config_match == p4_event_aliases[i].orig) {
> > +			config_match = p4_event_aliases[i].alter;
> > +			break;
> > +		} else if (config_match == p4_event_aliases[i].alter) {
> > +			config_match = p4_event_aliases[i].orig;
> > +			break;
> > +		}
> > +	}
> 
> Since this .c file is P4 specific and p4_event_aliases[] is a 
> file-scope array, is the p4_ prefix even needed?

I prefer them to be distinguished this way, since the .c file in
real is included into another (perf_event.c) file.

> 
> > +
> > +	if (i >= ARRAY_SIZE(p4_event_aliases))
> > +		return 0;
> > +
> > +	return (config_match |
> > +		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));
> 
> 'return' is not a function. Also, please don't break the line 
> pointlessly.

It is perfectly fine to return args in braces,
but I will change it.

...
> > +			/*
> > +			 * Probably an event alias is still available.
> 
> s/Probably/Possibly?

Hm? I don't get what is wrong with 'probably'.
...
> >  
> >  	/* Try to register using hardware perf events */
> >  	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
> 
> This looks like the right approach otherwise. Peter?
> 
> Thanks,
> 
> 	Ingo

Thanks Ingo, I'll update the patch.

	Cyrill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-21  7:20   ` Cyrill Gorcunov
@ 2011-07-21  7:52     ` Ingo Molnar
  2011-07-21  8:07       ` Cyrill Gorcunov
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2011-07-21  7:52 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Peter Zijlstra, Don Zickus, LKML, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> > > +static u64 p4_get_alias_event(u64 config)
> > > +{
> > > +	u64 config_match;
> > > +	int i;
> > > +
> > > +	/*
> > > +	 * Probably we're lucky and don't have to do
> > > +	 * matching over all config bits.
> > > +	 */
> > > +	if (!(config & P4_CONFIG_ALIASABLE))
> > > +		return 0;
> > 
> > 'all' config bits? There's a single alias mapping in 
> > p4_event_aliases[] right now which makes the comment rather 
> > misleading ...
> 
> no, the cycle below does check for all bits in config, and test
> for single bit might help us to return early.

The loop below? AFAICS it does:

> > > +	 */
> > > +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> > > +		if (config_match == p4_event_aliases[i].orig) {
> > > +			config_match = p4_event_aliases[i].alter;
> > > +			break;
> > > +		} else if (config_match == p4_event_aliases[i].alter) {
> > > +			config_match = p4_event_aliases[i].orig;
> > > +			break;
> > > +		}
> > > +	}

and p4_event_aliases[] is a single-entry array, so ARRAY_SIZE() is 1 
and this loop iterates once.

> > Since this .c file is P4 specific and p4_event_aliases[] is a 
> > file-scope array, is the p4_ prefix even needed?
> 
> I prefer them to be distinguished this way, since the .c file in 
> real is included into another (perf_event.c) file.

That should probably be fixed btw. - but yeah, until the PMU drivers 
are separated out better the prefix is fine.

> > > +
> > > +	if (i >= ARRAY_SIZE(p4_event_aliases))
> > > +		return 0;
> > > +
> > > +	return (config_match |
> > > +		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));
> > 
> > 'return' is not a function. Also, please don't break the line 
> > pointlessly.
> 
> It is perfectly fine to return args in braces,
> but I will change it.

The parantheses are syntactically valid but not 'perfectly fine'. As 
per kernel coding style we don't use them, to signal that 'return' is 
not a function.

> 
> ...
> > > +			/*
> > > +			 * Probably an event alias is still available.
> > 
> > s/Probably/Possibly?
> 
> Hm? I don't get what is wrong with 'probably'.

Probably means "most likely, maybe" - and that's not the case here.

(It's misleading in this context because it declares the probability 
of this action as always higher than 50% - and that's not always the 
case here.)

'Possibly' leaves the probability more undefined.

You could also write 'Check whether an event alias is still 
available'.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-21  7:52     ` Ingo Molnar
@ 2011-07-21  8:07       ` Cyrill Gorcunov
  2011-07-21  8:15         ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Cyrill Gorcunov @ 2011-07-21  8:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Don Zickus, LKML, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt

On Thu, Jul 21, 2011 at 09:52:01AM +0200, Ingo Molnar wrote:
> 
> * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> 
> > > > +static u64 p4_get_alias_event(u64 config)
> > > > +{
> > > > +	u64 config_match;
> > > > +	int i;
> > > > +
> > > > +	/*
> > > > +	 * Probably we're lucky and don't have to do
> > > > +	 * matching over all config bits.
> > > > +	 */
> > > > +	if (!(config & P4_CONFIG_ALIASABLE))
> > > > +		return 0;
> > > 
> > > 'all' config bits? There's a single alias mapping in 
> > > p4_event_aliases[] right now which makes the comment rather 
> > > misleading ...
> > 
> > no, the cycle below does check for all bits in config, and test
> > for single bit might help us to return early.
> 
> The loop below? AFAICS it does:
> 
> > > > +	 */
> > > > +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> > > > +		if (config_match == p4_event_aliases[i].orig) {
> > > > +			config_match = p4_event_aliases[i].alter;
> > > > +			break;
> > > > +		} else if (config_match == p4_event_aliases[i].alter) {
> > > > +			config_match = p4_event_aliases[i].orig;
> > > > +			break;
> > > > +		}
> > > > +	}
> 
> and p4_event_aliases[] is a single-entry array, so ARRAY_SIZE() is 1 
> and this loop iterates once.

I know, but Ingo, I can't guarantee that we'll never bring some new
alias(es) later ;) And test over 64bit variable is not the same as a test
over a single bit (where gcc knows the only one bit is tested and can
optimize it). I think I simply drop this comment, early return case is
understandable without it at all.

> 
> > > Since this .c file is P4 specific and p4_event_aliases[] is a 
> > > file-scope array, is the p4_ prefix even needed?
> > 
> > I prefer them to be distinguished this way, since the .c file in 
> > real is included into another (perf_event.c) file.
> 
> That should probably be fixed btw. - but yeah, until the PMU drivers 
> are separated out better the prefix is fine.

I thought about adding some additional Kconfig where we would choose which
PMU to compile in, not sure though where is the best place for it (under
which menu section I mean).

> 
> > > > +
> > > > +	if (i >= ARRAY_SIZE(p4_event_aliases))
> > > > +		return 0;
> > > > +
> > > > +	return (config_match |
> > > > +		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS));
> > > 
> > > 'return' is not a function. Also, please don't break the line 
> > > pointlessly.
> > 
> > It is perfectly fine to return args in braces,
> > but I will change it.
> 
> The parantheses are syntactically valid but not 'perfectly fine'. As 
> per kernel coding style we don't use them, to signal that 'return' is 
> not a function.

ok, i'll fix

> 
> > 
> > ...
> > > > +			/*
> > > > +			 * Probably an event alias is still available.
> > > 
> > > s/Probably/Possibly?
> > 
> > Hm? I don't get what is wrong with 'probably'.
> 
> Probably means "most likely, maybe" - and that's not the case here.
> 
> (It's misleading in this context because it declares the probability 
> of this action as always higher than 50% - and that's not always the 
> case here.)
> 
> 'Possibly' leaves the probability more undefined.
> 
> You could also write 'Check whether an event alias is still 
> available'.
> 
> Thanks,
> 
> 	Ingo

ok


	Cyrill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-21  8:07       ` Cyrill Gorcunov
@ 2011-07-21  8:15         ` Ingo Molnar
  0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2011-07-21  8:15 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Peter Zijlstra, Don Zickus, LKML, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt


* Cyrill Gorcunov <gorcunov@gmail.com> wrote:

> On Thu, Jul 21, 2011 at 09:52:01AM +0200, Ingo Molnar wrote:
> > 
> > * Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > 
> > > > > +static u64 p4_get_alias_event(u64 config)
> > > > > +{
> > > > > +	u64 config_match;
> > > > > +	int i;
> > > > > +
> > > > > +	/*
> > > > > +	 * Probably we're lucky and don't have to do
> > > > > +	 * matching over all config bits.
> > > > > +	 */
> > > > > +	if (!(config & P4_CONFIG_ALIASABLE))
> > > > > +		return 0;
> > > > 
> > > > 'all' config bits? There's a single alias mapping in 
> > > > p4_event_aliases[] right now which makes the comment rather 
> > > > misleading ...
> > > 
> > > no, the cycle below does check for all bits in config, and test
> > > for single bit might help us to return early.
> > 
> > The loop below? AFAICS it does:
> > 
> > > > > +	 */
> > > > > +	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
> > > > > +		if (config_match == p4_event_aliases[i].orig) {
> > > > > +			config_match = p4_event_aliases[i].alter;
> > > > > +			break;
> > > > > +		} else if (config_match == p4_event_aliases[i].alter) {
> > > > > +			config_match = p4_event_aliases[i].orig;
> > > > > +			break;
> > > > > +		}
> > > > > +	}
> > 
> > and p4_event_aliases[] is a single-entry array, so ARRAY_SIZE() is 1 
> > and this loop iterates once.
> 
> I know, but Ingo, I can't guarantee that we'll never bring some new
> alias(es) later ;) [...]

in which 'later' case that check and the comment can be added or 
updated ...

Right now it looks both somewhat pointless *and* the comment is 
somewhat misleading. You could fix the comment, pointing out that 
right now it's only a single entry but it's still a 
micro-optimization.

and if there's more entries in the future the comment can be updated.

> > > > Since this .c file is P4 specific and p4_event_aliases[] is a 
> > > > file-scope array, is the p4_ prefix even needed?
> > > 
> > > I prefer them to be distinguished this way, since the .c file in 
> > > real is included into another (perf_event.c) file.
> > 
> > That should probably be fixed btw. - but yeah, until the PMU drivers 
> > are separated out better the prefix is fine.
> 
> I thought about adding some additional Kconfig where we would 
> choose which PMU to compile in, not sure though where is the best 
> place for it (under which menu section I mean).

No need for any additional Kconfig for that, we could just build them 
as separate .o's.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-08 20:17 [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature Cyrill Gorcunov
  2011-07-14 18:46 ` Don Zickus
  2011-07-21  6:46 ` Ingo Molnar
@ 2011-07-21 10:00 ` tip-bot for Cyrill Gorcunov
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Cyrill Gorcunov @ 2011-07-21 10:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, acme, eranian, hpa, mingo, gorcunov, a.p.zijlstra,
	fweisbec, rostedt, ming.m.lin, tglx, dzickus, mingo

Commit-ID:  f91298709790b9a483752ca3c967845537df2af3
Gitweb:     http://git.kernel.org/tip/f91298709790b9a483752ca3c967845537df2af3
Author:     Cyrill Gorcunov <gorcunov@openvz.org>
AuthorDate: Sat, 9 Jul 2011 00:17:12 +0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Thu, 14 Jul 2011 17:25:04 -0400

perf, x86: P4 PMU - Introduce event alias feature

Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.

The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.

Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.

P.S. Thanks a huge to Don and Steven for for testing
     and early review.

Acked-by: Don Zickus <dzickus@redhat.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Stephane Eranian <eranian@google.com>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/perf_event_p4.h |   33 ++++++++
 arch/x86/kernel/cpu/perf_event.c     |    7 --
 arch/x86/kernel/cpu/perf_event_p4.c  |  135 ++++++++++++++++++++++++++-------
 kernel/watchdog.c                    |    2 -
 4 files changed, 139 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 56fd9e3..4d86c86 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -102,6 +102,14 @@
 #define P4_CONFIG_HT			(1ULL << P4_CONFIG_HT_SHIFT)
 
 /*
+ * If an event has alias it should be marked
+ * with a special bit. (Don't forget to check
+ * P4_PEBS_CONFIG_MASK and related bits on
+ * modification.)
+ */
+#define P4_CONFIG_ALIASABLE		(1 << 9)
+
+/*
  * The bits we allow to pass for RAW events
  */
 #define P4_CONFIG_MASK_ESCR		\
@@ -123,6 +131,31 @@
 	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR))	| \
 	(p4_config_pack_cccr(P4_CONFIG_MASK_CCCR))
 
+/*
+ * In case of event aliasing we need to preserve some
+ * caller bits otherwise the mapping won't be complete.
+ */
+#define P4_CONFIG_EVENT_ALIAS_MASK			  \
+	(p4_config_pack_escr(P4_CONFIG_MASK_ESCR)	| \
+	 p4_config_pack_cccr(P4_CCCR_EDGE		| \
+			     P4_CCCR_THRESHOLD_MASK	| \
+			     P4_CCCR_COMPLEMENT		| \
+			     P4_CCCR_COMPARE))
+
+#define  P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS		  \
+	((P4_CONFIG_HT)					| \
+	 p4_config_pack_escr(P4_ESCR_T0_OS		| \
+			     P4_ESCR_T0_USR		| \
+			     P4_ESCR_T1_OS		| \
+			     P4_ESCR_T1_USR)		| \
+	 p4_config_pack_cccr(P4_CCCR_OVF		| \
+			     P4_CCCR_CASCADE		| \
+			     P4_CCCR_FORCE_OVF		| \
+			     P4_CCCR_THREAD_ANY		| \
+			     P4_CCCR_OVF_PMI_T0		| \
+			     P4_CCCR_OVF_PMI_T1		| \
+			     P4_CONFIG_ALIASABLE))
+
 static inline bool p4_is_event_cascaded(u64 config)
 {
 	u32 cccr = p4_config_unpack_cccr(config);
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c53d433..b7a010f 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -274,7 +274,6 @@ struct x86_pmu {
 	void		(*enable_all)(int added);
 	void		(*enable)(struct perf_event *);
 	void		(*disable)(struct perf_event *);
-	void		(*hw_watchdog_set_attr)(struct perf_event_attr *attr);
 	int		(*hw_config)(struct perf_event *event);
 	int		(*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
 	unsigned	eventsel;
@@ -360,12 +359,6 @@ static u64 __read_mostly hw_cache_extra_regs
 				[PERF_COUNT_HW_CACHE_OP_MAX]
 				[PERF_COUNT_HW_CACHE_RESULT_MAX];
 
-void hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr)
-{
-	if (x86_pmu.hw_watchdog_set_attr)
-		x86_pmu.hw_watchdog_set_attr(wd_attr);
-}
-
 /*
  * Propagate event elapsed time into the generic event.
  * Can only be executed on the CPU where the event is active.
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index fb901c5..8b7a0c3 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -570,11 +570,92 @@ static __initconst const u64 p4_hw_cache_event_ids
  },
 };
 
+/*
+ * Because of Netburst being quite restricted in now
+ * many same events can run simultaneously, we use
+ * event aliases, ie different events which have the
+ * same functionallity but use non-intersected resources
+ * (ESCR/CCCR/couter registers). This allow us to run
+ * two or more semi-same events together. It is done
+ * transparently to a user space.
+ *
+ * Never set any cusom internal bits such as P4_CONFIG_HT,
+ * P4_CONFIG_ALIASABLE or bits for P4_PEBS_METRIC, they are
+ * either up-to-dated automatically either not appliable
+ * at all.
+ *
+ * And be really carefull choosing aliases!
+ */
+struct p4_event_alias {
+	u64 orig;
+	u64 alter;
+} p4_event_aliases[] = {
+	{
+		/*
+		 * Non-halted cycles can be substituted with
+		 * non-sleeping cycles (see Intel SDM Vol3b for
+		 * details).
+		 */
+	.orig	=
+		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
+	.alter	=
+		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)	|
+				    P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))|
+		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
+				    P4_CCCR_COMPARE),
+	},
+};
+
+static u64 p4_get_alias_event(u64 config)
+{
+	u64 config_match;
+	int i;
+
+	/*
+	 * Probably we're lucky and don't have to do
+	 * matching over all config bits.
+	 */
+	if (!(config & P4_CONFIG_ALIASABLE))
+		return 0;
+
+	config_match = config & P4_CONFIG_EVENT_ALIAS_MASK;
+
+	/*
+	 * If an event was previously swapped to the alter config
+	 * we should swap it back otherwise contnention on registers
+	 * will return back.
+	 */
+	for (i = 0; i < ARRAY_SIZE(p4_event_aliases); i++) {
+		if (config_match == p4_event_aliases[i].orig) {
+			config_match = p4_event_aliases[i].alter;
+			break;
+		} else if (config_match == p4_event_aliases[i].alter) {
+			config_match = p4_event_aliases[i].orig;
+			break;
+		}
+	}
+
+	if (i >= ARRAY_SIZE(p4_event_aliases))
+		return 0;
+
+	return config_match |
+		(config & P4_CONFIG_EVENT_ALIAS_IMMUTABLE_BITS);
+}
+
 static u64 p4_general_events[PERF_COUNT_HW_MAX] = {
   /* non-halted CPU clocks */
   [PERF_COUNT_HW_CPU_CYCLES] =
 	p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_GLOBAL_POWER_EVENTS)		|
-		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING)),
+		P4_ESCR_EMASK_BIT(P4_EVENT_GLOBAL_POWER_EVENTS, RUNNING))	|
+		P4_CONFIG_ALIASABLE,
 
   /*
    * retired instructions
@@ -719,31 +800,6 @@ static int p4_validate_raw_event(struct perf_event *event)
 	return 0;
 }
 
-static void p4_hw_watchdog_set_attr(struct perf_event_attr *wd_attr)
-{
-	/*
-	 * Watchdog ticks are special on Netburst, we use
-	 * that named "non-sleeping" ticks as recommended
-	 * by Intel SDM Vol3b.
-	 */
-	WARN_ON_ONCE(wd_attr->type	!= PERF_TYPE_HARDWARE ||
-		     wd_attr->config	!= PERF_COUNT_HW_CPU_CYCLES);
-
-	wd_attr->type	= PERF_TYPE_RAW;
-	wd_attr->config	=
-		p4_config_pack_escr(P4_ESCR_EVENT(P4_EVENT_EXECUTION_EVENT)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS0)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS1)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS2)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, NBOGUS3)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS0)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS1)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS2)		|
-			P4_ESCR_EMASK_BIT(P4_EVENT_EXECUTION_EVENT, BOGUS3))		|
-		p4_config_pack_cccr(P4_CCCR_THRESHOLD(15) | P4_CCCR_COMPLEMENT		|
-			P4_CCCR_COMPARE);
-}
-
 static int p4_hw_config(struct perf_event *event)
 {
 	int cpu = get_cpu();
@@ -1159,6 +1215,8 @@ static int p4_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign
 	struct p4_event_bind *bind;
 	unsigned int i, thread, num;
 	int cntr_idx, escr_idx;
+	u64 config_alias;
+	int pass;
 
 	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
 	bitmap_zero(escr_mask, P4_ESCR_MSR_TABLE_SIZE);
@@ -1167,6 +1225,17 @@ static int p4_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign
 
 		hwc = &cpuc->event_list[i]->hw;
 		thread = p4_ht_thread(cpu);
+		pass = 0;
+
+again:
+		/*
+		 * Aliases are swappable so we may hit circular
+		 * lock if both original config and alias need
+		 * resources (MSR registers) which already busy.
+		 */
+		if (pass > 2)
+			goto done;
+
 		bind = p4_config_get_bind(hwc->config);
 		escr_idx = p4_get_escr_idx(bind->escr_msr[thread]);
 		if (unlikely(escr_idx == -1))
@@ -1180,8 +1249,17 @@ static int p4_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign
 		}
 
 		cntr_idx = p4_next_cntr(thread, used_mask, bind);
-		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask))
-			goto done;
+		if (cntr_idx == -1 || test_bit(escr_idx, escr_mask)) {
+			/*
+			 * Probably an event alias is still available.
+			 */
+			config_alias = p4_get_alias_event(hwc->config);
+			if (!config_alias)
+				goto done;
+			hwc->config = config_alias;
+			pass++;
+			goto again;
+		}
 
 		p4_pmu_swap_config_ts(hwc, cpu);
 		if (assign)
@@ -1218,7 +1296,6 @@ static __initconst const struct x86_pmu p4_pmu = {
 	.cntval_bits		= ARCH_P4_CNTRVAL_BITS,
 	.cntval_mask		= ARCH_P4_CNTRVAL_MASK,
 	.max_period		= (1ULL << (ARCH_P4_CNTRVAL_BITS - 1)) - 1,
-	.hw_watchdog_set_attr	= p4_hw_watchdog_set_attr,
 	.hw_config		= p4_hw_config,
 	.schedule_events	= p4_pmu_schedule_events,
 	/*
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index a933e3a..36491cd 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -200,7 +200,6 @@ static int is_softlockup(unsigned long touch_ts)
 }
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
-void __weak hw_nmi_watchdog_set_attr(struct perf_event_attr *wd_attr) { }
 
 static struct perf_event_attr wd_hw_attr = {
 	.type		= PERF_TYPE_HARDWARE,
@@ -372,7 +371,6 @@ static int watchdog_nmi_enable(int cpu)
 
 	wd_attr = &wd_hw_attr;
 	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
-	hw_nmi_watchdog_set_attr(wd_attr);
 
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature
  2011-07-21  6:46 ` Ingo Molnar
  2011-07-21  7:20   ` Cyrill Gorcunov
@ 2011-07-21 12:02   ` Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2011-07-21 12:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Cyrill Gorcunov, Don Zickus, LKML, Stephane Eranian, Lin Ming,
	Arnaldo Carvalho de Melo, Frederic Weisbecker, Steven Rostedt

On Thu, 2011-07-21 at 08:46 +0200, Ingo Molnar wrote:
> 
> This looks like the right approach otherwise. Peter? 

Yeah, aliases are cool :-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-07-21 12:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-08 20:17 [PATCH -tip/perf/core] perf, x86: P4 PMU - Introduce event alias feature Cyrill Gorcunov
2011-07-14 18:46 ` Don Zickus
2011-07-21  6:46 ` Ingo Molnar
2011-07-21  7:20   ` Cyrill Gorcunov
2011-07-21  7:52     ` Ingo Molnar
2011-07-21  8:07       ` Cyrill Gorcunov
2011-07-21  8:15         ` Ingo Molnar
2011-07-21 12:02   ` Peter Zijlstra
2011-07-21 10:00 ` [tip:perf/core] " tip-bot for Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).