All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] perf/x86/amd: Add support for Large Increment per Cycle Events
@ 2019-11-14 18:37 Kim Phillips
  2019-11-14 18:37 ` [PATCH 1/2] perf/x86/amd: Constrain Large Increment per Cycle events Kim Phillips
  2019-11-14 18:37 ` [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
  0 siblings, 2 replies; 9+ messages in thread
From: Kim Phillips @ 2019-11-14 18:37 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar; +Cc: linux-kernel, Kim Phillips

This patchseries adds support for Large Increment per Cycle Events,
which is needed to count events like Retired SSE/AVX FLOPs.
The first patch constrains Large Increment events to the even PMCs,
and the second patch changes the scheduler to accommodate and
program the new Merge event needed on the odd counters.

The RFC was posted here:

https://lkml.org/lkml/2019/8/26/828

Changes since then include mostly fixing interoperation with the
watchdog, splitting, rewording, and addressing Peter Zijlstra's
comments:

 - Mentioned programming the odd counter before the even counter
   in the commit text, as is now also done in the code.

 - Do the programming of the counters in the enable/disable paths
   instead of the commit_scheduler hook.

 - Instead of the loop re-counting all large increment events,
   have collect_events() and a new amd_put_event_constraints_f17h
   update a new cpuc variable 'n_lg_inc'.  Now the scheduler
   does a simple subtraction to get the target gpmax value.

 - Amend the fastpath's used_mask code to fix a problem where
   counter programming was being overwritten when running with
   the watchdog enabled.

 - Omit the superfluous __set_bit(idx + 1) in __perf_sched_find_counter
   and clear the large increment's sched->state.used bit in the
   path where a failure to schedule is determined due to the
   next counter already being used (thanks Nathan Fontenot).

 - Broaden new PMU initialization code to run on families 17h and
   above.

 - Have new is_large_inc(strcut perf_event) common to all x86 paths
   as is is_pebs_pt().  That way, the raw event code checker
   amd_is_lg_inc_event_code() can stay in its vendor-specific area
   events/amd/core.c.

 - __set_bit, WARN_ON(!gpmax), all addressed.

 - WRT changing the naming to PAIR, etc. I dislike the idea because
   h/w documentation consistently calls this now relatively old
   feature for "Large Increment per Cycle" events, and the secondary
   event needed, specifically the "Merge event (0xFFF)".  When I
   started this project the biggest problem was disambiguating
   between the Large Increment event (FLOPs, or others), and the
   Merge event (0xFFF) itself.  Different phases had "Merge" for
   the Merge event vs. "Merged" for the Large Increment event(s),
   or "Mergee", which made reading the source code too easy to
   mistake one for the other. So I opted for two distinctly
   different base terms/stem-words: Large increment (lg_inc) and
   Merge, to match the documentation, which basically has it right.
   Changing the term to "pair" would have created the same "pair" vs.
   "paired" vs. "pairer" etc. confusion, so I dropped it.

 - WRT the comment "How about you make __perf_sched_find_count() set
   the right value? That already knows it did this.", I didn't see
   how I'd get away from still having to do the constraints flag &
   LARGE_INC check in perf_assign_events(), to re-adjust the assignment
   in the assign array, or sched.state.counter.  This code really
   is only needed after the counter assignment is made, in order to
   program the h/w correctly.

Kim Phillips (2):
  perf/x86/amd: Constrain Large Increment per Cycle events
  perf/x86/amd: Add support for Large Increment per Cycle Events

 arch/x86/events/amd/core.c   | 110 +++++++++++++++++++++++++----------
 arch/x86/events/core.c       |  46 ++++++++++++++-
 arch/x86/events/perf_event.h |  21 +++++++
 3 files changed, 145 insertions(+), 32 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] perf/x86/amd: Constrain Large Increment per Cycle events
  2019-11-14 18:37 [PATCH 0/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
@ 2019-11-14 18:37 ` Kim Phillips
  2020-01-17 10:08   ` [tip: perf/core] " tip-bot2 for Kim Phillips
  2019-11-14 18:37 ` [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
  1 sibling, 1 reply; 9+ messages in thread
From: Kim Phillips @ 2019-11-14 18:37 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, Kim Phillips, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

AMD Family 17h processors and above gain support for Large Increment
per Cycle events.  Unfortunately there is no CPUID or equivalent bit
that indicates whether the feature exists or not, so we continue to
determine eligibility based on a CPU family number comparison.

For Large Increment per Cycle events, we add a f17h-and-compatibles
get_event_constraints_f17h() that returns an even counter bitmask:
Large Increment per Cycle events can only be placed on PMCs 0, 2,
and 4 out of the currently available 0-5.  The only currently
public event that requires this feature to report valid counts
is PMCx003 "Retired SSE/AVX Operations".

Note that the CPU family logic in amd_core_pmu_init() is changed
so as to be able to selectively add initialization for features
available in ranges of backward-compatible CPU families.  This
Large Increment per Cycle feature is expected to be retained
in future families.

A side-effect of assigning a new get_constraints function for f17h
disables calling the old (prior to f15h) amd_get_event_constraints
implementation left enabled by commit e40ed1542dd7 ("perf/x86: Add perf
support for AMD family-17h processors"), which is no longer
necessary since those North Bridge event codes are obsoleted.

Also fix a spelling mistake whilst in the area (calulating ->
calculating).

Fixes: e40ed1542dd7 ("perf/x86: Add perf support for AMD family-17h processors")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/events/amd/core.c   | 91 ++++++++++++++++++++++++------------
 arch/x86/events/perf_event.h |  2 +
 2 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 64c3e70b0556..78f771cadb8e 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -301,6 +301,25 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
 	return offset;
 }
 
+/*
+ * AMD64 events are detected based on their event codes.
+ */
+static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
+{
+	return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
+}
+
+static inline bool amd_is_lg_inc_event_code(struct hw_perf_event *hwc)
+{
+	if (!(x86_pmu.flags & PMU_FL_MERGE))
+		return false;
+
+	switch (amd_get_event_code(hwc)) {
+	case 0x003:	return true;	/* Retired SSE/AVX FLOPs */
+	default:	return false;
+	}
+}
+
 static int amd_core_hw_config(struct perf_event *event)
 {
 	if (event->attr.exclude_host && event->attr.exclude_guest)
@@ -319,14 +338,6 @@ static int amd_core_hw_config(struct perf_event *event)
 	return 0;
 }
 
-/*
- * AMD64 events are detected based on their event codes.
- */
-static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
-{
-	return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
-}
-
 static inline int amd_is_nb_event(struct hw_perf_event *hwc)
 {
 	return (hwc->config & 0xe0) == 0xe0;
@@ -864,6 +875,20 @@ amd_get_event_constraints_f15h(struct cpu_hw_events *cpuc, int idx,
 	}
 }
 
+static struct event_constraint lg_inc_constraint;
+
+static struct event_constraint *
+amd_get_event_constraints_f17h(struct cpu_hw_events *cpuc, int idx,
+			       struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (amd_is_lg_inc_event_code(hwc))
+		return &lg_inc_constraint;
+
+	return &unconstrained;
+}
+
 static ssize_t amd_event_sysfs_show(char *page, u64 config)
 {
 	u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT) |
@@ -907,33 +932,15 @@ static __initconst const struct x86_pmu amd_pmu = {
 
 static int __init amd_core_pmu_init(void)
 {
+	u64 even_ctr_mask = 0ULL;
+	int i;
+
 	if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
 		return 0;
 
-	/* Avoid calulating the value each time in the NMI handler */
+	/* Avoid calculating the value each time in the NMI handler */
 	perf_nmi_window = msecs_to_jiffies(100);
 
-	switch (boot_cpu_data.x86) {
-	case 0x15:
-		pr_cont("Fam15h ");
-		x86_pmu.get_event_constraints = amd_get_event_constraints_f15h;
-		break;
-	case 0x17:
-		pr_cont("Fam17h ");
-		/*
-		 * In family 17h, there are no event constraints in the PMC hardware.
-		 * We fallback to using default amd_get_event_constraints.
-		 */
-		break;
-	case 0x18:
-		pr_cont("Fam18h ");
-		/* Using default amd_get_event_constraints. */
-		break;
-	default:
-		pr_err("core perfctr but no constraints; unknown hardware!\n");
-		return -ENODEV;
-	}
-
 	/*
 	 * If core performance counter extensions exists, we must use
 	 * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
@@ -948,6 +955,30 @@ static int __init amd_core_pmu_init(void)
 	 */
 	x86_pmu.amd_nb_constraints = 0;
 
+	if (boot_cpu_data.x86 == 0x15) {
+		pr_cont("Fam15h ");
+		x86_pmu.get_event_constraints = amd_get_event_constraints_f15h;
+	}
+	if (boot_cpu_data.x86 >= 0x17) {
+		pr_cont("Fam17h+ ");
+		/*
+		 * Family 17h and compatibles have constraints for Large
+		 * Increment per Cycle events: they may only be assigned an
+		 * even numbered counter that has a consecutive adjacent odd
+		 * numbered counter following it.
+		 */
+		for (i = 0; i < x86_pmu.num_counters - 1; i += 2)
+			even_ctr_mask |= 1 << i;
+
+		lg_inc_constraint = (struct event_constraint)
+				    __EVENT_CONSTRAINT(0, even_ctr_mask, 0,
+				    x86_pmu.num_counters / 2, 0,
+				    PERF_X86_EVENT_LARGE_INC);
+
+		x86_pmu.get_event_constraints = amd_get_event_constraints_f17h;
+		x86_pmu.flags |= PMU_FL_MERGE;
+	}
+
 	pr_cont("core perfctr, ");
 	return 0;
 }
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ecacfbf4ebc1..6154f06eb4ba 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -77,6 +77,7 @@ static inline bool constraint_match(struct event_constraint *c, u64 ecode)
 #define PERF_X86_EVENT_AUTO_RELOAD	0x0200 /* use PEBS auto-reload */
 #define PERF_X86_EVENT_LARGE_PEBS	0x0400 /* use large PEBS */
 #define PERF_X86_EVENT_PEBS_VIA_PT	0x0800 /* use PT buffer for PEBS */
+#define PERF_X86_EVENT_LARGE_INC	0x1000 /* Large Increment per Cycle */
 
 struct amd_nb {
 	int nb_id;  /* NorthBridge id */
@@ -735,6 +736,7 @@ do {									\
 #define PMU_FL_EXCL_ENABLED	0x8 /* exclusive counter active */
 #define PMU_FL_PEBS_ALL		0x10 /* all events are valid PEBS events */
 #define PMU_FL_TFA		0x20 /* deal with TSX force abort */
+#define PMU_FL_MERGE		0x40 /* merge counters for large incr. events */
 
 #define EVENT_VAR(_id)  event_attr_##_id
 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2019-11-14 18:37 [PATCH 0/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
  2019-11-14 18:37 ` [PATCH 1/2] perf/x86/amd: Constrain Large Increment per Cycle events Kim Phillips
@ 2019-11-14 18:37 ` Kim Phillips
  2019-12-20 12:09   ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Kim Phillips @ 2019-11-14 18:37 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, Kim Phillips, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

Description of hardware operation
---------------------------------

The core AMD PMU has a 4-bit wide per-cycle increment for each
performance monitor counter.  That works for most events, but
now with AMD Family 17h and above processors, some events can
occur more than 15 times in a cycle.  Those events are called
"Large Increment per Cycle" events. In order to count these
events, two adjacent h/w PMCs get their count signals merged
to form 8 bits per cycle total.  In addition, the PERF_CTR count
registers are merged to be able to count up to 64 bits.

Normally, events like instructions retired, get programmed on a single
counter like so:

PERF_CTL0 (MSR 0xc0010200) 0x000000000053ff0c # event 0x0c, umask 0xff
PERF_CTR0 (MSR 0xc0010201) 0x0000800000000001 # r/w 48-bit count

The next counter at MSRs 0xc0010202-3 remains unused, or can be used
independently to count something else.

When counting Large Increment per Cycle events, such as FLOPs,
however, we now have to reserve the next counter and program the
PERF_CTL (config) register with the Merge event (0xFFF), like so:

PERF_CTL0 (msr 0xc0010200) 0x000000000053ff03 # FLOPs event, umask 0xff
PERF_CTR0 (msr 0xc0010201) 0x0000800000000001 # rd 64-bit cnt, wr lo 48b
PERF_CTL1 (msr 0xc0010202) 0x0000000f004000ff # Merge event, enable bit
PERF_CTR1 (msr 0xc0010203) 0x0000000000000000 # wr hi 16-bits count

The count is widened from the normal 48-bits to 64 bits by having the
second counter carry the higher 16 bits of the count in its lower 16
bits of its counter register.

The odd counter, e.g., PERF_CTL1, is programmed with the enabled Merge
event before the even counter, PERF_CTL0.

The Large Increment feature is available starting with Family 17h.
For more details, search any Family 17h PPR for the "Large Increment
per Cycle Events" section, e.g., section 2.1.15.3 on p. 173 in this
version:

https://www.amd.com/system/files/TechDocs/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip

Description of software operation
---------------------------------

The following steps are taken in order to support reserving and
enabling the extra counter for Large Increment per Cycle events:

1. In the main x86 scheduler, we reduce the number of available
counters by the number of Large Increment per Cycle events being
scheduled, tracked by a new cpuc variable 'n_lg_inc' and a new
amd_put_event_constraints_f17h().  This improves the counter
scheduler success rate.

2. In perf_assign_events(), if a counter is assigned to a Large
Increment event, we increment the current counter variable, so the
counter used for the Merge event is removed from assignment
consideration by upcoming event assignments.

3. In find_counter(), if a counter has been found for the Large
Increment event, we set the next counter as used, to prevent other
events from using it.

4. We perform steps 2 & 3 also in the x86 scheduler fastpath, i.e.,
we add Merge event accounting to the existing used_mask logic.

5. Finally, we add on the programming of Merge event to the
neighbouring PMC counters in the counter enable/disable{_all}
code paths.

Currently, software does not support a single PMU with mixed 48- and
64-bit counting, so Large increment event counts are limited to 48
bits.  In set_period, we zero-out the upper 16 bits of the count, so
the hardware doesn't copy them to the even counter's higher bits.

Simple invocation example showing counting 8 FLOPs per 256-bit/%ymm
vaddps instruction executed in a loop 100 million times:

perf stat -e cpu/fp_ret_sse_avx_ops.all/,cpu/instructions/ <workload>

 Performance counter stats for '<workload>':

       800,000,000      cpu/fp_ret_sse_avx_ops.all/u
       300,042,101      cpu/instructions/u

Prior to this patch, the reported SSE/AVX FLOPs retired count would
be wrong.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/events/amd/core.c   | 19 +++++++++++++++
 arch/x86/events/core.c       | 46 ++++++++++++++++++++++++++++++++++--
 arch/x86/events/perf_event.h | 19 +++++++++++++++
 3 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 78f771cadb8e..e69723cb5b51 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -14,6 +14,10 @@
 static DEFINE_PER_CPU(unsigned long, perf_nmi_tstamp);
 static unsigned long perf_nmi_window;
 
+/* AMD Event 0xFFF: Merge.  Used with Large Increment per Cycle events */
+#define AMD_MERGE_EVENT ((0xFULL << 32) | 0xFFULL)
+#define AMD_MERGE_EVENT_ENABLE (AMD_MERGE_EVENT | ARCH_PERFMON_EVENTSEL_ENABLE)
+
 static __initconst const u64 amd_hw_cache_event_ids
 				[PERF_COUNT_HW_CACHE_MAX]
 				[PERF_COUNT_HW_CACHE_OP_MAX]
@@ -335,6 +339,10 @@ static int amd_core_hw_config(struct perf_event *event)
 	else if (event->attr.exclude_guest)
 		event->hw.config |= AMD64_EVENTSEL_HOSTONLY;
 
+	if ((x86_pmu.flags & PMU_FL_MERGE) &&
+	    amd_is_lg_inc_event_code(&event->hw))
+		event->hw.flags |= PERF_X86_EVENT_LARGE_INC;
+
 	return 0;
 }
 
@@ -889,6 +897,15 @@ amd_get_event_constraints_f17h(struct cpu_hw_events *cpuc, int idx,
 	return &unconstrained;
 }
 
+static void amd_put_event_constraints_f17h(struct cpu_hw_events *cpuc,
+					   struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (is_large_inc(hwc))
+		--cpuc->n_lg_inc;
+}
+
 static ssize_t amd_event_sysfs_show(char *page, u64 config)
 {
 	u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT) |
@@ -919,6 +936,7 @@ static __initconst const struct x86_pmu amd_pmu = {
 	.max_period		= (1ULL << 47) - 1,
 	.get_event_constraints	= amd_get_event_constraints,
 	.put_event_constraints	= amd_put_event_constraints,
+	.perf_ctr_merge_en	= AMD_MERGE_EVENT_ENABLE,
 
 	.format_attrs		= amd_format_attr,
 	.events_sysfs_show	= amd_event_sysfs_show,
@@ -976,6 +994,7 @@ static int __init amd_core_pmu_init(void)
 				    PERF_X86_EVENT_LARGE_INC);
 
 		x86_pmu.get_event_constraints = amd_get_event_constraints_f17h;
+		x86_pmu.put_event_constraints = amd_put_event_constraints_f17h;
 		x86_pmu.flags |= PMU_FL_MERGE;
 	}
 
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7b21455d7504..51c4c69a9d76 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -612,6 +612,7 @@ void x86_pmu_disable_all(void)
 	int idx;
 
 	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
+		struct hw_perf_event *hwc = &cpuc->events[idx]->hw;
 		u64 val;
 
 		if (!test_bit(idx, cpuc->active_mask))
@@ -621,6 +622,8 @@ void x86_pmu_disable_all(void)
 			continue;
 		val &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
 		wrmsrl(x86_pmu_config_addr(idx), val);
+		if (is_large_inc(hwc))
+			wrmsrl(x86_pmu_config_addr(idx + 1), 0);
 	}
 }
 
@@ -787,6 +790,16 @@ static bool __perf_sched_find_counter(struct perf_sched *sched)
 		if (!__test_and_set_bit(idx, sched->state.used)) {
 			if (sched->state.nr_gp++ >= sched->max_gp)
 				return false;
+			/*
+			 * Large inc. events need the Merge event
+			 * on the next counter
+			 */
+			if ((c->flags & PERF_X86_EVENT_LARGE_INC) &&
+			    __test_and_set_bit(idx + 1, sched->state.used)) {
+				/* next counter already used */
+				__clear_bit(idx, sched->state.used);
+				return false;
+			}
 
 			goto done;
 		}
@@ -849,14 +862,19 @@ int perf_assign_events(struct event_constraint **constraints, int n,
 			int wmin, int wmax, int gpmax, int *assign)
 {
 	struct perf_sched sched;
+	struct event_constraint *c;
 
 	perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
 
 	do {
 		if (!perf_sched_find_counter(&sched))
 			break;	/* failed */
-		if (assign)
+		if (assign) {
 			assign[sched.state.event] = sched.state.counter;
+			c = constraints[sched.state.event];
+			if (c->flags & PERF_X86_EVENT_LARGE_INC)
+				sched.state.counter++;
+		}
 	} while (perf_sched_next_event(&sched));
 
 	return sched.state.unassigned;
@@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
 			break;
 
 		/* not already used */
-		if (test_bit(hwc->idx, used_mask))
+		if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
+		    test_bit(hwc->idx + 1, used_mask)))
 			break;
 
 		__set_bit(hwc->idx, used_mask);
+		if (is_large_inc(hwc))
+			__set_bit(hwc->idx + 1, used_mask);
+
 		if (assign)
 			assign[i] = hwc->idx;
 	}
@@ -952,6 +974,15 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
 		    READ_ONCE(cpuc->excl_cntrs->exclusive_present))
 			gpmax /= 2;
 
+		/*
+		 * Reduce the amount of available counters to allow fitting
+		 * the extra Merge events needed by large increment events.
+		 */
+		if (x86_pmu.flags & PMU_FL_MERGE) {
+			gpmax = x86_pmu.num_counters - cpuc->n_lg_inc;
+			WARN_ON(gpmax <= 0);
+		}
+
 		unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
 					     wmax, gpmax, assign);
 	}
@@ -1032,6 +1063,8 @@ static int collect_events(struct cpu_hw_events *cpuc, struct perf_event *leader,
 			return -EINVAL;
 		cpuc->event_list[n] = leader;
 		n++;
+		if (is_large_inc(&leader->hw))
+			cpuc->n_lg_inc++;
 	}
 	if (!dogrp)
 		return n;
@@ -1046,6 +1079,8 @@ static int collect_events(struct cpu_hw_events *cpuc, struct perf_event *leader,
 
 		cpuc->event_list[n] = event;
 		n++;
+		if (is_large_inc(&event->hw))
+			cpuc->n_lg_inc++;
 	}
 	return n;
 }
@@ -1231,6 +1266,13 @@ int x86_perf_event_set_period(struct perf_event *event)
 
 	wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
 
+	/*
+	 * Clear the Merge event counter's upper 16 bits since
+	 * we currently declare a 48-bit counter width
+	 */
+	if (is_large_inc(hwc))
+		wrmsrl(x86_pmu_event_addr(idx + 1), 0);
+
 	/*
 	 * Due to erratum on certan cpu we need
 	 * a second write to be sure the register
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 6154f06eb4ba..6db6e4c69e3d 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -273,6 +273,7 @@ struct cpu_hw_events {
 	struct amd_nb			*amd_nb;
 	/* Inverted mask of bits to clear in the perf_ctr ctrl registers */
 	u64				perf_ctr_virt_mask;
+	int				n_lg_inc; /* Large increment events */
 
 	void				*kfree_on_online[X86_PERF_KFREE_MAX];
 };
@@ -687,6 +688,8 @@ struct x86_pmu {
 	 * AMD bits
 	 */
 	unsigned int	amd_nb_constraints : 1;
+	/* Merge event enable: value to set in perf_ctl next to large inc. */
+	u64		perf_ctr_merge_en;
 
 	/*
 	 * Extra registers for events
@@ -832,6 +835,11 @@ int x86_pmu_hw_config(struct perf_event *event);
 
 void x86_pmu_disable_all(void);
 
+static inline int is_large_inc(struct hw_perf_event *hwc)
+{
+	return !!(hwc->flags & PERF_X86_EVENT_LARGE_INC);
+}
+
 static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc,
 					  u64 enable_mask)
 {
@@ -839,6 +847,14 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc,
 
 	if (hwc->extra_reg.reg)
 		wrmsrl(hwc->extra_reg.reg, hwc->extra_reg.config);
+
+	/*
+	 * Add enabled Merge event on next counter
+	 * if large increment event being enabled on this counter
+	 */
+	if (is_large_inc(hwc))
+		wrmsrl(hwc->config_base + 2, x86_pmu.perf_ctr_merge_en);
+
 	wrmsrl(hwc->config_base, (hwc->config | enable_mask) & ~disable_mask);
 }
 
@@ -855,6 +871,9 @@ static inline void x86_pmu_disable_event(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 
 	wrmsrl(hwc->config_base, hwc->config);
+
+	if (is_large_inc(hwc))
+		wrmsrl(hwc->config_base + 2, 0);
 }
 
 void x86_pmu_enable_event(struct perf_event *event);
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2019-11-14 18:37 ` [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
@ 2019-12-20 12:09   ` Peter Zijlstra
  2020-01-08 22:26     ` Kim Phillips
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2019-12-20 12:09 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Ingo Molnar, linux-kernel, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:

I still hate the naming on this, "large increment per cycle" is just a
random bunch of words collected by AMD marketing or somesuch. It doesn't
convey the fundamental point that counters get paired. So I've done a
giant bunch of search and replace on it for you.

> @@ -621,6 +622,8 @@ void x86_pmu_disable_all(void)
>  			continue;
>  		val &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
>  		wrmsrl(x86_pmu_config_addr(idx), val);
> +		if (is_large_inc(hwc))
> +			wrmsrl(x86_pmu_config_addr(idx + 1), 0);
>  	}
>  }
>  

See, the above makes sense, it doesn't assume anything about where
config for idx and config for idx+1 are, and then here:

> @@ -855,6 +871,9 @@ static inline void x86_pmu_disable_event(struct perf_event *event)
>  	struct hw_perf_event *hwc = &event->hw;
>  
>  	wrmsrl(hwc->config_base, hwc->config);
> +
> +	if (is_large_inc(hwc))
> +		wrmsrl(hwc->config_base + 2, 0);
>  }

You hard-code the offset as being 2. I fixed that for you.

> @@ -849,14 +862,19 @@ int perf_assign_events(struct event_constraint **constraints, int n,
>  			int wmin, int wmax, int gpmax, int *assign)
>  {
>  	struct perf_sched sched;
> +	struct event_constraint *c;
>  
>  	perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
>  
>  	do {
>  		if (!perf_sched_find_counter(&sched))
>  			break;	/* failed */
> -		if (assign)
> +		if (assign) {
>  			assign[sched.state.event] = sched.state.counter;
> +			c = constraints[sched.state.event];
> +			if (c->flags & PERF_X86_EVENT_LARGE_INC)
> +				sched.state.counter++;
> +		}
>  	} while (perf_sched_next_event(&sched));
>  
>  	return sched.state.unassigned;

I'm still confused by this bit. AFAICT it serves no purpose.
perf_sched_next_event() will reset sched->state.counter to 0 on every
pass anyway.

I've deleted it for you.

> @@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
>  			break;
>  
>  		/* not already used */
> -		if (test_bit(hwc->idx, used_mask))
> +		if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
> +		    test_bit(hwc->idx + 1, used_mask)))
>  			break;
>  
>  		__set_bit(hwc->idx, used_mask);
> +		if (is_large_inc(hwc))
> +			__set_bit(hwc->idx + 1, used_mask);
> +
>  		if (assign)
>  			assign[i] = hwc->idx;
>  	}

This is just really sad.. fixed that too.

Can you verify the patches in:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/amd

work?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2019-12-20 12:09   ` Peter Zijlstra
@ 2020-01-08 22:26     ` Kim Phillips
  2020-01-10 15:09       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Kim Phillips @ 2020-01-08 22:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

Hi Peter,

Happy New Year, and thanks for your review...

On 12/20/19 6:09 AM, Peter Zijlstra wrote:
> On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:
>> @@ -621,6 +622,8 @@ void x86_pmu_disable_all(void)
>>  			continue;
>>  		val &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
>>  		wrmsrl(x86_pmu_config_addr(idx), val);
>> +		if (is_large_inc(hwc))
>> +			wrmsrl(x86_pmu_config_addr(idx + 1), 0);
>>  	}
>>  }
>>  
> 
> See, the above makes sense, it doesn't assume anything about where
> config for idx and config for idx+1 are, and then here:
> 
>> @@ -855,6 +871,9 @@ static inline void x86_pmu_disable_event(struct perf_event *event)
>>  	struct hw_perf_event *hwc = &event->hw;
>>  
>>  	wrmsrl(hwc->config_base, hwc->config);
>> +
>> +	if (is_large_inc(hwc))
>> +		wrmsrl(hwc->config_base + 2, 0);
>>  }
> 
> You hard-code the offset as being 2. I fixed that for you.

Sorry I missed that!  Thanks for catching and fixing it.

>> @@ -849,14 +862,19 @@ int perf_assign_events(struct event_constraint **constraints, int n,
>>  			int wmin, int wmax, int gpmax, int *assign)
>>  {
>>  	struct perf_sched sched;
>> +	struct event_constraint *c;
>>  
>>  	perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
>>  
>>  	do {
>>  		if (!perf_sched_find_counter(&sched))
>>  			break;	/* failed */
>> -		if (assign)
>> +		if (assign) {
>>  			assign[sched.state.event] = sched.state.counter;
>> +			c = constraints[sched.state.event];
>> +			if (c->flags & PERF_X86_EVENT_LARGE_INC)
>> +				sched.state.counter++;
>> +		}
>>  	} while (perf_sched_next_event(&sched));
>>  
>>  	return sched.state.unassigned;
> 
> I'm still confused by this bit. AFAICT it serves no purpose.
> perf_sched_next_event() will reset sched->state.counter to 0 on every
> pass anyway.
> 
> I've deleted it for you.

OK, I had my doubts because perf_sched_next_event doesn't have a counter reset in the "next weight" exit case, which indeed might be a moot case, but then perf_sched_init, called in the beginning of perf_assign_events, doesn't clear the counter variable either..

>> @@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
>>  			break;
>>  
>>  		/* not already used */
>> -		if (test_bit(hwc->idx, used_mask))
>> +		if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
>> +		    test_bit(hwc->idx + 1, used_mask)))
>>  			break;
>>  
>>  		__set_bit(hwc->idx, used_mask);
>> +		if (is_large_inc(hwc))
>> +			__set_bit(hwc->idx + 1, used_mask);
>> +
>>  		if (assign)
>>  			assign[i] = hwc->idx;
>>  	}
> 
> This is just really sad.. fixed that too.

[*]

> Can you verify the patches in:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/amd
> 
> work?

I found a test that failed: perf stat on 6 instructions + 1 FLOPs events, multiplexing over 6 h/w counters along with the watchdog often returns exceptionally high and incorrect counts for a couple of the instructions events.  The FLOPs event counts ok.

My original patch, rebased onto the same code your perf/amd branch is, does not fail this test.

If I re-add my perf_assign_events changes you removed, the problem does not go away.

If I undo re-adding my perf_assign_events code, and re-add my "not already used" code that you removed - see [*] above - the problem DOES go away, and all the counts are all accurate.

One problem I see with your change in the "not already used" fastpath area, is that the new mask variable gets updated with position 'i' regardless of any previous Large Increment event assignments.  I.e., a successfully scheduled large increment event assignment may have already consumed that 'i' slot for its Merge event in a previous iteration of the loop.  So if the fastpath scheduler fails to assign that following event, the slow path is wrongly entered due to a wrong 'i' comparison with 'n', for example.

Btw, thanks for adding the comment pointing out perf_sched_restore_state is incompatible with Large Increment events.

I'd just as soon put my original "not already used" code back in, but if it's still unwanted, let me know how you'd like to proceed...

Kim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2020-01-08 22:26     ` Kim Phillips
@ 2020-01-10 15:09       ` Peter Zijlstra
  2020-01-10 16:22         ` Kim Phillips
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-01-10 15:09 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Ingo Molnar, linux-kernel, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

On Wed, Jan 08, 2020 at 04:26:47PM -0600, Kim Phillips wrote:
> On 12/20/19 6:09 AM, Peter Zijlstra wrote:
> > On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:

> >> @@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> >>  			break;
> >>  
> >>  		/* not already used */
> >> -		if (test_bit(hwc->idx, used_mask))
> >> +		if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
> >> +		    test_bit(hwc->idx + 1, used_mask)))
> >>  			break;
> >>  
> >>  		__set_bit(hwc->idx, used_mask);
> >> +		if (is_large_inc(hwc))
> >> +			__set_bit(hwc->idx + 1, used_mask);
> >> +
> >>  		if (assign)
> >>  			assign[i] = hwc->idx;
> >>  	}
> > 
> > This is just really sad.. fixed that too.
> 
> [*]

> If I undo re-adding my perf_assign_events code, and re-add my "not
> already used" code that you removed - see [*] above - the problem DOES
> go away, and all the counts are all accurate.
> 
> One problem I see with your change in the "not already used" fastpath
> area, is that the new mask variable gets updated with position 'i'
> regardless of any previous Large Increment event assignments.

Urgh, I completely messed that up. Find the below delta (I'll push out a
new version to queue.git as well).

> I.e., a
> successfully scheduled large increment event assignment may have
> already consumed that 'i' slot for its Merge event in a previous
> iteration of the loop.  So if the fastpath scheduler fails to assign
> that following event, the slow path is wrongly entered due to a wrong
> 'i' comparison with 'n', for example.

That should only be part of the story though; the fast-path is purely
optional. False-negatives on the fast path should not affect
functionality, only performance. False-positives on the fast path are a
no-no of course.

---
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 222f172cbaf5..3bb738f5a472 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -937,7 +937,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
 	 * fastpath, try to reuse previous register
 	 */
 	for (i = 0; i < n; i++) {
-		u64 mask = BIT_ULL(i);
+		u64 mask;
 
 		hwc = &cpuc->event_list[i]->hw;
 		c = cpuc->event_constraint[i];
@@ -950,6 +950,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
 		if (!test_bit(hwc->idx, c->idxmsk))
 			break;
 
+		mask = BIT_ULL(hwc->idx);
 		if (is_counter_pair(hwc))
 			mask |= mask << 1;
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2020-01-10 15:09       ` Peter Zijlstra
@ 2020-01-10 16:22         ` Kim Phillips
  2020-01-10 17:38           ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Kim Phillips @ 2020-01-10 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

On 1/10/20 9:09 AM, Peter Zijlstra wrote:
> On Wed, Jan 08, 2020 at 04:26:47PM -0600, Kim Phillips wrote:
>> On 12/20/19 6:09 AM, Peter Zijlstra wrote:
>>> On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:
>> One problem I see with your change in the "not already used" fastpath
>> area, is that the new mask variable gets updated with position 'i'
>> regardless of any previous Large Increment event assignments.
> 
> Urgh, I completely messed that up. Find the below delta (I'll push out a
> new version to queue.git as well).

OK, I tested what you pushed on your perf/amd branch, and it passes all my tests.

BTW, a large part of the commit message went missing, hopefully it'll be brought back before being pushed further upstream?

Thank you,

Kim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events
  2020-01-10 16:22         ` Kim Phillips
@ 2020-01-10 17:38           ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2020-01-10 17:38 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Ingo Molnar, linux-kernel, Janakarajan Natarajan,
	Suravee Suthikulpanit, Tom Lendacky, Stephane Eranian,
	Martin Liška, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, x86

On Fri, Jan 10, 2020 at 10:22:10AM -0600, Kim Phillips wrote:
> On 1/10/20 9:09 AM, Peter Zijlstra wrote:
> > On Wed, Jan 08, 2020 at 04:26:47PM -0600, Kim Phillips wrote:
> >> On 12/20/19 6:09 AM, Peter Zijlstra wrote:
> >>> On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:
> >> One problem I see with your change in the "not already used" fastpath
> >> area, is that the new mask variable gets updated with position 'i'
> >> regardless of any previous Large Increment event assignments.
> > 
> > Urgh, I completely messed that up. Find the below delta (I'll push out a
> > new version to queue.git as well).
> 
> OK, I tested what you pushed on your perf/amd branch, and it passes all my tests.

Excellent!

> BTW, a large part of the commit message went missing, hopefully it'll be brought back before being pushed further upstream?

Argh.. it's those ---- lines, my script things they're cuts. I'll go
fix it up.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: perf/core] perf/x86/amd: Constrain Large Increment per Cycle events
  2019-11-14 18:37 ` [PATCH 1/2] perf/x86/amd: Constrain Large Increment per Cycle events Kim Phillips
@ 2020-01-17 10:08   ` tip-bot2 for Kim Phillips
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Kim Phillips @ 2020-01-17 10:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Kim Phillips, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     471af006a747f1c535c8a8c6c0973c320fe01b22
Gitweb:        https://git.kernel.org/tip/471af006a747f1c535c8a8c6c0973c320fe01b22
Author:        Kim Phillips <kim.phillips@amd.com>
AuthorDate:    Thu, 14 Nov 2019 12:37:19 -06:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 17 Jan 2020 10:19:26 +01:00

perf/x86/amd: Constrain Large Increment per Cycle events

AMD Family 17h processors and above gain support for Large Increment
per Cycle events.  Unfortunately there is no CPUID or equivalent bit
that indicates whether the feature exists or not, so we continue to
determine eligibility based on a CPU family number comparison.

For Large Increment per Cycle events, we add a f17h-and-compatibles
get_event_constraints_f17h() that returns an even counter bitmask:
Large Increment per Cycle events can only be placed on PMCs 0, 2,
and 4 out of the currently available 0-5.  The only currently
public event that requires this feature to report valid counts
is PMCx003 "Retired SSE/AVX Operations".

Note that the CPU family logic in amd_core_pmu_init() is changed
so as to be able to selectively add initialization for features
available in ranges of backward-compatible CPU families.  This
Large Increment per Cycle feature is expected to be retained
in future families.

A side-effect of assigning a new get_constraints function for f17h
disables calling the old (prior to f15h) amd_get_event_constraints
implementation left enabled by commit e40ed1542dd7 ("perf/x86: Add perf
support for AMD family-17h processors"), which is no longer
necessary since those North Bridge event codes are obsoleted.

Also fix a spelling mistake whilst in the area (calulating ->
calculating).

Fixes: e40ed1542dd7 ("perf/x86: Add perf support for AMD family-17h processors")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191114183720.19887-2-kim.phillips@amd.com
---
 arch/x86/events/amd/core.c   | 91 +++++++++++++++++++++++------------
 arch/x86/events/perf_event.h |  2 +-
 2 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index a7752cd..571168f 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -301,6 +301,25 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
 	return offset;
 }
 
+/*
+ * AMD64 events are detected based on their event codes.
+ */
+static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
+{
+	return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
+}
+
+static inline bool amd_is_pair_event_code(struct hw_perf_event *hwc)
+{
+	if (!(x86_pmu.flags & PMU_FL_PAIR))
+		return false;
+
+	switch (amd_get_event_code(hwc)) {
+	case 0x003:	return true;	/* Retired SSE/AVX FLOPs */
+	default:	return false;
+	}
+}
+
 static int amd_core_hw_config(struct perf_event *event)
 {
 	if (event->attr.exclude_host && event->attr.exclude_guest)
@@ -319,14 +338,6 @@ static int amd_core_hw_config(struct perf_event *event)
 	return 0;
 }
 
-/*
- * AMD64 events are detected based on their event codes.
- */
-static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
-{
-	return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
-}
-
 static inline int amd_is_nb_event(struct hw_perf_event *hwc)
 {
 	return (hwc->config & 0xe0) == 0xe0;
@@ -855,6 +866,20 @@ amd_get_event_constraints_f15h(struct cpu_hw_events *cpuc, int idx,
 	}
 }
 
+static struct event_constraint pair_constraint;
+
+static struct event_constraint *
+amd_get_event_constraints_f17h(struct cpu_hw_events *cpuc, int idx,
+			       struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (amd_is_pair_event_code(hwc))
+		return &pair_constraint;
+
+	return &unconstrained;
+}
+
 static ssize_t amd_event_sysfs_show(char *page, u64 config)
 {
 	u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT) |
@@ -898,33 +923,15 @@ static __initconst const struct x86_pmu amd_pmu = {
 
 static int __init amd_core_pmu_init(void)
 {
+	u64 even_ctr_mask = 0ULL;
+	int i;
+
 	if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
 		return 0;
 
-	/* Avoid calulating the value each time in the NMI handler */
+	/* Avoid calculating the value each time in the NMI handler */
 	perf_nmi_window = msecs_to_jiffies(100);
 
-	switch (boot_cpu_data.x86) {
-	case 0x15:
-		pr_cont("Fam15h ");
-		x86_pmu.get_event_constraints = amd_get_event_constraints_f15h;
-		break;
-	case 0x17:
-		pr_cont("Fam17h ");
-		/*
-		 * In family 17h, there are no event constraints in the PMC hardware.
-		 * We fallback to using default amd_get_event_constraints.
-		 */
-		break;
-	case 0x18:
-		pr_cont("Fam18h ");
-		/* Using default amd_get_event_constraints. */
-		break;
-	default:
-		pr_err("core perfctr but no constraints; unknown hardware!\n");
-		return -ENODEV;
-	}
-
 	/*
 	 * If core performance counter extensions exists, we must use
 	 * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
@@ -939,6 +946,30 @@ static int __init amd_core_pmu_init(void)
 	 */
 	x86_pmu.amd_nb_constraints = 0;
 
+	if (boot_cpu_data.x86 == 0x15) {
+		pr_cont("Fam15h ");
+		x86_pmu.get_event_constraints = amd_get_event_constraints_f15h;
+	}
+	if (boot_cpu_data.x86 >= 0x17) {
+		pr_cont("Fam17h+ ");
+		/*
+		 * Family 17h and compatibles have constraints for Large
+		 * Increment per Cycle events: they may only be assigned an
+		 * even numbered counter that has a consecutive adjacent odd
+		 * numbered counter following it.
+		 */
+		for (i = 0; i < x86_pmu.num_counters - 1; i += 2)
+			even_ctr_mask |= 1 << i;
+
+		pair_constraint = (struct event_constraint)
+				    __EVENT_CONSTRAINT(0, even_ctr_mask, 0,
+				    x86_pmu.num_counters / 2, 0,
+				    PERF_X86_EVENT_PAIR);
+
+		x86_pmu.get_event_constraints = amd_get_event_constraints_f17h;
+		x86_pmu.flags |= PMU_FL_PAIR;
+	}
+
 	pr_cont("core perfctr, ");
 	return 0;
 }
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 930611d..e2fd363 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -77,6 +77,7 @@ static inline bool constraint_match(struct event_constraint *c, u64 ecode)
 #define PERF_X86_EVENT_AUTO_RELOAD	0x0200 /* use PEBS auto-reload */
 #define PERF_X86_EVENT_LARGE_PEBS	0x0400 /* use large PEBS */
 #define PERF_X86_EVENT_PEBS_VIA_PT	0x0800 /* use PT buffer for PEBS */
+#define PERF_X86_EVENT_PAIR		0x1000 /* Large Increment per Cycle */
 
 struct amd_nb {
 	int nb_id;  /* NorthBridge id */
@@ -743,6 +744,7 @@ do {									\
 #define PMU_FL_EXCL_ENABLED	0x8 /* exclusive counter active */
 #define PMU_FL_PEBS_ALL		0x10 /* all events are valid PEBS events */
 #define PMU_FL_TFA		0x20 /* deal with TSX force abort */
+#define PMU_FL_PAIR		0x40 /* merge counters for large incr. events */
 
 #define EVENT_VAR(_id)  event_attr_##_id
 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-17 10:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 18:37 [PATCH 0/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
2019-11-14 18:37 ` [PATCH 1/2] perf/x86/amd: Constrain Large Increment per Cycle events Kim Phillips
2020-01-17 10:08   ` [tip: perf/core] " tip-bot2 for Kim Phillips
2019-11-14 18:37 ` [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events Kim Phillips
2019-12-20 12:09   ` Peter Zijlstra
2020-01-08 22:26     ` Kim Phillips
2020-01-10 15:09       ` Peter Zijlstra
2020-01-10 16:22         ` Kim Phillips
2020-01-10 17:38           ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.