linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] perf/x86/amd/uncore: Update AMD uncore driver
@ 2017-01-11 16:02 Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 1/3] perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC Janakarajan Natarajan
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Janakarajan Natarajan @ 2017-01-11 16:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit, Janakarajan Natarajan

This patchset updates the AMD uncore driver to support AMD Family17h processors.

First, the current L2 counters are renamed to Last level cache (LLC) counter to
reflect either L2 or L3 counters depending on the family. Then, it also 
dynamically allocates the counters since they vary between Family16h and 17h.

Also, in Family17h, the NorthBridge counters are re-purposed as Data Fabric (DF)
counters. Therefore, this patch exports the pmu name accordingly based on the
processor family.

Janakarajan Natarajan (3):
  perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC
  perf/x86/amd/uncore: Dynamically allocate uncore counters
  perf/x86/amd/uncore: Update sysfs attributes for Family17h processors

 arch/x86/events/amd/uncore.c | 292 +++++++++++++++++++++++++++++--------------
 1 file changed, 197 insertions(+), 95 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC
  2017-01-11 16:02 [PATCH 0/3] perf/x86/amd/uncore: Update AMD uncore driver Janakarajan Natarajan
@ 2017-01-11 16:02 ` Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 3/3] perf/x86/amd/uncore: Update sysfs attributes for Family17h processors Janakarajan Natarajan
  2 siblings, 0 replies; 9+ messages in thread
From: Janakarajan Natarajan @ 2017-01-11 16:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit, Janakarajan Natarajan

This patch renames L2 counters to LLC counters. In AMD Family17h
processors, L3 cache counter is supported.

Since older families have at most L2 counters, last level cache (LLC)
indicates L2/L3 based on the family.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 arch/x86/events/amd/uncore.c | 98 ++++++++++++++++++++++----------------------
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index a0b1bdb..24c8537 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -25,7 +25,7 @@
 #define MAX_COUNTERS		NUM_COUNTERS_NB
 
 #define RDPMC_BASE_NB		6
-#define RDPMC_BASE_L2		10
+#define RDPMC_BASE_LLC		10
 
 #define COUNTER_SHIFT		16
 
@@ -45,30 +45,30 @@ struct amd_uncore {
 };
 
 static struct amd_uncore * __percpu *amd_uncore_nb;
-static struct amd_uncore * __percpu *amd_uncore_l2;
+static struct amd_uncore * __percpu *amd_uncore_llc;
 
 static struct pmu amd_nb_pmu;
-static struct pmu amd_l2_pmu;
+static struct pmu amd_llc_pmu;
 
 static cpumask_t amd_nb_active_mask;
-static cpumask_t amd_l2_active_mask;
+static cpumask_t amd_llc_active_mask;
 
 static bool is_nb_event(struct perf_event *event)
 {
 	return event->pmu->type == amd_nb_pmu.type;
 }
 
-static bool is_l2_event(struct perf_event *event)
+static bool is_llc_event(struct perf_event *event)
 {
-	return event->pmu->type == amd_l2_pmu.type;
+	return event->pmu->type == amd_llc_pmu.type;
 }
 
 static struct amd_uncore *event_to_amd_uncore(struct perf_event *event)
 {
 	if (is_nb_event(event) && amd_uncore_nb)
 		return *per_cpu_ptr(amd_uncore_nb, event->cpu);
-	else if (is_l2_event(event) && amd_uncore_l2)
-		return *per_cpu_ptr(amd_uncore_l2, event->cpu);
+	else if (is_llc_event(event) && amd_uncore_llc)
+		return *per_cpu_ptr(amd_uncore_llc, event->cpu);
 
 	return NULL;
 }
@@ -183,16 +183,16 @@ static int amd_uncore_event_init(struct perf_event *event)
 		return -ENOENT;
 
 	/*
-	 * NB and L2 counters (MSRs) are shared across all cores that share the
-	 * same NB / L2 cache. Interrupts can be directed to a single target
-	 * core, however, event counts generated by processes running on other
-	 * cores cannot be masked out. So we do not support sampling and
-	 * per-thread events.
+	 * NB and Last level cache counters (MSRs) are shared across all cores
+	 * that share the same NB / Last level cache. Interrupts can be directed
+         * to a single target core, however, event counts generated by processes
+	 * running on other cores cannot be masked out. So we do not support
+	 * sampling and per-thread events.
 	 */
 	if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK)
 		return -EINVAL;
 
-	/* NB and L2 counters do not have usr/os/guest/host bits */
+	/* NB and Last level cache counters do not have usr/os/guest/host bits */
 	if (event->attr.exclude_user || event->attr.exclude_kernel ||
 	    event->attr.exclude_host || event->attr.exclude_guest)
 		return -EINVAL;
@@ -226,8 +226,8 @@ static ssize_t amd_uncore_attr_show_cpumask(struct device *dev,
 
 	if (pmu->type == amd_nb_pmu.type)
 		active_mask = &amd_nb_active_mask;
-	else if (pmu->type == amd_l2_pmu.type)
-		active_mask = &amd_l2_active_mask;
+	else if (pmu->type == amd_llc_pmu.type)
+		active_mask = &amd_llc_active_mask;
 	else
 		return 0;
 
@@ -276,7 +276,7 @@ static struct pmu amd_nb_pmu = {
 	.read		= amd_uncore_read,
 };
 
-static struct pmu amd_l2_pmu = {
+static struct pmu amd_llc_pmu = {
 	.task_ctx_nr	= perf_invalid_context,
 	.attr_groups	= amd_uncore_attr_groups,
 	.name		= "amd_l2",
@@ -296,7 +296,7 @@ static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
 
 static int amd_uncore_cpu_up_prepare(unsigned int cpu)
 {
-	struct amd_uncore *uncore_nb = NULL, *uncore_l2;
+	struct amd_uncore *uncore_nb = NULL, *uncore_llc;
 
 	if (amd_uncore_nb) {
 		uncore_nb = amd_uncore_alloc(cpu);
@@ -312,18 +312,18 @@ static int amd_uncore_cpu_up_prepare(unsigned int cpu)
 		*per_cpu_ptr(amd_uncore_nb, cpu) = uncore_nb;
 	}
 
-	if (amd_uncore_l2) {
-		uncore_l2 = amd_uncore_alloc(cpu);
-		if (!uncore_l2)
+	if (amd_uncore_llc) {
+		uncore_llc = amd_uncore_alloc(cpu);
+		if (!uncore_llc)
 			goto fail;
-		uncore_l2->cpu = cpu;
-		uncore_l2->num_counters = NUM_COUNTERS_L2;
-		uncore_l2->rdpmc_base = RDPMC_BASE_L2;
-		uncore_l2->msr_base = MSR_F16H_L2I_PERF_CTL;
-		uncore_l2->active_mask = &amd_l2_active_mask;
-		uncore_l2->pmu = &amd_l2_pmu;
-		uncore_l2->id = -1;
-		*per_cpu_ptr(amd_uncore_l2, cpu) = uncore_l2;
+		uncore_llc->cpu = cpu;
+		uncore_llc->num_counters = NUM_COUNTERS_L2;
+		uncore_llc->rdpmc_base = RDPMC_BASE_LLC;
+		uncore_llc->msr_base = MSR_F16H_L2I_PERF_CTL;
+		uncore_llc->active_mask = &amd_llc_active_mask;
+		uncore_llc->pmu = &amd_llc_pmu;
+		uncore_llc->id = -1;
+		*per_cpu_ptr(amd_uncore_llc, cpu) = uncore_llc;
 	}
 
 	return 0;
@@ -376,17 +376,17 @@ static int amd_uncore_cpu_starting(unsigned int cpu)
 		*per_cpu_ptr(amd_uncore_nb, cpu) = uncore;
 	}
 
-	if (amd_uncore_l2) {
+	if (amd_uncore_llc) {
 		unsigned int apicid = cpu_data(cpu).apicid;
 		unsigned int nshared;
 
-		uncore = *per_cpu_ptr(amd_uncore_l2, cpu);
+		uncore = *per_cpu_ptr(amd_uncore_llc, cpu);
 		cpuid_count(0x8000001d, 2, &eax, &ebx, &ecx, &edx);
 		nshared = ((eax >> 14) & 0xfff) + 1;
 		uncore->id = apicid - (apicid % nshared);
 
-		uncore = amd_uncore_find_online_sibling(uncore, amd_uncore_l2);
-		*per_cpu_ptr(amd_uncore_l2, cpu) = uncore;
+		uncore = amd_uncore_find_online_sibling(uncore, amd_uncore_llc);
+		*per_cpu_ptr(amd_uncore_llc, cpu) = uncore;
 	}
 
 	return 0;
@@ -419,8 +419,8 @@ static int amd_uncore_cpu_online(unsigned int cpu)
 	if (amd_uncore_nb)
 		uncore_online(cpu, amd_uncore_nb);
 
-	if (amd_uncore_l2)
-		uncore_online(cpu, amd_uncore_l2);
+	if (amd_uncore_llc)
+		uncore_online(cpu, amd_uncore_llc);
 
 	return 0;
 }
@@ -456,8 +456,8 @@ static int amd_uncore_cpu_down_prepare(unsigned int cpu)
 	if (amd_uncore_nb)
 		uncore_down_prepare(cpu, amd_uncore_nb);
 
-	if (amd_uncore_l2)
-		uncore_down_prepare(cpu, amd_uncore_l2);
+	if (amd_uncore_llc)
+		uncore_down_prepare(cpu, amd_uncore_llc);
 
 	return 0;
 }
@@ -479,8 +479,8 @@ static int amd_uncore_cpu_dead(unsigned int cpu)
 	if (amd_uncore_nb)
 		uncore_dead(cpu, amd_uncore_nb);
 
-	if (amd_uncore_l2)
-		uncore_dead(cpu, amd_uncore_l2);
+	if (amd_uncore_llc)
+		uncore_dead(cpu, amd_uncore_llc);
 
 	return 0;
 }
@@ -510,16 +510,16 @@ static int __init amd_uncore_init(void)
 	}
 
 	if (boot_cpu_has(X86_FEATURE_PERFCTR_L2)) {
-		amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
-		if (!amd_uncore_l2) {
+		amd_uncore_llc = alloc_percpu(struct amd_uncore *);
+		if (!amd_uncore_llc) {
 			ret = -ENOMEM;
-			goto fail_l2;
+			goto fail_llc;
 		}
-		ret = perf_pmu_register(&amd_l2_pmu, amd_l2_pmu.name, -1);
+		ret = perf_pmu_register(&amd_llc_pmu, amd_llc_pmu.name, -1);
 		if (ret)
-			goto fail_l2;
+			goto fail_llc;
 
-		pr_info("perf: AMD L2I counters detected\n");
+		pr_info("perf: AMD LLC counters detected\n");
 		ret = 0;
 	}
 
@@ -529,7 +529,7 @@ static int __init amd_uncore_init(void)
 	if (cpuhp_setup_state(CPUHP_PERF_X86_AMD_UNCORE_PREP,
 			      "perf/x86/amd/uncore:prepare",
 			      amd_uncore_cpu_up_prepare, amd_uncore_cpu_dead))
-		goto fail_l2;
+		goto fail_llc;
 
 	if (cpuhp_setup_state(CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING,
 			      "perf/x86/amd/uncore:starting",
@@ -546,11 +546,11 @@ static int __init amd_uncore_init(void)
 	cpuhp_remove_state(CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING);
 fail_prep:
 	cpuhp_remove_state(CPUHP_PERF_X86_AMD_UNCORE_PREP);
-fail_l2:
+fail_llc:
 	if (boot_cpu_has(X86_FEATURE_PERFCTR_NB))
 		perf_pmu_unregister(&amd_nb_pmu);
-	if (amd_uncore_l2)
-		free_percpu(amd_uncore_l2);
+	if (amd_uncore_llc)
+		free_percpu(amd_uncore_llc);
 fail_nb:
 	if (amd_uncore_nb)
 		free_percpu(amd_uncore_nb);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters
  2017-01-11 16:02 [PATCH 0/3] perf/x86/amd/uncore: Update AMD uncore driver Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 1/3] perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC Janakarajan Natarajan
@ 2017-01-11 16:02 ` Janakarajan Natarajan
  2017-01-12  9:20   ` Peter Zijlstra
                     ` (2 more replies)
  2017-01-11 16:02 ` [PATCH 3/3] perf/x86/amd/uncore: Update sysfs attributes for Family17h processors Janakarajan Natarajan
  2 siblings, 3 replies; 9+ messages in thread
From: Janakarajan Natarajan @ 2017-01-11 16:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit, Janakarajan Natarajan

This patch updates the AMD uncore driver to support AMD Family17h
processors. In Family17h, there are two extra last level cache counters.
The counters are, therefore, allocated dynamically based on the family.

The cpu hotplug up callback function is refactored to better manage
failure conditions.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 arch/x86/events/amd/uncore.c | 141 +++++++++++++++++++++++++++++++------------
 1 file changed, 104 insertions(+), 37 deletions(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 24c8537..7ab92f7 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -22,13 +22,16 @@
 
 #define NUM_COUNTERS_NB		4
 #define NUM_COUNTERS_L2		4
-#define MAX_COUNTERS		NUM_COUNTERS_NB
+#define NUM_COUNTERS_L3		6
 
 #define RDPMC_BASE_NB		6
 #define RDPMC_BASE_LLC		10
 
 #define COUNTER_SHIFT		16
 
+static int num_counters_llc;
+static int num_counters_nb;
+
 static HLIST_HEAD(uncore_unused_list);
 
 struct amd_uncore {
@@ -40,7 +43,7 @@ struct amd_uncore {
 	u32 msr_base;
 	cpumask_t *active_mask;
 	struct pmu *pmu;
-	struct perf_event *events[MAX_COUNTERS];
+	struct perf_event **events;
 	struct hlist_node node;
 };
 
@@ -123,6 +126,9 @@ static int amd_uncore_add(struct perf_event *event, int flags)
 	struct amd_uncore *uncore = event_to_amd_uncore(event);
 	struct hw_perf_event *hwc = &event->hw;
 
+	if (hwc->idx >= uncore->num_counters)
+		return -EINVAL;
+
 	/* are we already assigned? */
 	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
 		goto out;
@@ -288,51 +294,91 @@ static struct pmu amd_llc_pmu = {
 	.read		= amd_uncore_read,
 };
 
-static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
+static struct amd_uncore *amd_uncore_alloc(unsigned int cpu,
+					unsigned int counters)
 {
-	return kzalloc_node(sizeof(struct amd_uncore), GFP_KERNEL,
+	void *mem;
+	struct amd_uncore *uncore;
+
+	mem = kzalloc_node(sizeof(struct amd_uncore) +
+			sizeof(struct perf_event*) * counters, GFP_KERNEL,
 			cpu_to_node(cpu));
+	if (!mem)
+		return NULL;
+
+	uncore = mem;
+	uncore->events = mem + sizeof(struct amd_uncore);
+
+	return uncore;
 }
 
-static int amd_uncore_cpu_up_prepare(unsigned int cpu)
+static int amd_uncore_cpu_up_nb_prepare(unsigned int cpu)
 {
-	struct amd_uncore *uncore_nb = NULL, *uncore_llc;
+	struct amd_uncore *uncore_nb;
 
-	if (amd_uncore_nb) {
-		uncore_nb = amd_uncore_alloc(cpu);
-		if (!uncore_nb)
-			goto fail;
-		uncore_nb->cpu = cpu;
-		uncore_nb->num_counters = NUM_COUNTERS_NB;
-		uncore_nb->rdpmc_base = RDPMC_BASE_NB;
-		uncore_nb->msr_base = MSR_F15H_NB_PERF_CTL;
-		uncore_nb->active_mask = &amd_nb_active_mask;
-		uncore_nb->pmu = &amd_nb_pmu;
-		uncore_nb->id = -1;
-		*per_cpu_ptr(amd_uncore_nb, cpu) = uncore_nb;
-	}
+	if (!amd_uncore_nb)
+		return 0;
 
-	if (amd_uncore_llc) {
-		uncore_llc = amd_uncore_alloc(cpu);
-		if (!uncore_llc)
-			goto fail;
-		uncore_llc->cpu = cpu;
-		uncore_llc->num_counters = NUM_COUNTERS_L2;
-		uncore_llc->rdpmc_base = RDPMC_BASE_LLC;
-		uncore_llc->msr_base = MSR_F16H_L2I_PERF_CTL;
-		uncore_llc->active_mask = &amd_llc_active_mask;
-		uncore_llc->pmu = &amd_llc_pmu;
-		uncore_llc->id = -1;
-		*per_cpu_ptr(amd_uncore_llc, cpu) = uncore_llc;
-	}
+	uncore_nb = amd_uncore_alloc(cpu, num_counters_nb);
+	if (!uncore_nb)
+		return -ENOMEM;
+	uncore_nb->cpu = cpu;
+	uncore_nb->num_counters = num_counters_nb;
+	uncore_nb->rdpmc_base = RDPMC_BASE_NB;
+	uncore_nb->msr_base = MSR_F15H_NB_PERF_CTL;
+	uncore_nb->active_mask = &amd_nb_active_mask;
+	uncore_nb->pmu = &amd_nb_pmu;
+	uncore_nb->id = -1;
+	*per_cpu_ptr(amd_uncore_nb, cpu) = uncore_nb;
 
 	return 0;
+}
 
-fail:
-	if (amd_uncore_nb)
-		*per_cpu_ptr(amd_uncore_nb, cpu) = NULL;
-	kfree(uncore_nb);
-	return -ENOMEM;
+static int amd_uncore_cpu_up_llc_prepare(unsigned int cpu)
+{
+	struct amd_uncore *uncore_llc;
+
+	if (!amd_uncore_llc)
+		return 0;
+
+	uncore_llc = amd_uncore_alloc(cpu, num_counters_llc);
+	if (!uncore_llc)
+		return -ENOMEM;
+	uncore_llc->cpu = cpu;
+	uncore_llc->num_counters = num_counters_llc;
+	uncore_llc->rdpmc_base = RDPMC_BASE_LLC;
+	uncore_llc->msr_base = MSR_F16H_L2I_PERF_CTL;
+	uncore_llc->active_mask = &amd_llc_active_mask;
+	uncore_llc->pmu = &amd_llc_pmu;
+	uncore_llc->id = -1;
+	*per_cpu_ptr(amd_uncore_llc, cpu) = uncore_llc;
+
+	return 0;
+}
+
+static void amd_uncore_cpu_up_nb_free(unsigned int cpu)
+{
+	struct amd_uncore *uncore_nb;
+
+	uncore_nb = *per_cpu_ptr(amd_uncore_nb, cpu);
+	if (uncore_nb)
+		kfree(uncore_nb);
+	*per_cpu_ptr(amd_uncore_nb, cpu) = NULL;
+}
+
+static int amd_uncore_cpu_up_prepare(unsigned int cpu)
+{
+	int ret;
+
+	ret = amd_uncore_cpu_up_nb_prepare(cpu);
+	if (ret)
+		return ret;
+	ret = amd_uncore_cpu_up_llc_prepare(cpu);
+	if (ret) {
+		amd_uncore_cpu_up_nb_free(cpu);
+		return ret;
+	}
+	return 0;
 }
 
 static struct amd_uncore *
@@ -492,6 +538,27 @@ static int __init amd_uncore_init(void)
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
 		goto fail_nodev;
 
+	switch(boot_cpu_data.x86) {
+		case 23:
+			/* Family 17h */
+			num_counters_nb = NUM_COUNTERS_NB;
+			num_counters_llc = NUM_COUNTERS_L3;
+			break;
+		case 22:
+			/* Family 16h. May change */
+			num_counters_nb = NUM_COUNTERS_NB;
+			num_counters_llc = NUM_COUNTERS_L2;
+			break;
+		default:
+			/*
+			 * All prior families have the same number of
+			 * Northbridge and Last level cache counters
+			 */
+			num_counters_nb = NUM_COUNTERS_NB;
+			num_counters_llc = NUM_COUNTERS_L2;
+			break;
+	}
+
 	if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
 		goto fail_nodev;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] perf/x86/amd/uncore: Update sysfs attributes for Family17h processors
  2017-01-11 16:02 [PATCH 0/3] perf/x86/amd/uncore: Update AMD uncore driver Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 1/3] perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC Janakarajan Natarajan
  2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
@ 2017-01-11 16:02 ` Janakarajan Natarajan
  2 siblings, 0 replies; 9+ messages in thread
From: Janakarajan Natarajan @ 2017-01-11 16:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit, Janakarajan Natarajan

This patch updates the sysfs attributes for AMD Family17h processors. In
Family17h, the event bit position is changed for both the NorthBridge
and Last level cache counters.

The sysfs attributes are assigned based on the family and the type of
the counter.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 arch/x86/events/amd/uncore.c | 77 ++++++++++++++++++++++++++++++++------------
 1 file changed, 56 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index 7ab92f7..5644c20 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -250,30 +250,47 @@ static struct attribute_group amd_uncore_attr_group = {
 	.attrs = amd_uncore_attrs,
 };
 
-PMU_FORMAT_ATTR(event, "config:0-7,32-35");
-PMU_FORMAT_ATTR(umask, "config:8-15");
-
-static struct attribute *amd_uncore_format_attr[] = {
-	&format_attr_event.attr,
-	&format_attr_umask.attr,
-	NULL,
-};
-
-static struct attribute_group amd_uncore_format_group = {
-	.name = "format",
-	.attrs = amd_uncore_format_attr,
+/*
+ * Similar to PMU_FORMAT_ATTR but allowing for format_attr to be assigned based
+ * on family
+ */
+#define AMD_FORMAT_ATTR(_dev, _name, _format)				     \
+static ssize_t								     \
+_dev##_show##_name(struct device *dev,					     \
+		struct device_attribute *attr,				     \
+		char *page)						     \
+{									     \
+	BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE);			     \
+	return sprintf(page, _format "\n");				     \
+}									     \
+static struct device_attribute format_attr_##_dev##_name = __ATTR_RO(_dev);
+
+/* Used for each uncore counter type */
+#define AMD_ATTRIBUTE(_name)						     \
+static struct attribute *amd_uncore_format_attr_##_name[] = {		     \
+	&format_attr_event_##_name.attr,				     \
+	&format_attr_umask.attr,					     \
+	NULL,								     \
+};									     \
+static struct attribute_group amd_uncore_format_group_##_name = {	     \
+	.name = "format",						     \
+	.attrs = amd_uncore_format_attr_##_name,			     \
+};									     \
+static const struct attribute_group *amd_uncore_attr_groups_##_name[] = {    \
+	&amd_uncore_attr_group,						     \
+	&amd_uncore_format_group_##_name,				     \
+	NULL,								     \
 };
 
-static const struct attribute_group *amd_uncore_attr_groups[] = {
-	&amd_uncore_attr_group,
-	&amd_uncore_format_group,
-	NULL,
-};
+AMD_FORMAT_ATTR(event, , "config:0-7,32-35");
+AMD_FORMAT_ATTR(umask, , "config:8-15");
+AMD_FORMAT_ATTR(event, _df, "config:0-7,32-35,59-60");
+AMD_FORMAT_ATTR(event, _l3, "config:0-7");
+AMD_ATTRIBUTE(df);
+AMD_ATTRIBUTE(l3);
 
 static struct pmu amd_nb_pmu = {
 	.task_ctx_nr	= perf_invalid_context,
-	.attr_groups	= amd_uncore_attr_groups,
-	.name		= "amd_nb",
 	.event_init	= amd_uncore_event_init,
 	.add		= amd_uncore_add,
 	.del		= amd_uncore_del,
@@ -284,8 +301,6 @@ static struct pmu amd_nb_pmu = {
 
 static struct pmu amd_llc_pmu = {
 	.task_ctx_nr	= perf_invalid_context,
-	.attr_groups	= amd_uncore_attr_groups,
-	.name		= "amd_l2",
 	.event_init	= amd_uncore_event_init,
 	.add		= amd_uncore_add,
 	.del		= amd_uncore_del,
@@ -543,11 +558,25 @@ static int __init amd_uncore_init(void)
 			/* Family 17h */
 			num_counters_nb = NUM_COUNTERS_NB;
 			num_counters_llc = NUM_COUNTERS_L3;
+			/*
+			 * For Family17h, the NorthBridge counters are
+			 * re-purposed as Data Fabric counters. Also, support is
+			 * added for L3 counters. The pmus are exported based on
+			 * family as either L2 or L3 and NB or DF.
+			 */
+			amd_nb_pmu.name = "amd_df";
+			amd_llc_pmu.name = "amd_l3";
+			format_attr_event_df.show = &event_show_df;
+			format_attr_event_l3.show = &event_show_l3;
 			break;
 		case 22:
 			/* Family 16h. May change */
 			num_counters_nb = NUM_COUNTERS_NB;
 			num_counters_llc = NUM_COUNTERS_L2;
+			amd_nb_pmu.name = "amd_nb";
+			amd_llc_pmu.name = "amd_l2";
+			format_attr_event_df = format_attr_event;
+			format_attr_event_l3 = format_attr_event;
 			break;
 		default:
 			/*
@@ -556,8 +585,14 @@ static int __init amd_uncore_init(void)
 			 */
 			num_counters_nb = NUM_COUNTERS_NB;
 			num_counters_llc = NUM_COUNTERS_L2;
+			amd_nb_pmu.name = "amd_nb";
+			amd_llc_pmu.name = "amd_l2";
+			format_attr_event_df = format_attr_event;
+			format_attr_event_l3 = format_attr_event;
 			break;
 	}
+	amd_llc_pmu.attr_groups = amd_uncore_attr_groups_l3;
+	amd_nb_pmu.attr_groups = amd_uncore_attr_groups_df;
 
 	if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
 		goto fail_nodev;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters
  2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
@ 2017-01-12  9:20   ` Peter Zijlstra
  2017-01-13  0:48     ` Natarajan, Janakarajan
  2017-01-12 13:12   ` [PATCH] perf/x86/amd/uncore: fix ifnullfree.cocci warnings kbuild test robot
  2017-01-12 13:12   ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters kbuild test robot
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2017-01-12  9:20 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit

On Wed, Jan 11, 2017 at 10:02:17AM -0600, Janakarajan Natarajan wrote:
> This patch updates the AMD uncore driver to support AMD Family17h
> processors. In Family17h, there are two extra last level cache counters.
> The counters are, therefore, allocated dynamically based on the family.
> 
> The cpu hotplug up callback function is refactored to better manage
> failure conditions.
> 
> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
> ---
>  arch/x86/events/amd/uncore.c | 141 +++++++++++++++++++++++++++++++------------
>  1 file changed, 104 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
> index 24c8537..7ab92f7 100644
> --- a/arch/x86/events/amd/uncore.c
> +++ b/arch/x86/events/amd/uncore.c
> @@ -22,13 +22,16 @@
>  
>  #define NUM_COUNTERS_NB		4
>  #define NUM_COUNTERS_L2		4
> -#define MAX_COUNTERS		NUM_COUNTERS_NB
> +#define NUM_COUNTERS_L3		6
>  
>  #define RDPMC_BASE_NB		6
>  #define RDPMC_BASE_LLC		10
>  
>  #define COUNTER_SHIFT		16
>  
> +static int num_counters_llc;
> +static int num_counters_nb;
> +
>  static HLIST_HEAD(uncore_unused_list);
>  
>  struct amd_uncore {
> @@ -40,7 +43,7 @@ struct amd_uncore {
>  	u32 msr_base;
>  	cpumask_t *active_mask;
>  	struct pmu *pmu;
> -	struct perf_event *events[MAX_COUNTERS];
> +	struct perf_event **events;
>  	struct hlist_node node;
>  };

Why bother with the dynamic allocation crud? Why not simply set
MAX_COUNTERS to 6 and be happy?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] perf/x86/amd/uncore: fix ifnullfree.cocci warnings
  2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
  2017-01-12  9:20   ` Peter Zijlstra
@ 2017-01-12 13:12   ` kbuild test robot
  2017-01-12 13:12   ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters kbuild test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kbuild test robot @ 2017-01-12 13:12 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: kbuild-all, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin,
	Suravee Suthikulpanit, Janakarajan Natarajan

arch/x86/events/amd/uncore.c:365:2-7: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.

 NULL check before some freeing functions is not needed.

 Based on checkpatch warning
 "kfree(NULL) is safe this check is probably not required"
 and kfreeaddr.cocci by Julia Lawall.

Generated by: scripts/coccinelle/free/ifnullfree.cocci

CC: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 uncore.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -361,8 +361,7 @@ static void amd_uncore_cpu_up_nb_free(un
 	struct amd_uncore *uncore_nb;
 
 	uncore_nb = *per_cpu_ptr(amd_uncore_nb, cpu);
-	if (uncore_nb)
-		kfree(uncore_nb);
+	kfree(uncore_nb);
 	*per_cpu_ptr(amd_uncore_nb, cpu) = NULL;
 }
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters
  2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
  2017-01-12  9:20   ` Peter Zijlstra
  2017-01-12 13:12   ` [PATCH] perf/x86/amd/uncore: fix ifnullfree.cocci warnings kbuild test robot
@ 2017-01-12 13:12   ` kbuild test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kbuild test robot @ 2017-01-12 13:12 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: kbuild-all, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin,
	Suravee Suthikulpanit, Janakarajan Natarajan

Hi Janakarajan,

[auto build test WARNING on next-20170111]
[also build test WARNING on v4.10-rc3]
[cannot apply to tip/x86/core v4.9-rc8 v4.9-rc7 v4.9-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Janakarajan-Natarajan/perf-x86-amd-uncore-Update-AMD-uncore-driver/20170112-184113


coccinelle warnings: (new ones prefixed by >>)

>> arch/x86/events/amd/uncore.c:365:2-7: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters
  2017-01-12  9:20   ` Peter Zijlstra
@ 2017-01-13  0:48     ` Natarajan, Janakarajan
  2017-01-13  8:58       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Natarajan, Janakarajan @ 2017-01-13  0:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit


On 1/12/2017 3:20 AM, Peter Zijlstra wrote:
> On Wed, Jan 11, 2017 at 10:02:17AM -0600, Janakarajan Natarajan wrote:
>> This patch updates the AMD uncore driver to support AMD Family17h
>> processors. In Family17h, there are two extra last level cache counters.
>> The counters are, therefore, allocated dynamically based on the family.
>>
>> The cpu hotplug up callback function is refactored to better manage
>> failure conditions.
>>
>> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
>> ---
>>   arch/x86/events/amd/uncore.c | 141 +++++++++++++++++++++++++++++++------------
>>   1 file changed, 104 insertions(+), 37 deletions(-)
>>
>> diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
>> index 24c8537..7ab92f7 100644
>> --- a/arch/x86/events/amd/uncore.c
>> +++ b/arch/x86/events/amd/uncore.c
>> @@ -22,13 +22,16 @@
>>   
>>   #define NUM_COUNTERS_NB		4
>>   #define NUM_COUNTERS_L2		4
>> -#define MAX_COUNTERS		NUM_COUNTERS_NB
>> +#define NUM_COUNTERS_L3		6
>>   
>>   #define RDPMC_BASE_NB		6
>>   #define RDPMC_BASE_LLC		10
>>   
>>   #define COUNTER_SHIFT		16
>>   
>> +static int num_counters_llc;
>> +static int num_counters_nb;
>> +
>>   static HLIST_HEAD(uncore_unused_list);
>>   
>>   struct amd_uncore {
>> @@ -40,7 +43,7 @@ struct amd_uncore {
>>   	u32 msr_base;
>>   	cpumask_t *active_mask;
>>   	struct pmu *pmu;
>> -	struct perf_event *events[MAX_COUNTERS];
>> +	struct perf_event **events;
>>   	struct hlist_node node;
>>   };
> Why bother with the dynamic allocation crud? Why not simply set
> MAX_COUNTERS to 6 and be happy?
My reasoning behind using dynamic allocation was to prevent memory from
being allocated when not needed on a per cpu basis. If memory isn't a
consideration, I can send a v2 without the dynamic memory allocation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters
  2017-01-13  0:48     ` Natarajan, Janakarajan
@ 2017-01-13  8:58       ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2017-01-13  8:58 UTC (permalink / raw)
  To: Natarajan, Janakarajan
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Suravee Suthikulpanit

On Thu, Jan 12, 2017 at 06:48:40PM -0600, Natarajan, Janakarajan wrote:
> 
> On 1/12/2017 3:20 AM, Peter Zijlstra wrote:
> >On Wed, Jan 11, 2017 at 10:02:17AM -0600, Janakarajan Natarajan wrote:
> >>This patch updates the AMD uncore driver to support AMD Family17h
> >>processors. In Family17h, there are two extra last level cache counters.
> >>The counters are, therefore, allocated dynamically based on the family.
> >>
> >>The cpu hotplug up callback function is refactored to better manage
> >>failure conditions.
> >>
> >>Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
> >>---
> >>  arch/x86/events/amd/uncore.c | 141 +++++++++++++++++++++++++++++++------------
> >>  1 file changed, 104 insertions(+), 37 deletions(-)
> >>
> >>diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
> >>index 24c8537..7ab92f7 100644
> >>--- a/arch/x86/events/amd/uncore.c
> >>+++ b/arch/x86/events/amd/uncore.c
> >>@@ -22,13 +22,16 @@
> >>  #define NUM_COUNTERS_NB		4
> >>  #define NUM_COUNTERS_L2		4
> >>-#define MAX_COUNTERS		NUM_COUNTERS_NB
> >>+#define NUM_COUNTERS_L3		6
> >>  #define RDPMC_BASE_NB		6
> >>  #define RDPMC_BASE_LLC		10
> >>  #define COUNTER_SHIFT		16
> >>+static int num_counters_llc;
> >>+static int num_counters_nb;
> >>+
> >>  static HLIST_HEAD(uncore_unused_list);
> >>  struct amd_uncore {
> >>@@ -40,7 +43,7 @@ struct amd_uncore {
> >>  	u32 msr_base;
> >>  	cpumask_t *active_mask;
> >>  	struct pmu *pmu;
> >>-	struct perf_event *events[MAX_COUNTERS];
> >>+	struct perf_event **events;
> >>  	struct hlist_node node;
> >>  };
> >Why bother with the dynamic allocation crud? Why not simply set
> >MAX_COUNTERS to 6 and be happy?
> My reasoning behind using dynamic allocation was to prevent memory from
> being allocated when not needed on a per cpu basis. If memory isn't a
> consideration, I can send a v2 without the dynamic memory allocation.

Generally a sensible thing to consider, but here we're talking about 16
bytes bytes or so and are adding quite a bit of logic just to save that
on older hardware.

Also, doing that allocation comes at the cost of having to do an extra
pointer dereference every time you use these things.

I'd just keep static sized array and bump the max to 6 and not worry too
much.

(fwiw, the generic x86 cpu pmu does static arrays of 64 entries, even
through there's not a single PMU to actually have that many counters
on).

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-13  8:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-11 16:02 [PATCH 0/3] perf/x86/amd/uncore: Update AMD uncore driver Janakarajan Natarajan
2017-01-11 16:02 ` [PATCH 1/3] perf/x86/amd/uncore: Update AMD uncore to rename L2 to LLC Janakarajan Natarajan
2017-01-11 16:02 ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters Janakarajan Natarajan
2017-01-12  9:20   ` Peter Zijlstra
2017-01-13  0:48     ` Natarajan, Janakarajan
2017-01-13  8:58       ` Peter Zijlstra
2017-01-12 13:12   ` [PATCH] perf/x86/amd/uncore: fix ifnullfree.cocci warnings kbuild test robot
2017-01-12 13:12   ` [PATCH 2/3] perf/x86/amd/uncore: Dynamically allocate uncore counters kbuild test robot
2017-01-11 16:02 ` [PATCH 3/3] perf/x86/amd/uncore: Update sysfs attributes for Family17h processors Janakarajan Natarajan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).