All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management
@ 2016-02-22 22:19 Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check Thomas Gleixner
                   ` (27 more replies)
  0 siblings, 28 replies; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

This series addresses the following issues:

 - Add proper error handling to uncore and rapl drivers

 - Get rid of the pseudo per cpuness of these drivers and do a proper per
   package storage

 - Allow them to be modular

In order to do proper per package storage I added a facility which sanity
checks the physical package id of the processors which is supplied by bios and
does a logical package id translation. That allows drivers to do allocations
for the maximum number of possible packages independent of possible BIOS
creativity.

The module patches are optional. Andi pointed out that they miss the proper
auto loading/probing machinery, but I kept them and let Peter decide what to
do with them.

Changes vs. V2:

  - Change export to GPL

  - Fix the leftover while loop

  - Fix the snb pmu_private bogosity

  - Explain in the changelogs why we want to keep pmu_private

  - Drop the proc/cpuinfo change

  - Add an explanation to the Kconfig help text why selecting 'm' might break
    existing setups.

Delta patch below.

Thanks,

	tglx
---

--- a/arch/x86/Kconfig.perf
+++ b/arch/x86/Kconfig.perf
@@ -8,6 +8,9 @@ config PERF_EVENTS_INTEL_UNCORE
 	  Include support for Intel uncore performance events. These are
 	  available on NehalemEX and more modern processors.
 
+	  Note: Selecting 'm' might break existing setups as the drivers
+	  lack the autoprobe/load magic. If you need them select: y.
+
 	  If unsure say y.
 
 config PERF_EVENTS_INTEL_RAPL
@@ -18,6 +21,9 @@ config PERF_EVENTS_INTEL_RAPL
 	  Include support for Intel rapl performance events for power
 	  monitoring on modern processors.
 
+	  Note: Selecting 'm' might break existing setups as the drivers
+	  lack the autoprobe/load magic. If you need them select: y.
+
 	  If unsure say y.
 
 endmenu
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -774,9 +774,9 @@ static void __uncore_exit_boxes(struct i
 
 static void uncore_exit_boxes(void *dummy)
 {
-	struct intel_uncore_type **types = uncore_msr_uncores;
+	struct intel_uncore_type **types;
 
-	while (*types)
+	for (types = uncore_msr_uncores; *types; types++)
 		__uncore_exit_boxes(*types++, smp_processor_id());
 }
 
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -313,7 +313,7 @@ static int snb_uncore_imc_event_init(str
 		return -EINVAL;
 
 	event->cpu = box->cpu;
-	event->pmu_private = pmu;
+	event->pmu_private = box;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -12,7 +12,6 @@ static void show_cpuinfo_core(struct seq
 {
 #ifdef CONFIG_SMP
 	seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
-	seq_printf(m, "logical id\t: %d\n", c->logical_proc_id);
 	seq_printf(m, "siblings\t: %d\n",
 		   cpumask_weight(topology_core_cpumask(cpu)));
 	seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9335,7 +9335,7 @@ ssize_t perf_event_sysfs_show(struct dev
 
 	return 0;
 }
-EXPORT_SYMBOL(perf_event_sysfs_show);
+EXPORT_SYMBOL_GPL(perf_event_sysfs_show);
 
 static int __init perf_event_sysfs_init(void)
 {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:03   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 02/28] x86/perf/intel/uncore: Simplify error rollback Thomas Gleixner
                   ` (26 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Remove-pointless-mask-check.patch --]
[-- Type: text/plain, Size: 619 bytes --]

uncore_cpumask_init() is only ever called from intel_uncore_init() where the
mask is guaranteed to be empty.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    6 ------
 1 file changed, 6 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1342,12 +1342,6 @@ static void __init uncore_cpumask_init(v
 {
 	int cpu;
 
-	/*
-	 * ony invoke once from msr or pci init code
-	 */
-	if (!cpumask_empty(&uncore_cpu_mask))
-		return;
-
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 03/28] x86/perf/intel/uncore: Fix error handling
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 02/28] x86/perf/intel/uncore: Simplify error rollback Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:04   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 04/28] x86/perf/intel/uncore: Add sanity checks for pci dev package id Thomas Gleixner
                   ` (24 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Fix-error-handling.patch --]
[-- Type: text/plain, Size: 6930 bytes --]

This driver lacks any form of proper error handling. If initialization fails
or hotplug prepare fails, it lets the facility with half initialized stuff
around.

Fix the state and memory leaks in a first step. As a second step we need to
undo the hardware state which is set via uncore_box_init() on some of the
uncore implementations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |  122 ++++++++++++++++++--------
 arch/x86/kernel/cpu/perf_event_intel_uncore.h |   15 +--
 2 files changed, 94 insertions(+), 43 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -38,6 +38,16 @@ int uncore_pcibus_to_physid(struct pci_b
 	return phys_id;
 }
 
+static void uncore_free_pcibus_map(void)
+{
+	struct pci2phy_map *map, *tmp;
+
+	list_for_each_entry_safe(map, tmp, &pci2phy_map_head, list) {
+		list_del(&map->list);
+		kfree(map);
+	}
+}
+
 struct pci2phy_map *__find_pci2phy_map(int segment)
 {
 	struct pci2phy_map *map, *alloc = NULL;
@@ -760,16 +770,28 @@ static int uncore_pmu_register(struct in
 	}
 
 	ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
+	if (!ret)
+		pmu->registered = true;
 	return ret;
 }
 
+static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
+{
+	if (!pmu->registered)
+		return;
+	perf_pmu_unregister(&pmu->pmu);
+	pmu->registered = false;
+}
+
 static void __init uncore_type_exit(struct intel_uncore_type *type)
 {
 	int i;
 
 	if (type->pmus) {
-		for (i = 0; i < type->num_boxes; i++)
+		for (i = 0; i < type->num_boxes; i++) {
+			uncore_pmu_unregister(&type->pmus[i]);
 			free_percpu(type->pmus[i].box);
+		}
 		kfree(type->pmus);
 		type->pmus = NULL;
 	}
@@ -856,8 +878,8 @@ static int uncore_pci_probe(struct pci_d
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
 	struct intel_uncore_type *type;
-	int phys_id;
 	bool first_box = false;
+	int phys_id, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
 	if (phys_id < 0)
@@ -906,9 +928,18 @@ static int uncore_pci_probe(struct pci_d
 	list_add_tail(&box->list, &pmu->box_list);
 	raw_spin_unlock(&uncore_box_lock);
 
-	if (first_box)
-		uncore_pmu_register(pmu);
-	return 0;
+	if (!first_box)
+		return 0;
+
+	ret = uncore_pmu_register(pmu);
+	if (ret) {
+		pci_set_drvdata(pdev, NULL);
+		raw_spin_lock(&uncore_box_lock);
+		list_del(&box->list);
+		raw_spin_unlock(&uncore_box_lock);
+		kfree(box);
+	}
+	return ret;
 }
 
 static void uncore_pci_remove(struct pci_dev *pdev)
@@ -954,7 +985,7 @@ static void uncore_pci_remove(struct pci
 	kfree(box);
 
 	if (last_box)
-		perf_pmu_unregister(&pmu->pmu);
+		uncore_pmu_unregister(pmu);
 }
 
 static int __init uncore_pci_init(void)
@@ -1018,6 +1049,7 @@ static int __init uncore_pci_init(void)
 err:
 	uncore_types_exit(uncore_pci_uncores);
 	uncore_pci_uncores = empty_uncore;
+	uncore_free_pcibus_map();
 	return ret;
 }
 
@@ -1027,6 +1059,7 @@ static void __init uncore_pci_exit(void)
 		pcidrv_registered = false;
 		pci_unregister_driver(uncore_pci_driver);
 		uncore_types_exit(uncore_pci_uncores);
+		uncore_free_pcibus_map();
 	}
 }
 
@@ -1223,8 +1256,7 @@ static int uncore_cpu_notifier(struct no
 	/* allocate/free data structure for uncore box */
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		uncore_cpu_prepare(cpu, -1);
-		break;
+		return notifier_from_errno(uncore_cpu_prepare(cpu, -1));
 	case CPU_STARTING:
 		uncore_cpu_starting(cpu);
 		break;
@@ -1265,9 +1297,29 @@ static struct notifier_block uncore_cpu_
 	.priority	= CPU_PRI_PERF + 1,
 };
 
-static void __init uncore_cpu_setup(void *dummy)
+static int __init type_pmu_register(struct intel_uncore_type *type)
 {
-	uncore_cpu_starting(smp_processor_id());
+	int i, ret;
+
+	for (i = 0; i < type->num_boxes; i++) {
+		ret = uncore_pmu_register(&type->pmus[i]);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+static int __init uncore_msr_pmus_register(void)
+{
+	struct intel_uncore_type **types = uncore_msr_uncores;
+	int ret;
+
+	while (*types) {
+		ret = type_pmu_register(*types++);
+		if (ret)
+			return ret;
+	}
+	return 0;
 }
 
 static int __init uncore_cpu_init(void)
@@ -1317,6 +1369,10 @@ static int __init uncore_cpu_init(void)
 	ret = uncore_types_init(uncore_msr_uncores);
 	if (ret)
 		goto err;
+
+	ret = uncore_msr_pmus_register();
+	if (ret)
+		goto err;
 	return 0;
 err:
 	uncore_types_exit(uncore_msr_uncores);
@@ -1324,26 +1380,14 @@ static int __init uncore_cpu_init(void)
 	return ret;
 }
 
-static int __init uncore_pmus_register(void)
+static void __init uncore_cpu_setup(void *dummy)
 {
-	struct intel_uncore_pmu *pmu;
-	struct intel_uncore_type *type;
-	int i, j;
-
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			uncore_pmu_register(pmu);
-		}
-	}
-
-	return 0;
+	uncore_cpu_starting(smp_processor_id());
 }
 
-static void __init uncore_cpumask_init(void)
+static int __init uncore_cpumask_init(void)
 {
-	int cpu;
+	int cpu, ret = 0;
 
 	cpu_notifier_register_begin();
 
@@ -1359,17 +1403,20 @@ static void __init uncore_cpumask_init(v
 		if (phys_id < 0)
 			continue;
 
-		uncore_cpu_prepare(cpu, phys_id);
+		ret = uncore_cpu_prepare(cpu, phys_id);
+		if (ret)
+			goto out;
 		uncore_event_init_cpu(cpu);
 	}
 	on_each_cpu(uncore_cpu_setup, NULL, 1);
 
 	__register_cpu_notifier(&uncore_cpu_nb);
 
+out:
 	cpu_notifier_register_done();
+	return ret;
 }
 
-
 static int __init intel_uncore_init(void)
 {
 	int ret;
@@ -1382,17 +1429,20 @@ static int __init intel_uncore_init(void
 
 	ret = uncore_pci_init();
 	if (ret)
-		goto fail;
+		return ret;
 	ret = uncore_cpu_init();
-	if (ret) {
-		uncore_pci_exit();
-		goto fail;
-	}
-	uncore_cpumask_init();
+	if (ret)
+		goto errpci;
+	ret = uncore_cpumask_init();
+	if (ret)
+		goto errcpu;
 
-	uncore_pmus_register();
 	return 0;
-fail:
+
+errcpu:
+	uncore_types_exit(uncore_msr_uncores);
+errpci:
+	uncore_pci_exit();
 	return ret;
 }
 device_initcall(intel_uncore_init);
Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -73,13 +73,14 @@ struct intel_uncore_ops {
 };
 
 struct intel_uncore_pmu {
-	struct pmu pmu;
-	char name[UNCORE_PMU_NAME_LEN];
-	int pmu_idx;
-	int func_id;
-	struct intel_uncore_type *type;
-	struct intel_uncore_box ** __percpu box;
-	struct list_head box_list;
+	struct pmu			pmu;
+	char				name[UNCORE_PMU_NAME_LEN];
+	int				pmu_idx;
+	int				func_id;
+	bool				registered;
+	struct intel_uncore_type	*type;
+	struct intel_uncore_box		** __percpu box;
+	struct list_head		box_list;
 };
 
 struct intel_uncore_extra_reg {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 02/28] x86/perf/intel/uncore: Simplify error rollback
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:03   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 03/28] x86/perf/intel/uncore: Fix error handling Thomas Gleixner
                   ` (25 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Simplify-error-rollback.patch --]
[-- Type: text/plain, Size: 3058 bytes --]

No point in doing partial rollbacks. Robustify uncore_exit_type() so it does
not dereference type->pmus unconditionally and remove all the partial rollback
hackery.

Preparatory patch for proper error handling.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   45 +++++++++++++-------------
 1 file changed, 24 insertions(+), 21 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -767,10 +767,12 @@ static void __init uncore_type_exit(stru
 {
 	int i;
 
-	for (i = 0; i < type->num_boxes; i++)
-		free_percpu(type->pmus[i].box);
-	kfree(type->pmus);
-	type->pmus = NULL;
+	if (type->pmus) {
+		for (i = 0; i < type->num_boxes; i++)
+			free_percpu(type->pmus[i].box);
+		kfree(type->pmus);
+		type->pmus = NULL;
+	}
 	kfree(type->events_group);
 	type->events_group = NULL;
 }
@@ -778,6 +780,7 @@ static void __init uncore_type_exit(stru
 static void __init uncore_types_exit(struct intel_uncore_type **types)
 {
 	int i;
+
 	for (i = 0; types[i]; i++)
 		uncore_type_exit(types[i]);
 }
@@ -806,7 +809,7 @@ static int __init uncore_type_init(struc
 		INIT_LIST_HEAD(&pmus[i].box_list);
 		pmus[i].box = alloc_percpu(struct intel_uncore_box *);
 		if (!pmus[i].box)
-			goto fail;
+			return -ENOMEM;
 	}
 
 	if (type->event_descs) {
@@ -817,7 +820,7 @@ static int __init uncore_type_init(struc
 		attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
 					sizeof(*attr_group), GFP_KERNEL);
 		if (!attr_group)
-			goto fail;
+			return -ENOMEM;
 
 		attrs = (struct attribute **)(attr_group + 1);
 		attr_group->name = "events";
@@ -831,9 +834,6 @@ static int __init uncore_type_init(struc
 
 	type->pmu_group = &uncore_pmu_attr_group;
 	return 0;
-fail:
-	uncore_type_exit(type);
-	return -ENOMEM;
 }
 
 static int __init uncore_types_init(struct intel_uncore_type **types)
@@ -843,13 +843,9 @@ static int __init uncore_types_init(stru
 	for (i = 0; types[i]; i++) {
 		ret = uncore_type_init(types[i]);
 		if (ret)
-			goto fail;
+			return ret;
 	}
 	return 0;
-fail:
-	while (--i >= 0)
-		uncore_type_exit(types[i]);
-	return ret;
 }
 
 /*
@@ -1007,17 +1003,21 @@ static int __init uncore_pci_init(void)
 
 	ret = uncore_types_init(uncore_pci_uncores);
 	if (ret)
-		return ret;
+		goto err;
 
 	uncore_pci_driver->probe = uncore_pci_probe;
 	uncore_pci_driver->remove = uncore_pci_remove;
 
 	ret = pci_register_driver(uncore_pci_driver);
-	if (ret == 0)
-		pcidrv_registered = true;
-	else
-		uncore_types_exit(uncore_pci_uncores);
+	if (ret)
+		goto err;
+
+	pcidrv_registered = true;
+	return 0;
 
+err:
+	uncore_types_exit(uncore_pci_uncores);
+	uncore_pci_uncores = empty_uncore;
 	return ret;
 }
 
@@ -1316,9 +1316,12 @@ static int __init uncore_cpu_init(void)
 
 	ret = uncore_types_init(uncore_msr_uncores);
 	if (ret)
-		return ret;
-
+		goto err;
 	return 0;
+err:
+	uncore_types_exit(uncore_msr_uncores);
+	uncore_msr_uncores = empty_uncore;
+	return ret;
 }
 
 static int __init uncore_pmus_register(void)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 04/28] x86/perf/intel/uncore: Add sanity checks for pci dev package id
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (2 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 03/28] x86/perf/intel/uncore: Fix error handling Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:04   ` [tip:perf/core] perf/x86/intel/uncore: Add sanity checks for PCI " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 05/28] x86/perf/intel_uncore: Cleanup hardware on exit Thomas Gleixner
                   ` (23 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Add-sanity-checks-for-package-id.patch --]
[-- Type: text/plain, Size: 742 bytes --]

The storage array is size limited, but misses a sanity check

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -882,7 +882,7 @@ static int uncore_pci_probe(struct pci_d
 	int phys_id, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	if (phys_id < 0)
+	if (phys_id < 0 || phys_id >= UNCORE_SOCKET_MAX)
 		return -ENODEV;
 
 	if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 06/28] x86/perf/intel/uncore: Drop pointless hotplug priority
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (4 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 05/28] x86/perf/intel_uncore: Cleanup hardware on exit Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 07/28] x86/perf/intel_uncore: Make code readable Thomas Gleixner
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-uncore--Drop-pointless-hotplug-priority.patch --]
[-- Type: text/plain, Size: 713 bytes --]

The perf core doesn't do anything related to hardware pmus on hotplug. Drop
the priority.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    5 -----
 1 file changed, 5 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1294,11 +1294,6 @@ static int uncore_cpu_notifier(struct no
 
 static struct notifier_block uncore_cpu_nb = {
 	.notifier_call	= uncore_cpu_notifier,
-	/*
-	 * to migrate uncore events, our notifier should be executed
-	 * before perf core's notifier.
-	 */
-	.priority	= CPU_PRI_PERF + 1,
 };
 
 static int __init type_pmu_register(struct intel_uncore_type *type)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 05/28] x86/perf/intel_uncore: Cleanup hardware on exit
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (3 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 04/28] x86/perf/intel/uncore: Add sanity checks for pci dev package id Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:05   ` [tip:perf/core] perf/x86/intel/uncore: Clean up " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 06/28] x86/perf/intel/uncore: Drop pointless hotplug priority Thomas Gleixner
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Cleanup-hardware-on-exit.patch --]
[-- Type: text/plain, Size: 4817 bytes --]

When tearing down the boxes nothing undoes the hardware state which was setup
by box->init_box(). Add a box->exit_box() callback and implement it for the
uncores which have an init_box() callback.

This misses the cleanup in the error exit pathes, but I cannot be bothered to
implement it before cleaning up the rest of the driver, which makes that task
way simpler.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Removed the code which affects IVB and newer as Liang pointed out that
    it's incorrect
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c       |    6 +++++-
 arch/x86/kernel/cpu/perf_event_intel_uncore.h       |    9 +++++++++
 arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c |    6 ++++++
 arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c   |   13 +++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -937,6 +937,7 @@ static int uncore_pci_probe(struct pci_d
 		raw_spin_lock(&uncore_box_lock);
 		list_del(&box->list);
 		raw_spin_unlock(&uncore_box_lock);
+		uncore_box_exit(box);
 		kfree(box);
 	}
 	return ret;
@@ -982,6 +983,7 @@ static void uncore_pci_remove(struct pci
 	}
 
 	WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
+	uncore_box_exit(box);
 	kfree(box);
 
 	if (last_box)
@@ -1091,8 +1093,10 @@ static void uncore_cpu_dying(int cpu)
 			pmu = &type->pmus[j];
 			box = *per_cpu_ptr(pmu->box, cpu);
 			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			if (box && atomic_dec_and_test(&box->refcnt))
+			if (box && atomic_dec_and_test(&box->refcnt)) {
 				list_add(&box->list, &boxes_to_free);
+				uncore_box_exit(box);
+			}
 		}
 	}
 }
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -61,6 +61,7 @@ struct intel_uncore_type {
 
 struct intel_uncore_ops {
 	void (*init_box)(struct intel_uncore_box *);
+	void (*exit_box)(struct intel_uncore_box *);
 	void (*disable_box)(struct intel_uncore_box *);
 	void (*enable_box)(struct intel_uncore_box *);
 	void (*disable_event)(struct intel_uncore_box *, struct perf_event *);
@@ -306,6 +307,14 @@ static inline void uncore_box_init(struc
 	}
 }
 
+static inline void uncore_box_exit(struct intel_uncore_box *box)
+{
+	if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
+		if (box->pmu->type->ops->exit_box)
+			box->pmu->type->ops->exit_box(box);
+	}
+}
+
 static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
 {
 	return (box->phys_id < 0);
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c
@@ -201,6 +201,11 @@ static void nhmex_uncore_msr_init_box(st
 	wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, NHMEX_U_PMON_GLOBAL_EN_ALL);
 }
 
+static void nhmex_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+	wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, 0);
+}
+
 static void nhmex_uncore_msr_disable_box(struct intel_uncore_box *box)
 {
 	unsigned msr = uncore_msr_box_ctl(box);
@@ -250,6 +255,7 @@ static void nhmex_uncore_msr_enable_even
 
 #define NHMEX_UNCORE_OPS_COMMON_INIT()				\
 	.init_box	= nhmex_uncore_msr_init_box,		\
+	.exit_box	= nhmex_uncore_msr_exit_box,		\
 	.disable_box	= nhmex_uncore_msr_disable_box,		\
 	.enable_box	= nhmex_uncore_msr_enable_box,		\
 	.disable_event	= nhmex_uncore_msr_disable_event,	\
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -95,6 +95,12 @@ static void snb_uncore_msr_init_box(stru
 	}
 }
 
+static void snb_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+	if (box->pmu->pmu_idx == 0)
+		wrmsrl(SNB_UNC_PERF_GLOBAL_CTL, 0);
+}
+
 static struct uncore_event_desc snb_uncore_events[] = {
 	INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x00"),
 	{ /* end: all zeroes */ },
@@ -116,6 +122,7 @@ static struct attribute_group snb_uncore
 
 static struct intel_uncore_ops snb_uncore_msr_ops = {
 	.init_box	= snb_uncore_msr_init_box,
+	.exit_box	= snb_uncore_msr_exit_box,
 	.disable_event	= snb_uncore_msr_disable_event,
 	.enable_event	= snb_uncore_msr_enable_event,
 	.read_counter	= uncore_msr_read_counter,
@@ -231,6 +238,11 @@ static void snb_uncore_imc_init_box(stru
 	box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
 }
 
+static void snb_uncore_imc_exit_box(struct intel_uncore_box *box)
+{
+	iounmap(box->io_addr);
+}
+
 static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
 {}
 
@@ -458,6 +470,7 @@ static struct pmu snb_uncore_imc_pmu = {
 
 static struct intel_uncore_ops snb_uncore_imc_ops = {
 	.init_box	= snb_uncore_imc_init_box,
+	.exit_box	= snb_uncore_imc_exit_box,
 	.enable_box	= snb_uncore_imc_enable_box,
 	.disable_box	= snb_uncore_imc_disable_box,
 	.disable_event	= snb_uncore_imc_disable_event,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 07/28] x86/perf/intel_uncore: Make code readable
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (5 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 06/28] x86/perf/intel/uncore: Drop pointless hotplug priority Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:05   ` [tip:perf/core] perf/x86/intel/uncore: Make code more readable tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 08/28] x86/perf/uncore: Make uncore_pcibus_to_physid static Thomas Gleixner
                   ` (20 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Make-code-readable.patch --]
[-- Type: text/plain, Size: 5486 bytes --]

Cleanup the code a bit before reworking it completely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   95 +++++++++++++-------------
 1 file changed, 50 insertions(+), 45 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -217,7 +217,8 @@ u64 uncore_shared_reg_config(struct inte
 	return config;
 }
 
-static void uncore_assign_hw_event(struct intel_uncore_box *box, struct perf_event *event, int idx)
+static void uncore_assign_hw_event(struct intel_uncore_box *box,
+				   struct perf_event *event, int idx)
 {
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -312,18 +313,19 @@ static void uncore_pmu_init_hrtimer(stru
 	box->hrtimer.function = uncore_pmu_hrtimer;
 }
 
-static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type, int node)
+static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
+						 int node)
 {
+	int i, size, numshared = type->num_shared_regs ;
 	struct intel_uncore_box *box;
-	int i, size;
 
-	size = sizeof(*box) + type->num_shared_regs * sizeof(struct intel_uncore_extra_reg);
+	size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg);
 
 	box = kzalloc_node(size, GFP_KERNEL, node);
 	if (!box)
 		return NULL;
 
-	for (i = 0; i < type->num_shared_regs; i++)
+	for (i = 0; i < numshared; i++)
 		raw_spin_lock_init(&box->shared_regs[i].lock);
 
 	uncore_pmu_init_hrtimer(box);
@@ -351,7 +353,8 @@ static bool is_uncore_event(struct perf_
 }
 
 static int
-uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader, bool dogrp)
+uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader,
+		      bool dogrp)
 {
 	struct perf_event *event;
 	int n, max_count;
@@ -412,7 +415,8 @@ uncore_get_event_constraint(struct intel
 	return &type->unconstrainted;
 }
 
-static void uncore_put_event_constraint(struct intel_uncore_box *box, struct perf_event *event)
+static void uncore_put_event_constraint(struct intel_uncore_box *box,
+					struct perf_event *event)
 {
 	if (box->pmu->type->ops->put_constraint)
 		box->pmu->type->ops->put_constraint(box, event);
@@ -592,7 +596,7 @@ static void uncore_pmu_event_del(struct
 		if (event == box->event_list[i]) {
 			uncore_put_event_constraint(box, event);
 
-			while (++i < box->n_events)
+			for (++i; i < box->n_events; i++)
 				box->event_list[i - 1] = box->event_list[i];
 
 			--box->n_events;
@@ -801,10 +805,8 @@ static void __init uncore_type_exit(stru
 
 static void __init uncore_types_exit(struct intel_uncore_type **types)
 {
-	int i;
-
-	for (i = 0; types[i]; i++)
-		uncore_type_exit(types[i]);
+	for (; *types; types++)
+		uncore_type_exit(*types);
 }
 
 static int __init uncore_type_init(struct intel_uncore_type *type)
@@ -908,9 +910,11 @@ static int uncore_pci_probe(struct pci_d
 	 * some device types. Hence PCI device idx would be 0 for all devices.
 	 * So increment pmu pointer to point to an unused array element.
 	 */
-	if (boot_cpu_data.x86_model == 87)
+	if (boot_cpu_data.x86_model == 87) {
 		while (pmu->func_id >= 0)
 			pmu++;
+	}
+
 	if (pmu->func_id < 0)
 		pmu->func_id = pdev->devfn;
 	else
@@ -1170,44 +1174,45 @@ static int uncore_cpu_prepare(int cpu, i
 	return 0;
 }
 
-static void
-uncore_change_context(struct intel_uncore_type **uncores, int old_cpu, int new_cpu)
+static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
+				   int new_cpu)
 {
-	struct intel_uncore_type *type;
-	struct intel_uncore_pmu *pmu;
+	struct intel_uncore_pmu *pmu = type->pmus;
 	struct intel_uncore_box *box;
-	int i, j;
-
-	for (i = 0; uncores[i]; i++) {
-		type = uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			if (old_cpu < 0)
-				box = uncore_pmu_to_box(pmu, new_cpu);
-			else
-				box = uncore_pmu_to_box(pmu, old_cpu);
-			if (!box)
-				continue;
+	int i;
 
-			if (old_cpu < 0) {
-				WARN_ON_ONCE(box->cpu != -1);
-				box->cpu = new_cpu;
-				continue;
-			}
+	for (i = 0; i < type->num_boxes; i++, pmu++) {
+		if (old_cpu < 0)
+			box = uncore_pmu_to_box(pmu, new_cpu);
+		else
+			box = uncore_pmu_to_box(pmu, old_cpu);
+		if (!box)
+			continue;
 
-			WARN_ON_ONCE(box->cpu != old_cpu);
-			if (new_cpu >= 0) {
-				uncore_pmu_cancel_hrtimer(box);
-				perf_pmu_migrate_context(&pmu->pmu,
-						old_cpu, new_cpu);
-				box->cpu = new_cpu;
-			} else {
-				box->cpu = -1;
-			}
+		if (old_cpu < 0) {
+			WARN_ON_ONCE(box->cpu != -1);
+			box->cpu = new_cpu;
+			continue;
 		}
+
+		WARN_ON_ONCE(box->cpu != old_cpu);
+		box->cpu = -1;
+		if (new_cpu < 0)
+			continue;
+
+		uncore_pmu_cancel_hrtimer(box);
+		perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu);
+		box->cpu = new_cpu;
 	}
 }
 
+static void uncore_change_context(struct intel_uncore_type **uncores,
+				  int old_cpu, int new_cpu)
+{
+	for (; *uncores; uncores++)
+		uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
+}
+
 static void uncore_event_exit_cpu(int cpu)
 {
 	int i, phys_id, target;
@@ -1313,8 +1318,8 @@ static int __init uncore_msr_pmus_regist
 	struct intel_uncore_type **types = uncore_msr_uncores;
 	int ret;
 
-	while (*types) {
-		ret = type_pmu_register(*types++);
+	for (; *types; types++) {
+		ret = type_pmu_register(*types);
 		if (ret)
 			return ret;
 	}

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 08/28] x86/perf/uncore: Make uncore_pcibus_to_physid static
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (6 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 07/28] x86/perf/intel_uncore: Make code readable Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:06   ` [tip:perf/core] perf/x86/uncore: Make uncore_pcibus_to_physid() static tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 09/28] perf: Allow storage of pmu private data in event Thomas Gleixner
                   ` (19 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-uncore--Make-uncore_pcibus_to_physid-static.patch --]
[-- Type: text/plain, Size: 1233 bytes --]

No users outside of this file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    2 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.h |    1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -21,7 +21,7 @@ static struct event_constraint uncore_co
 struct event_constraint uncore_constraint_empty =
 	EVENT_CONSTRAINT(0, 0, 0);
 
-int uncore_pcibus_to_physid(struct pci_bus *bus)
+static int uncore_pcibus_to_physid(struct pci_bus *bus)
 {
 	struct pci2phy_map *map;
 	int phys_id = -1;
Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -125,7 +125,6 @@ struct pci2phy_map {
 	int pbus_to_physid[256];
 };
 
-int uncore_pcibus_to_physid(struct pci_bus *bus);
 struct pci2phy_map *__find_pci2phy_map(int segment);
 
 ssize_t uncore_event_show(struct kobject *kobj,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 09/28] perf: Allow storage of pmu private data in event
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (7 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 08/28] x86/perf/uncore: Make uncore_pcibus_to_physid static Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:06   ` [tip:perf/core] perf: Allow storage of PMU " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 10/28] x86/perf/intel_uncore: Store box in event->pmu_private Thomas Gleixner
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: perf--Allow-storage-of-pmu-private-data-in-event.patch --]
[-- Type: text/plain, Size: 790 bytes --]

For pmus which are not per cpu, but e.g. per package/socket, we want to be
able to store a reference to the underlying per package/socket facility in the
event at init time so we can avoid magic storage constructs in the pmu driver.

This allows us to get rid of the per cpu dance in the intel uncore and rapl
drivers and avoids a lookup of the per package data in the perf hotpath.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/perf_event.h |    1 +
 1 file changed, 1 insertion(+)

--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -467,6 +467,7 @@ struct perf_event {
 	int				group_flags;
 	struct perf_event		*group_leader;
 	struct pmu			*pmu;
+	void				*pmu_private;
 
 	enum perf_event_active_state	state;
 	unsigned int			attach_state;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 10/28] x86/perf/intel_uncore: Store box in event->pmu_private
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (8 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 09/28] perf: Allow storage of pmu private data in event Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:06   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 11/28] x86/topology: Create logical package id Thomas Gleixner
                   ` (17 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Store-box-in-event->pmu_private.patch --]
[-- Type: text/plain, Size: 2973 bytes --]

Store the pmu in event->pmu_private, so we can get rid of the per cpu data
storage. We keep it after converting to per package data, because a cpu to
package lookup will be 3 loads versus one and these usage sites are the perf
fast pathes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c     |   15 +--------------
 arch/x86/kernel/cpu/perf_event_intel_uncore.h     |   12 ++++++++++--
 arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c |    1 +
 3 files changed, 12 insertions(+), 16 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -92,11 +92,6 @@ ssize_t uncore_event_show(struct kobject
 	return sprintf(buf, "%s", event->config);
 }
 
-struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
-{
-	return container_of(event->pmu, struct intel_uncore_pmu, pmu);
-}
-
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
 {
 	struct intel_uncore_box *box;
@@ -122,15 +117,6 @@ struct intel_uncore_box *uncore_pmu_to_b
 	return *per_cpu_ptr(pmu->box, cpu);
 }
 
-struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
-{
-	/*
-	 * perf core schedules event on the basis of cpu, uncore events are
-	 * collected by one of the cpus inside a physical package.
-	 */
-	return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
-}
-
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
 {
 	u64 count;
@@ -690,6 +676,7 @@ static int uncore_pmu_event_init(struct
 	if (!box || box->cpu < 0)
 		return -EINVAL;
 	event->cpu = box->cpu;
+	event->pmu_private = box;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -319,9 +319,17 @@ static inline bool uncore_box_is_fake(st
 	return (box->phys_id < 0);
 }
 
-struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event);
+static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
+{
+	return container_of(event->pmu, struct intel_uncore_pmu, pmu);
+}
+
+static inline struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
+{
+	return event->pmu_private;
+}
+
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu);
-struct intel_uncore_box *uncore_event_to_box(struct perf_event *event);
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event);
 void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
 void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -313,6 +313,7 @@ static int snb_uncore_imc_event_init(str
 		return -EINVAL;
 
 	event->cpu = box->cpu;
+	event->pmu_private = box;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 11/28] x86/topology: Create logical package id
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (9 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 10/28] x86/perf/intel_uncore: Store box in event->pmu_private Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:07   ` [tip:perf/core] " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 12/28] x86/perf/uncore: Track packages not per cpu data Thomas Gleixner
                   ` (16 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-topology--Create-logical-package-id.patch --]
[-- Type: text/plain, Size: 7660 bytes --]

For per package oriented services we must be able to rely on the number of cpu
packages to be within bounds. Create a tracking facility, which

- calculates the number of possible packages depending on nr_cpu_ids after
  boot

- makes sure that the package id is within the number of possible packages. If
  the apic id is outside we map it to a logical package id if there is enough
  space available.

Provide interfaces for drivers to query the mapping and do translations from
physcial to logical ids.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    2 
 arch/x86/include/asm/topology.h  |   10 +++
 arch/x86/kernel/apic/apic.c      |   14 +++++
 arch/x86/kernel/cpu/common.c     |    2 
 arch/x86/kernel/cpu/intel.c      |   13 +++++
 arch/x86/kernel/smpboot.c        |  100 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 141 insertions(+)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -129,6 +129,8 @@ struct cpuinfo_x86 {
 	u16			booted_cores;
 	/* Physical processor id: */
 	u16			phys_proc_id;
+	/* Logical processor id: */
+	u16			logical_proc_id;
 	/* Core id: */
 	u16			cpu_core_id;
 	/* Compute unit id */
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -119,12 +119,22 @@ static inline void setup_node_to_cpumask
 
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 
+#define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).cpu_core_id)
 
 #ifdef ENABLE_TOPO_DEFINES
 #define topology_core_cpumask(cpu)		(per_cpu(cpu_core_map, cpu))
 #define topology_sibling_cpumask(cpu)		(per_cpu(cpu_sibling_map, cpu))
+
+extern unsigned int __max_logical_packages;
+#define topology_max_packages()			(__max_logical_packages)
+int topology_update_package_map(unsigned int apicid, unsigned int cpu);
+extern int topology_phys_to_logical_pkg(unsigned int pkg);
+#else
+#define topology_max_packages()			(1)
+static inline int
+topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
 #endif
 
 static inline void arch_fix_phys_package_id(int num, u32 slot)
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2078,6 +2078,20 @@ int generic_processor_info(int apicid, i
 		cpu = cpumask_next_zero(-1, cpu_present_mask);
 
 	/*
+	 * This can happen on physical hotplug. The sanity check at boot time
+	 * is done from native_smp_prepare_cpus() after num_possible_cpus() is
+	 * established.
+	 */
+	if (topology_update_package_map(apicid, cpu) < 0) {
+		int thiscpu = max + disabled_cpus;
+
+		pr_warning("ACPI: Package limit reached. Processor %d/0x%x ignored.\n",
+			   thiscpu, apicid);
+		disabled_cpus++;
+		return -ENOSPC;
+	}
+
+	/*
 	 * Validate version
 	 */
 	if (version == 0x0) {
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -977,6 +977,8 @@ static void identify_cpu(struct cpuinfo_
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
+	/* The boot/hotplug time assigment got cleared, restore it */
+	c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id);
 }
 
 /*
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -160,6 +160,19 @@ static void early_init_intel(struct cpui
 		pr_info("Disabling PGE capability bit\n");
 		setup_clear_cpu_cap(X86_FEATURE_PGE);
 	}
+
+	if (c->cpuid_level >= 0x00000001) {
+		u32 eax, ebx, ecx, edx;
+
+		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
+		/*
+		 * If HTT (EDX[28]) is set EBX[16:23] contain the number of
+		 * apicids which are reserved per package. Store the resulting
+		 * shift value for the package management code.
+		 */
+		if (edx & (1U << 28))
+			c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
+	}
 }
 
 #ifdef CONFIG_X86_32
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -97,6 +97,14 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
 
+/* Logical package management. We might want to allocate that dynamically */
+static int *physical_to_logical_pkg __read_mostly;
+static unsigned long *physical_package_map __read_mostly;;
+static unsigned long *logical_package_map  __read_mostly;
+static unsigned int max_physical_pkg_id __read_mostly;
+unsigned int __max_logical_packages __read_mostly;
+EXPORT_SYMBOL(__max_logical_packages);
+
 static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip)
 {
 	unsigned long flags;
@@ -251,6 +259,97 @@ static void notrace start_secondary(void
 	cpu_startup_entry(CPUHP_ONLINE);
 }
 
+int topology_update_package_map(unsigned int apicid, unsigned int cpu)
+{
+	unsigned int new, pkg = apicid >> boot_cpu_data.x86_coreid_bits;
+
+	/* Called from early boot ? */
+	if (!physical_package_map)
+		return 0;
+
+	if (pkg >= max_physical_pkg_id)
+		return -EINVAL;
+
+	/* Set the logical package id */
+	if (test_and_set_bit(pkg, physical_package_map))
+		goto found;
+
+	if (pkg < __max_logical_packages) {
+		set_bit(pkg, logical_package_map);
+		physical_to_logical_pkg[pkg] = pkg;
+		goto found;
+	}
+	new = find_first_zero_bit(logical_package_map, __max_logical_packages);
+	if (new >= __max_logical_packages) {
+		physical_to_logical_pkg[pkg] = -1;
+		pr_warn("APIC(%x) Package %u exceeds logical package map\n",
+			apicid, pkg);
+		return -ENOSPC;
+	}
+	set_bit(new, logical_package_map);
+	pr_info("APIC(%x) Converting physical %u to logical package %u\n",
+		apicid, pkg, new);
+	physical_to_logical_pkg[pkg] = new;
+
+found:
+	cpu_data(cpu).logical_proc_id = physical_to_logical_pkg[pkg];
+	return 0;
+}
+
+/**
+ * topology_phys_to_logical_pkg - Map a physical package id to a logical
+ *
+ * Returns logical package id or -1 if not found
+ */
+int topology_phys_to_logical_pkg(unsigned int phys_pkg)
+{
+	if (phys_pkg >= max_physical_pkg_id)
+		return -1;
+	return physical_to_logical_pkg[phys_pkg];
+}
+EXPORT_SYMBOL(topology_phys_to_logical_pkg);
+
+static void __init smp_init_package_map(void)
+{
+	unsigned int ncpus, cpu;
+	size_t size;
+
+	/*
+	 * Today neither Intel nor AMD support heterogenous systems. That
+	 * might change in the future....
+	 */
+	ncpus = boot_cpu_data.x86_max_cores * smp_num_siblings;
+	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
+
+	/*
+	 * Possibly larger than what we need as the number of apic ids per
+	 * package can be smaller than the actual used apic ids.
+	 */
+	max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus);
+	size = max_physical_pkg_id * sizeof(unsigned int);
+	physical_to_logical_pkg = kmalloc(size, GFP_KERNEL);
+	memset(physical_to_logical_pkg, 0xff, size);
+	size = BITS_TO_LONGS(max_physical_pkg_id);
+	physical_package_map = kzalloc(size, GFP_KERNEL);
+	size = BITS_TO_LONGS(__max_logical_packages);
+	logical_package_map = kzalloc(size, GFP_KERNEL);
+
+	pr_info("Max logical packages: %u\n", __max_logical_packages);
+
+	for_each_present_cpu(cpu) {
+		unsigned int apicid = apic->cpu_present_to_apicid(cpu);
+
+		if (apicid == BAD_APICID || !apic->apic_id_valid(apicid))
+			continue;
+		if (!topology_update_package_map(apicid, cpu))
+			continue;
+		pr_warn("CPU %u APICId %x disabled\n", cpu, apicid);
+		per_cpu(x86_bios_cpu_apicid, cpu) = BAD_APICID;
+		set_cpu_possible(cpu, false);
+		set_cpu_present(cpu, false);
+	}
+}
+
 void __init smp_store_boot_cpu_info(void)
 {
 	int id = 0; /* CPU 0 */
@@ -258,6 +357,7 @@ void __init smp_store_boot_cpu_info(void
 
 	*c = boot_cpu_data;
 	c->cpu_index = id;
+	smp_init_package_map();
 }
 
 /*

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 12/28] x86/perf/uncore: Track packages not per cpu data
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (10 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 11/28] x86/topology: Create logical package id Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:07   ` [tip:perf/core] perf/x86/uncore: Track packages, not per CPU data tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 13/28] x86/perf/intel_uncore: Clear all hardware state on exit Thomas Gleixner
                   ` (15 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-uncore--Track-packages-not-per-cpu-data.patch --]
[-- Type: text/plain, Size: 22117 bytes --]

uncore is a per package facility, but the code tries to mimick a per cpu
facility with completely convoluted constructs.

Simplify the whole machinery by tracking per package information. While at it,
avoid the kfree/alloc dance when a cpu goes offline and online again. There is
no point in freeing the box after it was allocated. We just keep proper
refcounting and the first cpu which comes online in a package does the
initialization/activation of the box.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c       |  420 ++++++++------------
 arch/x86/kernel/cpu/perf_event_intel_uncore.h       |   18 
 arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c |   10 
 3 files changed, 197 insertions(+), 251 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -9,9 +9,9 @@ struct pci_driver *uncore_pci_driver;
 /* pci bus to socket mapping */
 DEFINE_RAW_SPINLOCK(pci2phy_map_lock);
 struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head);
-struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
+struct pci_extra_dev *uncore_extra_pci_dev;
+static int max_packages;
 
-static DEFINE_RAW_SPINLOCK(uncore_box_lock);
 /* mask of cpus that collect uncore events */
 static cpumask_t uncore_cpu_mask;
 
@@ -94,27 +94,7 @@ ssize_t uncore_event_show(struct kobject
 
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
 {
-	struct intel_uncore_box *box;
-
-	box = *per_cpu_ptr(pmu->box, cpu);
-	if (box)
-		return box;
-
-	raw_spin_lock(&uncore_box_lock);
-	/* Recheck in lock to handle races. */
-	if (*per_cpu_ptr(pmu->box, cpu))
-		goto out;
-	list_for_each_entry(box, &pmu->box_list, list) {
-		if (box->phys_id == topology_physical_package_id(cpu)) {
-			atomic_inc(&box->refcnt);
-			*per_cpu_ptr(pmu->box, cpu) = box;
-			break;
-		}
-	}
-out:
-	raw_spin_unlock(&uncore_box_lock);
-
-	return *per_cpu_ptr(pmu->box, cpu);
+	return pmu->boxes[topology_logical_package_id(cpu)];
 }
 
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
@@ -315,9 +295,9 @@ static struct intel_uncore_box *uncore_a
 		raw_spin_lock_init(&box->shared_regs[i].lock);
 
 	uncore_pmu_init_hrtimer(box);
-	atomic_set(&box->refcnt, 1);
 	box->cpu = -1;
-	box->phys_id = -1;
+	box->pci_phys_id = -1;
+	box->pkgid = -1;
 
 	/* set default hrtimer timeout */
 	box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
@@ -774,14 +754,24 @@ static void uncore_pmu_unregister(struct
 	pmu->registered = false;
 }
 
+static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
+{
+	int pkg;
+
+	for (pkg = 0; pkg < max_packages; pkg++)
+		kfree(pmu->boxes[pkg]);
+	kfree(pmu->boxes);
+}
+
 static void __init uncore_type_exit(struct intel_uncore_type *type)
 {
+	struct intel_uncore_pmu *pmu = type->pmus;
 	int i;
 
-	if (type->pmus) {
-		for (i = 0; i < type->num_boxes; i++) {
-			uncore_pmu_unregister(&type->pmus[i]);
-			free_percpu(type->pmus[i].box);
+	if (pmu) {
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			uncore_pmu_unregister(pmu);
+			uncore_free_boxes(pmu);
 		}
 		kfree(type->pmus);
 		type->pmus = NULL;
@@ -796,37 +786,36 @@ static void __init uncore_types_exit(str
 		uncore_type_exit(*types);
 }
 
-static int __init uncore_type_init(struct intel_uncore_type *type)
+static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
 {
 	struct intel_uncore_pmu *pmus;
 	struct attribute_group *attr_group;
 	struct attribute **attrs;
+	size_t size;
 	int i, j;
 
 	pmus = kzalloc(sizeof(*pmus) * type->num_boxes, GFP_KERNEL);
 	if (!pmus)
 		return -ENOMEM;
 
-	type->pmus = pmus;
-
-	type->unconstrainted = (struct event_constraint)
-		__EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
-				0, type->num_counters, 0, 0);
+	size = max_packages * sizeof(struct intel_uncore_box *);
 
 	for (i = 0; i < type->num_boxes; i++) {
-		pmus[i].func_id = -1;
-		pmus[i].pmu_idx = i;
-		pmus[i].type = type;
-		INIT_LIST_HEAD(&pmus[i].box_list);
-		pmus[i].box = alloc_percpu(struct intel_uncore_box *);
-		if (!pmus[i].box)
+		pmus[i].func_id	= setid ? i : -1;
+		pmus[i].pmu_idx	= i;
+		pmus[i].type	= type;
+		pmus[i].boxes	= kzalloc(size, GFP_KERNEL);
+		if (!pmus[i].boxes)
 			return -ENOMEM;
 	}
 
+	type->pmus = pmus;
+	type->unconstrainted = (struct event_constraint)
+		__EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
+				0, type->num_counters, 0, 0);
+
 	if (type->event_descs) {
-		i = 0;
-		while (type->event_descs[i].attr.attr.name)
-			i++;
+		for (i = 0; type->event_descs[i].attr.attr.name; i++);
 
 		attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
 					sizeof(*attr_group), GFP_KERNEL);
@@ -847,12 +836,13 @@ static int __init uncore_type_init(struc
 	return 0;
 }
 
-static int __init uncore_types_init(struct intel_uncore_type **types)
+static int __init
+uncore_types_init(struct intel_uncore_type **types, bool setid)
 {
-	int i, ret;
+	int ret;
 
-	for (i = 0; types[i]; i++) {
-		ret = uncore_type_init(types[i]);
+	for (; *types; types++) {
+		ret = uncore_type_init(*types, setid);
 		if (ret)
 			return ret;
 	}
@@ -864,28 +854,28 @@ static int __init uncore_types_init(stru
  */
 static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct intel_uncore_type *type;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	struct intel_uncore_type *type;
-	bool first_box = false;
-	int phys_id, ret;
+	int phys_id, pkg, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	if (phys_id < 0 || phys_id >= UNCORE_SOCKET_MAX)
+	if (phys_id < 0)
 		return -ENODEV;
 
+	pkg = topology_phys_to_logical_pkg(phys_id);
+	if (WARN_ON_ONCE(pkg < 0))
+		return -EINVAL;
+
 	if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
 		int idx = UNCORE_PCI_DEV_IDX(id->driver_data);
-		uncore_extra_pci_dev[phys_id][idx] = pdev;
+
+		uncore_extra_pci_dev[pkg].dev[idx] = pdev;
 		pci_set_drvdata(pdev, NULL);
 		return 0;
 	}
 
 	type = uncore_pci_uncores[UNCORE_PCI_DEV_TYPE(id->driver_data)];
-	box = uncore_alloc_box(type, NUMA_NO_NODE);
-	if (!box)
-		return -ENOMEM;
-
 	/*
 	 * for performance monitoring unit with multiple boxes,
 	 * each box has a different function id.
@@ -902,32 +892,35 @@ static int uncore_pci_probe(struct pci_d
 			pmu++;
 	}
 
+	if (WARN_ON_ONCE(pmu->boxes[pkg] != NULL))
+		return -EINVAL;
+
+	box = uncore_alloc_box(type, NUMA_NO_NODE);
+	if (!box)
+		return -ENOMEM;
+
 	if (pmu->func_id < 0)
 		pmu->func_id = pdev->devfn;
 	else
 		WARN_ON_ONCE(pmu->func_id != pdev->devfn);
 
-	box->phys_id = phys_id;
+	atomic_inc(&box->refcnt);
+	box->pci_phys_id = phys_id;
+	box->pkgid = pkg;
 	box->pci_dev = pdev;
 	box->pmu = pmu;
 	uncore_box_init(box);
 	pci_set_drvdata(pdev, box);
 
-	raw_spin_lock(&uncore_box_lock);
-	if (list_empty(&pmu->box_list))
-		first_box = true;
-	list_add_tail(&box->list, &pmu->box_list);
-	raw_spin_unlock(&uncore_box_lock);
-
-	if (!first_box)
+	pmu->boxes[pkg] = box;
+	if (atomic_inc_return(&pmu->activeboxes) > 1)
 		return 0;
 
+	/* First active box registers the pmu */
 	ret = uncore_pmu_register(pmu);
 	if (ret) {
 		pci_set_drvdata(pdev, NULL);
-		raw_spin_lock(&uncore_box_lock);
-		list_del(&box->list);
-		raw_spin_unlock(&uncore_box_lock);
+		pmu->boxes[pkg] = NULL;
 		uncore_box_exit(box);
 		kfree(box);
 	}
@@ -938,15 +931,16 @@ static void uncore_pci_remove(struct pci
 {
 	struct intel_uncore_box *box = pci_get_drvdata(pdev);
 	struct intel_uncore_pmu *pmu;
-	int i, cpu, phys_id;
-	bool last_box = false;
+	int i, phys_id, pkg;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
+	pkg = topology_phys_to_logical_pkg(phys_id);
+
 	box = pci_get_drvdata(pdev);
 	if (!box) {
 		for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
-			if (uncore_extra_pci_dev[phys_id][i] == pdev) {
-				uncore_extra_pci_dev[phys_id][i] = NULL;
+			if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
+				uncore_extra_pci_dev[pkg].dev[i] = NULL;
 				break;
 			}
 		}
@@ -955,34 +949,20 @@ static void uncore_pci_remove(struct pci
 	}
 
 	pmu = box->pmu;
-	if (WARN_ON_ONCE(phys_id != box->phys_id))
+	if (WARN_ON_ONCE(phys_id != box->pci_phys_id))
 		return;
 
 	pci_set_drvdata(pdev, NULL);
-
-	raw_spin_lock(&uncore_box_lock);
-	list_del(&box->list);
-	if (list_empty(&pmu->box_list))
-		last_box = true;
-	raw_spin_unlock(&uncore_box_lock);
-
-	for_each_possible_cpu(cpu) {
-		if (*per_cpu_ptr(pmu->box, cpu) == box) {
-			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			atomic_dec(&box->refcnt);
-		}
-	}
-
-	WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
+	pmu->boxes[pkg] = NULL;
+	if (atomic_dec_return(&pmu->activeboxes) == 0)
+		uncore_pmu_unregister(pmu);
 	uncore_box_exit(box);
 	kfree(box);
-
-	if (last_box)
-		uncore_pmu_unregister(pmu);
 }
 
 static int __init uncore_pci_init(void)
 {
+	size_t size;
 	int ret;
 
 	switch (boot_cpu_data.x86_model) {
@@ -1025,24 +1005,34 @@ static int __init uncore_pci_init(void)
 	if (ret)
 		return ret;
 
-	ret = uncore_types_init(uncore_pci_uncores);
-	if (ret)
+	size = max_packages * sizeof(struct pci_extra_dev);
+	uncore_extra_pci_dev = kzalloc(size, GFP_KERNEL);
+	if (!uncore_extra_pci_dev) {
+		ret = -ENOMEM;
 		goto err;
+	}
+
+	ret = uncore_types_init(uncore_pci_uncores, false);
+	if (ret)
+		goto errtype;
 
 	uncore_pci_driver->probe = uncore_pci_probe;
 	uncore_pci_driver->remove = uncore_pci_remove;
 
 	ret = pci_register_driver(uncore_pci_driver);
 	if (ret)
-		goto err;
+		goto errtype;
 
 	pcidrv_registered = true;
 	return 0;
 
-err:
+errtype:
 	uncore_types_exit(uncore_pci_uncores);
-	uncore_pci_uncores = empty_uncore;
+	kfree(uncore_extra_pci_dev);
+	uncore_extra_pci_dev = NULL;
 	uncore_free_pcibus_map();
+err:
+	uncore_pci_uncores = empty_uncore;
 	return ret;
 }
 
@@ -1052,110 +1042,81 @@ static void __init uncore_pci_exit(void)
 		pcidrv_registered = false;
 		pci_unregister_driver(uncore_pci_driver);
 		uncore_types_exit(uncore_pci_uncores);
+		kfree(uncore_extra_pci_dev);
 		uncore_free_pcibus_map();
 	}
 }
 
-/* CPU hot plug/unplug are serialized by cpu_add_remove_lock mutex */
-static LIST_HEAD(boxes_to_free);
-
-static void uncore_kfree_boxes(void)
-{
-	struct intel_uncore_box *box;
-
-	while (!list_empty(&boxes_to_free)) {
-		box = list_entry(boxes_to_free.next,
-				 struct intel_uncore_box, list);
-		list_del(&box->list);
-		kfree(box);
-	}
-}
-
 static void uncore_cpu_dying(int cpu)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	int i, j;
+	int i, pkg;
 
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			box = *per_cpu_ptr(pmu->box, cpu);
-			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			if (box && atomic_dec_and_test(&box->refcnt)) {
-				list_add(&box->list, &boxes_to_free);
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (box && atomic_dec_return(&box->refcnt) == 0)
 				uncore_box_exit(box);
-			}
 		}
 	}
 }
 
-static int uncore_cpu_starting(int cpu)
+static void uncore_cpu_starting(int cpu, bool init)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
-	struct intel_uncore_box *box, *exist;
-	int i, j, k, phys_id;
+	struct intel_uncore_box *box;
+	int i, pkg, ncpus = 1;
 
-	phys_id = topology_physical_package_id(cpu);
+	if (init) {
+		/*
+		 * On init we get the number of online cpus in the package
+		 * and set refcount for all of them.
+		 */
+		ncpus = cpumask_weight(topology_core_cpumask(cpu));
+	}
 
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			box = *per_cpu_ptr(pmu->box, cpu);
-			/* called by uncore_cpu_init? */
-			if (box && box->phys_id >= 0) {
-				uncore_box_init(box);
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (!box)
 				continue;
-			}
-
-			for_each_online_cpu(k) {
-				exist = *per_cpu_ptr(pmu->box, k);
-				if (exist && exist->phys_id == phys_id) {
-					atomic_inc(&exist->refcnt);
-					*per_cpu_ptr(pmu->box, cpu) = exist;
-					if (box) {
-						list_add(&box->list,
-							 &boxes_to_free);
-						box = NULL;
-					}
-					break;
-				}
-			}
-
-			if (box) {
-				box->phys_id = phys_id;
+			/* The first cpu on a package activates the box */
+			if (atomic_add_return(ncpus, &box->refcnt) == ncpus)
 				uncore_box_init(box);
-			}
 		}
 	}
-	return 0;
 }
 
-static int uncore_cpu_prepare(int cpu, int phys_id)
+static int uncore_cpu_prepare(int cpu)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	int i, j;
-
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			if (pmu->func_id < 0)
-				pmu->func_id = j;
+	int i, pkg;
 
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			if (pmu->boxes[pkg])
+				continue;
+			/* First cpu of a package allocates the box */
 			box = uncore_alloc_box(type, cpu_to_node(cpu));
 			if (!box)
 				return -ENOMEM;
-
 			box->pmu = pmu;
-			box->phys_id = phys_id;
-			*per_cpu_ptr(pmu->box, cpu) = box;
+			box->pkgid = pkg;
+			pmu->boxes[pkg] = box;
 		}
 	}
 	return 0;
@@ -1166,13 +1127,11 @@ static void uncore_change_type_ctx(struc
 {
 	struct intel_uncore_pmu *pmu = type->pmus;
 	struct intel_uncore_box *box;
-	int i;
+	int i, pkg;
 
+	pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
 	for (i = 0; i < type->num_boxes; i++, pmu++) {
-		if (old_cpu < 0)
-			box = uncore_pmu_to_box(pmu, new_cpu);
-		else
-			box = uncore_pmu_to_box(pmu, old_cpu);
+		box = pmu->boxes[pkg];
 		if (!box)
 			continue;
 
@@ -1202,27 +1161,20 @@ static void uncore_change_context(struct
 
 static void uncore_event_exit_cpu(int cpu)
 {
-	int i, phys_id, target;
+	int target;
 
-	/* if exiting cpu is used for collecting uncore events */
+	/* Check if exiting cpu is used for collecting uncore events */
 	if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
 		return;
 
-	/* find a new cpu to collect uncore events */
-	phys_id = topology_physical_package_id(cpu);
-	target = -1;
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
-		if (phys_id == topology_physical_package_id(i)) {
-			target = i;
-			break;
-		}
-	}
+	/* Find a new cpu to collect uncore events */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
 
-	/* migrate uncore events to the new cpu */
-	if (target >= 0)
+	/* Migrate uncore events to the new target */
+	if (target < nr_cpu_ids)
 		cpumask_set_cpu(target, &uncore_cpu_mask);
+	else
+		target = -1;
 
 	uncore_change_context(uncore_msr_uncores, cpu, target);
 	uncore_change_context(uncore_pci_uncores, cpu, target);
@@ -1230,13 +1182,15 @@ static void uncore_event_exit_cpu(int cp
 
 static void uncore_event_init_cpu(int cpu)
 {
-	int i, phys_id;
+	int target;
 
-	phys_id = topology_physical_package_id(cpu);
-	for_each_cpu(i, &uncore_cpu_mask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;
-	}
+	/*
+	 * Check if there is an online cpu in the package
+	 * which collects uncore events already.
+	 */
+	target = cpumask_any_and(&uncore_cpu_mask, topology_core_cpumask(cpu));
+	if (target < nr_cpu_ids)
+		return;
 
 	cpumask_set_cpu(cpu, &uncore_cpu_mask);
 
@@ -1249,38 +1203,25 @@ static int uncore_cpu_notifier(struct no
 {
 	unsigned int cpu = (long)hcpu;
 
-	/* allocate/free data structure for uncore box */
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		return notifier_from_errno(uncore_cpu_prepare(cpu, -1));
+		return notifier_from_errno(uncore_cpu_prepare(cpu));
+
 	case CPU_STARTING:
-		uncore_cpu_starting(cpu);
+		uncore_cpu_starting(cpu, false);
+	case CPU_DOWN_FAILED:
+		uncore_event_init_cpu(cpu);
 		break;
+
 	case CPU_UP_CANCELED:
 	case CPU_DYING:
 		uncore_cpu_dying(cpu);
 		break;
-	case CPU_ONLINE:
-	case CPU_DEAD:
-		uncore_kfree_boxes();
-		break;
-	default:
-		break;
-	}
 
-	/* select the cpu that collects uncore events */
-	switch (action & ~CPU_TASKS_FROZEN) {
-	case CPU_DOWN_FAILED:
-	case CPU_STARTING:
-		uncore_event_init_cpu(cpu);
-		break;
 	case CPU_DOWN_PREPARE:
 		uncore_event_exit_cpu(cpu);
 		break;
-	default:
-		break;
 	}
-
 	return NOTIFY_OK;
 }
 
@@ -1357,7 +1298,7 @@ static int __init uncore_cpu_init(void)
 		return 0;
 	}
 
-	ret = uncore_types_init(uncore_msr_uncores);
+	ret = uncore_types_init(uncore_msr_uncores, true);
 	if (ret)
 		goto err;
 
@@ -1373,39 +1314,34 @@ static int __init uncore_cpu_init(void)
 
 static void __init uncore_cpu_setup(void *dummy)
 {
-	uncore_cpu_starting(smp_processor_id());
+	uncore_cpu_starting(smp_processor_id(), true);
 }
 
+/* Lazy to avoid allocation of a few bytes for the normal case */
+static __initdata DECLARE_BITMAP(packages, MAX_LOCAL_APIC);
+
 static int __init uncore_cpumask_init(void)
 {
-	int cpu, ret = 0;
-
-	cpu_notifier_register_begin();
+	unsigned int cpu;
 
 	for_each_online_cpu(cpu) {
-		int i, phys_id = topology_physical_package_id(cpu);
+		unsigned int pkg = topology_logical_package_id(cpu);
+		int ret;
 
-		for_each_cpu(i, &uncore_cpu_mask) {
-			if (phys_id == topology_physical_package_id(i)) {
-				phys_id = -1;
-				break;
-			}
-		}
-		if (phys_id < 0)
+		if (test_and_set_bit(pkg, packages))
 			continue;
-
-		ret = uncore_cpu_prepare(cpu, phys_id);
+		/*
+		 * The first online cpu of each package takes the refcounts
+		 * for all other online cpus in that package.
+		 */
+		ret = uncore_cpu_prepare(cpu);
 		if (ret)
-			goto out;
+			return ret;
 		uncore_event_init_cpu(cpu);
+		smp_call_function_single(cpu, uncore_cpu_setup, NULL, 1);
 	}
-	on_each_cpu(uncore_cpu_setup, NULL, 1);
-
 	__register_cpu_notifier(&uncore_cpu_nb);
-
-out:
-	cpu_notifier_register_done();
-	return ret;
+	return 0;
 }
 
 static int __init intel_uncore_init(void)
@@ -1418,22 +1354,26 @@ static int __init intel_uncore_init(void
 	if (cpu_has_hypervisor)
 		return -ENODEV;
 
+	max_packages = topology_max_packages();
+
 	ret = uncore_pci_init();
 	if (ret)
 		return ret;
 	ret = uncore_cpu_init();
 	if (ret)
-		goto errpci;
+		goto err;
+
+	cpu_notifier_register_begin();
 	ret = uncore_cpumask_init();
 	if (ret)
-		goto errcpu;
-
+		goto err;
+	cpu_notifier_register_done();
 	return 0;
 
-errcpu:
+err:
 	uncore_types_exit(uncore_msr_uncores);
-errpci:
 	uncore_pci_exit();
+	cpu_notifier_register_done();
 	return ret;
 }
 device_initcall(intel_uncore_init);
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -19,11 +19,12 @@
 #define UNCORE_EXTRA_PCI_DEV		0xff
 #define UNCORE_EXTRA_PCI_DEV_MAX	3
 
-/* support up to 8 sockets */
-#define UNCORE_SOCKET_MAX		8
-
 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
 
+struct pci_extra_dev {
+	struct pci_dev *dev[UNCORE_EXTRA_PCI_DEV_MAX];
+};
+
 struct intel_uncore_ops;
 struct intel_uncore_pmu;
 struct intel_uncore_box;
@@ -79,9 +80,9 @@ struct intel_uncore_pmu {
 	int				pmu_idx;
 	int				func_id;
 	bool				registered;
+	atomic_t			activeboxes;
 	struct intel_uncore_type	*type;
-	struct intel_uncore_box		** __percpu box;
-	struct list_head		box_list;
+	struct intel_uncore_box		**boxes;
 };
 
 struct intel_uncore_extra_reg {
@@ -91,7 +92,8 @@ struct intel_uncore_extra_reg {
 };
 
 struct intel_uncore_box {
-	int phys_id;
+	int pci_phys_id;
+	int pkgid;
 	int n_active;	/* number of active events */
 	int n_events;
 	int cpu;	/* cpu to collect events */
@@ -316,7 +318,7 @@ static inline void uncore_box_exit(struc
 
 static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
 {
-	return (box->phys_id < 0);
+	return (box->pkgid < 0);
 }
 
 static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
@@ -345,7 +347,7 @@ extern struct intel_uncore_type **uncore
 extern struct pci_driver *uncore_pci_driver;
 extern raw_spinlock_t pci2phy_map_lock;
 extern struct list_head pci2phy_map_head;
-extern struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
+extern struct pci_extra_dev *uncore_extra_pci_dev;
 extern struct event_constraint uncore_constraint_empty;
 
 /* perf_event_intel_uncore_snb.c */
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
@@ -987,7 +987,9 @@ static void snbep_qpi_enable_event(struc
 
 	if (reg1->idx != EXTRA_REG_NONE) {
 		int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
-		struct pci_dev *filter_pdev = uncore_extra_pci_dev[box->phys_id][idx];
+		int pkg = topology_phys_to_logical_pkg(box->pci_phys_id);
+		struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
+
 		if (filter_pdev) {
 			pci_write_config_dword(filter_pdev, reg1->reg,
 						(u32)reg1->config);
@@ -2521,14 +2523,16 @@ static struct intel_uncore_type *hswep_m
 
 void hswep_uncore_cpu_init(void)
 {
+	int pkg = topology_phys_to_logical_pkg(0);
+
 	if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
 		hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
 
 	/* Detect 6-8 core systems with only two SBOXes */
-	if (uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3]) {
+	if (uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3]) {
 		u32 capid4;
 
-		pci_read_config_dword(uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3],
+		pci_read_config_dword(uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3],
 				      0x94, &capid4);
 		if (((capid4 >> 6) & 0x3) == 0)
 			hswep_uncore_sbox.num_boxes = 2;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 13/28] x86/perf/intel_uncore: Clear all hardware state on exit
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (11 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 12/28] x86/perf/uncore: Track packages not per cpu data Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:08   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 14/28] x86/perf/intel_uncore: Make PCI and MSR uncore independent Thomas Gleixner
                   ` (14 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Clear-all-hardware-state-on-exit.patch --]
[-- Type: text/plain, Size: 1488 bytes --]

The only missing bit is to completely clear the hardware state on failure
exit. This is now a pretty simple exercise.

Undo the box->init_box() setup on all packages which have been initialized so
far.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -754,6 +754,30 @@ static void uncore_pmu_unregister(struct
 	pmu->registered = false;
 }
 
+static void __init __uncore_exit_boxes(struct intel_uncore_type *type, int cpu)
+{
+	struct intel_uncore_pmu *pmu = type->pmus;
+	struct intel_uncore_box *box;
+	int i, pkg;
+
+	if (pmu) {
+		pkg = topology_physical_package_id(cpu);
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (box)
+				uncore_box_exit(box);
+		}
+	}
+}
+
+static void __init uncore_exit_boxes(void *dummy)
+{
+	struct intel_uncore_type **types;
+
+	for (types = uncore_msr_uncores; *types; types++)
+		__uncore_exit_boxes(*types++, smp_processor_id());
+}
+
 static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
 {
 	int pkg;
@@ -1371,6 +1395,8 @@ static int __init intel_uncore_init(void
 	return 0;
 
 err:
+	/* Undo box->init_box() */
+	on_each_cpu_mask(&uncore_cpu_mask, uncore_exit_boxes, NULL, 1);
 	uncore_types_exit(uncore_msr_uncores);
 	uncore_pci_exit();
 	cpu_notifier_register_done();

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 14/28] x86/perf/intel_uncore: Make PCI and MSR uncore independent
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (12 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 13/28] x86/perf/intel_uncore: Clear all hardware state on exit Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:08   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 15/28] cpumask: Export cpumask_any_but Thomas Gleixner
                   ` (13 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan, Andi Kleen

[-- Attachment #1: x86-perf-intel_uncore--Make-PCI-and-MSR-uncore-independent.patch --]
[-- Type: text/plain, Size: 2608 bytes --]

Andi wanted to do this before, but the patch fell down the cracks. Implement
it with the proper error handling.

Requested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   34 +++++++++++++-------------
 1 file changed, 18 insertions(+), 16 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -1023,7 +1023,7 @@ static int __init uncore_pci_init(void)
 		ret = skl_uncore_pci_init();
 		break;
 	default:
-		return 0;
+		return -ENODEV;
 	}
 
 	if (ret)
@@ -1319,7 +1319,7 @@ static int __init uncore_cpu_init(void)
 		knl_uncore_cpu_init();
 		break;
 	default:
-		return 0;
+		return -ENODEV;
 	}
 
 	ret = uncore_types_init(uncore_msr_uncores, true);
@@ -1344,7 +1344,7 @@ static void __init uncore_cpu_setup(void
 /* Lazy to avoid allocation of a few bytes for the normal case */
 static __initdata DECLARE_BITMAP(packages, MAX_LOCAL_APIC);
 
-static int __init uncore_cpumask_init(void)
+static int __init uncore_cpumask_init(bool msr)
 {
 	unsigned int cpu;
 
@@ -1355,12 +1355,15 @@ static int __init uncore_cpumask_init(vo
 		if (test_and_set_bit(pkg, packages))
 			continue;
 		/*
-		 * The first online cpu of each package takes the refcounts
-		 * for all other online cpus in that package.
+		 * The first online cpu of each package allocates and takes
+		 * the refcounts for all other online cpus in that package.
+		 * If msrs are not enabled no allocation is required.
 		 */
-		ret = uncore_cpu_prepare(cpu);
-		if (ret)
-			return ret;
+		if (msr) {
+			ret = uncore_cpu_prepare(cpu);
+			if (ret)
+				return ret;
+		}
 		uncore_event_init_cpu(cpu);
 		smp_call_function_single(cpu, uncore_cpu_setup, NULL, 1);
 	}
@@ -1370,7 +1373,7 @@ static int __init uncore_cpumask_init(vo
 
 static int __init intel_uncore_init(void)
 {
-	int ret;
+	int pret, cret, ret;
 
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 		return -ENODEV;
@@ -1380,15 +1383,14 @@ static int __init intel_uncore_init(void
 
 	max_packages = topology_max_packages();
 
-	ret = uncore_pci_init();
-	if (ret)
-		return ret;
-	ret = uncore_cpu_init();
-	if (ret)
-		goto err;
+	pret = uncore_pci_init();
+	cret = uncore_cpu_init();
+
+	if (cret && pret)
+		return -ENODEV;
 
 	cpu_notifier_register_begin();
-	ret = uncore_cpumask_init();
+	ret = uncore_cpumask_init(!cret);
 	if (ret)
 		goto err;
 	cpu_notifier_register_done();

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 15/28] cpumask: Export cpumask_any_but
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (13 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 14/28] x86/perf/intel_uncore: Make PCI and MSR uncore independent Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:08   ` [tip:perf/core] cpumask: Export cpumask_any_but() tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 16/28] x86/perf/intel_uncore: Make it modular Thomas Gleixner
                   ` (12 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: cpumask--Export-cpumask_any_but.patch --]
[-- Type: text/plain, Size: 560 bytes --]

Almost every cpumask function is exported, just not the one I need to make the
intel uncore driver modular.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 lib/cpumask.c |    1 +
 1 file changed, 1 insertion(+)

Index: b/lib/cpumask.c
===================================================================
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -41,6 +41,7 @@ int cpumask_any_but(const struct cpumask
 			break;
 	return i;
 }
+EXPORT_SYMBOL(cpumask_any_but);
 
 /* These are not inline because of header tangles. */
 #ifdef CONFIG_CPUMASK_OFFSTACK

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 16/28] x86/perf/intel_uncore: Make it modular
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (14 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 15/28] cpumask: Export cpumask_any_but Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 17/28] x86/perf/cqm: Get rid of the silly for_each_cpu lookups Thomas Gleixner
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel_uncore--Make-it-modular.patch --]
[-- Type: text/plain, Size: 4225 bytes --]

Now that we have a proper cleanup all over the place, it's simple to make this
a modular driver.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig                              |    6 ++----
 arch/x86/Kconfig.perf                         |   16 ++++++++++++++++
 arch/x86/kernel/cpu/Makefile                  |    3 ++-
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   24 ++++++++++++++++++------
 4 files changed, 38 insertions(+), 11 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -160,10 +160,6 @@ config INSTRUCTION_DECODER
 	def_bool y
 	depends on KPROBES || PERF_EVENTS || UPROBES
 
-config PERF_EVENTS_INTEL_UNCORE
-	def_bool y
-	depends on PERF_EVENTS && CPU_SUP_INTEL && PCI
-
 config OUTPUT_FORMAT
 	string
 	default "elf32-i386" if X86_32
@@ -1039,6 +1035,8 @@ config X86_THERMAL_VECTOR
 	def_bool y
 	depends on X86_MCE_INTEL
 
+source "arch/x86/Kconfig.perf"
+
 config X86_LEGACY_VM86
 	bool "Legacy VM86 support"
 	default n
--- /dev/null
+++ b/arch/x86/Kconfig.perf
@@ -0,0 +1,16 @@
+menu "Performance monitoring"
+
+config PERF_EVENTS_INTEL_UNCORE
+	tristate "Intel uncore performance events"
+	depends on PERF_EVENTS && CPU_SUP_INTEL && PCI
+	default y
+	---help---
+	  Include support for Intel uncore performance events. These are
+	  available on NehalemEX and more modern processors.
+
+	  Note: Selecting 'm' might break existing setups as the drivers
+	  lack the autoprobe/load magic. If you need them select: y.
+
+	  If unsure say y.
+
+endmenu
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -43,7 +43,8 @@ obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_eve
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_pt.o perf_event_intel_bts.o
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_cstate.o
 
-obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE)	+= perf_event_intel_uncore.o \
+obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE)	+= perf_event_intel_uncores.o
+perf_event_intel_uncores-objs		:= perf_event_intel_uncore.o \
 					   perf_event_intel_uncore_snb.o \
 					   perf_event_intel_uncore_snbep.o \
 					   perf_event_intel_uncore_nhmex.o
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -21,6 +21,8 @@ static struct event_constraint uncore_co
 struct event_constraint uncore_constraint_empty =
 	EVENT_CONSTRAINT(0, 0, 0);
 
+MODULE_LICENSE("GPL");
+
 static int uncore_pcibus_to_physid(struct pci_bus *bus)
 {
 	struct pci2phy_map *map;
@@ -754,7 +756,7 @@ static void uncore_pmu_unregister(struct
 	pmu->registered = false;
 }
 
-static void __init __uncore_exit_boxes(struct intel_uncore_type *type, int cpu)
+static void __uncore_exit_boxes(struct intel_uncore_type *type, int cpu)
 {
 	struct intel_uncore_pmu *pmu = type->pmus;
 	struct intel_uncore_box *box;
@@ -770,7 +772,7 @@ static void __init __uncore_exit_boxes(s
 	}
 }
 
-static void __init uncore_exit_boxes(void *dummy)
+static void uncore_exit_boxes(void *dummy)
 {
 	struct intel_uncore_type **types;
 
@@ -787,7 +789,7 @@ static void uncore_free_boxes(struct int
 	kfree(pmu->boxes);
 }
 
-static void __init uncore_type_exit(struct intel_uncore_type *type)
+static void uncore_type_exit(struct intel_uncore_type *type)
 {
 	struct intel_uncore_pmu *pmu = type->pmus;
 	int i;
@@ -804,7 +806,7 @@ static void __init uncore_type_exit(stru
 	type->events_group = NULL;
 }
 
-static void __init uncore_types_exit(struct intel_uncore_type **types)
+static void uncore_types_exit(struct intel_uncore_type **types)
 {
 	for (; *types; types++)
 		uncore_type_exit(*types);
@@ -1060,7 +1062,7 @@ static int __init uncore_pci_init(void)
 	return ret;
 }
 
-static void __init uncore_pci_exit(void)
+static void uncore_pci_exit(void)
 {
 	if (pcidrv_registered) {
 		pcidrv_registered = false;
@@ -1404,4 +1406,14 @@ static int __init intel_uncore_init(void
 	cpu_notifier_register_done();
 	return ret;
 }
-device_initcall(intel_uncore_init);
+module_init(intel_uncore_init);
+
+static void __exit intel_uncore_exit(void)
+{
+	cpu_notifier_register_done();
+	__unregister_cpu_notifier(&uncore_cpu_nb);
+	uncore_types_exit(uncore_msr_uncores);
+	uncore_pci_exit();
+	cpu_notifier_register_done();
+}
+module_exit(intel_uncore_exit);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 18/28] x86/perf/intel_rapl: Make Knights Landings support functional
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (16 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 17/28] x86/perf/cqm: Get rid of the silly for_each_cpu lookups Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:09   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 19/28] x86/perf/intel/rapl: Add proper error handling Thomas Gleixner
                   ` (9 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan,
	Dasaratharaman Chandramouli

[-- Attachment #1: x86-perf-intel_rapl--Make-Knights-Landings-support-functional.patch --]
[-- Type: text/plain, Size: 912 bytes --]

The Knights Landings support added the events and the detection case, but then
returns 0 without actually initializing the driver.

Fixes: 3a2a7797326a4 "perf/x86/intel/rapl: Add support for Knights Landing (KNL)"
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -731,7 +731,7 @@ static int __init rapl_pmu_init(void)
 		rapl_add_quirk(rapl_hsw_server_quirk);
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
-
+		break;
 	default:
 		/* unsupported */
 		return 0;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 17/28] x86/perf/cqm: Get rid of the silly for_each_cpu lookups
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (15 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 16/28] x86/perf/intel_uncore: Make it modular Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:09   ` [tip:perf/core] perf/x86/intel/cqm: Get rid of the silly for_each_cpu() lookups tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 18/28] x86/perf/intel_rapl: Make Knights Landings support functional Thomas Gleixner
                   ` (10 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-cqm--Get-rid-of-the-silly-for_each_cpu-lookups.patch --]
[-- Type: text/plain, Size: 1909 bytes --]

CQM is a strict per package facility. Use the proper cpumasks to lookup the
readers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c |   34 ++++++++++-------------------
 1 file changed, 12 insertions(+), 22 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -1244,15 +1244,12 @@ static struct pmu intel_cqm_pmu = {
 
 static inline void cqm_pick_event_reader(int cpu)
 {
-	int phys_id = topology_physical_package_id(cpu);
-	int i;
+	int reader;
 
-	for_each_cpu(i, &cqm_cpumask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;	/* already got reader for this socket */
-	}
-
-	cpumask_set_cpu(cpu, &cqm_cpumask);
+	/* First online cpu in package becomes the reader */
+	reader = cpumask_any_and(&cqm_cpumask, topology_core_cpumask(cpu));
+	if (reader >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &cqm_cpumask);
 }
 
 static void intel_cqm_cpu_starting(unsigned int cpu)
@@ -1270,24 +1267,17 @@ static void intel_cqm_cpu_starting(unsig
 
 static void intel_cqm_cpu_exit(unsigned int cpu)
 {
-	int phys_id = topology_physical_package_id(cpu);
-	int i;
+	int target;
 
-	/*
-	 * Is @cpu a designated cqm reader?
-	 */
+	/* Is @cpu the current cqm reader for this package ? */
 	if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask))
 		return;
 
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
-
-		if (phys_id == topology_physical_package_id(i)) {
-			cpumask_set_cpu(i, &cqm_cpumask);
-			break;
-		}
-	}
+	/* Find another online reader in this package */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
+
+	if (target < nr_cpu_ids)
+		cpumask_set_cpu(target, &cqm_cpumask);
 }
 
 static int intel_cqm_cpu_notifier(struct notifier_block *nb,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 19/28] x86/perf/intel/rapl: Add proper error handling
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (17 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 18/28] x86/perf/intel_rapl: Make Knights Landings support functional Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 21/28] x86/perf/intel/rapl: Calculate timing once Thomas Gleixner
                   ` (8 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Add-proper-error-handling.patch --]
[-- Type: text/plain, Size: 2385 bytes --]

Like uncore the rapl driver lacks error handling. It leaks memory and leaves
the hotplug notifier registered.

Add the proper error checks, cleanup the memory and register the hotplug
notifier only on success.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   29 ++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -686,6 +686,14 @@ static int rapl_check_hw_unit(void)
 	return 0;
 }
 
+static void __init cleanup_rapl_pmus(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		kfree(per_cpu(rapl_pmu, cpu));
+}
+
 static const struct x86_cpu_id rapl_cpu_match[] = {
 	[0] = { .vendor = X86_VENDOR_INTEL, .family = 6 },
 	[1] = {},
@@ -702,7 +710,7 @@ static int __init rapl_pmu_init(void)
 	 * check for Intel processor family 6
 	 */
 	if (!x86_match_cpu(rapl_cpu_match))
-		return 0;
+		return -ENODEV;
 
 	/* check supported CPU */
 	switch (boot_cpu_data.x86_model) {
@@ -734,8 +742,9 @@ static int __init rapl_pmu_init(void)
 		break;
 	default:
 		/* unsupported */
-		return 0;
+		return -ENODEV;
 	}
+
 	ret = rapl_check_hw_unit();
 	if (ret)
 		return ret;
@@ -743,6 +752,7 @@ static int __init rapl_pmu_init(void)
 	/* run cpu model quirks */
 	for (quirk = rapl_quirks; quirk; quirk = quirk->next)
 		quirk->func();
+
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {
@@ -752,15 +762,14 @@ static int __init rapl_pmu_init(void)
 		rapl_cpu_init(cpu);
 	}
 
-	__perf_cpu_notifier(rapl_cpu_notifier);
-
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
 	if (WARN_ON(ret)) {
 		pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret);
-		cpu_notifier_register_done();
-		return -1;
+		goto out;
 	}
 
+	__perf_cpu_notifier(rapl_cpu_notifier);
+
 	pmu = __this_cpu_read(rapl_pmu);
 
 	pr_info("RAPL PMU detected,"
@@ -775,9 +784,13 @@ static int __init rapl_pmu_init(void)
 				rapl_domain_names[i], rapl_hw_unit[i]);
 		}
 	}
-out:
-	cpu_notifier_register_done();
 
+	cpu_notifier_register_done();
 	return 0;
+
+out:
+	cleanup_rapl_pmus();
+	cpu_notifier_register_done();
+	return ret;
 }
 device_initcall(rapl_pmu_init);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 21/28] x86/perf/intel/rapl: Calculate timing once
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (18 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 19/28] x86/perf/intel/rapl: Add proper error handling Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 20/28] x86/perf/intel/rapl: Sanitize the quirk handling Thomas Gleixner
                   ` (7 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Calculate-timing-once.patch --]
[-- Type: text/plain, Size: 2976 bytes --]

No point in doing the same calculation over and over. Do it once in
rapl_check_hw_unit().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   39 ++++++++++++----------------
 1 file changed, 18 insertions(+), 21 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -129,6 +129,7 @@ static int rapl_hw_unit[NR_RAPL_DOMAINS]
 static struct pmu rapl_pmu_class;
 static cpumask_t rapl_cpu_mask;
 static int rapl_cntr_mask;
+static u64 rapl_timer_ms;
 
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
@@ -558,7 +559,6 @@ static int rapl_cpu_prepare(int cpu)
 {
 	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
 	int phys_id = topology_physical_package_id(cpu);
-	u64 ms;
 
 	if (pmu)
 		return 0;
@@ -575,19 +575,7 @@ static int rapl_cpu_prepare(int cpu)
 
 	pmu->pmu = &rapl_pmu_class;
 
-	/*
-	 * use reference of 200W for scaling the timeout
-	 * to avoid missing counter overflows.
-	 * 200W = 200 Joules/sec
-	 * divide interval by 2 to avoid lockstep (2 * 100)
-	 * if hw unit is 32, then we use 2 ms 1/200/2
-	 */
-	if (rapl_hw_unit[0] < 32)
-		ms = (1000 / (2 * 100)) * (1ULL << (32 - rapl_hw_unit[0] - 1));
-	else
-		ms = 2;
-
-	pmu->timer_interval = ms_to_ktime(ms);
+	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
 
 	rapl_hrtimer_init(pmu);
 
@@ -676,6 +664,19 @@ static int rapl_check_hw_unit(void (*qui
 	/* Apply cpu model quirk */
 	if (quirk)
 		quirk();
+
+	/*
+	 * Calculate the timer rate:
+	 * Use reference of 200W for scaling the timeout to avoid counter
+	 * overflows. 200W = 200 Joules/sec
+	 * Divide interval by 2 to avoid lockstep (2 * 100)
+	 * if hw unit is 32, then we use 2 ms 1/200/2
+	 */
+	rapl_timer_ms = 2;
+	if (rapl_hw_unit[0] < 32) {
+		rapl_timer_ms = (1000 / (2 * 100));
+		rapl_timer_ms *= (1ULL << (32 - rapl_hw_unit[0] - 1));
+	}
 	return 0;
 }
 
@@ -695,9 +696,7 @@ static const struct x86_cpu_id rapl_cpu_
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	struct rapl_pmu *pmu;
-	int cpu, ret;
-	int i;
+	int cpu, ret, i;
 
 	/*
 	 * check for Intel processor family 6
@@ -758,15 +757,14 @@ static int __init rapl_pmu_init(void)
 	}
 
 	__perf_cpu_notifier(rapl_cpu_notifier);
-
-	pmu = __this_cpu_read(rapl_pmu);
+	cpu_notifier_register_done();
 
 	pr_info("RAPL PMU detected,"
 		" API unit is 2^-32 Joules,"
 		" %d fixed counters"
 		" %llu ms ovfl timer\n",
 		hweight32(rapl_cntr_mask),
-		ktime_to_ms(pmu->timer_interval));
+		rapl_timer_ms);
 	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
 		if (rapl_cntr_mask & (1 << i)) {
 			pr_info("hw unit of domain %s 2^-%d Joules\n",
@@ -774,7 +772,6 @@ static int __init rapl_pmu_init(void)
 		}
 	}
 
-	cpu_notifier_register_done();
 	return 0;
 
 out:

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 20/28] x86/perf/intel/rapl: Sanitize the quirk handling
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (19 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 21/28] x86/perf/intel/rapl: Calculate timing once Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 22/28] x86/perf/intel/rapl: Cleanup the printk output Thomas Gleixner
                   ` (6 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Sanitize-the-quirk-handling.patch --]
[-- Type: text/plain, Size: 3850 bytes --]

There is no point in having a quirk machinery for a single possible
function. Get rid of it and move the quirk to a place where it makes actually
sense.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   49 ++++++++++------------------
 1 file changed, 19 insertions(+), 30 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -133,7 +133,6 @@ static int rapl_cntr_mask;
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
 
-static struct x86_pmu_quirk *rapl_quirks;
 static inline u64 rapl_read_counter(struct perf_event *event)
 {
 	u64 raw;
@@ -141,15 +140,6 @@ static inline u64 rapl_read_counter(stru
 	return raw;
 }
 
-#define rapl_add_quirk(func_)						\
-do {									\
-	static struct x86_pmu_quirk __quirk __initdata = {		\
-		.func = func_,						\
-	};								\
-	__quirk.next = rapl_quirks;					\
-	rapl_quirks = &__quirk;						\
-} while (0)
-
 static inline u64 rapl_scale(u64 v, int cfg)
 {
 	if (cfg > NR_RAPL_DOMAINS) {
@@ -564,17 +554,6 @@ static void rapl_cpu_init(int cpu)
 	cpumask_set_cpu(cpu, &rapl_cpu_mask);
 }
 
-static __init void rapl_hsw_server_quirk(void)
-{
-	/*
-	 * DRAM domain on HSW server has fixed energy unit which can be
-	 * different than the unit from power unit MSR.
-	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-	 */
-	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
-}
-
 static int rapl_cpu_prepare(int cpu)
 {
 	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
@@ -672,7 +651,18 @@ static int rapl_cpu_notifier(struct noti
 	return NOTIFY_OK;
 }
 
-static int rapl_check_hw_unit(void)
+static __init void rapl_hsw_server_quirk(void)
+{
+	/*
+	 * DRAM domain on HSW server has fixed energy unit which can be
+	 * different than the unit from power unit MSR.
+	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+	 */
+	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(void (*quirk)(void))
 {
 	u64 msr_rapl_power_unit_bits;
 	int i;
@@ -683,6 +673,9 @@ static int rapl_check_hw_unit(void)
 	for (i = 0; i < NR_RAPL_DOMAINS; i++)
 		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
+	/* Apply cpu model quirk */
+	if (quirk)
+		quirk();
 	return 0;
 }
 
@@ -701,9 +694,9 @@ static const struct x86_cpu_id rapl_cpu_
 
 static int __init rapl_pmu_init(void)
 {
+	void (*quirk)(void) = NULL;
 	struct rapl_pmu *pmu;
 	int cpu, ret;
-	struct x86_pmu_quirk *quirk;
 	int i;
 
 	/*
@@ -720,7 +713,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
 		break;
 	case 63: /* Haswell-Server */
-		rapl_add_quirk(rapl_hsw_server_quirk);
+		quirk = rapl_hsw_server_quirk;
 		rapl_cntr_mask = RAPL_IDX_SRV;
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
@@ -736,7 +729,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
 	case 87: /* Knights Landing */
-		rapl_add_quirk(rapl_hsw_server_quirk);
+		quirk = rapl_hsw_server_quirk;
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
@@ -745,14 +738,10 @@ static int __init rapl_pmu_init(void)
 		return -ENODEV;
 	}
 
-	ret = rapl_check_hw_unit();
+	ret = rapl_check_hw_unit(quirk);
 	if (ret)
 		return ret;
 
-	/* run cpu model quirks */
-	for (quirk = rapl_quirks; quirk; quirk = quirk->next)
-		quirk->func();
-
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 22/28] x86/perf/intel/rapl: Cleanup the printk output
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (20 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 20/28] x86/perf/intel/rapl: Sanitize the quirk handling Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:11   ` [tip:perf/core] perf/x86/intel/rapl: Clean up " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 23/28] x86/perf/intel/rapl: Refactor code some more Thomas Gleixner
                   ` (5 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Cleanup-the-printk-output.patch --]
[-- Type: text/plain, Size: 2722 bytes --]

The output is inconsistent. Use a proper pr_fmt prefix and split out the
advertisement into a seperate function.

Remove the WARN_ON in the failure case. It's pointless as we already know
where it failed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   42 +++++++++++++++-------------
 1 file changed, 23 insertions(+), 19 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -44,6 +44,9 @@
  * the duration of the measurement. Tools may use a function such as
  * ldexp(raw_count, -32);
  */
+
+#define pr_fmt(fmt) "RAPL PMU: " fmt
+
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/perf_event.h>
@@ -144,7 +147,7 @@ static inline u64 rapl_read_counter(stru
 static inline u64 rapl_scale(u64 v, int cfg)
 {
 	if (cfg > NR_RAPL_DOMAINS) {
-		pr_warn("invalid domain %d, failed to scale data\n", cfg);
+		pr_warn("Invalid domain %d, failed to scale data\n", cfg);
 		return v;
 	}
 	/*
@@ -680,6 +683,21 @@ static int rapl_check_hw_unit(void (*qui
 	return 0;
 }
 
+static void __init rapl_advertise(void)
+{
+	int i;
+
+	pr_info("API unit is 2^-32 Joules, %d fixed counters, %llu ms ovfl timer\n",
+		hweight32(rapl_cntr_mask), rapl_timer_ms);
+
+	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
+		if (rapl_cntr_mask & (1 << i)) {
+			pr_info("hw unit of domain %s 2^-%d Joules\n",
+				rapl_domain_names[i], rapl_hw_unit[i]);
+		}
+	}
+}
+
 static void __init cleanup_rapl_pmus(void)
 {
 	int cpu;
@@ -696,7 +714,7 @@ static const struct x86_cpu_id rapl_cpu_
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	int cpu, ret, i;
+	int cpu, ret;
 
 	/*
 	 * check for Intel processor family 6
@@ -751,30 +769,16 @@ static int __init rapl_pmu_init(void)
 	}
 
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
-	if (WARN_ON(ret)) {
-		pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret);
+	if (ret)
 		goto out;
-	}
 
 	__perf_cpu_notifier(rapl_cpu_notifier);
 	cpu_notifier_register_done();
-
-	pr_info("RAPL PMU detected,"
-		" API unit is 2^-32 Joules,"
-		" %d fixed counters"
-		" %llu ms ovfl timer\n",
-		hweight32(rapl_cntr_mask),
-		rapl_timer_ms);
-	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
-		if (rapl_cntr_mask & (1 << i)) {
-			pr_info("hw unit of domain %s 2^-%d Joules\n",
-				rapl_domain_names[i], rapl_hw_unit[i]);
-		}
-	}
-
+	rapl_advertise();
 	return 0;
 
 out:
+	pr_warn("Initialization failed (%d), disabled\n", ret);
 	cleanup_rapl_pmus();
 	cpu_notifier_register_done();
 	return ret;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 23/28] x86/perf/intel/rapl: Refactor code some more
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (21 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 22/28] x86/perf/intel/rapl: Cleanup the printk output Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:11   ` [tip:perf/core] perf/x86/intel/rapl: Refactor the " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 25/28] x86/perf/intel/rapl: Utilize event->pmu_private Thomas Gleixner
                   ` (4 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Refactor-code-some-more.patch --]
[-- Type: text/plain, Size: 4346 bytes --]

Split out code from init into seperate functions. Tidy up the code and get rid
of pointless comments. I wish there would be comments for code which is not
obvious....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   61 ++++++++++++++--------------
 1 file changed, 31 insertions(+), 30 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -110,7 +110,7 @@ static ssize_t __rapl_##_var##_show(stru
 static struct kobj_attribute format_attr_##_var =		\
 	__ATTR(_name, 0444, __rapl_##_var##_show, NULL)
 
-#define RAPL_CNTR_WIDTH 32 /* 32-bit rapl counters */
+#define RAPL_CNTR_WIDTH 32
 
 #define RAPL_EVENT_ATTR_STR(_name, v, str)					\
 static struct perf_pmu_events_attr event_attr_##v = {				\
@@ -120,15 +120,16 @@ static struct perf_pmu_events_attr event
 };
 
 struct rapl_pmu {
-	spinlock_t	 lock;
-	int		 n_active; /* number of active events */
-	struct list_head active_list;
-	struct pmu	 *pmu; /* pointer to rapl_pmu_class */
-	ktime_t		 timer_interval; /* in ktime_t unit */
-	struct hrtimer   hrtimer;
+	spinlock_t		lock;
+	int			n_active;
+	struct list_head	active_list;
+	struct pmu		*pmu;
+	ktime_t			timer_interval;
+	struct hrtimer		hrtimer;
 };
 
-static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;  /* 1/2^hw_unit Joule */
+ /* 1/2^hw_unit Joule */
+static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct pmu rapl_pmu_class;
 static cpumask_t rapl_cpu_mask;
 static int rapl_cntr_mask;
@@ -200,11 +201,6 @@ static void rapl_start_hrtimer(struct ra
 		     HRTIMER_MODE_REL_PINNED);
 }
 
-static void rapl_stop_hrtimer(struct rapl_pmu *pmu)
-{
-	hrtimer_cancel(&pmu->hrtimer);
-}
-
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
 	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
@@ -216,9 +212,8 @@ static enum hrtimer_restart rapl_hrtimer
 
 	spin_lock_irqsave(&pmu->lock, flags);
 
-	list_for_each_entry(event, &pmu->active_list, active_entry) {
+	list_for_each_entry(event, &pmu->active_list, active_entry)
 		rapl_event_update(event);
-	}
 
 	spin_unlock_irqrestore(&pmu->lock, flags);
 
@@ -275,7 +270,7 @@ static void rapl_pmu_event_stop(struct p
 		WARN_ON_ONCE(pmu->n_active <= 0);
 		pmu->n_active--;
 		if (pmu->n_active == 0)
-			rapl_stop_hrtimer(pmu);
+			hrtimer_cancel(&pmu->hrtimer);
 
 		list_del(&event->active_entry);
 
@@ -542,7 +537,7 @@ static void rapl_cpu_exit(int cpu)
 		perf_pmu_migrate_context(pmu->pmu, cpu, target);
 
 	/* cancel overflow polling timer for CPU */
-	rapl_stop_hrtimer(pmu);
+	hrtimer_cancel(&pmu->hrtimer);
 }
 
 static void rapl_cpu_init(int cpu)
@@ -698,6 +693,20 @@ static void __init rapl_advertise(void)
 	}
 }
 
+static int __init rapl_prepare_cpus(void)
+{
+	unsigned int cpu;
+	int ret;
+
+	for_each_online_cpu(cpu) {
+		ret = rapl_cpu_prepare(cpu);
+		if (ret)
+			return ret;
+		rapl_cpu_init(cpu);
+	}
+	return 0;
+}
+
 static void __init cleanup_rapl_pmus(void)
 {
 	int cpu;
@@ -706,7 +715,7 @@ static void __init cleanup_rapl_pmus(voi
 		kfree(per_cpu(rapl_pmu, cpu));
 }
 
-static const struct x86_cpu_id rapl_cpu_match[] = {
+static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 	[0] = { .vendor = X86_VENDOR_INTEL, .family = 6 },
 	[1] = {},
 };
@@ -714,15 +723,11 @@ static const struct x86_cpu_id rapl_cpu_
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	int cpu, ret;
+	int ret;
 
-	/*
-	 * check for Intel processor family 6
-	 */
 	if (!x86_match_cpu(rapl_cpu_match))
 		return -ENODEV;
 
-	/* check supported CPU */
 	switch (boot_cpu_data.x86_model) {
 	case 42: /* Sandy Bridge */
 	case 58: /* Ivy Bridge */
@@ -751,7 +756,6 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
 	default:
-		/* unsupported */
 		return -ENODEV;
 	}
 
@@ -761,12 +765,9 @@ static int __init rapl_pmu_init(void)
 
 	cpu_notifier_register_begin();
 
-	for_each_online_cpu(cpu) {
-		ret = rapl_cpu_prepare(cpu);
-		if (ret)
-			goto out;
-		rapl_cpu_init(cpu);
-	}
+	ret = rapl_prepare_cpus();
+	if (ret)
+		goto out;
 
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
 	if (ret)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 24/28] x86/perf/intel/rapl: Make pmu lock raw
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (23 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 25/28] x86/perf/intel/rapl: Utilize event->pmu_private Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: Make PMU " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 26/28] x86/perf/intel/rapl: Convert it to a per package facility Thomas Gleixner
                   ` (2 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Make-pmu-lock-raw.patch --]
[-- Type: text/plain, Size: 2862 bytes --]

This lock is taken in hard interrupt context even on Preempt-RT. Make it raw
so RT does not have to patch it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -120,7 +120,7 @@ static struct perf_pmu_events_attr event
 };
 
 struct rapl_pmu {
-	spinlock_t		lock;
+	raw_spinlock_t		lock;
 	int			n_active;
 	struct list_head	active_list;
 	struct pmu		*pmu;
@@ -210,12 +210,12 @@ static enum hrtimer_restart rapl_hrtimer
 	if (!pmu->n_active)
 		return HRTIMER_NORESTART;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	list_for_each_entry(event, &pmu->active_list, active_entry)
 		rapl_event_update(event);
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 
 	hrtimer_forward_now(hrtimer, pmu->timer_interval);
 
@@ -252,9 +252,9 @@ static void rapl_pmu_event_start(struct
 	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 	__rapl_pmu_event_start(pmu, event);
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
@@ -263,7 +263,7 @@ static void rapl_pmu_event_stop(struct p
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	/* mark event as deactivated and stopped */
 	if (!(hwc->state & PERF_HES_STOPPED)) {
@@ -288,7 +288,7 @@ static void rapl_pmu_event_stop(struct p
 		hwc->state |= PERF_HES_UPTODATE;
 	}
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
@@ -297,14 +297,14 @@ static int rapl_pmu_event_add(struct per
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
 
 	if (mode & PERF_EF_START)
 		__rapl_pmu_event_start(pmu, event);
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 
 	return 0;
 }
@@ -567,7 +567,7 @@ static int rapl_cpu_prepare(int cpu)
 	pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
 	if (!pmu)
 		return -1;
-	spin_lock_init(&pmu->lock);
+	raw_spin_lock_init(&pmu->lock);
 
 	INIT_LIST_HEAD(&pmu->active_list);
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 25/28] x86/perf/intel/rapl: Utilize event->pmu_private
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (22 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 23/28] x86/perf/intel/rapl: Refactor code some more Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 24/28] x86/perf/intel/rapl: Make pmu lock raw Thomas Gleixner
                   ` (3 subsequent siblings)
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Utilize-event->pmu_private.patch --]
[-- Type: text/plain, Size: 2966 bytes --]

Store the pmu in event->pmu_private and use it instead of the per cpu
data. Preparatory step to get rid of the per cpu allocations. The usage sites
are the perf fast path, so we keep that even after the conversion to per
package storage as a cpu to package lookup involves 3 loads versus 1 with the
pmu_private pointer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -122,6 +122,7 @@ static struct perf_pmu_events_attr event
 struct rapl_pmu {
 	raw_spinlock_t		lock;
 	int			n_active;
+	int			cpu;
 	struct list_head	active_list;
 	struct pmu		*pmu;
 	ktime_t			timer_interval;
@@ -203,7 +204,7 @@ static void rapl_start_hrtimer(struct ra
 
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = container_of(hrtimer, struct rapl_pmu, hrtimer);
 	struct perf_event *event;
 	unsigned long flags;
 
@@ -249,7 +250,7 @@ static void __rapl_pmu_event_start(struc
 
 static void rapl_pmu_event_start(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pmu->lock, flags);
@@ -259,7 +260,7 @@ static void rapl_pmu_event_start(struct
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -293,7 +294,7 @@ static void rapl_pmu_event_stop(struct p
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -316,6 +317,7 @@ static void rapl_pmu_event_del(struct pe
 
 static int rapl_pmu_event_init(struct perf_event *event)
 {
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	u64 cfg = event->attr.config & RAPL_EVENT_MASK;
 	int bit, msr, ret = 0;
 
@@ -327,6 +329,9 @@ static int rapl_pmu_event_init(struct pe
 	if (event->attr.config & ~RAPL_EVENT_MASK)
 		return -EINVAL;
 
+	if (event->cpu < 0)
+		return -EINVAL;
+
 	/*
 	 * check event is known (determines counter)
 	 */
@@ -365,6 +370,8 @@ static int rapl_pmu_event_init(struct pe
 		return -EINVAL;
 
 	/* must be done before validate_group */
+	event->cpu = pmu->cpu;
+	event->pmu_private = pmu;
 	event->hw.event_base = msr;
 	event->hw.config = cfg;
 	event->hw.idx = bit;
@@ -572,6 +579,7 @@ static int rapl_cpu_prepare(int cpu)
 	INIT_LIST_HEAD(&pmu->active_list);
 
 	pmu->pmu = &rapl_pmu_class;
+	pmu->cpu = cpu;
 
 	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 26/28] x86/perf/intel/rapl: Convert it to a per package facility
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (24 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 24/28] x86/perf/intel/rapl: Make pmu lock raw Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 27/28] perf: Export perf_event_sysfs_show Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 28/28] x86/perf/intel/rapl: Make it modular Thomas Gleixner
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Convert-it-to-a-per-package-facility.patch --]
[-- Type: text/plain, Size: 8306 bytes --]

RAPL is a per package facility and we already have a mechanism for a dedicated
per package reader. So there is no point to have multiple CPUs doing the
same. The current implementation actually starts two timers on two cpus if one
does:

	perf stat -C1,2 -e -e power/energy-pkg ....

which makes the whole concept of 1 reader per package moot.

What's worse is that the above returns the double of the actual energy
consumption, but that's a different problem to address and cannot be solved by
removing the pointless per cpuness of that mechanism.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |  196 ++++++++++++----------------
 1 file changed, 87 insertions(+), 109 deletions(-)

Index: b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
===================================================================
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -129,15 +129,23 @@ struct rapl_pmu {
 	struct hrtimer		hrtimer;
 };
 
+struct rapl_pmus {
+	struct pmu		pmu;
+	unsigned int		maxpkg;
+	struct rapl_pmu		*pmus[];
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
-static struct pmu rapl_pmu_class;
+static struct rapl_pmus *rapl_pmus;
 static cpumask_t rapl_cpu_mask;
-static int rapl_cntr_mask;
+static unsigned int rapl_cntr_mask;
 static u64 rapl_timer_ms;
 
-static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
-static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
+static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
+{
+	return rapl_pmus->pmus[topology_logical_package_id(cpu)];
+}
 
 static inline u64 rapl_read_counter(struct perf_event *event)
 {
@@ -317,12 +325,12 @@ static void rapl_pmu_event_del(struct pe
 
 static int rapl_pmu_event_init(struct perf_event *event)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	u64 cfg = event->attr.config & RAPL_EVENT_MASK;
 	int bit, msr, ret = 0;
+	struct rapl_pmu *pmu;
 
 	/* only look at RAPL events */
-	if (event->attr.type != rapl_pmu_class.type)
+	if (event->attr.type != rapl_pmus->pmu.type)
 		return -ENOENT;
 
 	/* check only supported bits are set */
@@ -370,6 +378,7 @@ static int rapl_pmu_event_init(struct pe
 		return -EINVAL;
 
 	/* must be done before validate_group */
+	pmu = cpu_to_rapl_pmu(event->cpu);
 	event->cpu = pmu->cpu;
 	event->pmu_private = pmu;
 	event->hw.event_base = msr;
@@ -502,116 +511,62 @@ const struct attribute_group *rapl_attr_
 	NULL,
 };
 
-static struct pmu rapl_pmu_class = {
-	.attr_groups	= rapl_attr_groups,
-	.task_ctx_nr	= perf_invalid_context, /* system-wide only */
-	.event_init	= rapl_pmu_event_init,
-	.add		= rapl_pmu_event_add, /* must have */
-	.del		= rapl_pmu_event_del, /* must have */
-	.start		= rapl_pmu_event_start,
-	.stop		= rapl_pmu_event_stop,
-	.read		= rapl_pmu_event_read,
-};
-
 static void rapl_cpu_exit(int cpu)
 {
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-	int i, phys_id = topology_physical_package_id(cpu);
-	int target = -1;
-
-	/* find a new cpu on same package */
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
-		if (phys_id == topology_physical_package_id(i)) {
-			target = i;
-			break;
-		}
-	}
-	/*
-	 * clear cpu from cpumask
-	 * if was set in cpumask and still some cpu on package,
-	 * then move to new cpu
-	 */
-	if (cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask) && target >= 0)
-		cpumask_set_cpu(target, &rapl_cpu_mask);
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
+	int target;
 
-	WARN_ON(cpumask_empty(&rapl_cpu_mask));
-	/*
-	 * migrate events and context to new cpu
-	 */
-	if (target >= 0)
-		perf_pmu_migrate_context(pmu->pmu, cpu, target);
+	/* Check if exiting cpu is used for collecting rapl events */
+	if (!cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask))
+		return;
+
+	pmu->cpu = -1;
+	/* Find a new cpu to collect rapl events */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
 
-	/* cancel overflow polling timer for CPU */
-	hrtimer_cancel(&pmu->hrtimer);
+	/* Migrate rapl events to the new target */
+	if (target < nr_cpu_ids) {
+		cpumask_set_cpu(target, &rapl_cpu_mask);
+		pmu->cpu = target;
+		perf_pmu_migrate_context(pmu->pmu, cpu, target);
+	}
 }
 
 static void rapl_cpu_init(int cpu)
 {
-	int i, phys_id = topology_physical_package_id(cpu);
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
+	int target;
+
+	/*
+	 * Check if there is an online cpu in the package which collects rapl
+	 * events already.
+	 */
+	target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu));
+	if (target < nr_cpu_ids)
+		return;
 
-	/* check if phys_is is already covered */
-	for_each_cpu(i, &rapl_cpu_mask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;
-	}
-	/* was not found, so add it */
 	cpumask_set_cpu(cpu, &rapl_cpu_mask);
+	pmu->cpu = cpu;
 }
 
 static int rapl_cpu_prepare(int cpu)
 {
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-	int phys_id = topology_physical_package_id(cpu);
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
 
 	if (pmu)
 		return 0;
 
-	if (phys_id < 0)
-		return -1;
-
 	pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
 	if (!pmu)
-		return -1;
-	raw_spin_lock_init(&pmu->lock);
+		return -ENOMEM;
 
+	raw_spin_lock_init(&pmu->lock);
 	INIT_LIST_HEAD(&pmu->active_list);
-
-	pmu->pmu = &rapl_pmu_class;
-	pmu->cpu = cpu;
-
+	pmu->pmu = &rapl_pmus->pmu;
 	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
-
+	pmu->cpu = -1;
 	rapl_hrtimer_init(pmu);
-
-	/* set RAPL pmu for this cpu for now */
-	per_cpu(rapl_pmu, cpu) = pmu;
-	per_cpu(rapl_pmu_to_free, cpu) = NULL;
-
-	return 0;
-}
-
-static void rapl_cpu_kfree(int cpu)
-{
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu_to_free, cpu);
-
-	kfree(pmu);
-
-	per_cpu(rapl_pmu_to_free, cpu) = NULL;
-}
-
-static int rapl_cpu_dying(int cpu)
-{
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-
-	if (!pmu)
-		return 0;
-
-	per_cpu(rapl_pmu, cpu) = NULL;
-
-	per_cpu(rapl_pmu_to_free, cpu) = pmu;
-
+	rapl_pmus->pmus[topology_logical_package_id(cpu)] = pmu;
 	return 0;
 }
 
@@ -624,24 +579,16 @@ static int rapl_cpu_notifier(struct noti
 	case CPU_UP_PREPARE:
 		rapl_cpu_prepare(cpu);
 		break;
-	case CPU_STARTING:
-		rapl_cpu_init(cpu);
-		break;
-	case CPU_UP_CANCELED:
-	case CPU_DYING:
-		rapl_cpu_dying(cpu);
-		break;
+
+	case CPU_DOWN_FAILED:
 	case CPU_ONLINE:
-	case CPU_DEAD:
-		rapl_cpu_kfree(cpu);
+		rapl_cpu_init(cpu);
 		break;
+
 	case CPU_DOWN_PREPARE:
 		rapl_cpu_exit(cpu);
 		break;
-	default:
-		break;
 	}
-
 	return NOTIFY_OK;
 }
 
@@ -703,10 +650,14 @@ static void __init rapl_advertise(void)
 
 static int __init rapl_prepare_cpus(void)
 {
-	unsigned int cpu;
+	unsigned int cpu, pkg;
 	int ret;
 
 	for_each_online_cpu(cpu) {
+		pkg = topology_logical_package_id(cpu);
+		if (rapl_pmus->pmus[pkg])
+			continue;
+
 		ret = rapl_cpu_prepare(cpu);
 		if (ret)
 			return ret;
@@ -717,10 +668,33 @@ static int __init rapl_prepare_cpus(void
 
 static void __init cleanup_rapl_pmus(void)
 {
-	int cpu;
+	int i;
 
-	for_each_online_cpu(cpu)
-		kfree(per_cpu(rapl_pmu, cpu));
+	for (i = 0; i < rapl_pmus->maxpkg; i++)
+		kfree(rapl_pmus->pmus + i);
+	kfree(rapl_pmus);
+}
+
+static int __init init_rapl_pmus(void)
+{
+	int maxpkg = topology_max_packages();
+	size_t size;
+
+	size = sizeof(*rapl_pmus) + maxpkg * sizeof(struct rapl_pmu *);
+	rapl_pmus = kzalloc(size, GFP_KERNEL);
+	if (!rapl_pmus)
+		return -ENOMEM;
+
+	rapl_pmus->maxpkg		= maxpkg;
+	rapl_pmus->pmu.attr_groups	= rapl_attr_groups;
+	rapl_pmus->pmu.task_ctx_nr	= perf_invalid_context;
+	rapl_pmus->pmu.event_init	= rapl_pmu_event_init;
+	rapl_pmus->pmu.add		= rapl_pmu_event_add;
+	rapl_pmus->pmu.del		= rapl_pmu_event_del;
+	rapl_pmus->pmu.start		= rapl_pmu_event_start;
+	rapl_pmus->pmu.stop		= rapl_pmu_event_stop;
+	rapl_pmus->pmu.read		= rapl_pmu_event_read;
+	return 0;
 }
 
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
@@ -771,13 +745,17 @@ static int __init rapl_pmu_init(void)
 	if (ret)
 		return ret;
 
+	ret = init_rapl_pmus();
+	if (ret)
+		return ret;
+
 	cpu_notifier_register_begin();
 
 	ret = rapl_prepare_cpus();
 	if (ret)
 		goto out;
 
-	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
+	ret = perf_pmu_register(&rapl_pmus->pmu, "power", -1);
 	if (ret)
 		goto out;
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 27/28] perf: Export perf_event_sysfs_show
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (25 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 26/28] x86/perf/intel/rapl: Convert it to a per package facility Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  2016-02-29 11:13   ` [tip:perf/core] perf: Export perf_event_sysfs_show() tip-bot for Thomas Gleixner
  2016-02-22 22:19 ` [patch V3 28/28] x86/perf/intel/rapl: Make it modular Thomas Gleixner
  27 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: perf--Export-perf_event_sysfs_show.patch --]
[-- Type: text/plain, Size: 394 bytes --]

Required to use it in modular perf drivers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/events/core.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9335,6 +9335,7 @@ ssize_t perf_event_sysfs_show(struct dev
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(perf_event_sysfs_show);
 
 static int __init perf_event_sysfs_init(void)
 {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [patch V3 28/28] x86/perf/intel/rapl: Make it modular
  2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
                   ` (26 preceding siblings ...)
  2016-02-22 22:19 ` [patch V3 27/28] perf: Export perf_event_sysfs_show Thomas Gleixner
@ 2016-02-22 22:19 ` Thomas Gleixner
  27 siblings, 0 replies; 58+ messages in thread
From: Thomas Gleixner @ 2016-02-22 22:19 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, x86, Borislav Petkov, Stephane Eranian,
	Harish Chegondi, Kan Liang, Andi Kleen, Jacob Pan

[-- Attachment #1: x86-perf-intel-rapl--Make-it-modular.patch --]
[-- Type: text/plain, Size: 3054 bytes --]

Add the necessary exit functions so it can be built as a module.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig.perf                       |   13 +++++++++++++
 arch/x86/kernel/cpu/Makefile                |    4 +++-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c |   22 +++++++++++++++++++---
 3 files changed, 35 insertions(+), 4 deletions(-)

--- a/arch/x86/Kconfig.perf
+++ b/arch/x86/Kconfig.perf
@@ -13,4 +13,17 @@ config PERF_EVENTS_INTEL_UNCORE
 
 	  If unsure say y.
 
+config PERF_EVENTS_INTEL_RAPL
+	tristate "Intel rapl performance events"
+	depends on PERF_EVENTS && CPU_SUP_INTEL && PCI
+	default y
+	---help---
+	  Include support for Intel rapl performance events for power
+	  monitoring on modern processors.
+
+	  Note: Selecting 'm' might break existing setups as the drivers
+	  lack the autoprobe/load magic. If you need them select: y.
+
+	  If unsure say y.
+
 endmenu
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -39,10 +39,12 @@ obj-$(CONFIG_CPU_SUP_AMD)		+= perf_event
 endif
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_p6.o perf_event_knc.o perf_event_p4.o
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
-obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_rapl.o perf_event_intel_cqm.o
+obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_cqm.o
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_pt.o perf_event_intel_bts.o
 obj-$(CONFIG_CPU_SUP_INTEL)		+= perf_event_intel_cstate.o
 
+obj-$(CONFIG_PERF_EVENTS_INTEL_RAPL)	+= perf_event_intel_rapl.o
+
 obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE)	+= perf_event_intel_uncores.o
 perf_event_intel_uncores-objs		:= perf_event_intel_uncore.o \
 					   perf_event_intel_uncore_snb.o \
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -53,6 +53,8 @@
 #include <asm/cpu_device_id.h>
 #include "perf_event.h"
 
+MODULE_LICENSE("GPL");
+
 /*
  * RAPL energy status counters
  */
@@ -592,6 +594,10 @@ static int rapl_cpu_notifier(struct noti
 	return NOTIFY_OK;
 }
 
+static struct notifier_block rapl_cpu_nb = {
+	.notifier_call	= rapl_cpu_notifier,
+};
+
 static __init void rapl_hsw_server_quirk(void)
 {
 	/*
@@ -666,7 +672,7 @@ static int __init rapl_prepare_cpus(void
 	return 0;
 }
 
-static void __init cleanup_rapl_pmus(void)
+static void cleanup_rapl_pmus(void)
 {
 	int i;
 
@@ -759,7 +765,7 @@ static int __init rapl_pmu_init(void)
 	if (ret)
 		goto out;
 
-	__perf_cpu_notifier(rapl_cpu_notifier);
+	__register_cpu_notifier(&rapl_cpu_nb);
 	cpu_notifier_register_done();
 	rapl_advertise();
 	return 0;
@@ -770,4 +776,14 @@ static int __init rapl_pmu_init(void)
 	cpu_notifier_register_done();
 	return ret;
 }
-device_initcall(rapl_pmu_init);
+module_init(rapl_pmu_init);
+
+static void __exit intel_rapl_exit(void)
+{
+	cpu_notifier_register_done();
+	__unregister_cpu_notifier(&rapl_cpu_nb);
+	perf_pmu_unregister(&rapl_pmus->pmu);
+	cleanup_rapl_pmus();
+	cpu_notifier_register_done();
+}
+module_exit(intel_rapl_exit);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Remove pointless mask check
  2016-02-22 22:19 ` [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check Thomas Gleixner
@ 2016-02-29 11:03   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, harish.chegondi, linux-kernel, eranian, hpa, andi.kleen,
	kan.liang, torvalds, tglx, mingo, jolsa, peterz, vincent.weaver,
	acme, jacob.jun.pan

Commit-ID:  3ccca9eca67ec68461a5266832d02aee538599b2
Gitweb:     http://git.kernel.org/tip/3ccca9eca67ec68461a5266832d02aee538599b2
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:08 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:12 +0100

perf/x86/intel/uncore: Remove pointless mask check

uncore_cpumask_init() is only ever called from intel_uncore_init() where the
mask is guaranteed to be empty.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.657326866@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 91a18d6..c422d52e 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1342,12 +1342,6 @@ static void __init uncore_cpumask_init(void)
 {
 	int cpu;
 
-	/*
-	 * ony invoke once from msr or pci init code
-	 */
-	if (!cpumask_empty(&uncore_cpu_mask))
-		return;
-
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Simplify error rollback
  2016-02-22 22:19 ` [patch V3 02/28] x86/perf/intel/uncore: Simplify error rollback Thomas Gleixner
@ 2016-02-29 11:03   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kan.liang, hpa, linux-kernel, jacob.jun.pan, tglx, torvalds,
	mingo, jolsa, acme, harish.chegondi, bp, peterz, eranian,
	vincent.weaver, andi.kleen

Commit-ID:  ffeda003803213a8d0babefdd6a95fe424884c14
Gitweb:     http://git.kernel.org/tip/ffeda003803213a8d0babefdd6a95fe424884c14
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:09 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:13 +0100

perf/x86/intel/uncore: Simplify error rollback

No point in doing partial rollbacks. Robustify uncore_exit_type() so it does
not dereference type->pmus unconditionally and remove all the partial rollback
hackery.

Preparatory patch for proper error handling.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.751077467@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 45 ++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index c422d52e..91facdc 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -767,10 +767,12 @@ static void __init uncore_type_exit(struct intel_uncore_type *type)
 {
 	int i;
 
-	for (i = 0; i < type->num_boxes; i++)
-		free_percpu(type->pmus[i].box);
-	kfree(type->pmus);
-	type->pmus = NULL;
+	if (type->pmus) {
+		for (i = 0; i < type->num_boxes; i++)
+			free_percpu(type->pmus[i].box);
+		kfree(type->pmus);
+		type->pmus = NULL;
+	}
 	kfree(type->events_group);
 	type->events_group = NULL;
 }
@@ -778,6 +780,7 @@ static void __init uncore_type_exit(struct intel_uncore_type *type)
 static void __init uncore_types_exit(struct intel_uncore_type **types)
 {
 	int i;
+
 	for (i = 0; types[i]; i++)
 		uncore_type_exit(types[i]);
 }
@@ -806,7 +809,7 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
 		INIT_LIST_HEAD(&pmus[i].box_list);
 		pmus[i].box = alloc_percpu(struct intel_uncore_box *);
 		if (!pmus[i].box)
-			goto fail;
+			return -ENOMEM;
 	}
 
 	if (type->event_descs) {
@@ -817,7 +820,7 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
 		attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
 					sizeof(*attr_group), GFP_KERNEL);
 		if (!attr_group)
-			goto fail;
+			return -ENOMEM;
 
 		attrs = (struct attribute **)(attr_group + 1);
 		attr_group->name = "events";
@@ -831,9 +834,6 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
 
 	type->pmu_group = &uncore_pmu_attr_group;
 	return 0;
-fail:
-	uncore_type_exit(type);
-	return -ENOMEM;
 }
 
 static int __init uncore_types_init(struct intel_uncore_type **types)
@@ -843,13 +843,9 @@ static int __init uncore_types_init(struct intel_uncore_type **types)
 	for (i = 0; types[i]; i++) {
 		ret = uncore_type_init(types[i]);
 		if (ret)
-			goto fail;
+			return ret;
 	}
 	return 0;
-fail:
-	while (--i >= 0)
-		uncore_type_exit(types[i]);
-	return ret;
 }
 
 /*
@@ -1007,17 +1003,21 @@ static int __init uncore_pci_init(void)
 
 	ret = uncore_types_init(uncore_pci_uncores);
 	if (ret)
-		return ret;
+		goto err;
 
 	uncore_pci_driver->probe = uncore_pci_probe;
 	uncore_pci_driver->remove = uncore_pci_remove;
 
 	ret = pci_register_driver(uncore_pci_driver);
-	if (ret == 0)
-		pcidrv_registered = true;
-	else
-		uncore_types_exit(uncore_pci_uncores);
+	if (ret)
+		goto err;
+
+	pcidrv_registered = true;
+	return 0;
 
+err:
+	uncore_types_exit(uncore_pci_uncores);
+	uncore_pci_uncores = empty_uncore;
 	return ret;
 }
 
@@ -1316,9 +1316,12 @@ static int __init uncore_cpu_init(void)
 
 	ret = uncore_types_init(uncore_msr_uncores);
 	if (ret)
-		return ret;
-
+		goto err;
 	return 0;
+err:
+	uncore_types_exit(uncore_msr_uncores);
+	uncore_msr_uncores = empty_uncore;
+	return ret;
 }
 
 static int __init uncore_pmus_register(void)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Fix error handling
  2016-02-22 22:19 ` [patch V3 03/28] x86/perf/intel/uncore: Fix error handling Thomas Gleixner
@ 2016-02-29 11:04   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, peterz, kan.liang, linux-kernel, vincent.weaver, eranian,
	bp, jolsa, hpa, mingo, harish.chegondi, andi.kleen, acme,
	torvalds, jacob.jun.pan

Commit-ID:  4f089678d071781851c3b73c41e55a3765b6a5ee
Gitweb:     http://git.kernel.org/tip/4f089678d071781851c3b73c41e55a3765b6a5ee
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:09 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:14 +0100

perf/x86/intel/uncore: Fix error handling

This driver lacks any form of proper error handling. If initialization fails
or hotplug prepare fails, it lets the facility with half initialized stuff
around.

Fix the state and memory leaks in a first step. As a second step we need to
undo the hardware state which is set via uncore_box_init() on some of the
uncore implementations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.848880559@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 122 +++++++++++++++++++++++++++++------------
 arch/x86/events/intel/uncore.h |  15 ++---
 2 files changed, 94 insertions(+), 43 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 91facdc..25e6203 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -38,6 +38,16 @@ int uncore_pcibus_to_physid(struct pci_bus *bus)
 	return phys_id;
 }
 
+static void uncore_free_pcibus_map(void)
+{
+	struct pci2phy_map *map, *tmp;
+
+	list_for_each_entry_safe(map, tmp, &pci2phy_map_head, list) {
+		list_del(&map->list);
+		kfree(map);
+	}
+}
+
 struct pci2phy_map *__find_pci2phy_map(int segment)
 {
 	struct pci2phy_map *map, *alloc = NULL;
@@ -760,16 +770,28 @@ static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
 	}
 
 	ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
+	if (!ret)
+		pmu->registered = true;
 	return ret;
 }
 
+static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
+{
+	if (!pmu->registered)
+		return;
+	perf_pmu_unregister(&pmu->pmu);
+	pmu->registered = false;
+}
+
 static void __init uncore_type_exit(struct intel_uncore_type *type)
 {
 	int i;
 
 	if (type->pmus) {
-		for (i = 0; i < type->num_boxes; i++)
+		for (i = 0; i < type->num_boxes; i++) {
+			uncore_pmu_unregister(&type->pmus[i]);
 			free_percpu(type->pmus[i].box);
+		}
 		kfree(type->pmus);
 		type->pmus = NULL;
 	}
@@ -856,8 +878,8 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
 	struct intel_uncore_type *type;
-	int phys_id;
 	bool first_box = false;
+	int phys_id, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
 	if (phys_id < 0)
@@ -906,9 +928,18 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	list_add_tail(&box->list, &pmu->box_list);
 	raw_spin_unlock(&uncore_box_lock);
 
-	if (first_box)
-		uncore_pmu_register(pmu);
-	return 0;
+	if (!first_box)
+		return 0;
+
+	ret = uncore_pmu_register(pmu);
+	if (ret) {
+		pci_set_drvdata(pdev, NULL);
+		raw_spin_lock(&uncore_box_lock);
+		list_del(&box->list);
+		raw_spin_unlock(&uncore_box_lock);
+		kfree(box);
+	}
+	return ret;
 }
 
 static void uncore_pci_remove(struct pci_dev *pdev)
@@ -954,7 +985,7 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	kfree(box);
 
 	if (last_box)
-		perf_pmu_unregister(&pmu->pmu);
+		uncore_pmu_unregister(pmu);
 }
 
 static int __init uncore_pci_init(void)
@@ -1018,6 +1049,7 @@ static int __init uncore_pci_init(void)
 err:
 	uncore_types_exit(uncore_pci_uncores);
 	uncore_pci_uncores = empty_uncore;
+	uncore_free_pcibus_map();
 	return ret;
 }
 
@@ -1027,6 +1059,7 @@ static void __init uncore_pci_exit(void)
 		pcidrv_registered = false;
 		pci_unregister_driver(uncore_pci_driver);
 		uncore_types_exit(uncore_pci_uncores);
+		uncore_free_pcibus_map();
 	}
 }
 
@@ -1223,8 +1256,7 @@ static int uncore_cpu_notifier(struct notifier_block *self,
 	/* allocate/free data structure for uncore box */
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		uncore_cpu_prepare(cpu, -1);
-		break;
+		return notifier_from_errno(uncore_cpu_prepare(cpu, -1));
 	case CPU_STARTING:
 		uncore_cpu_starting(cpu);
 		break;
@@ -1265,9 +1297,29 @@ static struct notifier_block uncore_cpu_nb = {
 	.priority	= CPU_PRI_PERF + 1,
 };
 
-static void __init uncore_cpu_setup(void *dummy)
+static int __init type_pmu_register(struct intel_uncore_type *type)
 {
-	uncore_cpu_starting(smp_processor_id());
+	int i, ret;
+
+	for (i = 0; i < type->num_boxes; i++) {
+		ret = uncore_pmu_register(&type->pmus[i]);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+static int __init uncore_msr_pmus_register(void)
+{
+	struct intel_uncore_type **types = uncore_msr_uncores;
+	int ret;
+
+	while (*types) {
+		ret = type_pmu_register(*types++);
+		if (ret)
+			return ret;
+	}
+	return 0;
 }
 
 static int __init uncore_cpu_init(void)
@@ -1317,6 +1369,10 @@ static int __init uncore_cpu_init(void)
 	ret = uncore_types_init(uncore_msr_uncores);
 	if (ret)
 		goto err;
+
+	ret = uncore_msr_pmus_register();
+	if (ret)
+		goto err;
 	return 0;
 err:
 	uncore_types_exit(uncore_msr_uncores);
@@ -1324,26 +1380,14 @@ err:
 	return ret;
 }
 
-static int __init uncore_pmus_register(void)
+static void __init uncore_cpu_setup(void *dummy)
 {
-	struct intel_uncore_pmu *pmu;
-	struct intel_uncore_type *type;
-	int i, j;
-
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			uncore_pmu_register(pmu);
-		}
-	}
-
-	return 0;
+	uncore_cpu_starting(smp_processor_id());
 }
 
-static void __init uncore_cpumask_init(void)
+static int __init uncore_cpumask_init(void)
 {
-	int cpu;
+	int cpu, ret = 0;
 
 	cpu_notifier_register_begin();
 
@@ -1359,17 +1403,20 @@ static void __init uncore_cpumask_init(void)
 		if (phys_id < 0)
 			continue;
 
-		uncore_cpu_prepare(cpu, phys_id);
+		ret = uncore_cpu_prepare(cpu, phys_id);
+		if (ret)
+			goto out;
 		uncore_event_init_cpu(cpu);
 	}
 	on_each_cpu(uncore_cpu_setup, NULL, 1);
 
 	__register_cpu_notifier(&uncore_cpu_nb);
 
+out:
 	cpu_notifier_register_done();
+	return ret;
 }
 
-
 static int __init intel_uncore_init(void)
 {
 	int ret;
@@ -1382,17 +1429,20 @@ static int __init intel_uncore_init(void)
 
 	ret = uncore_pci_init();
 	if (ret)
-		goto fail;
+		return ret;
 	ret = uncore_cpu_init();
-	if (ret) {
-		uncore_pci_exit();
-		goto fail;
-	}
-	uncore_cpumask_init();
+	if (ret)
+		goto errpci;
+	ret = uncore_cpumask_init();
+	if (ret)
+		goto errcpu;
 
-	uncore_pmus_register();
 	return 0;
-fail:
+
+errcpu:
+	uncore_types_exit(uncore_msr_uncores);
+errpci:
+	uncore_pci_exit();
 	return ret;
 }
 device_initcall(intel_uncore_init);
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 6a1340c..a18cc7f 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -73,13 +73,14 @@ struct intel_uncore_ops {
 };
 
 struct intel_uncore_pmu {
-	struct pmu pmu;
-	char name[UNCORE_PMU_NAME_LEN];
-	int pmu_idx;
-	int func_id;
-	struct intel_uncore_type *type;
-	struct intel_uncore_box ** __percpu box;
-	struct list_head box_list;
+	struct pmu			pmu;
+	char				name[UNCORE_PMU_NAME_LEN];
+	int				pmu_idx;
+	int				func_id;
+	bool				registered;
+	struct intel_uncore_type	*type;
+	struct intel_uncore_box		** __percpu box;
+	struct list_head		box_list;
 };
 
 struct intel_uncore_extra_reg {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Add sanity checks for PCI dev package id
  2016-02-22 22:19 ` [patch V3 04/28] x86/perf/intel/uncore: Add sanity checks for pci dev package id Thomas Gleixner
@ 2016-02-29 11:04   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, andi.kleen, jolsa, vincent.weaver, eranian, hpa, torvalds,
	linux-kernel, peterz, tglx, acme, kan.liang, jacob.jun.pan,
	mingo, harish.chegondi

Commit-ID:  83f8ebd2eb459b21e8e74d43c75e1be666ec8b97
Gitweb:     http://git.kernel.org/tip/83f8ebd2eb459b21e8e74d43c75e1be666ec8b97
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:10 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:14 +0100

perf/x86/intel/uncore: Add sanity checks for PCI dev package id

The storage array is size limited, but misses a sanity check

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.929967806@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 25e6203..42ea435 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -882,7 +882,7 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	int phys_id, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	if (phys_id < 0)
+	if (phys_id < 0 || phys_id >= UNCORE_SOCKET_MAX)
 		return -ENODEV;
 
 	if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Clean up hardware on exit
  2016-02-22 22:19 ` [patch V3 05/28] x86/perf/intel_uncore: Cleanup hardware on exit Thomas Gleixner
@ 2016-02-29 11:05   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, kan.liang, torvalds, jolsa, hpa, acme, bp, mingo,
	vincent.weaver, andi.kleen, harish.chegondi, tglx, linux-kernel,
	jacob.jun.pan, peterz

Commit-ID:  a46195f1782e94ad07e7485fb15b80db7e60e7aa
Gitweb:     http://git.kernel.org/tip/a46195f1782e94ad07e7485fb15b80db7e60e7aa
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:11 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:15 +0100

perf/x86/intel/uncore: Clean up hardware on exit

When tearing down the boxes nothing undoes the hardware state which was setup
by box->init_box(). Add a box->exit_box() callback and implement it for the
uncores which have an init_box() callback.

This misses the cleanup in the error exit pathes, but I cannot be bothered to
implement it before cleaning up the rest of the driver, which makes that task
way simpler.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.023930023@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c       |  6 +++++-
 arch/x86/events/intel/uncore.h       |  9 +++++++++
 arch/x86/events/intel/uncore_nhmex.c |  6 ++++++
 arch/x86/events/intel/uncore_snb.c   | 13 +++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 42ea435..c3d1cbb 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -937,6 +937,7 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 		raw_spin_lock(&uncore_box_lock);
 		list_del(&box->list);
 		raw_spin_unlock(&uncore_box_lock);
+		uncore_box_exit(box);
 		kfree(box);
 	}
 	return ret;
@@ -982,6 +983,7 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	}
 
 	WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
+	uncore_box_exit(box);
 	kfree(box);
 
 	if (last_box)
@@ -1091,8 +1093,10 @@ static void uncore_cpu_dying(int cpu)
 			pmu = &type->pmus[j];
 			box = *per_cpu_ptr(pmu->box, cpu);
 			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			if (box && atomic_dec_and_test(&box->refcnt))
+			if (box && atomic_dec_and_test(&box->refcnt)) {
 				list_add(&box->list, &boxes_to_free);
+				uncore_box_exit(box);
+			}
 		}
 	}
 }
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index a18cc7f..cf2cb1f 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -61,6 +61,7 @@ struct intel_uncore_type {
 
 struct intel_uncore_ops {
 	void (*init_box)(struct intel_uncore_box *);
+	void (*exit_box)(struct intel_uncore_box *);
 	void (*disable_box)(struct intel_uncore_box *);
 	void (*enable_box)(struct intel_uncore_box *);
 	void (*disable_event)(struct intel_uncore_box *, struct perf_event *);
@@ -306,6 +307,14 @@ static inline void uncore_box_init(struct intel_uncore_box *box)
 	}
 }
 
+static inline void uncore_box_exit(struct intel_uncore_box *box)
+{
+	if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
+		if (box->pmu->type->ops->exit_box)
+			box->pmu->type->ops->exit_box(box);
+	}
+}
+
 static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
 {
 	return (box->phys_id < 0);
diff --git a/arch/x86/events/intel/uncore_nhmex.c b/arch/x86/events/intel/uncore_nhmex.c
index e89bf5c..cda5693 100644
--- a/arch/x86/events/intel/uncore_nhmex.c
+++ b/arch/x86/events/intel/uncore_nhmex.c
@@ -201,6 +201,11 @@ static void nhmex_uncore_msr_init_box(struct intel_uncore_box *box)
 	wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, NHMEX_U_PMON_GLOBAL_EN_ALL);
 }
 
+static void nhmex_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+	wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, 0);
+}
+
 static void nhmex_uncore_msr_disable_box(struct intel_uncore_box *box)
 {
 	unsigned msr = uncore_msr_box_ctl(box);
@@ -250,6 +255,7 @@ static void nhmex_uncore_msr_enable_event(struct intel_uncore_box *box, struct p
 
 #define NHMEX_UNCORE_OPS_COMMON_INIT()				\
 	.init_box	= nhmex_uncore_msr_init_box,		\
+	.exit_box	= nhmex_uncore_msr_exit_box,		\
 	.disable_box	= nhmex_uncore_msr_disable_box,		\
 	.enable_box	= nhmex_uncore_msr_enable_box,		\
 	.disable_event	= nhmex_uncore_msr_disable_event,	\
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 2049d26..120e106 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -95,6 +95,12 @@ static void snb_uncore_msr_init_box(struct intel_uncore_box *box)
 	}
 }
 
+static void snb_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+	if (box->pmu->pmu_idx == 0)
+		wrmsrl(SNB_UNC_PERF_GLOBAL_CTL, 0);
+}
+
 static struct uncore_event_desc snb_uncore_events[] = {
 	INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x00"),
 	{ /* end: all zeroes */ },
@@ -116,6 +122,7 @@ static struct attribute_group snb_uncore_format_group = {
 
 static struct intel_uncore_ops snb_uncore_msr_ops = {
 	.init_box	= snb_uncore_msr_init_box,
+	.exit_box	= snb_uncore_msr_exit_box,
 	.disable_event	= snb_uncore_msr_disable_event,
 	.enable_event	= snb_uncore_msr_enable_event,
 	.read_counter	= uncore_msr_read_counter,
@@ -231,6 +238,11 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
 	box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
 }
 
+static void snb_uncore_imc_exit_box(struct intel_uncore_box *box)
+{
+	iounmap(box->io_addr);
+}
+
 static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
 {}
 
@@ -458,6 +470,7 @@ static struct pmu snb_uncore_imc_pmu = {
 
 static struct intel_uncore_ops snb_uncore_imc_ops = {
 	.init_box	= snb_uncore_imc_init_box,
+	.exit_box	= snb_uncore_imc_exit_box,
 	.enable_box	= snb_uncore_imc_enable_box,
 	.disable_box	= snb_uncore_imc_disable_box,
 	.disable_event	= snb_uncore_imc_disable_event,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Make code more readable
  2016-02-22 22:19 ` [patch V3 07/28] x86/perf/intel_uncore: Make code readable Thomas Gleixner
@ 2016-02-29 11:05   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: harish.chegondi, mingo, kan.liang, eranian, tglx, jolsa, bp, hpa,
	vincent.weaver, andi.kleen, peterz, torvalds, jacob.jun.pan,
	linux-kernel, acme

Commit-ID:  1229735b290d12003a1362ccfc442c690c93bc3a
Gitweb:     http://git.kernel.org/tip/1229735b290d12003a1362ccfc442c690c93bc3a
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:12 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:15 +0100

perf/x86/intel/uncore: Make code more readable

Clean up the code a bit before reworking it completely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.204771538@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 95 ++++++++++++++++++++++--------------------
 1 file changed, 50 insertions(+), 45 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index c3d1cbb..84055f2 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -217,7 +217,8 @@ u64 uncore_shared_reg_config(struct intel_uncore_box *box, int idx)
 	return config;
 }
 
-static void uncore_assign_hw_event(struct intel_uncore_box *box, struct perf_event *event, int idx)
+static void uncore_assign_hw_event(struct intel_uncore_box *box,
+				   struct perf_event *event, int idx)
 {
 	struct hw_perf_event *hwc = &event->hw;
 
@@ -312,18 +313,19 @@ static void uncore_pmu_init_hrtimer(struct intel_uncore_box *box)
 	box->hrtimer.function = uncore_pmu_hrtimer;
 }
 
-static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type, int node)
+static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
+						 int node)
 {
+	int i, size, numshared = type->num_shared_regs ;
 	struct intel_uncore_box *box;
-	int i, size;
 
-	size = sizeof(*box) + type->num_shared_regs * sizeof(struct intel_uncore_extra_reg);
+	size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg);
 
 	box = kzalloc_node(size, GFP_KERNEL, node);
 	if (!box)
 		return NULL;
 
-	for (i = 0; i < type->num_shared_regs; i++)
+	for (i = 0; i < numshared; i++)
 		raw_spin_lock_init(&box->shared_regs[i].lock);
 
 	uncore_pmu_init_hrtimer(box);
@@ -351,7 +353,8 @@ static bool is_uncore_event(struct perf_event *event)
 }
 
 static int
-uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader, bool dogrp)
+uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader,
+		      bool dogrp)
 {
 	struct perf_event *event;
 	int n, max_count;
@@ -412,7 +415,8 @@ uncore_get_event_constraint(struct intel_uncore_box *box, struct perf_event *eve
 	return &type->unconstrainted;
 }
 
-static void uncore_put_event_constraint(struct intel_uncore_box *box, struct perf_event *event)
+static void uncore_put_event_constraint(struct intel_uncore_box *box,
+					struct perf_event *event)
 {
 	if (box->pmu->type->ops->put_constraint)
 		box->pmu->type->ops->put_constraint(box, event);
@@ -592,7 +596,7 @@ static void uncore_pmu_event_del(struct perf_event *event, int flags)
 		if (event == box->event_list[i]) {
 			uncore_put_event_constraint(box, event);
 
-			while (++i < box->n_events)
+			for (++i; i < box->n_events; i++)
 				box->event_list[i - 1] = box->event_list[i];
 
 			--box->n_events;
@@ -801,10 +805,8 @@ static void __init uncore_type_exit(struct intel_uncore_type *type)
 
 static void __init uncore_types_exit(struct intel_uncore_type **types)
 {
-	int i;
-
-	for (i = 0; types[i]; i++)
-		uncore_type_exit(types[i]);
+	for (; *types; types++)
+		uncore_type_exit(*types);
 }
 
 static int __init uncore_type_init(struct intel_uncore_type *type)
@@ -908,9 +910,11 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	 * some device types. Hence PCI device idx would be 0 for all devices.
 	 * So increment pmu pointer to point to an unused array element.
 	 */
-	if (boot_cpu_data.x86_model == 87)
+	if (boot_cpu_data.x86_model == 87) {
 		while (pmu->func_id >= 0)
 			pmu++;
+	}
+
 	if (pmu->func_id < 0)
 		pmu->func_id = pdev->devfn;
 	else
@@ -1170,44 +1174,45 @@ static int uncore_cpu_prepare(int cpu, int phys_id)
 	return 0;
 }
 
-static void
-uncore_change_context(struct intel_uncore_type **uncores, int old_cpu, int new_cpu)
+static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
+				   int new_cpu)
 {
-	struct intel_uncore_type *type;
-	struct intel_uncore_pmu *pmu;
+	struct intel_uncore_pmu *pmu = type->pmus;
 	struct intel_uncore_box *box;
-	int i, j;
-
-	for (i = 0; uncores[i]; i++) {
-		type = uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			if (old_cpu < 0)
-				box = uncore_pmu_to_box(pmu, new_cpu);
-			else
-				box = uncore_pmu_to_box(pmu, old_cpu);
-			if (!box)
-				continue;
+	int i;
 
-			if (old_cpu < 0) {
-				WARN_ON_ONCE(box->cpu != -1);
-				box->cpu = new_cpu;
-				continue;
-			}
+	for (i = 0; i < type->num_boxes; i++, pmu++) {
+		if (old_cpu < 0)
+			box = uncore_pmu_to_box(pmu, new_cpu);
+		else
+			box = uncore_pmu_to_box(pmu, old_cpu);
+		if (!box)
+			continue;
 
-			WARN_ON_ONCE(box->cpu != old_cpu);
-			if (new_cpu >= 0) {
-				uncore_pmu_cancel_hrtimer(box);
-				perf_pmu_migrate_context(&pmu->pmu,
-						old_cpu, new_cpu);
-				box->cpu = new_cpu;
-			} else {
-				box->cpu = -1;
-			}
+		if (old_cpu < 0) {
+			WARN_ON_ONCE(box->cpu != -1);
+			box->cpu = new_cpu;
+			continue;
 		}
+
+		WARN_ON_ONCE(box->cpu != old_cpu);
+		box->cpu = -1;
+		if (new_cpu < 0)
+			continue;
+
+		uncore_pmu_cancel_hrtimer(box);
+		perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu);
+		box->cpu = new_cpu;
 	}
 }
 
+static void uncore_change_context(struct intel_uncore_type **uncores,
+				  int old_cpu, int new_cpu)
+{
+	for (; *uncores; uncores++)
+		uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
+}
+
 static void uncore_event_exit_cpu(int cpu)
 {
 	int i, phys_id, target;
@@ -1318,8 +1323,8 @@ static int __init uncore_msr_pmus_register(void)
 	struct intel_uncore_type **types = uncore_msr_uncores;
 	int ret;
 
-	while (*types) {
-		ret = type_pmu_register(*types++);
+	for (; *types; types++) {
+		ret = type_pmu_register(*types);
 		if (ret)
 			return ret;
 	}

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/uncore: Make uncore_pcibus_to_physid() static
  2016-02-22 22:19 ` [patch V3 08/28] x86/perf/uncore: Make uncore_pcibus_to_physid static Thomas Gleixner
@ 2016-02-29 11:06   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, bp, vincent.weaver, torvalds, hpa, kan.liang,
	andi.kleen, jacob.jun.pan, linux-kernel, harish.chegondi, jolsa,
	tglx, mingo, acme, peterz

Commit-ID:  1384c70442b71f882e7e7153df542794d468a15d
Gitweb:     http://git.kernel.org/tip/1384c70442b71f882e7e7153df542794d468a15d
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:13 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:16 +0100

perf/x86/uncore: Make uncore_pcibus_to_physid() static

No users outside of this file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.285504825@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 2 +-
 arch/x86/events/intel/uncore.h | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 84055f2..80ee3d4 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -21,7 +21,7 @@ static struct event_constraint uncore_constraint_fixed =
 struct event_constraint uncore_constraint_empty =
 	EVENT_CONSTRAINT(0, 0, 0);
 
-int uncore_pcibus_to_physid(struct pci_bus *bus)
+static int uncore_pcibus_to_physid(struct pci_bus *bus)
 {
 	struct pci2phy_map *map;
 	int phys_id = -1;
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index cf2cb1f..f118c0d 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -125,7 +125,6 @@ struct pci2phy_map {
 	int pbus_to_physid[256];
 };
 
-int uncore_pcibus_to_physid(struct pci_bus *bus);
 struct pci2phy_map *__find_pci2phy_map(int segment);
 
 ssize_t uncore_event_show(struct kobject *kobj,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf: Allow storage of PMU private data in event
  2016-02-22 22:19 ` [patch V3 09/28] perf: Allow storage of pmu private data in event Thomas Gleixner
@ 2016-02-29 11:06   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: vincent.weaver, acme, mingo, bp, peterz, eranian, linux-kernel,
	tglx, kan.liang, jolsa, harish.chegondi, andi.kleen, torvalds,
	hpa, jacob.jun.pan

Commit-ID:  54d751d4ad357c817907fe89db3222b97ff66db3
Gitweb:     http://git.kernel.org/tip/54d751d4ad357c817907fe89db3222b97ff66db3
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:14 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:16 +0100

perf: Allow storage of PMU private data in event

For PMUs which are not per CPU, but e.g. per package/socket, we want to be
able to store a reference to the underlying per package/socket facility in the
event at init time so we can avoid magic storage constructs in the PMU driver.

This allows us to get rid of the per CPU dance in the intel uncore and RAPL
drivers and avoids a lookup of the per package data in the perf hotpath.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.364140369@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f5c5a3f..a9d8cab 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -468,6 +468,7 @@ struct perf_event {
 	int				group_flags;
 	struct perf_event		*group_leader;
 	struct pmu			*pmu;
+	void				*pmu_private;
 
 	enum perf_event_active_state	state;
 	unsigned int			attach_state;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Store box in event->pmu_private
  2016-02-22 22:19 ` [patch V3 10/28] x86/perf/intel_uncore: Store box in event->pmu_private Thomas Gleixner
@ 2016-02-29 11:06   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, eranian, linux-kernel, peterz, vincent.weaver, kan.liang,
	bp, jacob.jun.pan, acme, tglx, andi.kleen, harish.chegondi,
	mingo, torvalds, jolsa

Commit-ID:  1f2569fac6c6dd5a6e3fba41c183c04273f05a58
Gitweb:     http://git.kernel.org/tip/1f2569fac6c6dd5a6e3fba41c183c04273f05a58
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:14 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:17 +0100

perf/x86/intel/uncore: Store box in event->pmu_private

Store the PMU pointer in event->pmu_private, so we can get rid of the
per CPU data storage.

We keep it after converting to per package data, because a CPU to
package lookup will be 3 loads versus one and these usage sites are
in the perf fast path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.460851335@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c     | 15 +--------------
 arch/x86/events/intel/uncore.h     | 12 ++++++++++--
 arch/x86/events/intel/uncore_snb.c |  1 +
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 80ee3d4..220f81c 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -92,11 +92,6 @@ ssize_t uncore_event_show(struct kobject *kobj,
 	return sprintf(buf, "%s", event->config);
 }
 
-struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
-{
-	return container_of(event->pmu, struct intel_uncore_pmu, pmu);
-}
-
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
 {
 	struct intel_uncore_box *box;
@@ -122,15 +117,6 @@ out:
 	return *per_cpu_ptr(pmu->box, cpu);
 }
 
-struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
-{
-	/*
-	 * perf core schedules event on the basis of cpu, uncore events are
-	 * collected by one of the cpus inside a physical package.
-	 */
-	return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
-}
-
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
 {
 	u64 count;
@@ -690,6 +676,7 @@ static int uncore_pmu_event_init(struct perf_event *event)
 	if (!box || box->cpu < 0)
 		return -EINVAL;
 	event->cpu = box->cpu;
+	event->pmu_private = box;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index f118c0d..479c1e8 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -319,9 +319,17 @@ static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
 	return (box->phys_id < 0);
 }
 
-struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event);
+static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
+{
+	return container_of(event->pmu, struct intel_uncore_pmu, pmu);
+}
+
+static inline struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
+{
+	return event->pmu_private;
+}
+
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu);
-struct intel_uncore_box *uncore_event_to_box(struct perf_event *event);
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event);
 void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
 void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 120e106..96531d2 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -313,6 +313,7 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
 		return -EINVAL;
 
 	event->cpu = box->cpu;
+	event->pmu_private = box;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] x86/topology: Create logical package id
  2016-02-22 22:19 ` [patch V3 11/28] x86/topology: Create logical package id Thomas Gleixner
@ 2016-02-29 11:07   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, mingo, acme, harish.chegondi, kan.liang, mcgrof, jolsa,
	dvlasenk, bp, luto, torvalds, vincent.weaver, akpm, hpa, tglx,
	andi.kleen, toshi.kani, eranian, jacob.jun.pan, linux-kernel,
	brgerst

Commit-ID:  1f12e32f4cd5243ae46d8b933181be0d022c6793
Gitweb:     http://git.kernel.org/tip/1f12e32f4cd5243ae46d8b933181be0d022c6793
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:15 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:18 +0100

x86/topology: Create logical package id

For per package oriented services we must be able to rely on the number of CPU
packages to be within bounds. Create a tracking facility, which

- calculates the number of possible packages depending on nr_cpu_ids after boot

- makes sure that the package id is within the number of possible packages. If
  the apic id is outside we map it to a logical package id if there is enough
  space available.

Provide interfaces for drivers to query the mapping and do translations from
physcial to logical ids.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.541071755@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h |   2 +
 arch/x86/include/asm/topology.h  |  11 +++++
 arch/x86/kernel/apic/apic.c      |  14 ++++++
 arch/x86/kernel/cpu/common.c     |   2 +
 arch/x86/kernel/cpu/intel.c      |  13 +++++
 arch/x86/kernel/smpboot.c        | 100 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 142 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 20c11d1..813384e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -129,6 +129,8 @@ struct cpuinfo_x86 {
 	u16			booted_cores;
 	/* Physical processor id: */
 	u16			phys_proc_id;
+	/* Logical processor id: */
+	u16			logical_proc_id;
 	/* Core id: */
 	u16			cpu_core_id;
 	/* Compute unit id */
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 0fb4648..7f991bd5 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -119,12 +119,23 @@ static inline void setup_node_to_cpumask_map(void) { }
 
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
 
+#define topology_logical_package_id(cpu)	(cpu_data(cpu).logical_proc_id)
 #define topology_physical_package_id(cpu)	(cpu_data(cpu).phys_proc_id)
 #define topology_core_id(cpu)			(cpu_data(cpu).cpu_core_id)
 
 #ifdef ENABLE_TOPO_DEFINES
 #define topology_core_cpumask(cpu)		(per_cpu(cpu_core_map, cpu))
 #define topology_sibling_cpumask(cpu)		(per_cpu(cpu_sibling_map, cpu))
+
+extern unsigned int __max_logical_packages;
+#define topology_max_packages()			(__max_logical_packages)
+int topology_update_package_map(unsigned int apicid, unsigned int cpu);
+extern int topology_phys_to_logical_pkg(unsigned int pkg);
+#else
+#define topology_max_packages()			(1)
+static inline int
+topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
+static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
 #endif
 
 static inline void arch_fix_phys_package_id(int num, u32 slot)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 8a5cdda..531b961 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2078,6 +2078,20 @@ int generic_processor_info(int apicid, int version)
 		cpu = cpumask_next_zero(-1, cpu_present_mask);
 
 	/*
+	 * This can happen on physical hotplug. The sanity check at boot time
+	 * is done from native_smp_prepare_cpus() after num_possible_cpus() is
+	 * established.
+	 */
+	if (topology_update_package_map(apicid, cpu) < 0) {
+		int thiscpu = max + disabled_cpus;
+
+		pr_warning("ACPI: Package limit reached. Processor %d/0x%x ignored.\n",
+			   thiscpu, apicid);
+		disabled_cpus++;
+		return -ENOSPC;
+	}
+
+	/*
 	 * Validate version
 	 */
 	if (version == 0x0) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 68a80e9..81cf716 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -975,6 +975,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
+	/* The boot/hotplug time assigment got cleared, restore it */
+	c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id);
 }
 
 /*
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 05b9211..38766c2 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -160,6 +160,19 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 		pr_info("Disabling PGE capability bit\n");
 		setup_clear_cpu_cap(X86_FEATURE_PGE);
 	}
+
+	if (c->cpuid_level >= 0x00000001) {
+		u32 eax, ebx, ecx, edx;
+
+		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
+		/*
+		 * If HTT (EDX[28]) is set EBX[16:23] contain the number of
+		 * apicids which are reserved per package. Store the resulting
+		 * shift value for the package management code.
+		 */
+		if (edx & (1U << 28))
+			c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
+	}
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 24d57f7..3bf1e0b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -97,6 +97,14 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
 
+/* Logical package management. We might want to allocate that dynamically */
+static int *physical_to_logical_pkg __read_mostly;
+static unsigned long *physical_package_map __read_mostly;;
+static unsigned long *logical_package_map  __read_mostly;
+static unsigned int max_physical_pkg_id __read_mostly;
+unsigned int __max_logical_packages __read_mostly;
+EXPORT_SYMBOL(__max_logical_packages);
+
 static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip)
 {
 	unsigned long flags;
@@ -251,6 +259,97 @@ static void notrace start_secondary(void *unused)
 	cpu_startup_entry(CPUHP_ONLINE);
 }
 
+int topology_update_package_map(unsigned int apicid, unsigned int cpu)
+{
+	unsigned int new, pkg = apicid >> boot_cpu_data.x86_coreid_bits;
+
+	/* Called from early boot ? */
+	if (!physical_package_map)
+		return 0;
+
+	if (pkg >= max_physical_pkg_id)
+		return -EINVAL;
+
+	/* Set the logical package id */
+	if (test_and_set_bit(pkg, physical_package_map))
+		goto found;
+
+	if (pkg < __max_logical_packages) {
+		set_bit(pkg, logical_package_map);
+		physical_to_logical_pkg[pkg] = pkg;
+		goto found;
+	}
+	new = find_first_zero_bit(logical_package_map, __max_logical_packages);
+	if (new >= __max_logical_packages) {
+		physical_to_logical_pkg[pkg] = -1;
+		pr_warn("APIC(%x) Package %u exceeds logical package map\n",
+			apicid, pkg);
+		return -ENOSPC;
+	}
+	set_bit(new, logical_package_map);
+	pr_info("APIC(%x) Converting physical %u to logical package %u\n",
+		apicid, pkg, new);
+	physical_to_logical_pkg[pkg] = new;
+
+found:
+	cpu_data(cpu).logical_proc_id = physical_to_logical_pkg[pkg];
+	return 0;
+}
+
+/**
+ * topology_phys_to_logical_pkg - Map a physical package id to a logical
+ *
+ * Returns logical package id or -1 if not found
+ */
+int topology_phys_to_logical_pkg(unsigned int phys_pkg)
+{
+	if (phys_pkg >= max_physical_pkg_id)
+		return -1;
+	return physical_to_logical_pkg[phys_pkg];
+}
+EXPORT_SYMBOL(topology_phys_to_logical_pkg);
+
+static void __init smp_init_package_map(void)
+{
+	unsigned int ncpus, cpu;
+	size_t size;
+
+	/*
+	 * Today neither Intel nor AMD support heterogenous systems. That
+	 * might change in the future....
+	 */
+	ncpus = boot_cpu_data.x86_max_cores * smp_num_siblings;
+	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
+
+	/*
+	 * Possibly larger than what we need as the number of apic ids per
+	 * package can be smaller than the actual used apic ids.
+	 */
+	max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus);
+	size = max_physical_pkg_id * sizeof(unsigned int);
+	physical_to_logical_pkg = kmalloc(size, GFP_KERNEL);
+	memset(physical_to_logical_pkg, 0xff, size);
+	size = BITS_TO_LONGS(max_physical_pkg_id) * sizeof(unsigned long);
+	physical_package_map = kzalloc(size, GFP_KERNEL);
+	size = BITS_TO_LONGS(__max_logical_packages) * sizeof(unsigned long);
+	logical_package_map = kzalloc(size, GFP_KERNEL);
+
+	pr_info("Max logical packages: %u\n", __max_logical_packages);
+
+	for_each_present_cpu(cpu) {
+		unsigned int apicid = apic->cpu_present_to_apicid(cpu);
+
+		if (apicid == BAD_APICID || !apic->apic_id_valid(apicid))
+			continue;
+		if (!topology_update_package_map(apicid, cpu))
+			continue;
+		pr_warn("CPU %u APICId %x disabled\n", cpu, apicid);
+		per_cpu(x86_bios_cpu_apicid, cpu) = BAD_APICID;
+		set_cpu_possible(cpu, false);
+		set_cpu_present(cpu, false);
+	}
+}
+
 void __init smp_store_boot_cpu_info(void)
 {
 	int id = 0; /* CPU 0 */
@@ -258,6 +357,7 @@ void __init smp_store_boot_cpu_info(void)
 
 	*c = boot_cpu_data;
 	c->cpu_index = id;
+	smp_init_package_map();
 }
 
 /*

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/uncore: Track packages, not per CPU data
  2016-02-22 22:19 ` [patch V3 12/28] x86/perf/uncore: Track packages not per cpu data Thomas Gleixner
@ 2016-02-29 11:07   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kan.liang, linux-kernel, vincent.weaver, mingo, acme, peterz,
	torvalds, jacob.jun.pan, jolsa, eranian, harish.chegondi, hpa,
	andi.kleen, tglx, bp

Commit-ID:  cf6d445f68974d0b15a14cf6021be38a91f2b5d8
Gitweb:     http://git.kernel.org/tip/cf6d445f68974d0b15a14cf6021be38a91f2b5d8
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:16 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:18 +0100

perf/x86/uncore: Track packages, not per CPU data

Uncore is a per package facility, but the code tries to mimick a per CPU
facility with completely convoluted constructs.

Simplify the whole machinery by tracking per package information. While at it,
avoid the kfree/alloc dance when a CPU goes offline and online again. There is
no point in freeing the box after it was allocated. We just keep proper
refcounting and the first CPU which comes online in a package does the
initialization/activation of the box.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.622258933@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c       | 422 +++++++++++++++--------------------
 arch/x86/events/intel/uncore.h       |  18 +-
 arch/x86/events/intel/uncore_snbep.c |  10 +-
 3 files changed, 198 insertions(+), 252 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 220f81c..096fc0d 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -9,9 +9,9 @@ struct pci_driver *uncore_pci_driver;
 /* pci bus to socket mapping */
 DEFINE_RAW_SPINLOCK(pci2phy_map_lock);
 struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head);
-struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
+struct pci_extra_dev *uncore_extra_pci_dev;
+static int max_packages;
 
-static DEFINE_RAW_SPINLOCK(uncore_box_lock);
 /* mask of cpus that collect uncore events */
 static cpumask_t uncore_cpu_mask;
 
@@ -94,27 +94,7 @@ ssize_t uncore_event_show(struct kobject *kobj,
 
 struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
 {
-	struct intel_uncore_box *box;
-
-	box = *per_cpu_ptr(pmu->box, cpu);
-	if (box)
-		return box;
-
-	raw_spin_lock(&uncore_box_lock);
-	/* Recheck in lock to handle races. */
-	if (*per_cpu_ptr(pmu->box, cpu))
-		goto out;
-	list_for_each_entry(box, &pmu->box_list, list) {
-		if (box->phys_id == topology_physical_package_id(cpu)) {
-			atomic_inc(&box->refcnt);
-			*per_cpu_ptr(pmu->box, cpu) = box;
-			break;
-		}
-	}
-out:
-	raw_spin_unlock(&uncore_box_lock);
-
-	return *per_cpu_ptr(pmu->box, cpu);
+	return pmu->boxes[topology_logical_package_id(cpu)];
 }
 
 u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
@@ -315,9 +295,9 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
 		raw_spin_lock_init(&box->shared_regs[i].lock);
 
 	uncore_pmu_init_hrtimer(box);
-	atomic_set(&box->refcnt, 1);
 	box->cpu = -1;
-	box->phys_id = -1;
+	box->pci_phys_id = -1;
+	box->pkgid = -1;
 
 	/* set default hrtimer timeout */
 	box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
@@ -774,14 +754,24 @@ static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
 	pmu->registered = false;
 }
 
+static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
+{
+	int pkg;
+
+	for (pkg = 0; pkg < max_packages; pkg++)
+		kfree(pmu->boxes[pkg]);
+	kfree(pmu->boxes);
+}
+
 static void __init uncore_type_exit(struct intel_uncore_type *type)
 {
+	struct intel_uncore_pmu *pmu = type->pmus;
 	int i;
 
-	if (type->pmus) {
-		for (i = 0; i < type->num_boxes; i++) {
-			uncore_pmu_unregister(&type->pmus[i]);
-			free_percpu(type->pmus[i].box);
+	if (pmu) {
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			uncore_pmu_unregister(pmu);
+			uncore_free_boxes(pmu);
 		}
 		kfree(type->pmus);
 		type->pmus = NULL;
@@ -796,37 +786,36 @@ static void __init uncore_types_exit(struct intel_uncore_type **types)
 		uncore_type_exit(*types);
 }
 
-static int __init uncore_type_init(struct intel_uncore_type *type)
+static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
 {
 	struct intel_uncore_pmu *pmus;
 	struct attribute_group *attr_group;
 	struct attribute **attrs;
+	size_t size;
 	int i, j;
 
 	pmus = kzalloc(sizeof(*pmus) * type->num_boxes, GFP_KERNEL);
 	if (!pmus)
 		return -ENOMEM;
 
-	type->pmus = pmus;
-
-	type->unconstrainted = (struct event_constraint)
-		__EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
-				0, type->num_counters, 0, 0);
+	size = max_packages * sizeof(struct intel_uncore_box *);
 
 	for (i = 0; i < type->num_boxes; i++) {
-		pmus[i].func_id = -1;
-		pmus[i].pmu_idx = i;
-		pmus[i].type = type;
-		INIT_LIST_HEAD(&pmus[i].box_list);
-		pmus[i].box = alloc_percpu(struct intel_uncore_box *);
-		if (!pmus[i].box)
+		pmus[i].func_id	= setid ? i : -1;
+		pmus[i].pmu_idx	= i;
+		pmus[i].type	= type;
+		pmus[i].boxes	= kzalloc(size, GFP_KERNEL);
+		if (!pmus[i].boxes)
 			return -ENOMEM;
 	}
 
+	type->pmus = pmus;
+	type->unconstrainted = (struct event_constraint)
+		__EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
+				0, type->num_counters, 0, 0);
+
 	if (type->event_descs) {
-		i = 0;
-		while (type->event_descs[i].attr.attr.name)
-			i++;
+		for (i = 0; type->event_descs[i].attr.attr.name; i++);
 
 		attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
 					sizeof(*attr_group), GFP_KERNEL);
@@ -847,12 +836,13 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
 	return 0;
 }
 
-static int __init uncore_types_init(struct intel_uncore_type **types)
+static int __init
+uncore_types_init(struct intel_uncore_type **types, bool setid)
 {
-	int i, ret;
+	int ret;
 
-	for (i = 0; types[i]; i++) {
-		ret = uncore_type_init(types[i]);
+	for (; *types; types++) {
+		ret = uncore_type_init(*types, setid);
 		if (ret)
 			return ret;
 	}
@@ -864,28 +854,28 @@ static int __init uncore_types_init(struct intel_uncore_type **types)
  */
 static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct intel_uncore_type *type;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	struct intel_uncore_type *type;
-	bool first_box = false;
-	int phys_id, ret;
+	int phys_id, pkg, ret;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	if (phys_id < 0 || phys_id >= UNCORE_SOCKET_MAX)
+	if (phys_id < 0)
 		return -ENODEV;
 
+	pkg = topology_phys_to_logical_pkg(phys_id);
+	if (WARN_ON_ONCE(pkg < 0))
+		return -EINVAL;
+
 	if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
 		int idx = UNCORE_PCI_DEV_IDX(id->driver_data);
-		uncore_extra_pci_dev[phys_id][idx] = pdev;
+
+		uncore_extra_pci_dev[pkg].dev[idx] = pdev;
 		pci_set_drvdata(pdev, NULL);
 		return 0;
 	}
 
 	type = uncore_pci_uncores[UNCORE_PCI_DEV_TYPE(id->driver_data)];
-	box = uncore_alloc_box(type, NUMA_NO_NODE);
-	if (!box)
-		return -ENOMEM;
-
 	/*
 	 * for performance monitoring unit with multiple boxes,
 	 * each box has a different function id.
@@ -902,32 +892,35 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 			pmu++;
 	}
 
+	if (WARN_ON_ONCE(pmu->boxes[pkg] != NULL))
+		return -EINVAL;
+
+	box = uncore_alloc_box(type, NUMA_NO_NODE);
+	if (!box)
+		return -ENOMEM;
+
 	if (pmu->func_id < 0)
 		pmu->func_id = pdev->devfn;
 	else
 		WARN_ON_ONCE(pmu->func_id != pdev->devfn);
 
-	box->phys_id = phys_id;
+	atomic_inc(&box->refcnt);
+	box->pci_phys_id = phys_id;
+	box->pkgid = pkg;
 	box->pci_dev = pdev;
 	box->pmu = pmu;
 	uncore_box_init(box);
 	pci_set_drvdata(pdev, box);
 
-	raw_spin_lock(&uncore_box_lock);
-	if (list_empty(&pmu->box_list))
-		first_box = true;
-	list_add_tail(&box->list, &pmu->box_list);
-	raw_spin_unlock(&uncore_box_lock);
-
-	if (!first_box)
+	pmu->boxes[pkg] = box;
+	if (atomic_inc_return(&pmu->activeboxes) > 1)
 		return 0;
 
+	/* First active box registers the pmu */
 	ret = uncore_pmu_register(pmu);
 	if (ret) {
 		pci_set_drvdata(pdev, NULL);
-		raw_spin_lock(&uncore_box_lock);
-		list_del(&box->list);
-		raw_spin_unlock(&uncore_box_lock);
+		pmu->boxes[pkg] = NULL;
 		uncore_box_exit(box);
 		kfree(box);
 	}
@@ -938,15 +931,16 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 {
 	struct intel_uncore_box *box = pci_get_drvdata(pdev);
 	struct intel_uncore_pmu *pmu;
-	int i, cpu, phys_id;
-	bool last_box = false;
+	int i, phys_id, pkg;
 
 	phys_id = uncore_pcibus_to_physid(pdev->bus);
+	pkg = topology_phys_to_logical_pkg(phys_id);
+
 	box = pci_get_drvdata(pdev);
 	if (!box) {
 		for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
-			if (uncore_extra_pci_dev[phys_id][i] == pdev) {
-				uncore_extra_pci_dev[phys_id][i] = NULL;
+			if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
+				uncore_extra_pci_dev[pkg].dev[i] = NULL;
 				break;
 			}
 		}
@@ -955,34 +949,20 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	}
 
 	pmu = box->pmu;
-	if (WARN_ON_ONCE(phys_id != box->phys_id))
+	if (WARN_ON_ONCE(phys_id != box->pci_phys_id))
 		return;
 
 	pci_set_drvdata(pdev, NULL);
-
-	raw_spin_lock(&uncore_box_lock);
-	list_del(&box->list);
-	if (list_empty(&pmu->box_list))
-		last_box = true;
-	raw_spin_unlock(&uncore_box_lock);
-
-	for_each_possible_cpu(cpu) {
-		if (*per_cpu_ptr(pmu->box, cpu) == box) {
-			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			atomic_dec(&box->refcnt);
-		}
-	}
-
-	WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
+	pmu->boxes[pkg] = NULL;
+	if (atomic_dec_return(&pmu->activeboxes) == 0)
+		uncore_pmu_unregister(pmu);
 	uncore_box_exit(box);
 	kfree(box);
-
-	if (last_box)
-		uncore_pmu_unregister(pmu);
 }
 
 static int __init uncore_pci_init(void)
 {
+	size_t size;
 	int ret;
 
 	switch (boot_cpu_data.x86_model) {
@@ -1025,24 +1005,34 @@ static int __init uncore_pci_init(void)
 	if (ret)
 		return ret;
 
-	ret = uncore_types_init(uncore_pci_uncores);
-	if (ret)
+	size = max_packages * sizeof(struct pci_extra_dev);
+	uncore_extra_pci_dev = kzalloc(size, GFP_KERNEL);
+	if (!uncore_extra_pci_dev) {
+		ret = -ENOMEM;
 		goto err;
+	}
+
+	ret = uncore_types_init(uncore_pci_uncores, false);
+	if (ret)
+		goto errtype;
 
 	uncore_pci_driver->probe = uncore_pci_probe;
 	uncore_pci_driver->remove = uncore_pci_remove;
 
 	ret = pci_register_driver(uncore_pci_driver);
 	if (ret)
-		goto err;
+		goto errtype;
 
 	pcidrv_registered = true;
 	return 0;
 
-err:
+errtype:
 	uncore_types_exit(uncore_pci_uncores);
-	uncore_pci_uncores = empty_uncore;
+	kfree(uncore_extra_pci_dev);
+	uncore_extra_pci_dev = NULL;
 	uncore_free_pcibus_map();
+err:
+	uncore_pci_uncores = empty_uncore;
 	return ret;
 }
 
@@ -1052,110 +1042,81 @@ static void __init uncore_pci_exit(void)
 		pcidrv_registered = false;
 		pci_unregister_driver(uncore_pci_driver);
 		uncore_types_exit(uncore_pci_uncores);
+		kfree(uncore_extra_pci_dev);
 		uncore_free_pcibus_map();
 	}
 }
 
-/* CPU hot plug/unplug are serialized by cpu_add_remove_lock mutex */
-static LIST_HEAD(boxes_to_free);
-
-static void uncore_kfree_boxes(void)
-{
-	struct intel_uncore_box *box;
-
-	while (!list_empty(&boxes_to_free)) {
-		box = list_entry(boxes_to_free.next,
-				 struct intel_uncore_box, list);
-		list_del(&box->list);
-		kfree(box);
-	}
-}
-
 static void uncore_cpu_dying(int cpu)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	int i, j;
+	int i, pkg;
 
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			box = *per_cpu_ptr(pmu->box, cpu);
-			*per_cpu_ptr(pmu->box, cpu) = NULL;
-			if (box && atomic_dec_and_test(&box->refcnt)) {
-				list_add(&box->list, &boxes_to_free);
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (box && atomic_dec_return(&box->refcnt) == 0)
 				uncore_box_exit(box);
-			}
 		}
 	}
 }
 
-static int uncore_cpu_starting(int cpu)
+static void uncore_cpu_starting(int cpu, bool init)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
-	struct intel_uncore_box *box, *exist;
-	int i, j, k, phys_id;
-
-	phys_id = topology_physical_package_id(cpu);
-
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			box = *per_cpu_ptr(pmu->box, cpu);
-			/* called by uncore_cpu_init? */
-			if (box && box->phys_id >= 0) {
-				uncore_box_init(box);
-				continue;
-			}
+	struct intel_uncore_box *box;
+	int i, pkg, ncpus = 1;
 
-			for_each_online_cpu(k) {
-				exist = *per_cpu_ptr(pmu->box, k);
-				if (exist && exist->phys_id == phys_id) {
-					atomic_inc(&exist->refcnt);
-					*per_cpu_ptr(pmu->box, cpu) = exist;
-					if (box) {
-						list_add(&box->list,
-							 &boxes_to_free);
-						box = NULL;
-					}
-					break;
-				}
-			}
+	if (init) {
+		/*
+		 * On init we get the number of online cpus in the package
+		 * and set refcount for all of them.
+		 */
+		ncpus = cpumask_weight(topology_core_cpumask(cpu));
+	}
 
-			if (box) {
-				box->phys_id = phys_id;
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (!box)
+				continue;
+			/* The first cpu on a package activates the box */
+			if (atomic_add_return(ncpus, &box->refcnt) == ncpus)
 				uncore_box_init(box);
-			}
 		}
 	}
-	return 0;
 }
 
-static int uncore_cpu_prepare(int cpu, int phys_id)
+static int uncore_cpu_prepare(int cpu)
 {
-	struct intel_uncore_type *type;
+	struct intel_uncore_type *type, **types = uncore_msr_uncores;
 	struct intel_uncore_pmu *pmu;
 	struct intel_uncore_box *box;
-	int i, j;
-
-	for (i = 0; uncore_msr_uncores[i]; i++) {
-		type = uncore_msr_uncores[i];
-		for (j = 0; j < type->num_boxes; j++) {
-			pmu = &type->pmus[j];
-			if (pmu->func_id < 0)
-				pmu->func_id = j;
+	int i, pkg;
 
+	pkg = topology_logical_package_id(cpu);
+	for (; *types; types++) {
+		type = *types;
+		pmu = type->pmus;
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			if (pmu->boxes[pkg])
+				continue;
+			/* First cpu of a package allocates the box */
 			box = uncore_alloc_box(type, cpu_to_node(cpu));
 			if (!box)
 				return -ENOMEM;
-
 			box->pmu = pmu;
-			box->phys_id = phys_id;
-			*per_cpu_ptr(pmu->box, cpu) = box;
+			box->pkgid = pkg;
+			pmu->boxes[pkg] = box;
 		}
 	}
 	return 0;
@@ -1166,13 +1127,11 @@ static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
 {
 	struct intel_uncore_pmu *pmu = type->pmus;
 	struct intel_uncore_box *box;
-	int i;
+	int i, pkg;
 
+	pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
 	for (i = 0; i < type->num_boxes; i++, pmu++) {
-		if (old_cpu < 0)
-			box = uncore_pmu_to_box(pmu, new_cpu);
-		else
-			box = uncore_pmu_to_box(pmu, old_cpu);
+		box = pmu->boxes[pkg];
 		if (!box)
 			continue;
 
@@ -1202,27 +1161,20 @@ static void uncore_change_context(struct intel_uncore_type **uncores,
 
 static void uncore_event_exit_cpu(int cpu)
 {
-	int i, phys_id, target;
+	int target;
 
-	/* if exiting cpu is used for collecting uncore events */
+	/* Check if exiting cpu is used for collecting uncore events */
 	if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
 		return;
 
-	/* find a new cpu to collect uncore events */
-	phys_id = topology_physical_package_id(cpu);
-	target = -1;
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
-		if (phys_id == topology_physical_package_id(i)) {
-			target = i;
-			break;
-		}
-	}
+	/* Find a new cpu to collect uncore events */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
 
-	/* migrate uncore events to the new cpu */
-	if (target >= 0)
+	/* Migrate uncore events to the new target */
+	if (target < nr_cpu_ids)
 		cpumask_set_cpu(target, &uncore_cpu_mask);
+	else
+		target = -1;
 
 	uncore_change_context(uncore_msr_uncores, cpu, target);
 	uncore_change_context(uncore_pci_uncores, cpu, target);
@@ -1230,13 +1182,15 @@ static void uncore_event_exit_cpu(int cpu)
 
 static void uncore_event_init_cpu(int cpu)
 {
-	int i, phys_id;
+	int target;
 
-	phys_id = topology_physical_package_id(cpu);
-	for_each_cpu(i, &uncore_cpu_mask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;
-	}
+	/*
+	 * Check if there is an online cpu in the package
+	 * which collects uncore events already.
+	 */
+	target = cpumask_any_and(&uncore_cpu_mask, topology_core_cpumask(cpu));
+	if (target < nr_cpu_ids)
+		return;
 
 	cpumask_set_cpu(cpu, &uncore_cpu_mask);
 
@@ -1249,38 +1203,25 @@ static int uncore_cpu_notifier(struct notifier_block *self,
 {
 	unsigned int cpu = (long)hcpu;
 
-	/* allocate/free data structure for uncore box */
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		return notifier_from_errno(uncore_cpu_prepare(cpu, -1));
+		return notifier_from_errno(uncore_cpu_prepare(cpu));
+
 	case CPU_STARTING:
-		uncore_cpu_starting(cpu);
+		uncore_cpu_starting(cpu, false);
+	case CPU_DOWN_FAILED:
+		uncore_event_init_cpu(cpu);
 		break;
+
 	case CPU_UP_CANCELED:
 	case CPU_DYING:
 		uncore_cpu_dying(cpu);
 		break;
-	case CPU_ONLINE:
-	case CPU_DEAD:
-		uncore_kfree_boxes();
-		break;
-	default:
-		break;
-	}
 
-	/* select the cpu that collects uncore events */
-	switch (action & ~CPU_TASKS_FROZEN) {
-	case CPU_DOWN_FAILED:
-	case CPU_STARTING:
-		uncore_event_init_cpu(cpu);
-		break;
 	case CPU_DOWN_PREPARE:
 		uncore_event_exit_cpu(cpu);
 		break;
-	default:
-		break;
 	}
-
 	return NOTIFY_OK;
 }
 
@@ -1362,7 +1303,7 @@ static int __init uncore_cpu_init(void)
 		return 0;
 	}
 
-	ret = uncore_types_init(uncore_msr_uncores);
+	ret = uncore_types_init(uncore_msr_uncores, true);
 	if (ret)
 		goto err;
 
@@ -1378,39 +1319,34 @@ err:
 
 static void __init uncore_cpu_setup(void *dummy)
 {
-	uncore_cpu_starting(smp_processor_id());
+	uncore_cpu_starting(smp_processor_id(), true);
 }
 
+/* Lazy to avoid allocation of a few bytes for the normal case */
+static __initdata DECLARE_BITMAP(packages, MAX_LOCAL_APIC);
+
 static int __init uncore_cpumask_init(void)
 {
-	int cpu, ret = 0;
-
-	cpu_notifier_register_begin();
+	unsigned int cpu;
 
 	for_each_online_cpu(cpu) {
-		int i, phys_id = topology_physical_package_id(cpu);
+		unsigned int pkg = topology_logical_package_id(cpu);
+		int ret;
 
-		for_each_cpu(i, &uncore_cpu_mask) {
-			if (phys_id == topology_physical_package_id(i)) {
-				phys_id = -1;
-				break;
-			}
-		}
-		if (phys_id < 0)
+		if (test_and_set_bit(pkg, packages))
 			continue;
-
-		ret = uncore_cpu_prepare(cpu, phys_id);
+		/*
+		 * The first online cpu of each package takes the refcounts
+		 * for all other online cpus in that package.
+		 */
+		ret = uncore_cpu_prepare(cpu);
 		if (ret)
-			goto out;
+			return ret;
 		uncore_event_init_cpu(cpu);
+		smp_call_function_single(cpu, uncore_cpu_setup, NULL, 1);
 	}
-	on_each_cpu(uncore_cpu_setup, NULL, 1);
-
 	__register_cpu_notifier(&uncore_cpu_nb);
-
-out:
-	cpu_notifier_register_done();
-	return ret;
+	return 0;
 }
 
 static int __init intel_uncore_init(void)
@@ -1423,22 +1359,26 @@ static int __init intel_uncore_init(void)
 	if (cpu_has_hypervisor)
 		return -ENODEV;
 
+	max_packages = topology_max_packages();
+
 	ret = uncore_pci_init();
 	if (ret)
 		return ret;
 	ret = uncore_cpu_init();
 	if (ret)
-		goto errpci;
+		goto err;
+
+	cpu_notifier_register_begin();
 	ret = uncore_cpumask_init();
 	if (ret)
-		goto errcpu;
-
+		goto err;
+	cpu_notifier_register_done();
 	return 0;
 
-errcpu:
+err:
 	uncore_types_exit(uncore_msr_uncores);
-errpci:
 	uncore_pci_exit();
+	cpu_notifier_register_done();
 	return ret;
 }
 device_initcall(intel_uncore_init);
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 479c1e8..01c6793 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -19,11 +19,12 @@
 #define UNCORE_EXTRA_PCI_DEV		0xff
 #define UNCORE_EXTRA_PCI_DEV_MAX	3
 
-/* support up to 8 sockets */
-#define UNCORE_SOCKET_MAX		8
-
 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
 
+struct pci_extra_dev {
+	struct pci_dev *dev[UNCORE_EXTRA_PCI_DEV_MAX];
+};
+
 struct intel_uncore_ops;
 struct intel_uncore_pmu;
 struct intel_uncore_box;
@@ -79,9 +80,9 @@ struct intel_uncore_pmu {
 	int				pmu_idx;
 	int				func_id;
 	bool				registered;
+	atomic_t			activeboxes;
 	struct intel_uncore_type	*type;
-	struct intel_uncore_box		** __percpu box;
-	struct list_head		box_list;
+	struct intel_uncore_box		**boxes;
 };
 
 struct intel_uncore_extra_reg {
@@ -91,7 +92,8 @@ struct intel_uncore_extra_reg {
 };
 
 struct intel_uncore_box {
-	int phys_id;
+	int pci_phys_id;
+	int pkgid;
 	int n_active;	/* number of active events */
 	int n_events;
 	int cpu;	/* cpu to collect events */
@@ -316,7 +318,7 @@ static inline void uncore_box_exit(struct intel_uncore_box *box)
 
 static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
 {
-	return (box->phys_id < 0);
+	return (box->pkgid < 0);
 }
 
 static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
@@ -345,7 +347,7 @@ extern struct intel_uncore_type **uncore_pci_uncores;
 extern struct pci_driver *uncore_pci_driver;
 extern raw_spinlock_t pci2phy_map_lock;
 extern struct list_head pci2phy_map_head;
-extern struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
+extern struct pci_extra_dev *uncore_extra_pci_dev;
 extern struct event_constraint uncore_constraint_empty;
 
 /* perf_event_intel_uncore_snb.c */
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index d967fcc..93f6bd9 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -986,7 +986,9 @@ static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_eve
 
 	if (reg1->idx != EXTRA_REG_NONE) {
 		int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
-		struct pci_dev *filter_pdev = uncore_extra_pci_dev[box->phys_id][idx];
+		int pkg = topology_phys_to_logical_pkg(box->pci_phys_id);
+		struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
+
 		if (filter_pdev) {
 			pci_write_config_dword(filter_pdev, reg1->reg,
 						(u32)reg1->config);
@@ -2520,14 +2522,16 @@ static struct intel_uncore_type *hswep_msr_uncores[] = {
 
 void hswep_uncore_cpu_init(void)
 {
+	int pkg = topology_phys_to_logical_pkg(0);
+
 	if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
 		hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
 
 	/* Detect 6-8 core systems with only two SBOXes */
-	if (uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3]) {
+	if (uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3]) {
 		u32 capid4;
 
-		pci_read_config_dword(uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3],
+		pci_read_config_dword(uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3],
 				      0x94, &capid4);
 		if (((capid4 >> 6) & 0x3) == 0)
 			hswep_uncore_sbox.num_boxes = 2;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Clear all hardware state on exit
  2016-02-22 22:19 ` [patch V3 13/28] x86/perf/intel_uncore: Clear all hardware state on exit Thomas Gleixner
@ 2016-02-29 11:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: vincent.weaver, kan.liang, acme, hpa, andi.kleen, tglx, mingo,
	linux-kernel, peterz, harish.chegondi, jacob.jun.pan, jolsa, bp,
	eranian, torvalds

Commit-ID:  7b672d6433f0f31c95be74b649090cd8b3a6388d
Gitweb:     http://git.kernel.org/tip/7b672d6433f0f31c95be74b649090cd8b3a6388d
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:17 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:19 +0100

perf/x86/intel/uncore: Clear all hardware state on exit

The only missing bit is to completely clear the hardware state on failure
exit. This is now a pretty simple exercise.

Undo the box->init_box() setup on all packages which have been initialized so
far.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.702452407@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 096fc0d..5ad8e52 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -754,6 +754,30 @@ static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
 	pmu->registered = false;
 }
 
+static void __init __uncore_exit_boxes(struct intel_uncore_type *type, int cpu)
+{
+	struct intel_uncore_pmu *pmu = type->pmus;
+	struct intel_uncore_box *box;
+	int i, pkg;
+
+	if (pmu) {
+		pkg = topology_physical_package_id(cpu);
+		for (i = 0; i < type->num_boxes; i++, pmu++) {
+			box = pmu->boxes[pkg];
+			if (box)
+				uncore_box_exit(box);
+		}
+	}
+}
+
+static void __init uncore_exit_boxes(void *dummy)
+{
+	struct intel_uncore_type **types;
+
+	for (types = uncore_msr_uncores; *types; types++)
+		__uncore_exit_boxes(*types++, smp_processor_id());
+}
+
 static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
 {
 	int pkg;
@@ -1376,6 +1400,8 @@ static int __init intel_uncore_init(void)
 	return 0;
 
 err:
+	/* Undo box->init_box() */
+	on_each_cpu_mask(&uncore_cpu_mask, uncore_exit_boxes, NULL, 1);
 	uncore_types_exit(uncore_msr_uncores);
 	uncore_pci_exit();
 	cpu_notifier_register_done();

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/uncore: Make PCI and MSR uncore independent
  2016-02-22 22:19 ` [patch V3 14/28] x86/perf/intel_uncore: Make PCI and MSR uncore independent Thomas Gleixner
@ 2016-02-29 11:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jacob.jun.pan, eranian, ak, jolsa, hpa, bp, torvalds, tglx,
	mingo, andi.kleen, kan.liang, peterz, harish.chegondi,
	vincent.weaver, linux-kernel, acme

Commit-ID:  5485592c17954b2a91fbd12980af596dc666079d
Gitweb:     http://git.kernel.org/tip/5485592c17954b2a91fbd12980af596dc666079d
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:17 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:19 +0100

perf/x86/intel/uncore: Make PCI and MSR uncore independent

Andi wanted to do this before, but the patch fell down the cracks. Implement
it with the proper error handling.

Requested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.799159968@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/uncore.c | 34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 5ad8e52..7012d18 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1023,7 +1023,7 @@ static int __init uncore_pci_init(void)
 		ret = skl_uncore_pci_init();
 		break;
 	default:
-		return 0;
+		return -ENODEV;
 	}
 
 	if (ret)
@@ -1324,7 +1324,7 @@ static int __init uncore_cpu_init(void)
 		knl_uncore_cpu_init();
 		break;
 	default:
-		return 0;
+		return -ENODEV;
 	}
 
 	ret = uncore_types_init(uncore_msr_uncores, true);
@@ -1349,7 +1349,7 @@ static void __init uncore_cpu_setup(void *dummy)
 /* Lazy to avoid allocation of a few bytes for the normal case */
 static __initdata DECLARE_BITMAP(packages, MAX_LOCAL_APIC);
 
-static int __init uncore_cpumask_init(void)
+static int __init uncore_cpumask_init(bool msr)
 {
 	unsigned int cpu;
 
@@ -1360,12 +1360,15 @@ static int __init uncore_cpumask_init(void)
 		if (test_and_set_bit(pkg, packages))
 			continue;
 		/*
-		 * The first online cpu of each package takes the refcounts
-		 * for all other online cpus in that package.
+		 * The first online cpu of each package allocates and takes
+		 * the refcounts for all other online cpus in that package.
+		 * If msrs are not enabled no allocation is required.
 		 */
-		ret = uncore_cpu_prepare(cpu);
-		if (ret)
-			return ret;
+		if (msr) {
+			ret = uncore_cpu_prepare(cpu);
+			if (ret)
+				return ret;
+		}
 		uncore_event_init_cpu(cpu);
 		smp_call_function_single(cpu, uncore_cpu_setup, NULL, 1);
 	}
@@ -1375,7 +1378,7 @@ static int __init uncore_cpumask_init(void)
 
 static int __init intel_uncore_init(void)
 {
-	int ret;
+	int pret, cret, ret;
 
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
 		return -ENODEV;
@@ -1385,15 +1388,14 @@ static int __init intel_uncore_init(void)
 
 	max_packages = topology_max_packages();
 
-	ret = uncore_pci_init();
-	if (ret)
-		return ret;
-	ret = uncore_cpu_init();
-	if (ret)
-		goto err;
+	pret = uncore_pci_init();
+	cret = uncore_cpu_init();
+
+	if (cret && pret)
+		return -ENODEV;
 
 	cpu_notifier_register_begin();
-	ret = uncore_cpumask_init();
+	ret = uncore_cpumask_init(!cret);
 	if (ret)
 		goto err;
 	cpu_notifier_register_done();

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] cpumask: Export cpumask_any_but()
  2016-02-22 22:19 ` [patch V3 15/28] cpumask: Export cpumask_any_but Thomas Gleixner
@ 2016-02-29 11:08   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, peterz, rusty, mingo, andi.kleen, eranian, davem, hpa,
	vincent.weaver, tglx, jacob.jun.pan, jolsa, harish.chegondi, bp,
	linux-kernel, kan.liang, torvalds

Commit-ID:  3712bba1a260ad851f3aa8ddea9cb7326f6aa0b3
Gitweb:     http://git.kernel.org/tip/3712bba1a260ad851f3aa8ddea9cb7326f6aa0b3
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:18 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:20 +0100

cpumask: Export cpumask_any_but()

Almost every cpumask function is exported, just not the one I need to make the
Intel uncore driver modular.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.878299859@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 lib/cpumask.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index 5a70f61..81dedaa 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -41,6 +41,7 @@ int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
 			break;
 	return i;
 }
+EXPORT_SYMBOL(cpumask_any_but);
 
 /* These are not inline because of header tangles. */
 #ifdef CONFIG_CPUMASK_OFFSTACK

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/cqm: Get rid of the silly for_each_cpu() lookups
  2016-02-22 22:19 ` [patch V3 17/28] x86/perf/cqm: Get rid of the silly for_each_cpu lookups Thomas Gleixner
@ 2016-02-29 11:09   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, acme, tglx, linux-kernel, peterz, bp, jolsa,
	harish.chegondi, eranian, kan.liang, andi.kleen, torvalds,
	jacob.jun.pan, mingo, vincent.weaver

Commit-ID:  827db839cd051cfde5a618fb4427e86dc91bc9aa
Gitweb:     http://git.kernel.org/tip/827db839cd051cfde5a618fb4427e86dc91bc9aa
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:20 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:21 +0100

perf/x86/intel/cqm: Get rid of the silly for_each_cpu() lookups

CQM is a strict per package facility. Use the proper cpumasks to lookup the
readers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.054916179@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/cqm.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/cqm.c b/arch/x86/events/intel/cqm.c
index 1b064c4..93cb412 100644
--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -1244,15 +1244,12 @@ static struct pmu intel_cqm_pmu = {
 
 static inline void cqm_pick_event_reader(int cpu)
 {
-	int phys_id = topology_physical_package_id(cpu);
-	int i;
+	int reader;
 
-	for_each_cpu(i, &cqm_cpumask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;	/* already got reader for this socket */
-	}
-
-	cpumask_set_cpu(cpu, &cqm_cpumask);
+	/* First online cpu in package becomes the reader */
+	reader = cpumask_any_and(&cqm_cpumask, topology_core_cpumask(cpu));
+	if (reader >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &cqm_cpumask);
 }
 
 static void intel_cqm_cpu_starting(unsigned int cpu)
@@ -1270,24 +1267,17 @@ static void intel_cqm_cpu_starting(unsigned int cpu)
 
 static void intel_cqm_cpu_exit(unsigned int cpu)
 {
-	int phys_id = topology_physical_package_id(cpu);
-	int i;
+	int target;
 
-	/*
-	 * Is @cpu a designated cqm reader?
-	 */
+	/* Is @cpu the current cqm reader for this package ? */
 	if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask))
 		return;
 
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
+	/* Find another online reader in this package */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
 
-		if (phys_id == topology_physical_package_id(i)) {
-			cpumask_set_cpu(i, &cqm_cpumask);
-			break;
-		}
-	}
+	if (target < nr_cpu_ids)
+		cpumask_set_cpu(target, &cqm_cpumask);
 }
 
 static int intel_cqm_cpu_notifier(struct notifier_block *nb,

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Make Knights Landings support functional
  2016-02-22 22:19 ` [patch V3 18/28] x86/perf/intel_rapl: Make Knights Landings support functional Thomas Gleixner
@ 2016-02-29 11:09   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, vincent.weaver, linux-kernel, eranian, jolsa, mingo, acme,
	andi.kleen, peterz, dasaratharaman.chandramouli, jacob.jun.pan,
	kan.liang, harish.chegondi, bp, hpa, torvalds

Commit-ID:  4d120c535d638a952e644817ba3cbef30deedb2b
Gitweb:     http://git.kernel.org/tip/4d120c535d638a952e644817ba3cbef30deedb2b
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:20 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:21 +0100

perf/x86/intel/rapl: Make Knights Landings support functional

The Knights Landings support added the events and the detection case, but then
returns 0 without actually initializing the driver.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Fixes: 3a2a7797326a4 "perf/x86/intel/rapl: Add support for Knights Landing (KNL)"
Link: http://lkml.kernel.org/r/20160222221012.149331888@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 580f504..536f0ce 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -731,7 +731,7 @@ static int __init rapl_pmu_init(void)
 		rapl_add_quirk(rapl_hsw_server_quirk);
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
-
+		break;
 	default:
 		/* unsupported */
 		return 0;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Add proper error handling
  2016-02-22 22:19 ` [patch V3 19/28] x86/perf/intel/rapl: Add proper error handling Thomas Gleixner
@ 2016-02-29 11:10   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: harish.chegondi, torvalds, linux-kernel, acme, andi.kleen,
	peterz, eranian, kan.liang, jolsa, bp, mingo, vincent.weaver,
	tglx, hpa, jacob.jun.pan

Commit-ID:  55f2890f0726fe4a1f41a3a0e72ca1a263f095c3
Gitweb:     http://git.kernel.org/tip/55f2890f0726fe4a1f41a3a0e72ca1a263f095c3
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:21 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:22 +0100

perf/x86/intel/rapl: Add proper error handling

Like uncore the rapl driver lacks error handling. It leaks memory and leaves
the hotplug notifier registered.

Add the proper error checks, cleanup the memory and register the hotplug
notifier only on success.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.231222076@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 536f0ce..98b04d2 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -686,6 +686,14 @@ static int rapl_check_hw_unit(void)
 	return 0;
 }
 
+static void __init cleanup_rapl_pmus(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		kfree(per_cpu(rapl_pmu, cpu));
+}
+
 static const struct x86_cpu_id rapl_cpu_match[] = {
 	[0] = { .vendor = X86_VENDOR_INTEL, .family = 6 },
 	[1] = {},
@@ -702,7 +710,7 @@ static int __init rapl_pmu_init(void)
 	 * check for Intel processor family 6
 	 */
 	if (!x86_match_cpu(rapl_cpu_match))
-		return 0;
+		return -ENODEV;
 
 	/* check supported CPU */
 	switch (boot_cpu_data.x86_model) {
@@ -734,8 +742,9 @@ static int __init rapl_pmu_init(void)
 		break;
 	default:
 		/* unsupported */
-		return 0;
+		return -ENODEV;
 	}
+
 	ret = rapl_check_hw_unit();
 	if (ret)
 		return ret;
@@ -743,6 +752,7 @@ static int __init rapl_pmu_init(void)
 	/* run cpu model quirks */
 	for (quirk = rapl_quirks; quirk; quirk = quirk->next)
 		quirk->func();
+
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {
@@ -752,15 +762,14 @@ static int __init rapl_pmu_init(void)
 		rapl_cpu_init(cpu);
 	}
 
-	__perf_cpu_notifier(rapl_cpu_notifier);
-
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
 	if (WARN_ON(ret)) {
 		pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret);
-		cpu_notifier_register_done();
-		return -1;
+		goto out;
 	}
 
+	__perf_cpu_notifier(rapl_cpu_notifier);
+
 	pmu = __this_cpu_read(rapl_pmu);
 
 	pr_info("RAPL PMU detected,"
@@ -775,9 +784,13 @@ static int __init rapl_pmu_init(void)
 				rapl_domain_names[i], rapl_hw_unit[i]);
 		}
 	}
-out:
-	cpu_notifier_register_done();
 
+	cpu_notifier_register_done();
 	return 0;
+
+out:
+	cleanup_rapl_pmus();
+	cpu_notifier_register_done();
+	return ret;
 }
 device_initcall(rapl_pmu_init);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Sanitize the quirk handling
  2016-02-22 22:19 ` [patch V3 20/28] x86/perf/intel/rapl: Sanitize the quirk handling Thomas Gleixner
@ 2016-02-29 11:10   ` tip-bot for Thomas Gleixner
  2016-03-04 17:49     ` Borislav Petkov
  0 siblings, 1 reply; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, mingo, eranian, harish.chegondi, acme, jolsa, tglx,
	peterz, hpa, andi.kleen, vincent.weaver, kan.liang,
	jacob.jun.pan, linux-kernel, bp

Commit-ID:  b8b3319a471b2df35ae0a8fe94223638468e9ca4
Gitweb:     http://git.kernel.org/tip/b8b3319a471b2df35ae0a8fe94223638468e9ca4
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:22 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:22 +0100

perf/x86/intel/rapl: Sanitize the quirk handling

There is no point in having a quirk machinery for a single possible
function. Get rid of it and move the quirk to a place where it actually
makes sense.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.311639465@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 49 +++++++++++++++++---------------------------
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 98b04d2..976558a 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -133,7 +133,6 @@ static int rapl_cntr_mask;
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
 
-static struct x86_pmu_quirk *rapl_quirks;
 static inline u64 rapl_read_counter(struct perf_event *event)
 {
 	u64 raw;
@@ -141,15 +140,6 @@ static inline u64 rapl_read_counter(struct perf_event *event)
 	return raw;
 }
 
-#define rapl_add_quirk(func_)						\
-do {									\
-	static struct x86_pmu_quirk __quirk __initdata = {		\
-		.func = func_,						\
-	};								\
-	__quirk.next = rapl_quirks;					\
-	rapl_quirks = &__quirk;						\
-} while (0)
-
 static inline u64 rapl_scale(u64 v, int cfg)
 {
 	if (cfg > NR_RAPL_DOMAINS) {
@@ -564,17 +554,6 @@ static void rapl_cpu_init(int cpu)
 	cpumask_set_cpu(cpu, &rapl_cpu_mask);
 }
 
-static __init void rapl_hsw_server_quirk(void)
-{
-	/*
-	 * DRAM domain on HSW server has fixed energy unit which can be
-	 * different than the unit from power unit MSR.
-	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-	 */
-	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
-}
-
 static int rapl_cpu_prepare(int cpu)
 {
 	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
@@ -672,7 +651,18 @@ static int rapl_cpu_notifier(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static int rapl_check_hw_unit(void)
+static __init void rapl_hsw_server_quirk(void)
+{
+	/*
+	 * DRAM domain on HSW server has fixed energy unit which can be
+	 * different than the unit from power unit MSR.
+	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+	 */
+	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
+}
+
+static int rapl_check_hw_unit(void (*quirk)(void))
 {
 	u64 msr_rapl_power_unit_bits;
 	int i;
@@ -683,6 +673,9 @@ static int rapl_check_hw_unit(void)
 	for (i = 0; i < NR_RAPL_DOMAINS; i++)
 		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
+	/* Apply cpu model quirk */
+	if (quirk)
+		quirk();
 	return 0;
 }
 
@@ -701,9 +694,9 @@ static const struct x86_cpu_id rapl_cpu_match[] = {
 
 static int __init rapl_pmu_init(void)
 {
+	void (*quirk)(void) = NULL;
 	struct rapl_pmu *pmu;
 	int cpu, ret;
-	struct x86_pmu_quirk *quirk;
 	int i;
 
 	/*
@@ -720,7 +713,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
 		break;
 	case 63: /* Haswell-Server */
-		rapl_add_quirk(rapl_hsw_server_quirk);
+		quirk = rapl_hsw_server_quirk;
 		rapl_cntr_mask = RAPL_IDX_SRV;
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
@@ -736,7 +729,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
 	case 87: /* Knights Landing */
-		rapl_add_quirk(rapl_hsw_server_quirk);
+		quirk = rapl_hsw_server_quirk;
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
@@ -745,14 +738,10 @@ static int __init rapl_pmu_init(void)
 		return -ENODEV;
 	}
 
-	ret = rapl_check_hw_unit();
+	ret = rapl_check_hw_unit(quirk);
 	if (ret)
 		return ret;
 
-	/* run cpu model quirks */
-	for (quirk = rapl_quirks; quirk; quirk = quirk->next)
-		quirk->func();
-
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(cpu) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Calculate timing once
  2016-02-22 22:19 ` [patch V3 21/28] x86/perf/intel/rapl: Calculate timing once Thomas Gleixner
@ 2016-02-29 11:10   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: harish.chegondi, vincent.weaver, mingo, bp, hpa, eranian,
	torvalds, peterz, jolsa, linux-kernel, jacob.jun.pan, acme,
	andi.kleen, kan.liang, tglx

Commit-ID:  75c7003fbf419f05215d4ca09bf3964d0b1c99e9
Gitweb:     http://git.kernel.org/tip/75c7003fbf419f05215d4ca09bf3964d0b1c99e9
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:22 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:23 +0100

perf/x86/intel/rapl: Calculate timing once

No point in doing the same calculation over and over. Do it once in
rapl_check_hw_unit().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.409238136@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 39 ++++++++++++++++++---------------------
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 976558a..b1d4a2f 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -129,6 +129,7 @@ static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;  /* 1/2^hw_unit Joule */
 static struct pmu rapl_pmu_class;
 static cpumask_t rapl_cpu_mask;
 static int rapl_cntr_mask;
+static u64 rapl_timer_ms;
 
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
 static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
@@ -558,7 +559,6 @@ static int rapl_cpu_prepare(int cpu)
 {
 	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
 	int phys_id = topology_physical_package_id(cpu);
-	u64 ms;
 
 	if (pmu)
 		return 0;
@@ -575,19 +575,7 @@ static int rapl_cpu_prepare(int cpu)
 
 	pmu->pmu = &rapl_pmu_class;
 
-	/*
-	 * use reference of 200W for scaling the timeout
-	 * to avoid missing counter overflows.
-	 * 200W = 200 Joules/sec
-	 * divide interval by 2 to avoid lockstep (2 * 100)
-	 * if hw unit is 32, then we use 2 ms 1/200/2
-	 */
-	if (rapl_hw_unit[0] < 32)
-		ms = (1000 / (2 * 100)) * (1ULL << (32 - rapl_hw_unit[0] - 1));
-	else
-		ms = 2;
-
-	pmu->timer_interval = ms_to_ktime(ms);
+	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
 
 	rapl_hrtimer_init(pmu);
 
@@ -676,6 +664,19 @@ static int rapl_check_hw_unit(void (*quirk)(void))
 	/* Apply cpu model quirk */
 	if (quirk)
 		quirk();
+
+	/*
+	 * Calculate the timer rate:
+	 * Use reference of 200W for scaling the timeout to avoid counter
+	 * overflows. 200W = 200 Joules/sec
+	 * Divide interval by 2 to avoid lockstep (2 * 100)
+	 * if hw unit is 32, then we use 2 ms 1/200/2
+	 */
+	rapl_timer_ms = 2;
+	if (rapl_hw_unit[0] < 32) {
+		rapl_timer_ms = (1000 / (2 * 100));
+		rapl_timer_ms *= (1ULL << (32 - rapl_hw_unit[0] - 1));
+	}
 	return 0;
 }
 
@@ -695,9 +696,7 @@ static const struct x86_cpu_id rapl_cpu_match[] = {
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	struct rapl_pmu *pmu;
-	int cpu, ret;
-	int i;
+	int cpu, ret, i;
 
 	/*
 	 * check for Intel processor family 6
@@ -758,15 +757,14 @@ static int __init rapl_pmu_init(void)
 	}
 
 	__perf_cpu_notifier(rapl_cpu_notifier);
-
-	pmu = __this_cpu_read(rapl_pmu);
+	cpu_notifier_register_done();
 
 	pr_info("RAPL PMU detected,"
 		" API unit is 2^-32 Joules,"
 		" %d fixed counters"
 		" %llu ms ovfl timer\n",
 		hweight32(rapl_cntr_mask),
-		ktime_to_ms(pmu->timer_interval));
+		rapl_timer_ms);
 	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
 		if (rapl_cntr_mask & (1 << i)) {
 			pr_info("hw unit of domain %s 2^-%d Joules\n",
@@ -774,7 +772,6 @@ static int __init rapl_pmu_init(void)
 		}
 	}
 
-	cpu_notifier_register_done();
 	return 0;
 
 out:

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Clean up the printk output
  2016-02-22 22:19 ` [patch V3 22/28] x86/perf/intel/rapl: Cleanup the printk output Thomas Gleixner
@ 2016-02-29 11:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, jacob.jun.pan, kan.liang, torvalds, hpa, tglx, jolsa,
	vincent.weaver, bp, harish.chegondi, andi.kleen, mingo, acme,
	linux-kernel, peterz

Commit-ID:  512089d98457b7913d2e4762a44af52fbcd87470
Gitweb:     http://git.kernel.org/tip/512089d98457b7913d2e4762a44af52fbcd87470
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:23 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:23 +0100

perf/x86/intel/rapl: Clean up the printk output

The output is inconsistent. Use a proper pr_fmt prefix and split out the
advertisement into a seperate function.

Remove the WARN_ON() in the failure case. It's pointless as we already know
where it failed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.504551295@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index b1d4a2f..f31e4b4 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -44,6 +44,9 @@
  * the duration of the measurement. Tools may use a function such as
  * ldexp(raw_count, -32);
  */
+
+#define pr_fmt(fmt) "RAPL PMU: " fmt
+
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/perf_event.h>
@@ -144,7 +147,7 @@ static inline u64 rapl_read_counter(struct perf_event *event)
 static inline u64 rapl_scale(u64 v, int cfg)
 {
 	if (cfg > NR_RAPL_DOMAINS) {
-		pr_warn("invalid domain %d, failed to scale data\n", cfg);
+		pr_warn("Invalid domain %d, failed to scale data\n", cfg);
 		return v;
 	}
 	/*
@@ -680,6 +683,21 @@ static int rapl_check_hw_unit(void (*quirk)(void))
 	return 0;
 }
 
+static void __init rapl_advertise(void)
+{
+	int i;
+
+	pr_info("API unit is 2^-32 Joules, %d fixed counters, %llu ms ovfl timer\n",
+		hweight32(rapl_cntr_mask), rapl_timer_ms);
+
+	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
+		if (rapl_cntr_mask & (1 << i)) {
+			pr_info("hw unit of domain %s 2^-%d Joules\n",
+				rapl_domain_names[i], rapl_hw_unit[i]);
+		}
+	}
+}
+
 static void __init cleanup_rapl_pmus(void)
 {
 	int cpu;
@@ -696,7 +714,7 @@ static const struct x86_cpu_id rapl_cpu_match[] = {
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	int cpu, ret, i;
+	int cpu, ret;
 
 	/*
 	 * check for Intel processor family 6
@@ -751,30 +769,16 @@ static int __init rapl_pmu_init(void)
 	}
 
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
-	if (WARN_ON(ret)) {
-		pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret);
+	if (ret)
 		goto out;
-	}
 
 	__perf_cpu_notifier(rapl_cpu_notifier);
 	cpu_notifier_register_done();
-
-	pr_info("RAPL PMU detected,"
-		" API unit is 2^-32 Joules,"
-		" %d fixed counters"
-		" %llu ms ovfl timer\n",
-		hweight32(rapl_cntr_mask),
-		rapl_timer_ms);
-	for (i = 0; i < NR_RAPL_DOMAINS; i++) {
-		if (rapl_cntr_mask & (1 << i)) {
-			pr_info("hw unit of domain %s 2^-%d Joules\n",
-				rapl_domain_names[i], rapl_hw_unit[i]);
-		}
-	}
-
+	rapl_advertise();
 	return 0;
 
 out:
+	pr_warn("Initialization failed (%d), disabled\n", ret);
 	cleanup_rapl_pmus();
 	cpu_notifier_register_done();
 	return ret;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Refactor the code some more
  2016-02-22 22:19 ` [patch V3 23/28] x86/perf/intel/rapl: Refactor code some more Thomas Gleixner
@ 2016-02-29 11:11   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jacob.jun.pan, tglx, andi.kleen, bp,
	harish.chegondi, torvalds, jolsa, eranian, mingo, vincent.weaver,
	hpa, acme, kan.liang, peterz

Commit-ID:  7162b8fea63061b6231bc5e8a0fed55167e71b4c
Gitweb:     http://git.kernel.org/tip/7162b8fea63061b6231bc5e8a0fed55167e71b4c
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:24 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:24 +0100

perf/x86/intel/rapl: Refactor the code some more

Split out code from init into seperate functions. Tidy up the code and get rid
of pointless comments. I wish there would be comments for code which is not
obvious....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.588544679@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 61 ++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index f31e4b4..ba5043b 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -110,7 +110,7 @@ static ssize_t __rapl_##_var##_show(struct kobject *kobj,	\
 static struct kobj_attribute format_attr_##_var =		\
 	__ATTR(_name, 0444, __rapl_##_var##_show, NULL)
 
-#define RAPL_CNTR_WIDTH 32 /* 32-bit rapl counters */
+#define RAPL_CNTR_WIDTH 32
 
 #define RAPL_EVENT_ATTR_STR(_name, v, str)					\
 static struct perf_pmu_events_attr event_attr_##v = {				\
@@ -120,15 +120,16 @@ static struct perf_pmu_events_attr event_attr_##v = {				\
 };
 
 struct rapl_pmu {
-	spinlock_t	 lock;
-	int		 n_active; /* number of active events */
-	struct list_head active_list;
-	struct pmu	 *pmu; /* pointer to rapl_pmu_class */
-	ktime_t		 timer_interval; /* in ktime_t unit */
-	struct hrtimer   hrtimer;
+	spinlock_t		lock;
+	int			n_active;
+	struct list_head	active_list;
+	struct pmu		*pmu;
+	ktime_t			timer_interval;
+	struct hrtimer		hrtimer;
 };
 
-static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;  /* 1/2^hw_unit Joule */
+ /* 1/2^hw_unit Joule */
+static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
 static struct pmu rapl_pmu_class;
 static cpumask_t rapl_cpu_mask;
 static int rapl_cntr_mask;
@@ -200,11 +201,6 @@ static void rapl_start_hrtimer(struct rapl_pmu *pmu)
 		     HRTIMER_MODE_REL_PINNED);
 }
 
-static void rapl_stop_hrtimer(struct rapl_pmu *pmu)
-{
-	hrtimer_cancel(&pmu->hrtimer);
-}
-
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
 	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
@@ -216,9 +212,8 @@ static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 
 	spin_lock_irqsave(&pmu->lock, flags);
 
-	list_for_each_entry(event, &pmu->active_list, active_entry) {
+	list_for_each_entry(event, &pmu->active_list, active_entry)
 		rapl_event_update(event);
-	}
 
 	spin_unlock_irqrestore(&pmu->lock, flags);
 
@@ -275,7 +270,7 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 		WARN_ON_ONCE(pmu->n_active <= 0);
 		pmu->n_active--;
 		if (pmu->n_active == 0)
-			rapl_stop_hrtimer(pmu);
+			hrtimer_cancel(&pmu->hrtimer);
 
 		list_del(&event->active_entry);
 
@@ -542,7 +537,7 @@ static void rapl_cpu_exit(int cpu)
 		perf_pmu_migrate_context(pmu->pmu, cpu, target);
 
 	/* cancel overflow polling timer for CPU */
-	rapl_stop_hrtimer(pmu);
+	hrtimer_cancel(&pmu->hrtimer);
 }
 
 static void rapl_cpu_init(int cpu)
@@ -698,6 +693,20 @@ static void __init rapl_advertise(void)
 	}
 }
 
+static int __init rapl_prepare_cpus(void)
+{
+	unsigned int cpu;
+	int ret;
+
+	for_each_online_cpu(cpu) {
+		ret = rapl_cpu_prepare(cpu);
+		if (ret)
+			return ret;
+		rapl_cpu_init(cpu);
+	}
+	return 0;
+}
+
 static void __init cleanup_rapl_pmus(void)
 {
 	int cpu;
@@ -706,7 +715,7 @@ static void __init cleanup_rapl_pmus(void)
 		kfree(per_cpu(rapl_pmu, cpu));
 }
 
-static const struct x86_cpu_id rapl_cpu_match[] = {
+static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 	[0] = { .vendor = X86_VENDOR_INTEL, .family = 6 },
 	[1] = {},
 };
@@ -714,15 +723,11 @@ static const struct x86_cpu_id rapl_cpu_match[] = {
 static int __init rapl_pmu_init(void)
 {
 	void (*quirk)(void) = NULL;
-	int cpu, ret;
+	int ret;
 
-	/*
-	 * check for Intel processor family 6
-	 */
 	if (!x86_match_cpu(rapl_cpu_match))
 		return -ENODEV;
 
-	/* check supported CPU */
 	switch (boot_cpu_data.x86_model) {
 	case 42: /* Sandy Bridge */
 	case 58: /* Ivy Bridge */
@@ -751,7 +756,6 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
 	default:
-		/* unsupported */
 		return -ENODEV;
 	}
 
@@ -761,12 +765,9 @@ static int __init rapl_pmu_init(void)
 
 	cpu_notifier_register_begin();
 
-	for_each_online_cpu(cpu) {
-		ret = rapl_cpu_prepare(cpu);
-		if (ret)
-			goto out;
-		rapl_cpu_init(cpu);
-	}
+	ret = rapl_prepare_cpus();
+	if (ret)
+		goto out;
 
 	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
 	if (ret)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Make PMU lock raw
  2016-02-22 22:19 ` [patch V3 24/28] x86/perf/intel/rapl: Make pmu lock raw Thomas Gleixner
@ 2016-02-29 11:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, jolsa, vincent.weaver, harish.chegondi, peterz,
	linux-kernel, jacob.jun.pan, bp, kan.liang, mingo, hpa,
	andi.kleen, torvalds, acme, eranian

Commit-ID:  a208749c642618b6c106874153906b53f8e2ded9
Gitweb:     http://git.kernel.org/tip/a208749c642618b6c106874153906b53f8e2ded9
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:25 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:25 +0100

perf/x86/intel/rapl: Make PMU lock raw

This lock is taken in hard interrupt context even on Preempt-RT. Make it raw
so RT does not have to patch it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.669411833@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index ba5043b..2934999 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -120,7 +120,7 @@ static struct perf_pmu_events_attr event_attr_##v = {				\
 };
 
 struct rapl_pmu {
-	spinlock_t		lock;
+	raw_spinlock_t		lock;
 	int			n_active;
 	struct list_head	active_list;
 	struct pmu		*pmu;
@@ -210,12 +210,12 @@ static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 	if (!pmu->n_active)
 		return HRTIMER_NORESTART;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	list_for_each_entry(event, &pmu->active_list, active_entry)
 		rapl_event_update(event);
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 
 	hrtimer_forward_now(hrtimer, pmu->timer_interval);
 
@@ -252,9 +252,9 @@ static void rapl_pmu_event_start(struct perf_event *event, int mode)
 	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 	__rapl_pmu_event_start(pmu, event);
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
@@ -263,7 +263,7 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	/* mark event as deactivated and stopped */
 	if (!(hwc->state & PERF_HES_STOPPED)) {
@@ -288,7 +288,7 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 		hwc->state |= PERF_HES_UPTODATE;
 	}
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 }
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
@@ -297,14 +297,14 @@ static int rapl_pmu_event_add(struct perf_event *event, int mode)
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
-	spin_lock_irqsave(&pmu->lock, flags);
+	raw_spin_lock_irqsave(&pmu->lock, flags);
 
 	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
 
 	if (mode & PERF_EF_START)
 		__rapl_pmu_event_start(pmu, event);
 
-	spin_unlock_irqrestore(&pmu->lock, flags);
+	raw_spin_unlock_irqrestore(&pmu->lock, flags);
 
 	return 0;
 }
@@ -567,7 +567,7 @@ static int rapl_cpu_prepare(int cpu)
 	pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
 	if (!pmu)
 		return -1;
-	spin_lock_init(&pmu->lock);
+	raw_spin_lock_init(&pmu->lock);
 
 	INIT_LIST_HEAD(&pmu->active_list);
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Utilize event->pmu_private
  2016-02-22 22:19 ` [patch V3 25/28] x86/perf/intel/rapl: Utilize event->pmu_private Thomas Gleixner
@ 2016-02-29 11:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: harish.chegondi, kan.liang, tglx, hpa, mingo, peterz, acme,
	jacob.jun.pan, bp, linux-kernel, torvalds, eranian, andi.kleen,
	vincent.weaver, jolsa

Commit-ID:  8a6d2f8f73caa8b8eb596a9e2d2e0b15d64751a4
Gitweb:     http://git.kernel.org/tip/8a6d2f8f73caa8b8eb596a9e2d2e0b15d64751a4
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:25 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:25 +0100

perf/x86/intel/rapl: Utilize event->pmu_private

Store the PMU pointer in event->pmu_private and use it instead of the per CPU
data. Preparatory step to get rid of the per CPU allocations. The usage sites
are the perf fast path, so we keep that even after the conversion to per
package storage as a CPU to package lookup involves 3 loads versus 1 with the
pmu_private pointer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.748151799@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 2934999..753d90c 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -122,6 +122,7 @@ static struct perf_pmu_events_attr event_attr_##v = {				\
 struct rapl_pmu {
 	raw_spinlock_t		lock;
 	int			n_active;
+	int			cpu;
 	struct list_head	active_list;
 	struct pmu		*pmu;
 	ktime_t			timer_interval;
@@ -203,7 +204,7 @@ static void rapl_start_hrtimer(struct rapl_pmu *pmu)
 
 static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = container_of(hrtimer, struct rapl_pmu, hrtimer);
 	struct perf_event *event;
 	unsigned long flags;
 
@@ -249,7 +250,7 @@ static void __rapl_pmu_event_start(struct rapl_pmu *pmu,
 
 static void rapl_pmu_event_start(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pmu->lock, flags);
@@ -259,7 +260,7 @@ static void rapl_pmu_event_start(struct perf_event *event, int mode)
 
 static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -293,7 +294,7 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
 
 static int rapl_pmu_event_add(struct perf_event *event, int mode)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
+	struct rapl_pmu *pmu = event->pmu_private;
 	struct hw_perf_event *hwc = &event->hw;
 	unsigned long flags;
 
@@ -316,6 +317,7 @@ static void rapl_pmu_event_del(struct perf_event *event, int flags)
 
 static int rapl_pmu_event_init(struct perf_event *event)
 {
+	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	u64 cfg = event->attr.config & RAPL_EVENT_MASK;
 	int bit, msr, ret = 0;
 
@@ -327,6 +329,9 @@ static int rapl_pmu_event_init(struct perf_event *event)
 	if (event->attr.config & ~RAPL_EVENT_MASK)
 		return -EINVAL;
 
+	if (event->cpu < 0)
+		return -EINVAL;
+
 	/*
 	 * check event is known (determines counter)
 	 */
@@ -365,6 +370,8 @@ static int rapl_pmu_event_init(struct perf_event *event)
 		return -EINVAL;
 
 	/* must be done before validate_group */
+	event->cpu = pmu->cpu;
+	event->pmu_private = pmu;
 	event->hw.event_base = msr;
 	event->hw.config = cfg;
 	event->hw.idx = bit;
@@ -572,6 +579,7 @@ static int rapl_cpu_prepare(int cpu)
 	INIT_LIST_HEAD(&pmu->active_list);
 
 	pmu->pmu = &rapl_pmu_class;
+	pmu->cpu = cpu;
 
 	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Convert it to a per package facility
  2016-02-22 22:19 ` [patch V3 26/28] x86/perf/intel/rapl: Convert it to a per package facility Thomas Gleixner
@ 2016-02-29 11:12   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: harish.chegondi, andi.kleen, bp, tglx, torvalds, peterz,
	jacob.jun.pan, vincent.weaver, eranian, jolsa, kan.liang,
	linux-kernel, mingo, hpa, acme

Commit-ID:  9de8d686955b0e8e27847ed4edbbcd280f3fd853
Gitweb:     http://git.kernel.org/tip/9de8d686955b0e8e27847ed4edbbcd280f3fd853
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:26 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:26 +0100

perf/x86/intel/rapl: Convert it to a per package facility

RAPL is a per package facility and we already have a mechanism for a dedicated
per package reader. So there is no point to have multiple CPUs doing the
same. The current implementation actually starts two timers on two CPUs if one
does:

	perf stat -C1,2 -e -e power/energy-pkg ....

which makes the whole concept of 1 reader per package moot.

What's worse is that the above returns the double of the actual energy
consumption, but that's a different problem to address and cannot be solved by
removing the pointless per cpuness of that mechanism.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.845369524@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 194 +++++++++++++++++++------------------------
 1 file changed, 86 insertions(+), 108 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 753d90c..019e541 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -129,15 +129,23 @@ struct rapl_pmu {
 	struct hrtimer		hrtimer;
 };
 
+struct rapl_pmus {
+	struct pmu		pmu;
+	unsigned int		maxpkg;
+	struct rapl_pmu		*pmus[];
+};
+
  /* 1/2^hw_unit Joule */
 static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
-static struct pmu rapl_pmu_class;
+static struct rapl_pmus *rapl_pmus;
 static cpumask_t rapl_cpu_mask;
-static int rapl_cntr_mask;
+static unsigned int rapl_cntr_mask;
 static u64 rapl_timer_ms;
 
-static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu);
-static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free);
+static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
+{
+	return rapl_pmus->pmus[topology_logical_package_id(cpu)];
+}
 
 static inline u64 rapl_read_counter(struct perf_event *event)
 {
@@ -317,12 +325,12 @@ static void rapl_pmu_event_del(struct perf_event *event, int flags)
 
 static int rapl_pmu_event_init(struct perf_event *event)
 {
-	struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu);
 	u64 cfg = event->attr.config & RAPL_EVENT_MASK;
 	int bit, msr, ret = 0;
+	struct rapl_pmu *pmu;
 
 	/* only look at RAPL events */
-	if (event->attr.type != rapl_pmu_class.type)
+	if (event->attr.type != rapl_pmus->pmu.type)
 		return -ENOENT;
 
 	/* check only supported bits are set */
@@ -370,6 +378,7 @@ static int rapl_pmu_event_init(struct perf_event *event)
 		return -EINVAL;
 
 	/* must be done before validate_group */
+	pmu = cpu_to_rapl_pmu(event->cpu);
 	event->cpu = pmu->cpu;
 	event->pmu_private = pmu;
 	event->hw.event_base = msr;
@@ -502,116 +511,62 @@ const struct attribute_group *rapl_attr_groups[] = {
 	NULL,
 };
 
-static struct pmu rapl_pmu_class = {
-	.attr_groups	= rapl_attr_groups,
-	.task_ctx_nr	= perf_invalid_context, /* system-wide only */
-	.event_init	= rapl_pmu_event_init,
-	.add		= rapl_pmu_event_add, /* must have */
-	.del		= rapl_pmu_event_del, /* must have */
-	.start		= rapl_pmu_event_start,
-	.stop		= rapl_pmu_event_stop,
-	.read		= rapl_pmu_event_read,
-};
-
 static void rapl_cpu_exit(int cpu)
 {
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-	int i, phys_id = topology_physical_package_id(cpu);
-	int target = -1;
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
+	int target;
 
-	/* find a new cpu on same package */
-	for_each_online_cpu(i) {
-		if (i == cpu)
-			continue;
-		if (phys_id == topology_physical_package_id(i)) {
-			target = i;
-			break;
-		}
-	}
-	/*
-	 * clear cpu from cpumask
-	 * if was set in cpumask and still some cpu on package,
-	 * then move to new cpu
-	 */
-	if (cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask) && target >= 0)
-		cpumask_set_cpu(target, &rapl_cpu_mask);
+	/* Check if exiting cpu is used for collecting rapl events */
+	if (!cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask))
+		return;
 
-	WARN_ON(cpumask_empty(&rapl_cpu_mask));
-	/*
-	 * migrate events and context to new cpu
-	 */
-	if (target >= 0)
-		perf_pmu_migrate_context(pmu->pmu, cpu, target);
+	pmu->cpu = -1;
+	/* Find a new cpu to collect rapl events */
+	target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
 
-	/* cancel overflow polling timer for CPU */
-	hrtimer_cancel(&pmu->hrtimer);
+	/* Migrate rapl events to the new target */
+	if (target < nr_cpu_ids) {
+		cpumask_set_cpu(target, &rapl_cpu_mask);
+		pmu->cpu = target;
+		perf_pmu_migrate_context(pmu->pmu, cpu, target);
+	}
 }
 
 static void rapl_cpu_init(int cpu)
 {
-	int i, phys_id = topology_physical_package_id(cpu);
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
+	int target;
+
+	/*
+	 * Check if there is an online cpu in the package which collects rapl
+	 * events already.
+	 */
+	target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu));
+	if (target < nr_cpu_ids)
+		return;
 
-	/* check if phys_is is already covered */
-	for_each_cpu(i, &rapl_cpu_mask) {
-		if (phys_id == topology_physical_package_id(i))
-			return;
-	}
-	/* was not found, so add it */
 	cpumask_set_cpu(cpu, &rapl_cpu_mask);
+	pmu->cpu = cpu;
 }
 
 static int rapl_cpu_prepare(int cpu)
 {
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-	int phys_id = topology_physical_package_id(cpu);
+	struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
 
 	if (pmu)
 		return 0;
 
-	if (phys_id < 0)
-		return -1;
-
 	pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
 	if (!pmu)
-		return -1;
-	raw_spin_lock_init(&pmu->lock);
+		return -ENOMEM;
 
+	raw_spin_lock_init(&pmu->lock);
 	INIT_LIST_HEAD(&pmu->active_list);
-
-	pmu->pmu = &rapl_pmu_class;
-	pmu->cpu = cpu;
-
+	pmu->pmu = &rapl_pmus->pmu;
 	pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
-
+	pmu->cpu = -1;
 	rapl_hrtimer_init(pmu);
-
-	/* set RAPL pmu for this cpu for now */
-	per_cpu(rapl_pmu, cpu) = pmu;
-	per_cpu(rapl_pmu_to_free, cpu) = NULL;
-
-	return 0;
-}
-
-static void rapl_cpu_kfree(int cpu)
-{
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu_to_free, cpu);
-
-	kfree(pmu);
-
-	per_cpu(rapl_pmu_to_free, cpu) = NULL;
-}
-
-static int rapl_cpu_dying(int cpu)
-{
-	struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
-
-	if (!pmu)
-		return 0;
-
-	per_cpu(rapl_pmu, cpu) = NULL;
-
-	per_cpu(rapl_pmu_to_free, cpu) = pmu;
-
+	rapl_pmus->pmus[topology_logical_package_id(cpu)] = pmu;
 	return 0;
 }
 
@@ -624,24 +579,16 @@ static int rapl_cpu_notifier(struct notifier_block *self,
 	case CPU_UP_PREPARE:
 		rapl_cpu_prepare(cpu);
 		break;
-	case CPU_STARTING:
-		rapl_cpu_init(cpu);
-		break;
-	case CPU_UP_CANCELED:
-	case CPU_DYING:
-		rapl_cpu_dying(cpu);
-		break;
+
+	case CPU_DOWN_FAILED:
 	case CPU_ONLINE:
-	case CPU_DEAD:
-		rapl_cpu_kfree(cpu);
+		rapl_cpu_init(cpu);
 		break;
+
 	case CPU_DOWN_PREPARE:
 		rapl_cpu_exit(cpu);
 		break;
-	default:
-		break;
 	}
-
 	return NOTIFY_OK;
 }
 
@@ -703,10 +650,14 @@ static void __init rapl_advertise(void)
 
 static int __init rapl_prepare_cpus(void)
 {
-	unsigned int cpu;
+	unsigned int cpu, pkg;
 	int ret;
 
 	for_each_online_cpu(cpu) {
+		pkg = topology_logical_package_id(cpu);
+		if (rapl_pmus->pmus[pkg])
+			continue;
+
 		ret = rapl_cpu_prepare(cpu);
 		if (ret)
 			return ret;
@@ -717,10 +668,33 @@ static int __init rapl_prepare_cpus(void)
 
 static void __init cleanup_rapl_pmus(void)
 {
-	int cpu;
+	int i;
+
+	for (i = 0; i < rapl_pmus->maxpkg; i++)
+		kfree(rapl_pmus->pmus + i);
+	kfree(rapl_pmus);
+}
 
-	for_each_online_cpu(cpu)
-		kfree(per_cpu(rapl_pmu, cpu));
+static int __init init_rapl_pmus(void)
+{
+	int maxpkg = topology_max_packages();
+	size_t size;
+
+	size = sizeof(*rapl_pmus) + maxpkg * sizeof(struct rapl_pmu *);
+	rapl_pmus = kzalloc(size, GFP_KERNEL);
+	if (!rapl_pmus)
+		return -ENOMEM;
+
+	rapl_pmus->maxpkg		= maxpkg;
+	rapl_pmus->pmu.attr_groups	= rapl_attr_groups;
+	rapl_pmus->pmu.task_ctx_nr	= perf_invalid_context;
+	rapl_pmus->pmu.event_init	= rapl_pmu_event_init;
+	rapl_pmus->pmu.add		= rapl_pmu_event_add;
+	rapl_pmus->pmu.del		= rapl_pmu_event_del;
+	rapl_pmus->pmu.start		= rapl_pmu_event_start;
+	rapl_pmus->pmu.stop		= rapl_pmu_event_stop;
+	rapl_pmus->pmu.read		= rapl_pmu_event_read;
+	return 0;
 }
 
 static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
@@ -771,13 +745,17 @@ static int __init rapl_pmu_init(void)
 	if (ret)
 		return ret;
 
+	ret = init_rapl_pmus();
+	if (ret)
+		return ret;
+
 	cpu_notifier_register_begin();
 
 	ret = rapl_prepare_cpus();
 	if (ret)
 		goto out;
 
-	ret = perf_pmu_register(&rapl_pmu_class, "power", -1);
+	ret = perf_pmu_register(&rapl_pmus->pmu, "power", -1);
 	if (ret)
 		goto out;
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf: Export perf_event_sysfs_show()
  2016-02-22 22:19 ` [patch V3 27/28] perf: Export perf_event_sysfs_show Thomas Gleixner
@ 2016-02-29 11:13   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-02-29 11:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, harish.chegondi, jolsa, peterz, tglx, acme, vincent.weaver,
	andi.kleen, torvalds, eranian, bp, mingo, linux-kernel,
	kan.liang, jacob.jun.pan

Commit-ID:  675965b00d734c985e4285f5bec7e524d15fc4e1
Gitweb:     http://git.kernel.org/tip/675965b00d734c985e4285f5bec7e524d15fc4e1
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Mon, 22 Feb 2016 22:19:27 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 29 Feb 2016 09:35:27 +0100

perf: Export perf_event_sysfs_show()

Required to use it in modular perf drivers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.930735780@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index df5589f..5dcc0bd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9446,6 +9446,7 @@ ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute *attr,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(perf_event_sysfs_show);
 
 static int __init perf_event_sysfs_init(void)
 {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [tip:perf/core] perf/x86/intel/rapl: Sanitize the quirk handling
  2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
@ 2016-03-04 17:49     ` Borislav Petkov
  2016-03-08 16:04       ` Ingo Molnar
  0 siblings, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2016-03-04 17:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-tip-commits, torvalds, mingo, eranian, harish.chegondi,
	acme, jolsa, tglx, peterz, hpa, andi.kleen, vincent.weaver,
	kan.liang, jacob.jun.pan, linux-kernel

On Mon, Feb 29, 2016 at 03:10:28AM -0800, tip-bot for Thomas Gleixner wrote:
> Commit-ID:  b8b3319a471b2df35ae0a8fe94223638468e9ca4
> Gitweb:     http://git.kernel.org/tip/b8b3319a471b2df35ae0a8fe94223638468e9ca4
> Author:     Thomas Gleixner <tglx@linutronix.de>
> AuthorDate: Mon, 22 Feb 2016 22:19:22 +0000
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Mon, 29 Feb 2016 09:35:22 +0100
> 
> perf/x86/intel/rapl: Sanitize the quirk handling
> 
> There is no point in having a quirk machinery for a single possible
> function. Get rid of it and move the quirk to a place where it actually
> makes sense.

I guess you can simplify this even more by even getting rid of that
quirk function pointer funkyness and simply tell rapl_check_hw_unit() to
apply the quirk.

Diff ontop of yours:

---
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 019e541fa988..0b72976bab35 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -592,18 +592,7 @@ static int rapl_cpu_notifier(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static __init void rapl_hsw_server_quirk(void)
-{
-	/*
-	 * DRAM domain on HSW server has fixed energy unit which can be
-	 * different than the unit from power unit MSR.
-	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-	 */
-	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
-}
-
-static int rapl_check_hw_unit(void (*quirk)(void))
+static int rapl_check_hw_unit(bool apply_quirk)
 {
 	u64 msr_rapl_power_unit_bits;
 	int i;
@@ -614,9 +603,14 @@ static int rapl_check_hw_unit(void (*quirk)(void))
 	for (i = 0; i < NR_RAPL_DOMAINS; i++)
 		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-	/* Apply cpu model quirk */
-	if (quirk)
-		quirk();
+	/*
+	 * DRAM domain on HSW server has fixed energy unit which can be
+	 * different than the unit from power unit MSR.
+	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+	 */
+	if (apply_quirk)
+		rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
 	/*
 	 * Calculate the timer rate:
@@ -704,7 +698,7 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 
 static int __init rapl_pmu_init(void)
 {
-	void (*quirk)(void) = NULL;
+	bool apply_quirk = false;
 	int ret;
 
 	if (!x86_match_cpu(rapl_cpu_match))
@@ -717,7 +711,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
 		break;
 	case 63: /* Haswell-Server */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_SRV;
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
@@ -733,7 +727,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
 	case 87: /* Knights Landing */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
@@ -741,7 +735,7 @@ static int __init rapl_pmu_init(void)
 		return -ENODEV;
 	}
 
-	ret = rapl_check_hw_unit(quirk);
+	ret = rapl_check_hw_unit(apply_quirk);
 	if (ret)
 		return ret;
 
-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [tip:perf/core] perf/x86/intel/rapl: Sanitize the quirk handling
  2016-03-04 17:49     ` Borislav Petkov
@ 2016-03-08 16:04       ` Ingo Molnar
  2016-03-08 16:40         ` [PATCH] perf/x86/intel/rapl: Simplify quirk handling even more Borislav Petkov
  0 siblings, 1 reply; 58+ messages in thread
From: Ingo Molnar @ 2016-03-08 16:04 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, linux-tip-commits, torvalds, eranian,
	harish.chegondi, acme, jolsa, peterz, hpa, andi.kleen,
	vincent.weaver, kan.liang, jacob.jun.pan, linux-kernel


* Borislav Petkov <bp@alien8.de> wrote:

> On Mon, Feb 29, 2016 at 03:10:28AM -0800, tip-bot for Thomas Gleixner wrote:
> > Commit-ID:  b8b3319a471b2df35ae0a8fe94223638468e9ca4
> > Gitweb:     http://git.kernel.org/tip/b8b3319a471b2df35ae0a8fe94223638468e9ca4
> > Author:     Thomas Gleixner <tglx@linutronix.de>
> > AuthorDate: Mon, 22 Feb 2016 22:19:22 +0000
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Mon, 29 Feb 2016 09:35:22 +0100
> > 
> > perf/x86/intel/rapl: Sanitize the quirk handling
> > 
> > There is no point in having a quirk machinery for a single possible
> > function. Get rid of it and move the quirk to a place where it actually
> > makes sense.
> 
> I guess you can simplify this even more by even getting rid of that
> quirk function pointer funkyness and simply tell rapl_check_hw_unit() to
> apply the quirk.
> 
> Diff ontop of yours:
> 
> ---
> diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
> index 019e541fa988..0b72976bab35 100644
> --- a/arch/x86/events/intel/rapl.c
> +++ b/arch/x86/events/intel/rapl.c
> @@ -592,18 +592,7 @@ static int rapl_cpu_notifier(struct notifier_block *self,
>  	return NOTIFY_OK;
>  }
>  
> -static __init void rapl_hsw_server_quirk(void)
> -{
> -	/*
> -	 * DRAM domain on HSW server has fixed energy unit which can be
> -	 * different than the unit from power unit MSR.
> -	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
> -	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
> -	 */
> -	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
> -}
> -
> -static int rapl_check_hw_unit(void (*quirk)(void))
> +static int rapl_check_hw_unit(bool apply_quirk)
>  {
>  	u64 msr_rapl_power_unit_bits;
>  	int i;
> @@ -614,9 +603,14 @@ static int rapl_check_hw_unit(void (*quirk)(void))
>  	for (i = 0; i < NR_RAPL_DOMAINS; i++)
>  		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
>  
> -	/* Apply cpu model quirk */
> -	if (quirk)
> -		quirk();
> +	/*
> +	 * DRAM domain on HSW server has fixed energy unit which can be
> +	 * different than the unit from power unit MSR.
> +	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
> +	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
> +	 */
> +	if (apply_quirk)
> +		rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
>  
>  	/*
>  	 * Calculate the timer rate:
> @@ -704,7 +698,7 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
>  
>  static int __init rapl_pmu_init(void)
>  {
> -	void (*quirk)(void) = NULL;
> +	bool apply_quirk = false;
>  	int ret;
>  
>  	if (!x86_match_cpu(rapl_cpu_match))
> @@ -717,7 +711,7 @@ static int __init rapl_pmu_init(void)
>  		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
>  		break;
>  	case 63: /* Haswell-Server */
> -		quirk = rapl_hsw_server_quirk;
> +		apply_quirk = true;
>  		rapl_cntr_mask = RAPL_IDX_SRV;
>  		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
>  		break;
> @@ -733,7 +727,7 @@ static int __init rapl_pmu_init(void)
>  		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
>  		break;
>  	case 87: /* Knights Landing */
> -		quirk = rapl_hsw_server_quirk;
> +		apply_quirk = true;
>  		rapl_cntr_mask = RAPL_IDX_KNL;
>  		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
>  		break;
> @@ -741,7 +735,7 @@ static int __init rapl_pmu_init(void)
>  		return -ENODEV;
>  	}
>  
> -	ret = rapl_check_hw_unit(quirk);
> +	ret = rapl_check_hw_unit(apply_quirk);
>  	if (ret)
>  		return ret;

Please send a patch with a changelog so it can be applied.

Thanks!

	Ingo

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] perf/x86/intel/rapl: Simplify quirk handling even more
  2016-03-08 16:04       ` Ingo Molnar
@ 2016-03-08 16:40         ` Borislav Petkov
  2016-03-08 17:36           ` [tip:perf/core] " tip-bot for Borislav Petkov
  0 siblings, 1 reply; 58+ messages in thread
From: Borislav Petkov @ 2016-03-08 16:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-tip-commits, torvalds, eranian,
	harish.chegondi, acme, jolsa, peterz, hpa, andi.kleen,
	vincent.weaver, kan.liang, jacob.jun.pan, linux-kernel

Drop the quirk() function pointer in favor of a simple boolean which
says whether the quirk should be applied or not. Update comment while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
---
 arch/x86/events/intel/rapl.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 019e541fa988..b834a3f55a01 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -592,18 +592,7 @@ static int rapl_cpu_notifier(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static __init void rapl_hsw_server_quirk(void)
-{
-	/*
-	 * DRAM domain on HSW server has fixed energy unit which can be
-	 * different than the unit from power unit MSR.
-	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-	 */
-	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
-}
-
-static int rapl_check_hw_unit(void (*quirk)(void))
+static int rapl_check_hw_unit(bool apply_quirk)
 {
 	u64 msr_rapl_power_unit_bits;
 	int i;
@@ -614,9 +603,14 @@ static int rapl_check_hw_unit(void (*quirk)(void))
 	for (i = 0; i < NR_RAPL_DOMAINS; i++)
 		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-	/* Apply cpu model quirk */
-	if (quirk)
-		quirk();
+	/*
+	 * DRAM domain on HSW server and KNL has fixed energy unit which can be
+	 * different than the unit from power unit MSR. See
+	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+	 */
+	if (apply_quirk)
+		rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
 	/*
 	 * Calculate the timer rate:
@@ -704,7 +698,7 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 
 static int __init rapl_pmu_init(void)
 {
-	void (*quirk)(void) = NULL;
+	bool apply_quirk = false;
 	int ret;
 
 	if (!x86_match_cpu(rapl_cpu_match))
@@ -717,7 +711,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
 		break;
 	case 63: /* Haswell-Server */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_SRV;
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
@@ -733,7 +727,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
 	case 87: /* Knights Landing */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
@@ -741,7 +735,7 @@ static int __init rapl_pmu_init(void)
 		return -ENODEV;
 	}
 
-	ret = rapl_check_hw_unit(quirk);
+	ret = rapl_check_hw_unit(apply_quirk);
 	if (ret)
 		return ret;
 
-- 
2.3.5

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip:perf/core] perf/x86/intel/rapl: Simplify quirk handling even more
  2016-03-08 16:40         ` [PATCH] perf/x86/intel/rapl: Simplify quirk handling even more Borislav Petkov
@ 2016-03-08 17:36           ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 58+ messages in thread
From: tip-bot for Borislav Petkov @ 2016-03-08 17:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: andi.kleen, harish.chegondi, tglx, bp, mingo, peterz,
	linux-kernel, alexander.shishkin, eranian, torvalds, acme,
	jacob.jun.pan, bp, jolsa, kan.liang, hpa, vincent.weaver

Commit-ID:  7a8698058ae493ae53b1a8a2fa23d2e37000d73e
Gitweb:     http://git.kernel.org/tip/7a8698058ae493ae53b1a8a2fa23d2e37000d73e
Author:     Borislav Petkov <bp@alien8.de>
AuthorDate: Tue, 8 Mar 2016 17:40:41 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 8 Mar 2016 17:49:52 +0100

perf/x86/intel/rapl: Simplify quirk handling even more

Drop the quirk() function pointer in favor of a simple boolean which
says whether the quirk should be applied or not. Update comment while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160308164041.GF16568@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/rapl.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 019e541..b834a3f 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -592,18 +592,7 @@ static int rapl_cpu_notifier(struct notifier_block *self,
 	return NOTIFY_OK;
 }
 
-static __init void rapl_hsw_server_quirk(void)
-{
-	/*
-	 * DRAM domain on HSW server has fixed energy unit which can be
-	 * different than the unit from power unit MSR.
-	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
-	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
-	 */
-	rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
-}
-
-static int rapl_check_hw_unit(void (*quirk)(void))
+static int rapl_check_hw_unit(bool apply_quirk)
 {
 	u64 msr_rapl_power_unit_bits;
 	int i;
@@ -614,9 +603,14 @@ static int rapl_check_hw_unit(void (*quirk)(void))
 	for (i = 0; i < NR_RAPL_DOMAINS; i++)
 		rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
 
-	/* Apply cpu model quirk */
-	if (quirk)
-		quirk();
+	/*
+	 * DRAM domain on HSW server and KNL has fixed energy unit which can be
+	 * different than the unit from power unit MSR. See
+	 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
+	 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
+	 */
+	if (apply_quirk)
+		rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
 
 	/*
 	 * Calculate the timer rate:
@@ -704,7 +698,7 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 
 static int __init rapl_pmu_init(void)
 {
-	void (*quirk)(void) = NULL;
+	bool apply_quirk = false;
 	int ret;
 
 	if (!x86_match_cpu(rapl_cpu_match))
@@ -717,7 +711,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_cln_attr;
 		break;
 	case 63: /* Haswell-Server */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_SRV;
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
@@ -733,7 +727,7 @@ static int __init rapl_pmu_init(void)
 		rapl_pmu_events_group.attrs = rapl_events_srv_attr;
 		break;
 	case 87: /* Knights Landing */
-		quirk = rapl_hsw_server_quirk;
+		apply_quirk = true;
 		rapl_cntr_mask = RAPL_IDX_KNL;
 		rapl_pmu_events_group.attrs = rapl_events_knl_attr;
 		break;
@@ -741,7 +735,7 @@ static int __init rapl_pmu_init(void)
 		return -ENODEV;
 	}
 
-	ret = rapl_check_hw_unit(quirk);
+	ret = rapl_check_hw_unit(apply_quirk);
 	if (ret)
 		return ret;
 

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2016-03-08 17:37 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-22 22:19 [patch V3 00/28] x86/perf/intel/uncore|rapl: Fix error handling and sanitize pmu management Thomas Gleixner
2016-02-22 22:19 ` [patch V3 01/28] x86/perf/intel/uncore: Remove pointless mask check Thomas Gleixner
2016-02-29 11:03   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 02/28] x86/perf/intel/uncore: Simplify error rollback Thomas Gleixner
2016-02-29 11:03   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 03/28] x86/perf/intel/uncore: Fix error handling Thomas Gleixner
2016-02-29 11:04   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 04/28] x86/perf/intel/uncore: Add sanity checks for pci dev package id Thomas Gleixner
2016-02-29 11:04   ` [tip:perf/core] perf/x86/intel/uncore: Add sanity checks for PCI " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 05/28] x86/perf/intel_uncore: Cleanup hardware on exit Thomas Gleixner
2016-02-29 11:05   ` [tip:perf/core] perf/x86/intel/uncore: Clean up " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 06/28] x86/perf/intel/uncore: Drop pointless hotplug priority Thomas Gleixner
2016-02-22 22:19 ` [patch V3 07/28] x86/perf/intel_uncore: Make code readable Thomas Gleixner
2016-02-29 11:05   ` [tip:perf/core] perf/x86/intel/uncore: Make code more readable tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 08/28] x86/perf/uncore: Make uncore_pcibus_to_physid static Thomas Gleixner
2016-02-29 11:06   ` [tip:perf/core] perf/x86/uncore: Make uncore_pcibus_to_physid() static tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 09/28] perf: Allow storage of pmu private data in event Thomas Gleixner
2016-02-29 11:06   ` [tip:perf/core] perf: Allow storage of PMU " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 10/28] x86/perf/intel_uncore: Store box in event->pmu_private Thomas Gleixner
2016-02-29 11:06   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 11/28] x86/topology: Create logical package id Thomas Gleixner
2016-02-29 11:07   ` [tip:perf/core] " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 12/28] x86/perf/uncore: Track packages not per cpu data Thomas Gleixner
2016-02-29 11:07   ` [tip:perf/core] perf/x86/uncore: Track packages, not per CPU data tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 13/28] x86/perf/intel_uncore: Clear all hardware state on exit Thomas Gleixner
2016-02-29 11:08   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 14/28] x86/perf/intel_uncore: Make PCI and MSR uncore independent Thomas Gleixner
2016-02-29 11:08   ` [tip:perf/core] perf/x86/intel/uncore: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 15/28] cpumask: Export cpumask_any_but Thomas Gleixner
2016-02-29 11:08   ` [tip:perf/core] cpumask: Export cpumask_any_but() tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 16/28] x86/perf/intel_uncore: Make it modular Thomas Gleixner
2016-02-22 22:19 ` [patch V3 17/28] x86/perf/cqm: Get rid of the silly for_each_cpu lookups Thomas Gleixner
2016-02-29 11:09   ` [tip:perf/core] perf/x86/intel/cqm: Get rid of the silly for_each_cpu() lookups tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 18/28] x86/perf/intel_rapl: Make Knights Landings support functional Thomas Gleixner
2016-02-29 11:09   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 19/28] x86/perf/intel/rapl: Add proper error handling Thomas Gleixner
2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 21/28] x86/perf/intel/rapl: Calculate timing once Thomas Gleixner
2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 20/28] x86/perf/intel/rapl: Sanitize the quirk handling Thomas Gleixner
2016-02-29 11:10   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-03-04 17:49     ` Borislav Petkov
2016-03-08 16:04       ` Ingo Molnar
2016-03-08 16:40         ` [PATCH] perf/x86/intel/rapl: Simplify quirk handling even more Borislav Petkov
2016-03-08 17:36           ` [tip:perf/core] " tip-bot for Borislav Petkov
2016-02-22 22:19 ` [patch V3 22/28] x86/perf/intel/rapl: Cleanup the printk output Thomas Gleixner
2016-02-29 11:11   ` [tip:perf/core] perf/x86/intel/rapl: Clean up " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 23/28] x86/perf/intel/rapl: Refactor code some more Thomas Gleixner
2016-02-29 11:11   ` [tip:perf/core] perf/x86/intel/rapl: Refactor the " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 25/28] x86/perf/intel/rapl: Utilize event->pmu_private Thomas Gleixner
2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 24/28] x86/perf/intel/rapl: Make pmu lock raw Thomas Gleixner
2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: Make PMU " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 26/28] x86/perf/intel/rapl: Convert it to a per package facility Thomas Gleixner
2016-02-29 11:12   ` [tip:perf/core] perf/x86/intel/rapl: " tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 27/28] perf: Export perf_event_sysfs_show Thomas Gleixner
2016-02-29 11:13   ` [tip:perf/core] perf: Export perf_event_sysfs_show() tip-bot for Thomas Gleixner
2016-02-22 22:19 ` [patch V3 28/28] x86/perf/intel/rapl: Make it modular Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.