linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] perf/x86/intel/rapl: avoid access unallocate memory
@ 2016-11-02 12:25 Sebastian Andrzej Siewior
  2016-11-02 22:47 ` Charles (Chas) Williams
  0 siblings, 1 reply; 43+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-11-02 12:25 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, Charles (Chas) Williams, M. Vefa Bicakci

After the hotplug rework Charles Williams reported that his vmware
virtualized system no longer boots and crashes in rapl_cpu_online().
As it turns out topology_max_packages() reports four while
topology_logical_package_id() for CPU two and three returns 65535. That
means cpu_to_rapl_pmu() for those CPUs is accessing not allocated memory
of rapl_pmus->pmus[].
"M. Vefa Bicakci" reported the same problem on XEN.
This patch ensures we error out in such an invalid situation.

Reported-by: "Charles (Chas) Williams" <ciwillia@brocade.com>
Tested-by: "M. Vefa Bicakci" <m.v.b@runbox.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
I am not sure if this a race with the new hotplug code or something that was
always there. Both (M. Vefa Bicakc and Charles) say that the box boots
sometimes fine (without the patch).  smp_store_boot_cpu_info() should have run
before the notofoert and thus should have set the info properly. However I got
the following bootlog from Charles with this patch:

[    0.017110] smpboot: APIC(0) Converting physical 0 to logical package 0
[    0.017111] smpboot: APIC(1) Converting physical 1 to logical package 1
[    0.017113] smpboot: Max logical packages: 2                                                                                            
…
[    1.995494] RAPL PMU: rapl pmu error: max package: 2 but CPU1 belongs to 65535
[    1.995647] rapl pmu error: max package: 2 but CPU1 belongs to 65535
 
So it seems that the information got overwritten. I am not sure how to proceed
here. That memory corruption should be found and fixed and a boot crash might
motivate one to do so… I can't reproduce this on barematal.

Thread starts at
  d40f8e3c-b332-c331-38b9-11eb4f4aaaa7@brocade.com

 arch/x86/events/intel/rapl.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 0a535cea8ff3..f5d85f2853d7 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -682,6 +682,15 @@ static int __init init_rapl_pmus(void)
 {
 	int maxpkg = topology_max_packages();
 	size_t size;
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu) {
+		if (topology_logical_package_id(cpu) >= maxpkg) {
+			pr_err("rapl pmu error: max package: %u but CPU%d belongs to %u\n",
+			       maxpkg, cpu, topology_logical_package_id(cpu));
+			return -EINVAL;
+		}
+	}
 
 	size = sizeof(*rapl_pmus) + maxpkg * sizeof(struct rapl_pmu *);
 	rapl_pmus = kzalloc(size, GFP_KERNEL);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2016-11-18 14:21 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-02 12:25 [RFC PATCH] perf/x86/intel/rapl: avoid access unallocate memory Sebastian Andrzej Siewior
2016-11-02 22:47 ` Charles (Chas) Williams
2016-11-03 17:47   ` Sebastian Andrzej Siewior
2016-11-04 12:20     ` Charles (Chas) Williams
2016-11-04 18:03       ` Sebastian Andrzej Siewior
2016-11-04 20:42         ` Charles (Chas) Williams
2016-11-04 20:57           ` Sebastian Andrzej Siewior
2016-11-07 16:19   ` Thomas Gleixner
2016-11-07 16:59     ` Charles (Chas) Williams
2016-11-07 20:20       ` Thomas Gleixner
2016-11-08 14:20         ` Charles (Chas) Williams
2016-11-08 14:31           ` Thomas Gleixner
2016-11-08 14:57             ` Charles (Chas) Williams
2016-11-08 16:22               ` Thomas Gleixner
2016-11-09 15:35                 ` [PATCH] x86/cpuid: Deal with broken firmware once more Thomas Gleixner
2016-11-09 15:37                   ` Thomas Gleixner
2016-11-09 16:03                   ` Peter Zijlstra
2016-11-09 16:34                     ` Charles (Chas) Williams
2016-11-09 18:37                       ` Thomas Gleixner
2016-11-09 18:15                   ` Charles (Chas) Williams
2016-11-09 20:27                   ` [tip:x86/urgent] x86/cpu: Deal with broken firmware (VMWare/XEN) tip-bot for Thomas Gleixner
2016-11-11  5:49                     ` Alok Kataria
2016-11-10  3:57                   ` [PATCH] x86/cpuid: Deal with broken firmware once more M. Vefa Bicakci
2016-11-10 10:50                     ` Charles (Chas) Williams
2016-11-10 11:14                       ` Thomas Gleixner
2016-11-12 22:05                       ` M. Vefa Bicakci
2016-11-10 11:13                     ` Thomas Gleixner
2016-11-10 11:39                       ` Peter Zijlstra
2016-11-10 14:02                       ` Boris Ostrovsky
2016-11-10 15:05                         ` Charles (Chas) Williams
2016-11-10 15:31                           ` Boris Ostrovsky
2016-11-10 15:54                             ` Sebastian Andrzej Siewior
2016-11-10 17:15                             ` Thomas Gleixner
2016-11-12 22:05                             ` M. Vefa Bicakci
2016-11-13 18:04                               ` Boris Ostrovsky
2016-11-13 23:42                                 ` M. Vefa Bicakci
2016-11-15  1:21                                   ` Boris Ostrovsky
2016-11-18 11:16                                     ` Thomas Gleixner
2016-11-18 14:22                                       ` Boris Ostrovsky
2016-11-10 15:12                         ` Thomas Gleixner
2016-11-10 15:38                           ` Boris Ostrovsky
2016-11-10 17:13                             ` Thomas Gleixner
2016-11-10 18:01                               ` Boris Ostrovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).