From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754883AbcKJD5p (ORCPT ); Wed, 9 Nov 2016 22:57:45 -0500 Received: from aibo.runbox.com ([91.220.196.211]:48588 "EHLO aibo.runbox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752831AbcKJD5n (ORCPT ); Wed, 9 Nov 2016 22:57:43 -0500 Subject: Re: [PATCH] x86/cpuid: Deal with broken firmware once more To: Thomas Gleixner , Sebastian Andrzej Siewior References: <20161102122557.qs4rl6mb7n7l7j7p@linutronix.de> <24e69019-60d0-29e7-e31f-c6f00f9ed98a@brocade.com> <58e229e2-91f4-a97f-1b9f-089f48ef994a@brocade.com> <86609338-2b45-ed7e-fb07-99421e43a2f1@brocade.com> Cc: "Charles (Chas) Williams" , "x86@kernel.org" , LKML , Peter Zijlstra , Borislav Petkov From: "M. Vefa Bicakci" Message-ID: <49fe8cc5-0f0f-6cac-7a5c-803e81f5667d@runbox.com> Date: Thu, 10 Nov 2016 06:57:00 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/09/2016 06:35 PM, Thomas Gleixner wrote: > Both ACPI and MP specifications require that the APIC id in the respective > tables must be the same as the APIC id in CPUID. > > The kernel retrieves the physical package id from the APIC id during the > ACPI/MP table scan and builds the physical to logical package map. > > There exist Virtualbox and Xen implementations which violate the spec. As a > result the physical to logical package map, which relies on the ACPI/MP > tables does not work on those systems, because the CPUID initialized > physical package id does not match the firmware id. This causes system > crashes and malfunction due to invalid package mappings. > > The only way to cure this is to sanitize the physical package id after the > CPUID enumeration and yell when the APIC ids are different. If the physical > package IDs differ use the package information from the ACPI/MP tables so > the existing logical package map just works. > > Reported-by: "Charles (Chas) Williams" , > Reported-by: M. Vefa Bicakci > Signed-off-by: Thomas Gleixner Hello Thomas and Sebastian, Sorry for the delay in reporting what I have found out and for the delay in testing this patch, which has been due to my health. I have found that your patch unfortunately does not improve the situation for me. Here is an excerpt obtained from the dmesg of a kernel compiled with this patch *as well as* Sebastian's patch: === 8< === [ 0.002561] CPU: Physical Processor ID: 0 [ 0.002566] CPU: Processor Core ID: 0 [ 0.002572] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: ffff CPUID: 2 [ 0.002577] [Firmware Bug]: CPU0: Using firmware package id 4095 instead of 0 [ 0.002586] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8 [ 0.002591] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0 [ 0.033319] ftrace: allocating 28753 entries in 113 pages [ 0.040121] cpu 0 spinlock event irq 1 [ 0.040145] smpboot: Max logical packages: 1 [ 0.040155] VPMU disabled by hypervisor. [ 0.040181] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only. [ 0.047050] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.047065] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.052015] installing Xen timer for CPU 1 [ 0.052074] SMP alternatives: switching to SMP code [ 0.002000] Disabled fast string operations [ 0.002000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: ffff CPUID: 2 [ 0.002000] [Firmware Bug]: CPU1: Using firmware package id 4095 instead of 0 [ 0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0 [ 0.078061] cpu 1 spinlock event irq 13 ... [ 0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000) [ 0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535 [ 0.217572] futex hash table entries: 512 (order: 3, 32768 bytes) [ 0.218293] Initialise system trusted keyrings ... [ 0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000) [ 0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535 [ 0.217572] futex hash table entries: 512 (order: 3, 32768 bytes) ... [ 2.974474] intel_rapl: Found RAPL domain package [ 2.974489] intel_rapl: Found RAPL domain core [ 2.974498] intel_rapl: Found RAPL domain uncore [ 2.974518] intel_rapl: RAPL package 4095 domain package locked by BIOS === >8 === As you can see, your patch unfortunately does not correct the issue with the virtual CPU package identifiers in Xen-based virtual machines using para-virtualization. In summary, the root cause of the issue for me appears to be that the boot-up code in the init_apic_mappings function switches the APIC 'ops' structure from Xen's 'ops' structure to the no-op ops structure. Due to this, the smp_init_package_map uses the no-op APIC ops structure's cpu_present_to_to_apicid function, even though it should use the corresponding method from Xen's APIC ops structure (i.e., xen_cpu_present_to_apicid). Here is a dmesg excerpt with a kernel patched with Sebastian's patch (and some debugging code), exhibiting the issue I have just explained: (Note the 'switched to apic NOOP' line.) === 8< === (early) [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org (early) [ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs (early) [ 0.000000] No local APIC present (early) [ 0.000000] APIC: disable apic facility (early) [ 0.000000] APIC: switched to apic NOOP (early) [ 0.000000] e820: [mem 0xfa000000-0xffffffff] available for PCI devices (early) [ 0.000000] Booting paravirtualized kernel on Xen ... [ 0.034082] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus... [ 0.034123] cpu 0 spinlock event irq 1 [ 0.034138] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096 [ 0.034146] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 0, nr_cpu_ids: 2, cpu_present(mps_cpu): 1 [ 0.034155] smpboot: mvb: smp_init_package_map:379: apicid: 65535, apic_id_valid(apicid):0 [ 0.034162] smpboot: Max logical packages: 1 [ 0.034169] VPMU disabled by hypervisor. [ 0.034187] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only. [ 0.041131] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.041142] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.046024] installing Xen timer for CPU 1 [ 0.046121] SMP alternatives: switching to SMP code [ 0.002000] Disabled fast string operations [ 0.002000] mvb: detect_ht: CPU has hyper-threading capability [ 0.002000] mvb: CPU: Physical Processor ID: 0 [ 0.002000] mvb: CPU: Processor Core ID: 0 [ 0.002000] mvb: identify_cpu:1112: c: ffff880013b0a040, c->logical_proc_id: 65535 [ 0.002000] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 1, nr_cpu_ids: 2, cpu_present(mps_cpu): 1 [ 0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 4095 [ 0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0 [ 0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0 ... [ 0.266540] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1 [ 0.266547] RAPL PMU: mvb: rapl pmu: CPU0 (ffff880013a0a040) belongs to package ID 65535. [ 0.266559] RAPL PMU: mvb: rapl pmu: Package ID >= max package for CPU0 (ffff880013a0a040). This is an error. [ 0.266569] RAPL PMU: mvb: rapl pmu: CPU1 (ffff880013b0a040) belongs to package ID 0. === >8 === Through some debugging, last week I came up with the patch at the end of this e-mail, which does correct the issue for me. Once again, I am sorry for reporting this too late. With a kernel built with the patch below, Sebastian's patch and some debugging code, I obtain the following dmesg output, which appears to be correct and does not emit RAPL-related errors during boot-up: === 8< === [ 0.032083] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus... [ 0.032106] cpu 0 spinlock event irq 1 [ 0.032119] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096 [ 0.032127] mvb: xen_cpu_present_to_apicid:151: cpu: 0, ret: 0 [ 0.032132] smpboot: mvb: topology_update_package_map:270: cpu: 0, pkg: 0 [ 0.032137] smpboot: APIC(0) Converting physical 0 to logical package 0 [ 0.032142] smpboot: mvb: topology_update_package_map:305: cpu: 0, cpu_data(cpu).logical_proc_id: 0 [ 0.032149] smpboot: mvb: smp_init_package_map:385: topology_update_package_map successful. [ 0.032155] smpboot: Max logical packages: 1 [ 0.032161] VPMU disabled by hypervisor. [ 0.032177] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only. [ 0.039061] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.039102] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.045048] installing Xen timer for CPU 1 [ 0.045145] SMP alternatives: switching to SMP code [ 0.002000] Disabled fast string operations [ 0.002000] mvb: detect_ht: CPU has hyper-threading capability [ 0.002000] mvb: CPU: Physical Processor ID: 0 [ 0.002000] mvb: CPU: Processor Core ID: 1 [ 0.002000] mvb: identify_cpu:1112: c: ffff88001790a040, c->logical_proc_id: 0 [ 0.002000] mvb: xen_cpu_present_to_apicid:151: cpu: 1, ret: 0 [ 0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 0 [ 0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0 [ 0.067149] cpu 1 spinlock event irq 13 ... [ 0.208395] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1 [ 0.208400] RAPL PMU: mvb: rapl pmu: CPU0 (ffff88001780a040) belongs to package ID 0. [ 0.208409] RAPL PMU: mvb: rapl pmu: CPU1 (ffff88001790a040) belongs to package ID 0. [ 0.208491] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 163840 ms ovfl timer [ 0.208507] RAPL PMU: hw unit of domain pp0-core 2^-16 Joules [ 0.208512] RAPL PMU: hw unit of domain package 2^-16 Joules [ 0.208517] RAPL PMU: hw unit of domain pp1-gpu 2^-16 Joules [ 0.209083] futex hash table entries: 512 (order: 3, 32768 bytes) === >8 === My patch may not be a very elegant correction for the issue at hand, unfortunately (and it probably does not comply with the kernel patch submission guidelines). In addition, I think Sebastian's patch should be included in the mainline kernel, so that potential bugs do not bring down the kernel with a RAPL-related oops at boot-up time with virtualization. Please note that I will need to be offline for most of the day today (my current timezone is UTC+03:00), and as a result my responses to your replies will most likely be a bit late. Thank you, Vefa >>From b7097820285ab6a8588879969d74c56d890d4fd4 Mon Sep 17 00:00:00 2001 From: "M. Vefa Bicakci" Date: Fri, 4 Nov 2016 10:01:19 +0300 Subject: [PATCH] Do not reset apic to apic_noop with Xen --- arch/x86/include/asm/apic.h | 3 +++ arch/x86/kernel/apic/apic.c | 7 +++++-- arch/x86/xen/apic.c | 4 +++- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index b3d4c042e610..8c37580b7eb7 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -359,6 +359,9 @@ struct apic { */ extern struct apic *apic; +/* Indicates whether a hypervisor has already set 'apic'. */ +extern int apic_set_by_hypervisor; + /* * APIC drivers are probed based on how they are listed in the .apicdrivers * section. So the order is important and enforced by the ordering diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 076c315cdf18..a3a1d4570acf 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -172,6 +172,8 @@ unsigned long mp_lapic_addr; int disable_apic; /* Disable local APIC timer from the kernel commandline or via dmi quirk */ static int disable_apic_timer __initdata; +/* Indicates whether a hypervisor has already set 'apic'. */ +int apic_set_by_hypervisor; /* Local APIC timer works in C2 */ int local_apic_timer_c2_ok; EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok); @@ -1788,8 +1790,9 @@ void __init init_apic_mappings(void) return; } - /* If no local APIC can be found return early */ - if (!smp_found_config && detect_init_APIC()) { + if (apic_set_by_hypervisor) { + /* Hypervisor has already taken care of the APIC. */ + } else if (!smp_found_config && detect_init_APIC()) { /* lets NOP'ify apic operations */ pr_info("APIC: disable apic facility\n"); apic_disable(); diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c index 5ac792a1d419..b79a003c13a5 100644 --- a/arch/x86/xen/apic.c +++ b/arch/x86/xen/apic.c @@ -229,8 +229,10 @@ void __init xen_init_apic(void) x86_io_apic_ops.read = xen_io_apic_read; /* On PV guests the APIC CPUID bit is disabled so none of the * routines end up executing. */ - if (!xen_initial_domain()) + if (!xen_initial_domain()) { apic = &xen_pv_apic; + apic_set_by_hypervisor = 1; + } x86_platform.apic_post_init = xen_apic_check; } -- 2.5.5