From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751561AbdFGMVQ (ORCPT ); Wed, 7 Jun 2017 08:21:16 -0400 Received: from imap.hinterhof.net ([80.69.43.37]:46193 "EHLO imap.hinterhof.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751413AbdFGMUA (ORCPT ); Wed, 7 Jun 2017 08:20:00 -0400 Date: Wed, 7 Jun 2017 14:19:55 +0200 From: Max Vozeler To: Boris Ostrovsky Cc: Thomas Gleixner , LKML , x86@kernel.org, Peter Zijlstra , Borislav Petkov , "Charles (Chas) Williams" , "M. Vefa Bicakci" , Alok Kataria , xen-devel , Juergen =?iso-8859-1?Q?Gro=DF?= , Thomas =?iso-8859-1?Q?Wei=DFschuh?= Subject: Re: [PATCH v2] x86/smpboot: Make logical package management more robust Message-ID: <20170607121955.GA26412@chaos.hinterhof.net> References: <8aa33de4-db18-759b-d2cb-0e25d5ab9d88@oracle.com> <730d61ff-ff1e-df80-3446-7fceb25a6d63@oracle.com> <60e7a807-27fb-f666-270a-9512804deae8@oracle.com> <20170606133958.GA23069@chaos.hinterhof.net> <20c3f46a-3514-99cb-30c3-cec6e42211cf@oracle.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="sm4nu43k4a2Rpi4c" Content-Disposition: inline In-Reply-To: <20c3f46a-3514-99cb-30c3-cec6e42211cf@oracle.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Jun 06, 2017 at 09:48:37PM -0400, Boris Ostrovsky wrote: > On 06/06/2017 09:39 AM, Max Vozeler wrote: > >there is a problem booting recent kernels on some Xen domUs hosted by > >provider JiffyBox. > > > >The kernel seems to crash just after logging > >[ 0.038700] SMP alternatives: switching to SMP code > > Do you have the crash splat? Stack trace and such. > > In fact, full boot log might be useful. Unfortunately, we don't have much more information. Just after "switching to SMP code" the console connection is lost and we get a notification that the VM has crashed. I'm attaching the boot log up to that point.. just in case. I have asked the hosting provider if they can provide XEN hypervisor logs. > >We started seeing this with 4.9.2 and bisecting the 4.9 stable kernels > >determined that this commit introduced the problem. Reverting it from 4.9.2 > >makes the kernel boot again. > > > >Older kernels (starting from 3.16 up to and including 4.9.1) were running > >fine in this setup. But recent mainline (tested 4.12-rc3) and 4.9.x both > >fail to boot there. > > > >Unfortunately we have no detailed information about the hypervisor or > >setup and the provider is not very forthcoming with details. I'm attaching > >dmesg of a successful boot (4.9.2 with this commit reverted). > > > >It shows a fairly old XEN version: > > > >[ 0.000000] Xen version: 3.1.2-416.el5 (preserve-AD) > > This is a 10 year old hypervisor so it's not especially surprising that > newer kernels don't work. (If anything, I am surprised that you actually > booted 4.9 at all). > > There have been a bunch of problems in this area (topology) on PV guests. Thanks and kind regards, Max --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="dmesg-crash.txt" Filesystem type is ext2fs, using whole disk kernel /boot/vmlinuz-4.9.2-bisect-00033-g2b95c93 root=/dev/xvda ro cgroup_enab le=memory apparmor=1 security=apparmor initrd /boot/initrd.img-4.9.2-bisect-00033-g2b95c93 ============= Init TPM Front ================ Tpmfront:Error Unable to read device/vtpm/0/backend-id during tpmfront initialization! error = ENOENT Tpmfront:Info Shutting down tpmfront close blk: backend=/local/domain/0/backend/vbd/226/51712 node=device/vbd/51712 close blk: backend=/local/domain/0/backend/vbd/226/51728 node=device/vbd/51728 [ 0.000000] Linux version 4.9.2-bisect-00033-g2b95c93 (aaa@example.com) (gcc version 4.9.2 (Debian 4.9.2-10) ) #5 SMP Wed May 31 10:54:25 UTC 2017 [ 0.000000] Command line: root=/dev/xvda ro cgroup_enable=memory apparmor=1 security=apparmor [ 0.000000] x86/fpu: Legacy x87 FPU detected. [ 0.000000] x86/fpu: Using 'eager' FPU context switches. [ 0.000000] ACPI in unprivileged domain disabled [ 0.000000] Released 0 page(s) [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved [ 0.000000] Xen: [mem 0x0000000000100000-0x00000000807fffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI not present or invalid. [ 0.000000] Hypervisor detected: Xen [ 0.000000] e820: last_pfn = 0x80800 max_arch_pfn = 0x400000000 [ 0.000000] MTRR: Disabled [ 0.000000] x86/PAT: MTRRs disabled, skipping PAT initialization too. [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC [ 0.000000] RAMDISK: [mem 0x01fe1000-0x0264afff] [ 0.000000] NUMA turned off [ 0.000000] Faking a node at [mem 0x0000000000000000-0x00000000807fffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x7fc17000-0x7fc1bfff] [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000807fffff] [ 0.000000] Normal empty [ 0.000000] Device empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009ffff] [ 0.000000] node 0: [mem 0x0000000000100000-0x00000000807fffff] [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x00000000807fffff] [ 0.000000] p2m virtual area at ffffc90000000000, size is 40000000 [ 0.000000] Remapped 0 page(s) [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org [ 0.000000] smpboot: Allowing 3 CPUs, 0 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff] [ 0.000000] e820: [mem 0x80800000-0xffffffff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on Xen [ 0.000000] Xen version: 3.1.2-416.el5 (preserve-AD) [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:3 nr_node_ids:1 [ 0.000000] percpu: Embedded 35 pages/cpu @ffff88007d000000 s105240 r8192 d29928 u524288 [ 0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes) [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 517994 [ 0.000000] Policy zone: DMA32 [ 0.000000] Kernel command line: root=/dev/xvda ro cgroup_enable=memory apparmor=1 security=apparmor [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] Memory: 2030340K/2104956K available (7412K kernel code, 1426K rwdata, 3120K rodata, 1480K init, 848K bss, 74616K reserved, 0K cma-reserved) [ 0.000000] Hierarchical RCU implementation. [ 0.000000] Build-time adjustment of leaf fanout to 64. [ 0.000000] RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=3. [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=3 [ 0.000000] Using NULL legacy PIC [ 0.000000] NR_IRQS:33024 nr_irqs:64 0 [ 0.000000] xen:events: Using 2-level ABI [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [tty0] enabled [ 0.000000] console [hvc0] enabled [ 0.000000] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] installing Xen timer for CPU 0 [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 2133.431 MHz processor [ 0.008000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4266.81 BogoMIPS (lpj=8533632) [ 0.008000] pid_max: default: 32768 minimum: 301 [ 0.008000] Security Framework initialized [ 0.008000] Yama: becoming mindful. [ 0.008000] AppArmor: AppArmor initialized [ 0.008000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) [ 0.008000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.008000] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.008000] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.008000] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7 [ 0.008000] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0 [ 0.037256] ftrace: allocating 28843 entries in 113 pages [ 0.044095] cpu 0 spinlock event irq 1 [ 0.044108] smpboot: Max logical packages: 1 [ 0.044114] smpboot: CPU 0 Converting physical 33 to logical package 0 [ 0.044123] VPMU disabled by hypervisor. [ 0.044144] Performance Events: unsupported p6 CPU model 44 no PMU driver, software events only. [ 0.044911] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.044922] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.045096] installing Xen timer for CPU 1 [ 0.045135] SMP alternatives: switching to SMP code Verbindung mit Console getrennt --sm4nu43k4a2Rpi4c--