* xen crash with 4.17 kernel on Fedora @ 2018-07-01 16:43 Michael Young 2018-07-01 17:41 ` Andrew Cooper 2018-07-02 8:07 ` Juergen Gross 0 siblings, 2 replies; 7+ messages in thread From: Michael Young @ 2018-07-01 16:43 UTC (permalink / raw) To: xen-devel I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64) The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running kernel-4.17.2-200.fc28.x86_64 is (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000] (XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c x86_64/entry.S#create_bounce_frame+0x135/0x159 (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff81062330>] (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0) (XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx: 0000000000000000 (XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi: 0000000000000000 (XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8: ffffffff820bb698 (XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14: ffffffff82203e38 (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000006e0 (XEN) cr3: 000000001aacf000 cr2: 0000000000000000 (XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff82203d90: (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330 (XEN) 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b (XEN) 0000000000000246 ffffffff8110e019 0000000000000000 0000000000000246 (XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8 ffffffff82203e88 (XEN) ffffffff82739000 8000000000000061 0000000000000000 0000000000000000 (XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98 ffffffff82203e58 (XEN) 0000000000000000 0000000000000000 8000000000000161 0000000000000100 (XEN) fffffffffffffeff 0000000000000000 0000000000000000 ffffffff82203ef0 (XEN) ffffffff810ac990 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 8000000000000161 0000000000000100 (XEN) fffffffffffffeff 0000000000000000 0000000000000000 0000000002739000 (XEN) 0000000000000080 ffffffff8275db62 000000000001a739 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81037c80 (XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18 ffffffff8102aaf0 (XEN) ffffffff8275dc8c 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305 where addr2line -f -e vmlinux ffffffff81062330 gives native_irq_disable /usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44 What is the problem or how might it be debugged? Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young @ 2018-07-01 17:41 ` Andrew Cooper 2018-07-01 18:09 ` M A Young 2018-07-02 8:07 ` Juergen Gross 1 sibling, 1 reply; 7+ messages in thread From: Andrew Cooper @ 2018-07-01 17:41 UTC (permalink / raw) To: Michael Young, xen-devel On 01/07/18 17:43, Michael Young wrote: > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) > which > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64) > > The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running > kernel-4.17.2-200.fc28.x86_64 is > > (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000] > (XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c > x86_64/entry.S#create_bounce_frame+0x135/0x159 > (XEN) Domain 0 (vcpu#0) crashed on cpu#0: > (XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e033:[<ffffffff81062330>] > (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0) > (XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx: > 0000000000000000 > (XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi: > 0000000000000000 > (XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8: > ffffffff820bb698 > (XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11: > 0000000000000000 > (XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14: > ffffffff82203e38 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: > 00000000000006e0 > (XEN) cr3: 000000001aacf000 cr2: 0000000000000000 > (XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss: > 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffffffff82203d90: > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff81062330 > (XEN) 000000010000e030 0000000000010046 ffffffff82203dd8 > 000000000000e02b > (XEN) 0000000000000246 ffffffff8110e019 0000000000000000 > 0000000000000246 > (XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8 > ffffffff82203e88 > (XEN) ffffffff82739000 8000000000000061 0000000000000000 > 0000000000000000 > (XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98 > ffffffff82203e58 > (XEN) 0000000000000000 0000000000000000 8000000000000161 > 0000000000000100 > (XEN) fffffffffffffeff 0000000000000000 0000000000000000 > ffffffff82203ef0 > (XEN) ffffffff810ac990 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 8000000000000161 > 0000000000000100 > (XEN) fffffffffffffeff 0000000000000000 0000000000000000 > 0000000002739000 > (XEN) 0000000000000080 ffffffff8275db62 000000000001a739 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff81037c80 > (XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18 > ffffffff8102aaf0 > (XEN) ffffffff8275dc8c 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 > ccccccccccccc305 > > where > addr2line -f -e vmlinux ffffffff81062330 > gives > native_irq_disable > /usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44 > > > What is the problem or how might it be debugged? The guest is executing a native `cli` instruction which is privileged and we don't allow (we could trap & emulate, but we can't provide proper STI-shadow behaviour, and such a guest might also expect popf to work, which is very much doesnt). In Linux, that codepath should be using a pvop, rather than a native op. It is either a subsystem which should be skipped when virtualised, or a poorly coded subsystem, or a buggy setup path. Can you see about trying to boot the old kernel as dom0, and the new kernel as a domU with pause on crash configured? /usr/libexec/xen/bin/xenctx should be able to pull a backtrace out of the crashed domain state if you pass the appropriate symbol table in. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-01 17:41 ` Andrew Cooper @ 2018-07-01 18:09 ` M A Young 2018-07-01 21:26 ` Michael Young 0 siblings, 1 reply; 7+ messages in thread From: M A Young @ 2018-07-01 18:09 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 6179 bytes --] On Sun, 1 Jul 2018, Andrew Cooper wrote: > On 01/07/18 17:43, Michael Young wrote: > > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel > > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) > > which > > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64) > > > > The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running > > kernel-4.17.2-200.fc28.x86_64 is > > > > (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000] > > (XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c > > x86_64/entry.S#create_bounce_frame+0x135/0x159 > > (XEN) Domain 0 (vcpu#0) crashed on cpu#0: > > (XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e033:[<ffffffff81062330>] > > (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0) > > (XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx: > > 0000000000000000 > > (XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi: > > 0000000000000000 > > (XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8: > > ffffffff820bb698 > > (XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11: > > 0000000000000000 > > (XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14: > > ffffffff82203e38 > > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: > > 00000000000006e0 > > (XEN) cr3: 000000001aacf000 cr2: 0000000000000000 > > (XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss: > > 0000000000000000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > > (XEN) Guest stack trace from rsp=ffffffff82203d90: > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > ffffffff81062330 > > (XEN) 000000010000e030 0000000000010046 ffffffff82203dd8 > > 000000000000e02b > > (XEN) 0000000000000246 ffffffff8110e019 0000000000000000 > > 0000000000000246 > > (XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8 > > ffffffff82203e88 > > (XEN) ffffffff82739000 8000000000000061 0000000000000000 > > 0000000000000000 > > (XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98 > > ffffffff82203e58 > > (XEN) 0000000000000000 0000000000000000 8000000000000161 > > 0000000000000100 > > (XEN) fffffffffffffeff 0000000000000000 0000000000000000 > > ffffffff82203ef0 > > (XEN) ffffffff810ac990 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 8000000000000161 > > 0000000000000100 > > (XEN) fffffffffffffeff 0000000000000000 0000000000000000 > > 0000000002739000 > > (XEN) 0000000000000080 ffffffff8275db62 000000000001a739 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > ffffffff81037c80 > > (XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18 > > ffffffff8102aaf0 > > (XEN) ffffffff8275dc8c 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 > > ccccccccccccc305 > > > > where > > addr2line -f -e vmlinux ffffffff81062330 > > gives > > native_irq_disable > > /usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44 > > > > > > What is the problem or how might it be debugged? > > The guest is executing a native `cli` instruction which is privileged > and we don't allow (we could trap & emulate, but we can't provide proper > STI-shadow behaviour, and such a guest might also expect popf to work, > which is very much doesnt). In Linux, that codepath should be using a > pvop, rather than a native op. > > It is either a subsystem which should be skipped when virtualised, or a > poorly coded subsystem, or a buggy setup path. > > Can you see about trying to boot the old kernel as dom0, and the new > kernel as a domU with pause on crash configured? > /usr/libexec/xen/bin/xenctx should be able to pull a backtrace out of > the crashed domain state if you pass the appropriate symbol table in. I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier) rip: ffffffff81062330 native_irq_disable flags: 00000246 i z p rsp: ffffffff82203d90 rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000 rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000 rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38 r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000 r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000 cs: e033 ss: e02b ds: 0000 es: 0000 fs: 0000 @ 0000000000000000 gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/ Code (instr addr ffffffff81062330) 00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f 1f 40 00 66 2e 0f 1f 84 Stack: 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b 0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246 0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88 ffffffff82739000 8000000000000061 0000000000000000 0000000000000000 Call Trace: [<ffffffff81062330>] native_irq_disable <-- ffffffff82203da8: [<ffffffff81062330>] native_irq_disable ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9 ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58 ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46 ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108 ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30 ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8 ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed Michael Young [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-01 18:09 ` M A Young @ 2018-07-01 21:26 ` Michael Young 2018-07-02 6:33 ` Juergen Gross 0 siblings, 1 reply; 7+ messages in thread From: Michael Young @ 2018-07-01 21:26 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel On Sun, 1 Jul 2018, M A Young wrote: > I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier) > > rip: ffffffff81062330 native_irq_disable > flags: 00000246 i z p > rsp: ffffffff82203d90 > rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000 > rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000 > rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38 > r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000 > r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000 > cs: e033 ss: e02b ds: 0000 es: 0000 > fs: 0000 @ 0000000000000000 > gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/ > Code (instr addr ffffffff81062330) > 00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f > 1f 40 00 66 2e 0f 1f 84 > > > Stack: > 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330 > 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b > 0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246 > 0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88 > ffffffff82739000 8000000000000061 0000000000000000 0000000000000000 > > Call Trace: > [<ffffffff81062330>] native_irq_disable <-- > ffffffff82203da8: [<ffffffff81062330>] native_irq_disable > ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9 > ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58 > ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46 > ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108 > ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30 > ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8 > ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop > ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed The xen_load_gdt_boot code is 0xffffffff8275da5a <xen_load_gdt_boot>: callq 0xffffffff81a017a0 <__fentry__> 0xffffffff8275da5f <xen_load_gdt_boot+5>: push %r13 0xffffffff8275da61 <xen_load_gdt_boot+7>: push %r12 0xffffffff8275da63 <xen_load_gdt_boot+9>: push %rbp 0xffffffff8275da64 <xen_load_gdt_boot+10>: push %rbx 0xffffffff8275da65 <xen_load_gdt_boot+11>: push %rdx 0xffffffff8275da66 <xen_load_gdt_boot+12>: movzwl (%rdi),%ebp 0xffffffff8275da69 <xen_load_gdt_boot+15>: mov 0x2(%rdi),%r12 0xffffffff8275da6d <xen_load_gdt_boot+19>: inc %ebp 0xffffffff8275da6f <xen_load_gdt_boot+21>: cmp $0x1000,%ebp 0xffffffff8275da75 <xen_load_gdt_boot+27>: jle 0xffffffff8275da79 <xen_load_gdt_boot+31> 0xffffffff8275da77 <xen_load_gdt_boot+29>: ud2 0xffffffff8275da79 <xen_load_gdt_boot+31>: test $0xfff,%r12d 0xffffffff8275da80 <xen_load_gdt_boot+38>: je 0xffffffff8275da84 <xen_load_gdt_boot+42> 0xffffffff8275da82 <xen_load_gdt_boot+40>: ud2 0xffffffff8275da84 <xen_load_gdt_boot+42>: mov $0x80000000,%ebx 0xffffffff8275da89 <xen_load_gdt_boot+47>: mov -0x54ba80(%rip),%rax # 0xffffffff82212010 0xffffffff8275da90 <xen_load_gdt_boot+54>: add %r12,%rbx 0xffffffff8275da93 <xen_load_gdt_boot+57>: mov %rbx,%rdi 0xffffffff8275da96 <xen_load_gdt_boot+60>: jb 0xffffffff8275daa9 <xen_load_gdt_boot+79> 0xffffffff8275da98 <xen_load_gdt_boot+62>: mov $0xffffffff80000000,%rbx 0xffffffff8275da9f <xen_load_gdt_boot+69>: mov %rbx,%rax 0xffffffff8275daa2 <xen_load_gdt_boot+72>: sub -0x5dec19(%rip),%rax # 0xffffffff8217ee90 <page_offset_base> 0xffffffff8275daa9 <xen_load_gdt_boot+79>: lea (%rdi,%rax,1),%rbx 0xffffffff8275daad <xen_load_gdt_boot+83>: mov %rbx,%rdi 0xffffffff8275dab0 <xen_load_gdt_boot+86>: shr $0xc,%rdi 0xffffffff8275dab4 <xen_load_gdt_boot+90>: cmpb $0x0,-0x3d0459(%rip) # 0xffffffff8238d662 <xen_features+2> 0xffffffff8275dabb <xen_load_gdt_boot+97>: mov %rdi,%rax 0xffffffff8275dabe <xen_load_gdt_boot+100>: jne 0xffffffff8275db02 <xen_load_gdt_boot+168> 0xffffffff8275dac0 <xen_load_gdt_boot+102>: cmp -0x3d9a67(%rip),%rdi # 0xffffffff82384060 <xen_p2m_size> 0xffffffff8275dac7 <xen_load_gdt_boot+109>: jae 0xffffffff8275dadc <xen_load_gdt_boot+130> 0xffffffff8275dac9 <xen_load_gdt_boot+111>: mov -0x3d9a68(%rip),%rdx # 0xffffffff82384068 <xen_p2m_addr> 0xffffffff8275dad0 <xen_load_gdt_boot+118>: mov (%rdx,%rdi,8),%rax 0xffffffff8275dad4 <xen_load_gdt_boot+122>: cmp $0xffffffffffffffff,%rax 0xffffffff8275dad8 <xen_load_gdt_boot+126>: jne 0xffffffff8275daf5 <xen_load_gdt_boot+155> 0xffffffff8275dada <xen_load_gdt_boot+128>: jmp 0xffffffff8275daea <xen_load_gdt_boot+144> 0xffffffff8275dadc <xen_load_gdt_boot+130>: bts $0x3e,%rax 0xffffffff8275dae1 <xen_load_gdt_boot+135>: cmp -0x3d9a90(%rip),%rdi # 0xffffffff82384058 <xen_max_p2m_pfn> 0xffffffff8275dae8 <xen_load_gdt_boot+142>: jae 0xffffffff8275daf5 <xen_load_gdt_boot+155> 0xffffffff8275daea <xen_load_gdt_boot+144>: callq 0xffffffff81017190 <get_phys_to_machine> 0xffffffff8275daef <xen_load_gdt_boot+149>: cmp $0xffffffffffffffff,%rax 0xffffffff8275daf3 <xen_load_gdt_boot+153>: je 0xffffffff8275db02 <xen_load_gdt_boot+168> 0xffffffff8275daf5 <xen_load_gdt_boot+155>: movabs $0x3fffffffffffffff,%rdx 0xffffffff8275daff <xen_load_gdt_boot+165>: and %rdx,%rax 0xffffffff8275db02 <xen_load_gdt_boot+168>: movabs $0x8000000000000161,%rsi 0xffffffff8275db0c <xen_load_gdt_boot+178>: or -0x523d53(%rip),%rsi # 0xffffffff82239dc0 <sme_me_mask> 0xffffffff8275db13 <xen_load_gdt_boot+185>: and -0x3d847a(%rip),%rsi # 0xffffffff823856a0 <__default_kernel_pte_mask> 0xffffffff8275db1a <xen_load_gdt_boot+192>: mov %rax,(%rsp) 0xffffffff8275db1e <xen_load_gdt_boot+196>: and $0xfffffffffffff000,%rbx 0xffffffff8275db25 <xen_load_gdt_boot+203>: mov %rsi,%r13 0xffffffff8275db28 <xen_load_gdt_boot+206>: test $0x1,%sil 0xffffffff8275db2c <xen_load_gdt_boot+210>: je 0xffffffff8275db64 <xen_load_gdt_boot+266> 0xffffffff8275db2e <xen_load_gdt_boot+212>: mov -0x3d848d(%rip),%rcx # 0xffffffff823856a8 <__supported_pte_mask> 0xffffffff8275db35 <xen_load_gdt_boot+219>: and %rcx,%r13 0xffffffff8275db38 <xen_load_gdt_boot+222>: cmp %r13,%rsi 0xffffffff8275db3b <xen_load_gdt_boot+225>: je 0xffffffff8275db64 <xen_load_gdt_boot+266> 0xffffffff8275db3d <xen_load_gdt_boot+227>: cmpb $0x0,-0x424ea8(%rip) # 0xffffffff82338c9c <__warned.24604> 0xffffffff8275db44 <xen_load_gdt_boot+234>: jne 0xffffffff8275db64 <xen_load_gdt_boot+266> 0xffffffff8275db46 <xen_load_gdt_boot+236>: mov %rcx,%rdx 0xffffffff8275db49 <xen_load_gdt_boot+239>: mov $0xffffffff820a6cd0,%rdi 0xffffffff8275db50 <xen_load_gdt_boot+246>: movb $0x1,-0x424ebb(%rip) # 0xffffffff82338c9c <__warned.24604> 0xffffffff8275db57 <xen_load_gdt_boot+253>: not %rdx 0xffffffff8275db5a <xen_load_gdt_boot+256>: and %rsi,%rdx 0xffffffff8275db5d <xen_load_gdt_boot+259>: callq 0xffffffff810ac92a <__warn_printk> 0xffffffff8275db62 <xen_load_gdt_boot+264>: ud2 0xffffffff8275db64 <xen_load_gdt_boot+266>: or %r13,%rbx 0xffffffff8275db67 <xen_load_gdt_boot+269>: mov %rbx,%rdi 0xffffffff8275db6a <xen_load_gdt_boot+272>: callq *0xffffffff82185fd8 0xffffffff8275db71 <xen_load_gdt_boot+279>: xor %edx,%edx 0xffffffff8275db73 <xen_load_gdt_boot+281>: mov %rax,%rsi 0xffffffff8275db76 <xen_load_gdt_boot+284>: mov %r12,%rdi 0xffffffff8275db79 <xen_load_gdt_boot+287>: callq 0xffffffff810011c0 <xen_hypercall_update_va_mapping> 0xffffffff8275db7e <xen_load_gdt_boot+292>: test %eax,%eax 0xffffffff8275db80 <xen_load_gdt_boot+294>: je 0xffffffff8275db84 <xen_load_gdt_boot+298> 0xffffffff8275db82 <xen_load_gdt_boot+296>: ud2 0xffffffff8275db84 <xen_load_gdt_boot+298>: shr $0x3,%ebp 0xffffffff8275db87 <xen_load_gdt_boot+301>: mov %rsp,%rdi 0xffffffff8275db8a <xen_load_gdt_boot+304>: mov %ebp,%esi 0xffffffff8275db8c <xen_load_gdt_boot+306>: callq 0xffffffff81001040 <xen_hypercall_set_gdt> 0xffffffff8275db91 <xen_load_gdt_boot+311>: test %eax,%eax 0xffffffff8275db93 <xen_load_gdt_boot+313>: je 0xffffffff8275db97 <xen_load_gdt_boot+317> 0xffffffff8275db95 <xen_load_gdt_boot+315>: ud2 0xffffffff8275db97 <xen_load_gdt_boot+317>: pop %rax 0xffffffff8275db98 <xen_load_gdt_boot+318>: pop %rbx 0xffffffff8275db99 <xen_load_gdt_boot+319>: pop %rbp 0xffffffff8275db9a <xen_load_gdt_boot+320>: pop %r12 0xffffffff8275db9c <xen_load_gdt_boot+322>: pop %r13 0xffffffff8275db9e <xen_load_gdt_boot+324>: retq I think the crash is triggered by the code static inline pgprotval_t check_pgprot(pgprot_t pgprot) { pgprotval_t massaged_val = massage_pgprot(pgprot); /* mmdebug.h can not be included here because of dependencies */ #ifdef CONFIG_DEBUG_VM WARN_ONCE(pgprot_val(pgprot) != massaged_val, "attempted to set unsupported pgprot: %016llx " "bits: %016llx supported: %016llx\n", (u64)pgprot_val(pgprot), (u64)pgprot_val(pgprot) ^ massaged_val, (u64)__supported_pte_mask); #endif return massaged_val; } static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) { return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) | check_pgprot(pgprot)); } in arch/x86/include/asm/pgtable.h which is inlined into xen_load_gdt_boot by via pfn_pte In 4.16 the equivalent code was static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) { return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) | massage_pgprot(pgprot)); } Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-01 21:26 ` Michael Young @ 2018-07-02 6:33 ` Juergen Gross 0 siblings, 0 replies; 7+ messages in thread From: Juergen Gross @ 2018-07-02 6:33 UTC (permalink / raw) To: Michael Young, Andrew Cooper; +Cc: xen-devel On 01/07/18 23:26, Michael Young wrote: > On Sun, 1 Jul 2018, M A Young wrote: > >> I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier) >> >> rip: ffffffff81062330 native_irq_disable >> flags: 00000246 i z p >> rsp: ffffffff82203d90 >> rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000 >> rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000 >> rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38 >> r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000 >> r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000 >> cs: e033 ss: e02b ds: 0000 es: 0000 >> fs: 0000 @ 0000000000000000 >> gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/ >> Code (instr addr ffffffff81062330) >> 00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f >> 1f 40 00 66 2e 0f 1f 84 >> >> >> Stack: >> 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330 >> 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b >> 0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246 >> 0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88 >> ffffffff82739000 8000000000000061 0000000000000000 0000000000000000 >> >> Call Trace: >> [<ffffffff81062330>] native_irq_disable <-- >> ffffffff82203da8: [<ffffffff81062330>] native_irq_disable >> ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9 >> ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58 >> ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46 >> ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108 >> ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30 >> ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8 >> ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop >> ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed > > I think the crash is triggered by the code > > static inline pgprotval_t check_pgprot(pgprot_t pgprot) > { > pgprotval_t massaged_val = massage_pgprot(pgprot); > > /* mmdebug.h can not be included here because of dependencies */ > #ifdef CONFIG_DEBUG_VM > WARN_ONCE(pgprot_val(pgprot) != massaged_val, > "attempted to set unsupported pgprot: %016llx " > "bits: %016llx supported: %016llx\n", > (u64)pgprot_val(pgprot), > (u64)pgprot_val(pgprot) ^ massaged_val, > (u64)__supported_pte_mask); > #endif > > return massaged_val; > } > > static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) > { > return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) | > check_pgprot(pgprot)); > } > > in arch/x86/include/asm/pgtable.h which is inlined into > xen_load_gdt_boot by via pfn_pte > > In 4.16 the equivalent code was > > static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) > { > return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) | > massage_pgprot(pgprot)); > } There are two problems here: 1. pv_irq_ops hasn't been setup early enough, so the printk() will use native_irq_disable() instead of the Xen variant. 2. For PV domains the default kernel pte should not include the global bit. Repairing this issue will avoid the WARN_ONCE() above. I'll send two patches soon to fix the issues. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young 2018-07-01 17:41 ` Andrew Cooper @ 2018-07-02 8:07 ` Juergen Gross 2018-07-02 9:31 ` M A Young 1 sibling, 1 reply; 7+ messages in thread From: Juergen Gross @ 2018-07-02 8:07 UTC (permalink / raw) To: Michael Young, xen-devel [-- Attachment #1: Type: text/plain, Size: 467 bytes --] On 01/07/18 18:43, Michael Young wrote: > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64) Could you please try the attached patches? They apply to either 4.17 or 4.18-rc. The first one should let the kernel survive the WARN_ONCE(), while the second will avoid hitting the WARN_ONCE(). Juergen [-- Attachment #2: 0001-xen-setup-pv-irq-ops-vector-earlier.patch --] [-- Type: text/x-patch, Size: 1830 bytes --] >From baa8db1bd97958cccc67f8e894847104c51c27ef Mon Sep 17 00:00:00 2001 From: Juergen Gross <jgross@suse.com> Date: Mon, 2 Jul 2018 09:09:18 +0200 Subject: [PATCH] xen: setup pv irq ops vector earlier Setting pv_irq_ops for Xen PV domains should be done as early as possible in order to support e.g. very early printk() usage. Remove the no longer necessary conditional in xen_init_irq_ops() from PVH V1 times to make clear this is a PV only function. Cc: <stable@vger.kernel.org> # 4.14 Signed-off-by: Juergen Gross <jgross@suse.com> --- arch/x86/xen/enlighten_pv.c | 3 +-- arch/x86/xen/irq.c | 4 +--- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 8d4e2e1ae60b..0f4cd9e5bed4 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1213,6 +1213,7 @@ asmlinkage __visible void __init xen_start_kernel(void) pv_info = xen_info; pv_init_ops.patch = paravirt_patch_default; pv_cpu_ops = xen_cpu_ops; + xen_init_irq_ops(); x86_platform.get_nmi_reason = xen_get_nmi_reason; @@ -1249,8 +1250,6 @@ asmlinkage __visible void __init xen_start_kernel(void) get_cpu_cap(&boot_cpu_data); x86_configure_nx(); - xen_init_irq_ops(); - /* Let's presume PV guests always boot on vCPU with id 0. */ per_cpu(xen_vcpu_id, 0) = 0; diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index 74179852e46c..7515a19fd324 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -128,8 +128,6 @@ static const struct pv_irq_ops xen_irq_ops __initconst = { void __init xen_init_irq_ops(void) { - /* For PVH we use default pv_irq_ops settings. */ - if (!xen_feature(XENFEAT_hvm_callback_vector)) - pv_irq_ops = xen_irq_ops; + pv_irq_ops = xen_irq_ops; x86_init.irqs.intr_init = xen_init_IRQ; } -- 2.13.7 [-- Attachment #3: 0002-xen-remove-global-bit-from-__default_kernel_pte_mask.patch --] [-- Type: text/x-patch, Size: 1140 bytes --] >From 2ab1412c43762f27e65bd18d8c1ffde9133a56b1 Mon Sep 17 00:00:00 2001 From: Juergen Gross <jgross@suse.com> Date: Mon, 2 Jul 2018 09:31:36 +0200 Subject: [PATCH] xen: remove global bit from __default_kernel_pte_mask for pv guests When removing the global bit from __supported_pte_mask do the same for __default_kernel_pte_mask in order to avoid the WARN_ONCE() in check_pgprot() when setting a kernel pte before having called init_mem_mapping(). Cc: <stable@vger.kernel.org> # 4.17 Reported-by: Michael Young <m.a.young@durham.ac.uk> Signed-off-by: Juergen Gross <jgross@suse.com> --- arch/x86/xen/enlighten_pv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 0f4cd9e5bed4..cf7b13d3e911 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1230,6 +1230,7 @@ asmlinkage __visible void __init xen_start_kernel(void) /* Prevent unwanted bits from being set in PTEs. */ __supported_pte_mask &= ~_PAGE_GLOBAL; + __default_kernel_pte_mask &= ~_PAGE_GLOBAL; /* * Prevent page tables from being allocated in highmem, even -- 2.13.7 [-- Attachment #4: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora 2018-07-02 8:07 ` Juergen Gross @ 2018-07-02 9:31 ` M A Young 0 siblings, 0 replies; 7+ messages in thread From: M A Young @ 2018-07-02 9:31 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel On Mon, 2 Jul 2018, Juergen Gross wrote: > On 01/07/18 18:43, Michael Young wrote: > > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel > > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which > > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64) > > Could you please try the attached patches? They apply to either 4.17 > or 4.18-rc. > > The first one should let the kernel survive the WARN_ONCE(), while > the second will avoid hitting the WARN_ONCE(). Yes, kernel-4.17.3-200.fc28 with these patches applied boots as a DomU and I checked dmesg, /var/log/messages and journalctl for pgprot messages and didn't find anything. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-07-02 9:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young 2018-07-01 17:41 ` Andrew Cooper 2018-07-01 18:09 ` M A Young 2018-07-01 21:26 ` Michael Young 2018-07-02 6:33 ` Juergen Gross 2018-07-02 8:07 ` Juergen Gross 2018-07-02 9:31 ` M A Young
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.