* xen crash with 4.17 kernel on Fedora
@ 2018-07-01 16:43 Michael Young
2018-07-01 17:41 ` Andrew Cooper
2018-07-02 8:07 ` Juergen Gross
0 siblings, 2 replies; 7+ messages in thread
From: Michael Young @ 2018-07-01 16:43 UTC (permalink / raw)
To: xen-devel
I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel
(eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which
didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64)
The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running
kernel-4.17.2-200.fc28.x86_64 is
(XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
(XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c
x86_64/entry.S#create_bounce_frame+0x135/0x159
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff81062330>]
(XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
(XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx: 0000000000000000
(XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi: 0000000000000000
(XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8: ffffffff820bb698
(XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14: ffffffff82203e38
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000006e0
(XEN) cr3: 000000001aacf000 cr2: 0000000000000000
(XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff82203d90:
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330
(XEN) 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b
(XEN) 0000000000000246 ffffffff8110e019 0000000000000000 0000000000000246
(XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8 ffffffff82203e88
(XEN) ffffffff82739000 8000000000000061 0000000000000000 0000000000000000
(XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98 ffffffff82203e58
(XEN) 0000000000000000 0000000000000000 8000000000000161 0000000000000100
(XEN) fffffffffffffeff 0000000000000000 0000000000000000 ffffffff82203ef0
(XEN) ffffffff810ac990 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 8000000000000161 0000000000000100
(XEN) fffffffffffffeff 0000000000000000 0000000000000000 0000000002739000
(XEN) 0000000000000080 ffffffff8275db62 000000000001a739 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81037c80
(XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18 ffffffff8102aaf0
(XEN) ffffffff8275dc8c 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305
where
addr2line -f -e vmlinux ffffffff81062330
gives
native_irq_disable
/usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44
What is the problem or how might it be debugged?
Michael Young
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young
@ 2018-07-01 17:41 ` Andrew Cooper
2018-07-01 18:09 ` M A Young
2018-07-02 8:07 ` Juergen Gross
1 sibling, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2018-07-01 17:41 UTC (permalink / raw)
To: Michael Young, xen-devel
On 01/07/18 17:43, Michael Young wrote:
> I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel
> (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64)
> which
> didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64)
>
> The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running
> kernel-4.17.2-200.fc28.x86_64 is
>
> (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c
> x86_64/entry.S#create_bounce_frame+0x135/0x159
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]----
> (XEN) CPU: 0
> (XEN) RIP: e033:[<ffffffff81062330>]
> (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
> (XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx:
> 0000000000000000
> (XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi:
> 0000000000000000
> (XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8:
> ffffffff820bb698
> (XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11:
> 0000000000000000
> (XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14:
> ffffffff82203e38
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4:
> 00000000000006e0
> (XEN) cr3: 000000001aacf000 cr2: 0000000000000000
> (XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss:
> 0000000000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=ffffffff82203d90:
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffffffff81062330
> (XEN) 000000010000e030 0000000000010046 ffffffff82203dd8
> 000000000000e02b
> (XEN) 0000000000000246 ffffffff8110e019 0000000000000000
> 0000000000000246
> (XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8
> ffffffff82203e88
> (XEN) ffffffff82739000 8000000000000061 0000000000000000
> 0000000000000000
> (XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98
> ffffffff82203e58
> (XEN) 0000000000000000 0000000000000000 8000000000000161
> 0000000000000100
> (XEN) fffffffffffffeff 0000000000000000 0000000000000000
> ffffffff82203ef0
> (XEN) ffffffff810ac990 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 8000000000000161
> 0000000000000100
> (XEN) fffffffffffffeff 0000000000000000 0000000000000000
> 0000000002739000
> (XEN) 0000000000000080 ffffffff8275db62 000000000001a739
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffffffff81037c80
> (XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18
> ffffffff8102aaf0
> (XEN) ffffffff8275dc8c 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0f00000060c0c748
> ccccccccccccc305
>
> where
> addr2line -f -e vmlinux ffffffff81062330
> gives
> native_irq_disable
> /usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44
>
>
> What is the problem or how might it be debugged?
The guest is executing a native `cli` instruction which is privileged
and we don't allow (we could trap & emulate, but we can't provide proper
STI-shadow behaviour, and such a guest might also expect popf to work,
which is very much doesnt). In Linux, that codepath should be using a
pvop, rather than a native op.
It is either a subsystem which should be skipped when virtualised, or a
poorly coded subsystem, or a buggy setup path.
Can you see about trying to boot the old kernel as dom0, and the new
kernel as a domU with pause on crash configured?
/usr/libexec/xen/bin/xenctx should be able to pull a backtrace out of
the crashed domain state if you pass the appropriate symbol table in.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-01 17:41 ` Andrew Cooper
@ 2018-07-01 18:09 ` M A Young
2018-07-01 21:26 ` Michael Young
0 siblings, 1 reply; 7+ messages in thread
From: M A Young @ 2018-07-01 18:09 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 6179 bytes --]
On Sun, 1 Jul 2018, Andrew Cooper wrote:
> On 01/07/18 17:43, Michael Young wrote:
> > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel
> > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64)
> > which
> > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64)
> >
> > The backtrace for a Dom0 boot of xen-4.10.1-5.fc28.x86_64 running
> > kernel-4.17.2-200.fc28.x86_64 is
> >
> > (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
> > (XEN) domain_crash_sync called from entry.S: fault at ffff82d08035557c
> > x86_64/entry.S#create_bounce_frame+0x135/0x159
> > (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> > (XEN) ----[ Xen-4.10.1 x86_64 debug=n Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e033:[<ffffffff81062330>]
> > (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
> > (XEN) rax: 0000000000000246 rbx: 00000000ffffffff rcx:
> > 0000000000000000
> > (XEN) rdx: 0000000000000000 rsi: 00000000ffffffff rdi:
> > 0000000000000000
> > (XEN) rbp: 0000000000000000 rsp: ffffffff82203d90 r8:
> > ffffffff820bb698
> > (XEN) r9: ffffffff82203e38 r10: 0000000000000000 r11:
> > 0000000000000000
> > (XEN) r12: 0000000000000000 r13: ffffffff820bb698 r14:
> > ffffffff82203e38
> > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4:
> > 00000000000006e0
> > (XEN) cr3: 000000001aacf000 cr2: 0000000000000000
> > (XEN) fsb: 0000000000000000 gsb: ffffffff82731000 gss:
> > 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> > (XEN) Guest stack trace from rsp=ffffffff82203d90:
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > ffffffff81062330
> > (XEN) 000000010000e030 0000000000010046 ffffffff82203dd8
> > 000000000000e02b
> > (XEN) 0000000000000246 ffffffff8110e019 0000000000000000
> > 0000000000000246
> > (XEN) 0000000000000000 0000000000000000 ffffffff820a6cd8
> > ffffffff82203e88
> > (XEN) ffffffff82739000 8000000000000061 0000000000000000
> > 0000000000000000
> > (XEN) ffffffff8110ecb6 0000000000000008 ffffffff82203e98
> > ffffffff82203e58
> > (XEN) 0000000000000000 0000000000000000 8000000000000161
> > 0000000000000100
> > (XEN) fffffffffffffeff 0000000000000000 0000000000000000
> > ffffffff82203ef0
> > (XEN) ffffffff810ac990 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 8000000000000161
> > 0000000000000100
> > (XEN) fffffffffffffeff 0000000000000000 0000000000000000
> > 0000000002739000
> > (XEN) 0000000000000080 ffffffff8275db62 000000000001a739
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > ffffffff81037c80
> > (XEN) 007fffff8275efe7 ffffffff82739000 ffffffff81037f18
> > ffffffff8102aaf0
> > (XEN) ffffffff8275dc8c 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0f00000060c0c748
> > ccccccccccccc305
> >
> > where
> > addr2line -f -e vmlinux ffffffff81062330
> > gives
> > native_irq_disable
> > /usr/src/debug/kernel-4.17.fc28/linux-4.17.2-200.fc28.x86_64/./arch/x86/include/asm/irqflags.h:44
> >
> >
> > What is the problem or how might it be debugged?
>
> The guest is executing a native `cli` instruction which is privileged
> and we don't allow (we could trap & emulate, but we can't provide proper
> STI-shadow behaviour, and such a guest might also expect popf to work,
> which is very much doesnt). In Linux, that codepath should be using a
> pvop, rather than a native op.
>
> It is either a subsystem which should be skipped when virtualised, or a
> poorly coded subsystem, or a buggy setup path.
>
> Can you see about trying to boot the old kernel as dom0, and the new
> kernel as a domU with pause on crash configured?
> /usr/libexec/xen/bin/xenctx should be able to pull a backtrace out of
> the crashed domain state if you pass the appropriate symbol table in.
I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier)
rip: ffffffff81062330 native_irq_disable
flags: 00000246 i z p
rsp: ffffffff82203d90
rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000
rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000
rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38
r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000
r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000
cs: e033 ss: e02b ds: 0000 es: 0000
fs: 0000 @ 0000000000000000
gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/
Code (instr addr ffffffff81062330)
00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f
1f 40 00 66 2e 0f 1f 84
Stack:
0000000000000000 0000000000000000 0000000000000000 ffffffff81062330
000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b
0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246
0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88
ffffffff82739000 8000000000000061 0000000000000000 0000000000000000
Call Trace:
[<ffffffff81062330>] native_irq_disable <--
ffffffff82203da8: [<ffffffff81062330>] native_irq_disable
ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9
ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58
ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46
ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108
ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30
ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8
ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop
ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed
Michael Young
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-01 18:09 ` M A Young
@ 2018-07-01 21:26 ` Michael Young
2018-07-02 6:33 ` Juergen Gross
0 siblings, 1 reply; 7+ messages in thread
From: Michael Young @ 2018-07-01 21:26 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
On Sun, 1 Jul 2018, M A Young wrote:
> I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier)
>
> rip: ffffffff81062330 native_irq_disable
> flags: 00000246 i z p
> rsp: ffffffff82203d90
> rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000
> rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000
> rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38
> r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000
> r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000
> cs: e033 ss: e02b ds: 0000 es: 0000
> fs: 0000 @ 0000000000000000
> gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/
> Code (instr addr ffffffff81062330)
> 00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f
> 1f 40 00 66 2e 0f 1f 84
>
>
> Stack:
> 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330
> 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b
> 0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246
> 0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88
> ffffffff82739000 8000000000000061 0000000000000000 0000000000000000
>
> Call Trace:
> [<ffffffff81062330>] native_irq_disable <--
> ffffffff82203da8: [<ffffffff81062330>] native_irq_disable
> ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9
> ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58
> ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46
> ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108
> ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30
> ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8
> ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop
> ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed
The xen_load_gdt_boot code is
0xffffffff8275da5a <xen_load_gdt_boot>:
callq 0xffffffff81a017a0 <__fentry__>
0xffffffff8275da5f <xen_load_gdt_boot+5>: push %r13
0xffffffff8275da61 <xen_load_gdt_boot+7>: push %r12
0xffffffff8275da63 <xen_load_gdt_boot+9>: push %rbp
0xffffffff8275da64 <xen_load_gdt_boot+10>: push %rbx
0xffffffff8275da65 <xen_load_gdt_boot+11>: push %rdx
0xffffffff8275da66 <xen_load_gdt_boot+12>: movzwl (%rdi),%ebp
0xffffffff8275da69 <xen_load_gdt_boot+15>: mov 0x2(%rdi),%r12
0xffffffff8275da6d <xen_load_gdt_boot+19>: inc %ebp
0xffffffff8275da6f <xen_load_gdt_boot+21>: cmp $0x1000,%ebp
0xffffffff8275da75 <xen_load_gdt_boot+27>:
jle 0xffffffff8275da79 <xen_load_gdt_boot+31>
0xffffffff8275da77 <xen_load_gdt_boot+29>: ud2
0xffffffff8275da79 <xen_load_gdt_boot+31>: test $0xfff,%r12d
0xffffffff8275da80 <xen_load_gdt_boot+38>:
je 0xffffffff8275da84 <xen_load_gdt_boot+42>
0xffffffff8275da82 <xen_load_gdt_boot+40>: ud2
0xffffffff8275da84 <xen_load_gdt_boot+42>: mov $0x80000000,%ebx
0xffffffff8275da89 <xen_load_gdt_boot+47>:
mov -0x54ba80(%rip),%rax # 0xffffffff82212010
0xffffffff8275da90 <xen_load_gdt_boot+54>: add %r12,%rbx
0xffffffff8275da93 <xen_load_gdt_boot+57>: mov %rbx,%rdi
0xffffffff8275da96 <xen_load_gdt_boot+60>:
jb 0xffffffff8275daa9 <xen_load_gdt_boot+79>
0xffffffff8275da98 <xen_load_gdt_boot+62>: mov
$0xffffffff80000000,%rbx
0xffffffff8275da9f <xen_load_gdt_boot+69>: mov %rbx,%rax
0xffffffff8275daa2 <xen_load_gdt_boot+72>:
sub -0x5dec19(%rip),%rax # 0xffffffff8217ee90
<page_offset_base>
0xffffffff8275daa9 <xen_load_gdt_boot+79>: lea (%rdi,%rax,1),%rbx
0xffffffff8275daad <xen_load_gdt_boot+83>: mov %rbx,%rdi
0xffffffff8275dab0 <xen_load_gdt_boot+86>: shr $0xc,%rdi
0xffffffff8275dab4 <xen_load_gdt_boot+90>:
cmpb $0x0,-0x3d0459(%rip) # 0xffffffff8238d662
<xen_features+2>
0xffffffff8275dabb <xen_load_gdt_boot+97>: mov %rdi,%rax
0xffffffff8275dabe <xen_load_gdt_boot+100>:
jne 0xffffffff8275db02 <xen_load_gdt_boot+168>
0xffffffff8275dac0 <xen_load_gdt_boot+102>:
cmp -0x3d9a67(%rip),%rdi # 0xffffffff82384060 <xen_p2m_size>
0xffffffff8275dac7 <xen_load_gdt_boot+109>:
jae 0xffffffff8275dadc <xen_load_gdt_boot+130>
0xffffffff8275dac9 <xen_load_gdt_boot+111>:
mov -0x3d9a68(%rip),%rdx # 0xffffffff82384068 <xen_p2m_addr>
0xffffffff8275dad0 <xen_load_gdt_boot+118>: mov (%rdx,%rdi,8),%rax
0xffffffff8275dad4 <xen_load_gdt_boot+122>: cmp
$0xffffffffffffffff,%rax
0xffffffff8275dad8 <xen_load_gdt_boot+126>:
jne 0xffffffff8275daf5 <xen_load_gdt_boot+155>
0xffffffff8275dada <xen_load_gdt_boot+128>:
jmp 0xffffffff8275daea <xen_load_gdt_boot+144>
0xffffffff8275dadc <xen_load_gdt_boot+130>: bts $0x3e,%rax
0xffffffff8275dae1 <xen_load_gdt_boot+135>:
cmp -0x3d9a90(%rip),%rdi # 0xffffffff82384058
<xen_max_p2m_pfn>
0xffffffff8275dae8 <xen_load_gdt_boot+142>:
jae 0xffffffff8275daf5 <xen_load_gdt_boot+155>
0xffffffff8275daea <xen_load_gdt_boot+144>:
callq 0xffffffff81017190 <get_phys_to_machine>
0xffffffff8275daef <xen_load_gdt_boot+149>: cmp
$0xffffffffffffffff,%rax
0xffffffff8275daf3 <xen_load_gdt_boot+153>:
je 0xffffffff8275db02 <xen_load_gdt_boot+168>
0xffffffff8275daf5 <xen_load_gdt_boot+155>: movabs
$0x3fffffffffffffff,%rdx
0xffffffff8275daff <xen_load_gdt_boot+165>: and %rdx,%rax
0xffffffff8275db02 <xen_load_gdt_boot+168>: movabs
$0x8000000000000161,%rsi
0xffffffff8275db0c <xen_load_gdt_boot+178>:
or -0x523d53(%rip),%rsi # 0xffffffff82239dc0 <sme_me_mask>
0xffffffff8275db13 <xen_load_gdt_boot+185>:
and -0x3d847a(%rip),%rsi # 0xffffffff823856a0
<__default_kernel_pte_mask>
0xffffffff8275db1a <xen_load_gdt_boot+192>: mov %rax,(%rsp)
0xffffffff8275db1e <xen_load_gdt_boot+196>: and
$0xfffffffffffff000,%rbx
0xffffffff8275db25 <xen_load_gdt_boot+203>: mov %rsi,%r13
0xffffffff8275db28 <xen_load_gdt_boot+206>: test $0x1,%sil
0xffffffff8275db2c <xen_load_gdt_boot+210>:
je 0xffffffff8275db64 <xen_load_gdt_boot+266>
0xffffffff8275db2e <xen_load_gdt_boot+212>:
mov -0x3d848d(%rip),%rcx # 0xffffffff823856a8
<__supported_pte_mask>
0xffffffff8275db35 <xen_load_gdt_boot+219>: and %rcx,%r13
0xffffffff8275db38 <xen_load_gdt_boot+222>: cmp %r13,%rsi
0xffffffff8275db3b <xen_load_gdt_boot+225>:
je 0xffffffff8275db64 <xen_load_gdt_boot+266>
0xffffffff8275db3d <xen_load_gdt_boot+227>:
cmpb $0x0,-0x424ea8(%rip) # 0xffffffff82338c9c
<__warned.24604>
0xffffffff8275db44 <xen_load_gdt_boot+234>:
jne 0xffffffff8275db64 <xen_load_gdt_boot+266>
0xffffffff8275db46 <xen_load_gdt_boot+236>: mov %rcx,%rdx
0xffffffff8275db49 <xen_load_gdt_boot+239>: mov
$0xffffffff820a6cd0,%rdi
0xffffffff8275db50 <xen_load_gdt_boot+246>:
movb $0x1,-0x424ebb(%rip) # 0xffffffff82338c9c
<__warned.24604>
0xffffffff8275db57 <xen_load_gdt_boot+253>: not %rdx
0xffffffff8275db5a <xen_load_gdt_boot+256>: and %rsi,%rdx
0xffffffff8275db5d <xen_load_gdt_boot+259>:
callq 0xffffffff810ac92a <__warn_printk>
0xffffffff8275db62 <xen_load_gdt_boot+264>: ud2
0xffffffff8275db64 <xen_load_gdt_boot+266>: or %r13,%rbx
0xffffffff8275db67 <xen_load_gdt_boot+269>: mov %rbx,%rdi
0xffffffff8275db6a <xen_load_gdt_boot+272>: callq *0xffffffff82185fd8
0xffffffff8275db71 <xen_load_gdt_boot+279>: xor %edx,%edx
0xffffffff8275db73 <xen_load_gdt_boot+281>: mov %rax,%rsi
0xffffffff8275db76 <xen_load_gdt_boot+284>: mov %r12,%rdi
0xffffffff8275db79 <xen_load_gdt_boot+287>:
callq 0xffffffff810011c0 <xen_hypercall_update_va_mapping>
0xffffffff8275db7e <xen_load_gdt_boot+292>: test %eax,%eax
0xffffffff8275db80 <xen_load_gdt_boot+294>:
je 0xffffffff8275db84 <xen_load_gdt_boot+298>
0xffffffff8275db82 <xen_load_gdt_boot+296>: ud2
0xffffffff8275db84 <xen_load_gdt_boot+298>: shr $0x3,%ebp
0xffffffff8275db87 <xen_load_gdt_boot+301>: mov %rsp,%rdi
0xffffffff8275db8a <xen_load_gdt_boot+304>: mov %ebp,%esi
0xffffffff8275db8c <xen_load_gdt_boot+306>:
callq 0xffffffff81001040 <xen_hypercall_set_gdt>
0xffffffff8275db91 <xen_load_gdt_boot+311>: test %eax,%eax
0xffffffff8275db93 <xen_load_gdt_boot+313>:
je 0xffffffff8275db97 <xen_load_gdt_boot+317>
0xffffffff8275db95 <xen_load_gdt_boot+315>: ud2
0xffffffff8275db97 <xen_load_gdt_boot+317>: pop %rax
0xffffffff8275db98 <xen_load_gdt_boot+318>: pop %rbx
0xffffffff8275db99 <xen_load_gdt_boot+319>: pop %rbp
0xffffffff8275db9a <xen_load_gdt_boot+320>: pop %r12
0xffffffff8275db9c <xen_load_gdt_boot+322>: pop %r13
0xffffffff8275db9e <xen_load_gdt_boot+324>: retq
I think the crash is triggered by the code
static inline pgprotval_t check_pgprot(pgprot_t pgprot)
{
pgprotval_t massaged_val = massage_pgprot(pgprot);
/* mmdebug.h can not be included here because of dependencies */
#ifdef CONFIG_DEBUG_VM
WARN_ONCE(pgprot_val(pgprot) != massaged_val,
"attempted to set unsupported pgprot: %016llx "
"bits: %016llx supported: %016llx\n",
(u64)pgprot_val(pgprot),
(u64)pgprot_val(pgprot) ^ massaged_val,
(u64)__supported_pte_mask);
#endif
return massaged_val;
}
static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
{
return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
check_pgprot(pgprot));
}
in arch/x86/include/asm/pgtable.h which is inlined into xen_load_gdt_boot
by via pfn_pte
In 4.16 the equivalent code was
static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
{
return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
massage_pgprot(pgprot));
}
Michael Young
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-01 21:26 ` Michael Young
@ 2018-07-02 6:33 ` Juergen Gross
0 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2018-07-02 6:33 UTC (permalink / raw)
To: Michael Young, Andrew Cooper; +Cc: xen-devel
On 01/07/18 23:26, Michael Young wrote:
> On Sun, 1 Jul 2018, M A Young wrote:
>
>> I get (with kernel-4.17.3-200.fc28.x86_64 which is a bit easier)
>>
>> rip: ffffffff81062330 native_irq_disable
>> flags: 00000246 i z p
>> rsp: ffffffff82203d90
>> rax: 0000000000000246 rcx: 0000000000000000 rdx: 0000000000000000
>> rbx: 00000000ffffffff rsi: 00000000ffffffff rdi: 0000000000000000
>> rbp: 0000000000000000 r8: ffffffff820bb698 r9: ffffffff82203e38
>> r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000
>> r13: ffffffff820bb698 r14: ffffffff82203e38 r15: 0000000000000000
>> cs: e033 ss: e02b ds: 0000 es: 0000
>> fs: 0000 @ 0000000000000000
>> gs: 0000 @ ffffffff82731000/0000000000000000 __init_begin/
>> Code (instr addr ffffffff81062330)
>> 00 00 00 00 00 57 9d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 <fa> c3 0f
>> 1f 40 00 66 2e 0f 1f 84
>>
>>
>> Stack:
>> 0000000000000000 0000000000000000 0000000000000000 ffffffff81062330
>> 000000010000e030 0000000000010046 ffffffff82203dd8 000000000000e02b
>> 0000000000000246 ffffffff8110dff9 0000000000000000 0000000000000246
>> 0000000000000000 0000000000000000 ffffffff820a6cd0 ffffffff82203e88
>> ffffffff82739000 8000000000000061 0000000000000000 0000000000000000
>>
>> Call Trace:
>> [<ffffffff81062330>] native_irq_disable <--
>> ffffffff82203da8: [<ffffffff81062330>] native_irq_disable
>> ffffffff82203dd8: [<ffffffff8110dff9>] vprintk_emit+0xe9
>> ffffffff82203e30: [<ffffffff8110ec96>] printk+0x58
>> ffffffff82203e90: [<ffffffff810ac970>] __warn_printk+0x46
>> ffffffff82203ef8: [<ffffffff8275db62>] xen_load_gdt_boot+0x108
>> ffffffff82203f28: [<ffffffff81037c70>] load_direct_gdt+0x30
>> ffffffff82203f40: [<ffffffff81037f08>] switch_to_new_gdt+0x8
>> ffffffff82203f48: [<ffffffff8102aae0>] x86_init_noop
>> ffffffff82203f50: [<ffffffff8275dc8c>] xen_start_kernel+0xed
>
> I think the crash is triggered by the code
>
> static inline pgprotval_t check_pgprot(pgprot_t pgprot)
> {
> pgprotval_t massaged_val = massage_pgprot(pgprot);
>
> /* mmdebug.h can not be included here because of dependencies */
> #ifdef CONFIG_DEBUG_VM
> WARN_ONCE(pgprot_val(pgprot) != massaged_val,
> "attempted to set unsupported pgprot: %016llx "
> "bits: %016llx supported: %016llx\n",
> (u64)pgprot_val(pgprot),
> (u64)pgprot_val(pgprot) ^ massaged_val,
> (u64)__supported_pte_mask);
> #endif
>
> return massaged_val;
> }
>
> static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
> {
> return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
> check_pgprot(pgprot));
> }
>
> in arch/x86/include/asm/pgtable.h which is inlined into
> xen_load_gdt_boot by via pfn_pte
>
> In 4.16 the equivalent code was
>
> static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
> {
> return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
> massage_pgprot(pgprot));
> }
There are two problems here:
1. pv_irq_ops hasn't been setup early enough, so the printk() will use
native_irq_disable() instead of the Xen variant.
2. For PV domains the default kernel pte should not include the global
bit. Repairing this issue will avoid the WARN_ONCE() above.
I'll send two patches soon to fix the issues.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young
2018-07-01 17:41 ` Andrew Cooper
@ 2018-07-02 8:07 ` Juergen Gross
2018-07-02 9:31 ` M A Young
1 sibling, 1 reply; 7+ messages in thread
From: Juergen Gross @ 2018-07-02 8:07 UTC (permalink / raw)
To: Michael Young, xen-devel
[-- Attachment #1: Type: text/plain, Size: 467 bytes --]
On 01/07/18 18:43, Michael Young wrote:
> I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel
> (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which
> didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64)
Could you please try the attached patches? They apply to either 4.17
or 4.18-rc.
The first one should let the kernel survive the WARN_ONCE(), while
the second will avoid hitting the WARN_ONCE().
Juergen
[-- Attachment #2: 0001-xen-setup-pv-irq-ops-vector-earlier.patch --]
[-- Type: text/x-patch, Size: 1830 bytes --]
>From baa8db1bd97958cccc67f8e894847104c51c27ef Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Mon, 2 Jul 2018 09:09:18 +0200
Subject: [PATCH] xen: setup pv irq ops vector earlier
Setting pv_irq_ops for Xen PV domains should be done as early as
possible in order to support e.g. very early printk() usage.
Remove the no longer necessary conditional in xen_init_irq_ops()
from PVH V1 times to make clear this is a PV only function.
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/xen/enlighten_pv.c | 3 +--
arch/x86/xen/irq.c | 4 +---
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 8d4e2e1ae60b..0f4cd9e5bed4 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1213,6 +1213,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
pv_info = xen_info;
pv_init_ops.patch = paravirt_patch_default;
pv_cpu_ops = xen_cpu_ops;
+ xen_init_irq_ops();
x86_platform.get_nmi_reason = xen_get_nmi_reason;
@@ -1249,8 +1250,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
get_cpu_cap(&boot_cpu_data);
x86_configure_nx();
- xen_init_irq_ops();
-
/* Let's presume PV guests always boot on vCPU with id 0. */
per_cpu(xen_vcpu_id, 0) = 0;
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index 74179852e46c..7515a19fd324 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -128,8 +128,6 @@ static const struct pv_irq_ops xen_irq_ops __initconst = {
void __init xen_init_irq_ops(void)
{
- /* For PVH we use default pv_irq_ops settings. */
- if (!xen_feature(XENFEAT_hvm_callback_vector))
- pv_irq_ops = xen_irq_ops;
+ pv_irq_ops = xen_irq_ops;
x86_init.irqs.intr_init = xen_init_IRQ;
}
--
2.13.7
[-- Attachment #3: 0002-xen-remove-global-bit-from-__default_kernel_pte_mask.patch --]
[-- Type: text/x-patch, Size: 1140 bytes --]
>From 2ab1412c43762f27e65bd18d8c1ffde9133a56b1 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Mon, 2 Jul 2018 09:31:36 +0200
Subject: [PATCH] xen: remove global bit from __default_kernel_pte_mask for
pv guests
When removing the global bit from __supported_pte_mask do the same for
__default_kernel_pte_mask in order to avoid the WARN_ONCE() in
check_pgprot() when setting a kernel pte before having called
init_mem_mapping().
Cc: <stable@vger.kernel.org> # 4.17
Reported-by: Michael Young <m.a.young@durham.ac.uk>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/xen/enlighten_pv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 0f4cd9e5bed4..cf7b13d3e911 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1230,6 +1230,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
/* Prevent unwanted bits from being set in PTEs. */
__supported_pte_mask &= ~_PAGE_GLOBAL;
+ __default_kernel_pte_mask &= ~_PAGE_GLOBAL;
/*
* Prevent page tables from being allocated in highmem, even
--
2.13.7
[-- Attachment #4: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: xen crash with 4.17 kernel on Fedora
2018-07-02 8:07 ` Juergen Gross
@ 2018-07-02 9:31 ` M A Young
0 siblings, 0 replies; 7+ messages in thread
From: M A Young @ 2018-07-02 9:31 UTC (permalink / raw)
To: Juergen Gross; +Cc: xen-devel
On Mon, 2 Jul 2018, Juergen Gross wrote:
> On 01/07/18 18:43, Michael Young wrote:
> > I am seeing crash on boot and DomU (pv) on Fedora with the 4.17 kernel
> > (eg. kernel-4.17.2-200.fc28.x86_64 and kernel-4.17.3-200.fc28.x86_64) which
> > didn't occur with 4.16 kernel (eg. kernel-4.16.16-300.fc28.x86_64)
>
> Could you please try the attached patches? They apply to either 4.17
> or 4.18-rc.
>
> The first one should let the kernel survive the WARN_ONCE(), while
> the second will avoid hitting the WARN_ONCE().
Yes, kernel-4.17.3-200.fc28 with these patches applied boots as a DomU and
I checked dmesg, /var/log/messages and journalctl for pgprot messages and
didn't find anything.
Michael Young
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-07-02 9:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-01 16:43 xen crash with 4.17 kernel on Fedora Michael Young
2018-07-01 17:41 ` Andrew Cooper
2018-07-01 18:09 ` M A Young
2018-07-01 21:26 ` Michael Young
2018-07-02 6:33 ` Juergen Gross
2018-07-02 8:07 ` Juergen Gross
2018-07-02 9:31 ` M A Young
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.