Hello, Wei recently discovered an issue when running a Linux PVH Dom0 on a box with a Intel Family 6 (0x6), Model 158 (0x9e), Stepping 9 (raw 000906e9) CPU, we are not sure whether the issue is limited to a PVH Dom0, or it just happens to be easier to trigger in this scenario. The issue is caused by what seems to be an interrupt injection while Xen is still servicing a previous interrupt (ie: the interrupt hasn't been EOI'ed and ISR for the vector is set) with the same or lower priority than the interrupt currently being serviced. This injection always happen when returning from idle from a state ACPI_STATE_C3 or lower. Note that I haven't been able to reproduce this issue when using mwait-idle=0 or max_cstate=2 on the Xen command line, but again without knowing the underlying issue it's impossible to tell whether it's relevant. Andrew provided a debug patch which I've expanded to also log power state transition, and is attached to this email. Here is a trace of a crash, together with the debug info. (XEN) *** Pending EOI error *** (XEN) cpu #1, irq 30, vector 0x21, sp 1 (XEN) Peoi stack: sp 1 (XEN) [ 0] irq 30, vec 0x21, ready 0, ISR 1, TMR 0, IRR 0 (XEN) Peoi stack trace records: (XEN) [22619] POP {sp 1, irq 30, vec 0x21} (XEN) [22620] POWER TYPE 4 (XEN) [22621] IDLE PPR 0x00000010 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22622] WAKE PPR 0x00000010 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000004 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22623] ACK_PRE PPR 0x000000f0 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000004 (XEN) [22624] ACK_POST PPR 0x00000010 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22625] POWER TYPE 5 (XEN) [22626] IDLE PPR 0x00000010 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22627] WAKE PPR 0x00000010 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22628] PUSH {sp 0, irq 30, vec 0x21} (XEN) [22629] POWER TYPE 5 (XEN) [22630] IDLE PPR 0x00000020 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22631] WAKE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22632] POWER TYPE 5 (XEN) [22633] IDLE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22634] WAKE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000004 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22635] ACK_PRE PPR 0x000000f0 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000004 (XEN) [22636] ACK_POST PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22637] READY {sp 1, irq 30, vec 0x21} (XEN) [22638] ACK_PRE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22639] ACK_POST PPR 0x00000010 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) [22640] POP {sp 1, irq 30, vec 0x21} (XEN) [22641] PUSH {sp 0, irq 30, vec 0x21} (XEN) [22642] POWER TYPE 4 (XEN) [22643] IDLE PPR 0x00000020 (XEN) IRR 0000000000000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22644] WAKE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22645] POWER TYPE 3 (XEN) [22646] IDLE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22647] WAKE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22648] POWER TYPE 3 (XEN) [22649] IDLE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) [22650] WAKE PPR 0x00000020 (XEN) IRR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) ISR 0000000002000000000000000000000000000000000000000000000000000000 (XEN) All LAPIC state: (XEN) [vector] ISR TMR IRR (XEN) [1f:00] 00000000 00000000 00000000 (XEN) [3f:20] 00000002 00000000 00000000 (XEN) [5f:40] 00000000 00000000 00000000 (XEN) [7f:60] 00000000 00000000 00000000 (XEN) [9f:80] 00000000 00000000 00000000 (XEN) [bf:a0] 00000000 00000000 00000000 (XEN) [df:c0] 00000000 00000000 00000000 (XEN) [ff:e0] 00000000 00000000 04000000 (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1340 (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 1 (XEN) RIP: e008:[] do_IRQ+0x8df/0xacb (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: ffff83086c67202c rbx: 0000000000000180 rcx: 0000000000000000 (XEN) rdx: ffff83086c68ffff rsi: 000000000000000a rdi: ffff83086c601e24 (XEN) rbp: ffff83086c68fd98 rsp: ffff83086c68fd38 r8: ffff83086c690000 (XEN) r9: 0000000000000030 r10: 0000000004000000 r11: 0000000000000007 (XEN) r12: 000000000000011f r13: 00000000ffffffff r14: ffff83086c601e00 (XEN) r15: ffff82cfffffb100 cr0: 0000000080050033 cr4: 00000000003526e0 (XEN) cr3: 0000000855ba7000 cr2: 0000556bfa53c040 (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around (do_IRQ+0x8df/0xacb): (XEN) 8d 7e 24 e8 51 66 fb ff <0f> 0b 0f 0b 0f 0b 0f 0b b8 00 00 00 00 eb 4e 83 (XEN) Xen stack trace from rsp=ffff83086c68fd38: (XEN) ffff82d000000000 ffff83086c601e24 0000000000000000 ffff83086c6724e0 (XEN) ffff82d08037b841 ffff82d08037b835 ffff82d08037b841 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffff83086c68ffff 0000000000000000 (XEN) 00007cf793970237 ffff82d08037b8aa 00000003040712e5 0000000000000008 (XEN) ffff83086c671448 ffff83086c671390 ffff83086c68fec0 00000003040b3015 (XEN) ffff83086c672d08 ffff83086c6724e0 ffff83086c672d28 0000000000000180 (XEN) ffff83086c67202c 0000000000000000 ffff83086c68ffff 0000000000002ccf (XEN) ffff83086c6713c0 0000002100000000 ffff82d0802e2403 000000000000e008 (XEN) 0000000000000202 ffff83086c68fe50 0000000000000000 ffff830088dd4000 (XEN) 00000020ffffffff 0000000000000000 ffff83086c68fee8 ffff82d08059bd00 (XEN) 0000000000000000 0000000000000000 000002d90000017f ffff82d0805a3c80 (XEN) 0000000000000001 ffff82d08059bd00 0000000000000001 0000000000000001 (XEN) ffff830856085000 ffff83086c68fef0 ffff82d08027755d ffff83086c6a5000 (XEN) ffff830088dd4000 ffff830088bfa000 ffff83086c6a5000 ffff83086c68fdb8 (XEN) 0000000000000000 0000000000000000 ffff880269a3bd00 ffff880269a3bd00 (XEN) 0000000000000005 0000000000000005 0000000000000000 0000000000000120 (XEN) 0000000000000000 000000002059d803 ffffffff816fe980 ffff88027335a7c0 (XEN) ffffffff82049af8 ffff88027335a7c0 00000000dade4600 0000beef0000beef (XEN) ffffffff816fec52 000000bf0000beef 0000000000000246 ffffc90000d13e98 (XEN) 000000000000beef ffff83086c68beef 000000000000beef 000000000000beef (XEN) Xen call trace: (XEN) [] do_IRQ+0x8df/0xacb (XEN) [] common_interrupt+0x10a/0x120 (XEN) [] mwait-idle.c#mwait_idle+0x2a5/0x381 (XEN) [] domain.c#idle_loop+0xb3/0xb5 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1340 (XEN) **************************************** (XEN) (XEN) Manual reset required ('noreboot' specified) Finally I'm also proving the surrounding context of the instructions pointers in the trace above: (XEN) [] do_IRQ+0x8df/0xacb xen/arch/x86/irq.c:1340: 1325 if ( action->ack_type == ACKTYPE_EOI ) 1326 { 1327 sp = pending_eoi_sp(peoi); 1328 if ( !((sp == 0) || (peoi[sp-1].vector < vector)) ) 1329 { 1330 printk("*** Pending EOI error ***\n"); 1331 printk(" cpu #%u, irq %d, vector 0x%x, sp %d\n", 1332 smp_processor_id(), irq, vector, sp); 1333 1334 dump_peoi_stack(sp); 1335 dump_peoi_records(); 1336 dump_lapic(); 1337 1338 spin_unlock(&desc->lock); 1339 ->1340 assert_failed("(sp == 0) || (peoi[sp-1].vector < vector)"); 1341 } 1342 1343 ASSERT(sp < (NR_DYNAMIC_VECTORS-1)); 1344 peoi[sp].irq = irq; 1345 peoi[sp].vector = vector; 1346 peoi[sp].ready = 0; 1347 pending_eoi_sp(peoi) = sp+1; 1348 cpumask_set_cpu(smp_processor_id(), action->cpu_eoi_map); (XEN) [] common_interrupt+0x10a/0x120 xen/arch/x86/x86_64/entry.S:58 47 /* Inject exception if pending. */ 48 lea VCPU_trap_bounce(%rbx), %rdx 49 testb $TBF_EXCEPTION, TRAPBOUNCE_flags(%rdx) 50 jnz .Lprocess_trapbounce 51 52 cmpb $0, VCPU_mce_pending(%rbx) 53 jne process_mce 54 .Ltest_guest_nmi: 55 cmpb $0, VCPU_nmi_pending(%rbx) 56 jne process_nmi 57 test_guest_events: -> 58 movq VCPU_vcpu_info(%rbx), %rax 59 movzwl VCPUINFO_upcall_pending(%rax), %eax 60 decl %eax 61 cmpl $0xfe, %eax 62 ja restore_all_guest 63 /*process_guest_events:*/ 64 sti 65 leaq VCPU_trap_bounce(%rbx), %rdx 66 movq VCPU_event_addr(%rbx), %rax 67 movq %rax, TRAPBOUNCE_eip(%rdx) 68 movb $TBF_INTERRUPT, TRAPBOUNCE_flags(%rdx) 69 call create_bounce_frame 70 jmp test_all_events (XEN) [] mwait-idle.c#mwait_idle+0x2a5/0x381 xen/arch/x86/cpu/mwait-idle.c:802 788 if (cpu_is_haltable(cpu)) 789 mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK); 790 791 after = cpuidle_get_tick(); 792 793 cstate_restore_tsc(); 794 trace_exit_reason(irq_traced); 795 TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after, 796 irq_traced[0], irq_traced[1], irq_traced[2], irq_traced[3]); 797 798 /* Now back in C0. */ 799 update_idle_stats(power, cx, before, after); 800 local_irq_enable(); 801 -> 802 if (!(lapic_timer_reliable_states & (1 << cstate))) 803 lapic_timer_on(); 804 805 sched_tick_resume(); 806 cpufreq_dbs_timer_resume(); (XEN) [] domain.c#idle_loop+0xb3/0xb5 xen/arch/x86/domain.c:144 129 for ( ; ; ) 130 { 131 if ( cpu_is_offline(cpu) ) 132 play_dead(); 133 134 /* Are we here for running vcpu context tasklets, or for idling? */ 135 if ( unlikely(tasklet_work_to_do(cpu)) ) 136 do_tasklet(); 137 /* 138 * Test softirqs twice --- first to see if should even try scrubbing 139 * and then, after it is done, whether softirqs became pending 140 * while we were scrubbing. 141 */ 142 else if ( !softirq_pending(cpu) && !scrub_free_pages() && 143 !softirq_pending(cpu) ) -> 144 pm_idle(); 145 do_softirq(); 146 /* 147 * We MUST be last (or before pm_idle). Otherwise after we get the 148 * softirq we would execute pm_idle (and sleep) and not patch. 149 */ 150 check_for_livepatch_work(); 151 }