From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Thimo E." Subject: Re: cpuidle and un-eoid interrupts at the local apic Date: Wed, 31 Jul 2013 10:30:13 +0200 Message-ID: <51F8CB15.1070608@digithi.de> References: <51A908CA.7050604@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51A908CA.7050604@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper Cc: Keir Fraser , Jan Beulich , Xen-devel List List-Id: xen-devel@lists.xenproject.org Hello all, I have also a Haswell system. I am running XenServer 6.2 (with Xen 4.1.5) on it and I am experiencing the same issue. Do you already have a solution for this problem ? Best regards Thimo (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1027^M (XEN) ----[ Xen-4.1.5.debug x86_64 debug=y Not tainted ]----^M (XEN) CPU: 1^M (XEN) RIP: e008:[] do_IRQ+0x3ba/0x6d9^M (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor^M (XEN) rax: 0000000000000001 rbx: ffff83081f080f00 rcx: ffff83081f05b340^M (XEN) rdx: 0000000000000001 rsi: 000000000000002b rdi: 0000000000000001^M (XEN) rbp: ffff83081f057d88 rsp: ffff83081f057d18 r8: ffff83081f05b63c^M (XEN) r9: 000070044fb97100 r10: ffff8300b858c060 r11: 000020f3f5a4dea5^M (XEN) r12: 000000000000002b r13: ffff83081f004e80 r14: 000000000000001d^M (XEN) r15: 0000000000000002 cr0: 000000008005003b cr4: 00000000001026f0^M (XEN) cr3: 000000045915f000 cr2: 0000000000150008^M (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008^M (XEN) Xen stack trace from rsp=ffff83081f057d18:^M (XEN) 000000000000001d 000000000000001d ffff83081f080f00 0000000000000000^M (XEN) 00000000ffffffea ffff83081f080f00 0000000000000000 0000000000000000^M (XEN) ffffffffffffffff ffff83081f057f18 ffff83081f06bb00 ffff83081f06bb90^M (XEN) ffff8300b858c000 0000000000000002 00007cf7e0fa8247 ffff82c480161a66^M (XEN) 0000000000000002 ffff8300b858c000 ffff83081f06bb90 ffff83081f06bb00^M (XEN) ffff83081f057ef0 ffff83081f057f18 000020f3f5a4dea5 ffff8300b858c060^M (XEN) 000070044fb97100 ffff83081f05bb80 0000000000007f40 0000000000000001^M (XEN) 0000000000000000 000020f3c755a972 ffff83081f06bb90 0000002b00000000^M (XEN) ffff82c4801a21f0 000000000000e008 0000000000000246 ffff83081f057e48^M (XEN) 000000000000e010 ffff83081f057ef0 ffff82c4801a3dc4 000020f3f595c09c^M (XEN) 000020f3f596987e ffff8306383e3010 ffff83081f05b100 ffffffffffffffff^M (XEN) 0000000000000001 0000000000000001 ffffffffffffffff ffff83081f057f18^M (XEN) 00000000802d4680 0000000000000000 0000000000000000 ffff82c4802d4680^M (XEN) 000002a80000024b ffff8300b8586000 ffff83081f057f18 ffff8300b8586000^M (XEN) ffff8300b858c000 ffff8300b858c000 0000000000000002 ffff83081f057f10^M (XEN) ffff82c48015a261 ffff82c480126ccd 0000000000000001 ffff83081f057d18^M (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000^M (XEN) 0000000000000000 0000000000000000 0000000000000246 ffff88001a8093a0^M (XEN) 0000000100885e0f 000000000000000f 0000000000000000 ffffffff802063aa^M (XEN) 0000000000000001 00000000deadbeef 00000000deadbeef 0000010000000000^M (XEN) Xen call trace:^M (XEN) [] do_IRQ+0x3ba/0x6d9^M (XEN) [] common_interrupt+0x26/0x30^M (XEN) [] lapic_timer_nop+0x0/0x6^M (XEN) [] idle_loop+0x48/0x59^M (XEN) ^M (XEN) ^M (XEN) ****************************************^M (XEN) Panic on CPU 1:^M (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1027^M (XEN) ****************************************^M (XEN) ^M (XEN) Reboot in five seconds...^M Am 31.05.2013 22:32, schrieb Andrew Cooper: > Recently our automated testing system has caught a curious assertion > while testing Xen 4.1.5 on a HaswellDT system. > > (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1030 > (XEN) ----[ Xen-4.1.5 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[] do_IRQ+0x514/0x750 > (XEN) RFLAGS: 0000000000010093 CONTEXT: hypervisor > (XEN) rax: 000000000000002f rbx: ffff830249841e80 rcx: ffff82c4803127c0 > (XEN) rdx: 0000000000000004 rsi: 0000000000000027 rdi: 0000000000000001 > (XEN) rbp: 0000000000001e00 rsp: ffff82c4802bfd48 r8: ffff82c480312abc > (XEN) r9: ffff8302498a5948 r10: 0000000000000009 r11: ffff8302498c6c80 > (XEN) r12: ffff830243b07f50 r13: ffff8300a24f8000 r14: 00000af8373788e3 > (XEN) r15: ffff830249841e80 cr0: 000000008005003b cr4: 00000000001026f0 > (XEN) cr3: 00000002479e6000 cr2: 00000000e6d3c090 > (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4802bfd48: > (XEN) ffff830249841eb4 ffff82c480312ec0 000000000000001e 0000001e00000000 > (XEN) 0000000000000000 00000000498a5670 ffff830249841d80 ffff830249840080 > (XEN) ffff830249841db4 0000000000000000 ffff8302498a55e0 ffff8302498a5670 > (XEN) ffff8300a24f8000 00000af8373788e3 00000af83736b8ed ffff82c480162ca0 > (XEN) 00000af83736b8ed 00000af8373788e3 ffff8300a24f8000 ffff8302498a5670 > (XEN) ffff8302498a55e0 0000000000000000 ffff8302498c6c80 0000000000000009 > (XEN) ffff8302498a5948 ffff82c480313000 0000000000007f40 0000000000000001 > (XEN) 0000000000000000 0000000000000000 00000af80db652fd 0000002700000000 > (XEN) ffff82c4801a50a0 000000000000e008 0000000000000246 ffff82c4802bfe78 > (XEN) 0000000000000000 ffff8302498a5670 ffff82c4801a6a56 ffffffffffffffff > (XEN) ffff830249818000 0000000000000000 ffff8300a24f8000 ffff82c480122c11 > (XEN) 00000af839021119 0000000000000000 0000000000000000 00000000802bff18 > (XEN) 0000025c0000013b ffff82c4802e7580 ffff82c4802bff18 ffff8300a2838000 > (XEN) ffff82c4802f61a0 ffff8300a24f8000 0000000000000002 00000af837304b45 > (XEN) ffff82c48015b67a 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 00000000ee8a3f8c 0000000000000001 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 00000000ee8a3f74 0000000000000af8 > (XEN) 0000000000000001 0000010000000000 00000000c01013a7 0000000000000061 > (XEN) 0000000000000246 00000000ee8a3f70 0000000000000069 0000000000000000 > (XEN) Xen call trace: > (XEN) [] do_IRQ+0x514/0x750 > (XEN) 15[] common_interrupt+0x20/0x30 > (XEN) 32[] lapic_timer_nop+0x0/0x10 > (XEN) 38[] acpi_processor_idle+0x376/0x740 > (XEN) 43[] do_block+0x71/0xd0 > (XEN) 56[] idle_loop+0x1a/0x50 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1030 > (XEN) **************************************** > > And the disassembly before the assertion: > > ffff82c48016b29f: 48 8d 14 85 00 00 00 lea 0x0(,%rax,4),%rdx > ffff82c48016b2a6: 00 > ffff82c48016b2a7: 0f b6 44 11 ff movzbl -0x1(%rcx,%rdx,1),%eax > ffff82c48016b2ac: 39 c6 cmp %eax,%esi > ffff82c48016b2ae: 0f 8f 5c ff ff ff jg ffff82c48016b210 > ffff82c48016b2b4: 0f 0b ud2 > > > Xen has been woken up by an interrupt of vector 0x27, but has a vector > 0x2f on the top of the pending EOI stack for the local APIC. > > I have put in more debugging to dump the LAPIC state of the two > interesting vectors and the IOAPIC state, but I have no idea if/when the > problem might reoccur. > > My understanding of LAPIC priority leads me to think that Xen really > shouldn't be woken up by a lower priority vector if a higher priority > one is still un-eoi'd. There is not yet sufficient information to tell > whether this is truely the case, or that Xen has simply gotten confused > about which vectors it eoi'd. > > Having said that, we do keep line level interrupts un-eoi'd for extended > periods while guests service the interrupt. Given that vectors are > chosen at random, we could get into a situation where a line interrupt > has a vector 0xdf and stays pending for 150ms (which I measured as a > not-overly-uncommon mean-time-till-eoi for line level interrupt). This > would starve any other guest interrupts for an extended period. > > Given directed-eoi support in the past few generations of processor, the > requirement for the pending EOI stack has disappeared as far as I am > aware. Would it be sensible idea in general to make use of the pending > eoi stack conditional on not having/using directed EOI support? > > ~Andrew > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel