From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: S3 crash with VTD Queue Invalidation enabled Date: Mon, 3 Jun 2013 20:22:19 +0100 Message-ID: <51ACECEB.9030904@citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ben Guthro Cc: xen-devel List-Id: xen-devel@lists.xenproject.org On 03/06/13 19:29, Ben Guthro wrote: > I am seeing a crash on some vPro systems in the S3 path - > specifically a Lenovo ThinkPad x220t (Sandybridge) > > Once I managed to not suspend the console, I got a panic in > queue_invalidate_wait() > (I added a dump_execution_state() here, to get some more info) > > (XEN) Entering ACPI S3 state. > (XEN) ----[ Xen-4.2.2 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[] invalidate_sync+0x258/0x291 > (XEN) RFLAGS: 0000000000010086 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff830137a665c0 rcx: 0000000000000000 > (XEN) rdx: ffff82c48030a0a0 rsi: 000000000000000a rdi: ffff82c4802766e0 > (XEN) rbp: ffff82c4802bfd30 rsp: ffff82c4802bfce0 r8: 0000000000000004 > (XEN) r9: 0000000000000002 r10: 0000000000000020 r11: 0000000000000010 > (XEN) r12: 0000000bf34a77bc r13: 0000000000000000 r14: ffff830137a665f8 > (XEN) r15: 0000000137a5c002 cr0: 000000008005003b cr4: 00000000000426f0 > (XEN) cr3: 00000000ba2cd000 cr2: ffff880024181ff0 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4802bfce0: > (XEN) 0000000000000002 0000000000000002 0101010000000002 0000000000000082 > (XEN) 00000001802bfd30 ffff830137a665c0 0000000000000000 0000000000000000 > (XEN) 0000000000000000 1000000000000000 ffff82c4802bfd90 ffff82c48014919d > (XEN) ffff82c400000000 0000000000000000 ffff82c4802bfd60 0000000000000000 > (XEN) ffff82c4802bfd90 ffff830137a665c0 ffff830137a66540 0000000000000000 > (XEN) ffff830137a66670 ffff82c4802679e0 ffff82c4802bfde0 ffff82c480145a60 > (XEN) 0000000000000000 ffff82c4802bfdc0 ffff82c480125d36 ffff82c3ffd7a00c > (XEN) 0000000000000000 0000000000000003 0000000000000003 ffff82c48030a100 > (XEN) ffff82c4802bfe20 ffff82c480145b08 ffff830137a4e620 ffff82c3ffd7a00c > (XEN) 0000000000000000 0000000000000003 0000000000000003 ffff82c48030a100 > (XEN) ffff82c4802bfe30 ffff82c480141e12 ffff82c4802bfe80 ffff82c48019f315 > (XEN) ffff82c4802bfe60 0000000000000282 0000000000000003 ffff83010cc0a010 > (XEN) ffff8300ba0fd000 0000000000000000 0000000000000003 ffff82c48030a100 > (XEN) ffff82c4802bfea0 ffff82c480105ed4 ffff8300ba0fd188 ffff82c48030a170 > (XEN) ffff82c4802bfec0 ffff82c480127a1e ffff82c480125b8a ffff82c48030a190 > (XEN) ffff82c4802bfef0 ffff82c480127d89 ffff82c4802bff18 ffff82c4802bff18 > (XEN) ffff82c4802bff18 00000000ffffffff ffff82c4802bff10 ffff82c48015a42f > (XEN) ffff8300ba59a000 ffff8300ba0fd000 ffff82c4802bfda8 0000000000001403 > (XEN) 0000000000000003 0000000000003403 ffffffff81a6b278 ffff8800049f3d28 > (XEN) 0000000000000000 0000000000000246 0000000000000404 0000000000000003 > (XEN) Xen call trace: > (XEN) [] invalidate_sync+0x258/0x291 > (XEN) [] flush_iotlb_qi+0xd3/0xef > (XEN) [] iommu_flush_all+0xb5/0xde > (XEN) [] vtd_suspend+0x23/0xf1 > (XEN) [] iommu_suspend+0x3c/0x3e > (XEN) [] enter_state_helper+0x1a0/0x3cb > (XEN) [] continue_hypercall_tasklet_handler+0x51/0xbf > (XEN) [] do_tasklet_work+0x8d/0xc7 > (XEN) [] do_tasklet+0x6b/0x9b > (XEN) [] idle_loop+0x67/0x6f > (XEN) > (XEN) > (XEN) DMAR_IQA_REG = 137a5c002 > (XEN) DMAR_IQH_REG = 120 > (XEN) DMAR_IQT_REG = 140 > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) queue invalidate wait descriptor was not executed > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > > > This particular dump was with Xen 4.2.2, and Linux 3.8.8 > I have tested the following other combinations, with no difference in behavior: > > Xen-unstable git cs da3bca931fbcf0cbdfec971aca234e7ec0f41e16, with > Linux 3.10-rc3 cs 58f8bbd2e39c3732c55698494338ee19a92c53a0 > > Xen-4.2.2 / linux-3.8.8 > Xen-4.2.2 / linux-3.8.13 > Xen-4.2.3-pre / linux-3.8.13 > > Booting with iommu=no-qinval or iommu=off works around the problem, > but I was wondering if there was a more elegant solution, possibly > detecting, and disabling this feature if not working properly? > > > Thanks in advance for any insight. > > Ben This was likely broken by XSA-36 My fix for the crash path is: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=53fd1d8458de01169dfb56feb315f02c2b521a86 You want to inspect the use of iommu_enabled and iommu_intremap. ~Andrew > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel