From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Guthro Subject: Re: S3 crash with VTD Queue Invalidation enabled Date: Tue, 4 Jun 2013 15:49:01 -0400 Message-ID: References: <51ACECEB.9030904@citrix.com> <51ADC77402000078000DAF95@nat28.tlf.novell.com> <51AE0F6602000078000DB1F4@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , xen-devel List-Id: xen-devel@lists.xenproject.org On Tue, Jun 4, 2013 at 3:20 PM, Ben Guthro wrote: > On Tue, Jun 4, 2013 at 10:01 AM, Jan Beulich wrote: >>>>> On 04.06.13 at 14:25, Ben Guthro wrote: >>> On Tue, Jun 4, 2013 at 4:54 AM, Jan Beulich wrote: >>>>>>> On 03.06.13 at 21:22, Andrew Cooper wrote: >>>>> On 03/06/13 19:29, Ben Guthro wrote: >>>>>> (XEN) Xen call trace: >>>>>> (XEN) [] invalidate_sync+0x258/0x291 >>>>>> (XEN) [] flush_iotlb_qi+0xd3/0xef >>>>>> (XEN) [] iommu_flush_all+0xb5/0xde >>>>>> (XEN) [] vtd_suspend+0x23/0xf1 >>>>>> (XEN) [] iommu_suspend+0x3c/0x3e >>>>>> (XEN) [] enter_state_helper+0x1a0/0x3cb >>>>>> (XEN) [] continue_hypercall_tasklet_handler+0x51/0xbf >>>>>> (XEN) [] do_tasklet_work+0x8d/0xc7 >>>>>> (XEN) [] do_tasklet+0x6b/0x9b >>>>>> (XEN) [] idle_loop+0x67/0x6f >>>>> >>>>> This was likely broken by XSA-36 >>>>> >>>>> My fix for the crash path is: >>>>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=53fd1d8458de01169dfb >>>>> 56feb315f02c2b521a86 >>>>> >>>>> You want to inspect the use of iommu_enabled and iommu_intremap. >>>> >>>> According to the comment in vtd_suspend(), >>>> iommu_disable_x2apic_IR() is supposed to run after >>>> iommu_suspend() (and indeed lapic_suspend() gets called >>>> immediately after iommu_suspend() by device_power_down()), >>>> and hence that shouldn't be the reason here. But, Ben, to be >>>> sure, dumping the state of the various IOMMU related enabling >>>> variables would be a good idea. >>> >>> I assume you are referring to the variables below, defined at the top of >>> iommu.c >>> At the time of the crash, they look like this: >>> >>> (XEN) iommu_enabled = 1 >>> (XEN) force_iommu; = 0 >>> (XEN) iommu_verbose; = 0 >>> (XEN) iommu_workaround_bios_bug; = 0 >>> (XEN) iommu_passthrough; = 0 >>> (XEN) iommu_snoop = 0 >>> (XEN) iommu_qinval = 1 >>> (XEN) iommu_intremap = 1 >>> (XEN) iommu_hap_pt_share = 0 >>> (XEN) iommu_debug; = 0 >>> (XEN) amd_iommu_perdev_intremap = 1 >>> >>> If that gives any additional insight, please let me know. >>> I'm not sure I gleaned anything particularly significant from it though. >>> >>> Or - perhaps you are referring to other enabling variables? >> >> These were exactly the ones (or really you picked a superset of >> what I wanted to know the state of). To me this pretty clearly >> means that Andrew's original thought here is not applicable, as >> at this point we can't possibly have shut down qinval yet. >> >>>> Is this perhaps having some similarity with >>>> http://lists.xen.org/archives/html/xen-devel/2013-04/msg00343.html? >>>> We're clearly running single-CPU only here and there... >>> >>> We certainly should be, as we have gone through the >>> disable_nonboot_cpus() by this point - and I can verify that from the >>> logs. >> >> I'm much more tending towards the connection here, noting that >> Andrew's original thread didn't really lead anywhere (i.e. we still >> don't know what the panic he saw was actually caused by). >> > > I'm starting to think you're on to something here. hmm - maybe not. I get the same crash with "maxcpus=1" > I've put a bunch of trace throughout the functions in qinval.c > > It seems that everything is functioning properly, up until we go > through the disable_nonboot_cpus() path. > Prior to this, I see the qinval.c functions being executed on all > cpus, and both drhd units > Afterward, it gets stuck in queue_invalidate_wait on the first drhd > unit.. and eventually panics. > > I'm not exactly sure what to make of this yet.