From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ben Guthro <ben@guthro.net>
Subject: Re: S3 crash with VTD Queue Invalidation enabled
Date: Fri, 14 Jun 2013 13:01:50 -0400
Message-ID: <CAOvdn6XaLiOcTd0KAWVHAz+vCm3866oNFWRPgbw5yUbiC7zOrQ@mail.gmail.com>
References: <CAOvdn6W0iK531W_uEHjF7Q_r0Gr8fXLw+CB+0hocuQTCXNgnWA@mail.gmail.com>
	<51ACECEB.9030904@citrix.com>
	<51ADC77402000078000DAF95@nat28.tlf.novell.com>
	<CAOvdn6W9=eVFvqtLaEhvrf7fGwbAf43yZGC3YSQ-PoBZ+tjCnA@mail.gmail.com>
	<51AE0F6602000078000DB1F4@nat28.tlf.novell.com>
	<CAOvdn6UCz7Papbr1bU+qw01LbNRJe-5sOW2OFDrCKEf5wOggMQ@mail.gmail.com>
	<CAOvdn6Xxa7F_RGRVFQsp6wD_FxTjTLefDER9zqkvf9x9XnzshA@mail.gmail.com>
	<CAOvdn6WBXksYrEFrP1gypgh-hgn6TZ_Pq4eTqSXmbhKuthh2nA@mail.gmail.com>
	<51AF11F102000078000DB589@nat28.tlf.novell.com>
	<CAOvdn6XWGAoedf+sCAuc8NyRS7Jw8W--nG5N6+71UtzJeHDphg@mail.gmail.com>
	<51AF71E902000078000DB8B7@nat28.tlf.novell.com>
	<CAOvdn6WU2MAOnnS5yTph=SJszfOcymbVOuErpRfQV39EzgFMgg@mail.gmail.com>
	<51AF777F02000078000DB944@nat28.tlf.novell.com>
	<CAOvdn6W2QLK0QauYQce1ZFFKWZRv19-biJ6yYTNOH0DmSdSsPw@mail.gmail.com>
	<CAOvdn6UTHWbambEAfj4ii+cZqBaFpv+zYCCchH+eUceogth=pw@mail.gmail.com>
	<51BAF2B702000078000DE490@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <51BAF2B702000078000DE490@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>, xiantao.zhang@intel.com, xen-devel <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On Fri, Jun 14, 2013 at 4:38 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 06.06.13 at 01:53, Ben Guthro <ben@guthro.net> wrote:
>>> Early in the boot process, I see queue_invalidate_wait() called for
>>> DRHD unit 0, and 1
>>> (unit 0 is wired up to the IGD, unit 1 is everything else)
>>>
>>> Up until i915 does the following, I see that unit being flushed with
>>> queue_invalidate_wait() :
>>>
>>> [    0.704537] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>>> [    0.704537] ENERGY_PERF_BIAS: View and update with x86_energy_p
>>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>>> [    1.983028] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to
>>> bit banging on pin 5
>>> [    2.253551] fbcon: inteldrmfb (fb0) is primary device
>>> [    3.111838] Console: switching to colour frame buffer device 170x48
>>> [    3.171631] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
>>> [    3.171634] i915 0000:00:02.0: registered panic notifier
>>> [    3.173339] acpi device:00: registered as cooling_device1
>>> [    3.173401] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
>>> [    3.173962] input: Video Bus as
>>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
>>> [    3.174232] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
>>> [    3.174258] ahci 0000:00:1f.2: version 3.0
>>> [    3.174270] xen: registering gsi 19 triggering 0 polarity 1
>>> [    3.174274] Already setup the GSI :19
>>>
>>>
>>> After that - the unit never seems to be flushed.
>
> With queue_invalidate_wait() having a single caller -
> invalidate_sync() -, and with invalidate_sync() being called from
> all interrupt setup (IO-APIC as well as MSI), that's quite odd to be
> the case. At least upon network driver load or interface-up, this
> should be getting called.
>
>>> ...until we enter into the S3 hypercall, which loops over all DRHD
>>> units, and explicitly flushes all of them via iommu_flush_all()
>>>
>>> It is at that point that it hangs up when talking to the device that
>>> the IGD is plumbed up to.
>>>
>>>
>>> Does this point to something in the i915 driver doing something that
>>> is incompatible with Xen?
>>
>> I actually separated it from the S3 hypercall, adding a new debug key
>> 'F' - to just call iommu_flush_all()
>> I can crash it on demand with this.
>>
>> Booting with "i915.modeset=0 single" (to prevent both KMS, and Xorg) -
>> it does not occur.
>> So, that pretty much narrows it down to the IGD, in my mind.
>
> Which reminds me of a change I did several weeks back to our kernel,
> but which isn't as easily done with pv-ops: There are a number of
> cases in the AGP and DRM code that qualify upon CONFIG_INTEL_IOMMU
> and use intel_iommu_gfx_mapped. As you certainly know, Linux when
> running on Xen doesn't see any IOMMU, and hence the config option
> being enabled or disabled is completely unrelated to whether the
> driver actually runs on top of an enabled IOMMU. Similarly the setting
> of intel_iommu_gfx_mapped cannot possibly happen when running on
> top of Xen, as it sits in code that never gets used in this case.
>
> A possibly simple, but rather hacky solution might be to always set
> that variable when running on Xen. But that wouldn't cover the case
> of a kernel being built without CONFIG_INTEL_IOMMU, yet in that
> case the driver might still run with an IOMMU enabled underneath.
> (In our case I can simply always #define intel_iommu_gfx_mapped
> to 1, with the INTEL_IOMMU option getting forcibly disabled for the
> Xen kernel flavors anyway. Whether that's entirely correct when
> not running on an enabled IOMMU I can't tell yet, and don't know
> whom to ask.)
>
> And that wouldn't cover the IGD getting passed through to a DomU
> at all - obviously Xen's ability to properly drive all IOMMU operations
> (including qinval) must not depend on the owning guest's driver code.
>
> I have to admit though that it entirely escapes me why a graphics
> driver needs to peek into IOMMU code/state in the first place. This
> very much smells of bad design.


This all makes sense, and I agree with your assessment.

Unfortunately, I went and got the machine back from our QA department,
to do some tests on this, and now I am unable to reproduce the issue,
to prove your analysis is correct.
It was 100% reproducible a week ago, and now I can't make it happen,
using the same code base & build.

It is all very strange, and smells of a race condition, or
uninitialized variable.
I blame Alpha particles.