All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Ben Guthro <ben@guthro.net>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	xiantao.zhang@intel.com, xen-devel <xen-devel@lists.xen.org>
Subject: Re: S3 crash with VTD Queue Invalidation enabled
Date: Fri, 14 Jun 2013 09:38:47 +0100	[thread overview]
Message-ID: <51BAF2B702000078000DE490@nat28.tlf.novell.com> (raw)
In-Reply-To: <CAOvdn6UTHWbambEAfj4ii+cZqBaFpv+zYCCchH+eUceogth=pw@mail.gmail.com>

>>> On 06.06.13 at 01:53, Ben Guthro <ben@guthro.net> wrote:
>> Early in the boot process, I see queue_invalidate_wait() called for
>> DRHD unit 0, and 1
>> (unit 0 is wired up to the IGD, unit 1 is everything else)
>>
>> Up until i915 does the following, I see that unit being flushed with
>> queue_invalidate_wait() :
>>
>> [    0.704537] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>> [    0.704537] ENERGY_PERF_BIAS: View and update with x86_energy_p
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> [    1.983028] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to
>> bit banging on pin 5
>> [    2.253551] fbcon: inteldrmfb (fb0) is primary device
>> [    3.111838] Console: switching to colour frame buffer device 170x48
>> [    3.171631] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
>> [    3.171634] i915 0000:00:02.0: registered panic notifier
>> [    3.173339] acpi device:00: registered as cooling_device1
>> [    3.173401] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
>> [    3.173962] input: Video Bus as
>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
>> [    3.174232] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
>> [    3.174258] ahci 0000:00:1f.2: version 3.0
>> [    3.174270] xen: registering gsi 19 triggering 0 polarity 1
>> [    3.174274] Already setup the GSI :19
>>
>>
>> After that - the unit never seems to be flushed.

With queue_invalidate_wait() having a single caller -
invalidate_sync() -, and with invalidate_sync() being called from
all interrupt setup (IO-APIC as well as MSI), that's quite odd to be
the case. At least upon network driver load or interface-up, this
should be getting called.

>> ...until we enter into the S3 hypercall, which loops over all DRHD
>> units, and explicitly flushes all of them via iommu_flush_all()
>>
>> It is at that point that it hangs up when talking to the device that
>> the IGD is plumbed up to.
>>
>>
>> Does this point to something in the i915 driver doing something that
>> is incompatible with Xen?
> 
> I actually separated it from the S3 hypercall, adding a new debug key
> 'F' - to just call iommu_flush_all()
> I can crash it on demand with this.
> 
> Booting with "i915.modeset=0 single" (to prevent both KMS, and Xorg) -
> it does not occur.
> So, that pretty much narrows it down to the IGD, in my mind.

Which reminds me of a change I did several weeks back to our kernel,
but which isn't as easily done with pv-ops: There are a number of
cases in the AGP and DRM code that qualify upon CONFIG_INTEL_IOMMU
and use intel_iommu_gfx_mapped. As you certainly know, Linux when
running on Xen doesn't see any IOMMU, and hence the config option
being enabled or disabled is completely unrelated to whether the
driver actually runs on top of an enabled IOMMU. Similarly the setting
of intel_iommu_gfx_mapped cannot possibly happen when running on
top of Xen, as it sits in code that never gets used in this case.

A possibly simple, but rather hacky solution might be to always set
that variable when running on Xen. But that wouldn't cover the case
of a kernel being built without CONFIG_INTEL_IOMMU, yet in that
case the driver might still run with an IOMMU enabled underneath.
(In our case I can simply always #define intel_iommu_gfx_mapped
to 1, with the INTEL_IOMMU option getting forcibly disabled for the
Xen kernel flavors anyway. Whether that's entirely correct when
not running on an enabled IOMMU I can't tell yet, and don't know
whom to ask.)

And that wouldn't cover the IGD getting passed through to a DomU
at all - obviously Xen's ability to properly drive all IOMMU operations
(including qinval) must not depend on the owning guest's driver code.

I have to admit though that it entirely escapes me why a graphics
driver needs to peek into IOMMU code/state in the first place. This
very much smells of bad design.

Jan

  parent reply	other threads:[~2013-06-14  8:38 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-03 18:29 S3 crash with VTD Queue Invalidation enabled Ben Guthro
2013-06-03 19:22 ` Andrew Cooper
2013-06-04  8:54   ` Jan Beulich
2013-06-04 12:25     ` Ben Guthro
2013-06-04 14:01       ` Jan Beulich
2013-06-04 19:20         ` Ben Guthro
2013-06-04 19:49           ` Ben Guthro
2013-06-04 21:09             ` Ben Guthro
2013-06-05  8:24               ` Jan Beulich
2013-06-05 13:54                 ` Ben Guthro
2013-06-05 15:14                   ` Jan Beulich
2013-06-05 15:25                     ` Ben Guthro
2013-06-05 15:38                       ` Jan Beulich
2013-06-05 20:27                         ` Ben Guthro
2013-06-05 23:53                           ` Ben Guthro
2013-06-06  6:58                             ` Jan Beulich
2013-06-06 15:06                               ` Zhang, Xiantao
2013-06-06 15:07                                 ` Ben Guthro
2013-06-06 15:13                                   ` Zhang, Xiantao
2013-06-06 15:17                                     ` Ben Guthro
2013-06-07  1:33                                       ` Zhang, Xiantao
2013-06-07 15:52                                         ` Ben Guthro
2013-06-14  8:38                             ` Jan Beulich [this message]
2013-06-14 17:01                               ` Ben Guthro
2013-06-14 18:27                                 ` Ben Guthro
2013-06-17  7:23                                   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BAF2B702000078000DE490@nat28.tlf.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ben@guthro.net \
    --cc=xen-devel@lists.xen.org \
    --cc=xiantao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.