All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Ben Guthro <ben@guthro.net>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	xiantao.zhang@intel.com, xen-devel <xen-devel@lists.xen.org>
Subject: Re: S3 crash with VTD Queue Invalidation enabled
Date: Thu, 06 Jun 2013 07:58:30 +0100	[thread overview]
Message-ID: <51B04F3602000078000DBC60@nat28.tlf.novell.com> (raw)
In-Reply-To: <CAOvdn6UTHWbambEAfj4ii+cZqBaFpv+zYCCchH+eUceogth=pw@mail.gmail.com>

>>> On 06.06.13 at 01:53, Ben Guthro <ben@guthro.net> wrote:
> On Wed, Jun 5, 2013 at 4:27 PM, Ben Guthro <ben@guthro.net> wrote:
>> On Wed, Jun 5, 2013 at 11:38 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 05.06.13 at 17:25, Ben Guthro <ben@guthro.net> wrote:
>>>> On Wed, Jun 5, 2013 at 11:14 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> Depending on whether ATS is in use, more than one invalidation
>>>>> can be done in the processing here - could you therefore check
>>>>> whether there's any sign of ATS use ("iommu=verbose" should
>>>>> make you see respective messages), and if so see whether
>>>>> disabling it ("ats=off") makes a difference?
>>>>
>>>> ATS does not appear to be running:
>>>>
>>>> (XEN) [VT-D]dmar.c:737: Host address width 36
>>>> (XEN) [VT-D]dmar.c:751: found ACPI_DMAR_DRHD:
>>>> (XEN) [VT-D]dmar.c:412:   dmaru->address = fed90000
>>>> (XEN) [VT-D]iommu.c:1197: drhd->address = fed90000 iommu->reg = ffff82c3ffd57000
>>>> (XEN) [VT-D]iommu.c:1199: cap = c0000020e60262 ecap = f0101a
>>>> (XEN) [VT-D]dmar.c:338:  endpoint: 0000:00:02.0
>>>> (XEN) [VT-D]dmar.c:751: found ACPI_DMAR_DRHD:
>>>> (XEN) [VT-D]dmar.c:412:   dmaru->address = fed91000
>>>> (XEN) [VT-D]iommu.c:1197: drhd->address = fed91000 iommu->reg = ffff82c3ffd56000
>>>> (XEN) [VT-D]iommu.c:1199: cap = c9008020660262 ecap = f0105a
>>>> (XEN) [VT-D]dmar.c:354:  IOAPIC: 0000:f0:1f.0
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.0
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.1
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.2
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.3
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.4
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.5
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.6
>>>> (XEN) [VT-D]dmar.c:332:  MSI HPET: 0000:00:0f.7
>>>> (XEN) [VT-D]dmar.c:426:   flags: INCLUDE_ALL
>>>> (XEN) [VT-D]dmar.c:756: found ACPI_DMAR_RMRR:
>>>> (XEN) [VT-D]dmar.c:338:  endpoint: 0000:00:1d.0
>>>> (XEN) [VT-D]dmar.c:338:  endpoint: 0000:00:1a.0
>>>> (XEN) [VT-D]dmar.c:625:   RMRR region: base_addr ba8d5000 end_address
>>>> ba8ebfff
>>>> (XEN) [VT-D]dmar.c:756: found ACPI_DMAR_RMRR:
>>>> (XEN) [VT-D]dmar.c:338:  endpoint: 0000:00:02.0
>>>> (XEN) [VT-D]dmar.c:625:   RMRR region: base_addr bb800000 end_address
>>>> bf9fffff
>>>>
>>>> I would expect a line with "found ACPI_DMAR_ATSR" to be printed, if it
>>>> was found.
>>>
>>> Right. So one less variable.
>>
>> Some more info.
>> Ross Philipson provided me with a handy utility to dump a bunch more
>> info about the DMAR tables, and with some more trace, this appears to
>> be tied to the IGD.
>>
>> Early in the boot process, I see queue_invalidate_wait() called for
>> DRHD unit 0, and 1
>> (unit 0 is wired up to the IGD, unit 1 is everything else)
>>
>> Up until i915 does the following, I see that unit being flushed with
>> queue_invalidate_wait() :
>>
>> [    0.704537] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>> [    0.704537] ENERGY_PERF_BIAS: View and update with x86_energy_p
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> [    1.983028] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to
>> bit banging on pin 5
>> [    2.253551] fbcon: inteldrmfb (fb0) is primary device
>> [    3.111838] Console: switching to colour frame buffer device 170x48
>> [    3.171631] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
>> [    3.171634] i915 0000:00:02.0: registered panic notifier
>> [    3.173339] acpi device:00: registered as cooling_device1
>> [    3.173401] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
>> [    3.173962] input: Video Bus as
>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
>> [    3.174232] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on 
> minor 0
>> [    3.174258] ahci 0000:00:1f.2: version 3.0
>> [    3.174270] xen: registering gsi 19 triggering 0 polarity 1
>> [    3.174274] Already setup the GSI :19
>>
>>
>> After that - the unit never seems to be flushed.
>>
>> ...until we enter into the S3 hypercall, which loops over all DRHD
>> units, and explicitly flushes all of them via iommu_flush_all()
>>
>> It is at that point that it hangs up when talking to the device that
>> the IGD is plumbed up to.
>>
>>
>> Does this point to something in the i915 driver doing something that
>> is incompatible with Xen?
> 
> I actually separated it from the S3 hypercall, adding a new debug key
> 'F' - to just call iommu_flush_all()
> I can crash it on demand with this.
> 
> Booting with "i915.modeset=0 single" (to prevent both KMS, and Xorg) -
> it does not occur.
> So, that pretty much narrows it down to the IGD, in my mind.

Indeed, I agree. Yet I can't in any way comment on what or why.
Xiantao (perhaps some graphics person would good to be Cc-ed
here too)?

Jan

  reply	other threads:[~2013-06-06  6:58 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-03 18:29 S3 crash with VTD Queue Invalidation enabled Ben Guthro
2013-06-03 19:22 ` Andrew Cooper
2013-06-04  8:54   ` Jan Beulich
2013-06-04 12:25     ` Ben Guthro
2013-06-04 14:01       ` Jan Beulich
2013-06-04 19:20         ` Ben Guthro
2013-06-04 19:49           ` Ben Guthro
2013-06-04 21:09             ` Ben Guthro
2013-06-05  8:24               ` Jan Beulich
2013-06-05 13:54                 ` Ben Guthro
2013-06-05 15:14                   ` Jan Beulich
2013-06-05 15:25                     ` Ben Guthro
2013-06-05 15:38                       ` Jan Beulich
2013-06-05 20:27                         ` Ben Guthro
2013-06-05 23:53                           ` Ben Guthro
2013-06-06  6:58                             ` Jan Beulich [this message]
2013-06-06 15:06                               ` Zhang, Xiantao
2013-06-06 15:07                                 ` Ben Guthro
2013-06-06 15:13                                   ` Zhang, Xiantao
2013-06-06 15:17                                     ` Ben Guthro
2013-06-07  1:33                                       ` Zhang, Xiantao
2013-06-07 15:52                                         ` Ben Guthro
2013-06-14  8:38                             ` Jan Beulich
2013-06-14 17:01                               ` Ben Guthro
2013-06-14 18:27                                 ` Ben Guthro
2013-06-17  7:23                                   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B04F3602000078000DBC60@nat28.tlf.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ben@guthro.net \
    --cc=xen-devel@lists.xen.org \
    --cc=xiantao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.