All of lore.kernel.org
 help / color / mirror / Atom feed
* regression/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS
@ 2015-09-17 13:19 Frank de Jong
  2015-09-23 13:07 ` Daniel Vetter
  0 siblings, 1 reply; 2+ messages in thread
From: Frank de Jong @ 2015-09-17 13:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Hello,

A regression/bug was introduced by commit 
[0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support 
for DPMS.
It causes my linux box to emit PATA and NIC errors (PCI bus stops 
working), SATA is unaffected. The system freezes shorty after. Trouble 
starts within 12 hours, but not right away.
Git bisect was used to find the regression/bug. It was initially 
introduced somewhere around kernel v3.15.0-rc8 (code has moved around 
since).

*** for details and logs, please scroll down ***

Hardware details:
- MSI MS-7732/PH61A-P35 (MS-7732), BIOS V2.4 09/19/2014
- Sandy Bridge Intel(R) Pentium(R) CPU G620
- 8 GiB RAM
- CRT monitor without EDID (not "plug and play") hooked up to the i915 
CRT port
- 1x SSD connected to ASM1062 SATA controller
- 2x HDD connected to Intel SATA controller
- CDROM drive (ata9) and HDD (ata10) connected to Promise PDC20268
- Uncommon hardware configuration (old PCI PATA controller and PCI NIC)

Other:
- Running on LFS 7.1+ x86_64
- DPMS powersave enabled (setterm -powersave powerdown -powerdown 5)
- text based, no X
- Has VM's (QEMU KVM) running on top

Testing revealed:
v3.18.21: NOT OK
v3.17.8 : NOT OK
v3.17.0 : NOT OK
v3.16.7 : OK
v3.16.0 : OK
v3.14.51: OK

As a test, function intel_crtc_update_dpms() in 
v3.18.20/drivers/gpu/drm/i915/intel_display.c was replaced with an older 
revision (see below). The older revision works fine; it bypasses the 
function intel_crtc_control().

So what is going wrong here? Perhaps something to do with the power 
domains, lack of EDID? Or might it just trigger some bug/regression 
hidden deep in the kernel?

I do not have the know-how to fix this. You guys do, ofcourse! :-)


/**
  * Sets the power management mode of the pipe and plane.
  */
void intel_crtc_update_dpms(struct drm_crtc *crtc)
{
         struct drm_device *dev = crtc->dev;
         struct drm_i915_private *dev_priv = dev->dev_private;
         struct intel_encoder *intel_encoder;
         bool enable = false;

         for_each_encoder_on_crtc(dev, crtc, intel_encoder)
                 enable |= intel_encoder->connectors_active;

         if (enable)
                 dev_priv->display.crtc_enable(crtc);
         else
                 dev_priv->display.crtc_disable(crtc);

         intel_crtc_update_sarea(crtc, enable);
}


# /proc/cmdline
# noirqdebug: "ASM1083/1085 PCIe to PCI Bridge" has hardware issues, 
skips a IRQ beat now and then
root=LABEL=ROOTFS selinux=0 video=800x600-32@72 
acpi_enforce_resources=lax thermal.off=1 noirqdebug


# /proc/interrupts
   0:         14          0   IO-APIC-edge      timer
   1:         12          0   IO-APIC-edge      i8042
   4:      57380          0   IO-APIC-edge      serial
   5:          0          0   IO-APIC-edge      parport0
   8:        119          0   IO-APIC-edge      rtc0
   9:          3          0   IO-APIC-fasteoi   acpi
  12:        147          0   IO-APIC-edge      i8042
  16:        746          0   IO-APIC  16-fasteoi   ehci_hcd:usb1
  18:       7305          0   IO-APIC  18-fasteoi   0000:03:00.0
  19:      64979          0   IO-APIC  19-fasteoi   eth0
  23:         33          0   IO-APIC  23-fasteoi   ehci_hcd:usb3
  24:        762          0   PCI-MSI-edge      0000:00:1f.2
  25:      12926          0   PCI-MSI-edge      0000:06:00.0
  26:      16017          0   PCI-MSI-edge      eth1
  27:          0          0   PCI-MSI-edge      xhci_hcd
  28:          0          0   PCI-MSI-edge      xhci_hcd
  29:          0          0   PCI-MSI-edge      xhci_hcd
  30:        517          0   PCI-MSI-edge      snd_hda_intel
  31:          7          0   PCI-MSI-edge      i915
NMI:          0          0   Non-maskable interrupts
LOC:    1024537     205875   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          1          0   APIC ICR read retries
RES:      17909      21506   Rescheduling interrupts
CAL:       7297       7219   Function call interrupts
TLB:        346        209   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:         11         11   Machine check polls
ERR:          0
MIS:          0


# lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor 
Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core 
Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series 
Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset 
Family USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset 
Family High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 1 (rev b5)
00:1c.2 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset 
Family PCI Express Root Port 6 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset 
Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation H61 Express Chipset Family LPC 
Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset 
Family SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family 
SMBus Controller (rev 05)
02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI 
Bridge (rev 01)
03:00.0 Mass storage controller: Promise Technology, Inc. PDC20268 
[Ultra100 TX2] (rev 02)
03:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] 
(rev 74)
04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB 
Host Controller
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. 
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
06:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA 
Controller (rev 01)


# logging: so it begins
Sep  8 04:13:06 syscore kernel: ata9: drained 2 bytes to clear DRQ
Sep  8 04:18:13 syscore kernel: ata9: drained 2 bytes to clear DRQ
Sep  8 04:19:55 syscore kernel: ata9: drained 2 bytes to clear DRQ
Sep  8 04:37:42 syscore kernel: ata9: drained 2 bytes to clear DRQ
Sep  8 04:41:58 syscore kernel: ata9: drained 2 bytes to clear DRQ
Sep  8 04:47:06 syscore kernel: ata9: drained 2 bytes to clear DRQ


# logging: and it ends like this (kernel lock-up if unlucky)
Jun 14 18:32:59 syscore kernel: ata9: drained 2 bytes to clear DRQ
Jun 14 18:33:04 syscore kernel: ata9.00: qc timeout (cmd 0xa0)
Jun 14 18:33:04 syscore kernel: ata9.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x6 frozen
Jun 14 18:33:16 syscore kernel: sr 8:0:0:0: CDB:
Jun 14 18:33:16 syscore kernel: Get event status notification: 4a 01 00 
00 10 00 00 00 08 00
Jun 14 18:33:16 syscore kernel: ata9.00: cmd 
a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
Jun 14 18:33:16 syscore kernel: res 51/54:03:00:08:00/00:00:00:00:00/a0 
Emask 0x5 (timeout)
Jun 14 18:33:16 syscore kernel: ata9.00: status: { DRDY ERR }
Jun 14 18:33:16 syscore kernel: ata9: soft resetting link
Jun 14 18:33:16 syscore kernel: eth0: Transmit error, Tx status register ff.
Jun 14 18:33:16 syscore kernel: Flags; bus-master 1, dirty 51677(13) 
current 51677(13)
Jun 14 18:33:16 syscore kernel: Transmit list ffffffff vs. ffff8800d781e9b8.
Jun 14 18:33:16 syscore kernel: 0: @ffff8800d781e200  length 8000008d 
status 0001008d
Jun 14 18:33:16 syscore kernel: 1: @ffff8800d781e298  length 80000048 
status 00010048
Jun 14 18:33:16 syscore kernel: 2: @ffff8800d781e330  length 800000b7 
status 000100b7
Jun 14 18:33:16 syscore kernel: 3: @ffff8800d781e3c8  length 800000a7 
status 000100a7
Jun 14 18:33:16 syscore kernel: 4: @ffff8800d781e460  length 00000080 
status 00010102
Jun 14 18:33:16 syscore kernel: 5: @ffff8800d781e4f8  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 6: @ffff8800d781e590  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 7: @ffff8800d781e628  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 8: @ffff8800d781e6c0  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 9: @ffff8800d781e758  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 10: @ffff8800d781e7f0  length 00000080 
status 0001014d
Jun 14 18:33:16 syscore kernel: 11: @ffff8800d781e888  length 80000048 
status 00010048
Jun 14 18:33:16 syscore kernel: 12: @ffff8800d781e920  length 800000a7 
status 800100a7
Jun 14 18:33:16 syscore kernel: 13: @ffff8800d781e9b8  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 14: @ffff8800d781ea50  length 80000042 
status 00010042
Jun 14 18:33:16 syscore kernel: 15: @ffff8800d781eae8  length 00000080 
status 00010102
Jun 14 18:33:16 syscore kernel: eth0: Updating statistics failed, 
disabling stats as an interrupt source
Jun 14 18:33:16 syscore kernel: eth0: Host error, FIFO diagnostic 
register ffff.
Jun 14 18:33:16 syscore kernel: eth0: PCI bus error, bus status ffffffff
Jun 14 18:33:16 syscore kernel: eth0:  setting full-duplex.
Jun 14 18:33:16 syscore kernel: eth0: command 0x5800 did not complete! 
Status=0xffff
Jun 14 18:33:16 syscore kernel: sched: RT throttling activated
Jun 14 18:33:16 syscore kernel: ata9.00: qc timeout (cmd 0xa1)
Jun 14 18:33:16 syscore kernel: ata9.00: failed to IDENTIFY (I/O error, 
err_mask=0x5)
Jun 14 18:33:16 syscore kernel: ata9.00: revalidation failed (errno=-5)
Jun 14 18:33:16 syscore kernel: ata9: soft resetting link
Jun 14 18:33:16 syscore kernel: pata_pdc2027x: 40-conductor cable 
detected on port 0
Jun 14 18:33:16 syscore kernel: ata9.00: configured for MWDMA2
Jun 14 18:33:16 syscore kernel: ata9: EH complete


# logging: more of the same
Sep 10 15:34:21 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x0
Sep 10 15:34:21 syscore kernel: ata10.00: BMDMA stat 0xff
Sep 10 15:34:21 syscore kernel: ata10.00: failed command: READ DMA EXT
Sep 10 15:34:21 syscore kernel: ata10.00: cmd 
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
Sep 10 15:34:21 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 
Emask 0x20 (host bus error)
Sep 10 15:34:21 syscore kernel: ata10.00: status: { DRDY }
Sep 10 15:34:22 syscore kernel: ata10.00: configured for UDMA/100
Sep 10 15:34:22 syscore kernel: ata10: EH complete
Sep 10 15:34:23 syscore kernel: ata10: drained 202 bytes to clear DRQ
Sep 10 15:34:23 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x0
Sep 10 15:34:23 syscore kernel: ata10.00: BMDMA stat 0xff
Sep 10 15:34:23 syscore kernel: ata10.00: failed command: READ DMA EXT
Sep 10 15:34:23 syscore kernel: ata10.00: cmd 
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
Sep 10 15:34:23 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 
Emask 0x20 (host bus error)
Sep 10 15:34:23 syscore kernel: ata10.00: status: { DRDY }
Sep 10 15:34:23 syscore kernel: ata10.00: n_sectors mismatch 586114704 
!= 268435455
Sep 10 15:34:23 syscore kernel: ata10.00: old n_sectors matches native, 
probably late HPA lock, will try to unlock HPA
Sep 10 15:34:23 syscore kernel: ata10.00: revalidation failed (errno=-5)
Sep 10 15:34:23 syscore kernel: ata10: soft resetting link
Sep 10 15:34:33 syscore kernel: eth0: Transmit error, Tx status register ff.
Sep 10 15:34:33 syscore kernel: Flags; bus-master 1, dirty 36208(0) 
current 36208(0)
Sep 10 15:34:33 syscore kernel: Transmit list ffffffff vs. ffff88001fafc200.
Sep 10 15:34:33 syscore kernel: 0: @ffff88001fafc200  length 80000071 
status 00010071
Sep 10 15:34:33 syscore kernel: 1: @ffff88001fafc298  length 80000036 
status 00010036
Sep 10 15:34:33 syscore kernel: 2: @ffff88001fafc330  length 8000005c 
status 0001005c
Sep 10 15:34:33 syscore kernel: 3: @ffff88001fafc3c8  length 80000036 
status 00010036
Sep 10 15:34:33 syscore kernel: 4: @ffff88001fafc460  length 80000036 
status 00010036
Sep 10 15:34:33 syscore kernel: 5: @ffff88001fafc4f8  length 80000064 
status 00010064
Sep 10 15:34:33 syscore kernel: 6: @ffff88001fafc590  length 8000002a 
status 0001002a
Sep 10 15:34:33 syscore kernel: 7: @ffff88001fafc628  length 80000064 
status 00010064
Sep 10 15:34:33 syscore kernel: 8: @ffff88001fafc6c0  length 80000036 
status 00010036
Sep 10 15:34:33 syscore kernel: 9: @ffff88001fafc758  length 80000064 
status 00010064
Sep 10 15:34:33 syscore kernel: 10: @ffff88001fafc7f0  length 80000055 
status 00010055
Sep 10 15:34:33 syscore kernel: 11: @ffff88001fafc888  length 80000036 
status 00010036
Sep 10 15:34:33 syscore kernel: 12: @ffff88001fafc920  length 8000002a 
status 0001002a
Sep 10 15:34:33 syscore kernel: 13: @ffff88001fafc9b8  length 8000004a 
status 0c01004a
Sep 10 15:34:33 syscore kernel: 14: @ffff88001fafca50  length 8000004a 
status 0c01004a
Sep 10 15:34:33 syscore kernel: 15: @ffff88001fafcae8  length 8000004a 
status 8c01004a
Sep 10 15:34:33 syscore kernel: eth0: Updating statistics failed, 
disabling stats as an interrupt source.
Sep 10 15:34:33 syscore kernel: eth0: Host error, FIFO diagnostic 
register ffff.
Sep 10 15:34:33 syscore kernel: eth0: PCI bus error, bus status ffffffff
Sep 10 15:34:34 syscore kernel: eth0:  setting full-duplex.
Sep 10 15:34:34 syscore kernel: eth0: command 0x5800 did not complete! 
Status=0xffff
Sep 10 15:34:34 syscore kernel: [sched_delayed] sched: RT throttling 
activated
Sep 10 15:34:34 syscore kernel: ata10.00: configured for UDMA/100
Sep 10 15:34:34 syscore kernel: ata10: EH complete
Sep 10 15:34:34 syscore kernel: ata10: drained 292 bytes to clear DRQ
Sep 10 15:34:34 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x0
Sep 10 15:34:34 syscore kernel: ata10.00: BMDMA stat 0xff
Sep 10 15:34:34 syscore kernel: ata10.00: failed command: READ DMA EXT
Sep 10 15:34:34 syscore kernel: ata10.00: cmd 
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
Sep 10 15:34:34 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 
Emask 0x20 (host bus error)
Sep 10 15:34:34 syscore kernel: ata10.00: status: { DRDY }
Sep 10 15:34:34 syscore kernel: ata10.00: configured for UDMA/100
Sep 10 15:34:34 syscore kernel: ata10: EH complete


Kind regards,
Frank de Jong
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: regression/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS
  2015-09-17 13:19 regression/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS Frank de Jong
@ 2015-09-23 13:07 ` Daniel Vetter
  0 siblings, 0 replies; 2+ messages in thread
From: Daniel Vetter @ 2015-09-23 13:07 UTC (permalink / raw)
  To: Frank de Jong; +Cc: Daniel Vetter, intel-gfx

Another one for Jairo.
-Daniel

On Thu, Sep 17, 2015 at 03:19:44PM +0200, Frank de Jong wrote:
> Hello,
> 
> A regression/bug was introduced by commit
> [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for
> DPMS.
> It causes my linux box to emit PATA and NIC errors (PCI bus stops working),
> SATA is unaffected. The system freezes shorty after. Trouble starts within
> 12 hours, but not right away.
> Git bisect was used to find the regression/bug. It was initially introduced
> somewhere around kernel v3.15.0-rc8 (code has moved around since).
> 
> *** for details and logs, please scroll down ***
> 
> Hardware details:
> - MSI MS-7732/PH61A-P35 (MS-7732), BIOS V2.4 09/19/2014
> - Sandy Bridge Intel(R) Pentium(R) CPU G620
> - 8 GiB RAM
> - CRT monitor without EDID (not "plug and play") hooked up to the i915 CRT
> port
> - 1x SSD connected to ASM1062 SATA controller
> - 2x HDD connected to Intel SATA controller
> - CDROM drive (ata9) and HDD (ata10) connected to Promise PDC20268
> - Uncommon hardware configuration (old PCI PATA controller and PCI NIC)
> 
> Other:
> - Running on LFS 7.1+ x86_64
> - DPMS powersave enabled (setterm -powersave powerdown -powerdown 5)
> - text based, no X
> - Has VM's (QEMU KVM) running on top
> 
> Testing revealed:
> v3.18.21: NOT OK
> v3.17.8 : NOT OK
> v3.17.0 : NOT OK
> v3.16.7 : OK
> v3.16.0 : OK
> v3.14.51: OK
> 
> As a test, function intel_crtc_update_dpms() in
> v3.18.20/drivers/gpu/drm/i915/intel_display.c was replaced with an older
> revision (see below). The older revision works fine; it bypasses the
> function intel_crtc_control().
> 
> So what is going wrong here? Perhaps something to do with the power domains,
> lack of EDID? Or might it just trigger some bug/regression hidden deep in
> the kernel?
> 
> I do not have the know-how to fix this. You guys do, ofcourse! :-)
> 
> 
> /**
>  * Sets the power management mode of the pipe and plane.
>  */
> void intel_crtc_update_dpms(struct drm_crtc *crtc)
> {
>         struct drm_device *dev = crtc->dev;
>         struct drm_i915_private *dev_priv = dev->dev_private;
>         struct intel_encoder *intel_encoder;
>         bool enable = false;
> 
>         for_each_encoder_on_crtc(dev, crtc, intel_encoder)
>                 enable |= intel_encoder->connectors_active;
> 
>         if (enable)
>                 dev_priv->display.crtc_enable(crtc);
>         else
>                 dev_priv->display.crtc_disable(crtc);
> 
>         intel_crtc_update_sarea(crtc, enable);
> }
> 
> 
> # /proc/cmdline
> # noirqdebug: "ASM1083/1085 PCIe to PCI Bridge" has hardware issues, skips a
> IRQ beat now and then
> root=LABEL=ROOTFS selinux=0 video=800x600-32@72 acpi_enforce_resources=lax
> thermal.off=1 noirqdebug
> 
> 
> # /proc/interrupts
>   0:         14          0   IO-APIC-edge      timer
>   1:         12          0   IO-APIC-edge      i8042
>   4:      57380          0   IO-APIC-edge      serial
>   5:          0          0   IO-APIC-edge      parport0
>   8:        119          0   IO-APIC-edge      rtc0
>   9:          3          0   IO-APIC-fasteoi   acpi
>  12:        147          0   IO-APIC-edge      i8042
>  16:        746          0   IO-APIC  16-fasteoi   ehci_hcd:usb1
>  18:       7305          0   IO-APIC  18-fasteoi   0000:03:00.0
>  19:      64979          0   IO-APIC  19-fasteoi   eth0
>  23:         33          0   IO-APIC  23-fasteoi   ehci_hcd:usb3
>  24:        762          0   PCI-MSI-edge      0000:00:1f.2
>  25:      12926          0   PCI-MSI-edge      0000:06:00.0
>  26:      16017          0   PCI-MSI-edge      eth1
>  27:          0          0   PCI-MSI-edge      xhci_hcd
>  28:          0          0   PCI-MSI-edge      xhci_hcd
>  29:          0          0   PCI-MSI-edge      xhci_hcd
>  30:        517          0   PCI-MSI-edge      snd_hda_intel
>  31:          7          0   PCI-MSI-edge      i915
> NMI:          0          0   Non-maskable interrupts
> LOC:    1024537     205875   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> PMI:          0          0   Performance monitoring interrupts
> IWI:          0          0   IRQ work interrupts
> RTR:          1          0   APIC ICR read retries
> RES:      17909      21506   Rescheduling interrupts
> CAL:       7297       7219   Function call interrupts
> TLB:        346        209   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:         11         11   Machine check polls
> ERR:          0
> MIS:          0
> 
> 
> # lspci
> 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family
> DRAM Controller (rev 09)
> 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
> Processor Family Integrated Graphics Controller (rev 09)
> 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series
> Chipset Family MEI Controller #1 (rev 04)
> 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #2 (rev 05)
> 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family
> High Definition Audio Controller (rev 05)
> 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family
> PCI Express Root Port 1 (rev b5)
> 00:1c.2 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5)
> 00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family
> PCI Express Root Port 4 (rev b5)
> 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family
> PCI Express Root Port 5 (rev b5)
> 00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family
> PCI Express Root Port 6 (rev b5)
> 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 05)
> 00:1f.0 ISA bridge: Intel Corporation H61 Express Chipset Family LPC
> Controller (rev 05)
> 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset
> Family SATA AHCI Controller (rev 05)
> 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus
> Controller (rev 05)
> 02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge
> (rev 01)
> 03:00.0 Mass storage controller: Promise Technology, Inc. PDC20268 [Ultra100
> TX2] (rev 02)
> 03:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev
> 74)
> 04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host
> Controller
> 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B
> PCI Express Gigabit Ethernet controller (rev 06)
> 06:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA
> Controller (rev 01)
> 
> 
> # logging: so it begins
> Sep  8 04:13:06 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Sep  8 04:18:13 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Sep  8 04:19:55 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Sep  8 04:37:42 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Sep  8 04:41:58 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Sep  8 04:47:06 syscore kernel: ata9: drained 2 bytes to clear DRQ
> 
> 
> # logging: and it ends like this (kernel lock-up if unlucky)
> Jun 14 18:32:59 syscore kernel: ata9: drained 2 bytes to clear DRQ
> Jun 14 18:33:04 syscore kernel: ata9.00: qc timeout (cmd 0xa0)
> Jun 14 18:33:04 syscore kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x6 frozen
> Jun 14 18:33:16 syscore kernel: sr 8:0:0:0: CDB:
> Jun 14 18:33:16 syscore kernel: Get event status notification: 4a 01 00 00
> 10 00 00 00 08 00
> Jun 14 18:33:16 syscore kernel: ata9.00: cmd
> a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
> Jun 14 18:33:16 syscore kernel: res 51/54:03:00:08:00/00:00:00:00:00/a0
> Emask 0x5 (timeout)
> Jun 14 18:33:16 syscore kernel: ata9.00: status: { DRDY ERR }
> Jun 14 18:33:16 syscore kernel: ata9: soft resetting link
> Jun 14 18:33:16 syscore kernel: eth0: Transmit error, Tx status register ff.
> Jun 14 18:33:16 syscore kernel: Flags; bus-master 1, dirty 51677(13) current
> 51677(13)
> Jun 14 18:33:16 syscore kernel: Transmit list ffffffff vs. ffff8800d781e9b8.
> Jun 14 18:33:16 syscore kernel: 0: @ffff8800d781e200  length 8000008d status
> 0001008d
> Jun 14 18:33:16 syscore kernel: 1: @ffff8800d781e298  length 80000048 status
> 00010048
> Jun 14 18:33:16 syscore kernel: 2: @ffff8800d781e330  length 800000b7 status
> 000100b7
> Jun 14 18:33:16 syscore kernel: 3: @ffff8800d781e3c8  length 800000a7 status
> 000100a7
> Jun 14 18:33:16 syscore kernel: 4: @ffff8800d781e460  length 00000080 status
> 00010102
> Jun 14 18:33:16 syscore kernel: 5: @ffff8800d781e4f8  length 80000042 status
> 00010042
> Jun 14 18:33:16 syscore kernel: 6: @ffff8800d781e590  length 80000042 status
> 00010042
> Jun 14 18:33:16 syscore kernel: 7: @ffff8800d781e628  length 80000042 status
> 00010042
> Jun 14 18:33:16 syscore kernel: 8: @ffff8800d781e6c0  length 80000042 status
> 00010042
> Jun 14 18:33:16 syscore kernel: 9: @ffff8800d781e758  length 80000042 status
> 00010042
> Jun 14 18:33:16 syscore kernel: 10: @ffff8800d781e7f0  length 00000080
> status 0001014d
> Jun 14 18:33:16 syscore kernel: 11: @ffff8800d781e888  length 80000048
> status 00010048
> Jun 14 18:33:16 syscore kernel: 12: @ffff8800d781e920  length 800000a7
> status 800100a7
> Jun 14 18:33:16 syscore kernel: 13: @ffff8800d781e9b8  length 80000042
> status 00010042
> Jun 14 18:33:16 syscore kernel: 14: @ffff8800d781ea50  length 80000042
> status 00010042
> Jun 14 18:33:16 syscore kernel: 15: @ffff8800d781eae8  length 00000080
> status 00010102
> Jun 14 18:33:16 syscore kernel: eth0: Updating statistics failed, disabling
> stats as an interrupt source
> Jun 14 18:33:16 syscore kernel: eth0: Host error, FIFO diagnostic register
> ffff.
> Jun 14 18:33:16 syscore kernel: eth0: PCI bus error, bus status ffffffff
> Jun 14 18:33:16 syscore kernel: eth0:  setting full-duplex.
> Jun 14 18:33:16 syscore kernel: eth0: command 0x5800 did not complete!
> Status=0xffff
> Jun 14 18:33:16 syscore kernel: sched: RT throttling activated
> Jun 14 18:33:16 syscore kernel: ata9.00: qc timeout (cmd 0xa1)
> Jun 14 18:33:16 syscore kernel: ata9.00: failed to IDENTIFY (I/O error,
> err_mask=0x5)
> Jun 14 18:33:16 syscore kernel: ata9.00: revalidation failed (errno=-5)
> Jun 14 18:33:16 syscore kernel: ata9: soft resetting link
> Jun 14 18:33:16 syscore kernel: pata_pdc2027x: 40-conductor cable detected
> on port 0
> Jun 14 18:33:16 syscore kernel: ata9.00: configured for MWDMA2
> Jun 14 18:33:16 syscore kernel: ata9: EH complete
> 
> 
> # logging: more of the same
> Sep 10 15:34:21 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Sep 10 15:34:21 syscore kernel: ata10.00: BMDMA stat 0xff
> Sep 10 15:34:21 syscore kernel: ata10.00: failed command: READ DMA EXT
> Sep 10 15:34:21 syscore kernel: ata10.00: cmd
> 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
> Sep 10 15:34:21 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
> Emask 0x20 (host bus error)
> Sep 10 15:34:21 syscore kernel: ata10.00: status: { DRDY }
> Sep 10 15:34:22 syscore kernel: ata10.00: configured for UDMA/100
> Sep 10 15:34:22 syscore kernel: ata10: EH complete
> Sep 10 15:34:23 syscore kernel: ata10: drained 202 bytes to clear DRQ
> Sep 10 15:34:23 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Sep 10 15:34:23 syscore kernel: ata10.00: BMDMA stat 0xff
> Sep 10 15:34:23 syscore kernel: ata10.00: failed command: READ DMA EXT
> Sep 10 15:34:23 syscore kernel: ata10.00: cmd
> 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
> Sep 10 15:34:23 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
> Emask 0x20 (host bus error)
> Sep 10 15:34:23 syscore kernel: ata10.00: status: { DRDY }
> Sep 10 15:34:23 syscore kernel: ata10.00: n_sectors mismatch 586114704 !=
> 268435455
> Sep 10 15:34:23 syscore kernel: ata10.00: old n_sectors matches native,
> probably late HPA lock, will try to unlock HPA
> Sep 10 15:34:23 syscore kernel: ata10.00: revalidation failed (errno=-5)
> Sep 10 15:34:23 syscore kernel: ata10: soft resetting link
> Sep 10 15:34:33 syscore kernel: eth0: Transmit error, Tx status register ff.
> Sep 10 15:34:33 syscore kernel: Flags; bus-master 1, dirty 36208(0) current
> 36208(0)
> Sep 10 15:34:33 syscore kernel: Transmit list ffffffff vs. ffff88001fafc200.
> Sep 10 15:34:33 syscore kernel: 0: @ffff88001fafc200  length 80000071 status
> 00010071
> Sep 10 15:34:33 syscore kernel: 1: @ffff88001fafc298  length 80000036 status
> 00010036
> Sep 10 15:34:33 syscore kernel: 2: @ffff88001fafc330  length 8000005c status
> 0001005c
> Sep 10 15:34:33 syscore kernel: 3: @ffff88001fafc3c8  length 80000036 status
> 00010036
> Sep 10 15:34:33 syscore kernel: 4: @ffff88001fafc460  length 80000036 status
> 00010036
> Sep 10 15:34:33 syscore kernel: 5: @ffff88001fafc4f8  length 80000064 status
> 00010064
> Sep 10 15:34:33 syscore kernel: 6: @ffff88001fafc590  length 8000002a status
> 0001002a
> Sep 10 15:34:33 syscore kernel: 7: @ffff88001fafc628  length 80000064 status
> 00010064
> Sep 10 15:34:33 syscore kernel: 8: @ffff88001fafc6c0  length 80000036 status
> 00010036
> Sep 10 15:34:33 syscore kernel: 9: @ffff88001fafc758  length 80000064 status
> 00010064
> Sep 10 15:34:33 syscore kernel: 10: @ffff88001fafc7f0  length 80000055
> status 00010055
> Sep 10 15:34:33 syscore kernel: 11: @ffff88001fafc888  length 80000036
> status 00010036
> Sep 10 15:34:33 syscore kernel: 12: @ffff88001fafc920  length 8000002a
> status 0001002a
> Sep 10 15:34:33 syscore kernel: 13: @ffff88001fafc9b8  length 8000004a
> status 0c01004a
> Sep 10 15:34:33 syscore kernel: 14: @ffff88001fafca50  length 8000004a
> status 0c01004a
> Sep 10 15:34:33 syscore kernel: 15: @ffff88001fafcae8  length 8000004a
> status 8c01004a
> Sep 10 15:34:33 syscore kernel: eth0: Updating statistics failed, disabling
> stats as an interrupt source.
> Sep 10 15:34:33 syscore kernel: eth0: Host error, FIFO diagnostic register
> ffff.
> Sep 10 15:34:33 syscore kernel: eth0: PCI bus error, bus status ffffffff
> Sep 10 15:34:34 syscore kernel: eth0:  setting full-duplex.
> Sep 10 15:34:34 syscore kernel: eth0: command 0x5800 did not complete!
> Status=0xffff
> Sep 10 15:34:34 syscore kernel: [sched_delayed] sched: RT throttling
> activated
> Sep 10 15:34:34 syscore kernel: ata10.00: configured for UDMA/100
> Sep 10 15:34:34 syscore kernel: ata10: EH complete
> Sep 10 15:34:34 syscore kernel: ata10: drained 292 bytes to clear DRQ
> Sep 10 15:34:34 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Sep 10 15:34:34 syscore kernel: ata10.00: BMDMA stat 0xff
> Sep 10 15:34:34 syscore kernel: ata10.00: failed command: READ DMA EXT
> Sep 10 15:34:34 syscore kernel: ata10.00: cmd
> 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in
> Sep 10 15:34:34 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
> Emask 0x20 (host bus error)
> Sep 10 15:34:34 syscore kernel: ata10.00: status: { DRDY }
> Sep 10 15:34:34 syscore kernel: ata10.00: configured for UDMA/100
> Sep 10 15:34:34 syscore kernel: ata10: EH complete
> 
> 
> Kind regards,
> Frank de Jong

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-09-23 13:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17 13:19 regression/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS Frank de Jong
2015-09-23 13:07 ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.