Re: pciehp is broken from 4.10-rc1

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: pciehp is broken from 4.10-rc1
       [not found] <CAE9FiQVCMCa7iVyuwp9z6VrY0cE7V_xghuXip28Ft52=8QmTWw@mail.gmail.com>
@ 2017-02-03 15:09 ` Bjorn Helgaas
       [not found] ` <20170203055200.GA29413@wunner.de>
  1 sibling, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2017-02-03 15:09 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Lukas Wunner, linux-pci,
	Mika Westerberg, linux-kernel

[+cc Mika, linux-kernel]

On Thu, Feb 02, 2017 at 08:11:48PM -0800, Yinghai Lu wrote:
> 4.9 is  working,
> ...

> After reverting
> 
> From 68db9bc814362e7f24371c27d12a4f34477d9356 Mon Sep 17 00:00:00 2001
> From: Lukas Wunner <lukas@wunner.de>
> Date: Fri, 28 Oct 2016 10:52:06 +0200
> Subject: PCI: pciehp: Add runtime PM support for PCIe hotplug ports
> 
> the hotplug work again.

I provisionally reverted 68db9bc81436 ("PCI: pciehp: Add runtime PM
support for PCIe hotplug ports") on my for-linus branch while this is
being debugged.

To be clear, this revert is only a last resort to avoid releasing
v4.10 with a regression.  I hope and assume we'll have a real fix
before v4.10 and we'll be able to drop the revert in favor of the fix.

Can someone please open a kernel.org bugzilla, mark it as a
regression, and attach the complete dmesg log and "lspci -vv" output?
Please mention that bugzilla in the changelog of the fix.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <20170203055200.GA29413@wunner.de>]

[parent not found: <CAE9FiQWs0H9vqEo2ZYnWWBW0Ao-hx4WYHQ69cyR32nFQ9yV9rw@mail.gmail.com>]

[parent not found: <20170204081254.GA29595@wunner.de>]

[parent not found: <20170204185607.GA29957@wunner.de>]

[parent not found: <CAE9FiQUuFJHMScyFgnHbs5r-SzTiRiBZ2JcpUYJhg0ft75-OBQ@mail.gmail.com>]

[parent not found: <20170204233443.GA234@wunner.de>]

* Re: pciehp is broken from 4.10-rc1
       [not found]           ` <20170204233443.GA234@wunner.de>
@ 2017-02-05  4:22             ` Yinghai Lu
  2017-02-05  5:20               ` Yinghai Lu
  2017-02-05  7:34               ` Lukas Wunner
  0 siblings, 2 replies; 10+ messages in thread
From: Yinghai Lu @ 2017-02-05  4:22 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Rafael J. Wysocki, linux-pci, Mika Westerberg,
	Linux Kernel Mailing List

On Sat, Feb 4, 2017 at 3:34 PM, Lukas Wunner <lukas@wunner.de> wrote:
> On Sat, Feb 04, 2017 at 01:44:34PM -0800, Yinghai Lu wrote:
>> On Sat, Feb 4, 2017 at 10:56 AM, Lukas Wunner <lukas@wunner.de> wrote:
>> > On Sat, Feb 04, 2017 at 09:12:54AM +0100, Lukas Wunner wrote:
>> > Section 6.7.3.4 of the PCIe Base spec seems to support the theory above,
>> > so here's a tentative patch.
>> >
>> >
>> > -- >8 --
>> > Subject: [PATCH] PCI: pciehp: Don't enable PME on runtime suspend
>>
>> it works:
>
> Thanks a lot for the report and for testing the patch!

Wait, Commit 68db9bc still has problem with another server (skylake
based), and this patch does not help.


sca05-0a81fd8d:~ # echo 0 > /sys/bus/pci/slots/11/power
[  362.721197] pci_hotplug: power_write_file: power = 0
[  362.726887] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
SLOTCTRL a8 value read 11f1
[  362.736431] pciehp 0000:b3:00.0:pcie004: pciehp_unconfigure_device:
domain:bus:dev = 0000:b4:00
[  362.746160] mlx4_core 0000:b4:00.0: PME# disabled
[  364.494033] pcieport 0000:b3:00.0:   root_bridge ACPI_HANDLE
ffff9e56b8811550 : pci0000:b3
[  364.503274] pcieport 0000:b3:00.0:  pciehp is native
[  364.508863] pci 0000:b4:00.0: freeing pci_dev info
[  364.514718] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  364.523443] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot:
SLOTCTRL a8 write cmd 400
[  364.587047] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0108
from Slot Status
[  364.595592] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down
[  364.602325] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down event
ignored; already powering off
[  365.568415] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off:
SLOTCTRL a8 write cmd 300
[  365.569338] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status

sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
[  375.376609] pci_hotplug: power_write_file: power = 1
[  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
SLOTCTRL a8 value read 17f1
[  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot:
SLOTCTRL a8 write cmd 0
[  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink:
SLOTCTRL a8 write cmd 200
[  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link
Active not set in 1000 msec
[  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval
20 ms to get ffffffff
[  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status:
lnk_status = 5001
[  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
[  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
[  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot:
SLOTCTRL a8 write cmd 400
[  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off:
SLOTCTRL a8 write cmd 300
[  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  380.018020] pciehp 0000:b3:00.0:pcie004:
pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
[  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
-bash: echo: write error: Operation not permitted

revert commit 68db9bc, also make it working again.


Thanks


Yinghai

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-05  4:22             ` Yinghai Lu
@ 2017-02-05  5:20               ` Yinghai Lu
  2017-02-05  7:34               ` Lukas Wunner
  1 sibling, 0 replies; 10+ messages in thread
From: Yinghai Lu @ 2017-02-05  5:20 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Rafael J. Wysocki, linux-pci, Mika Westerberg,
	Linux Kernel Mailing List

On Sat, Feb 4, 2017 at 8:22 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Sat, Feb 4, 2017 at 3:34 PM, Lukas Wunner <lukas@wunner.de> wrote:
>> On Sat, Feb 04, 2017 at 01:44:34PM -0800, Yinghai Lu wrote:
>>> On Sat, Feb 4, 2017 at 10:56 AM, Lukas Wunner <lukas@wunner.de> wrote:
>>> > On Sat, Feb 04, 2017 at 09:12:54AM +0100, Lukas Wunner wrote:
>>> > Section 6.7.3.4 of the PCIe Base spec seems to support the theory above,
>>> > so here's a tentative patch.
>>> >
>>> >
>>> > -- >8 --
>>> > Subject: [PATCH] PCI: pciehp: Don't enable PME on runtime suspend
>>>
>>> it works:
>>
>> Thanks a lot for the report and for testing the patch!
>
> Wait, Commit 68db9bc still has problem with another server (skylake
> based), and this patch does not help.
>
>
> sca05-0a81fd8d:~ # echo 0 > /sys/bus/pci/slots/11/power
> [  362.721197] pci_hotplug: power_write_file: power = 0
> [  362.726887] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
> SLOTCTRL a8 value read 11f1
> [  362.736431] pciehp 0000:b3:00.0:pcie004: pciehp_unconfigure_device:
> domain:bus:dev = 0000:b4:00
> [  362.746160] mlx4_core 0000:b4:00.0: PME# disabled
> [  364.494033] pcieport 0000:b3:00.0:   root_bridge ACPI_HANDLE
> ffff9e56b8811550 : pci0000:b3
> [  364.503274] pcieport 0000:b3:00.0:  pciehp is native
> [  364.508863] pci 0000:b4:00.0: freeing pci_dev info
> [  364.514718] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  364.523443] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot:
> SLOTCTRL a8 write cmd 400
> [  364.587047] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0108
> from Slot Status
> [  364.595592] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down
> [  364.602325] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down event
> ignored; already powering off
> [  365.568415] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off:
> SLOTCTRL a8 write cmd 300
> [  365.569338] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
>
> sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> [  375.376609] pci_hotplug: power_write_file: power = 1
> [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
> SLOTCTRL a8 value read 17f1
> [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot:
> SLOTCTRL a8 write cmd 0
> [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink:
> SLOTCTRL a8 write cmd 200
> [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link
> Active not set in 1000 msec
> [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval
> 20 ms to get ffffffff
> [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status:
> lnk_status = 5001
> [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot:
> SLOTCTRL a8 write cmd 400
> [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off:
> SLOTCTRL a8 write cmd 300
> [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  380.018020] pciehp 0000:b3:00.0:pcie004:
> pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> -bash: echo: write error: Operation not permitted
>
> revert commit 68db9bc, also make it working again.

output after reverting 68db9bc

sca05-0a81fd8d:~ # echo 0 > /sys/bus/pci/slots/11/power
[  359.966115] pci_hotplug: power_write_file: power = 0
[  359.971759] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
SLOTCTRL a8 value read 11f1
[  359.981284] pciehp 0000:b3:00.0:pcie004: pciehp_unconfigure_device:
domain:bus:dev = 0000:b4:00
[  359.991017] mlx4_core 0000:b4:00.0: PME# disabled
[  361.579571] pci 0000:b4:00.0: freeing pci_dev info
[  361.585390] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  361.594116] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot:
SLOTCTRL a8 write cmd 400
[  361.657705] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0108
from Slot Status
[  361.666268] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down
[  361.673076] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Down event
ignored; already powering off
[  362.621894] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off:
SLOTCTRL a8 write cmd 300
[  362.622499] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
sca05-0a81fd8d:~ #
sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
[  368.797970] pci_hotplug: power_write_file: power = 1
[  368.803544] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status:
SLOTCTRL a8 value read 17f1
[  368.813743] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  368.822410] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot:
SLOTCTRL a8 write cmd 0
[  368.831280] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink:
SLOTCTRL a8 write cmd 200
[  368.832115] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  369.455188] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_active:
lnk_status = f083
[  369.463844] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0108
from Slot Status
[  369.465786] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_active:
lnk_status = f083
[  369.481042] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Up
[  369.487219] pciehp 0000:b3:00.0:pcie004: Slot(11): Link Up event
ignored; already powering on
[  369.573787] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status:
lnk_status = f083
[  369.582664] pci 0000:b4:00.0: [15b3:1003] type 00 class 0x0c0600
[  369.589626] pci 0000:b4:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
[  369.597359] pci 0000:b4:00.0: reg 0x18: [mem 0x00000000-0x07ffffff
64bit pref]
[  369.605749] pci_bus 0000:b4:   bridge ACPI_HANDLE ffff9c2fb8817780
: 0000:b3:00.0
[  369.615407] pci 0000:b4:00.0: reg 0x134: [mem 0x00000000-0x07ffffff
64bit pref]
[  369.623571] pci 0000:b4:00.0: VF(n) BAR2 space: [mem
0x00000000-0x1ffffffff 64bit pref] (contains BAR2 for 64 VFs)
[  369.638820] pci 0000:b4:00.0: on_all_pcie_path: 1
[  369.644445] pci 0000:b4:00.0: BAR 2: assigned [mem
0x396ff8000000-0x396fffffffff 64bit pref]
[  369.654012] pci 0000:b4:00.0: BAR 9: assigned [mem
0x396df8000000-0x396ff7ffffff 64bit pref]
[  369.663489] pci 0000:b4:00.0: BAR 0: [mem size 0x00100000 64bit] + pref
[  369.670879] pci 0000:b4:00.0: BAR 0: assigned [mem
0xddf00000-0xddffffff 64bit]
[  369.679171] pcieport 0000:b3:00.0: PCI bridge to [bus b4-b7]
[  369.685495] pcieport 0000:b3:00.0:   bridge window [io  0xf000-0xffff]
[  369.692791] pcieport 0000:b3:00.0:   bridge window [mem
0xdd000000-0xddffffff]
[  369.700857] pcieport 0000:b3:00.0:   bridge window [mem
0x396df8000000-0x396fffffffff 64bit pref]
[  369.710778] pcieport 0000:b3:00.0: Max Payload Size set to  256/
256 (was  256), Max Read Rq  128
[  369.720776] pci 0000:b4:00.0: Max Payload Size set to  256/ 256
(was  128), Max Read Rq  512
[  369.730231] pci 0000:b4:00.0: calling
mellanox_check_broken_intx_masking+0x0/0x130
[  369.738691] calling  mellanox_check_broken_intx_masking+0x0/0x130 @
40613 for 0000:b4:00.0
[  369.747913] pci fixup mellanox_check_broken_intx_masking+0x0/0x130
returned after 0 usecs for 0000:b4:00.0
[  369.759192] mlx4_core: Initializing 0000:b4:00.0
[  369.764398] mlx4_core 0000:b4:00.0: enabling device (0000 -> 0002)
[  369.771854]   alloc irq_desc for 71 on node 5
[  369.776904] IOAPIC[31]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1
Redir_hint:1 Trig_Mode:0 Dlvry_Mode:1 Avail:0 Vector:D7 Dest:00143FFF
SID:B32C SQ:0 SVT:1)
[  369.792059] IOAPIC[24]: Set routing entry (31-0 -> 0xd7 -> IRQ 71
Mode:1 Active:1 Dest:1327103)
...

[  377.032574] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_on:
SLOTCTRL a8 write cmd 100
[  377.032802] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status
[  377.050076] pciehp 0000:b3:00.0:pcie004:
pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[  377.050328] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010
from Slot Status

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-05  4:22             ` Yinghai Lu
  2017-02-05  5:20               ` Yinghai Lu
@ 2017-02-05  7:34               ` Lukas Wunner
  2017-02-06 10:37                 ` Mika Westerberg
  2017-02-06 18:10                 ` Bjorn Helgaas
  1 sibling, 2 replies; 10+ messages in thread
From: Lukas Wunner @ 2017-02-05  7:34 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Bjorn Helgaas, Rafael J. Wysocki, linux-pci, Mika Westerberg,
	Linux Kernel Mailing List

On Sat, Feb 04, 2017 at 08:22:59PM -0800, Yinghai Lu wrote:
> On Sat, Feb 4, 2017 at 3:34 PM, Lukas Wunner <lukas@wunner.de> wrote:
> > On Sat, Feb 04, 2017 at 01:44:34PM -0800, Yinghai Lu wrote:
> >> On Sat, Feb 4, 2017 at 10:56 AM, Lukas Wunner <lukas@wunner.de> wrote:
> >> > On Sat, Feb 04, 2017 at 09:12:54AM +0100, Lukas Wunner wrote:
> >> > Section 6.7.3.4 of the PCIe Base spec seems to support the theory above,
> >> > so here's a tentative patch.
> >> >
> >> > -- >8 --
> >> > Subject: [PATCH] PCI: pciehp: Don't enable PME on runtime suspend
> >>
> >> it works:
> >
> > Thanks a lot for the report and for testing the patch!
> 
> Wait, Commit 68db9bc still has problem with another server (skylake
> based), and this patch does not help.
[...]
> sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> [  375.376609] pci_hotplug: power_write_file: power = 1
> [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link Active not set in 1000 msec
> [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval 20 ms to get ffffffff
> [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status: lnk_status = 5001
> [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  380.018020] pciehp 0000:b3:00.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status

So on this Skylake machine link training fails after resuming from D3hot
to D0.

One thing that's a bit fishy is that normally the Link Disable bit is
cleared when powering on the slot.  This results in a debug message
in dmesg containg the string "lnk_ctrl = ", and that line is missing
from the output you've pasted above, suggesting that the machine is
not running a stock v4.10 kernel after all but something else.  Could
you check why this message is not printed?  Could you check with lspci
if the Link Disable bit is set before you invoke "echo 1"?

This is the call stack:
pciehp_sysfs_enable_slot()
  pciehp_enable_slot()
    board_added()
      pciehp_power_on_slot()
        pciehp_link_enable()
          __pciehp_link_set()

Another theory is that the link is generally unreliable on this machine
since the Link Bandwidth Management Status bit is set in the Link Status
Register ("lnk_status = 5001"), which according to the spec means:

"Hardware has changed Link speed or width to attempt to correct unreliable
Link operation, either through an LTSSM timeout or a higher level process.
This bit must be set if the Physical Layer reports a speed or width change
was initiated by the Downstream component that was not indicated as an
autonomous change."

In this case it would be good to know which hardware exactly we're dealing
with so that we might quirk it to not runtime suspend the port.  To that
end, could you attach a full dmesg log to the bugzilla entry I've created?
https://bugzilla.kernel.org/show_bug.cgi?id=193951

@Mika, Rafael: Are you aware of Skylake machines with unreliable link
training, or perhaps errata of Skylake chips related to link training
on hotplug ports?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-05  7:34               ` Lukas Wunner
@ 2017-02-06 10:37                 ` Mika Westerberg
  2017-02-06 11:49                   ` Rafael J. Wysocki
  2017-02-06 21:35                   ` Lukas Wunner
  2017-02-06 18:10                 ` Bjorn Helgaas
  1 sibling, 2 replies; 10+ messages in thread
From: Mika Westerberg @ 2017-02-06 10:37 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Yinghai Lu, Bjorn Helgaas, Rafael J. Wysocki, linux-pci,
	Linux Kernel Mailing List

On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> > [  375.376609] pci_hotplug: power_write_file: power = 1
> > [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> > [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> > [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> > [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link Active not set in 1000 msec
> > [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval 20 ms to get ffffffff
> > [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status: lnk_status = 5001
> > [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> > [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> > [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> > [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> > [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  380.018020] pciehp 0000:b3:00.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> > [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status

It would be good to see the output when 68db9bc is reverted. Yinghai,
can you attach that to the bugzilla but as well?

> So on this Skylake machine link training fails after resuming from D3hot
> to D0.
> 
> One thing that's a bit fishy is that normally the Link Disable bit is
> cleared when powering on the slot.  This results in a debug message
> in dmesg containg the string "lnk_ctrl = ", and that line is missing
> from the output you've pasted above, suggesting that the machine is
> not running a stock v4.10 kernel after all but something else.  Could
> you check why this message is not printed?  Could you check with lspci
> if the Link Disable bit is set before you invoke "echo 1"?
> 
> This is the call stack:
> pciehp_sysfs_enable_slot()
>   pciehp_enable_slot()
>     board_added()
>       pciehp_power_on_slot()
>         pciehp_link_enable()
>           __pciehp_link_set()
> 
> Another theory is that the link is generally unreliable on this machine
> since the Link Bandwidth Management Status bit is set in the Link Status
> Register ("lnk_status = 5001"), which according to the spec means:
> 
> "Hardware has changed Link speed or width to attempt to correct unreliable
> Link operation, either through an LTSSM timeout or a higher level process.
> This bit must be set if the Physical Layer reports a speed or width change
> was initiated by the Downstream component that was not indicated as an
> autonomous change."
> 
> In this case it would be good to know which hardware exactly we're dealing
> with so that we might quirk it to not runtime suspend the port.  To that
> end, could you attach a full dmesg log to the bugzilla entry I've created?
> https://bugzilla.kernel.org/show_bug.cgi?id=193951
> 
> @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> training, or perhaps errata of Skylake chips related to link training
> on hotplug ports?

According to the 100-series (the chipset used with Skylake) errata
below, I don't see any mentions related to PCIe link training issues.

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-06 10:37                 ` Mika Westerberg
@ 2017-02-06 11:49                   ` Rafael J. Wysocki
  2017-02-06 21:35                   ` Lukas Wunner
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2017-02-06 11:49 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Lukas Wunner, Yinghai Lu, Bjorn Helgaas, linux-pci,
	Linux Kernel Mailing List

On Monday, February 06, 2017 12:37:06 PM Mika Westerberg wrote:
> On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > > sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> > > [  375.376609] pci_hotplug: power_write_file: power = 1
> > > [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> > > [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> > > [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> > > [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link Active not set in 1000 msec
> > > [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval 20 ms to get ffffffff
> > > [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status: lnk_status = 5001
> > > [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> > > [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> > > [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> > > [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> > > [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  380.018020] pciehp 0000:b3:00.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> > > [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> 
> It would be good to see the output when 68db9bc is reverted. Yinghai,
> can you attach that to the bugzilla but as well?
> 
> > So on this Skylake machine link training fails after resuming from D3hot
> > to D0.
> > 
> > One thing that's a bit fishy is that normally the Link Disable bit is
> > cleared when powering on the slot.  This results in a debug message
> > in dmesg containg the string "lnk_ctrl = ", and that line is missing
> > from the output you've pasted above, suggesting that the machine is
> > not running a stock v4.10 kernel after all but something else.  Could
> > you check why this message is not printed?  Could you check with lspci
> > if the Link Disable bit is set before you invoke "echo 1"?
> > 
> > This is the call stack:
> > pciehp_sysfs_enable_slot()
> >   pciehp_enable_slot()
> >     board_added()
> >       pciehp_power_on_slot()
> >         pciehp_link_enable()
> >           __pciehp_link_set()
> > 
> > Another theory is that the link is generally unreliable on this machine
> > since the Link Bandwidth Management Status bit is set in the Link Status
> > Register ("lnk_status = 5001"), which according to the spec means:
> > 
> > "Hardware has changed Link speed or width to attempt to correct unreliable
> > Link operation, either through an LTSSM timeout or a higher level process.
> > This bit must be set if the Physical Layer reports a speed or width change
> > was initiated by the Downstream component that was not indicated as an
> > autonomous change."
> > 
> > In this case it would be good to know which hardware exactly we're dealing
> > with so that we might quirk it to not runtime suspend the port.  To that
> > end, could you attach a full dmesg log to the bugzilla entry I've created?
> > https://bugzilla.kernel.org/show_bug.cgi?id=193951
> > 
> > @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> > training, or perhaps errata of Skylake chips related to link training
> > on hotplug ports?
> 
> According to the 100-series (the chipset used with Skylake) errata
> below, I don't see any mentions related to PCIe link training issues.
> 
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf

Still, it does look like errata to me.

At least I don't see what can be done on the software side to avoid this from
happening except for leaving the port(s) in question in D0.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-06 10:37                 ` Mika Westerberg
  2017-02-06 11:49                   ` Rafael J. Wysocki
@ 2017-02-06 21:35                   ` Lukas Wunner
  2017-02-08 13:00                     ` Erik Veijola
  2017-02-08 17:25                     ` Bjorn Helgaas
  1 sibling, 2 replies; 10+ messages in thread
From: Lukas Wunner @ 2017-02-06 21:35 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Yinghai Lu, Bjorn Helgaas, Rafael J. Wysocki, linux-pci,
	Linux Kernel Mailing List

On Mon, Feb 06, 2017 at 12:37:06PM +0200, Mika Westerberg wrote:
> On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> > training, or perhaps errata of Skylake chips related to link training
> > on hotplug ports?
> 
> According to the 100-series (the chipset used with Skylake) errata
> below, I don't see any mentions related to PCIe link training issues.
> 
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf

Yinghai Lu responded off-list that the hardware in question is an
unreleased / secret Intel product, so this particular issue cannot
be expected to be documented publicly at this point.

Of course this raises the question whether issues with unreleased
products can at all be considered valid regressions, given that the
final product may not regress.  It seems like a novelty to me that
patches would get reverted for something like this, but we'll see.

Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-06 21:35                   ` Lukas Wunner
@ 2017-02-08 13:00                     ` Erik Veijola
  2017-02-08 17:25                     ` Bjorn Helgaas
  1 sibling, 0 replies; 10+ messages in thread
From: Erik Veijola @ 2017-02-08 13:00 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Mika Westerberg, Yinghai Lu, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, Linux Kernel Mailing List

On Mon, Feb 06, 2017 at 10:35:28PM +0100, Lukas Wunner wrote:
> On Mon, Feb 06, 2017 at 12:37:06PM +0200, Mika Westerberg wrote:
> > On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > > @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> > > training, or perhaps errata of Skylake chips related to link training
> > > on hotplug ports?
> > 
> > According to the 100-series (the chipset used with Skylake) errata
> > below, I don't see any mentions related to PCIe link training issues.
> > 
> > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf
> 
> Yinghai Lu responded off-list that the hardware in question is an
> unreleased / secret Intel product, so this particular issue cannot
> be expected to be documented publicly at this point.
> 
> Of course this raises the question whether issues with unreleased
> products can at all be considered valid regressions, given that the
> final product may not regress.  It seems like a novelty to me that
> patches would get reverted for something like this, but we'll see.
> 
> Lukas
> 

Yinghai, we may have a similar system in our lab so we could test this
also. What is your setup for doing the test?

-Erik

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-06 21:35                   ` Lukas Wunner
  2017-02-08 13:00                     ` Erik Veijola
@ 2017-02-08 17:25                     ` Bjorn Helgaas
  1 sibling, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2017-02-08 17:25 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Mika Westerberg, Yinghai Lu, Bjorn Helgaas, Rafael J. Wysocki,
	linux-pci, Linux Kernel Mailing List

On Mon, Feb 06, 2017 at 10:35:28PM +0100, Lukas Wunner wrote:
> On Mon, Feb 06, 2017 at 12:37:06PM +0200, Mika Westerberg wrote:
> > On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > > @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> > > training, or perhaps errata of Skylake chips related to link training
> > > on hotplug ports?
> > 
> > According to the 100-series (the chipset used with Skylake) errata
> > below, I don't see any mentions related to PCIe link training issues.
> > 
> > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf
> 
> Yinghai Lu responded off-list that the hardware in question is an
> unreleased / secret Intel product, so this particular issue cannot
> be expected to be documented publicly at this point.
> 
> Of course this raises the question whether issues with unreleased
> products can at all be considered valid regressions, given that the
> final product may not regress.  It seems like a novelty to me that
> patches would get reverted for something like this, but we'll see.

I assume the hardware will eventually be released, and I assume the
hardware will not be changed because of this issue.  I would like to
avoid the situation of having v4.9 but not v4.10 work on this
hardware.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pciehp is broken from 4.10-rc1
  2017-02-05  7:34               ` Lukas Wunner
  2017-02-06 10:37                 ` Mika Westerberg
@ 2017-02-06 18:10                 ` Bjorn Helgaas
  1 sibling, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2017-02-06 18:10 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Yinghai Lu, Bjorn Helgaas, Rafael J. Wysocki, linux-pci,
	Mika Westerberg, Linux Kernel Mailing List

On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> On Sat, Feb 04, 2017 at 08:22:59PM -0800, Yinghai Lu wrote:
> > On Sat, Feb 4, 2017 at 3:34 PM, Lukas Wunner <lukas@wunner.de> wrote:
> > > On Sat, Feb 04, 2017 at 01:44:34PM -0800, Yinghai Lu wrote:
> > >> On Sat, Feb 4, 2017 at 10:56 AM, Lukas Wunner <lukas@wunner.de> wrote:
> > >> > On Sat, Feb 04, 2017 at 09:12:54AM +0100, Lukas Wunner wrote:
> > >> > Section 6.7.3.4 of the PCIe Base spec seems to support the theory above,
> > >> > so here's a tentative patch.
> > >> >
> > >> > -- >8 --
> > >> > Subject: [PATCH] PCI: pciehp: Don't enable PME on runtime suspend
> > >>
> > >> it works:
> > >
> > > Thanks a lot for the report and for testing the patch!
> > 
> > Wait, Commit 68db9bc still has problem with another server (skylake
> > based), and this patch does not help.
> [...]
> > sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> > [  375.376609] pci_hotplug: power_write_file: power = 1
> > [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> > [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> > [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> > [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link Active not set in 1000 msec
> > [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval 20 ms to get ffffffff
> > [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status: lnk_status = 5001
> > [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> > [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> > [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> > [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> > [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > [  380.018020] pciehp 0000:b3:00.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> > [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> 
> So on this Skylake machine link training fails after resuming from D3hot
> to D0.
> 
> One thing that's a bit fishy is that normally the Link Disable bit is
> cleared when powering on the slot.  This results in a debug message
> in dmesg containg the string "lnk_ctrl = ", and that line is missing
> from the output you've pasted above, suggesting that the machine is
> not running a stock v4.10 kernel after all but something else.  Could
> you check why this message is not printed?  Could you check with lspci
> if the Link Disable bit is set before you invoke "echo 1"?
> 
> This is the call stack:
> pciehp_sysfs_enable_slot()
>   pciehp_enable_slot()
>     board_added()
>       pciehp_power_on_slot()
>         pciehp_link_enable()
>           __pciehp_link_set()
> 
> Another theory is that the link is generally unreliable on this machine
> since the Link Bandwidth Management Status bit is set in the Link Status
> Register ("lnk_status = 5001"), which according to the spec means:
> 
> "Hardware has changed Link speed or width to attempt to correct unreliable
> Link operation, either through an LTSSM timeout or a higher level process.
> This bit must be set if the Physical Layer reports a speed or width change
> was initiated by the Downstream component that was not indicated as an
> autonomous change."
> 
> In this case it would be good to know which hardware exactly we're dealing
> with so that we might quirk it to not runtime suspend the port.  To that
> end, could you attach a full dmesg log to the bugzilla entry I've created?
> https://bugzilla.kernel.org/show_bug.cgi?id=193951
> 
> @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> training, or perhaps errata of Skylake chips related to link training
> on hotplug ports?

I think we're prematurely leaping to the conclusion that this is a
hardware erratum.  I don't have nearly the confidence that pciehp is
handling this correctly that you seem to have.

If this is a hardware erratum, we should be able to turn off
CONFIG_HOTPLUG_PCI_PCIE and drive through this scenario manually with
setpci.  That sequence would be immensely helpful to any hardware
engineers who want to investigate this.

I'm hesitant to add a quirk until we have a better understanding of
what's going on.  Yinghai tripped over this one broken case, but I
don't have any reason to believe that's the only one.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-02-08 17:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAE9FiQVCMCa7iVyuwp9z6VrY0cE7V_xghuXip28Ft52=8QmTWw@mail.gmail.com>
2017-02-03 15:09 ` pciehp is broken from 4.10-rc1 Bjorn Helgaas
     [not found] ` <20170203055200.GA29413@wunner.de>
     [not found]   ` <CAE9FiQWs0H9vqEo2ZYnWWBW0Ao-hx4WYHQ69cyR32nFQ9yV9rw@mail.gmail.com>
     [not found]     ` <20170204081254.GA29595@wunner.de>
     [not found]       ` <20170204185607.GA29957@wunner.de>
     [not found]         ` <CAE9FiQUuFJHMScyFgnHbs5r-SzTiRiBZ2JcpUYJhg0ft75-OBQ@mail.gmail.com>
     [not found]           ` <20170204233443.GA234@wunner.de>
2017-02-05  4:22             ` Yinghai Lu
2017-02-05  5:20               ` Yinghai Lu
2017-02-05  7:34               ` Lukas Wunner
2017-02-06 10:37                 ` Mika Westerberg
2017-02-06 11:49                   ` Rafael J. Wysocki
2017-02-06 21:35                   ` Lukas Wunner
2017-02-08 13:00                     ` Erik Veijola
2017-02-08 17:25                     ` Bjorn Helgaas
2017-02-06 18:10                 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).