Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* PCI trouble on mvebu (Turris Omnia)
@ 2020-10-27 15:43 Toke Høiland-Jørgensen
  2020-10-27 17:20 ` Bjorn Helgaas
  2020-10-27 18:03 ` Marek Behun
  0 siblings, 2 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 15:43 UTC (permalink / raw)
  To: linux-pci, linux-arm-kernel, Rob Herring; +Cc: Ilias Apalodimas

Hi everyone

I'm trying to get a mainline kernel to run on my Turris Omnia, and am
having some trouble getting the PCI bus to work correctly. Specifically,
I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
the resource request fix[0] applied on top.

The kernel boots fine, and the patch in [0] makes the PCI devices show
up. But I'm still getting initialisation errors like these:

[    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
[    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
[    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)

and the WiFi drivers fail to initialise with what appears to me to be
errors related to the bus rather than to the drivers themselves:

[    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
[    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
[    3.524473] ath9k 0000:01:00.0: Failed to initialize device
[    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
[    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
[    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
[    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
[    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
[    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110

lspci looks OK, though:

# lspci
00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)

Does anyone have any clue what could be going on here? Is this a bug, or
did I miss something in my config or other initialisation? I've tried
with both the stock u-boot distributed with the board, and with an
upstream u-boot from latest master; doesn't seem to make any different.

Any pointers will be greatly appreciated!

Thanks,

-Toke


[0] https://lore.kernel.org/linux-pci/20201023145252.2691779-1-robh@kernel.org/

Full dmesg:

[    1.546457] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
[    1.546469] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.546615] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
[    1.546627] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.547341] PCI: bus0: Fast back to back transfers disabled
[    1.547349] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547356] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547363] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547444] pci 0000:01:00.0: [168c:002e] type 00 class 0x028000
[    1.547466] pci 0000:01:00.0: reg 0x10: [mem 0xe8000000-0xe800ffff 64bit]
[    1.547576] pci 0000:01:00.0: supports D1
[    1.547581] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
[    1.547692] pci 0000:00:01.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.601932] PCI: bus1: Fast back to back transfers enabled
[    1.601941] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.602039] pci 0000:02:00.0: [168c:003c] type 00 class 0x028000
[    1.602063] pci 0000:02:00.0: reg 0x10: [mem 0xea000000-0xea1fffff 64bit]
[    1.602096] pci 0000:02:00.0: reg 0x30: [mem 0xea200000-0xea20ffff pref]
[    1.602174] pci 0000:02:00.0: supports D1 D2
[    1.602273] pci 0000:00:02.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.631918] PCI: bus2: Fast back to back transfers enabled
[    1.631926] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.632623] PCI: bus3: Fast back to back transfers enabled
[    1.632630] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
[    1.632663] pci 0000:00:01.0: BAR 8: assigned [mem 0xe0000000-0xe00fffff]
[    1.632671] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0200000-0xe04fffff]
[    1.632679] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0100000-0xe01007ff pref]
[    1.632687] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
[    1.632694] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0600000-0xe06007ff pref]
[    1.632701] pci 0000:01:00.0: BAR 0: assigned [mem 0xe0000000-0xe000ffff 64bit]
[    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
[    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.632720] pci 0000:00:01.0: PCI bridge to [bus 01]
[    1.632728] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xe00fffff]
[    1.632737] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0200000-0xe03fffff 64bit]
[    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
[    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.632757] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0400000-0xe040ffff pref]
[    1.632762] pci 0000:00:02.0: PCI bridge to [bus 02]
[    1.632768] pci 0000:00:02.0:   bridge window [mem 0xe0200000-0xe04fffff]
[    1.632774] pci 0000:00:03.0: PCI bridge to [bus 03]
[    1.633030] mv_xor f1060800.xor: Marvell shared XOR driver
[    1.691640] mv_xor f1060800.xor: Marvell XOR (Descriptor Mode): ( xor cpy intr )
[    1.691756] mv_xor f1060900.xor: Marvell shared XOR driver
[    1.751635] mv_xor f1060900.xor: Marvell XOR (Descriptor Mode): ( xor cpy intr )
[    1.769386] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    1.770240] printk: console [ttyS0] disabled
[    1.790351] f1012000.serial: ttyS0 at MMIO 0xf1012000 (irq = 30, base_baud = 15625000) is a 16550A
[    3.040783] printk: console [ttyS0] enabled
[    3.065621] f1012100.serial: ttyS1 at MMIO 0xf1012100 (irq = 31, base_baud = 15625000) is a 16550A
[    3.075329] ahci-mvebu f10a8000.sata: supply ahci not found, using dummy regulator
[    3.082990] ahci-mvebu f10a8000.sata: supply phy not found, using dummy regulator
[    3.090499] ahci-mvebu f10a8000.sata: supply target not found, using dummy regulator
[    3.098335] ahci-mvebu f10a8000.sata: AHCI 0001.0000 32 slots 2 ports 6 Gbps 0x3 impl platform mode
[    3.107411] ahci-mvebu f10a8000.sata: flags: 64bit ncq sntf led only pmp fbs pio slum part sxs 
[    3.116657] scsi host0: ahci-mvebu
[    3.120302] scsi host1: ahci-mvebu
[    3.123825] ata1: SATA max UDMA/133 mmio [mem 0xf10a8000-0xf10a9fff] port 0x100 irq 53
[    3.131768] ata2: SATA max UDMA/133 mmio [mem 0xf10a8000-0xf10a9fff] port 0x180 irq 53
[    3.140560] spi-nor spi0.0: s25fl164k (8192 Kbytes)
[    3.145494] 2 fixed-partitions partitions found on MTD device spi0.0
[    3.151868] Creating 2 MTD partitions on "spi0.0":
[    3.156671] 0x000000000000-0x000000100000 : "U-Boot"
[    3.171461] 0x000000100000-0x000000800000 : "Rescue system"
[    3.191747] wireguard: WireGuard 1.0.0 loaded. See www.wireguard.com for information.
[    3.199597] wireguard: Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
[    3.209859] libphy: Fixed MDIO Bus: probed
[    3.214141] tun: Universal TUN/TAP device driver, 1.6
[    3.219584] libphy: orion_mdio_bus: probed
[    3.224542] mv88e6085 f1072004.mdio-mii:10: switch 0x1760 detected: Marvell 88E6176, revision 1
[    3.450274] libphy: mv88e6xxx SMI: probed
[    3.461815] mvneta f1070000.ethernet eth0: Using hardware mac address d8:58:d7:00:4e:98
[    3.470606] mvneta f1030000.ethernet eth1: Using hardware mac address d8:58:d7:00:4e:96
[    3.479356] mvneta f1034000.ethernet eth2: Using hardware mac address d8:58:d7:00:4e:97
[    3.482630] ata1: SATA link down (SStatus 0 SControl 300)
[    3.487588] pci 0000:00:01.0: enabling device (0140 -> 0142)
[    3.492831] ata2: SATA link down (SStatus 0 SControl 300)
[    3.498496] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
[    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
[    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
[    3.524473] ath9k 0000:01:00.0: Failed to initialize device
[    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
[    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
[    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
[    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
[    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
[    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
[    3.601529] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    3.608072] ehci-pci: EHCI PCI platform driver
[    3.612553] ehci-orion: EHCI orion driver
[    3.616675] orion-ehci f1058000.usb: EHCI Host Controller
[    3.622105] orion-ehci f1058000.usb: new USB bus registered, assigned bus number 1
[    3.629733] orion-ehci f1058000.usb: irq 49, io mem 0xf1058000
[    3.661261] orion-ehci f1058000.usb: USB 2.0 started, EHCI 1.00
[    3.667530] hub 1-0:1.0: USB hub found
[    3.671321] hub 1-0:1.0: 1 port detected
[    3.675700] xhci-hcd f10f0000.usb3: xHCI Host Controller
[    3.681034] xhci-hcd f10f0000.usb3: new USB bus registered, assigned bus number 2
[    3.688599] xhci-hcd f10f0000.usb3: hcc params 0x0a000990 hci version 0x100 quirks 0x0000000000010010
[    3.697867] xhci-hcd f10f0000.usb3: irq 55, io mem 0xf10f0000
[    3.703905] hub 2-0:1.0: USB hub found
[    3.707678] hub 2-0:1.0: 1 port detected
[    3.711767] xhci-hcd f10f0000.usb3: xHCI Host Controller
[    3.717096] xhci-hcd f10f0000.usb3: new USB bus registered, assigned bus number 3
[    3.724621] xhci-hcd f10f0000.usb3: Host supports USB 3.0 SuperSpeed
[    3.731026] usb usb3: We don't know the algorithms for LPM for this host, disabling LPM.
[    3.739388] hub 3-0:1.0: USB hub found
[    3.743167] hub 3-0:1.0: 1 port detected
[    3.747339] xhci-hcd f10f8000.usb3: xHCI Host Controller
[    3.752684] xhci-hcd f10f8000.usb3: new USB bus registered, assigned bus number 4
[    3.760230] xhci-hcd f10f8000.usb3: hcc params 0x0a000990 hci version 0x100 quirks 0x0000000000010010
[    3.769502] xhci-hcd f10f8000.usb3: irq 56, io mem 0xf10f8000
[    3.775527] hub 4-0:1.0: USB hub found
[    3.779298] hub 4-0:1.0: 1 port detected
[    3.783756] xhci-hcd f10f8000.usb3: xHCI Host Controller
[    3.789086] xhci-hcd f10f8000.usb3: new USB bus registered, assigned bus number 5
[    3.796610] xhci-hcd f10f8000.usb3: Host supports USB 3.0 SuperSpeed
[    3.803012] usb usb5: We don't know the algorithms for LPM for this host, disabling LPM.
[    3.811375] hub 5-0:1.0: USB hub found
[    3.815147] hub 5-0:1.0: 1 port detected
[    3.819312] usbcore: registered new interface driver usb-storage
[    3.826044] armada38x-rtc f10a3800.rtc: registered as rtc0
[    3.831632] armada38x-rtc f10a3800.rtc: setting system clock to 2020-10-27T15:31:52 UTC (1603812712)
[    3.840905] i2c /dev entries driver
[    3.846565] orion_wdt: Initial timeout 171 sec
[    3.851350] sdhci: Secure Digital Host Controller Interface driver
[    3.857544] sdhci: Copyright(c) Pierre Ossman
[    3.862041] sdhci-pltfm: SDHCI platform and OF driver helper
[    3.868792] marvell-cesa f1090000.crypto: CESA device successfully registered
[    3.876106] usbcore: registered new interface driver usbhid
[    3.881715] usbhid: USB HID core driver
[    3.885678] GACT probability on
[    3.888837] Mirror/redirect action on
[    3.892589] Simple TC action Loaded
[    3.893793] mmc0: SDHCI controller on f10d8000.sdhci [f10d8000.sdhci] using ADMA
[    3.896117] u32 classifier
[    3.906258]     Performance counters on
[    3.910113]     input device check on
[    3.913812]     Actions configured
[    3.917606] NET: Registered protocol family 10
[    3.922867] Segment Routing with IPv6
[    3.926605] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    3.932956] NET: Registered protocol family 17
[    3.937500] 8021q: 802.1Q VLAN Support v1.8
[    3.941850] ThumbEE CPU extension supported.
[    3.946133] Registering SWP/SWPB emulation handler
[    3.951024] Loading compiled-in X.509 certificates
[    3.956916] Btrfs loaded, crc32c=crc32c-generic
[    3.961872] mv88e6085 f1072004.mdio-mii:10: switch 0x1760 detected: Marvell 88E6176, revision 1
[    4.027817] mmc0: new high speed MMC card at address 0001
[    4.033650] mmcblk0: mmc0:0001 H8G4a\x92 7.28 GiB 
[    4.038323] mmcblk0boot0: mmc0:0001 H8G4a\x92 partition 1 4.00 MiB
[    4.044421] mmcblk0boot1: mmc0:0001 H8G4a\x92 partition 2 4.00 MiB
[    4.050457] mmcblk0rpmb: mmc0:0001 H8G4a\x92 partition 3 4.00 MiB, chardev (250:0)
[    4.059708]  mmcblk0: p1
[    4.081276] usb 2-1: new high-speed USB device number 2 using xhci-hcd
[    4.169488] libphy: mv88e6xxx SMI: probed
[    4.261911] usb-storage 2-1:1.0: USB Mass Storage device detected
[    4.268229] scsi host2: usb-storage 2-1:1.0
[    4.816096] mv88e6085 f1072004.mdio-mii:10 lan0 (uninitialized): PHY [mv88e6xxx-1:00] driver [Marvell 88E1540] (irq=70)
[    4.842702] mv88e6085 f1072004.mdio-mii:10 lan1 (uninitialized): PHY [mv88e6xxx-1:01] driver [Marvell 88E1540] (irq=71)
[    4.869246] mv88e6085 f1072004.mdio-mii:10 lan2 (uninitialized): PHY [mv88e6xxx-1:02] driver [Marvell 88E1540] (irq=72)
[    4.895772] mv88e6085 f1072004.mdio-mii:10 lan3 (uninitialized): PHY [mv88e6xxx-1:03] driver [Marvell 88E1540] (irq=73)
[    4.920733] mv88e6085 f1072004.mdio-mii:10 lan4 (uninitialized): PHY [mv88e6xxx-1:04] driver [Marvell 88E1540] (irq=74)
[    4.939701] mv88e6085 f1072004.mdio-mii:10: configuring for fixed/rgmii-id link mode
[    4.950089] mv88e6085 f1072004.mdio-mii:10: Link is Up - 1Gbps/Full - flow control off
[    4.958047] DSA: tree 0 setup
[    4.961339] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    4.970623] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    4.977231] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    4.985879] cfg80211: failed to load regulatory.db
[    4.990987] Waiting 2 sec before mounting root device...
[    5.351539] scsi 2:0:0:0: Direct-Access     General  UDisk            5.00 PQ: 0 ANSI: 2
[    5.360060] sd 2:0:0:0: [sda] 7987200 512-byte logical blocks: (4.09 GB/3.81 GiB)
[    5.367691] sd 2:0:0:0: [sda] Write Protect is off
[    5.372503] sd 2:0:0:0: [sda] Mode Sense: 0b 00 00 08
[    5.372605] sd 2:0:0:0: [sda] No Caching mode page found
[    5.377931] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    5.435076]  sda: sda1
[    5.438130] sd 2:0:0:0: [sda] Attached SCSI removable disk
[    7.047873] BTRFS: device fsid 448334b8-1b27-4738-8118-9e70b56b1e58 devid 1 transid 680 /dev/root scanned by swapper/0 (1)
[    7.059562] BTRFS info (device mmcblk0p1): disk space caching is enabled
[    7.066294] BTRFS info (device mmcblk0p1): has skinny extents
[    7.078585] BTRFS info (device mmcblk0p1): enabling ssd optimizations
[    7.087624] VFS: Mounted root (btrfs filesystem) on device 0:12.
[    7.094044] devtmpfs: mounted
[    7.097581] Freeing unused kernel memory: 1024K
[    7.131431] Run /sbin/init as init process
[    7.135536]   with arguments:
[    7.135539]     /sbin/init
[    7.135541]     earlyprintk
[    7.135543]   with environment:
[    7.135545]     HOME=/
[    7.135548]     TERM=linux
[    7.220335] random: fast init done
[    7.650974] systemd[1]: systemd 246.6-1.1-arch running in system mode. (+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    7.674141] systemd[1]: Detected architecture arm.
[    7.752534] systemd[1]: Set hostname to <omnia-arch>.
[    7.938493] systemd[164]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
[    8.148416] systemd[1]: Queued start job for default target Graphical Interface.
[    8.156570] random: systemd: uninitialized urandom read (16 bytes read)
[    8.164923] systemd[1]: Created slice system-getty.slice.
[    8.201373] random: systemd: uninitialized urandom read (16 bytes read)
[    8.208682] systemd[1]: Created slice system-modprobe.slice.
[    8.241347] random: systemd: uninitialized urandom read (16 bytes read)
[    8.248610] systemd[1]: Created slice system-serial\x2dgetty.slice.
[    8.281970] systemd[1]: Created slice User and Session Slice.
[    8.321507] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    8.371436] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    8.421373] systemd[1]: Condition check resulted in Arbitrary Executable File Formats File System Automount Point being skipped.
[    8.433099] systemd[1]: Reached target Local Encrypted Volumes.
[    8.481453] systemd[1]: Reached target Paths.
[    8.521358] systemd[1]: Reached target Remote File Systems.
[    8.571330] systemd[1]: Reached target Slices.
[    8.611374] systemd[1]: Reached target Swap.
[    8.641568] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[    8.693521] systemd[1]: Listening on Process Core Dump Socket.
[    8.745061] systemd[1]: Condition check resulted in Journal Audit Socket being skipped.
[    8.759882] systemd[1]: Listening on Journal Socket (/dev/log).
[    8.801664] systemd[1]: Listening on Journal Socket.
[    8.848051] systemd[1]: Listening on Network Service Netlink Socket.
[    8.892567] systemd[1]: Listening on udev Control Socket.
[    8.941553] systemd[1]: Listening on udev Kernel Socket.
[    8.981628] systemd[1]: Condition check resulted in Huge Pages File System being skipped.
[    8.990034] systemd[1]: Condition check resulted in POSIX Message Queue File System being skipped.
[    8.999279] systemd[1]: Condition check resulted in Kernel Debug File System being skipped.
[    9.010575] systemd[1]: Mounting Kernel Trace File System...
[    9.043918] systemd[1]: Mounting Temporary Directory (/tmp)...
[    9.081515] systemd[1]: Condition check resulted in Create list of static device nodes for the current kernel being skipped.
[    9.095686] systemd[1]: Starting Load Kernel Module drm...
[    9.138063] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[    9.148885] systemd[1]: Condition check resulted in Load Kernel Modules being skipped.
[    9.157149] systemd[1]: Condition check resulted in FUSE Control File System being skipped.
[    9.165757] systemd[1]: Condition check resulted in Kernel Configuration File System being skipped.
[    9.177585] systemd[1]: Starting Remount Root and Kernel File Systems...
[    9.211505] systemd[1]: Condition check resulted in Repartition Root Disk being skipped.
[    9.222607] systemd[1]: Starting Apply Kernel Variables...
[    9.264359] systemd[1]: Starting Coldplug All udev Devices...
[    9.305343] systemd[1]: Mounted Kernel Trace File System.
[    9.352014] systemd[1]: Mounted Temporary Directory (/tmp).
[    9.391997] systemd[1]: modprobe@drm.service: Succeeded.
[    9.398342] systemd[1]: Finished Load Kernel Module drm.
[    9.436436] systemd[1]: Finished Remount Root and Kernel File Systems.
[    9.472894] systemd[1]: Finished Apply Kernel Variables.
[    9.514310] systemd[1]: Condition check resulted in First Boot Wizard being skipped.
[    9.529349] systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
[    9.540514] systemd[1]: Starting Load/Save Random Seed...
[    9.561757] systemd[1]: Condition check resulted in Create System Users being skipped.
[    9.578340] systemd[1]: Starting Create Static Device Nodes in /dev...
[    9.639784] systemd[1]: Finished Create Static Device Nodes in /dev.
[    9.692025] systemd[1]: Reached target Local File Systems (Pre).
[    9.741485] systemd[1]: Condition check resulted in Virtual Machine and Container Storage (Compatibility) being skipped.
[    9.752637] systemd[1]: Reached target Local File Systems.
[    9.794672] systemd[1]: Started Entropy Daemon based on the HAVEGE algorithm.
[    9.831672] systemd[1]: Condition check resulted in Rebuild Dynamic Linker Cache being skipped.
[    9.844130] systemd[1]: Starting Journal Service...
[    9.861510] systemd[1]: Condition check resulted in Commit a transient machine-id on disk being skipped.
[    9.885260] systemd[1]: Starting Rule-based Manager for Device Events and Files...
[    9.932999] systemd[1]: Finished Coldplug All udev Devices.
[   10.175983] systemd[1]: Started Journal Service.
[   11.579842] mvneta f1070000.ethernet eth0: configuring for fixed/rgmii link mode
[   11.607754] mvneta f1070000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[   11.787479] mvneta f1034000.ethernet eth2: PHY [f1072004.mdio-mii:01] driver [Marvell 88E1510] (irq=POLL)
[   11.817734] mvneta f1034000.ethernet eth2: configuring for phy/sgmii link mode
[   12.102369] BTRFS info (device mmcblk0p1): devid 1 device path /dev/root changed to /dev/mmcblk0p1 scanned by systemd-udevd (194)
[   13.131291] random: crng init done
[   13.134710] random: 7 urandom warning(s) missed due to ratelimiting
[   14.961639] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control rx/tx
[   14.969684] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 15:43 PCI trouble on mvebu (Turris Omnia) Toke Høiland-Jørgensen
@ 2020-10-27 17:20 ` Bjorn Helgaas
  2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-27 18:56   ` Toke Høiland-Jørgensen
  2020-10-27 18:03 ` Marek Behun
  1 sibling, 2 replies; 48+ messages in thread
From: Bjorn Helgaas @ 2020-10-27 17:20 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas, vtolkm

[+cc vtolkm]

On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> Hi everyone
> 
> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> having some trouble getting the PCI bus to work correctly. Specifically,
> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> the resource request fix[0] applied on top.
> 
> The kernel boots fine, and the patch in [0] makes the PCI devices show
> up. But I'm still getting initialisation errors like these:
> 
> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> 
> and the WiFi drivers fail to initialise with what appears to me to be
> errors related to the bus rather than to the drivers themselves:
> 
> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> 
> lspci looks OK, though:
> 
> # lspci
> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> 
> Does anyone have any clue what could be going on here? Is this a bug, or
> did I miss something in my config or other initialisation? I've tried
> with both the stock u-boot distributed with the board, and with an
> upstream u-boot from latest master; doesn't seem to make any different.

Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
don't think we have a fix yet.

> [0] https://lore.kernel.org/linux-pci/20201023145252.2691779-1-robh@kernel.org/
> 
> Full dmesg:
> 
> [    1.546457] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
> [    1.546469] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
> [    1.546615] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
> [    1.546627] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
> [    1.547341] PCI: bus0: Fast back to back transfers disabled
> [    1.547349] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547356] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547363] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547444] pci 0000:01:00.0: [168c:002e] type 00 class 0x028000
> [    1.547466] pci 0000:01:00.0: reg 0x10: [mem 0xe8000000-0xe800ffff 64bit]
> [    1.547576] pci 0000:01:00.0: supports D1
> [    1.547581] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
> [    1.547692] pci 0000:00:01.0: ASPM: current common clock configuration is inconsistent, reconfiguring
> [    1.601932] PCI: bus1: Fast back to back transfers enabled
> [    1.601941] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
> [    1.602039] pci 0000:02:00.0: [168c:003c] type 00 class 0x028000
> [    1.602063] pci 0000:02:00.0: reg 0x10: [mem 0xea000000-0xea1fffff 64bit]
> [    1.602096] pci 0000:02:00.0: reg 0x30: [mem 0xea200000-0xea20ffff pref]
> [    1.602174] pci 0000:02:00.0: supports D1 D2
> [    1.602273] pci 0000:00:02.0: ASPM: current common clock configuration is inconsistent, reconfiguring
> [    1.631918] PCI: bus2: Fast back to back transfers enabled
> [    1.631926] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
> [    1.632623] PCI: bus3: Fast back to back transfers enabled
> [    1.632630] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
> [    1.632663] pci 0000:00:01.0: BAR 8: assigned [mem 0xe0000000-0xe00fffff]
> [    1.632671] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0200000-0xe04fffff]
> [    1.632679] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0100000-0xe01007ff pref]
> [    1.632687] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
> [    1.632694] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0600000-0xe06007ff pref]
> [    1.632701] pci 0000:01:00.0: BAR 0: assigned [mem 0xe0000000-0xe000ffff 64bit]
> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632720] pci 0000:00:01.0: PCI bridge to [bus 01]
> [    1.632728] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xe00fffff]
> [    1.632737] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0200000-0xe03fffff 64bit]
> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632757] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0400000-0xe040ffff pref]
> [    1.632762] pci 0000:00:02.0: PCI bridge to [bus 02]
> [    1.632768] pci 0000:00:02.0:   bridge window [mem 0xe0200000-0xe04fffff]
> [    1.632774] pci 0000:00:03.0: PCI bridge to [bus 03]
> [    1.633030] mv_xor f1060800.xor: Marvell shared XOR driver
> [    1.691640] mv_xor f1060800.xor: Marvell XOR (Descriptor Mode): ( xor cpy intr )
> [    1.691756] mv_xor f1060900.xor: Marvell shared XOR driver
> [    1.751635] mv_xor f1060900.xor: Marvell XOR (Descriptor Mode): ( xor cpy intr )
> [    1.769386] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> [    1.770240] printk: console [ttyS0] disabled
> [    1.790351] f1012000.serial: ttyS0 at MMIO 0xf1012000 (irq = 30, base_baud = 15625000) is a 16550A
> [    3.040783] printk: console [ttyS0] enabled
> [    3.065621] f1012100.serial: ttyS1 at MMIO 0xf1012100 (irq = 31, base_baud = 15625000) is a 16550A
> [    3.075329] ahci-mvebu f10a8000.sata: supply ahci not found, using dummy regulator
> [    3.082990] ahci-mvebu f10a8000.sata: supply phy not found, using dummy regulator
> [    3.090499] ahci-mvebu f10a8000.sata: supply target not found, using dummy regulator
> [    3.098335] ahci-mvebu f10a8000.sata: AHCI 0001.0000 32 slots 2 ports 6 Gbps 0x3 impl platform mode
> [    3.107411] ahci-mvebu f10a8000.sata: flags: 64bit ncq sntf led only pmp fbs pio slum part sxs 
> [    3.116657] scsi host0: ahci-mvebu
> [    3.120302] scsi host1: ahci-mvebu
> [    3.123825] ata1: SATA max UDMA/133 mmio [mem 0xf10a8000-0xf10a9fff] port 0x100 irq 53
> [    3.131768] ata2: SATA max UDMA/133 mmio [mem 0xf10a8000-0xf10a9fff] port 0x180 irq 53
> [    3.140560] spi-nor spi0.0: s25fl164k (8192 Kbytes)
> [    3.145494] 2 fixed-partitions partitions found on MTD device spi0.0
> [    3.151868] Creating 2 MTD partitions on "spi0.0":
> [    3.156671] 0x000000000000-0x000000100000 : "U-Boot"
> [    3.171461] 0x000000100000-0x000000800000 : "Rescue system"
> [    3.191747] wireguard: WireGuard 1.0.0 loaded. See www.wireguard.com for information.
> [    3.199597] wireguard: Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> [    3.209859] libphy: Fixed MDIO Bus: probed
> [    3.214141] tun: Universal TUN/TAP device driver, 1.6
> [    3.219584] libphy: orion_mdio_bus: probed
> [    3.224542] mv88e6085 f1072004.mdio-mii:10: switch 0x1760 detected: Marvell 88E6176, revision 1
> [    3.450274] libphy: mv88e6xxx SMI: probed
> [    3.461815] mvneta f1070000.ethernet eth0: Using hardware mac address d8:58:d7:00:4e:98
> [    3.470606] mvneta f1030000.ethernet eth1: Using hardware mac address d8:58:d7:00:4e:96
> [    3.479356] mvneta f1034000.ethernet eth2: Using hardware mac address d8:58:d7:00:4e:97
> [    3.482630] ata1: SATA link down (SStatus 0 SControl 300)
> [    3.487588] pci 0000:00:01.0: enabling device (0140 -> 0142)
> [    3.492831] ata2: SATA link down (SStatus 0 SControl 300)
> [    3.498496] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> [    3.601529] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> [    3.608072] ehci-pci: EHCI PCI platform driver
> [    3.612553] ehci-orion: EHCI orion driver
> [    3.616675] orion-ehci f1058000.usb: EHCI Host Controller
> [    3.622105] orion-ehci f1058000.usb: new USB bus registered, assigned bus number 1
> [    3.629733] orion-ehci f1058000.usb: irq 49, io mem 0xf1058000
> [    3.661261] orion-ehci f1058000.usb: USB 2.0 started, EHCI 1.00
> [    3.667530] hub 1-0:1.0: USB hub found
> [    3.671321] hub 1-0:1.0: 1 port detected
> [    3.675700] xhci-hcd f10f0000.usb3: xHCI Host Controller
> [    3.681034] xhci-hcd f10f0000.usb3: new USB bus registered, assigned bus number 2
> [    3.688599] xhci-hcd f10f0000.usb3: hcc params 0x0a000990 hci version 0x100 quirks 0x0000000000010010
> [    3.697867] xhci-hcd f10f0000.usb3: irq 55, io mem 0xf10f0000
> [    3.703905] hub 2-0:1.0: USB hub found
> [    3.707678] hub 2-0:1.0: 1 port detected
> [    3.711767] xhci-hcd f10f0000.usb3: xHCI Host Controller
> [    3.717096] xhci-hcd f10f0000.usb3: new USB bus registered, assigned bus number 3
> [    3.724621] xhci-hcd f10f0000.usb3: Host supports USB 3.0 SuperSpeed
> [    3.731026] usb usb3: We don't know the algorithms for LPM for this host, disabling LPM.
> [    3.739388] hub 3-0:1.0: USB hub found
> [    3.743167] hub 3-0:1.0: 1 port detected
> [    3.747339] xhci-hcd f10f8000.usb3: xHCI Host Controller
> [    3.752684] xhci-hcd f10f8000.usb3: new USB bus registered, assigned bus number 4
> [    3.760230] xhci-hcd f10f8000.usb3: hcc params 0x0a000990 hci version 0x100 quirks 0x0000000000010010
> [    3.769502] xhci-hcd f10f8000.usb3: irq 56, io mem 0xf10f8000
> [    3.775527] hub 4-0:1.0: USB hub found
> [    3.779298] hub 4-0:1.0: 1 port detected
> [    3.783756] xhci-hcd f10f8000.usb3: xHCI Host Controller
> [    3.789086] xhci-hcd f10f8000.usb3: new USB bus registered, assigned bus number 5
> [    3.796610] xhci-hcd f10f8000.usb3: Host supports USB 3.0 SuperSpeed
> [    3.803012] usb usb5: We don't know the algorithms for LPM for this host, disabling LPM.
> [    3.811375] hub 5-0:1.0: USB hub found
> [    3.815147] hub 5-0:1.0: 1 port detected
> [    3.819312] usbcore: registered new interface driver usb-storage
> [    3.826044] armada38x-rtc f10a3800.rtc: registered as rtc0
> [    3.831632] armada38x-rtc f10a3800.rtc: setting system clock to 2020-10-27T15:31:52 UTC (1603812712)
> [    3.840905] i2c /dev entries driver
> [    3.846565] orion_wdt: Initial timeout 171 sec
> [    3.851350] sdhci: Secure Digital Host Controller Interface driver
> [    3.857544] sdhci: Copyright(c) Pierre Ossman
> [    3.862041] sdhci-pltfm: SDHCI platform and OF driver helper
> [    3.868792] marvell-cesa f1090000.crypto: CESA device successfully registered
> [    3.876106] usbcore: registered new interface driver usbhid
> [    3.881715] usbhid: USB HID core driver
> [    3.885678] GACT probability on
> [    3.888837] Mirror/redirect action on
> [    3.892589] Simple TC action Loaded
> [    3.893793] mmc0: SDHCI controller on f10d8000.sdhci [f10d8000.sdhci] using ADMA
> [    3.896117] u32 classifier
> [    3.906258]     Performance counters on
> [    3.910113]     input device check on
> [    3.913812]     Actions configured
> [    3.917606] NET: Registered protocol family 10
> [    3.922867] Segment Routing with IPv6
> [    3.926605] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> [    3.932956] NET: Registered protocol family 17
> [    3.937500] 8021q: 802.1Q VLAN Support v1.8
> [    3.941850] ThumbEE CPU extension supported.
> [    3.946133] Registering SWP/SWPB emulation handler
> [    3.951024] Loading compiled-in X.509 certificates
> [    3.956916] Btrfs loaded, crc32c=crc32c-generic
> [    3.961872] mv88e6085 f1072004.mdio-mii:10: switch 0x1760 detected: Marvell 88E6176, revision 1
> [    4.027817] mmc0: new high speed MMC card at address 0001
> [    4.033650] mmcblk0: mmc0:0001 H8G4a\x92 7.28 GiB 
> [    4.038323] mmcblk0boot0: mmc0:0001 H8G4a\x92 partition 1 4.00 MiB
> [    4.044421] mmcblk0boot1: mmc0:0001 H8G4a\x92 partition 2 4.00 MiB
> [    4.050457] mmcblk0rpmb: mmc0:0001 H8G4a\x92 partition 3 4.00 MiB, chardev (250:0)
> [    4.059708]  mmcblk0: p1
> [    4.081276] usb 2-1: new high-speed USB device number 2 using xhci-hcd
> [    4.169488] libphy: mv88e6xxx SMI: probed
> [    4.261911] usb-storage 2-1:1.0: USB Mass Storage device detected
> [    4.268229] scsi host2: usb-storage 2-1:1.0
> [    4.816096] mv88e6085 f1072004.mdio-mii:10 lan0 (uninitialized): PHY [mv88e6xxx-1:00] driver [Marvell 88E1540] (irq=70)
> [    4.842702] mv88e6085 f1072004.mdio-mii:10 lan1 (uninitialized): PHY [mv88e6xxx-1:01] driver [Marvell 88E1540] (irq=71)
> [    4.869246] mv88e6085 f1072004.mdio-mii:10 lan2 (uninitialized): PHY [mv88e6xxx-1:02] driver [Marvell 88E1540] (irq=72)
> [    4.895772] mv88e6085 f1072004.mdio-mii:10 lan3 (uninitialized): PHY [mv88e6xxx-1:03] driver [Marvell 88E1540] (irq=73)
> [    4.920733] mv88e6085 f1072004.mdio-mii:10 lan4 (uninitialized): PHY [mv88e6xxx-1:04] driver [Marvell 88E1540] (irq=74)
> [    4.939701] mv88e6085 f1072004.mdio-mii:10: configuring for fixed/rgmii-id link mode
> [    4.950089] mv88e6085 f1072004.mdio-mii:10: Link is Up - 1Gbps/Full - flow control off
> [    4.958047] DSA: tree 0 setup
> [    4.961339] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [    4.970623] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [    4.977231] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
> [    4.985879] cfg80211: failed to load regulatory.db
> [    4.990987] Waiting 2 sec before mounting root device...
> [    5.351539] scsi 2:0:0:0: Direct-Access     General  UDisk            5.00 PQ: 0 ANSI: 2
> [    5.360060] sd 2:0:0:0: [sda] 7987200 512-byte logical blocks: (4.09 GB/3.81 GiB)
> [    5.367691] sd 2:0:0:0: [sda] Write Protect is off
> [    5.372503] sd 2:0:0:0: [sda] Mode Sense: 0b 00 00 08
> [    5.372605] sd 2:0:0:0: [sda] No Caching mode page found
> [    5.377931] sd 2:0:0:0: [sda] Assuming drive cache: write through
> [    5.435076]  sda: sda1
> [    5.438130] sd 2:0:0:0: [sda] Attached SCSI removable disk
> [    7.047873] BTRFS: device fsid 448334b8-1b27-4738-8118-9e70b56b1e58 devid 1 transid 680 /dev/root scanned by swapper/0 (1)
> [    7.059562] BTRFS info (device mmcblk0p1): disk space caching is enabled
> [    7.066294] BTRFS info (device mmcblk0p1): has skinny extents
> [    7.078585] BTRFS info (device mmcblk0p1): enabling ssd optimizations
> [    7.087624] VFS: Mounted root (btrfs filesystem) on device 0:12.
> [    7.094044] devtmpfs: mounted
> [    7.097581] Freeing unused kernel memory: 1024K
> [    7.131431] Run /sbin/init as init process
> [    7.135536]   with arguments:
> [    7.135539]     /sbin/init
> [    7.135541]     earlyprintk
> [    7.135543]   with environment:
> [    7.135545]     HOME=/
> [    7.135548]     TERM=linux
> [    7.220335] random: fast init done
> [    7.650974] systemd[1]: systemd 246.6-1.1-arch running in system mode. (+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
> [    7.674141] systemd[1]: Detected architecture arm.
> [    7.752534] systemd[1]: Set hostname to <omnia-arch>.
> [    7.938493] systemd[164]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
> [    8.148416] systemd[1]: Queued start job for default target Graphical Interface.
> [    8.156570] random: systemd: uninitialized urandom read (16 bytes read)
> [    8.164923] systemd[1]: Created slice system-getty.slice.
> [    8.201373] random: systemd: uninitialized urandom read (16 bytes read)
> [    8.208682] systemd[1]: Created slice system-modprobe.slice.
> [    8.241347] random: systemd: uninitialized urandom read (16 bytes read)
> [    8.248610] systemd[1]: Created slice system-serial\x2dgetty.slice.
> [    8.281970] systemd[1]: Created slice User and Session Slice.
> [    8.321507] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
> [    8.371436] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
> [    8.421373] systemd[1]: Condition check resulted in Arbitrary Executable File Formats File System Automount Point being skipped.
> [    8.433099] systemd[1]: Reached target Local Encrypted Volumes.
> [    8.481453] systemd[1]: Reached target Paths.
> [    8.521358] systemd[1]: Reached target Remote File Systems.
> [    8.571330] systemd[1]: Reached target Slices.
> [    8.611374] systemd[1]: Reached target Swap.
> [    8.641568] systemd[1]: Listening on Device-mapper event daemon FIFOs.
> [    8.693521] systemd[1]: Listening on Process Core Dump Socket.
> [    8.745061] systemd[1]: Condition check resulted in Journal Audit Socket being skipped.
> [    8.759882] systemd[1]: Listening on Journal Socket (/dev/log).
> [    8.801664] systemd[1]: Listening on Journal Socket.
> [    8.848051] systemd[1]: Listening on Network Service Netlink Socket.
> [    8.892567] systemd[1]: Listening on udev Control Socket.
> [    8.941553] systemd[1]: Listening on udev Kernel Socket.
> [    8.981628] systemd[1]: Condition check resulted in Huge Pages File System being skipped.
> [    8.990034] systemd[1]: Condition check resulted in POSIX Message Queue File System being skipped.
> [    8.999279] systemd[1]: Condition check resulted in Kernel Debug File System being skipped.
> [    9.010575] systemd[1]: Mounting Kernel Trace File System...
> [    9.043918] systemd[1]: Mounting Temporary Directory (/tmp)...
> [    9.081515] systemd[1]: Condition check resulted in Create list of static device nodes for the current kernel being skipped.
> [    9.095686] systemd[1]: Starting Load Kernel Module drm...
> [    9.138063] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
> [    9.148885] systemd[1]: Condition check resulted in Load Kernel Modules being skipped.
> [    9.157149] systemd[1]: Condition check resulted in FUSE Control File System being skipped.
> [    9.165757] systemd[1]: Condition check resulted in Kernel Configuration File System being skipped.
> [    9.177585] systemd[1]: Starting Remount Root and Kernel File Systems...
> [    9.211505] systemd[1]: Condition check resulted in Repartition Root Disk being skipped.
> [    9.222607] systemd[1]: Starting Apply Kernel Variables...
> [    9.264359] systemd[1]: Starting Coldplug All udev Devices...
> [    9.305343] systemd[1]: Mounted Kernel Trace File System.
> [    9.352014] systemd[1]: Mounted Temporary Directory (/tmp).
> [    9.391997] systemd[1]: modprobe@drm.service: Succeeded.
> [    9.398342] systemd[1]: Finished Load Kernel Module drm.
> [    9.436436] systemd[1]: Finished Remount Root and Kernel File Systems.
> [    9.472894] systemd[1]: Finished Apply Kernel Variables.
> [    9.514310] systemd[1]: Condition check resulted in First Boot Wizard being skipped.
> [    9.529349] systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
> [    9.540514] systemd[1]: Starting Load/Save Random Seed...
> [    9.561757] systemd[1]: Condition check resulted in Create System Users being skipped.
> [    9.578340] systemd[1]: Starting Create Static Device Nodes in /dev...
> [    9.639784] systemd[1]: Finished Create Static Device Nodes in /dev.
> [    9.692025] systemd[1]: Reached target Local File Systems (Pre).
> [    9.741485] systemd[1]: Condition check resulted in Virtual Machine and Container Storage (Compatibility) being skipped.
> [    9.752637] systemd[1]: Reached target Local File Systems.
> [    9.794672] systemd[1]: Started Entropy Daemon based on the HAVEGE algorithm.
> [    9.831672] systemd[1]: Condition check resulted in Rebuild Dynamic Linker Cache being skipped.
> [    9.844130] systemd[1]: Starting Journal Service...
> [    9.861510] systemd[1]: Condition check resulted in Commit a transient machine-id on disk being skipped.
> [    9.885260] systemd[1]: Starting Rule-based Manager for Device Events and Files...
> [    9.932999] systemd[1]: Finished Coldplug All udev Devices.
> [   10.175983] systemd[1]: Started Journal Service.
> [   11.579842] mvneta f1070000.ethernet eth0: configuring for fixed/rgmii link mode
> [   11.607754] mvneta f1070000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
> [   11.787479] mvneta f1034000.ethernet eth2: PHY [f1072004.mdio-mii:01] driver [Marvell 88E1510] (irq=POLL)
> [   11.817734] mvneta f1034000.ethernet eth2: configuring for phy/sgmii link mode
> [   12.102369] BTRFS info (device mmcblk0p1): devid 1 device path /dev/root changed to /dev/mmcblk0p1 scanned by systemd-udevd (194)
> [   13.131291] random: crng init done
> [   13.134710] random: 7 urandom warning(s) missed due to ratelimiting
> [   14.961639] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control rx/tx
> [   14.969684] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 17:20 ` Bjorn Helgaas
@ 2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-27 18:59     ` Toke Høiland-Jørgensen
  2020-10-27 18:56   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-27 17:44 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	vtolkm, Bjorn Helgaas

[-- Attachment #1.1.1: Type: text/plain, Size: 4274 bytes --]


On 27/10/2020 18:20, Bjorn Helgaas wrote:
> [+cc vtolkm]
>
> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> Hi everyone
>>
>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> having some trouble getting the PCI bus to work correctly. Specifically,
>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> the resource request fix[0] applied on top.
>>
>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>> up. But I'm still getting initialisation errors like these:
>>
>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>
>> and the WiFi drivers fail to initialise with what appears to me to be
>> errors related to the bus rather than to the drivers themselves:
>>
>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>
>> lspci looks OK, though:
>>
>> # lspci
>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>
>> Does anyone have any clue what could be going on here? Is this a bug, or
>> did I miss something in my config or other initialisation? I've tried
>> with both the stock u-boot distributed with the board, and with an
>> upstream u-boot from latest master; doesn't seem to make any different.
> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> don't think we have a fix yet.
>

Got the same device working with > 5.10.0-rc1-next-20201027-to-dirty < 
but ASPM turned off, as mentioned in the cited bug report.


  dmesg | grep ath

ath10k_pci 0000:02:00.0: enabling device (0140 -> 0142)
ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
ath10k_pci 0000:02:00.0: qca988x hw2.0 target 0x4100016c chip_id 
0x043202ff sub 0000:0000
ath9k 0000:03:00.0: enabling device (0140 -> 0142)
ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 0 tracing 1 dfs 0 
testmode 0
ath10k_pci 0000:02:00.0: firmware ver 10.2.4-1.0-00047 api 5 features 
no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 35bd9258
ath: EEPROM regdomain sanitized
ath: EEPROM regdomain: 0x64
ath: EEPROM indicates we should expect a direct regpair map
ath: Country alpha2 being used: 00
ath: Regpair used: 0x64
ath10k_pci 0000:02:00.0: board_file api 1 bmi_id N/A crc32 bebc7c08
ath10k_pci 0000:02:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 cal otp max-sta 
128 raw 0 hwcrypto 1
ath: EEPROM regdomain sanitized
ath: EEPROM regdomain: 0x64
ath: EEPROM indicates we should expect a direct regpair map
ath: Country alpha2 being used: 00
ath: Regpair used: 0x64
ath10k_pci 0000:02:00.0: pdev param 0 not supported by firmware

----

Note: related issues - workaround compile ath and cfg80211 as modules

(1) https://bugzilla.kernel.org/show_bug.cgi?id=209863
(2) https://bugzilla.kernel.org/show_bug.cgi?id=209855
(3) https://bugzilla.kernel.org/show_bug.cgi?id=209853





[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 15:43 PCI trouble on mvebu (Turris Omnia) Toke Høiland-Jørgensen
  2020-10-27 17:20 ` Bjorn Helgaas
@ 2020-10-27 18:03 ` Marek Behun
  2020-10-27 19:00   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 48+ messages in thread
From: Marek Behun @ 2020-10-27 18:03 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

Are you using stock U-Boot in the Omnia?

Marek

On Tue, 27 Oct 2020 16:43:20 +0100
Toke Høiland-Jørgensen <toke@redhat.com> wrote:

> Hi everyone
> 
> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> having some trouble getting the PCI bus to work correctly. Specifically,
> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> the resource request fix[0] applied on top.
> 
> The kernel boots fine, and the patch in [0] makes the PCI devices show
> up. But I'm still getting initialisation errors like these:
> 
> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> 
> and the WiFi drivers fail to initialise with what appears to me to be
> errors related to the bus rather than to the drivers themselves:
> 
> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> 
> lspci looks OK, though:
> 
> # lspci
> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> 
> Does anyone have any clue what could be going on here? Is this a bug, or
> did I miss something in my config or other initialisation? I've tried
> with both the stock u-boot distributed with the board, and with an
> upstream u-boot from latest master; doesn't seem to make any different.
> 
> Any pointers will be greatly appreciated!
> 
> Thanks,
> 
> -Toke
> 
> 
> [0] https://lore.kernel.org/linux-pci/20201023145252.2691779-1-robh@kernel.org/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 17:20 ` Bjorn Helgaas
  2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-27 18:56   ` Toke Høiland-Jørgensen
  2020-10-28 13:36     ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 18:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas, vtolkm

Bjorn Helgaas <helgaas@kernel.org> writes:

> [+cc vtolkm]
>
> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> Hi everyone
>> 
>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> having some trouble getting the PCI bus to work correctly. Specifically,
>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> the resource request fix[0] applied on top.
>> 
>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>> up. But I'm still getting initialisation errors like these:
>> 
>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> 
>> and the WiFi drivers fail to initialise with what appears to me to be
>> errors related to the bus rather than to the drivers themselves:
>> 
>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>> 
>> lspci looks OK, though:
>> 
>> # lspci
>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>> 
>> Does anyone have any clue what could be going on here? Is this a bug, or
>> did I miss something in my config or other initialisation? I've tried
>> with both the stock u-boot distributed with the board, and with an
>> upstream u-boot from latest master; doesn't seem to make any different.
>
> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> don't think we have a fix yet.

Yes! Turning that off does indeed help! Thanks a bunch :)

You mention that bisecting this would be helpful - I can try that
tomorrow; any idea when this was last working?

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-27 18:59     ` Toke Høiland-Jørgensen
  2020-10-27 20:20       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 18:59 UTC (permalink / raw)
  To: vtolkm
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	vtolkm, Bjorn Helgaas

"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:

> On 27/10/2020 18:20, Bjorn Helgaas wrote:
>> [+cc vtolkm]
>>
>> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>>> Hi everyone
>>>
>>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>>> having some trouble getting the PCI bus to work correctly. Specifically,
>>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>>> the resource request fix[0] applied on top.
>>>
>>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>>> up. But I'm still getting initialisation errors like these:
>>>
>>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>
>>> and the WiFi drivers fail to initialise with what appears to me to be
>>> errors related to the bus rather than to the drivers themselves:
>>>
>>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>>
>>> lspci looks OK, though:
>>>
>>> # lspci
>>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>>
>>> Does anyone have any clue what could be going on here? Is this a bug, or
>>> did I miss something in my config or other initialisation? I've tried
>>> with both the stock u-boot distributed with the board, and with an
>>> upstream u-boot from latest master; doesn't seem to make any different.
>> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> don't think we have a fix yet.
>>
>
> Got the same device working with > 5.10.0-rc1-next-20201027-to-dirty < 
> but ASPM turned off, as mentioned in the cited bug report.

Yup, indeed that helped!

> Note: related issues - workaround compile ath and cfg80211 as modules
>
> (1) https://bugzilla.kernel.org/show_bug.cgi?id=209863
> (2) https://bugzilla.kernel.org/show_bug.cgi?id=209855
> (3) https://bugzilla.kernel.org/show_bug.cgi?id=209853

Yeah, I had noticed the regdb failure but put off debugging that until
the PCI issue was resolved. So guess that's next on my list - thanks for
the pointer (although I'd rather avoid the module approach as booting
the kernel directly from my build box over tftp is quite convenient...
Let's see if there isn't another way to fix this)

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 18:03 ` Marek Behun
@ 2020-10-27 19:00   ` Toke Høiland-Jørgensen
  2020-10-27 20:19     ` Marek Behun
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 19:00 UTC (permalink / raw)
  To: Marek Behun; +Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

Marek Behun <marek.behun@nic.cz> writes:

> Are you using stock U-Boot in the Omnia?

I've tried both that and the latest upstream - didn't make a difference
wrt the PCI issue. Only difference I've noticed other than that (apart
from being able to turn more things on when using upstream) is that the
upstream u-boot can't seem to find the eMMC chip on the Omnia. Any idea
why? It doesn't matter right now since I'm just tftp-booting, but it
would be kinda nice to get that fixed as well :)

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 19:00   ` Toke Høiland-Jørgensen
@ 2020-10-27 20:19     ` Marek Behun
  2020-10-27 20:49       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: Marek Behun @ 2020-10-27 20:19 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

On Tue, 27 Oct 2020 20:00:58 +0100
Toke Høiland-Jørgensen <toke@redhat.com> wrote:

> Marek Behun <marek.behun@nic.cz> writes:
> 
> > Are you using stock U-Boot in the Omnia?  
> 
> I've tried both that and the latest upstream - didn't make a difference
> wrt the PCI issue. Only difference I've noticed other than that (apart
> from being able to turn more things on when using upstream) is that the
> upstream u-boot can't seem to find the eMMC chip on the Omnia. Any idea
> why? It doesn't matter right now since I'm just tftp-booting, but it
> would be kinda nice to get that fixed as well :)
> 
> -Toke
> 

No idea, I will have to look into that.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 18:59     ` Toke Høiland-Jørgensen
@ 2020-10-27 20:20       ` Toke Høiland-Jørgensen
  2020-10-27 21:22         ` ™֟☻̭҇ Ѽ ҉ ®
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 20:20 UTC (permalink / raw)
  To: vtolkm
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	vtolkm, Bjorn Helgaas

Toke Høiland-Jørgensen <toke@redhat.com> writes:

>> Note: related issues - workaround compile ath and cfg80211 as modules
>>
>> (1) https://bugzilla.kernel.org/show_bug.cgi?id=209863
>> (2) https://bugzilla.kernel.org/show_bug.cgi?id=209855
>> (3) https://bugzilla.kernel.org/show_bug.cgi?id=209853
>
> Yeah, I had noticed the regdb failure but put off debugging that until
> the PCI issue was resolved. So guess that's next on my list - thanks for
> the pointer (although I'd rather avoid the module approach as booting
> the kernel directly from my build box over tftp is quite convenient...
> Let's see if there isn't another way to fix this)

To follow up on this, everything seems to work just fine (ath10k init at
boot + regulatory db load) if I simply set:

CONFIG_EXTRA_FIRMWARE="ath10k/QCA988X/hw2.0/board.bin ath10k/QCA988X/hw2.0/firmware-5.bin regulatory.db regulatory.db.p7s"

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 20:19     ` Marek Behun
@ 2020-10-27 20:49       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 20:49 UTC (permalink / raw)
  To: Marek Behun; +Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

Marek Behun <marek.behun@nic.cz> writes:

> On Tue, 27 Oct 2020 20:00:58 +0100
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> Marek Behun <marek.behun@nic.cz> writes:
>> 
>> > Are you using stock U-Boot in the Omnia?  
>> 
>> I've tried both that and the latest upstream - didn't make a difference
>> wrt the PCI issue. Only difference I've noticed other than that (apart
>> from being able to turn more things on when using upstream) is that the
>> upstream u-boot can't seem to find the eMMC chip on the Omnia. Any idea
>> why? It doesn't matter right now since I'm just tftp-booting, but it
>> would be kinda nice to get that fixed as well :)
>> 
>> -Toke
>> 
>
> No idea, I will have to look into that.

Please do! Would be awesome to get it working :)

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 20:20       ` Toke Høiland-Jørgensen
@ 2020-10-27 21:22         ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-27 21:31           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-27 21:22 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Bjorn Helgaas

[-- Attachment #1.1.1: Type: text/plain, Size: 1825 bytes --]

On 27/10/2020 21:20, Toke Høiland-Jørgensen wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>
>>> Note: related issues - workaround compile ath and cfg80211 as modules
>>>
>>> (1) https://bugzilla.kernel.org/show_bug.cgi?id=209863
>>> (2) https://bugzilla.kernel.org/show_bug.cgi?id=209855
>>> (3) https://bugzilla.kernel.org/show_bug.cgi?id=209853
>> Yeah, I had noticed the regdb failure but put off debugging that until
>> the PCI issue was resolved. So guess that's next on my list - thanks for
>> the pointer (although I'd rather avoid the module approach as booting
>> the kernel directly from my build box over tftp is quite convenient...
>> Let's see if there isn't another way to fix this)
> To follow up on this, everything seems to work just fine (ath10k init at
> boot + regulatory db load) if I simply set:
>
> CONFIG_EXTRA_FIRMWARE="ath10k/QCA988X/hw2.0/board.bin ath10k/QCA988X/hw2.0/firmware-5.bin regulatory.db regulatory.db.p7s"
>
> -Toke
>

That works on my node only for the regulatory files but not the ath10 
firmware with kconfig:

  Symbol: EXTRA_FIRMWARE_DIR [=/srv/fw]
  Type  : string
  Defined at drivers/base/firmware_loader/Kconfig:63
    Prompt: Firmware blobs root directory
    Depends on: FW_LOADER [=y] && EXTRA_FIRMWARE [=regulatory.db 
regulatory.db.p7s board.bin firmware-5.bin]!=
    Location:
     -> Device Drivers
       -> Generic Driver Options
         -> Firmware loader
           -> Firmware loading facility (FW_LOADER [=y])
             -> Build named firmware blobs into the kernel binary 
(EXTRA_FIRMWARE [=regulatory.db regulatory.db.p7s board.bin 
firmware-5.bin])

But that is off thread topic anyway and bug lodged 
https://bugzilla.kernel.org/show_bug.cgi?id=209855


[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 21:22         ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-27 21:31           ` Toke Høiland-Jørgensen
  2020-10-27 22:01             ` ™֟☻̭҇ Ѽ ҉ ®
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 21:31 UTC (permalink / raw)
  To: vtolkm
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Bjorn Helgaas

"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:

> On 27/10/2020 21:20, Toke Høiland-Jørgensen wrote:
>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>
>>>> Note: related issues - workaround compile ath and cfg80211 as modules
>>>>
>>>> (1) https://bugzilla.kernel.org/show_bug.cgi?id=209863
>>>> (2) https://bugzilla.kernel.org/show_bug.cgi?id=209855
>>>> (3) https://bugzilla.kernel.org/show_bug.cgi?id=209853
>>> Yeah, I had noticed the regdb failure but put off debugging that until
>>> the PCI issue was resolved. So guess that's next on my list - thanks for
>>> the pointer (although I'd rather avoid the module approach as booting
>>> the kernel directly from my build box over tftp is quite convenient...
>>> Let's see if there isn't another way to fix this)
>> To follow up on this, everything seems to work just fine (ath10k init at
>> boot + regulatory db load) if I simply set:
>>
>> CONFIG_EXTRA_FIRMWARE="ath10k/QCA988X/hw2.0/board.bin ath10k/QCA988X/hw2.0/firmware-5.bin regulatory.db regulatory.db.p7s"
>>
>> -Toke
>>
>
> That works on my node only for the regulatory files but not the ath10 
> firmware with kconfig:
>
>   Symbol: EXTRA_FIRMWARE_DIR [=/srv/fw]
>   Type  : string
>   Defined at drivers/base/firmware_loader/Kconfig:63
>     Prompt: Firmware blobs root directory
>     Depends on: FW_LOADER [=y] && EXTRA_FIRMWARE [=regulatory.db 
> regulatory.db.p7s board.bin firmware-5.bin]!=
>     Location:
>      -> Device Drivers
>        -> Generic Driver Options
>          -> Firmware loader
>            -> Firmware loading facility (FW_LOADER [=y])
>              -> Build named firmware blobs into the kernel binary 
> (EXTRA_FIRMWARE [=regulatory.db regulatory.db.p7s board.bin 
> firmware-5.bin])

I think that's because you're missing the path prefix
(ath10k/QCA988X/hw2.0/) from board.bin and firmware-5.bin?
request_firmware() uses the full path...

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 21:31           ` Toke Høiland-Jørgensen
@ 2020-10-27 22:01             ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-27 22:12               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-27 22:01 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Bjorn Helgaas

[-- Attachment #1.1.1: Type: text/plain, Size: 3037 bytes --]

On 27/10/2020 22:31, Toke Høiland-Jørgensen wrote:
>>> To follow up on this, everything seems to work just fine (ath10k init at
>>> boot + regulatory db load) if I simply set:
>>>
>>> CONFIG_EXTRA_FIRMWARE="ath10k/QCA988X/hw2.0/board.bin ath10k/QCA988X/hw2.0/firmware-5.bin regulatory.db regulatory.db.p7s"
>>>
>>> -Toke
>>>
>> That works on my node only for the regulatory files but not the ath10
>> firmware with kconfig:
>>
>>    Symbol: EXTRA_FIRMWARE_DIR [=/srv/fw]
>>    Type  : string
>>    Defined at drivers/base/firmware_loader/Kconfig:63
>>      Prompt: Firmware blobs root directory
>>      Depends on: FW_LOADER [=y] && EXTRA_FIRMWARE [=regulatory.db
>> regulatory.db.p7s board.bin firmware-5.bin]!=
>>      Location:
>>       -> Device Drivers
>>         -> Generic Driver Options
>>           -> Firmware loader
>>             -> Firmware loading facility (FW_LOADER [=y])
>>               -> Build named firmware blobs into the kernel binary
>> (EXTRA_FIRMWARE [=regulatory.db regulatory.db.p7s board.bin
>> firmware-5.bin])
> I think that's because you're missing the path prefix
> (ath10k/QCA988X/hw2.0/) from board.bin and firmware-5.bin?
> request_firmware() uses the full path...
>
> -Toke

Well, that would be weird/strange having to specify the path prefix for 
build-in firmware,considering:

  CONFIG_FW_LOADER:

  This enables the firmware loading facility in the kernel. The kernel
  will first look for built-in firmware, if it has any. Next, it will
  look for the requested firmware in a series of filesystem paths:

        o firmware_class path module parameter or kernel boot param
        o /lib/firmware/updates/UTS_RELEASE
        o /lib/firmware/updates
        o /lib/firmware/UTS_RELEASE
        o /lib/firmware

----

Nevertheless, I tried with same path prefix as per your kconfig but the 
compilation fails, which I am not surprised since the ath10 blobs are 
not located at that path

   UPD     drivers/base/firmware_loader/builtin/regulatory.db.gen.S
   UPD drivers/base/firmware_loader/builtin/regulatory.db.p7s.gen.S
make[4]: *** No rule to make target 
'/srv/fw/ath10k/QCA988X/hw2.0/board.bin', needed by 
'drivers/base/firmware_loader/builtin/ath10k/QCA988X/hw2.0/board.bin.gen.o'. 
Stop.
make[4]: *** Waiting for unfinished jobs....
   UPD 
drivers/base/firmware_loader/builtin/ath10k/QCA988X/hw2.0/board.bin.gen.S
make[3]: *** [scripts/Makefile.build:500: 
drivers/base/firmware_loader/builtin] Error 2
make[2]: *** [scripts/Makefile.build:500: drivers/base/firmware_loader] 
Error 2
make[1]: *** [scripts/Makefile.build:500: drivers/base] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:1799: drivers] Error 2
make: *** Waiting for unfinished jobs....

I suspect that since you are booting the kernel directly from my build 
box over tftp it accesses the ath10 firmware blobs on the build box.




[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 22:01             ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-27 22:12               ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-27 22:12 UTC (permalink / raw)
  To: vtolkm
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Bjorn Helgaas

"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:

> On 27/10/2020 22:31, Toke Høiland-Jørgensen wrote:
>>>> To follow up on this, everything seems to work just fine (ath10k init at
>>>> boot + regulatory db load) if I simply set:
>>>>
>>>> CONFIG_EXTRA_FIRMWARE="ath10k/QCA988X/hw2.0/board.bin ath10k/QCA988X/hw2.0/firmware-5.bin regulatory.db regulatory.db.p7s"
>>>>
>>>> -Toke
>>>>
>>> That works on my node only for the regulatory files but not the ath10
>>> firmware with kconfig:
>>>
>>>    Symbol: EXTRA_FIRMWARE_DIR [=/srv/fw]
>>>    Type  : string
>>>    Defined at drivers/base/firmware_loader/Kconfig:63
>>>      Prompt: Firmware blobs root directory
>>>      Depends on: FW_LOADER [=y] && EXTRA_FIRMWARE [=regulatory.db
>>> regulatory.db.p7s board.bin firmware-5.bin]!=
>>>      Location:
>>>       -> Device Drivers
>>>         -> Generic Driver Options
>>>           -> Firmware loader
>>>             -> Firmware loading facility (FW_LOADER [=y])
>>>               -> Build named firmware blobs into the kernel binary
>>> (EXTRA_FIRMWARE [=regulatory.db regulatory.db.p7s board.bin
>>> firmware-5.bin])
>> I think that's because you're missing the path prefix
>> (ath10k/QCA988X/hw2.0/) from board.bin and firmware-5.bin?
>> request_firmware() uses the full path...
>>
>> -Toke
>
> Well, that would be weird/strange having to specify the path prefix for 
> build-in firmware,considering:
>
>   CONFIG_FW_LOADER:
>
>   This enables the firmware loading facility in the kernel. The kernel
>   will first look for built-in firmware, if it has any. Next, it will
>   look for the requested firmware in a series of filesystem paths:
>
>         o firmware_class path module parameter or kernel boot param
>         o /lib/firmware/updates/UTS_RELEASE
>         o /lib/firmware/updates
>         o /lib/firmware/UTS_RELEASE
>         o /lib/firmware

Why would that be weird? The driver is requesting firmware with a path
prefix, so the firmware location has to match... Doesn't matter if it's
in the filesystem or builtin.

> ----
>
> Nevertheless, I tried with same path prefix as per your kconfig but the 
> compilation fails, which I am not surprised since the ath10 blobs are 
> not located at that path

Well you'd need to fix that :)

>    UPD     drivers/base/firmware_loader/builtin/regulatory.db.gen.S
>    UPD drivers/base/firmware_loader/builtin/regulatory.db.p7s.gen.S
> make[4]: *** No rule to make target 
> '/srv/fw/ath10k/QCA988X/hw2.0/board.bin', needed by

Based on that error message, you'd need to do something like:

mkdir -p /srv/fw/ath10k/QCA988X/hw2.0
mv /srv/fw/{board.bin,firmware-5.bin} /srv/fw/ath10k/QCA988X/hw2.0

> 'drivers/base/firmware_loader/builtin/ath10k/QCA988X/hw2.0/board.bin.gen.o'. 
> Stop.
> make[4]: *** Waiting for unfinished jobs....
>    UPD 
> drivers/base/firmware_loader/builtin/ath10k/QCA988X/hw2.0/board.bin.gen.S
> make[3]: *** [scripts/Makefile.build:500: 
> drivers/base/firmware_loader/builtin] Error 2
> make[2]: *** [scripts/Makefile.build:500: drivers/base/firmware_loader] 
> Error 2
> make[1]: *** [scripts/Makefile.build:500: drivers/base] Error 2
> make[1]: *** Waiting for unfinished jobs....
> make: *** [Makefile:1799: drivers] Error 2
> make: *** Waiting for unfinished jobs....
>
> I suspect that since you are booting the kernel directly from my build 
> box over tftp it accesses the ath10 firmware blobs on the build box.

Yes, obviously it's reading the firmware blobs at build time from the
location on the build box, then embedding them in the kernel image,
which is then served over tftp to the Omnia. It's not loading anything
from the build box after that (how would that work?)

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-27 18:56   ` Toke Høiland-Jørgensen
@ 2020-10-28 13:36     ` Toke Høiland-Jørgensen
  2020-10-28 14:42       ` Bjorn Helgaas
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-28 13:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas, vtolkm

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> Bjorn Helgaas <helgaas@kernel.org> writes:
>
>> [+cc vtolkm]
>>
>> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>>> Hi everyone
>>> 
>>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>>> having some trouble getting the PCI bus to work correctly. Specifically,
>>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>>> the resource request fix[0] applied on top.
>>> 
>>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>>> up. But I'm still getting initialisation errors like these:
>>> 
>>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>> 
>>> and the WiFi drivers fail to initialise with what appears to me to be
>>> errors related to the bus rather than to the drivers themselves:
>>> 
>>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>> 
>>> lspci looks OK, though:
>>> 
>>> # lspci
>>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>> 
>>> Does anyone have any clue what could be going on here? Is this a bug, or
>>> did I miss something in my config or other initialisation? I've tried
>>> with both the stock u-boot distributed with the board, and with an
>>> upstream u-boot from latest master; doesn't seem to make any different.
>>
>> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> don't think we have a fix yet.
>
> Yes! Turning that off does indeed help! Thanks a bunch :)
>
> You mention that bisecting this would be helpful - I can try that
> tomorrow; any idea when this was last working?

OK, so I tried to bisect this, but, erm, I couldn't find a working
revision to start from? I went all the way back to 4.10 (which is the
first version to include the device tree file for the Omnia), and even
on that, the wireless cards were failing to initialise with ASPM
enabled...

Happy to run other tests, but I think I'm going to need some pointers -
the PCI subsystem is not my home turf :)

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 13:36     ` Toke Høiland-Jørgensen
@ 2020-10-28 14:42       ` Bjorn Helgaas
  2020-10-28 15:08         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: Bjorn Helgaas @ 2020-10-28 14:42 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas, vtolkm

On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> writes:
> 
> > Bjorn Helgaas <helgaas@kernel.org> writes:
> >
> >> [+cc vtolkm]
> >>
> >> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> >>> Hi everyone
> >>> 
> >>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> >>> having some trouble getting the PCI bus to work correctly. Specifically,
> >>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> >>> the resource request fix[0] applied on top.
> >>> 
> >>> The kernel boots fine, and the patch in [0] makes the PCI devices show
> >>> up. But I'm still getting initialisation errors like these:
> >>> 
> >>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> >>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> >>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> >>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> >>> 
> >>> and the WiFi drivers fail to initialise with what appears to me to be
> >>> errors related to the bus rather than to the drivers themselves:
> >>> 
> >>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> >>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> >>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> >>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> >>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> >>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> >>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> >>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> >>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> >>> 
> >>> lspci looks OK, though:
> >>> 
> >>> # lspci
> >>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> >>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> >>> 
> >>> Does anyone have any clue what could be going on here? Is this a bug, or
> >>> did I miss something in my config or other initialisation? I've tried
> >>> with both the stock u-boot distributed with the board, and with an
> >>> upstream u-boot from latest master; doesn't seem to make any different.
> >>
> >> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> >> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> >> don't think we have a fix yet.
> >
> > Yes! Turning that off does indeed help! Thanks a bunch :)
> >
> > You mention that bisecting this would be helpful - I can try that
> > tomorrow; any idea when this was last working?
> 
> OK, so I tried to bisect this, but, erm, I couldn't find a working
> revision to start from? I went all the way back to 4.10 (which is the
> first version to include the device tree file for the Omnia), and even
> on that, the wireless cards were failing to initialise with ASPM
> enabled...

I have no personal experience with this device; all I know is that the
bugzilla suggests that it worked in v5.4, which isn't much help.

Possibly the apparent regression was really a .config change, i.e.,
CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
"worked" but got enabled later and it started failing?

Maybe the debug patch below would be worth trying to see if it makes
any difference?  If it *does* help, try omitting the first hunk to see
if we just need to apply the quirk_enable_clear_retrain_link() quirk.

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index ac0557a305af..afe7fa1d54d6 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -103,7 +103,7 @@ static const char *policy_str[] = {
 	[POLICY_POWER_SUPERSAVE] = "powersupersave"
 };
 
-#define LINK_RETRAIN_TIMEOUT HZ
+#define LINK_RETRAIN_TIMEOUT (10*HZ)
 
 static int policy_to_aspm_state(struct pcie_link_state *link)
 {
@@ -201,7 +201,7 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
 	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
 	reg16 |= PCI_EXP_LNKCTL_RL;
 	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
-	if (parent->clear_retrain_link) {
+	if (1 || parent->clear_retrain_link) {
 		/*
 		 * Due to an erratum in some devices the Retrain Link bit
 		 * needs to be cleared again manually to allow the link

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 14:42       ` Bjorn Helgaas
@ 2020-10-28 15:08         ` Toke Høiland-Jørgensen
  2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-29 15:12           ` Rob Herring
  0 siblings, 2 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-28 15:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas, vtolkm

Bjorn Helgaas <helgaas@kernel.org> writes:

> On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>> 
>> > Bjorn Helgaas <helgaas@kernel.org> writes:
>> >
>> >> [+cc vtolkm]
>> >>
>> >> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> >>> Hi everyone
>> >>> 
>> >>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> >>> having some trouble getting the PCI bus to work correctly. Specifically,
>> >>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> >>> the resource request fix[0] applied on top.
>> >>> 
>> >>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>> >>> up. But I'm still getting initialisation errors like these:
>> >>> 
>> >>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> >>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> >>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> >>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> >>> 
>> >>> and the WiFi drivers fail to initialise with what appears to me to be
>> >>> errors related to the bus rather than to the drivers themselves:
>> >>> 
>> >>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> >>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> >>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> >>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> >>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> >>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> >>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> >>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> >>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>> >>> 
>> >>> lspci looks OK, though:
>> >>> 
>> >>> # lspci
>> >>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> >>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> >>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> >>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> >>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>> >>> 
>> >>> Does anyone have any clue what could be going on here? Is this a bug, or
>> >>> did I miss something in my config or other initialisation? I've tried
>> >>> with both the stock u-boot distributed with the board, and with an
>> >>> upstream u-boot from latest master; doesn't seem to make any different.
>> >>
>> >> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> >> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> >> don't think we have a fix yet.
>> >
>> > Yes! Turning that off does indeed help! Thanks a bunch :)
>> >
>> > You mention that bisecting this would be helpful - I can try that
>> > tomorrow; any idea when this was last working?
>> 
>> OK, so I tried to bisect this, but, erm, I couldn't find a working
>> revision to start from? I went all the way back to 4.10 (which is the
>> first version to include the device tree file for the Omnia), and even
>> on that, the wireless cards were failing to initialise with ASPM
>> enabled...
>
> I have no personal experience with this device; all I know is that the
> bugzilla suggests that it worked in v5.4, which isn't much help.
>
> Possibly the apparent regression was really a .config change, i.e.,
> CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
> "worked" but got enabled later and it started failing?

Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
default and only turns it on for specific targets. So I guess that it's
most likely that this has never worked...

> Maybe the debug patch below would be worth trying to see if it makes
> any difference?  If it *does* help, try omitting the first hunk to see
> if we just need to apply the quirk_enable_clear_retrain_link() quirk.

Tried, doesn't help...

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 15:08         ` Toke Høiland-Jørgensen
@ 2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-28 23:16             ` Bjorn Helgaas
  2020-10-29  1:21             ` Marek Behun
  2020-10-29 15:12           ` Rob Herring
  1 sibling, 2 replies; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-28 16:40 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

[-- Attachment #1.1.1: Type: text/plain, Size: 4852 bytes --]


On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
> Bjorn Helgaas <helgaas@kernel.org> writes:
>
>> On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>
>>>> Bjorn Helgaas <helgaas@kernel.org> writes:
>>>>
>>>>> [+cc vtolkm]
>>>>>
>>>>> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>>>>>> Hi everyone
>>>>>>
>>>>>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>>>>>> having some trouble getting the PCI bus to work correctly. Specifically,
>>>>>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>>>>>> the resource request fix[0] applied on top.
>>>>>>
>>>>>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>>>>>> up. But I'm still getting initialisation errors like these:
>>>>>>
>>>>>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>>>>>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>>>>>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>>
>>>>>> and the WiFi drivers fail to initialise with what appears to me to be
>>>>>> errors related to the bus rather than to the drivers themselves:
>>>>>>
>>>>>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>>>>>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>>>>>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>>>>>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>>>>>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>>>>>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>>>>>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>>>>>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>>>>>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>>>>>
>>>>>> lspci looks OK, though:
>>>>>>
>>>>>> # lspci
>>>>>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>>>>>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>>>>>
>>>>>> Does anyone have any clue what could be going on here? Is this a bug, or
>>>>>> did I miss something in my config or other initialisation? I've tried
>>>>>> with both the stock u-boot distributed with the board, and with an
>>>>>> upstream u-boot from latest master; doesn't seem to make any different.
>>>>> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>>>>> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>>>>> don't think we have a fix yet.
>>>> Yes! Turning that off does indeed help! Thanks a bunch :)
>>>>
>>>> You mention that bisecting this would be helpful - I can try that
>>>> tomorrow; any idea when this was last working?
>>> OK, so I tried to bisect this, but, erm, I couldn't find a working
>>> revision to start from? I went all the way back to 4.10 (which is the
>>> first version to include the device tree file for the Omnia), and even
>>> on that, the wireless cards were failing to initialise with ASPM
>>> enabled...
>> I have no personal experience with this device; all I know is that the
>> bugzilla suggests that it worked in v5.4, which isn't much help.
>>
>> Possibly the apparent regression was really a .config change, i.e.,
>> CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
>> "worked" but got enabled later and it started failing?
> Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
> default and only turns it on for specific targets. So I guess that it's
> most likely that this has never worked...
>
>> Maybe the debug patch below would be worth trying to see if it makes
>> any difference?  If it *does* help, try omitting the first hunk to see
>> if we just need to apply the quirk_enable_clear_retrain_link() quirk.
> Tried, doesn't help...
>
> -Toke
>

Found this patch

https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch 


that mentions the Compex WLE900VX card, which reading the lspci verbose 
output from the bugtracker seems to the device being troubled.





[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-28 23:16             ` Bjorn Helgaas
  2020-10-29 10:09               ` Pali Rohár
                                 ` (2 more replies)
  2020-10-29  1:21             ` Marek Behun
  1 sibling, 3 replies; 48+ messages in thread
From: Bjorn Helgaas @ 2020-10-28 23:16 UTC (permalink / raw)
  To: vtolkm, Toke Høiland-Jørgensen
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Pali Rohár, Marek Behún, Thomas Petazzoni,
	Jason Cooper

[+cc Pali, Marek, Thomas, Jason]

On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
> On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
> > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
> > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > > > 
> > > > > > [+cc vtolkm]
> > > > > > 
> > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> > > > > > > Hi everyone
> > > > > > > 
> > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
> > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> > > > > > > the resource request fix[0] applied on top.
> > > > > > > 
> > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
> > > > > > > up. But I'm still getting initialisation errors like these:
> > > > > > > 
> > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > 
> > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
> > > > > > > errors related to the bus rather than to the drivers themselves:
> > > > > > > 
> > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> > > > > > > 
> > > > > > > lspci looks OK, though:
> > > > > > > 
> > > > > > > # lspci
> > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> > > > > > > 
> > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
> > > > > > > did I miss something in my config or other initialisation? I've tried
> > > > > > > with both the stock u-boot distributed with the board, and with an
> > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
> > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> > > > > > don't think we have a fix yet.
> > > > > Yes! Turning that off does indeed help! Thanks a bunch :)
> > > > > 
> > > > > You mention that bisecting this would be helpful - I can try that
> > > > > tomorrow; any idea when this was last working?
> > > > OK, so I tried to bisect this, but, erm, I couldn't find a working
> > > > revision to start from? I went all the way back to 4.10 (which is the
> > > > first version to include the device tree file for the Omnia), and even
> > > > on that, the wireless cards were failing to initialise with ASPM
> > > > enabled...
> > > I have no personal experience with this device; all I know is that the
> > > bugzilla suggests that it worked in v5.4, which isn't much help.
> > > 
> > > Possibly the apparent regression was really a .config change, i.e.,
> > > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
> > > "worked" but got enabled later and it started failing?
> > Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
> > default and only turns it on for specific targets. So I guess that it's
> > most likely that this has never worked...
> > 
> > > Maybe the debug patch below would be worth trying to see if it makes
> > > any difference?  If it *does* help, try omitting the first hunk to see
> > > if we just need to apply the quirk_enable_clear_retrain_link() quirk.
> > Tried, doesn't help...
> > 
> > -Toke
> 
> Found this patch
> 
> https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
> 
> that mentions the Compex WLE900VX card, which reading the lspci verbose
> output from the bugtracker seems to the device being troubled.

Interesting.  Indeed, the Compex WLE900VX card seems to have the
Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
the same device in it.

The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
for aardvark, so of course doesn't help mvebu.

PCIe hardware is supposed to automatically negotiate the highest link
speed supported by both ends.  But software *is* allowed to set an
upper limit (the Target Link Speed in Link Control 2).  If we initiate
a retrain and the link doesn't come back up, I wonder if we should try
to help the hardware out by using Target Link Speed to limit to a
lower speed and attempting another retrain, something like this hacky
patch: (please collect the dmesg log if you try this)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index ac0557a305af..fb6e13532a2c 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -192,12 +192,42 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
 	link->clkpm_disable = blacklist ? 1 : 0;
 }
 
+#define PCI_EXP_LNKCAP2_SLS	0x000000fe
+
+static int decrease_tls(struct pci_dev *pdev)
+{
+	u32 lnkcap2;
+	u16 lnkctl2, tls;
+
+	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP2, &lnkcap2);
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
+	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
+
+	pci_info(pdev, "lnkcap2 %#010x sls %#04x lnkctl2 %#06x tls %#03x\n",
+		 lnkcap2, (lnkcap2 & PCI_EXP_LNKCAP2_SLS) >> 1,
+		 lnkctl2, tls);
+
+	if (tls < 2)
+		return -EINVAL;
+
+	tls--;
+	pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL2,
+					   PCI_EXP_LNKCTL2_TLS, tls);
+	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
+	pci_info(pdev, "lnkctl2 %#010x new tls %#03x\n",
+		 lnkctl2, tls);
+
+	return 0;
+}
+
 static bool pcie_retrain_link(struct pcie_link_state *link)
 {
 	struct pci_dev *parent = link->pdev;
 	unsigned long end_jiffies;
 	u16 reg16;
 
+top:
 	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
 	reg16 |= PCI_EXP_LNKCTL_RL;
 	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
@@ -216,10 +246,14 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
 	do {
 		pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &reg16);
 		if (!(reg16 & PCI_EXP_LNKSTA_LT))
-			break;
+			return true;	/* success */
 		msleep(1);
 	} while (time_before(jiffies, end_jiffies));
-	return !(reg16 & PCI_EXP_LNKSTA_LT);
+
+	if (decrease_tls(parent))
+		return false;	/* can't decrease any more */
+
+	goto top;
 }
 
 /*

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-28 23:16             ` Bjorn Helgaas
@ 2020-10-29  1:21             ` Marek Behun
  1 sibling, 0 replies; 48+ messages in thread
From: Marek Behun @ 2020-10-29  1:21 UTC (permalink / raw)
  To: ™֟☻̭҇ Ѽ ҉ ®
  Cc: vtolkm, Toke Høiland-Jørgensen, Bjorn Helgaas,
	linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas

On Wed, 28 Oct 2020 16:40:00 +0000
"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> wrote:

> Found this patch
> 
> https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch 
> 
> 
> that mentions the Compex WLE900VX card, which reading the lspci verbose 
> output from the bugtracker seems to the device being troubled.

It seems mvebu driver in combination with compex card is similarily
broken as aardvark was... :) Hopefully Pali will want to look into this.

Marek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 23:16             ` Bjorn Helgaas
@ 2020-10-29 10:09               ` Pali Rohár
  2020-10-29 10:56                 ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-29 11:12                 ` Toke Høiland-Jørgensen
  2020-10-29 10:41               ` Toke Høiland-Jørgensen
  2020-10-30 11:23               ` Pali Rohár
  2 siblings, 2 replies; 48+ messages in thread
From: Pali Rohár @ 2020-10-29 10:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: vtolkm, Toke Høiland-Jørgensen, linux-pci,
	linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Marek Behún, Thomas Petazzoni, Jason Cooper

Hello!

On Wednesday 28 October 2020 18:16:26 Bjorn Helgaas wrote:
> [+cc Pali, Marek, Thomas, Jason]
> 
> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
> > On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
> > > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
> > > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> > > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > > > > 
> > > > > > > [+cc vtolkm]
> > > > > > > 
> > > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> > > > > > > > Hi everyone
> > > > > > > > 
> > > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> > > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
> > > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> > > > > > > > the resource request fix[0] applied on top.
> > > > > > > > 
> > > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
> > > > > > > > up. But I'm still getting initialisation errors like these:
> > > > > > > > 
> > > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> > > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> > > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > > 
> > > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
> > > > > > > > errors related to the bus rather than to the drivers themselves:
> > > > > > > > 
> > > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> > > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> > > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> > > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> > > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> > > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> > > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> > > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> > > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> > > > > > > > 
> > > > > > > > lspci looks OK, though:
> > > > > > > > 
> > > > > > > > # lspci
> > > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> > > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> > > > > > > > 
> > > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
> > > > > > > > did I miss something in my config or other initialisation? I've tried
> > > > > > > > with both the stock u-boot distributed with the board, and with an
> > > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
> > > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> > > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> > > > > > > don't think we have a fix yet.
> > > > > > Yes! Turning that off does indeed help! Thanks a bunch :)

I have been testing mainline kernel on Turris Omnia with two PCIe
default cards (WLE200 and WLE900) and it worked fine. But I do not know
if I had ASPM enabled or not.

So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
issue is only when CONFIG_PCIEASPM is enabled?

> > > > > > 
> > > > > > You mention that bisecting this would be helpful - I can try that
> > > > > > tomorrow; any idea when this was last working?
> > > > > OK, so I tried to bisect this, but, erm, I couldn't find a working
> > > > > revision to start from? I went all the way back to 4.10 (which is the
> > > > > first version to include the device tree file for the Omnia), and even
> > > > > on that, the wireless cards were failing to initialise with ASPM
> > > > > enabled...
> > > > I have no personal experience with this device; all I know is that the
> > > > bugzilla suggests that it worked in v5.4, which isn't much help.
> > > > 
> > > > Possibly the apparent regression was really a .config change, i.e.,
> > > > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
> > > > "worked" but got enabled later and it started failing?
> > > Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
> > > default and only turns it on for specific targets. So I guess that it's
> > > most likely that this has never worked...
> > > 
> > > > Maybe the debug patch below would be worth trying to see if it makes
> > > > any difference?  If it *does* help, try omitting the first hunk to see
> > > > if we just need to apply the quirk_enable_clear_retrain_link() quirk.
> > > Tried, doesn't help...
> > > 
> > > -Toke
> > 
> > Found this patch
> > 
> > https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
> > 
> > that mentions the Compex WLE900VX card, which reading the lspci verbose
> > output from the bugtracker seems to the device being troubled.
> 
> Interesting.  Indeed, the Compex WLE900VX card seems to have the
> Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
> the same device in it.
> 
> The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
> for aardvark, so of course doesn't help mvebu.

That patch is for aardvark driver, PCI controller on Armada 3720 SOC.
We have found out that lot of people were patching aardvark driver to
explicitly set only pcie gen 1 mode in internal aardvark register as
default value (gen 2) did not worked correctly with more Compex cards.
Then we have created above patch which force pcie gen 1 mode only for
gen 1 cards and it stabilized Compex cards. I think that there a HW bug
in that SOC which cause that PCI controller does not work correctly.

This patch is needed for Espressobin and Turris MOX. I have been testing
it with CONFIG_PCIEASPM=y on both devices and basically all tested cards
worked fine.

> PCIe hardware is supposed to automatically negotiate the highest link
> speed supported by both ends.  But software *is* allowed to set an
> upper limit (the Target Link Speed in Link Control 2).  If we initiate
> a retrain and the link doesn't come back up, I wonder if we should try
> to help the hardware out by using Target Link Speed to limit to a
> lower speed and attempting another retrain, something like this hacky
> patch: (please collect the dmesg log if you try this)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index ac0557a305af..fb6e13532a2c 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -192,12 +192,42 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>  	link->clkpm_disable = blacklist ? 1 : 0;
>  }
>  
> +#define PCI_EXP_LNKCAP2_SLS	0x000000fe
> +
> +static int decrease_tls(struct pci_dev *pdev)
> +{
> +	u32 lnkcap2;
> +	u16 lnkctl2, tls;
> +
> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP2, &lnkcap2);
> +
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
> +
> +	pci_info(pdev, "lnkcap2 %#010x sls %#04x lnkctl2 %#06x tls %#03x\n",
> +		 lnkcap2, (lnkcap2 & PCI_EXP_LNKCAP2_SLS) >> 1,
> +		 lnkctl2, tls);
> +
> +	if (tls < 2)
> +		return -EINVAL;
> +
> +	tls--;
> +	pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL2,
> +					   PCI_EXP_LNKCTL2_TLS, tls);
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
> +	pci_info(pdev, "lnkctl2 %#010x new tls %#03x\n",
> +		 lnkctl2, tls);
> +
> +	return 0;
> +}
> +
>  static bool pcie_retrain_link(struct pcie_link_state *link)
>  {
>  	struct pci_dev *parent = link->pdev;
>  	unsigned long end_jiffies;
>  	u16 reg16;
>  
> +top:
>  	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>  	reg16 |= PCI_EXP_LNKCTL_RL;
>  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> @@ -216,10 +246,14 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>  	do {
>  		pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &reg16);
>  		if (!(reg16 & PCI_EXP_LNKSTA_LT))
> -			break;
> +			return true;	/* success */
>  		msleep(1);
>  	} while (time_before(jiffies, end_jiffies));
> -	return !(reg16 & PCI_EXP_LNKSTA_LT);
> +
> +	if (decrease_tls(parent))
> +		return false;	/* can't decrease any more */
> +
> +	goto top;
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 23:16             ` Bjorn Helgaas
  2020-10-29 10:09               ` Pali Rohár
@ 2020-10-29 10:41               ` Toke Høiland-Jørgensen
  2020-10-29 11:18                 ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-30 11:23               ` Pali Rohár
  2 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 10:41 UTC (permalink / raw)
  To: Bjorn Helgaas, vtolkm
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Pali Rohár, Marek Behún, Thomas Petazzoni,
	Jason Cooper

Bjorn Helgaas <helgaas@kernel.org> writes:

> [+cc Pali, Marek, Thomas, Jason]
>
> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
>> On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
>> > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
>> > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > > > 
>> > > > > > [+cc vtolkm]
>> > > > > > 
>> > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > > > > Hi everyone
>> > > > > > > 
>> > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
>> > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> > > > > > > the resource request fix[0] applied on top.
>> > > > > > > 
>> > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
>> > > > > > > up. But I'm still getting initialisation errors like these:
>> > > > > > > 
>> > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > 
>> > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
>> > > > > > > errors related to the bus rather than to the drivers themselves:
>> > > > > > > 
>> > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>> > > > > > > 
>> > > > > > > lspci looks OK, though:
>> > > > > > > 
>> > > > > > > # lspci
>> > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>> > > > > > > 
>> > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
>> > > > > > > did I miss something in my config or other initialisation? I've tried
>> > > > > > > with both the stock u-boot distributed with the board, and with an
>> > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
>> > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> > > > > > don't think we have a fix yet.
>> > > > > Yes! Turning that off does indeed help! Thanks a bunch :)
>> > > > > 
>> > > > > You mention that bisecting this would be helpful - I can try that
>> > > > > tomorrow; any idea when this was last working?
>> > > > OK, so I tried to bisect this, but, erm, I couldn't find a working
>> > > > revision to start from? I went all the way back to 4.10 (which is the
>> > > > first version to include the device tree file for the Omnia), and even
>> > > > on that, the wireless cards were failing to initialise with ASPM
>> > > > enabled...
>> > > I have no personal experience with this device; all I know is that the
>> > > bugzilla suggests that it worked in v5.4, which isn't much help.
>> > > 
>> > > Possibly the apparent regression was really a .config change, i.e.,
>> > > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
>> > > "worked" but got enabled later and it started failing?
>> > Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
>> > default and only turns it on for specific targets. So I guess that it's
>> > most likely that this has never worked...
>> > 
>> > > Maybe the debug patch below would be worth trying to see if it makes
>> > > any difference?  If it *does* help, try omitting the first hunk to see
>> > > if we just need to apply the quirk_enable_clear_retrain_link() quirk.
>> > Tried, doesn't help...
>> > 
>> > -Toke
>> 
>> Found this patch
>> 
>> https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
>> 
>> that mentions the Compex WLE900VX card, which reading the lspci verbose
>> output from the bugtracker seems to the device being troubled.
>
> Interesting.  Indeed, the Compex WLE900VX card seems to have the
> Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
> the same device in it.
>
> The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
> for aardvark, so of course doesn't help mvebu.
>
> PCIe hardware is supposed to automatically negotiate the highest link
> speed supported by both ends.  But software *is* allowed to set an
> upper limit (the Target Link Speed in Link Control 2).  If we initiate
> a retrain and the link doesn't come back up, I wonder if we should try
> to help the hardware out by using Target Link Speed to limit to a
> lower speed and attempting another retrain, something like this hacky
> patch: (please collect the dmesg log if you try this)

Well, I tried it, but don't see any of the 'lnkcap2' output from that
new function:

[    1.545853] mvebu-pcie soc:pcie: host bridge /soc/pcie ranges:
[    1.545878] mvebu-pcie soc:pcie:      MEM 0x00f1080000..0x00f1081fff -> 0x0000080000
[    1.545894] mvebu-pcie soc:pcie:      MEM 0x00f1040000..0x00f1041fff -> 0x0000040000
[    1.545907] mvebu-pcie soc:pcie:      MEM 0x00f1044000..0x00f1045fff -> 0x0000044000
[    1.545920] mvebu-pcie soc:pcie:      MEM 0x00f1048000..0x00f1049fff -> 0x0000048000
[    1.545933] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
[    1.545945] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
[    1.545958] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
[    1.545970] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
[    1.545982] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
[    1.545994] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
[    1.546006] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
[    1.546014] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
[    1.546181] mvebu-pcie soc:pcie: PCI host bridge to bus 0000:00
[    1.546190] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.546197] pci_bus 0000:00: root bus resource [mem 0xf1080000-0xf1081fff] (bus address [0x00080000-0x00081fff])
[    1.546204] pci_bus 0000:00: root bus resource [mem 0xf1040000-0xf1041fff] (bus address [0x00040000-0x00041fff])
[    1.546210] pci_bus 0000:00: root bus resource [mem 0xf1044000-0xf1045fff] (bus address [0x00044000-0x00045fff])
[    1.546216] pci_bus 0000:00: root bus resource [mem 0xf1048000-0xf1049fff] (bus address [0x00048000-0x00049fff])
[    1.546220] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xe7ffffff]
[    1.546225] pci_bus 0000:00: root bus resource [io  0x1000-0xeffff]
[    1.546294] pci 0000:00:01.0: [11ab:6820] type 01 class 0x060400
[    1.546308] pci 0000:00:01.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.546482] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
[    1.546495] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.546643] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
[    1.546656] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.547379] PCI: bus0: Fast back to back transfers disabled
[    1.547387] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547394] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547402] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.547484] pci 0000:01:00.0: [168c:002e] type 00 class 0x028000
[    1.547507] pci 0000:01:00.0: reg 0x10: [mem 0xe8000000-0xe800ffff 64bit]
[    1.547615] pci 0000:01:00.0: supports D1
[    1.547620] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
[    1.547730] pci 0000:00:01.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.631937] PCI: bus2: Fast back to back transfers enabled
[    1.631945] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.632655] PCI: bus3: Fast back to back transfers enabled
[    1.632662] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
[    1.632694] pci 0000:00:01.0: BAR 8: assigned [mem 0xe0000000-0xe00fffff]
[    1.632702] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0200000-0xe04fffff]
[    1.632710] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0100000-0xe01007ff pref]
[    1.632718] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
[    1.632726] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0600000-0xe06007ff pref]
[    1.632734] pci 0000:01:00.0: BAR 0: assigned [mem 0xe0000000-0xe000ffff 64bit]
[    1.632741] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
[    1.632746] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.632752] pci 0000:00:01.0: PCI bridge to [bus 01]
[    1.632760] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xe00fffff]
[    1.632769] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0200000-0xe03fffff 64bit]
[    1.632776] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
[    1.632782] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.632788] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0400000-0xe040ffff pref]
[    1.632793] pci 0000:00:02.0: PCI bridge to [bus 02]
[    1.632800] pci 0000:00:02.0:   bridge window [mem 0xe0200000-0xe04fffff]
[    1.632807] pci 0000:00:03.0: PCI bridge to [bus 03]

(and then later, still):
[    3.476364] pci 0000:00:01.0: enabling device (0140 -> 0142)
[    3.477542] ata1: SATA link down (SStatus 0 SControl 300)
[    3.482126] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
[    3.487487] ata2: SATA link down (SStatus 0 SControl 300)
[    3.493379] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
[    3.505891] ath: phy0: Unable to initialize hardware; initialization status: -95
[    3.513325] ath9k 0000:01:00.0: Failed to initialize device
[    3.518933] ath9k: probe of 0000:01:00.0 failed with error -95
[    3.524862] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
[    3.531904] pci 0000:00:02.0: enabling device (0140 -> 0142)
[    3.537590] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
[    3.577436] ath10k_pci 0000:02:00.0: failed to wake up device : -110
[    3.583948] ath10k_pci: probe of 0000:02:00.0 failed with error -110


-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 10:09               ` Pali Rohár
@ 2020-10-29 10:56                 ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-29 11:12                 ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-29 10:56 UTC (permalink / raw)
  To: Pali Rohár, Bjorn Helgaas
  Cc: Toke Høiland-Jørgensen, linux-pci, linux-arm-kernel,
	Rob Herring, Ilias Apalodimas, Marek Behún,
	Thomas Petazzoni, Jason Cooper

On 29/10/2020 11:09, Pali Rohár wrote:
> Hello!
>
> On Wednesday 28 October 2020 18:16:26 Bjorn Helgaas wrote:
>> [+cc Pali, Marek, Thomas, Jason]
>>
>> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
>>> On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
>>>> Bjorn Helgaas <helgaas@kernel.org> writes:
>>>>> On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>>>>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>>>>> Bjorn Helgaas <helgaas@kernel.org> writes:
>>>>>>>
>>>>>>>> [+cc vtolkm]
>>>>>>>>
>>>>>>>> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>>>>>>>>> Hi everyone
>>>>>>>>>
>>>>>>>>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>>>>>>>>> having some trouble getting the PCI bus to work correctly. Specifically,
>>>>>>>>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>>>>>>>>> the resource request fix[0] applied on top.
>>>>>>>>>
>>>>>>>>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>>>>>>>>> up. But I'm still getting initialisation errors like these:
>>>>>>>>>
>>>>>>>>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>>>>>>>>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>>>>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>>>>>>>>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>>>>>
>>>>>>>>> and the WiFi drivers fail to initialise with what appears to me to be
>>>>>>>>> errors related to the bus rather than to the drivers themselves:
>>>>>>>>>
>>>>>>>>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>>>>>>>>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>>>>>>>>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>>>>>>>>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>>>>>>>>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>>>>>>>>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>>>>>>>>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>>>>>>>>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>>>>>>>>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>>>>>>>>
>>>>>>>>> lspci looks OK, though:
>>>>>>>>>
>>>>>>>>> # lspci
>>>>>>>>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>>>>>>>>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>>>>>>>>
>>>>>>>>> Does anyone have any clue what could be going on here? Is this a bug, or
>>>>>>>>> did I miss something in my config or other initialisation? I've tried
>>>>>>>>> with both the stock u-boot distributed with the board, and with an
>>>>>>>>> upstream u-boot from latest master; doesn't seem to make any different.
>>>>>>>> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>>>>>>>> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>>>>>>>> don't think we have a fix yet.
>>>>>>> Yes! Turning that off does indeed help! Thanks a bunch :)
> I have been testing mainline kernel on Turris Omnia with two PCIe
> default cards (WLE200 and WLE900) and it worked fine. But I do not know
> if I had ASPM enabled or not.
>
> So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
> issue is only when CONFIG_PCIEASPM is enabled?

Yes, that is the gist of it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 10:09               ` Pali Rohár
  2020-10-29 10:56                 ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-29 11:12                 ` Toke Høiland-Jørgensen
  2020-10-29 19:30                   ` Bjorn Helgaas
  1 sibling, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 11:12 UTC (permalink / raw)
  To: Pali Rohár, Bjorn Helgaas
  Cc: vtolkm, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

Pali Rohár <pali@kernel.org> writes:

> Hello!
>
> On Wednesday 28 October 2020 18:16:26 Bjorn Helgaas wrote:
>> [+cc Pali, Marek, Thomas, Jason]
>> 
>> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
>> > On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
>> > > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
>> > > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > > > > 
>> > > > > > > [+cc vtolkm]
>> > > > > > > 
>> > > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > > > > > Hi everyone
>> > > > > > > > 
>> > > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> > > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
>> > > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> > > > > > > > the resource request fix[0] applied on top.
>> > > > > > > > 
>> > > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
>> > > > > > > > up. But I'm still getting initialisation errors like these:
>> > > > > > > > 
>> > > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> > > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> > > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > > 
>> > > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
>> > > > > > > > errors related to the bus rather than to the drivers themselves:
>> > > > > > > > 
>> > > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> > > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> > > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> > > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> > > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> > > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> > > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> > > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> > > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>> > > > > > > > 
>> > > > > > > > lspci looks OK, though:
>> > > > > > > > 
>> > > > > > > > # lspci
>> > > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> > > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>> > > > > > > > 
>> > > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
>> > > > > > > > did I miss something in my config or other initialisation? I've tried
>> > > > > > > > with both the stock u-boot distributed with the board, and with an
>> > > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
>> > > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> > > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> > > > > > > don't think we have a fix yet.
>> > > > > > Yes! Turning that off does indeed help! Thanks a bunch :)
>
> I have been testing mainline kernel on Turris Omnia with two PCIe
> default cards (WLE200 and WLE900) and it worked fine. But I do not know
> if I had ASPM enabled or not.
>
> So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
> issue is only when CONFIG_PCIEASPM is enabled?

Yup, exactly. And I'm also currently testing with the default WLE200/900
cards... I just tried sticking an MT76-based WiFi card into the third
PCI slot, and that doesn't come up either when I enable PCIEASPM.

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 10:41               ` Toke Høiland-Jørgensen
@ 2020-10-29 11:18                 ` ™֟☻̭҇ Ѽ ҉ ®
  0 siblings, 0 replies; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-29 11:18 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Pali Rohár, Marek Behún, Thomas Petazzoni,
	Jason Cooper

[-- Attachment #1.1.1: Type: text/plain, Size: 12195 bytes --]


On 29/10/2020 11:41, Toke Høiland-Jørgensen wrote:
> Bjorn Helgaas <helgaas@kernel.org> writes:
>
>> [+cc Pali, Marek, Thomas, Jason]
>>
>> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
>>> On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
>>>> Bjorn Helgaas <helgaas@kernel.org> writes:
>>>>> On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>>>>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>>>>> Bjorn Helgaas <helgaas@kernel.org> writes:
>>>>>>>
>>>>>>>> [+cc vtolkm]
>>>>>>>>
>>>>>>>> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>>>>>>>>> Hi everyone
>>>>>>>>>
>>>>>>>>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>>>>>>>>> having some trouble getting the PCI bus to work correctly. Specifically,
>>>>>>>>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>>>>>>>>> the resource request fix[0] applied on top.
>>>>>>>>>
>>>>>>>>> The kernel boots fine, and the patch in [0] makes the PCI devices show
>>>>>>>>> up. But I'm still getting initialisation errors like these:
>>>>>>>>>
>>>>>>>>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>>>>>>>>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>>>>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>>>>>>>>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>>>>>>>>>
>>>>>>>>> and the WiFi drivers fail to initialise with what appears to me to be
>>>>>>>>> errors related to the bus rather than to the drivers themselves:
>>>>>>>>>
>>>>>>>>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>>>>>>>>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>>>>>>>>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>>>>>>>>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>>>>>>>>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>>>>>>>>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>>>>>>>>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>>>>>>>>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>>>>>>>>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>>>>>>>>>
>>>>>>>>> lspci looks OK, though:
>>>>>>>>>
>>>>>>>>> # lspci
>>>>>>>>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>>>>>>>>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>>>>>>>>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>>>>>>>>>
>>>>>>>>> Does anyone have any clue what could be going on here? Is this a bug, or
>>>>>>>>> did I miss something in my config or other initialisation? I've tried
>>>>>>>>> with both the stock u-boot distributed with the board, and with an
>>>>>>>>> upstream u-boot from latest master; doesn't seem to make any different.
>>>>>>>> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>>>>>>>> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>>>>>>>> don't think we have a fix yet.
>>>>>>> Yes! Turning that off does indeed help! Thanks a bunch :)
>>>>>>>
>>>>>>> You mention that bisecting this would be helpful - I can try that
>>>>>>> tomorrow; any idea when this was last working?
>>>>>> OK, so I tried to bisect this, but, erm, I couldn't find a working
>>>>>> revision to start from? I went all the way back to 4.10 (which is the
>>>>>> first version to include the device tree file for the Omnia), and even
>>>>>> on that, the wireless cards were failing to initialise with ASPM
>>>>>> enabled...
>>>>> I have no personal experience with this device; all I know is that the
>>>>> bugzilla suggests that it worked in v5.4, which isn't much help.
>>>>>
>>>>> Possibly the apparent regression was really a .config change, i.e.,
>>>>> CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
>>>>> "worked" but got enabled later and it started failing?
>>>> Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
>>>> default and only turns it on for specific targets. So I guess that it's
>>>> most likely that this has never worked...
>>>>
>>>>> Maybe the debug patch below would be worth trying to see if it makes
>>>>> any difference?  If it *does* help, try omitting the first hunk to see
>>>>> if we just need to apply the quirk_enable_clear_retrain_link() quirk.
>>>> Tried, doesn't help...
>>>>
>>>> -Toke
>>> Found this patch
>>>
>>> https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
>>>
>>> that mentions the Compex WLE900VX card, which reading the lspci verbose
>>> output from the bugtracker seems to the device being troubled.
>> Interesting.  Indeed, the Compex WLE900VX card seems to have the
>> Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
>> the same device in it.
>>
>> The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
>> for aardvark, so of course doesn't help mvebu.
>>
>> PCIe hardware is supposed to automatically negotiate the highest link
>> speed supported by both ends.  But software *is* allowed to set an
>> upper limit (the Target Link Speed in Link Control 2).  If we initiate
>> a retrain and the link doesn't come back up, I wonder if we should try
>> to help the hardware out by using Target Link Speed to limit to a
>> lower speed and attempting another retrain, something like this hacky
>> patch: (please collect the dmesg log if you try this)
> Well, I tried it, but don't see any of the 'lnkcap2' output from that
> new function:
>
> [    1.545853] mvebu-pcie soc:pcie: host bridge /soc/pcie ranges:
> [    1.545878] mvebu-pcie soc:pcie:      MEM 0x00f1080000..0x00f1081fff -> 0x0000080000
> [    1.545894] mvebu-pcie soc:pcie:      MEM 0x00f1040000..0x00f1041fff -> 0x0000040000
> [    1.545907] mvebu-pcie soc:pcie:      MEM 0x00f1044000..0x00f1045fff -> 0x0000044000
> [    1.545920] mvebu-pcie soc:pcie:      MEM 0x00f1048000..0x00f1049fff -> 0x0000048000
> [    1.545933] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
> [    1.545945] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
> [    1.545958] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
> [    1.545970] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
> [    1.545982] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
> [    1.545994] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
> [    1.546006] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
> [    1.546014] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
> [    1.546181] mvebu-pcie soc:pcie: PCI host bridge to bus 0000:00
> [    1.546190] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    1.546197] pci_bus 0000:00: root bus resource [mem 0xf1080000-0xf1081fff] (bus address [0x00080000-0x00081fff])
> [    1.546204] pci_bus 0000:00: root bus resource [mem 0xf1040000-0xf1041fff] (bus address [0x00040000-0x00041fff])
> [    1.546210] pci_bus 0000:00: root bus resource [mem 0xf1044000-0xf1045fff] (bus address [0x00044000-0x00045fff])
> [    1.546216] pci_bus 0000:00: root bus resource [mem 0xf1048000-0xf1049fff] (bus address [0x00048000-0x00049fff])
> [    1.546220] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xe7ffffff]
> [    1.546225] pci_bus 0000:00: root bus resource [io  0x1000-0xeffff]
> [    1.546294] pci 0000:00:01.0: [11ab:6820] type 01 class 0x060400
> [    1.546308] pci 0000:00:01.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
> [    1.546482] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
> [    1.546495] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
> [    1.546643] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
> [    1.546656] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
> [    1.547379] PCI: bus0: Fast back to back transfers disabled
> [    1.547387] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547394] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547402] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [    1.547484] pci 0000:01:00.0: [168c:002e] type 00 class 0x028000
> [    1.547507] pci 0000:01:00.0: reg 0x10: [mem 0xe8000000-0xe800ffff 64bit]
> [    1.547615] pci 0000:01:00.0: supports D1
> [    1.547620] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
> [    1.547730] pci 0000:00:01.0: ASPM: current common clock configuration is inconsistent, reconfiguring
> [    1.631937] PCI: bus2: Fast back to back transfers enabled
> [    1.631945] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
> [    1.632655] PCI: bus3: Fast back to back transfers enabled
> [    1.632662] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
> [    1.632694] pci 0000:00:01.0: BAR 8: assigned [mem 0xe0000000-0xe00fffff]
> [    1.632702] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0200000-0xe04fffff]
> [    1.632710] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0100000-0xe01007ff pref]
> [    1.632718] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
> [    1.632726] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0600000-0xe06007ff pref]
> [    1.632734] pci 0000:01:00.0: BAR 0: assigned [mem 0xe0000000-0xe000ffff 64bit]
> [    1.632741] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> [    1.632746] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632752] pci 0000:00:01.0: PCI bridge to [bus 01]
> [    1.632760] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xe00fffff]
> [    1.632769] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0200000-0xe03fffff 64bit]
> [    1.632776] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> [    1.632782] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> [    1.632788] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0400000-0xe040ffff pref]
> [    1.632793] pci 0000:00:02.0: PCI bridge to [bus 02]
> [    1.632800] pci 0000:00:02.0:   bridge window [mem 0xe0200000-0xe04fffff]
> [    1.632807] pci 0000:00:03.0: PCI bridge to [bus 03]
>
> (and then later, still):
> [    3.476364] pci 0000:00:01.0: enabling device (0140 -> 0142)
> [    3.477542] ata1: SATA link down (SStatus 0 SControl 300)
> [    3.482126] ath9k 0000:01:00.0: enabling device (0000 -> 0002)
> [    3.487487] ata2: SATA link down (SStatus 0 SControl 300)
> [    3.493379] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> [    3.505891] ath: phy0: Unable to initialize hardware; initialization status: -95
> [    3.513325] ath9k 0000:01:00.0: Failed to initialize device
> [    3.518933] ath9k: probe of 0000:01:00.0 failed with error -95
> [    3.524862] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> [    3.531904] pci 0000:00:02.0: enabling device (0140 -> 0142)
> [    3.537590] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> [    3.577436] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> [    3.583948] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>
>
> -Toke
>

Same result my end - run tested with next-20201027

N.B. node does not boot anymore with next-20201028, but that that is 
independent of this patch and apparently another issue.

[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 15:08         ` Toke Høiland-Jørgensen
  2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-29 15:12           ` Rob Herring
  1 sibling, 0 replies; 48+ messages in thread
From: Rob Herring @ 2020-10-29 15:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Bjorn Helgaas, PCI, linux-arm-kernel, Ilias Apalodimas, vtolkm

On Wed, Oct 28, 2020 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Bjorn Helgaas <helgaas@kernel.org> writes:
>
> > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
> >> Toke Høiland-Jørgensen <toke@redhat.com> writes:
> >>
> >> > Bjorn Helgaas <helgaas@kernel.org> writes:
> >> >
> >> >> [+cc vtolkm]
> >> >>
> >> >> On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> >> >>> Hi everyone
> >> >>>
> >> >>> I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> >> >>> having some trouble getting the PCI bus to work correctly. Specifically,
> >> >>> I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> >> >>> the resource request fix[0] applied on top.
> >> >>>
> >> >>> The kernel boots fine, and the patch in [0] makes the PCI devices show
> >> >>> up. But I'm still getting initialisation errors like these:
> >> >>>
> >> >>> [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> >> >>> [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> >> >>> [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> >> >>> [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> >> >>>
> >> >>> and the WiFi drivers fail to initialise with what appears to me to be
> >> >>> errors related to the bus rather than to the drivers themselves:
> >> >>>
> >> >>> [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> >> >>> [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> >> >>> [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> >> >>> [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> >> >>> [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> >> >>> [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> >> >>> [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> >> >>> [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> >> >>> [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> >> >>>
> >> >>> lspci looks OK, though:
> >> >>>
> >> >>> # lspci
> >> >>> 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >> >>> 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >> >>> 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> >> >>> 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> >> >>> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> >> >>>
> >> >>> Does anyone have any clue what could be going on here? Is this a bug, or
> >> >>> did I miss something in my config or other initialisation? I've tried
> >> >>> with both the stock u-boot distributed with the board, and with an
> >> >>> upstream u-boot from latest master; doesn't seem to make any different.
> >> >>
> >> >> Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> >> >> report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> >> >> don't think we have a fix yet.
> >> >
> >> > Yes! Turning that off does indeed help! Thanks a bunch :)
> >> >
> >> > You mention that bisecting this would be helpful - I can try that
> >> > tomorrow; any idea when this was last working?
> >>
> >> OK, so I tried to bisect this, but, erm, I couldn't find a working
> >> revision to start from? I went all the way back to 4.10 (which is the
> >> first version to include the device tree file for the Omnia), and even
> >> on that, the wireless cards were failing to initialise with ASPM
> >> enabled...
> >
> > I have no personal experience with this device; all I know is that the
> > bugzilla suggests that it worked in v5.4, which isn't much help.
> >
> > Possibly the apparent regression was really a .config change, i.e.,
> > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
> > "worked" but got enabled later and it started failing?
>
> Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
> default and only turns it on for specific targets. So I guess that it's
> most likely that this has never worked...

FYI, there's a bugzilla for this:

https://bugzilla.kernel.org/show_bug.cgi?id=209833

Rob

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 11:12                 ` Toke Høiland-Jørgensen
@ 2020-10-29 19:30                   ` Bjorn Helgaas
  2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
                                       ` (4 more replies)
  0 siblings, 5 replies; 48+ messages in thread
From: Bjorn Helgaas @ 2020-10-29 19:30 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Pali Rohár, vtolkm, linux-pci, linux-arm-kernel,
	Rob Herring, Ilias Apalodimas, Marek Behún,
	Thomas Petazzoni, Jason Cooper

On Thu, Oct 29, 2020 at 12:12:21PM +0100, Toke Høiland-Jørgensen wrote:
> Pali Rohár <pali@kernel.org> writes:

> > I have been testing mainline kernel on Turris Omnia with two PCIe
> > default cards (WLE200 and WLE900) and it worked fine. But I do not know
> > if I had ASPM enabled or not.
> >
> > So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
> > issue is only when CONFIG_PCIEASPM is enabled?
> 
> Yup, exactly. And I'm also currently testing with the default WLE200/900
> cards... I just tried sticking an MT76-based WiFi card into the third
> PCI slot, and that doesn't come up either when I enable PCIEASPM.

Huh.  So IIUC, the following cases all try to retrain the link and it
fails to come up again:

  - aardvark + WLE900VX (see commit 43fc679ced18)
  - mvebu + WLE200
  - mvebu + WLE900
  - mvebu + MT76

In all these cases, Linux was able to enumerate the NIC, which means
the link was up when firmware handed it off.

I think Linux decided the Common Clock Configuration was wrong, so it
tried to fix it and retrain the link, and the link didn't come back
up.

I don't have "lspci -vv" output from all of them, but in vtolkm's
case, the firmware handed off with:

  00:02.0 Root Port to [bus 02]  SlotClk+ CommClk+
  02:00.0 QCA986x/988x NIC       SlotClk+ CommClk-

Per spec (PCIe r5, sec 7.5.3.7), SlotClk is HwInit and CommClk is RW
and should power up as 0.  If I'm reading the implementation note
correctly, if SlotClk is set on both ends of the link, software should
set CommClk, so the config above *does* look wrong, and CommClk+ on
the Root Port suggests that firmware set it.

I think both the aardvark and mvebu systems probably use U-Boot.  I
don't know U-Boot at all, but I don't see anything in it that touches
Link Control.  I'm curious what happens if you put one of these cards
in a PC.  If anybody tries it, please collect the "sudo lspci -vv" and
dmesg output.

We could quirk these NICs to avoid the retrain, but since aardvark and
mvebu have no obvious connection and WLE200/WLE900 and MT76 have no
obvious connection, I doubt there's a simple hardware defect that
explains all these.  

Maybe we're doing something wrong in the retrain, but obviously the
link came up in the first place.  AFAIK the only thing we're changing
is the CommClk setting, and that looks legitimate per spec.

Another experiment: build kernel without CONFIG_PCIEASPM, set $ROOT
and $NIC appropriately, and try the following:

  # Set $ROOT and $NIC (update to match your system):

    # ROOT=00:02.0
    # NIC=02:00.0

  # Dump the Root Port and NIC Link registers:

    # setpci -s$ROOT CAP_EXP+0xc.l              # Link Capabilities
    # setpci -s$ROOT CAP_EXP+0x10.w             # Link Control
    # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status

    # setpci -s$NIC  CAP_EXP+0xc.l              # Link Capabilities
    # setpci -s$NIC  CAP_EXP+0x10.w             # Link Control
    # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

  # Retrain the link:

    # setpci -s$ROOT CAP_EXP+0x10.w=0x0020      # Link Control Retrain Link
    # sleep 1
    # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
    # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

  # Set CommClk+ and retrain the link:

    # setpci -s$NIC  CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
    # setpci -s$ROOT CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
    # setpci -s$ROOT CAP_EXP+0x10.w=0x0060      # Link Control RL + CC
    # sleep 1
    # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
    # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:30                   ` Bjorn Helgaas
@ 2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-29 19:57                     ` Andrew Lunn
                                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-29 19:56 UTC (permalink / raw)
  To: Bjorn Helgaas, Toke Høiland-Jørgensen
  Cc: Pali Rohár, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

[-- Attachment #1.1.1: Type: text/plain, Size: 4820 bytes --]

On 29/10/2020 20:30, Bjorn Helgaas wrote:
> On Thu, Oct 29, 2020 at 12:12:21PM +0100, Toke Høiland-Jørgensen wrote:
>> Pali Rohár <pali@kernel.org> writes:
>>> I have been testing mainline kernel on Turris Omnia with two PCIe
>>> default cards (WLE200 and WLE900) and it worked fine. But I do not know
>>> if I had ASPM enabled or not.
>>>
>>> So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
>>> issue is only when CONFIG_PCIEASPM is enabled?
>> Yup, exactly. And I'm also currently testing with the default WLE200/900
>> cards... I just tried sticking an MT76-based WiFi card into the third
>> PCI slot, and that doesn't come up either when I enable PCIEASPM.
> Huh.  So IIUC, the following cases all try to retrain the link and it
> fails to come up again:
>
>    - aardvark + WLE900VX (see commit 43fc679ced18)
>    - mvebu + WLE200
>    - mvebu + WLE900
>    - mvebu + MT76
>
> In all these cases, Linux was able to enumerate the NIC, which means
> the link was up when firmware handed it off.
>
> I think Linux decided the Common Clock Configuration was wrong, so it
> tried to fix it and retrain the link, and the link didn't come back
> up.
>
> I don't have "lspci -vv" output from all of them, but in vtolkm's
> case, the firmware handed off with:
>
>    00:02.0 Root Port to [bus 02]  SlotClk+ CommClk+
>    02:00.0 QCA986x/988x NIC       SlotClk+ CommClk-
>
> Per spec (PCIe r5, sec 7.5.3.7), SlotClk is HwInit and CommClk is RW
> and should power up as 0.  If I'm reading the implementation note
> correctly, if SlotClk is set on both ends of the link, software should
> set CommClk, so the config above *does* look wrong, and CommClk+ on
> the Root Port suggests that firmware set it.
>
> I think both the aardvark and mvebu systems probably use U-Boot.  I
> don't know U-Boot at all, but I don't see anything in it that touches
> Link Control.  I'm curious what happens if you put one of these cards
> in a PC.  If anybody tries it, please collect the "sudo lspci -vv" and
> dmesg output.
>
> We could quirk these NICs to avoid the retrain, but since aardvark and
> mvebu have no obvious connection and WLE200/WLE900 and MT76 have no
> obvious connection, I doubt there's a simple hardware defect that
> explains all these.
>
> Maybe we're doing something wrong in the retrain, but obviously the
> link came up in the first place.  AFAIK the only thing we're changing
> is the CommClk setting, and that looks legitimate per spec.
>
> Another experiment: build kernel without CONFIG_PCIEASPM, set $ROOT
> and $NIC appropriately, and try the following:
>
>    # Set $ROOT and $NIC (update to match your system):
>
>      # ROOT=00:02.0
>      # NIC=02:00.0
>
>    # Dump the Root Port and NIC Link registers:
>
>      # setpci -s$ROOT CAP_EXP+0xc.l              # Link Capabilities
>      # setpci -s$ROOT CAP_EXP+0x10.w             # Link Control
>      # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
>
>      # setpci -s$NIC  CAP_EXP+0xc.l              # Link Capabilities
>      # setpci -s$NIC  CAP_EXP+0x10.w             # Link Control
>      # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status
>
>    # Retrain the link:
>
>      # setpci -s$ROOT CAP_EXP+0x10.w=0x0020      # Link Control Retrain Link
>      # sleep 1
>      # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
>      # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status
>
>    # Set CommClk+ and retrain the link:
>
>      # setpci -s$NIC  CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
>      # setpci -s$ROOT CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
>      # setpci -s$ROOT CAP_EXP+0x10.w=0x0060      # Link Control RL + CC
>      # sleep 1
>      # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
>      # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

ROOT=00:02.0
NIC=02:00.0
setpci -s$ROOT CAP_EXP+0xc.l
0003ac12
setpci -s$ROOT CAP_EXP+0x10.w
0040
setpci -s$ROOT CAP_EXP+0x12.w
1011
setpci -s$NIC  CAP_EXP+0xc.l

00036c11
setpci -s$NIC  CAP_EXP+0x10.w
0000
setpci -s$NIC  CAP_EXP+0x12.w
1011
setpci -s$ROOT CAP_EXP+0x10.w=0x0020
sleep 1
setpci -s$ROOT CAP_EXP+0x12.w
1011
setpci -s$NIC  CAP_EXP+0x12.w
setpci: 0000:02:00.0: Instance #0 of Capability 0010 not found - there 
are no capabilities with that id.
setpci -s$NIC  CAP_EXP+0x10.w=0x0040
setpci: 0000:02:00.0: Instance #0 of Capability 0010 not found - there 
are no capabilities with that id.
setpci -s$ROOT CAP_EXP+0x10.w=0x0040
setpci -s$ROOT CAP_EXP+0x10.w=0x0060
sleep 1
setpci -s$ROOT CAP_EXP+0x12.w
1811
setpci -s$NIC  CAP_EXP+0x12.w
setpci: 0000:02:00.0: Instance #0 of Capability 0010 not found - there 
are no capabilities with that id.


[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:30                   ` Bjorn Helgaas
  2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-29 19:57                     ` Andrew Lunn
  2020-10-29 21:55                       ` Thomas Petazzoni
  2020-10-29 20:18                     ` Toke Høiland-Jørgensen
                                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 48+ messages in thread
From: Andrew Lunn @ 2020-10-29 19:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Toke Høiland-Jørgensen, Rob Herring, Jason Cooper,
	Pali Rohár, Ilias Apalodimas, Marek Behún,
	Thomas Petazzoni, linux-pci, vtolkm, linux-arm-kernel

> We could quirk these NICs to avoid the retrain, but since aardvark and
> mvebu have no obvious connection

Both are Mavell. There could be some shared IP.

     Andrew

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:30                   ` Bjorn Helgaas
  2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-29 19:57                     ` Andrew Lunn
@ 2020-10-29 20:18                     ` Toke Høiland-Jørgensen
  2020-10-29 22:09                       ` Toke Høiland-Jørgensen
  2020-10-29 20:58                     ` Marek Behun
  2020-10-29 21:54                     ` Thomas Petazzoni
  4 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 20:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Pali Rohár, vtolkm, linux-pci, linux-arm-kernel,
	Rob Herring, Ilias Apalodimas, Marek Behún,
	Thomas Petazzoni, Jason Cooper

Bjorn Helgaas <helgaas@kernel.org> writes:

> Another experiment: build kernel without CONFIG_PCIEASPM, set $ROOT
> and $NIC appropriately, and try the following:
>
>   # Set $ROOT and $NIC (update to match your system):
>
>     # ROOT=00:02.0
>     # NIC=02:00.0

(these matched the ath10k card, so just went with that)

>   # Dump the Root Port and NIC Link registers:
>
>     # setpci -s$ROOT CAP_EXP+0xc.l              # Link Capabilities
>     # setpci -s$ROOT CAP_EXP+0x10.w             # Link Control
>     # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status

# setpci -s$ROOT CAP_EXP+0xc.l
0003ac12
# setpci -s$ROOT CAP_EXP+0x10.w
0040
# setpci -s$ROOT CAP_EXP+0x12.w
1011

>     # setpci -s$NIC  CAP_EXP+0xc.l              # Link Capabilities
>     # setpci -s$NIC  CAP_EXP+0x10.w             # Link Control
>     # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

# setpci -s$NIC CAP_EXP+0xc.l
00036c11
# setpci -s$NIC CAP_EXP+0x10.w
0000
# setpci -s$NIC CAP_EXP+0x12.w
1011

>   # Retrain the link:
>
>     # setpci -s$ROOT CAP_EXP+0x10.w=0x0020      # Link Control Retrain Link
>     # sleep 1
>     # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
>     # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

# setpci -s$ROOT CAP_EXP+0x10.w=0x0020
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
setpci: 0000:02:00.0: Instance #0 of Capability 0010 not found - there are no capabilities with that id.
# setpci -s$ROOT CAP_EXP+0x10.w
0000

(nothing in the dmesg either) - rebooted before trying the below:

>   # Set CommClk+ and retrain the link:
>
>     # setpci -s$NIC  CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
>     # setpci -s$ROOT CAP_EXP+0x10.w=0x0040      # Link Control Common Clock
>     # setpci -s$ROOT CAP_EXP+0x10.w=0x0060      # Link Control RL + CC
>     # sleep 1
>     # setpci -s$ROOT CAP_EXP+0x12.w             # Link Status
>     # setpci -s$NIC  CAP_EXP+0x12.w             # Link Status

# setpci -s$NIC CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0060
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
setpci: 0000:02:00.0: Instance #0 of Capability 0010 not found - there are no capabilities with that id.

# lspci -v
00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04) (prog-if 00 [Normal decode])
        Device tree node: /sys/firmware/devicetree/base/soc/pcie/pcie@1,0
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: e0000000-e00fffff [size=1M]
        Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
        Expansion ROM at e0100000 [virtual] [disabled] [size=2K]
        Capabilities: [40] Express Root Port (Slot+), MSI 00
lspci: Unable to load libkmod resources: error -12

00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04) (prog-if 00 [Normal decode])
        Device tree node: /sys/firmware/devicetree/base/soc/pcie/pcie@2,0
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: e0200000-e04fffff [size=3M]
        Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
        Expansion ROM at e0500000 [virtual] [disabled] [size=2K]
        Capabilities: [40] Express Root Port (Slot+), MSI 00

00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04) (prog-if 00 [Normal decode])
        Device tree node: /sys/firmware/devicetree/base/soc/pcie/pcie@3,0
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: e0600000-e07fffff [size=2M]
        Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
        Expansion ROM at e0800000 [virtual] [disabled] [size=2K]
        Capabilities: [40] Express Root Port (Slot+), MSI 00

01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
        Subsystem: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express)
        Flags: bus master, fast devsel, latency 0, IRQ 60
        Memory at e0000000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
        Capabilities: [60] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel
        Capabilities: [160] Device Serial Number 00-15-17-ff-ff-24-14-12
        Capabilities: [170] Power Budgeting <?>
        Kernel driver in use: ath9k

02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: ath10k_pci

03:00.0 Network controller: MEDIATEK Corp. Device 7612
        Subsystem: MEDIATEK Corp. Device 7612
        Flags: bus master, fast devsel, latency 0, IRQ 63
        Memory at e0600000 (64-bit, non-prefetchable) [size=1M]
        Expansion ROM at e0700000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [158] Latency Tolerance Reporting
        Capabilities: [160] L1 PM Substates
        Kernel driver in use: mt76x2e


-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:30                   ` Bjorn Helgaas
                                       ` (2 preceding siblings ...)
  2020-10-29 20:18                     ` Toke Høiland-Jørgensen
@ 2020-10-29 20:58                     ` Marek Behun
  2020-10-30 10:08                       ` Pali Rohár
  2020-10-29 21:54                     ` Thomas Petazzoni
  4 siblings, 1 reply; 48+ messages in thread
From: Marek Behun @ 2020-10-29 20:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Toke Høiland-Jørgensen, Pali Rohár, vtolkm,
	linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Thomas Petazzoni, Jason Cooper

On Thu, 29 Oct 2020 14:30:22 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Thu, Oct 29, 2020 at 12:12:21PM +0100, Toke Høiland-Jørgensen wrote:
> > Pali Rohár <pali@kernel.org> writes:  
> 
> > > I have been testing mainline kernel on Turris Omnia with two PCIe
> > > default cards (WLE200 and WLE900) and it worked fine. But I do not know
> > > if I had ASPM enabled or not.
> > >
> > > So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
> > > issue is only when CONFIG_PCIEASPM is enabled?  
> > 
> > Yup, exactly. And I'm also currently testing with the default WLE200/900
> > cards... I just tried sticking an MT76-based WiFi card into the third
> > PCI slot, and that doesn't come up either when I enable PCIEASPM.  
> 
> Huh.  So IIUC, the following cases all try to retrain the link and it
> fails to come up again:
> 
>   - aardvark + WLE900VX (see commit 43fc679ced18)
>   - mvebu + WLE200
>   - mvebu + WLE900
>   - mvebu + MT76

Bjorn, IIRC Pali's patches fix the WLE900VX card for Aardvark (both in
kernel and in U-Boot).
IMO mvebu has similar issues. Both these drivers handle the PCIe reset
signal incorrectly (or at least Aardvark did before Pali's work).

mvebu is used on Turris Omnia, and our HW guys first solved the WLE900VX
not working issue by using different capacitors for the SerDeses (this
was 5 years ago). But after Pali's work on Aardvark I think this could
also be solved for mvebu driver in software.

BTW the WLE900VX card has problems on many systems, it won't work for
example on Thinkpad X230. There is a bug on kernel bugzilla reported
for this.

My opinion is that many drivers do not respect the PCIe specification
for reset and link training totally correctly (Pali was talking about
this when he was looking at Aardvark) and that WLE900VX has a bug that
in combination with those drivers causes the fail. If you look at the
drivers, they are incompatible in how they handle the reset signal and
link training.

I am curious what Pali will tell us, he said that he will look into the
mvebu driver.

Marek

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:30                   ` Bjorn Helgaas
                                       ` (3 preceding siblings ...)
  2020-10-29 20:58                     ` Marek Behun
@ 2020-10-29 21:54                     ` Thomas Petazzoni
  2020-10-29 23:15                       ` Toke Høiland-Jørgensen
  4 siblings, 1 reply; 48+ messages in thread
From: Thomas Petazzoni @ 2020-10-29 21:54 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Toke Høiland-Jørgensen, Pali Rohár, vtolkm,
	linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Marek Behún, Jason Cooper

Hello,

On Thu, 29 Oct 2020 14:30:22 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> We could quirk these NICs to avoid the retrain, but since aardvark and
> mvebu have no obvious connection and WLE200/WLE900 and MT76 have no
> obvious connection, I doubt there's a simple hardware defect that
> explains all these.  

aardvark and mvebu have one very strong connection: they are the only
two drivers making use of the PCI Bridge emulation logic in
drivers/pci/pci-bridge-emul.c:

drivers/pci$ git grep pci-bridge-emul
akefile:obj-$(CONFIG_PCI_BRIDGE_EMUL)  += pci-bridge-emul.o
controller/pci-aardvark.c:#include "../pci-bridge-emul.h"
controller/pci-mvebu.c:#include "../pci-bridge-emul.h"
pci-bridge-emul.c:#include "pci-bridge-emul.h"

I haven't read the whole thread, but it is important to keep in mind
that on those two platforms, the PCI Bridge seen by Linux is *not* a
real HW bridge. It is faked by the the pci-bridge-emul code. So if this
code has defects/bugs in how it emulates a PCI Bridge behavior, you
might see weird things.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 19:57                     ` Andrew Lunn
@ 2020-10-29 21:55                       ` Thomas Petazzoni
  0 siblings, 0 replies; 48+ messages in thread
From: Thomas Petazzoni @ 2020-10-29 21:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Bjorn Helgaas, Toke Høiland-Jørgensen, Rob Herring,
	Jason Cooper, Pali Rohár, Ilias Apalodimas,
	Marek Behún, linux-pci, vtolkm, linux-arm-kernel

On Thu, 29 Oct 2020 20:57:31 +0100
Andrew Lunn <andrew@lunn.ch> wrote:

> > We could quirk these NICs to avoid the retrain, but since aardvark and
> > mvebu have no obvious connection  
> 
> Both are Mavell. There could be some shared IP.

From my experience, even though both are from Marvell, they are really
different IP blocks, made by different teams, used in different SoCs.

However, as I replied to Bjorn, both use the PCI Bridge emulation logic.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 20:18                     ` Toke Høiland-Jørgensen
@ 2020-10-29 22:09                       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 22:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Pali Rohár, vtolkm, linux-pci, linux-arm-kernel,
	Rob Herring, Ilias Apalodimas, Marek Behún,
	Thomas Petazzoni, Jason Cooper

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> Bjorn Helgaas <helgaas@kernel.org> writes:
>
>> Another experiment: build kernel without CONFIG_PCIEASPM, set $ROOT
>> and $NIC appropriately, and try the following:
>>
>>   # Set $ROOT and $NIC (update to match your system):
>>
>>     # ROOT=00:02.0
>>     # NIC=02:00.0
>
> (these matched the ath10k card, so just went with that)

And since Marek's latest email mentioned that the WLE900 is especially
problematic, I also tried with the other slot that has the mt76 in it:

# ROOT=00:03.0
# NIC=03:00.0
# setpci -s$ROOT CAP_EXP+0xc.l
0003ac12
# setpci -s$ROOT CAP_EXP+0x10.w
0040
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0xc.l
0047dc11
# setpci -s$NIC CAP_EXP+0x10.w
0000
# setpci -s$NIC CAP_EXP+0x12.w
1011

# setpci -s$ROOT CAP_EXP+0x10.w=0x0020
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
1011

# setpci -s$NIC CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0060
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
1011

And based on this I went back and rebuilt the kernel with PCIEASPM
enabled, and now both the WLE200 and the MT76 works with this output:

[    1.544429] mvebu-pcie soc:pcie: host bridge /soc/pcie ranges:
[    1.544455] mvebu-pcie soc:pcie:      MEM 0x00f1080000..0x00f1081fff -> 0x0000080000
[    1.544471] mvebu-pcie soc:pcie:      MEM 0x00f1040000..0x00f1041fff -> 0x0000040000
[    1.544485] mvebu-pcie soc:pcie:      MEM 0x00f1044000..0x00f1045fff -> 0x0000044000
[    1.544500] mvebu-pcie soc:pcie:      MEM 0x00f1048000..0x00f1049fff -> 0x0000048000
[    1.544513] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
[    1.544527] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0100000000
[    1.544540] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
[    1.544552] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0200000000
[    1.544565] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
[    1.544577] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0300000000
[    1.544590] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
[    1.544599] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe -> 0x0400000000
[    1.544768] mvebu-pcie soc:pcie: PCI host bridge to bus 0000:00
[    1.544776] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.544783] pci_bus 0000:00: root bus resource [mem 0xf1080000-0xf1081fff] (bus address [0x00080000-0x00081fff])
[    1.544789] pci_bus 0000:00: root bus resource [mem 0xf1040000-0xf1041fff] (bus address [0x00040000-0x00041fff])
[    1.544795] pci_bus 0000:00: root bus resource [mem 0xf1044000-0xf1045fff] (bus address [0x00044000-0x00045fff])
[    1.544801] pci_bus 0000:00: root bus resource [mem 0xf1048000-0xf1049fff] (bus address [0x00048000-0x00049fff])
[    1.544806] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xe7ffffff]
[    1.544811] pci_bus 0000:00: root bus resource [io  0x1000-0xeffff]
[    1.544882] pci 0000:00:01.0: [11ab:6820] type 01 class 0x060400
[    1.544896] pci 0000:00:01.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.545073] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
[    1.545085] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.545237] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
[    1.545250] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
[    1.546030] PCI: bus0: Fast back to back transfers disabled
[    1.546037] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.546045] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.546052] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.546132] pci 0000:01:00.0: [168c:002e] type 00 class 0x028000
[    1.546154] pci 0000:01:00.0: reg 0x10: [mem 0xe8000000-0xe800ffff 64bit]
[    1.546263] pci 0000:01:00.0: supports D1
[    1.546268] pci 0000:01:00.0: PME# supported from D0 D1 D3hot
[    1.546377] pci 0000:00:01.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.602042] PCI: bus1: Fast back to back transfers enabled
[    1.602052] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.602146] pci 0000:02:00.0: [168c:003c] type 00 class 0x028000
[    1.602169] pci 0000:02:00.0: reg 0x10: [mem 0xea000000-0xea1fffff 64bit]
[    1.602201] pci 0000:02:00.0: reg 0x30: [mem 0xea200000-0xea20ffff pref]
[    1.602280] pci 0000:02:00.0: supports D1 D2
[    1.602377] pci 0000:00:02.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.632025] PCI: bus2: Fast back to back transfers enabled
[    1.632033] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.632117] pci 0000:03:00.0: [14c3:7612] type 00 class 0x028000
[    1.632141] pci 0000:03:00.0: reg 0x10: [mem 0xec000000-0xec0fffff 64bit]
[    1.632175] pci 0000:03:00.0: reg 0x30: [mem 0xec100000-0xec10ffff pref]
[    1.632262] pci 0000:03:00.0: PME# supported from D0 D3hot D3cold
[    1.632373] pci 0000:00:03.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[    1.662037] PCI: bus3: Fast back to back transfers disabled
[    1.662045] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
[    1.662078] pci 0000:00:01.0: BAR 8: assigned [mem 0xe0000000-0xe00fffff]
[    1.662086] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0200000-0xe04fffff]
[    1.662093] pci 0000:00:03.0: BAR 8: assigned [mem 0xe0600000-0xe07fffff]
[    1.662101] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0100000-0xe01007ff pref]
[    1.662109] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
[    1.662116] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0800000-0xe08007ff pref]
[    1.662124] pci 0000:01:00.0: BAR 0: assigned [mem 0xe0000000-0xe000ffff 64bit]
[    1.662135] pci 0000:00:01.0: PCI bridge to [bus 01]
[    1.662142] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xe00fffff]
[    1.662151] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0200000-0xe03fffff 64bit]
[    1.662158] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
[    1.662164] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
[    1.662170] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0400000-0xe040ffff pref]
[    1.662176] pci 0000:00:02.0: PCI bridge to [bus 02]
[    1.662182] pci 0000:00:02.0:   bridge window [mem 0xe0200000-0xe04fffff]
[    1.662190] pci 0000:03:00.0: BAR 0: assigned [mem 0xe0600000-0xe06fffff 64bit]
[    1.662202] pci 0000:03:00.0: BAR 6: assigned [mem 0xe0700000-0xe070ffff pref]
[    1.662207] pci 0000:00:03.0: PCI bridge to [bus 03]
[    1.662212] pci 0000:00:03.0:   bridge window [mem 0xe0600000-0xe07fffff]


This has me somewhat puzzled. Investigating further, it turns out that
if I *remove* the MT76 card, the WLE200 starts failing again. So with
just the WLE* cards plugged in, I went back and tried the setpci
sequence again with the WLE200 (with PCIEASPM disabled):

# ROOT=00:01.0
# NIC=01:00.0
# setpci -s$ROOT CAP_EXP+0xc.l
0003ac12
# setpci -s$ROOT CAP_EXP+0x10.w
0040
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0xc.l
00033c11
# setpci -s$NIC CAP_EXP+0x10.w
0000
# setpci -s$NIC CAP_EXP+0x12.w
1011
# setpci -s$ROOT CAP_EXP+0x10.w=0x0020
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0040
# setpci -s$ROOT CAP_EXP+0x10.w=0x0060
# sleep 1
# setpci -s$ROOT CAP_EXP+0x12.w
1011
# setpci -s$NIC CAP_EXP+0x12.w
1011

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 21:54                     ` Thomas Petazzoni
@ 2020-10-29 23:15                       ` Toke Høiland-Jørgensen
  2020-10-30  8:23                         ` Thomas Petazzoni
  2020-10-30 10:15                         ` Pali Rohár
  0 siblings, 2 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 23:15 UTC (permalink / raw)
  To: Thomas Petazzoni, Bjorn Helgaas
  Cc: Pali Rohár, vtolkm, linux-pci, linux-arm-kernel,
	Rob Herring, Ilias Apalodimas, Marek Behún, Jason Cooper

Thomas Petazzoni <thomas.petazzoni@bootlin.com> writes:

> Hello,
>
> On Thu, 29 Oct 2020 14:30:22 -0500
> Bjorn Helgaas <helgaas@kernel.org> wrote:
>
>> We could quirk these NICs to avoid the retrain, but since aardvark and
>> mvebu have no obvious connection and WLE200/WLE900 and MT76 have no
>> obvious connection, I doubt there's a simple hardware defect that
>> explains all these.  
>
> aardvark and mvebu have one very strong connection: they are the only
> two drivers making use of the PCI Bridge emulation logic in
> drivers/pci/pci-bridge-emul.c:
>
> drivers/pci$ git grep pci-bridge-emul
> akefile:obj-$(CONFIG_PCI_BRIDGE_EMUL)  += pci-bridge-emul.o
> controller/pci-aardvark.c:#include "../pci-bridge-emul.h"
> controller/pci-mvebu.c:#include "../pci-bridge-emul.h"
> pci-bridge-emul.c:#include "pci-bridge-emul.h"
>
> I haven't read the whole thread, but it is important to keep in mind
> that on those two platforms, the PCI Bridge seen by Linux is *not* a
> real HW bridge. It is faked by the the pci-bridge-emul code. So if this
> code has defects/bugs in how it emulates a PCI Bridge behavior, you
> might see weird things.

Ohh, that's interesting. Why does it need to emulate it?

And could this cause things weird interactions like what I'm seeing,
where a somewhat buggy device in slot 2 affects the ability to retrain
the link also in slot 1, but only if there's no device in slot 3?

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 23:15                       ` Toke Høiland-Jørgensen
@ 2020-10-30  8:23                         ` Thomas Petazzoni
  2020-10-30 10:15                         ` Pali Rohár
  1 sibling, 0 replies; 48+ messages in thread
From: Thomas Petazzoni @ 2020-10-30  8:23 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Bjorn Helgaas, Pali Rohár, vtolkm, linux-pci,
	linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Marek Behún, Jason Cooper

On Fri, 30 Oct 2020 00:15:57 +0100
Toke Høiland-Jørgensen <toke@redhat.com> wrote:

> > I haven't read the whole thread, but it is important to keep in mind
> > that on those two platforms, the PCI Bridge seen by Linux is *not* a
> > real HW bridge. It is faked by the the pci-bridge-emul code. So if this
> > code has defects/bugs in how it emulates a PCI Bridge behavior, you
> > might see weird things.  
> 
> Ohh, that's interesting. Why does it need to emulate it?

Because the HW doesn't expose a standard PCI Bridge. On mvebu, the main
initial motivation was to be able to configure MBus windows dynamically
depending on PCI endpoints that are connected.

For AArdvark, the rationale is documented in commit
8a3ebd8de328301aacbe328650a59253be2ac82c:

commit 8a3ebd8de328301aacbe328650a59253be2ac82c
Author: Zachary Zhang <zhangzg@marvell.com>
Date:   Thu Oct 18 17:37:19 2018 +0200

    PCI: aardvark: Implement emulated root PCI bridge config space
    
    The PCI controller in the Marvell Armada 3720 does not implement a
    software-accessible root port PCI bridge configuration space. This
    causes a number of problems when using PCIe switches or when the Max
    Payload size needs to be aligned between the root complex and the
    endpoint.
    
    Implementing an emulated root PCI bridge, like is already done in the
    pci-mvebu driver for older Marvell platforms allows to solve those
    issues, and also to support features such as ASR, PME, VC, HP.
    
    Signed-off-by: Zachary Zhang <zhangzg@marvell.com>
    [Thomas: convert to the common emulated PCI bridge logic.]
    Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
    Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 20:58                     ` Marek Behun
@ 2020-10-30 10:08                       ` Pali Rohár
  2020-10-30 10:45                         ` Marek Behun
  0 siblings, 1 reply; 48+ messages in thread
From: Pali Rohár @ 2020-10-30 10:08 UTC (permalink / raw)
  To: Marek Behun
  Cc: Bjorn Helgaas, Toke Høiland-Jørgensen, vtolkm,
	linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Thomas Petazzoni, Jason Cooper

Hello!

On Thursday 29 October 2020 21:58:53 Marek Behun wrote:
> On Thu, 29 Oct 2020 14:30:22 -0500
> Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > On Thu, Oct 29, 2020 at 12:12:21PM +0100, Toke Høiland-Jørgensen wrote:
> > > Pali Rohár <pali@kernel.org> writes:  
> > 
> > > > I have been testing mainline kernel on Turris Omnia with two PCIe
> > > > default cards (WLE200 and WLE900) and it worked fine. But I do not know
> > > > if I had ASPM enabled or not.
> > > >
> > > > So it is working fine for you when CONFIG_PCIEASPM is disabled and whole
> > > > issue is only when CONFIG_PCIEASPM is enabled?  
> > > 
> > > Yup, exactly. And I'm also currently testing with the default WLE200/900
> > > cards... I just tried sticking an MT76-based WiFi card into the third
> > > PCI slot, and that doesn't come up either when I enable PCIEASPM.  
> > 
> > Huh.  So IIUC, the following cases all try to retrain the link and it
> > fails to come up again:
> > 
> >   - aardvark + WLE900VX (see commit 43fc679ced18)

Just to note: aardvark + WLE200 worked fine whatever I did. No
workaround and no patch was needed.

> >   - mvebu + WLE200
> >   - mvebu + WLE900
> >   - mvebu + MT76
> 
> Bjorn, IIRC Pali's patches fix the WLE900VX card for Aardvark (both in
> kernel and in U-Boot).
> IMO mvebu has similar issues. Both these drivers handle the PCIe reset
> signal incorrectly (or at least Aardvark did before Pali's work).
> 
> mvebu is used on Turris Omnia, and our HW guys first solved the WLE900VX
> not working issue by using different capacitors for the SerDeses (this
> was 5 years ago). But after Pali's work on Aardvark I think this could
> also be solved for mvebu driver in software.

Apparently not :-( See below, we cannot control PERST# pin from software
on Turris Omnia.

> BTW the WLE900VX card has problems on many systems, it won't work for
> example on Thinkpad X230. There is a bug on kernel bugzilla reported
> for this.

WLE900VX is really buggy card. During its initialization/reset
W_DISABLE# (pin 20) must be in correct state, otherwise system would
never see this card. This is reason why it does not work in laptops,
sometimes could help double reboot and playing with rfkill state prior
reboot. See reported issue:

https://bugzilla.kernel.org/show_bug.cgi?id=84821#c53

> My opinion is that many drivers do not respect the PCIe specification
> for reset and link training totally correctly (Pali was talking about
> this when he was looking at Aardvark) and that WLE900VX has a bug that
> in combination with those drivers causes the fail. If you look at the
> drivers, they are incompatible in how they handle the reset signal and
> link training.

Seems that aardvark or WLE900VX card (not only this one, but basically
every ath10k tested card, also non-Compex) have problems that when
booting Linux kernel they are in some totally strange state and whatever
I did I was not able to detect them and make link training success. The
only thing which helped was to issue card reset via out of band PERST#
signal.

And here is the main issue with PERST# signal on linux kernel. Basically
every driver issue card reset via PERST# signal for different amount of
time. Something which must be driver and card independent, probably
already documented in PCIe specification. See my email:

https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/

I was trying to find that minimal reset timeout in specifications, but I
was not able to understand all those details and timeouts defined in
different diagrams. I'm not HW guy. See what was I able to find out:

https://lore.kernel.org/linux-pci/20200507212002.GA32182@bogus/

And my conclusion is here:

https://lore.kernel.org/linux-pci/20200513115940.fiemtnxfqcyqo6ik@pali/

So to finally fix issues with card reset we need somebody who understand
hardware documents and PCIe specifications and can figure out what is
the correct minimal value of delay needed for proper card reset via
PERST# signal. And then fix all PCI controller drivers to use this
value.

In aardvark we have timeout which was enough for my tested cards on
Espressobin and Turris MOX.


And second issue is with link training. What helped me to finally fix
link training for PCIe cards on A3720 with aardvark driver in both
U-Boot and Linux kernel was comment in following commit:

https://git.kernel.org/linus/f4c7d053d7f7

    As required by PCI Express spec a delay for at least 100ms after
    such a reset [fundamental reset by asserted PERST# signal] before
    link training is needed.

In aardvark control register I forcibly disabled link training bit prior
issuing reset via PERST# signal and then I re-enabled it 100ms after
reset was completed.

I have sent aardvark patch which update comment for above requirement:
https://lore.kernel.org/linux-pci/20200924084618.12442-1-pali@kernel.org/

> I am curious what Pali will tell us, he said that he will look into the
> mvebu driver.

If same problem with WLE900 cards is also on A38x SOC (with pci-mvebu
driver) then it would be hard to fix it on Turris Omnia.

On Turris MOX (with aardvark) PERST# pin from card is connected to some
MPP pin on A3720 SOC, which we can control via GPIO. In DTS we have
configured it as "reset-gpios" and therefore aardvark driver can
assert/deassert PERST# for card when needed.

On Turris Omnia (with pci-mvebu) PERST# pin from wifi card is connected
to MCU and it asserts/deasserts this pin only after board reset. Also it
is shared line across all mPCIe slots and also with other peripherals.

So we cannot issue reset via PERST# signal on Turris Omnia. But there
are other ways how to issue fundamental reset, via in band signaling.

But IIRC issuing fundamental reset via in band PCIe bus is done via PCIe
bridge to which is card connected. So second problem, we do not have
PCIe bridge on mvebu platforms, it is just emulated via kernel. Unless
there is some "special" register for issuing fundamental reset we would
not be able to emulate this reset.

Aardvark does not have PCIe bridge too, but in its internal registers
are bits for different types of reset. And when I was trying to use them
nothing happened, nothing helped. Only external reset via PERST# signal
was able to initialize card.

I will look into A38x PCI registers if there is not something which
could help us. But without access to PERST# pin I'm sceptical if we can
do something... Only just hoping that in PCIe ASPM retraining code is a
bug which can be fixed...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-29 23:15                       ` Toke Høiland-Jørgensen
  2020-10-30  8:23                         ` Thomas Petazzoni
@ 2020-10-30 10:15                         ` Pali Rohár
  1 sibling, 0 replies; 48+ messages in thread
From: Pali Rohár @ 2020-10-30 10:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Thomas Petazzoni, Bjorn Helgaas, vtolkm, linux-pci,
	linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Marek Behún, Jason Cooper

On Friday 30 October 2020 00:15:57 Toke Høiland-Jørgensen wrote:
> Thomas Petazzoni <thomas.petazzoni@bootlin.com> writes:
> 
> > Hello,
> >
> > On Thu, 29 Oct 2020 14:30:22 -0500
> > Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> >> We could quirk these NICs to avoid the retrain, but since aardvark and
> >> mvebu have no obvious connection and WLE200/WLE900 and MT76 have no
> >> obvious connection, I doubt there's a simple hardware defect that
> >> explains all these.  
> >
> > aardvark and mvebu have one very strong connection: they are the only
> > two drivers making use of the PCI Bridge emulation logic in
> > drivers/pci/pci-bridge-emul.c:
> >
> > drivers/pci$ git grep pci-bridge-emul
> > akefile:obj-$(CONFIG_PCI_BRIDGE_EMUL)  += pci-bridge-emul.o
> > controller/pci-aardvark.c:#include "../pci-bridge-emul.h"
> > controller/pci-mvebu.c:#include "../pci-bridge-emul.h"
> > pci-bridge-emul.c:#include "pci-bridge-emul.h"
> >
> > I haven't read the whole thread, but it is important to keep in mind
> > that on those two platforms, the PCI Bridge seen by Linux is *not* a
> > real HW bridge. It is faked by the the pci-bridge-emul code. So if this
> > code has defects/bugs in how it emulates a PCI Bridge behavior, you
> > might see weird things.
> 
> Ohh, that's interesting. Why does it need to emulate it?

I could speculate, they wanted to decrease cost of hw, so they did not
include bridge into hw and let user to emulate it (if is needed).

> And could this cause things weird interactions like what I'm seeing,
> where a somewhat buggy device in slot 2 affects the ability to retrain
> the link also in slot 1, but only if there's no device in slot 3?

I doubt, slots and registers are independent. Every slot/card has own
(emulated) bridge.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-30 10:08                       ` Pali Rohár
@ 2020-10-30 10:45                         ` Marek Behun
  0 siblings, 0 replies; 48+ messages in thread
From: Marek Behun @ 2020-10-30 10:45 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Bjorn Helgaas, Toke Høiland-Jørgensen, vtolkm,
	linux-pci, linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Thomas Petazzoni, Jason Cooper

On Fri, 30 Oct 2020 11:08:07 +0100
Pali Rohár <pali@kernel.org> wrote:

> On Turris Omnia (with pci-mvebu) PERST# pin from wifi card is connected
> to MCU and it asserts/deasserts this pin only after board reset. Also it
> is shared line across all mPCIe slots and also with other peripherals.
> 
> So we cannot issue reset via PERST# signal on Turris Omnia. But there
> are other ways how to issue fundamental reset, via in band signaling.

We can code this into MCU code, AFAIK it is upgradable from main CPU
via I2C :) I wanted to try this because of LEDs anyway...

But I think that all 3 PCIe slots have their PERST# signal connected to
just one GPIO on the MCU...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-28 23:16             ` Bjorn Helgaas
  2020-10-29 10:09               ` Pali Rohár
  2020-10-29 10:41               ` Toke Høiland-Jørgensen
@ 2020-10-30 11:23               ` Pali Rohár
  2020-10-30 13:02                 ` Toke Høiland-Jørgensen
  2 siblings, 1 reply; 48+ messages in thread
From: Pali Rohár @ 2020-10-30 11:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: vtolkm, Toke Høiland-Jørgensen, linux-pci,
	linux-arm-kernel, Rob Herring, Ilias Apalodimas,
	Marek Behún, Thomas Petazzoni, Jason Cooper

On Wednesday 28 October 2020 18:16:26 Bjorn Helgaas wrote:
> [+cc Pali, Marek, Thomas, Jason]
> 
> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
> > On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
> > > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
> > > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> > > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
> > > > > > 
> > > > > > > [+cc vtolkm]
> > > > > > > 
> > > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
> > > > > > > > Hi everyone
> > > > > > > > 
> > > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
> > > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
> > > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
> > > > > > > > the resource request fix[0] applied on top.
> > > > > > > > 
> > > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
> > > > > > > > up. But I'm still getting initialisation errors like these:
> > > > > > > > 
> > > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
> > > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
> > > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
> > > > > > > > 
> > > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
> > > > > > > > errors related to the bus rather than to the drivers themselves:
> > > > > > > > 
> > > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
> > > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
> > > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
> > > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
> > > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
> > > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
> > > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
> > > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
> > > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
> > > > > > > > 
> > > > > > > > lspci looks OK, though:
> > > > > > > > 
> > > > > > > > # lspci
> > > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
> > > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
> > > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
> > > > > > > > 
> > > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
> > > > > > > > did I miss something in my config or other initialisation? I've tried
> > > > > > > > with both the stock u-boot distributed with the board, and with an
> > > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
> > > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
> > > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
> > > > > > > don't think we have a fix yet.
> > > > > > Yes! Turning that off does indeed help! Thanks a bunch :)
> > > > > > 
> > > > > > You mention that bisecting this would be helpful - I can try that
> > > > > > tomorrow; any idea when this was last working?
> > > > > OK, so I tried to bisect this, but, erm, I couldn't find a working
> > > > > revision to start from? I went all the way back to 4.10 (which is the
> > > > > first version to include the device tree file for the Omnia), and even
> > > > > on that, the wireless cards were failing to initialise with ASPM
> > > > > enabled...
> > > > I have no personal experience with this device; all I know is that the
> > > > bugzilla suggests that it worked in v5.4, which isn't much help.
> > > > 
> > > > Possibly the apparent regression was really a .config change, i.e.,
> > > > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
> > > > "worked" but got enabled later and it started failing?
> > > Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
> > > default and only turns it on for specific targets. So I guess that it's
> > > most likely that this has never worked...
> > > 
> > > > Maybe the debug patch below would be worth trying to see if it makes
> > > > any difference?  If it *does* help, try omitting the first hunk to see
> > > > if we just need to apply the quirk_enable_clear_retrain_link() quirk.
> > > Tried, doesn't help...
> > > 
> > > -Toke
> > 
> > Found this patch
> > 
> > https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
> > 
> > that mentions the Compex WLE900VX card, which reading the lspci verbose
> > output from the bugtracker seems to the device being troubled.
> 
> Interesting.  Indeed, the Compex WLE900VX card seems to have the
> Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
> the same device in it.
> 
> The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
> for aardvark, so of course doesn't help mvebu.
> 
> PCIe hardware is supposed to automatically negotiate the highest link
> speed supported by both ends.  But software *is* allowed to set an
> upper limit (the Target Link Speed in Link Control 2).  If we initiate
> a retrain and the link doesn't come back up, I wonder if we should try
> to help the hardware out by using Target Link Speed to limit to a
> lower speed and attempting another retrain, something like this hacky
> patch: (please collect the dmesg log if you try this)

My experience with that WLE900VX card, aardvark driver and aspm code:

Link training in GEN2 mode for this card succeed only once after reset.
Repeated link retraining fails and it fails even when aardvark is
reconfigured to GEN1 mode. Reset via PERST# signal is required to have
working link training.

What I did in aardvark driver: Set mode to GEN2, do link training. If
success read "negotiated link speed" from "Link Control Status Register"
(for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
retrain link again (for WLE900VX now it would be at GEN1). After that
card is stable and all future retraining (e.g. from aspm.c) also passes.

If I do not change aardvark mode from GEN2 to GEN1 the second link
training fails. And if I change mode to GEN1 after this failed link
training then nothing happen, link training do not success.

So just speculation now... In current setup initialization of card does
one link training at GEN2. Then aspm.c is called which is doing second
link retraining at GEN2. And if it fails then below patch issue third
link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
then second link retraining must be at GEN1 (not GEN2) to workaround
this issue.

Bjorn, Toke: what about trying to hack aspm.c code to never do link
retraining at GEN2 speed? And always force GEN1 speed prior link
training?

> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index ac0557a305af..fb6e13532a2c 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -192,12 +192,42 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
>  	link->clkpm_disable = blacklist ? 1 : 0;
>  }
>  
> +#define PCI_EXP_LNKCAP2_SLS	0x000000fe
> +
> +static int decrease_tls(struct pci_dev *pdev)
> +{
> +	u32 lnkcap2;
> +	u16 lnkctl2, tls;
> +
> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP2, &lnkcap2);
> +
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
> +
> +	pci_info(pdev, "lnkcap2 %#010x sls %#04x lnkctl2 %#06x tls %#03x\n",
> +		 lnkcap2, (lnkcap2 & PCI_EXP_LNKCAP2_SLS) >> 1,
> +		 lnkctl2, tls);
> +
> +	if (tls < 2)
> +		return -EINVAL;
> +
> +	tls--;
> +	pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL2,
> +					   PCI_EXP_LNKCTL2_TLS, tls);
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &lnkctl2);
> +	pci_info(pdev, "lnkctl2 %#010x new tls %#03x\n",
> +		 lnkctl2, tls);
> +
> +	return 0;
> +}
> +
>  static bool pcie_retrain_link(struct pcie_link_state *link)
>  {
>  	struct pci_dev *parent = link->pdev;
>  	unsigned long end_jiffies;
>  	u16 reg16;
>  
> +top:
>  	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>  	reg16 |= PCI_EXP_LNKCTL_RL;
>  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> @@ -216,10 +246,14 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>  	do {
>  		pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &reg16);
>  		if (!(reg16 & PCI_EXP_LNKSTA_LT))
> -			break;
> +			return true;	/* success */
>  		msleep(1);
>  	} while (time_before(jiffies, end_jiffies));
> -	return !(reg16 & PCI_EXP_LNKSTA_LT);
> +
> +	if (decrease_tls(parent))
> +		return false;	/* can't decrease any more */
> +
> +	goto top;
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-30 11:23               ` Pali Rohár
@ 2020-10-30 13:02                 ` Toke Høiland-Jørgensen
  2020-10-30 14:23                   ` Pali Rohár
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-30 13:02 UTC (permalink / raw)
  To: Pali Rohár, Bjorn Helgaas
  Cc: vtolkm, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

Pali Rohár <pali@kernel.org> writes:

> On Wednesday 28 October 2020 18:16:26 Bjorn Helgaas wrote:
>> [+cc Pali, Marek, Thomas, Jason]
>> 
>> On Wed, Oct 28, 2020 at 04:40:00PM +0000, ™֟☻̭҇ Ѽ ҉ ® wrote:
>> > On 28/10/2020 16:08, Toke Høiland-Jørgensen wrote:
>> > > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > > On Wed, Oct 28, 2020 at 02:36:13PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > > Toke Høiland-Jørgensen <toke@redhat.com> writes:
>> > > > > > Bjorn Helgaas <helgaas@kernel.org> writes:
>> > > > > > 
>> > > > > > > [+cc vtolkm]
>> > > > > > > 
>> > > > > > > On Tue, Oct 27, 2020 at 04:43:20PM +0100, Toke Høiland-Jørgensen wrote:
>> > > > > > > > Hi everyone
>> > > > > > > > 
>> > > > > > > > I'm trying to get a mainline kernel to run on my Turris Omnia, and am
>> > > > > > > > having some trouble getting the PCI bus to work correctly. Specifically,
>> > > > > > > > I'm running a 5.10-rc1 kernel (torvalds/master as of this moment), with
>> > > > > > > > the resource request fix[0] applied on top.
>> > > > > > > > 
>> > > > > > > > The kernel boots fine, and the patch in [0] makes the PCI devices show
>> > > > > > > > up. But I'm still getting initialisation errors like these:
>> > > > > > > > 
>> > > > > > > > [    1.632709] pci 0000:01:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
>> > > > > > > > [    1.632714] pci 0000:01:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > > [    1.632745] pci 0000:02:00.0: BAR 0: error updating (0xe0200004 != 0xffffffff)
>> > > > > > > > [    1.632750] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
>> > > > > > > > 
>> > > > > > > > and the WiFi drivers fail to initialise with what appears to me to be
>> > > > > > > > errors related to the bus rather than to the drivers themselves:
>> > > > > > > > 
>> > > > > > > > [    3.509878] ath: phy0: Mac Chip Rev 0xfffc0.f is not supported by this driver
>> > > > > > > > [    3.517049] ath: phy0: Unable to initialize hardware; initialization status: -95
>> > > > > > > > [    3.524473] ath9k 0000:01:00.0: Failed to initialize device
>> > > > > > > > [    3.530081] ath9k: probe of 0000:01:00.0 failed with error -95
>> > > > > > > > [    3.536012] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
>> > > > > > > > [    3.543049] pci 0000:00:02.0: enabling device (0140 -> 0142)
>> > > > > > > > [    3.548735] ath10k_pci 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
>> > > > > > > > [    3.588592] ath10k_pci 0000:02:00.0: failed to wake up device : -110
>> > > > > > > > [    3.595098] ath10k_pci: probe of 0000:02:00.0 failed with error -110
>> > > > > > > > 
>> > > > > > > > lspci looks OK, though:
>> > > > > > > > 
>> > > > > > > > # lspci
>> > > > > > > > 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 00:03.0 PCI bridge: Marvell Technology Group Ltd. Device 6820 (rev 04)
>> > > > > > > > 01:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network Adapter (PCI-Express) (rev 01)
>> > > > > > > > 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter (rev ff)
>> > > > > > > > 
>> > > > > > > > Does anyone have any clue what could be going on here? Is this a bug, or
>> > > > > > > > did I miss something in my config or other initialisation? I've tried
>> > > > > > > > with both the stock u-boot distributed with the board, and with an
>> > > > > > > > upstream u-boot from latest master; doesn't seem to make any different.
>> > > > > > > Can you try turning off CONFIG_PCIEASPM?  We had a similar recent
>> > > > > > > report at https://bugzilla.kernel.org/show_bug.cgi?id=209833 but I
>> > > > > > > don't think we have a fix yet.
>> > > > > > Yes! Turning that off does indeed help! Thanks a bunch :)
>> > > > > > 
>> > > > > > You mention that bisecting this would be helpful - I can try that
>> > > > > > tomorrow; any idea when this was last working?
>> > > > > OK, so I tried to bisect this, but, erm, I couldn't find a working
>> > > > > revision to start from? I went all the way back to 4.10 (which is the
>> > > > > first version to include the device tree file for the Omnia), and even
>> > > > > on that, the wireless cards were failing to initialise with ASPM
>> > > > > enabled...
>> > > > I have no personal experience with this device; all I know is that the
>> > > > bugzilla suggests that it worked in v5.4, which isn't much help.
>> > > > 
>> > > > Possibly the apparent regression was really a .config change, i.e.,
>> > > > CONFIG_PCIEASPM was disabled in the v5.4 kernel vtolkm@ tested and it
>> > > > "worked" but got enabled later and it started failing?
>> > > Yeah, I suspect so. The OpenWrt config disables CONFIG_PCIEASPM by
>> > > default and only turns it on for specific targets. So I guess that it's
>> > > most likely that this has never worked...
>> > > 
>> > > > Maybe the debug patch below would be worth trying to see if it makes
>> > > > any difference?  If it *does* help, try omitting the first hunk to see
>> > > > if we just need to apply the quirk_enable_clear_retrain_link() quirk.
>> > > Tried, doesn't help...
>> > > 
>> > > -Toke
>> > 
>> > Found this patch
>> > 
>> > https://github.com/openwrt/openwrt/blob/7c0496f29bed87326f1bf591ca25ace82373cfc7/target/linux/mvebu/patches-5.4/405-PCI-aardvark-Improve-link-training.patch
>> > 
>> > that mentions the Compex WLE900VX card, which reading the lspci verbose
>> > output from the bugtracker seems to the device being troubled.
>> 
>> Interesting.  Indeed, the Compex WLE900VX card seems to have the
>> Qualcomm Atheros QCA9880 on it, and it looks like Toke's system has
>> the same device in it.
>> 
>> The patch you mention (https://git.kernel.org/linus/43fc679ced18) is
>> for aardvark, so of course doesn't help mvebu.
>> 
>> PCIe hardware is supposed to automatically negotiate the highest link
>> speed supported by both ends.  But software *is* allowed to set an
>> upper limit (the Target Link Speed in Link Control 2).  If we initiate
>> a retrain and the link doesn't come back up, I wonder if we should try
>> to help the hardware out by using Target Link Speed to limit to a
>> lower speed and attempting another retrain, something like this hacky
>> patch: (please collect the dmesg log if you try this)
>
> My experience with that WLE900VX card, aardvark driver and aspm code:
>
> Link training in GEN2 mode for this card succeed only once after reset.
> Repeated link retraining fails and it fails even when aardvark is
> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
> working link training.
>
> What I did in aardvark driver: Set mode to GEN2, do link training. If
> success read "negotiated link speed" from "Link Control Status Register"
> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
> retrain link again (for WLE900VX now it would be at GEN1). After that
> card is stable and all future retraining (e.g. from aspm.c) also passes.
>
> If I do not change aardvark mode from GEN2 to GEN1 the second link
> training fails. And if I change mode to GEN1 after this failed link
> training then nothing happen, link training do not success.
>
> So just speculation now... In current setup initialization of card does
> one link training at GEN2. Then aspm.c is called which is doing second
> link retraining at GEN2. And if it fails then below patch issue third
> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
> then second link retraining must be at GEN1 (not GEN2) to workaround
> this issue.
>
> Bjorn, Toke: what about trying to hack aspm.c code to never do link
> retraining at GEN2 speed? And always force GEN1 speed prior link
> training?

Sounds like a plan. I poked around in aspm.c and must confess to being a
bit lost in the soup of registers ;)

So if one of you can cook up a patch, that would be most helpful!

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-30 13:02                 ` Toke Høiland-Jørgensen
@ 2020-10-30 14:23                   ` Pali Rohár
  2020-10-30 14:54                     ` ™֟☻̭҇ Ѽ ҉ ®
  0 siblings, 1 reply; 48+ messages in thread
From: Pali Rohár @ 2020-10-30 14:23 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Bjorn Helgaas, vtolkm, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
> Pali Rohár <pali@kernel.org> writes:
> > My experience with that WLE900VX card, aardvark driver and aspm code:
> >
> > Link training in GEN2 mode for this card succeed only once after reset.
> > Repeated link retraining fails and it fails even when aardvark is
> > reconfigured to GEN1 mode. Reset via PERST# signal is required to have
> > working link training.
> >
> > What I did in aardvark driver: Set mode to GEN2, do link training. If
> > success read "negotiated link speed" from "Link Control Status Register"
> > (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
> > retrain link again (for WLE900VX now it would be at GEN1). After that
> > card is stable and all future retraining (e.g. from aspm.c) also passes.
> >
> > If I do not change aardvark mode from GEN2 to GEN1 the second link
> > training fails. And if I change mode to GEN1 after this failed link
> > training then nothing happen, link training do not success.
> >
> > So just speculation now... In current setup initialization of card does
> > one link training at GEN2. Then aspm.c is called which is doing second
> > link retraining at GEN2. And if it fails then below patch issue third
> > link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
> > then second link retraining must be at GEN1 (not GEN2) to workaround
> > this issue.
> >
> > Bjorn, Toke: what about trying to hack aspm.c code to never do link
> > retraining at GEN2 speed? And always force GEN1 speed prior link
> > training?
> 
> Sounds like a plan. I poked around in aspm.c and must confess to being a
> bit lost in the soup of registers ;)
> 
> So if one of you can cook up a patch, that would be most helpful!

I modified Bjorn's patch, explicitly set tls to 1 and added debug info
about cls (current link speed, that what is used by aardvark). It is
untested, I just tried to compile it.

Can try it?

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 253c30cc1967..f934c0b52f41 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
 	unsigned long end_jiffies;
 	u16 reg16;
 
+	u32 lnkcap2;
+	u16 lnksta, lnkctl2, cls, tls;
+
+	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
+	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
+	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
+	cls = lnksta & PCI_EXP_LNKSTA_CLS;
+	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
+
+	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
+		lnkcap2, (lnkcap2 & 0x3F) >> 1,
+		lnksta, cls,
+		lnkctl2, tls);
+
+	tls = 1;
+	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
+					PCI_EXP_LNKCTL2_TLS, tls);
+	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
+	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
+		lnkctl2, tls);
+
 	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
 	reg16 |= PCI_EXP_LNKCTL_RL;
 	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
@@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
 			break;
 		msleep(1);
 	} while (time_before(jiffies, end_jiffies));
+	pci_info(parent, "lnksta %#06x new cls %#03x\n",
+		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
 	return !(reg16 & PCI_EXP_LNKSTA_LT);
 }
 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-30 14:23                   ` Pali Rohár
@ 2020-10-30 14:54                     ` ™֟☻̭҇ Ѽ ҉ ®
  2020-10-31 12:49                       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-10-30 14:54 UTC (permalink / raw)
  To: Pali Rohár, Toke Høiland-Jørgensen
  Cc: Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

[-- Attachment #1.1.1: Type: text/plain, Size: 11659 bytes --]

On 30/10/2020 15:23, Pali Rohár wrote:
> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>> Pali Rohár <pali@kernel.org> writes:
>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>>>
>>> Link training in GEN2 mode for this card succeed only once after reset.
>>> Repeated link retraining fails and it fails even when aardvark is
>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>>> working link training.
>>>
>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>>> success read "negotiated link speed" from "Link Control Status Register"
>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>>>
>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>>> training fails. And if I change mode to GEN1 after this failed link
>>> training then nothing happen, link training do not success.
>>>
>>> So just speculation now... In current setup initialization of card does
>>> one link training at GEN2. Then aspm.c is called which is doing second
>>> link retraining at GEN2. And if it fails then below patch issue third
>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>>> this issue.
>>>
>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>>> training?
>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>> bit lost in the soup of registers ;)
>>
>> So if one of you can cook up a patch, that would be most helpful!
> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
> about cls (current link speed, that what is used by aardvark). It is
> untested, I just tried to compile it.
>
> Can try it?
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 253c30cc1967..f934c0b52f41 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>   	unsigned long end_jiffies;
>   	u16 reg16;
>   
> +	u32 lnkcap2;
> +	u16 lnksta, lnkctl2, cls, tls;
> +
> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
> +
> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
> +		lnksta, cls,
> +		lnkctl2, tls);
> +
> +	tls = 1;
> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
> +					PCI_EXP_LNKCTL2_TLS, tls);
> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
> +		lnkctl2, tls);
> +
>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>   	reg16 |= PCI_EXP_LNKCTL_RL;
>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>   			break;
>   		msleep(1);
>   	} while (time_before(jiffies, end_jiffies));
> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
>   }
>   

Still exhibiting the BAR update error, run tested with next--20201030


0.396182] mvebu-pcie soc:pcie: host bridge /soc/pcie ranges:
0.396205] mvebu-pcie soc:pcie: Parsing ranges property...
0.396222] mvebu-pcie soc:pcie:      MEM 0x00f1080000..0x00f1081fff -> 
0x0000080000
0.396251] mvebu-pcie soc:pcie:      MEM 0x00f1040000..0x00f1041fff -> 
0x0000040000
0.396278] mvebu-pcie soc:pcie:      MEM 0x00f1044000..0x00f1045fff -> 
0x0000044000
0.396303] mvebu-pcie soc:pcie:      MEM 0x00f1048000..0x00f1049fff -> 
0x0000048000
0.396329] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe 
-> 0x0100000000
0.396340] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe 
-> 0x0100000000
0.396351] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe 
-> 0x0200000000
0.396361] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe 
-> 0x0200000000
0.396372] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe 
-> 0x0300000000
0.396382] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe 
-> 0x0300000000
0.396393] mvebu-pcie soc:pcie:      MEM 0xffffffffffffffff..0x00fffffffe 
-> 0x0400000000
0.396400] mvebu-pcie soc:pcie:       IO 0xffffffffffffffff..0x00fffffffe 
-> 0x0400000000
0.397280] mvebu-pcie soc:pcie: PCI host bridge to bus 0000:00
0.397299] pci_bus 0000:00: root bus resource [bus 00-ff]
0.397314] pci_bus 0000:00: root bus resource [mem 0xf1080000-0xf1081fff] 
(bus address [0x00080000-0x00081fff])
0.397327] pci_bus 0000:00: root bus resource [mem 0xf1040000-0xf1041fff] 
(bus address [0x00040000-0x00041fff])
0.397348] pci_bus 0000:00: root bus resource [mem 0xf1044000-0xf1045fff] 
(bus address [0x00044000-0x00045fff])
0.397360] pci_bus 0000:00: root bus resource [mem 0xf1048000-0xf1049fff] 
(bus address [0x00048000-0x00049fff])
0.397371] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xe7ffffff]
0.397383] pci_bus 0000:00: root bus resource [io  0x1000-0xeffff]
0.397388] pci_bus 0000:00: scanning bus
0.397495] pci 0000:00:01.0: [11ab:6820] type 01 class 0x060400
0.397509] pci 0000:00:01.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
0.398052] pci 0000:00:02.0: [11ab:6820] type 01 class 0x060400
0.398064] pci 0000:00:02.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
0.398585] pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400
0.398597] pci 0000:00:03.0: reg 0x38: [mem 0x00000000-0x000007ff pref]
0.399755] pci_bus 0000:00: fixups for bus
0.399773] pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 0
0.399777] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
0.399784] pci 0000:00:02.0: scanning [bus 00-00] behind bridge, pass 0
0.399787] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
0.399794] pci 0000:00:03.0: scanning [bus 00-00] behind bridge, pass 0
0.399797] pci 0000:00:03.0: bridge configuration invalid ([bus 00-00]), 
reconfiguring
0.399803] pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 1
0.400032] pci_bus 0000:01: scanning bus
0.400784] pci_bus 0000:01: fixups for bus
0.400794] pci_bus 0000:01: bus scan returning with max=01
0.400800] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
0.400808] pci 0000:00:02.0: scanning [bus 00-00] behind bridge, pass 1
0.401032] pci_bus 0000:02: scanning bus
0.401078] pci 0000:02:00.0: [168c:003c] type 00 class 0x028000
0.401098] pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x001fffff 64bit]
0.401125] pci 0000:02:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
0.401217] pci 0000:02:00.0: supports D1 D2
0.401614] pci 0000:00:02.0: ASPM: current common clock configuration is 
inconsistent, reconfiguring
0.401626] pci 0000:00:02.0: lnkcap2 0x00000000 sls 0x00 lnksta 0x1011 
cls 0x1 lnkctl2 0x0000 tls 0x0
0.401632] pci 0000:00:02.0: lnkctl2 0x00000000 new tls 0x1
0.428701] pci 0000:00:02.0: lnksta 0x1011 new cls 0x1
0.429486] pci_bus 0000:02: fixups for bus
0.429498] pci_bus 0000:02: bus scan returning with max=02
0.429504] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
0.429514] pci 0000:00:03.0: scanning [bus 00-00] behind bridge, pass 1
0.429778] pci_bus 0000:03: scanning bus
0.429831] pci 0000:03:00.0: [168c:002e] type 00 class 0x028000
0.429854] pci 0000:03:00.0: reg 0x10: [mem 0x00000000-0x0000ffff 64bit]
0.429978] pci 0000:03:00.0: supports D1
0.429985] pci 0000:03:00.0: PME# supported from D0 D1 D3hot
0.429992] pci 0000:03:00.0: PME# disabled
0.430403] pci 0000:00:03.0: ASPM: current common clock configuration is 
inconsistent, reconfiguring
0.430416] pci 0000:00:03.0: lnkcap2 0x00000000 sls 0x00 lnksta 0x1011 
cls 0x1 lnkctl2 0x0000 tls 0x0
0.430421] pci 0000:00:03.0: lnkctl2 0x00000000 new tls 0x1
0.460692] pci 0000:00:03.0: lnksta 0x1011 new cls 0x1
0.461459] pci_bus 0000:03: fixups for bus
0.461470] pci_bus 0000:03: bus scan returning with max=03
0.461476] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
0.461482] pci_bus 0000:00: bus scan returning with max=03
0.461552] pci 0000:00:02.0: BAR 8: assigned [mem 0xe0000000-0xe02fffff]
0.461561] pci 0000:00:03.0: BAR 8: assigned [mem 0xe0300000-0xe03fffff]
0.461568] pci 0000:00:01.0: BAR 6: assigned [mem 0xe0400000-0xe04007ff pref]
0.461576] pci 0000:00:02.0: BAR 6: assigned [mem 0xe0500000-0xe05007ff pref]
0.461583] pci 0000:00:03.0: BAR 6: assigned [mem 0xe0600000-0xe06007ff pref]
0.461593] pci 0000:00:01.0: PCI bridge to [bus 01]
0.461620] pci 0000:02:00.0: BAR 0: assigned [mem 0xe0000000-0xe01fffff 
64bit]
0.461627] pci 0000:02:00.0: BAR 0: error updating (0xe0000004 != 0xffffffff)
0.461633] pci 0000:02:00.0: BAR 0: error updating (high 0x000000 != 
0xffffffff)
0.461639] pci 0000:02:00.0: BAR 6: assigned [mem 0xe0200000-0xe020ffff pref]
0.461645] pci 0000:00:02.0: PCI bridge to [bus 02]
0.461651] pci 0000:00:02.0:   bridge window [mem 0xe0000000-0xe02fffff]
0.461666] pci 0000:03:00.0: BAR 0: assigned [mem 0xe0300000-0xe030ffff 
64bit]
0.461673] pci 0000:03:00.0: BAR 0: error updating (0xe0300004 != 0xffffffff)
0.461678] pci 0000:03:00.0: BAR 0: error updating (high 0x000000 != 
0xffffffff)
0.461683] pci 0000:00:03.0: PCI bridge to [bus 03]
0.461689] pci 0000:00:03.0:   bridge window [mem 0xe0300000-0xe03fffff]
0.461701] pci 0000:00:01.0: Max Payload Size set to  128/ 128 (was 128), 
Max Read Rq  128
0.461710] pci 0000:00:02.0: Max Payload Size set to  128/ 128 (was 128), 
Max Read Rq  128
0.461715] pci 0000:02:00.0: Failed attempting to set the MPS
0.461721] pci 0000:02:00.0: Max Payload Size set to  128/ 256 (was 128), 
Max Read Rq  128
0.461729] pci 0000:00:03.0: Max Payload Size set to  128/ 128 (was 128), 
Max Read Rq  128
0.461734] pci 0000:03:00.0: Failed attempting to set the MPS
0.461740] pci 0000:03:00.0: Max Payload Size set to  128/ 128 (was 128), 
Max Read Rq  128
0.461855] pcieport 0000:00:01.0: assign IRQ: got 0
0.461866] pcieport 0000:00:01.0: enabling bus mastering
0.461959] pcieport 0000:00:02.0: assign IRQ: got 0
0.461966] pcieport 0000:00:02.0: enabling device (0140 -> 0142)
0.461980] pcieport 0000:00:02.0: enabling bus mastering
0.462065] pcieport 0000:00:03.0: assign IRQ: got 0
0.462070] pcieport 0000:00:03.0: enabling device (0140 -> 0142)
0.462080] pcieport 0000:00:03.0: enabling bus mastering
2.467153] pci 0000:00:03.0: enabling bus mastering
2.519024] ath10k_pci 0000:02:00.0: of_irq_parse_pci: failed with rc=134
2.531459] ath10k_pci 0000:02:00.0: assign IRQ: got 0
2.536915] pci 0000:00:02.0: enabling bus mastering
2.540553] ath10k_pci 0000:02:00.0: can't change power state from D3hot 
to D0 (config space inaccessible)
2.580450] ath10k_pci 0000:02:00.0: failed to wake up device : -110
2.586973] ath10k_pci 0000:02:00.0: disabling bus mastering
2.587220] ath10k_pci: probe of 0000:02:00.0 failed with error -110
2.605598] ehci-pci: EHCI PCI platform driver



[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-30 14:54                     ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-10-31 12:49                       ` Toke Høiland-Jørgensen
  2020-11-02 15:24                         ` Pali Rohár
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-31 12:49 UTC (permalink / raw)
  To: vtolkm, Pali Rohár
  Cc: Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:

> On 30/10/2020 15:23, Pali Rohár wrote:
>> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>>> Pali Rohár <pali@kernel.org> writes:
>>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>>>>
>>>> Link training in GEN2 mode for this card succeed only once after reset.
>>>> Repeated link retraining fails and it fails even when aardvark is
>>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>>>> working link training.
>>>>
>>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>>>> success read "negotiated link speed" from "Link Control Status Register"
>>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>>>>
>>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>>>> training fails. And if I change mode to GEN1 after this failed link
>>>> training then nothing happen, link training do not success.
>>>>
>>>> So just speculation now... In current setup initialization of card does
>>>> one link training at GEN2. Then aspm.c is called which is doing second
>>>> link retraining at GEN2. And if it fails then below patch issue third
>>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>>>> this issue.
>>>>
>>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>>>> training?
>>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>>> bit lost in the soup of registers ;)
>>>
>>> So if one of you can cook up a patch, that would be most helpful!
>> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>> about cls (current link speed, that what is used by aardvark). It is
>> untested, I just tried to compile it.
>>
>> Can try it?
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 253c30cc1967..f934c0b52f41 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>   	unsigned long end_jiffies;
>>   	u16 reg16;
>>   
>> +	u32 lnkcap2;
>> +	u16 lnksta, lnkctl2, cls, tls;
>> +
>> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>> +
>> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>> +		lnksta, cls,
>> +		lnkctl2, tls);
>> +
>> +	tls = 1;
>> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>> +					PCI_EXP_LNKCTL2_TLS, tls);
>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>> +		lnkctl2, tls);
>> +
>>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>>   	reg16 |= PCI_EXP_LNKCTL_RL;
>>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>   			break;
>>   		msleep(1);
>>   	} while (time_before(jiffies, end_jiffies));
>> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
>>   }
>>   
>
> Still exhibiting the BAR update error, run tested with next--20201030

Yup, same for me :(

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-10-31 12:49                       ` Toke Høiland-Jørgensen
@ 2020-11-02 15:24                         ` Pali Rohár
  2020-11-02 15:54                           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: Pali Rohár @ 2020-11-02 15:24 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: vtolkm, Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:
> 
> > On 30/10/2020 15:23, Pali Rohár wrote:
> >> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
> >>> Pali Rohár <pali@kernel.org> writes:
> >>>> My experience with that WLE900VX card, aardvark driver and aspm code:
> >>>>
> >>>> Link training in GEN2 mode for this card succeed only once after reset.
> >>>> Repeated link retraining fails and it fails even when aardvark is
> >>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
> >>>> working link training.
> >>>>
> >>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
> >>>> success read "negotiated link speed" from "Link Control Status Register"
> >>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
> >>>> retrain link again (for WLE900VX now it would be at GEN1). After that
> >>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
> >>>>
> >>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
> >>>> training fails. And if I change mode to GEN1 after this failed link
> >>>> training then nothing happen, link training do not success.
> >>>>
> >>>> So just speculation now... In current setup initialization of card does
> >>>> one link training at GEN2. Then aspm.c is called which is doing second
> >>>> link retraining at GEN2. And if it fails then below patch issue third
> >>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
> >>>> then second link retraining must be at GEN1 (not GEN2) to workaround
> >>>> this issue.
> >>>>
> >>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
> >>>> retraining at GEN2 speed? And always force GEN1 speed prior link
> >>>> training?
> >>> Sounds like a plan. I poked around in aspm.c and must confess to being a
> >>> bit lost in the soup of registers ;)
> >>>
> >>> So if one of you can cook up a patch, that would be most helpful!
> >> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
> >> about cls (current link speed, that what is used by aardvark). It is
> >> untested, I just tried to compile it.
> >>
> >> Can try it?
> >>
> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> >> index 253c30cc1967..f934c0b52f41 100644
> >> --- a/drivers/pci/pcie/aspm.c
> >> +++ b/drivers/pci/pcie/aspm.c
> >> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
> >>   	unsigned long end_jiffies;
> >>   	u16 reg16;
> >>   
> >> +	u32 lnkcap2;
> >> +	u16 lnksta, lnkctl2, cls, tls;
> >> +
> >> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> >> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
> >> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
> >> +
> >> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
> >> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
> >> +		lnksta, cls,
> >> +		lnkctl2, tls);
> >> +
> >> +	tls = 1;
> >> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
> >> +					PCI_EXP_LNKCTL2_TLS, tls);
> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
> >> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
> >> +		lnkctl2, tls);
> >> +
> >>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
> >>   	reg16 |= PCI_EXP_LNKCTL_RL;
> >>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> >> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
> >>   			break;
> >>   		msleep(1);
> >>   	} while (time_before(jiffies, end_jiffies));
> >> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
> >> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
> >>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
> >>   }
> >>   
> >
> > Still exhibiting the BAR update error, run tested with next--20201030
> 
> Yup, same for me :(

So then it is different issue and not similar to aardvark one.

Anyway, was ASPM working on some previous kernel version? Or was it
always broken on Turris Omnia?

And has somebody other Armada 385 device with mPCIe slots to test if
ASPM is working? Or any other 32bit Marvell Armada SOC?

I would like to know if this is issue only on Turris Omnia or also on
other Armada 385 SOC device or even on any other device which uses
pci-mvebu.c driver.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-11-02 15:24                         ` Pali Rohár
@ 2020-11-02 15:54                           ` Toke Høiland-Jørgensen
  2020-11-02 16:18                             ` ™֟☻̭҇ Ѽ ҉ ®
  0 siblings, 1 reply; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-02 15:54 UTC (permalink / raw)
  To: Pali Rohár
  Cc: vtolkm, Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

Pali Rohár <pali@kernel.org> writes:

> On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
>> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:
>> 
>> > On 30/10/2020 15:23, Pali Rohár wrote:
>> >> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>> >>> Pali Rohár <pali@kernel.org> writes:
>> >>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>> >>>>
>> >>>> Link training in GEN2 mode for this card succeed only once after reset.
>> >>>> Repeated link retraining fails and it fails even when aardvark is
>> >>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>> >>>> working link training.
>> >>>>
>> >>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>> >>>> success read "negotiated link speed" from "Link Control Status Register"
>> >>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>> >>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>> >>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>> >>>>
>> >>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>> >>>> training fails. And if I change mode to GEN1 after this failed link
>> >>>> training then nothing happen, link training do not success.
>> >>>>
>> >>>> So just speculation now... In current setup initialization of card does
>> >>>> one link training at GEN2. Then aspm.c is called which is doing second
>> >>>> link retraining at GEN2. And if it fails then below patch issue third
>> >>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>> >>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>> >>>> this issue.
>> >>>>
>> >>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>> >>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>> >>>> training?
>> >>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>> >>> bit lost in the soup of registers ;)
>> >>>
>> >>> So if one of you can cook up a patch, that would be most helpful!
>> >> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>> >> about cls (current link speed, that what is used by aardvark). It is
>> >> untested, I just tried to compile it.
>> >>
>> >> Can try it?
>> >>
>> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> >> index 253c30cc1967..f934c0b52f41 100644
>> >> --- a/drivers/pci/pcie/aspm.c
>> >> +++ b/drivers/pci/pcie/aspm.c
>> >> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>> >>   	unsigned long end_jiffies;
>> >>   	u16 reg16;
>> >>   
>> >> +	u32 lnkcap2;
>> >> +	u16 lnksta, lnkctl2, cls, tls;
>> >> +
>> >> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> >> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>> >> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>> >> +
>> >> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>> >> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>> >> +		lnksta, cls,
>> >> +		lnkctl2, tls);
>> >> +
>> >> +	tls = 1;
>> >> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>> >> +					PCI_EXP_LNKCTL2_TLS, tls);
>> >> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>> >> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>> >> +		lnkctl2, tls);
>> >> +
>> >>   	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>> >>   	reg16 |= PCI_EXP_LNKCTL_RL;
>> >>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> >> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>> >>   			break;
>> >>   		msleep(1);
>> >>   	} while (time_before(jiffies, end_jiffies));
>> >> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>> >> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>> >>   	return !(reg16 & PCI_EXP_LNKSTA_LT);
>> >>   }
>> >>   
>> >
>> > Still exhibiting the BAR update error, run tested with next--20201030
>> 
>> Yup, same for me :(
>
> So then it is different issue and not similar to aardvark one.
>
> Anyway, was ASPM working on some previous kernel version? Or was it
> always broken on Turris Omnia?

I tried bisecting and couldn't find a commit that worked. And OpenWrt by
default builds with ASPM off, so my best guess is that it was always
broken.

However, the two other PCI slots *do* work with ASPM on, as long as
they're both occupied when booting. If I only have one card installed
apart from the dodge WLE900, both of them fail...

> And has somebody other Armada 385 device with mPCIe slots to test if
> ASPM is working? Or any other 32bit Marvell Armada SOC?
>
> I would like to know if this is issue only on Turris Omnia or also on
> other Armada 385 SOC device or even on any other device which uses
> pci-mvebu.c driver.

See above: It does partly work on my Omnia. Is it possible to define a
quirk to just disable it on a per-slot basis for the WLE900 card? Maybe
just doing that and calling it a day would be enough...

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-11-02 15:54                           ` Toke Høiland-Jørgensen
@ 2020-11-02 16:18                             ` ™֟☻̭҇ Ѽ ҉ ®
  2020-11-02 16:33                               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 48+ messages in thread
From: ™֟☻̭҇ Ѽ ҉ ® @ 2020-11-02 16:18 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Pali Rohár
  Cc: Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

[-- Attachment #1.1.1: Type: text/plain, Size: 5129 bytes --]


On 02/11/2020 16:54, Toke Høiland-Jørgensen wrote:
> Pali Rohár <pali@kernel.org> writes:
>
>> On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
>>> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:
>>>
>>>> On 30/10/2020 15:23, Pali Rohár wrote:
>>>>> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>>>>>> Pali Rohár <pali@kernel.org> writes:
>>>>>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>>>>>>>
>>>>>>> Link training in GEN2 mode for this card succeed only once after reset.
>>>>>>> Repeated link retraining fails and it fails even when aardvark is
>>>>>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>>>>>>> working link training.
>>>>>>>
>>>>>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>>>>>>> success read "negotiated link speed" from "Link Control Status Register"
>>>>>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>>>>>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>>>>>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>>>>>>>
>>>>>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>>>>>>> training fails. And if I change mode to GEN1 after this failed link
>>>>>>> training then nothing happen, link training do not success.
>>>>>>>
>>>>>>> So just speculation now... In current setup initialization of card does
>>>>>>> one link training at GEN2. Then aspm.c is called which is doing second
>>>>>>> link retraining at GEN2. And if it fails then below patch issue third
>>>>>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>>>>>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>>>>>>> this issue.
>>>>>>>
>>>>>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>>>>>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>>>>>>> training?
>>>>>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>>>>>> bit lost in the soup of registers ;)
>>>>>>
>>>>>> So if one of you can cook up a patch, that would be most helpful!
>>>>> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>>>>> about cls (current link speed, that what is used by aardvark). It is
>>>>> untested, I just tried to compile it.
>>>>>
>>>>> Can try it?
>>>>>
>>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>>> index 253c30cc1967..f934c0b52f41 100644
>>>>> --- a/drivers/pci/pcie/aspm.c
>>>>> +++ b/drivers/pci/pcie/aspm.c
>>>>> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>>>>    	unsigned long end_jiffies;
>>>>>    	u16 reg16;
>>>>>    
>>>>> +	u32 lnkcap2;
>>>>> +	u16 lnksta, lnkctl2, cls, tls;
>>>>> +
>>>>> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>>>>> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>>>>> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>>>>> +
>>>>> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>>>>> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>>>>> +		lnksta, cls,
>>>>> +		lnkctl2, tls);
>>>>> +
>>>>> +	tls = 1;
>>>>> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>>>>> +					PCI_EXP_LNKCTL2_TLS, tls);
>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>>>>> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>>>>> +		lnkctl2, tls);
>>>>> +
>>>>>    	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>>>>>    	reg16 |= PCI_EXP_LNKCTL_RL;
>>>>>    	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>>>>> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>>>>    			break;
>>>>>    		msleep(1);
>>>>>    	} while (time_before(jiffies, end_jiffies));
>>>>> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>>>>> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>>>>>    	return !(reg16 & PCI_EXP_LNKSTA_LT);
>>>>>    }
>>>>>    
>>>> Still exhibiting the BAR update error, run tested with next--20201030
>>> Yup, same for me :(
>> So then it is different issue and not similar to aardvark one.
>>
>> Anyway, was ASPM working on some previous kernel version? Or was it
>> always broken on Turris Omnia?
> I tried bisecting and couldn't find a commit that worked. And OpenWrt by
> default builds with ASPM off, so my best guess is that it was always
> broken.
>
> However, the two other PCI slots *do* work with ASPM on, as long as
> they're both occupied when booting. If I only have one card installed
> apart from the dodge WLE900, both of them fail...

Just to be sure it is not a (particular) mPCIe slot issue on the TO - 
did you change the device order in the mPCIe slots?

On my node:

- right slot (next to the CPU) hosts a SSD
- centre slot hosts WLE900VX
- left slot (over the SIM card slot) hosts the WLE200N2


[-- Attachment #1.1.2: OpenPGP_0x729CFF47A416598B.asc --]
[-- Type: application/pgp-keys, Size: 3163 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: PCI trouble on mvebu (Turris Omnia)
  2020-11-02 16:18                             ` ™֟☻̭҇ Ѽ ҉ ®
@ 2020-11-02 16:33                               ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 48+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-02 16:33 UTC (permalink / raw)
  To: vtolkm, Pali Rohár
  Cc: Bjorn Helgaas, linux-pci, linux-arm-kernel, Rob Herring,
	Ilias Apalodimas, Marek Behún, Thomas Petazzoni,
	Jason Cooper

"™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:

> On 02/11/2020 16:54, Toke Høiland-Jørgensen wrote:
>> Pali Rohár <pali@kernel.org> writes:
>>
>>> On Saturday 31 October 2020 13:49:49 Toke Høiland-Jørgensen wrote:
>>>> "™֟☻̭҇ Ѽ ҉ ®" <vtolkm@googlemail.com> writes:
>>>>
>>>>> On 30/10/2020 15:23, Pali Rohár wrote:
>>>>>> On Friday 30 October 2020 14:02:22 Toke Høiland-Jørgensen wrote:
>>>>>>> Pali Rohár <pali@kernel.org> writes:
>>>>>>>> My experience with that WLE900VX card, aardvark driver and aspm code:
>>>>>>>>
>>>>>>>> Link training in GEN2 mode for this card succeed only once after reset.
>>>>>>>> Repeated link retraining fails and it fails even when aardvark is
>>>>>>>> reconfigured to GEN1 mode. Reset via PERST# signal is required to have
>>>>>>>> working link training.
>>>>>>>>
>>>>>>>> What I did in aardvark driver: Set mode to GEN2, do link training. If
>>>>>>>> success read "negotiated link speed" from "Link Control Status Register"
>>>>>>>> (for WLE900VX it is 0x1 - GEN1) and set it into aardvark. And then
>>>>>>>> retrain link again (for WLE900VX now it would be at GEN1). After that
>>>>>>>> card is stable and all future retraining (e.g. from aspm.c) also passes.
>>>>>>>>
>>>>>>>> If I do not change aardvark mode from GEN2 to GEN1 the second link
>>>>>>>> training fails. And if I change mode to GEN1 after this failed link
>>>>>>>> training then nothing happen, link training do not success.
>>>>>>>>
>>>>>>>> So just speculation now... In current setup initialization of card does
>>>>>>>> one link training at GEN2. Then aspm.c is called which is doing second
>>>>>>>> link retraining at GEN2. And if it fails then below patch issue third
>>>>>>>> link retraining at GEN1. If A38x/pci-mvebu has same problem as aardvark
>>>>>>>> then second link retraining must be at GEN1 (not GEN2) to workaround
>>>>>>>> this issue.
>>>>>>>>
>>>>>>>> Bjorn, Toke: what about trying to hack aspm.c code to never do link
>>>>>>>> retraining at GEN2 speed? And always force GEN1 speed prior link
>>>>>>>> training?
>>>>>>> Sounds like a plan. I poked around in aspm.c and must confess to being a
>>>>>>> bit lost in the soup of registers ;)
>>>>>>>
>>>>>>> So if one of you can cook up a patch, that would be most helpful!
>>>>>> I modified Bjorn's patch, explicitly set tls to 1 and added debug info
>>>>>> about cls (current link speed, that what is used by aardvark). It is
>>>>>> untested, I just tried to compile it.
>>>>>>
>>>>>> Can try it?
>>>>>>
>>>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>>>> index 253c30cc1967..f934c0b52f41 100644
>>>>>> --- a/drivers/pci/pcie/aspm.c
>>>>>> +++ b/drivers/pci/pcie/aspm.c
>>>>>> @@ -206,6 +206,27 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>>>>>    	unsigned long end_jiffies;
>>>>>>    	u16 reg16;
>>>>>>    
>>>>>> +	u32 lnkcap2;
>>>>>> +	u16 lnksta, lnkctl2, cls, tls;
>>>>>> +
>>>>>> +	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP2, &lnkcap2);
>>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKSTA, &lnksta);
>>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>>>>>> +	cls = lnksta & PCI_EXP_LNKSTA_CLS;
>>>>>> +	tls = lnkctl2 & PCI_EXP_LNKCTL2_TLS;
>>>>>> +
>>>>>> +	pci_info(parent, "lnkcap2 %#010x sls %#04x lnksta %#06x cls %#03x lnkctl2 %#06x tls %#03x\n",
>>>>>> +		lnkcap2, (lnkcap2 & 0x3F) >> 1,
>>>>>> +		lnksta, cls,
>>>>>> +		lnkctl2, tls);
>>>>>> +
>>>>>> +	tls = 1;
>>>>>> +	pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL2,
>>>>>> +					PCI_EXP_LNKCTL2_TLS, tls);
>>>>>> +	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
>>>>>> +	pci_info(parent, "lnkctl2 %#010x new tls %#03x\n",
>>>>>> +		lnkctl2, tls);
>>>>>> +
>>>>>>    	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &reg16);
>>>>>>    	reg16 |= PCI_EXP_LNKCTL_RL;
>>>>>>    	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>>>>>> @@ -227,6 +248,8 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
>>>>>>    			break;
>>>>>>    		msleep(1);
>>>>>>    	} while (time_before(jiffies, end_jiffies));
>>>>>> +	pci_info(parent, "lnksta %#06x new cls %#03x\n",
>>>>>> +		lnksta, (cls & PCI_EXP_LNKSTA_CLS));
>>>>>>    	return !(reg16 & PCI_EXP_LNKSTA_LT);
>>>>>>    }
>>>>>>    
>>>>> Still exhibiting the BAR update error, run tested with next--20201030
>>>> Yup, same for me :(
>>> So then it is different issue and not similar to aardvark one.
>>>
>>> Anyway, was ASPM working on some previous kernel version? Or was it
>>> always broken on Turris Omnia?
>> I tried bisecting and couldn't find a commit that worked. And OpenWrt by
>> default builds with ASPM off, so my best guess is that it was always
>> broken.
>>
>> However, the two other PCI slots *do* work with ASPM on, as long as
>> they're both occupied when booting. If I only have one card installed
>> apart from the dodge WLE900, both of them fail...
>
> Just to be sure it is not a (particular) mPCIe slot issue on the TO - 
> did you change the device order in the mPCIe slots?

No, I didn't.

> On my node:
>
> - right slot (next to the CPU) hosts a SSD
> - centre slot hosts WLE900VX
> - left slot (over the SIM card slot) hosts the WLE200N2

That's the same order as the PCI subsystem enumerates the slots (on my
machine at least). I have WLE200/WLE900/MT76 in those three slots, which
makes slot 1 and 3 work, while slot 2 craps out. If I remove the MT76
card (as it was originally), neither of slots 1 and 2 work...

-Toke


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, back to index

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27 15:43 PCI trouble on mvebu (Turris Omnia) Toke Høiland-Jørgensen
2020-10-27 17:20 ` Bjorn Helgaas
2020-10-27 17:44   ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 18:59     ` Toke Høiland-Jørgensen
2020-10-27 20:20       ` Toke Høiland-Jørgensen
2020-10-27 21:22         ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 21:31           ` Toke Høiland-Jørgensen
2020-10-27 22:01             ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-27 22:12               ` Toke Høiland-Jørgensen
2020-10-27 18:56   ` Toke Høiland-Jørgensen
2020-10-28 13:36     ` Toke Høiland-Jørgensen
2020-10-28 14:42       ` Bjorn Helgaas
2020-10-28 15:08         ` Toke Høiland-Jørgensen
2020-10-28 16:40           ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-28 23:16             ` Bjorn Helgaas
2020-10-29 10:09               ` Pali Rohár
2020-10-29 10:56                 ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-29 11:12                 ` Toke Høiland-Jørgensen
2020-10-29 19:30                   ` Bjorn Helgaas
2020-10-29 19:56                     ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-29 19:57                     ` Andrew Lunn
2020-10-29 21:55                       ` Thomas Petazzoni
2020-10-29 20:18                     ` Toke Høiland-Jørgensen
2020-10-29 22:09                       ` Toke Høiland-Jørgensen
2020-10-29 20:58                     ` Marek Behun
2020-10-30 10:08                       ` Pali Rohár
2020-10-30 10:45                         ` Marek Behun
2020-10-29 21:54                     ` Thomas Petazzoni
2020-10-29 23:15                       ` Toke Høiland-Jørgensen
2020-10-30  8:23                         ` Thomas Petazzoni
2020-10-30 10:15                         ` Pali Rohár
2020-10-29 10:41               ` Toke Høiland-Jørgensen
2020-10-29 11:18                 ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-30 11:23               ` Pali Rohár
2020-10-30 13:02                 ` Toke Høiland-Jørgensen
2020-10-30 14:23                   ` Pali Rohár
2020-10-30 14:54                     ` ™֟☻̭҇ Ѽ ҉ ®
2020-10-31 12:49                       ` Toke Høiland-Jørgensen
2020-11-02 15:24                         ` Pali Rohár
2020-11-02 15:54                           ` Toke Høiland-Jørgensen
2020-11-02 16:18                             ` ™֟☻̭҇ Ѽ ҉ ®
2020-11-02 16:33                               ` Toke Høiland-Jørgensen
2020-10-29  1:21             ` Marek Behun
2020-10-29 15:12           ` Rob Herring
2020-10-27 18:03 ` Marek Behun
2020-10-27 19:00   ` Toke Høiland-Jørgensen
2020-10-27 20:19     ` Marek Behun
2020-10-27 20:49       ` Toke Høiland-Jørgensen

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git