linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (no subject)
@ 2018-05-04 20:06 Bjorn Helgaas
  2018-05-07 21:56 ` [PATCH] PCI: Check whether bridges allow access to extended config space Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2018-05-04 20:06 UTC (permalink / raw)
  To: Gilles Buloz
  Cc: Bjorn Helgaas, linux-pci, Minghuan.Lian, linux-arm-kernel,
	Ard Biesheuvel, linux-kernel, Frederick Lawler, Sinan Kaya

Bcc: 
Subject: Re: [PATCH] PCI: Check whether bridges allow access to extended
 config space
Reply-To: 
In-Reply-To: <5AEC8002.5000309@kontron.com>

[+cc Fred, Sinan]

On Fri, May 04, 2018 at 03:45:07PM +0000, Gilles Buloz wrote:
> Le 04/05/2018 00:31, Bjorn Helgaas a écrit :
> > [+cc LKML]
> >
> > On Thu, May 03, 2018 at 12:40:27PM +0000, Gilles Buloz wrote:
> >> Subject:    [PATCH] For exception at PCI probe due to bridge reporting UR
> >>
> >> Even if a device supports extended config access, no such access must be
> >> done to this device If there's a bridge not supporting that in the path
> >> to this device. Doing such access with UR reporting enabled on the root
> >> bridge leads to an exception.
> >>
> >> This is the case on a LS1043A CPU (NXP QorIQ Layerscape) platform with
> >> the following bus topology :
> >>    LS1043 PCIe root
> >>      -> PEX8112 PCIe-to-PCI bridge (not supporting ext cfg on PCI side)
> >>        -> PMC slot connector (for legacy PMC modules)
> >> With a PMC module topology as follows :
> >>    PMC connector
> >>      -> PCI-to-PCIe bridge
> >>        -> PCIe switch (4 ports)
> >>          -> 4 PCIe devices (one on each port)
> >> In this case all devices behind the PEX8112 are supporting extended config
> >> access but this is prohibited by the PEX8112. Without this patch, an
> >> exception (synchronous abort) occurs in pci_cfg_space_size_ext().
> >>
> >> This patch checks the parent bridge of each allocated child bus to know if
> >> extended config access is supported on the child bus, and sets a flag in
> >> child->bus_flags if not supported. This  flag is inherited by all children
> >> buses of this child bus and then is checked to avoid this unsupported
> >> accesses to every device on these buses.
> > Hi Gilles,
> >
> > Thanks for the patch!  I reworked it a little bit to simplify the code
> > in pci_alloc_child_bus().  Can you test it and make sure I didn't
> > break anything?
> >
> Hi Bjorn,
> 
> Your rework works as expected. Tested on LS1043A platform with kernel 4.17-rc1, and with some backport on kernel 4.1.35
> 
> Suggestion : maybe change the pci_info() string to have :
>      pci_bus 0000:xx: extended config space not accessible
> instead of
>      pci_bus 0000:xx: extended config space not accessible on secondary bus
> as xx is already the number of the secondary bus

Oops, when I wrote that I was thinking it would be printed for the
bridge device (not the bus).  I changed it as you suggest.

Interesting, I didn't think about the fact that pci_info() would work
on a struct pci_bus * as well as on a struct pci_dev *, since it's a
macro and they both have a "dev" member.

> Info : with kernel 4.17-rc1, it turns out I need pcie_aspm=off to
> have the PMC devices behind the PCI-to-PCIe bridge of the PMC safely
> detected/configured. But this is not caused by the patch.

> Without pcie_aspm=off I saw this at one boot :
>     "pci 0000:02:0e.0: ASPM: Could not configure common clock" for this bridge, but devices
>     correctly detected/configured
> but at most boots I get :
>     no ASPM message but "pci 0000:04:02.0: bridge configuration invalid ([bus ff-ff]), reconfiguring "
>     instead, and some devices are missing. Also lspci show "rev ff" for some devices.
> I don't see this problem on 4.1.35 with the same backported patch.

This is interesting, especially since you have this unusual topology
of a path to the device that is PCIe, then conventional PCI, then PCIe
again.  We *should* be able to use ASPM on the PCIe links, but it's
definitely not a well-tested scenario.

Can you tell if something is actually broken?  Sinan's recent change,
04875177dbe0 ("PCI/ASPM: Don't warn if already in common clock mode"),
which appeared in v4.17-rc1, turns off the message in some cases.

The "bridge configuration invalid" message just means the firmware
didn't configure the bridge.  We *should* still set it up correctly,
but please report a bug if we don't.

lspci showing "ff" for some devices might be a symptom of the devices
being powered off.  In that case config reads normally return ~0 data
(though on your platform maybe it would cause exceptions).  I've seen
this in other situations and wondered if it would be worth adding a
hint to lspci so it could say "device may be powered off".

Anyway, if you are seeing something broken (more than just the
messages), please start a new thread about each one.  If you do, could
you please:

  - open a report at https://bugzilla.kernel.org/, in the Drivers/PCI
    component (open a separate bug for each issue you see)

  - use kernel version 4.17-rc1 and mark it as a regression if
    appropriate

  - attach (don't paste inline) the complete dmesg log and "lspci -vv"
    output (as root) to the bug

  - post a note to linux-pci@vger.kernel.org, cc Fred, Sinan, and me,
    and include the link to the bugzilla

Bjorn

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: LS1043A : "synchronous abort" at boot due to PCI config read
@ 2018-04-27 16:56 Bjorn Helgaas
  2018-04-30  8:46 ` Gilles Buloz
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2018-04-27 16:56 UTC (permalink / raw)
  To: Gilles Buloz
  Cc: Ard Biesheuvel, Bjorn Helgaas, linux-pci, linux-arm-kernel,
	minghuan.Lian

On Fri, Apr 27, 2018 at 12:29:32PM +0000, Gilles Buloz wrote:
> Le 27/04/2018 10:43, Ard Biesheuvel a écrit :
> > (add Bjorn and linux-pci)
> >
> > On 13 April 2018 at 19:32, Gilles Buloz <Gilles.Buloz@kontron.com> wrote:
> >> Dear developers,
> >>
> >> I currently have two functional workarounds for this issue but
> >> would like to know which one you would recommend, if any :-) I'm
> >> using a LS1043A CPU (NXP QorIQ Layerscape) and get a "synchronous
> >> external abort" when booting because of a PCI config read during
> >> PCI scan.
> >>
> >> I'm using a custom hardware (based on LS1043ARDB) having a
> >> PEX8112 PCIe-to-PCI bridge connected to the LS1043A to have a PCI
> >> slot for legacy devices. This bridge only supports PCI-Compatible
> >> config accesses (offset 0x00-0xFF).

I would guess the PEX8112 itself has 4K of config space, but it only
forwards 256 bytes of config space to the conventional PCI secondary
bus.

> >> On this PCI slot I connect a PCI module made of a PCI-to-PCIe
> >> bridge plus PCIe devices behind.
> >>
> >> The problem occurs when the kernel probes the PCIe devices : as
> >> they are PCIe devices, the kernel does a PCI config read access
> >> at offset 0x100 to check if "PCIe extended capability registers"
> >> are accessible (see drivers/pci/probe.c, function
> >> pci_cfg_space_size_ext()). Unfortunately the PEX8112 PCIe-to-PCI
> >> bridge that is in the path reports an error to the CPU for this
> >> access, and it seems there's no way to disable that on this
> >> bridge.
> >>
> >> The first workaround I found was to patch
> >> drivers/pci/host/pci-layerscape.c to have PCIE_ABSERR_SETTING set
> >> to 0x9400 instead of 0x9401 (for PCIE_ABSERR register) to disable
> >> error reporting. This only impacts an NXP part of the Linux
> >> kernel code, but I'm not sure this is a good idea (however it
> >> seems to be like that on Intel platforms where even MEM accesses
> >> to a no-device address return FF without any error).
> >>
> >> I've also tried another workaround that works : patch
> >> drivers/pci/probe.c to use bus_flags to remember if a bus is
> >> behind a bridge without extended address capability, to avoid PCi
> >> config read accesses at offset 0x100 in pci_cfg_space_size() /
> >> pci_cfg_space_size_ext(). But this patch impacts the generic PCI
> >> probe method of Linux.
> >>
> >> Any Idea to properly handle that issue ?
> >>
> > This seems like a rather unusual configuration, but I guess that
> > if the first bridge/switch advertises its inability to support
> > extended config space accesses, we should not be performing them
> > on any of its subordinate buses. How does the PEX8112 advertise
> > this limitation?
> >
> > That said, I wonder if it is reasonable in the first place to
> > expect that a PCIe device works as expected passing through a
> > legacy PCI layer like that.
> >
> The PEX8112 PCIe-to-PCI bridge has capability PCI_CAP_ID_EXP, but
> has no PCI_CAP_ID_PCIX capability.  As I understand the lack of
> PCI_CAP_ID_PCIX is advertising this limitation on the PCI side (no
> support for PCI config offset >=0x100).

Sounds right to me.

> Also I guess in the case of a bridge having PCI_CAP_ID_PCIX, this
> limitation would be advertised by the lack of PCI_X_STATUS_266MHZ
> and PCI_X_STATUS_533MHZ (as done in drivers/pci/probe.c at
> pci_cfg_space_size())

Also sounds right.  Per the PCI-X spec, checking for PCI_X_STATUS_266MHZ
should be enough, but it shouldn't hurt to check for either
PCI_X_STATUS_266MHZ or PCI_X_STATUS_533MHZ.

> I'm currently using the attached patch (for kernel 4.1.35-rt41 from
> NXP Yocto BSP). It uses bus_flags to remember if a bus is behind a
> bridge without extended address capability to avoid PCi config
> accesses at offset >= 0x100. Thanks to this patch I now have a
> functional system with functional PCI/PCIe devices.

The patch seems like it's looking at the right things, but I don't
want to build it into pci_scan_bridge_extend() because that function
is much too complicated already.

I'd rather build it into pci_cfg_space_size() or
pci_cfg_space_size_ext() somehow.  Maybe something along these lines?
This doesn't account for the case of a PCIe-to-PCI-X Mode 2 bridge; in
that case, I think all 4K would be accessible on the PCI-X side.

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6fd0bcd..d8b091f0bcd1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1367,7 +1367,7 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev)
  * pci_cfg_space_size - Get the configuration space size of the PCI device
  * @dev: PCI device
  *
- * Regular PCI devices have 256 bytes, but PCI-X 2 and PCI Express devices
+ * Regular PCI devices have 256 bytes, but PCI-X Mode 2 and PCI Express devices
  * have 4096 bytes.  Even if the device is capable, that doesn't mean we can
  * access it.  Maybe we don't have a way to generate extended config space
  * accesses, or the device is behind a reverse Express bridge.  So we try
@@ -1376,9 +1376,14 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev)
  */
 static int pci_cfg_space_size_ext(struct pci_dev *dev)
 {
+	struct pci_dev *bridge = pci_upstream_bridge(dev);
 	u32 status;
 	int pos = PCI_CFG_SPACE_SIZE;
 
+	if (bridge && pci_is_pcie(bridge) &&
+	    pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE)
+		return PCI_CFG_SPACE_SIZE;
+
 	if (pci_read_config_dword(dev, pos, &status) != PCIBIOS_SUCCESSFUL)
 		return PCI_CFG_SPACE_SIZE;
 	if (status == 0xffffffff || pci_ext_cfg_is_aliased(dev))

> --- include/linux/pci.h.orig	2018-03-26 16:51:18.050000000 +0000
> +++ include/linux/pci.h	2018-03-26 16:51:27.660000000 +0000
> @@ -193,6 +193,7 @@
>  enum pci_bus_flags {
>  	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
>  	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
> +	PCI_BUS_FLAGS_COMPAT_CFG_SPACE = (__force pci_bus_flags_t) 4,
>  };
>  
>  /* These values come from the PCI Express Spec */
> --- drivers/pci/probe.c.orig	2018-01-22 09:29:52.000000000 +0000
> +++ drivers/pci/probe.c	2018-03-26 16:54:30.830000000 +0000
> @@ -827,6 +827,28 @@
>  			child->primary = primary;
>  			pci_bus_insert_busn_res(child, secondary, subordinate);
>  			child->bridge_ctl = bctl;
> +
> +			{
> +				int pos;
> +				u32 status;
> +				bool pci_compat_cfg_space = false;
> +
> +				if (!pci_is_pcie(dev) || (pci_pcie_type(dev) == PCI_EXP_TYPE_PCIE_BRIDGE) || (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)) {
> +					/* for PCI/PCI bridges, or PCIe/PCI bridge in forward or reverse mode, we have to check for PCI-X capabilities */
> +					pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
> +					if (pos) {
> +						pci_read_config_dword(dev, pos + PCI_X_STATUS, &status);
> +						if (!(status & (PCI_X_STATUS_266MHZ | PCI_X_STATUS_533MHZ)))
> +							pci_compat_cfg_space = true;
> +					} else {
> +						pci_compat_cfg_space = true;
> +					}
> +					if (pci_compat_cfg_space) {
> +						dev_info(&dev->dev, "[%04x:%04x] Child bus limited to PCI-Compatible config space\n", dev->vendor, dev->device);
> +						child->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE;
> +					}
> +				}
> +			}
>  		}
>  
>  		cmax = pci_scan_child_bus(child);
> @@ -1098,6 +1120,11 @@
>  			goto fail;
>  	}
>  
> +	if (dev->bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE) {
> +		dev_info(&dev->dev, "[%04x:%04x] PCI-Compatible config space only due to parent bus(es)\n", dev->vendor, dev->device);
> +		return PCI_CFG_SPACE_SIZE;
> +	}
> +
>  	return pci_cfg_space_size_ext(dev);
>  
>   fail:

> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-10  2:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-04 20:06 Bjorn Helgaas
2018-05-07 21:56 ` [PATCH] PCI: Check whether bridges allow access to extended config space Bjorn Helgaas
2018-05-09 12:27   ` Gilles Buloz
2018-05-10  2:44     ` Frederick Lawler
  -- strict thread matches above, loose matches on Subject: below --
2018-04-27 16:56 LS1043A : "synchronous abort" at boot due to PCI config read Bjorn Helgaas
2018-04-30  8:46 ` Gilles Buloz
2018-04-30 13:36   ` Gilles Buloz
2018-04-30 17:04     ` Bjorn Helgaas
2018-04-30 17:53       ` Gilles Buloz
2018-05-02 12:57         ` Gilles Buloz
2018-05-02 13:26           ` Bjorn Helgaas
2018-05-02 13:48             ` Gilles Buloz
2018-05-02 17:23               ` Bjorn Helgaas
2018-05-03 12:40                 ` Gilles Buloz
2018-05-03 22:31                   ` [PATCH] PCI: Check whether bridges allow access to extended config space Bjorn Helgaas
2018-05-04 15:45                     ` Gilles Buloz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).