All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-17 17:59 ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-17 17:59 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

Add a writeup about how PCI host bridges should be described in ACPI
using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 Documentation/PCI/00-INDEX      |    2 +
 Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/PCI/acpi-info.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 147231f..0780280 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -1,5 +1,7 @@
 00-INDEX
 	- this file
+acpi-info.txt
+	- info on how PCI host bridges are represented in ACPI
 MSI-HOWTO.txt
 	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
 PCIEBUS-HOWTO.txt
diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
new file mode 100644
index 0000000..ccbcfda
--- /dev/null
+++ b/Documentation/PCI/acpi-info.txt
@@ -0,0 +1,136 @@
+	    ACPI considerations for PCI host bridges
+
+The basic requirement is that the ACPI namespace should describe
+*everything* that consumes address space unless there's another
+standard way for the OS to find it [1, 2].  For example, windows that
+are forwarded to PCI by a PCI host bridge should be described via ACPI
+devices, since the OS can't locate the host bridge by itself.  PCI
+devices *below* the host bridge do not need to be described via ACPI,
+because the resources they consume are inside the host bridge windows,
+and the OS can discover them via the standard PCI enumeration
+mechanism (using config accesses to read and size the BARs).
+
+This ACPI resource description is done via _CRS methods of devices in
+the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
+the OS can read _CRS and figure out what resource is being consumed
+even if it doesn't have a driver for the device [3].  That's important
+because it means an old OS can work correctly even on a system with
+new devices unknown to the OS.  The new devices won't do anything, but
+the OS can at least make sure no resources conflict with them.
+
+Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
+reserving address space!  The static tables are for things the OS
+needs to know early in boot, before it can parse the ACPI namespace.
+If a new table is defined, an old OS needs to operate correctly even
+though it ignores the table.  _CRS allows that because it is generic
+and understood by the old OS; a static table does not.
+
+If the OS is expected to manage an ACPI device, that device will have
+a specific _HID/_CID that tells the OS what driver to bind to it, and
+the _CRS tells the OS and the driver where the device's registers are.
+
+PNP0C02 "motherboard" devices are basically a catch-all.  There's no
+programming model for them other than "don't use these resources for
+anything else."  So any address space that is (1) not claimed by some
+other ACPI device and (2) should not be assigned by the OS to
+something else, should be claimed by a PNP0C02 _CRS method.
+
+PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
+describe all the address space they consume.  In principle, this would
+be all the windows they forward down to the PCI bus, as well as the
+bridge registers themselves.  The bridge registers include things like
+secondary/subordinate bus registers that determine the bus range below
+the bridge, window registers that describe the apertures, etc.  These
+are all device-specific, non-architected things, so the only way a
+PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
+contain the device-specific details.  These bridge registers also
+include ECAM space, since it is consumed by the bridge.
+
+ACPI defined a Producer/Consumer bit that was intended to distinguish
+the bridge apertures from the bridge registers [4, 5].  However,
+BIOSes didn't use that bit correctly, and the result is that OSes have
+to assume that everything in a PCI host bridge _CRS is a window.  That
+leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
+device itself.
+
+The workaround is to describe the bridge registers (including ECAM
+space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
+the bridge register space is device-specific anyway, so the generic
+PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
+ECAM, pci_root.c learns about the space from either MCFG or the _CBA
+method.
+
+Note that the PCIe spec actually does require ECAM unless there's a
+standard firmware interface for config access, e.g., the ia64 SAL
+interface [7].  One reason is that we want a generic host bridge
+driver (pci_root.c), and a generic driver requires a generic way to
+access config space.
+
+
+[1] ACPI 6.0, sec 6.1:
+    For any device that is on a non-enumerable type of bus (for
+    example, an ISA bus), OSPM enumerates the devices' identifier(s)
+    and the ACPI system firmware must supply an _HID object ... for
+    each device to enable OSPM to do that.
+
+[2] ACPI 6.0, sec 3.7:
+    The OS enumerates motherboard devices simply by reading through
+    the ACPI Namespace looking for devices with hardware IDs.
+
+    Each device enumerated by ACPI includes ACPI-defined objects in
+    the ACPI Namespace that report the hardware resources the device
+    could occupy [_PRS], an object that reports the resources that are
+    currently used by the device [_CRS], and objects for configuring
+    those resources [_SRS].  The information is used by the Plug and
+    Play OS (OSPM) to configure the devices.
+
+[3] ACPI 6.0, sec 6.2:
+    OSPM uses device configuration objects to configure hardware
+    resources for devices enumerated via ACPI.  Device configuration
+    objects provide information about current and possible resource
+    requirements, the relationship between shared resources, and
+    methods for configuring hardware resources.
+
+    When OSPM enumerates a device, it calls _PRS to determine the
+    resource requirements of the device.  It may also call _CRS to
+    find the current resource settings for the device.  Using this
+    information, the Plug and Play system determines what resources
+    the device should consume and sets those resources by calling the
+    device’s _SRS control method.
+
+    In ACPI, devices can consume resources (for example, legacy
+    keyboards), provide resources (for example, a proprietary PCI
+    bridge), or do both.  Unless otherwise specified, resources for a
+    device are assumed to be taken from the nearest matching resource
+    above the device in the device hierarchy.
+
+[4] ACPI 6.0, sec 6.4.3.5.4:
+    Extended Address Space Descriptor
+    General Flags: Bit [0] Consumer/Producer:
+	1–This device consumes this resource
+	0–This device produces and consumes this resource
+
+[5] ACPI 6.0, sec 19.6.43:
+    ResourceUsage specifies whether the Memory range is consumed by
+    this device (ResourceConsumer) or passed on to child devices
+    (ResourceProducer).  If nothing is specified, then
+    ResourceConsumer is assumed.
+
+[6] PCI Firmware 3.0, sec 4.1.2:
+    If the operating system does not natively comprehend reserving the
+    MMCFG region, the MMCFG region must be reserved by firmware.  The
+    address range reported in the MCFG table or by _CBA method (see
+    Section 4.1.3) must be reserved by declaring a motherboard
+    resource.  For most systems, the motherboard resource would appear
+    at the root of the ACPI namespace (under \_SB) in a node with a
+    _HID of EISAID (PNP0C02), and the resources in this case should
+    not be claimed in the root PCI bus’s _CRS.  The resources can
+    optionally be returned in Int15 E820 or EFIGetMemoryMap as
+    reserved memory but must always be reported through ACPI as a
+    motherboard resource.
+
+[7] PCI Express 3.0, sec 7.2.2:
+    For systems that are PC-compatible, or that do not implement a
+    processor-architecture-specific firmware interface standard that
+    allows access to the Configuration Space, the ECAM is required as
+    defined in this section.


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-17 17:59 ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-17 17:59 UTC (permalink / raw)
  To: linux-arm-kernel

Add a writeup about how PCI host bridges should be described in ACPI
using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 Documentation/PCI/00-INDEX      |    2 +
 Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/PCI/acpi-info.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 147231f..0780280 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -1,5 +1,7 @@
 00-INDEX
 	- this file
+acpi-info.txt
+	- info on how PCI host bridges are represented in ACPI
 MSI-HOWTO.txt
 	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
 PCIEBUS-HOWTO.txt
diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
new file mode 100644
index 0000000..ccbcfda
--- /dev/null
+++ b/Documentation/PCI/acpi-info.txt
@@ -0,0 +1,136 @@
+	    ACPI considerations for PCI host bridges
+
+The basic requirement is that the ACPI namespace should describe
+*everything* that consumes address space unless there's another
+standard way for the OS to find it [1, 2]. ?For example, windows that
+are forwarded to PCI by a PCI host bridge should be described via ACPI
+devices, since the OS can't locate the host bridge by itself. ?PCI
+devices *below* the host bridge do not need to be described via ACPI,
+because the resources they consume are inside the host bridge windows,
+and the OS can discover them via the standard PCI enumeration
+mechanism (using config accesses to read and size the BARs).
+
+This ACPI resource description is done via _CRS methods of devices in
+the ACPI namespace [2]. ? _CRS methods are like generalized PCI BARs:
+the OS can read _CRS and figure out what resource is being consumed
+even if it doesn't have a driver for the device [3]. ?That's important
+because it means an old OS can work correctly even on a system with
+new devices unknown to the OS. ?The new devices won't do anything, but
+the OS can at least make sure no resources conflict with them.
+
+Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
+reserving address space!  The static tables are for things the OS
+needs to know early in boot, before it can parse the ACPI namespace.
+If a new table is defined, an old OS needs to operate correctly even
+though it ignores the table.  _CRS allows that because it is generic
+and understood by the old OS; a static table does not.
+
+If the OS is expected to manage an ACPI device, that device will have
+a specific _HID/_CID that tells the OS what driver to bind to it, and
+the _CRS tells the OS and the driver where the device's registers are.
+
+PNP0C02 "motherboard" devices are basically a catch-all. ?There's no
+programming model for them other than "don't use these resources for
+anything else." ?So any address space that is (1) not claimed by some
+other ACPI device and (2) should not be assigned by the OS to
+something else, should be claimed by a PNP0C02 _CRS method.
+
+PCI host bridges are PNP0A03 or PNP0A08 devices. ?Their _CRS should
+describe all the address space they consume. ?In principle, this would
+be all the windows they forward down to the PCI bus, as well as the
+bridge registers themselves. ?The bridge registers include things like
+secondary/subordinate bus registers that determine the bus range below
+the bridge, window registers that describe the apertures, etc. ?These
+are all device-specific, non-architected things, so the only way a
+PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
+contain the device-specific details. ?These bridge registers also
+include ECAM space, since it is consumed by the bridge.
+
+ACPI defined a Producer/Consumer bit that was intended to distinguish
+the bridge apertures from the bridge registers [4, 5]. ?However,
+BIOSes didn't use that bit correctly, and the result is that OSes have
+to assume that everything in a PCI host bridge _CRS is a window. ?That
+leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
+device itself.
+
+The workaround is to describe the bridge registers (including ECAM
+space) in PNP0C02 catch-all devices [6]. ?With the exception of ECAM,
+the bridge register space is device-specific anyway, so the generic
+PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it. ?For
+ECAM, pci_root.c learns about the space from either MCFG or the _CBA
+method.
+
+Note that the PCIe spec actually does require ECAM unless there's a
+standard firmware interface for config access, e.g., the ia64 SAL
+interface [7].  One reason is that we want a generic host bridge
+driver (pci_root.c), and a generic driver requires a generic way to
+access config space.
+
+
+[1] ACPI 6.0, sec 6.1:
+    For any device that is on a non-enumerable type of bus (for
+    example, an ISA bus), OSPM enumerates the devices' identifier(s)
+    and the ACPI system firmware must supply an _HID object ... for
+    each device to enable OSPM to do that.
+
+[2] ACPI 6.0, sec 3.7:
+    The OS enumerates motherboard devices simply by reading through
+    the ACPI Namespace looking for devices with hardware IDs.
+
+    Each device enumerated by ACPI includes ACPI-defined objects in
+    the ACPI Namespace that report the hardware resources the device
+    could occupy [_PRS], an object that reports the resources that are
+    currently used by the device [_CRS], and objects for configuring
+    those resources [_SRS].  The information is used by the Plug and
+    Play OS (OSPM) to configure the devices.
+
+[3] ACPI 6.0, sec 6.2:
+    OSPM uses device configuration objects to configure hardware
+    resources for devices enumerated via ACPI.  Device configuration
+    objects provide information about current and possible resource
+    requirements, the relationship between shared resources, and
+    methods for configuring hardware resources.
+
+    When OSPM enumerates a device, it calls _PRS to determine the
+    resource requirements of the device.  It may also call _CRS to
+    find the current resource settings for the device.  Using this
+    information, the Plug and Play system determines what resources
+    the device should consume and sets those resources by calling the
+    device?s _SRS control method.
+
+    In ACPI, devices can consume resources (for example, legacy
+    keyboards), provide resources (for example, a proprietary PCI
+    bridge), or do both.  Unless otherwise specified, resources for a
+    device are assumed to be taken from the nearest matching resource
+    above the device in the device hierarchy.
+
+[4] ACPI 6.0, sec 6.4.3.5.4:
+    Extended Address Space Descriptor
+    General Flags: Bit [0] Consumer/Producer:
+	1?This device consumes this resource
+	0?This device produces and consumes this resource
+
+[5] ACPI 6.0, sec 19.6.43:
+    ResourceUsage specifies whether the Memory range is consumed by
+    this device (ResourceConsumer) or passed on to child devices
+    (ResourceProducer).  If nothing is specified, then
+    ResourceConsumer is assumed.
+
+[6] PCI Firmware 3.0, sec 4.1.2:
+    If the operating system does not natively comprehend reserving the
+    MMCFG region, the MMCFG region must be reserved by firmware.  The
+    address range reported in the MCFG table or by _CBA method (see
+    Section 4.1.3) must be reserved by declaring a motherboard
+    resource.  For most systems, the motherboard resource would appear
+    at the root of the ACPI namespace (under \_SB) in a node with a
+    _HID of EISAID (PNP0C02), and the resources in this case should
+    not be claimed in the root PCI bus?s _CRS.  The resources can
+    optionally be returned in Int15 E820 or EFIGetMemoryMap as
+    reserved memory but must always be reported through ACPI as a
+    motherboard resource.
+
+[7] PCI Express 3.0, sec 7.2.2:
+    For systems that are PC-compatible, or that do not implement a
+    processor-architecture-specific firmware interface standard that
+    allows access to the Configuration Space, the ECAM is required as
+    defined in this section.

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-17 17:59 ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-18 17:17   ` Gabriele Paoloni
  -1 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-18 17:17 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 17 November 2016 18:00
> To: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> arm-kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136
> +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and
> FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-
> info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1–This device consumes this resource
> +	0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 17:17   ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-18 17:17 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 17 November 2016 18:00
> To: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> arm-kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136
> +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and
> FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-
> info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1–This device consumes this resource
> +	0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 17:17   ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-18 17:17 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

SGkgQmpvcm4NCg0KTWFueSB0aGFua3MgZm9yIHB1dHRpbmcgdGhpcyB0b2dldGhlciwgaXQgcmVh
bGx5IGhlbHBzIQ0KDQpPbmUgdGhpbmcgYmVsb3cuLg0KDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2Fn
ZS0tLS0tDQo+IEZyb206IGxpbnV4LWtlcm5lbC1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0
bzpsaW51eC1rZXJuZWwtDQo+IG93bmVyQHZnZXIua2VybmVsLm9yZ10gT24gQmVoYWxmIE9mIEJq
b3JuIEhlbGdhYXMNCj4gU2VudDogMTcgTm92ZW1iZXIgMjAxNiAxODowMA0KPiBUbzogbGludXgt
cGNpQHZnZXIua2VybmVsLm9yZw0KPiBDYzogbGludXgtYWNwaUB2Z2VyLmtlcm5lbC5vcmc7IGxp
bnV4LWtlcm5lbEB2Z2VyLmtlcm5lbC5vcmc7IGxpbnV4LQ0KPiBhcm0ta2VybmVsQGxpc3RzLmlu
ZnJhZGVhZC5vcmc7IGxpbmFyby1hY3BpQGxpc3RzLmxpbmFyby5vcmcNCj4gU3ViamVjdDogW1BB
VENIXSBQQ0k6IEFkZCBpbmZvcm1hdGlvbiBhYm91dCBkZXNjcmliaW5nIFBDSSBpbiBBQ1BJDQo+
IA0KPiBBZGQgYSB3cml0ZXVwIGFib3V0IGhvdyBQQ0kgaG9zdCBicmlkZ2VzIHNob3VsZCBiZSBk
ZXNjcmliZWQgaW4gQUNQSQ0KPiB1c2luZyBQTlAwQTAzL1BOUDBBMDggZGV2aWNlcywgUE5QMEMw
MiBkZXZpY2VzLCBhbmQgdGhlIE1DRkcgdGFibGUuDQo+IA0KPiBTaWduZWQtb2ZmLWJ5OiBCam9y
biBIZWxnYWFzIDxiaGVsZ2Fhc0Bnb29nbGUuY29tPg0KPiAtLS0NCj4gIERvY3VtZW50YXRpb24v
UENJLzAwLUlOREVYICAgICAgfCAgICAyICsNCj4gIERvY3VtZW50YXRpb24vUENJL2FjcGktaW5m
by50eHQgfCAgMTM2DQo+ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKw0K
PiAgMiBmaWxlcyBjaGFuZ2VkLCAxMzggaW5zZXJ0aW9ucygrKQ0KPiAgY3JlYXRlIG1vZGUgMTAw
NjQ0IERvY3VtZW50YXRpb24vUENJL2FjcGktaW5mby50eHQNCj4gDQo+IGRpZmYgLS1naXQgYS9E
b2N1bWVudGF0aW9uL1BDSS8wMC1JTkRFWCBiL0RvY3VtZW50YXRpb24vUENJLzAwLUlOREVYDQo+
IGluZGV4IDE0NzIzMWYuLjA3ODAyODAgMTAwNjQ0DQo+IC0tLSBhL0RvY3VtZW50YXRpb24vUENJ
LzAwLUlOREVYDQo+ICsrKyBiL0RvY3VtZW50YXRpb24vUENJLzAwLUlOREVYDQo+IEBAIC0xLDUg
KzEsNyBAQA0KPiAgMDAtSU5ERVgNCj4gIAktIHRoaXMgZmlsZQ0KPiArYWNwaS1pbmZvLnR4dA0K
PiArCS0gaW5mbyBvbiBob3cgUENJIGhvc3QgYnJpZGdlcyBhcmUgcmVwcmVzZW50ZWQgaW4gQUNQ
SQ0KPiAgTVNJLUhPV1RPLnR4dA0KPiAgCS0gdGhlIE1lc3NhZ2UgU2lnbmFsZWQgSW50ZXJydXB0
cyAoTVNJKSBEcml2ZXIgR3VpZGUgSE9XVE8gYW5kDQo+IEZBUS4NCj4gIFBDSUVCVVMtSE9XVE8u
dHh0DQo+IGRpZmYgLS1naXQgYS9Eb2N1bWVudGF0aW9uL1BDSS9hY3BpLWluZm8udHh0IGIvRG9j
dW1lbnRhdGlvbi9QQ0kvYWNwaS0NCj4gaW5mby50eHQNCj4gbmV3IGZpbGUgbW9kZSAxMDA2NDQN
Cj4gaW5kZXggMDAwMDAwMC4uY2NiY2ZkYQ0KPiAtLS0gL2Rldi9udWxsDQo+ICsrKyBiL0RvY3Vt
ZW50YXRpb24vUENJL2FjcGktaW5mby50eHQNCj4gQEAgLTAsMCArMSwxMzYgQEANCj4gKwkgICAg
QUNQSSBjb25zaWRlcmF0aW9ucyBmb3IgUENJIGhvc3QgYnJpZGdlcw0KPiArDQo+ICtUaGUgYmFz
aWMgcmVxdWlyZW1lbnQgaXMgdGhhdCB0aGUgQUNQSSBuYW1lc3BhY2Ugc2hvdWxkIGRlc2NyaWJl
DQo+ICsqZXZlcnl0aGluZyogdGhhdCBjb25zdW1lcyBhZGRyZXNzIHNwYWNlIHVubGVzcyB0aGVy
ZSdzIGFub3RoZXINCj4gK3N0YW5kYXJkIHdheSBmb3IgdGhlIE9TIHRvIGZpbmQgaXQgWzEsIDJd
LiDCoEZvciBleGFtcGxlLCB3aW5kb3dzIHRoYXQNCj4gK2FyZSBmb3J3YXJkZWQgdG8gUENJIGJ5
IGEgUENJIGhvc3QgYnJpZGdlIHNob3VsZCBiZSBkZXNjcmliZWQgdmlhIEFDUEkNCj4gK2Rldmlj
ZXMsIHNpbmNlIHRoZSBPUyBjYW4ndCBsb2NhdGUgdGhlIGhvc3QgYnJpZGdlIGJ5IGl0c2VsZi4g
wqBQQ0kNCj4gK2RldmljZXMgKmJlbG93KiB0aGUgaG9zdCBicmlkZ2UgZG8gbm90IG5lZWQgdG8g
YmUgZGVzY3JpYmVkIHZpYSBBQ1BJLA0KPiArYmVjYXVzZSB0aGUgcmVzb3VyY2VzIHRoZXkgY29u
c3VtZSBhcmUgaW5zaWRlIHRoZSBob3N0IGJyaWRnZSB3aW5kb3dzLA0KPiArYW5kIHRoZSBPUyBj
YW4gZGlzY292ZXIgdGhlbSB2aWEgdGhlIHN0YW5kYXJkIFBDSSBlbnVtZXJhdGlvbg0KPiArbWVj
aGFuaXNtICh1c2luZyBjb25maWcgYWNjZXNzZXMgdG8gcmVhZCBhbmQgc2l6ZSB0aGUgQkFScyku
DQo+ICsNCj4gK1RoaXMgQUNQSSByZXNvdXJjZSBkZXNjcmlwdGlvbiBpcyBkb25lIHZpYSBfQ1JT
IG1ldGhvZHMgb2YgZGV2aWNlcyBpbg0KPiArdGhlIEFDUEkgbmFtZXNwYWNlIFsyXS4gwqAgX0NS
UyBtZXRob2RzIGFyZSBsaWtlIGdlbmVyYWxpemVkIFBDSSBCQVJzOg0KPiArdGhlIE9TIGNhbiBy
ZWFkIF9DUlMgYW5kIGZpZ3VyZSBvdXQgd2hhdCByZXNvdXJjZSBpcyBiZWluZyBjb25zdW1lZA0K
PiArZXZlbiBpZiBpdCBkb2Vzbid0IGhhdmUgYSBkcml2ZXIgZm9yIHRoZSBkZXZpY2UgWzNdLiDC
oFRoYXQncyBpbXBvcnRhbnQNCj4gK2JlY2F1c2UgaXQgbWVhbnMgYW4gb2xkIE9TIGNhbiB3b3Jr
IGNvcnJlY3RseSBldmVuIG9uIGEgc3lzdGVtIHdpdGgNCj4gK25ldyBkZXZpY2VzIHVua25vd24g
dG8gdGhlIE9TLiDCoFRoZSBuZXcgZGV2aWNlcyB3b24ndCBkbyBhbnl0aGluZywgYnV0DQo+ICt0
aGUgT1MgY2FuIGF0IGxlYXN0IG1ha2Ugc3VyZSBubyByZXNvdXJjZXMgY29uZmxpY3Qgd2l0aCB0
aGVtLg0KPiArDQo+ICtTdGF0aWMgdGFibGVzIGxpa2UgTUNGRywgSFBFVCwgRUNEVCwgZXRjLiwg
YXJlICpub3QqIG1lY2hhbmlzbXMgZm9yDQo+ICtyZXNlcnZpbmcgYWRkcmVzcyBzcGFjZSEgIFRo
ZSBzdGF0aWMgdGFibGVzIGFyZSBmb3IgdGhpbmdzIHRoZSBPUw0KPiArbmVlZHMgdG8ga25vdyBl
YXJseSBpbiBib290LCBiZWZvcmUgaXQgY2FuIHBhcnNlIHRoZSBBQ1BJIG5hbWVzcGFjZS4NCj4g
K0lmIGEgbmV3IHRhYmxlIGlzIGRlZmluZWQsIGFuIG9sZCBPUyBuZWVkcyB0byBvcGVyYXRlIGNv
cnJlY3RseSBldmVuDQo+ICt0aG91Z2ggaXQgaWdub3JlcyB0aGUgdGFibGUuICBfQ1JTIGFsbG93
cyB0aGF0IGJlY2F1c2UgaXQgaXMgZ2VuZXJpYw0KPiArYW5kIHVuZGVyc3Rvb2QgYnkgdGhlIG9s
ZCBPUzsgYSBzdGF0aWMgdGFibGUgZG9lcyBub3QuDQoNClJpZ2h0IHNvIGlmIG15IHVuZGVyc3Rh
bmRpbmcgaXMgY29ycmVjdCB5b3UgYXJlIHNheWluZyB0aGF0IHJlc291cmNlcw0KZGVzY3JpYmVk
IGluIHRoZSBNQ0ZHIHRhYmxlIHNob3VsZCBhbHNvIGJlIGRlY2xhcmVkIGluIFBOUDBDMDIgZGV2
aWNlcw0Kc28gdGhhdCB0aGUgUE5QIGRyaXZlciBjYW4gcmVzZXJ2ZSB0aGVzZSByZXNvdXJjZXMu
DQoNCk9uIHRoZSBvdGhlciBzaWRlIHRoZSBQQ0kgUm9vdCBicmlkZ2UgZHJpdmVyIHNob3VsZCBu
b3QgcmVzZXJ2ZSBzdWNoDQpyZXNvdXJjZXMuDQoNCldlbGwgaWYgbXkgdW5kZXJzdGFuZGluZyBp
cyBjb3JyZWN0IEkgdGhpbmsgd2UgaGF2ZSBhIHByb2JsZW0gaGVyZToNCmh0dHA6Ly9seHIuZnJl
ZS1lbGVjdHJvbnMuY29tL3NvdXJjZS9kcml2ZXJzL3BjaS9lY2FtLmMjTDc0DQoNCkFzIHlvdSBj
YW4gc2VlIHBjaV9lY2FtX2NyZWF0ZSgpIHdpbGwgY29uZmxpY3Qgd2l0aCB0aGUgcG5wIGRyaXZl
cg0KYXMgaXQgd2lsbCB0cnkgdG8gcmVzZXJ2ZSB0aGUgcmVzb3VyY2VzIGZyb20gdGhlIE1DRkcg
dGFibGUuLi4NCg0KTWF5YmUgd2UgbmVlZCB0byByZXdvcmsgcGNpX2VjYW1fY3JlYXRlKCkgPw0K
DQpUaGFua3MNCg0KR2FiDQoNCj4gKw0KPiArSWYgdGhlIE9TIGlzIGV4cGVjdGVkIHRvIG1hbmFn
ZSBhbiBBQ1BJIGRldmljZSwgdGhhdCBkZXZpY2Ugd2lsbCBoYXZlDQo+ICthIHNwZWNpZmljIF9I
SUQvX0NJRCB0aGF0IHRlbGxzIHRoZSBPUyB3aGF0IGRyaXZlciB0byBiaW5kIHRvIGl0LCBhbmQN
Cj4gK3RoZSBfQ1JTIHRlbGxzIHRoZSBPUyBhbmQgdGhlIGRyaXZlciB3aGVyZSB0aGUgZGV2aWNl
J3MgcmVnaXN0ZXJzIGFyZS4NCj4gKw0KPiArUE5QMEMwMiAibW90aGVyYm9hcmQiIGRldmljZXMg
YXJlIGJhc2ljYWxseSBhIGNhdGNoLWFsbC4gwqBUaGVyZSdzIG5vDQo+ICtwcm9ncmFtbWluZyBt
b2RlbCBmb3IgdGhlbSBvdGhlciB0aGFuICJkb24ndCB1c2UgdGhlc2UgcmVzb3VyY2VzIGZvcg0K
PiArYW55dGhpbmcgZWxzZS4iIMKgU28gYW55IGFkZHJlc3Mgc3BhY2UgdGhhdCBpcyAoMSkgbm90
IGNsYWltZWQgYnkgc29tZQ0KPiArb3RoZXIgQUNQSSBkZXZpY2UgYW5kICgyKSBzaG91bGQgbm90
IGJlIGFzc2lnbmVkIGJ5IHRoZSBPUyB0bw0KPiArc29tZXRoaW5nIGVsc2UsIHNob3VsZCBiZSBj
bGFpbWVkIGJ5IGEgUE5QMEMwMiBfQ1JTIG1ldGhvZC4NCj4gKw0KPiArUENJIGhvc3QgYnJpZGdl
cyBhcmUgUE5QMEEwMyBvciBQTlAwQTA4IGRldmljZXMuIMKgVGhlaXIgX0NSUyBzaG91bGQNCj4g
K2Rlc2NyaWJlIGFsbCB0aGUgYWRkcmVzcyBzcGFjZSB0aGV5IGNvbnN1bWUuIMKgSW4gcHJpbmNp
cGxlLCB0aGlzIHdvdWxkDQo+ICtiZSBhbGwgdGhlIHdpbmRvd3MgdGhleSBmb3J3YXJkIGRvd24g
dG8gdGhlIFBDSSBidXMsIGFzIHdlbGwgYXMgdGhlDQo+ICticmlkZ2UgcmVnaXN0ZXJzIHRoZW1z
ZWx2ZXMuIMKgVGhlIGJyaWRnZSByZWdpc3RlcnMgaW5jbHVkZSB0aGluZ3MgbGlrZQ0KPiArc2Vj
b25kYXJ5L3N1Ym9yZGluYXRlIGJ1cyByZWdpc3RlcnMgdGhhdCBkZXRlcm1pbmUgdGhlIGJ1cyBy
YW5nZSBiZWxvdw0KPiArdGhlIGJyaWRnZSwgd2luZG93IHJlZ2lzdGVycyB0aGF0IGRlc2NyaWJl
IHRoZSBhcGVydHVyZXMsIGV0Yy4gwqBUaGVzZQ0KPiArYXJlIGFsbCBkZXZpY2Utc3BlY2lmaWMs
IG5vbi1hcmNoaXRlY3RlZCB0aGluZ3MsIHNvIHRoZSBvbmx5IHdheSBhDQo+ICtQTlAwQTAzL1BO
UDBBMDggZHJpdmVyIGNhbiBtYW5hZ2UgdGhlbSBpcyB2aWEgX1BSUy9fQ1JTL19TUlMsIHdoaWNo
DQo+ICtjb250YWluIHRoZSBkZXZpY2Utc3BlY2lmaWMgZGV0YWlscy4gwqBUaGVzZSBicmlkZ2Ug
cmVnaXN0ZXJzIGFsc28NCj4gK2luY2x1ZGUgRUNBTSBzcGFjZSwgc2luY2UgaXQgaXMgY29uc3Vt
ZWQgYnkgdGhlIGJyaWRnZS4NCj4gKw0KPiArQUNQSSBkZWZpbmVkIGEgUHJvZHVjZXIvQ29uc3Vt
ZXIgYml0IHRoYXQgd2FzIGludGVuZGVkIHRvIGRpc3Rpbmd1aXNoDQo+ICt0aGUgYnJpZGdlIGFw
ZXJ0dXJlcyBmcm9tIHRoZSBicmlkZ2UgcmVnaXN0ZXJzIFs0LCA1XS4gwqBIb3dldmVyLA0KPiAr
QklPU2VzIGRpZG4ndCB1c2UgdGhhdCBiaXQgY29ycmVjdGx5LCBhbmQgdGhlIHJlc3VsdCBpcyB0
aGF0IE9TZXMgaGF2ZQ0KPiArdG8gYXNzdW1lIHRoYXQgZXZlcnl0aGluZyBpbiBhIFBDSSBob3N0
IGJyaWRnZSBfQ1JTIGlzIGEgd2luZG93LiDCoFRoYXQNCj4gK2xlYXZlcyBubyB3YXkgdG8gZGVz
Y3JpYmUgdGhlIGJyaWRnZSByZWdpc3RlcnMgaW4gdGhlIFBOUDBBMDMvUE5QMEEwOA0KPiArZGV2
aWNlIGl0c2VsZi4NCj4gKw0KPiArVGhlIHdvcmthcm91bmQgaXMgdG8gZGVzY3JpYmUgdGhlIGJy
aWRnZSByZWdpc3RlcnMgKGluY2x1ZGluZyBFQ0FNDQo+ICtzcGFjZSkgaW4gUE5QMEMwMiBjYXRj
aC1hbGwgZGV2aWNlcyBbNl0uIMKgV2l0aCB0aGUgZXhjZXB0aW9uIG9mIEVDQU0sDQo+ICt0aGUg
YnJpZGdlIHJlZ2lzdGVyIHNwYWNlIGlzIGRldmljZS1zcGVjaWZpYyBhbnl3YXksIHNvIHRoZSBn
ZW5lcmljDQo+ICtQTlAwQTAzL1BOUDBBMDggZHJpdmVyIChwY2lfcm9vdC5jKSBoYXMgbm8gbmVl
ZCB0byBrbm93IGFib3V0IGl0LiDCoEZvcg0KPiArRUNBTSwgcGNpX3Jvb3QuYyBsZWFybnMgYWJv
dXQgdGhlIHNwYWNlIGZyb20gZWl0aGVyIE1DRkcgb3IgdGhlIF9DQkENCj4gK21ldGhvZC4NCj4g
Kw0KPiArTm90ZSB0aGF0IHRoZSBQQ0llIHNwZWMgYWN0dWFsbHkgZG9lcyByZXF1aXJlIEVDQU0g
dW5sZXNzIHRoZXJlJ3MgYQ0KPiArc3RhbmRhcmQgZmlybXdhcmUgaW50ZXJmYWNlIGZvciBjb25m
aWcgYWNjZXNzLCBlLmcuLCB0aGUgaWE2NCBTQUwNCj4gK2ludGVyZmFjZSBbN10uICBPbmUgcmVh
c29uIGlzIHRoYXQgd2Ugd2FudCBhIGdlbmVyaWMgaG9zdCBicmlkZ2UNCj4gK2RyaXZlciAocGNp
X3Jvb3QuYyksIGFuZCBhIGdlbmVyaWMgZHJpdmVyIHJlcXVpcmVzIGEgZ2VuZXJpYyB3YXkgdG8N
Cj4gK2FjY2VzcyBjb25maWcgc3BhY2UuDQo+ICsNCj4gKw0KPiArWzFdIEFDUEkgNi4wLCBzZWMg
Ni4xOg0KPiArICAgIEZvciBhbnkgZGV2aWNlIHRoYXQgaXMgb24gYSBub24tZW51bWVyYWJsZSB0
eXBlIG9mIGJ1cyAoZm9yDQo+ICsgICAgZXhhbXBsZSwgYW4gSVNBIGJ1cyksIE9TUE0gZW51bWVy
YXRlcyB0aGUgZGV2aWNlcycgaWRlbnRpZmllcihzKQ0KPiArICAgIGFuZCB0aGUgQUNQSSBzeXN0
ZW0gZmlybXdhcmUgbXVzdCBzdXBwbHkgYW4gX0hJRCBvYmplY3QgLi4uIGZvcg0KPiArICAgIGVh
Y2ggZGV2aWNlIHRvIGVuYWJsZSBPU1BNIHRvIGRvIHRoYXQuDQo+ICsNCj4gK1syXSBBQ1BJIDYu
MCwgc2VjIDMuNzoNCj4gKyAgICBUaGUgT1MgZW51bWVyYXRlcyBtb3RoZXJib2FyZCBkZXZpY2Vz
IHNpbXBseSBieSByZWFkaW5nIHRocm91Z2gNCj4gKyAgICB0aGUgQUNQSSBOYW1lc3BhY2UgbG9v
a2luZyBmb3IgZGV2aWNlcyB3aXRoIGhhcmR3YXJlIElEcy4NCj4gKw0KPiArICAgIEVhY2ggZGV2
aWNlIGVudW1lcmF0ZWQgYnkgQUNQSSBpbmNsdWRlcyBBQ1BJLWRlZmluZWQgb2JqZWN0cyBpbg0K
PiArICAgIHRoZSBBQ1BJIE5hbWVzcGFjZSB0aGF0IHJlcG9ydCB0aGUgaGFyZHdhcmUgcmVzb3Vy
Y2VzIHRoZSBkZXZpY2UNCj4gKyAgICBjb3VsZCBvY2N1cHkgW19QUlNdLCBhbiBvYmplY3QgdGhh
dCByZXBvcnRzIHRoZSByZXNvdXJjZXMgdGhhdCBhcmUNCj4gKyAgICBjdXJyZW50bHkgdXNlZCBi
eSB0aGUgZGV2aWNlIFtfQ1JTXSwgYW5kIG9iamVjdHMgZm9yIGNvbmZpZ3VyaW5nDQo+ICsgICAg
dGhvc2UgcmVzb3VyY2VzIFtfU1JTXS4gIFRoZSBpbmZvcm1hdGlvbiBpcyB1c2VkIGJ5IHRoZSBQ
bHVnIGFuZA0KPiArICAgIFBsYXkgT1MgKE9TUE0pIHRvIGNvbmZpZ3VyZSB0aGUgZGV2aWNlcy4N
Cj4gKw0KPiArWzNdIEFDUEkgNi4wLCBzZWMgNi4yOg0KPiArICAgIE9TUE0gdXNlcyBkZXZpY2Ug
Y29uZmlndXJhdGlvbiBvYmplY3RzIHRvIGNvbmZpZ3VyZSBoYXJkd2FyZQ0KPiArICAgIHJlc291
cmNlcyBmb3IgZGV2aWNlcyBlbnVtZXJhdGVkIHZpYSBBQ1BJLiAgRGV2aWNlIGNvbmZpZ3VyYXRp
b24NCj4gKyAgICBvYmplY3RzIHByb3ZpZGUgaW5mb3JtYXRpb24gYWJvdXQgY3VycmVudCBhbmQg
cG9zc2libGUgcmVzb3VyY2UNCj4gKyAgICByZXF1aXJlbWVudHMsIHRoZSByZWxhdGlvbnNoaXAg
YmV0d2VlbiBzaGFyZWQgcmVzb3VyY2VzLCBhbmQNCj4gKyAgICBtZXRob2RzIGZvciBjb25maWd1
cmluZyBoYXJkd2FyZSByZXNvdXJjZXMuDQo+ICsNCj4gKyAgICBXaGVuIE9TUE0gZW51bWVyYXRl
cyBhIGRldmljZSwgaXQgY2FsbHMgX1BSUyB0byBkZXRlcm1pbmUgdGhlDQo+ICsgICAgcmVzb3Vy
Y2UgcmVxdWlyZW1lbnRzIG9mIHRoZSBkZXZpY2UuICBJdCBtYXkgYWxzbyBjYWxsIF9DUlMgdG8N
Cj4gKyAgICBmaW5kIHRoZSBjdXJyZW50IHJlc291cmNlIHNldHRpbmdzIGZvciB0aGUgZGV2aWNl
LiAgVXNpbmcgdGhpcw0KPiArICAgIGluZm9ybWF0aW9uLCB0aGUgUGx1ZyBhbmQgUGxheSBzeXN0
ZW0gZGV0ZXJtaW5lcyB3aGF0IHJlc291cmNlcw0KPiArICAgIHRoZSBkZXZpY2Ugc2hvdWxkIGNv
bnN1bWUgYW5kIHNldHMgdGhvc2UgcmVzb3VyY2VzIGJ5IGNhbGxpbmcgdGhlDQo+ICsgICAgZGV2
aWNl4oCZcyBfU1JTIGNvbnRyb2wgbWV0aG9kLg0KPiArDQo+ICsgICAgSW4gQUNQSSwgZGV2aWNl
cyBjYW4gY29uc3VtZSByZXNvdXJjZXMgKGZvciBleGFtcGxlLCBsZWdhY3kNCj4gKyAgICBrZXli
b2FyZHMpLCBwcm92aWRlIHJlc291cmNlcyAoZm9yIGV4YW1wbGUsIGEgcHJvcHJpZXRhcnkgUENJ
DQo+ICsgICAgYnJpZGdlKSwgb3IgZG8gYm90aC4gIFVubGVzcyBvdGhlcndpc2Ugc3BlY2lmaWVk
LCByZXNvdXJjZXMgZm9yIGENCj4gKyAgICBkZXZpY2UgYXJlIGFzc3VtZWQgdG8gYmUgdGFrZW4g
ZnJvbSB0aGUgbmVhcmVzdCBtYXRjaGluZyByZXNvdXJjZQ0KPiArICAgIGFib3ZlIHRoZSBkZXZp
Y2UgaW4gdGhlIGRldmljZSBoaWVyYXJjaHkuDQo+ICsNCj4gK1s0XSBBQ1BJIDYuMCwgc2VjIDYu
NC4zLjUuNDoNCj4gKyAgICBFeHRlbmRlZCBBZGRyZXNzIFNwYWNlIERlc2NyaXB0b3INCj4gKyAg
ICBHZW5lcmFsIEZsYWdzOiBCaXQgWzBdIENvbnN1bWVyL1Byb2R1Y2VyOg0KPiArCTHigJNUaGlz
IGRldmljZSBjb25zdW1lcyB0aGlzIHJlc291cmNlDQo+ICsJMOKAk1RoaXMgZGV2aWNlIHByb2R1
Y2VzIGFuZCBjb25zdW1lcyB0aGlzIHJlc291cmNlDQo+ICsNCj4gK1s1XSBBQ1BJIDYuMCwgc2Vj
IDE5LjYuNDM6DQo+ICsgICAgUmVzb3VyY2VVc2FnZSBzcGVjaWZpZXMgd2hldGhlciB0aGUgTWVt
b3J5IHJhbmdlIGlzIGNvbnN1bWVkIGJ5DQo+ICsgICAgdGhpcyBkZXZpY2UgKFJlc291cmNlQ29u
c3VtZXIpIG9yIHBhc3NlZCBvbiB0byBjaGlsZCBkZXZpY2VzDQo+ICsgICAgKFJlc291cmNlUHJv
ZHVjZXIpLiAgSWYgbm90aGluZyBpcyBzcGVjaWZpZWQsIHRoZW4NCj4gKyAgICBSZXNvdXJjZUNv
bnN1bWVyIGlzIGFzc3VtZWQuDQo+ICsNCj4gK1s2XSBQQ0kgRmlybXdhcmUgMy4wLCBzZWMgNC4x
LjI6DQo+ICsgICAgSWYgdGhlIG9wZXJhdGluZyBzeXN0ZW0gZG9lcyBub3QgbmF0aXZlbHkgY29t
cHJlaGVuZCByZXNlcnZpbmcgdGhlDQo+ICsgICAgTU1DRkcgcmVnaW9uLCB0aGUgTU1DRkcgcmVn
aW9uIG11c3QgYmUgcmVzZXJ2ZWQgYnkgZmlybXdhcmUuICBUaGUNCj4gKyAgICBhZGRyZXNzIHJh
bmdlIHJlcG9ydGVkIGluIHRoZSBNQ0ZHIHRhYmxlIG9yIGJ5IF9DQkEgbWV0aG9kIChzZWUNCj4g
KyAgICBTZWN0aW9uIDQuMS4zKSBtdXN0IGJlIHJlc2VydmVkIGJ5IGRlY2xhcmluZyBhIG1vdGhl
cmJvYXJkDQo+ICsgICAgcmVzb3VyY2UuICBGb3IgbW9zdCBzeXN0ZW1zLCB0aGUgbW90aGVyYm9h
cmQgcmVzb3VyY2Ugd291bGQgYXBwZWFyDQo+ICsgICAgYXQgdGhlIHJvb3Qgb2YgdGhlIEFDUEkg
bmFtZXNwYWNlICh1bmRlciBcX1NCKSBpbiBhIG5vZGUgd2l0aCBhDQo+ICsgICAgX0hJRCBvZiBF
SVNBSUQgKFBOUDBDMDIpLCBhbmQgdGhlIHJlc291cmNlcyBpbiB0aGlzIGNhc2Ugc2hvdWxkDQo+
ICsgICAgbm90IGJlIGNsYWltZWQgaW4gdGhlIHJvb3QgUENJIGJ1c+KAmXMgX0NSUy4gIFRoZSBy
ZXNvdXJjZXMgY2FuDQo+ICsgICAgb3B0aW9uYWxseSBiZSByZXR1cm5lZCBpbiBJbnQxNSBFODIw
IG9yIEVGSUdldE1lbW9yeU1hcCBhcw0KPiArICAgIHJlc2VydmVkIG1lbW9yeSBidXQgbXVzdCBh
bHdheXMgYmUgcmVwb3J0ZWQgdGhyb3VnaCBBQ1BJIGFzIGENCj4gKyAgICBtb3RoZXJib2FyZCBy
ZXNvdXJjZS4NCj4gKw0KPiArWzddIFBDSSBFeHByZXNzIDMuMCwgc2VjIDcuMi4yOg0KPiArICAg
IEZvciBzeXN0ZW1zIHRoYXQgYXJlIFBDLWNvbXBhdGlibGUsIG9yIHRoYXQgZG8gbm90IGltcGxl
bWVudCBhDQo+ICsgICAgcHJvY2Vzc29yLWFyY2hpdGVjdHVyZS1zcGVjaWZpYyBmaXJtd2FyZSBp
bnRlcmZhY2Ugc3RhbmRhcmQgdGhhdA0KPiArICAgIGFsbG93cyBhY2Nlc3MgdG8gdGhlIENvbmZp
Z3VyYXRpb24gU3BhY2UsIHRoZSBFQ0FNIGlzIHJlcXVpcmVkIGFzDQo+ICsgICAgZGVmaW5lZCBp
biB0aGlzIHNlY3Rpb24uDQoNCg==

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 17:17   ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-18 17:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -----Original Message-----
> From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 17 November 2016 18:00
> To: linux-pci at vger.kernel.org
> Cc: linux-acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-
> arm-kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136
> +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and
> FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-
> info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2]. ?For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself. ?PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2]. ? _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3]. ?That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS. ?The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all. ?There's no
> +programming model for them other than "don't use these resources for
> +anything else." ?So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices. ?Their _CRS should
> +describe all the address space they consume. ?In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves. ?The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc. ?These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details. ?These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5]. ?However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window. ?That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6]. ?With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it. ?For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device?s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1?This device consumes this resource
> +	0?This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus?s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-18 17:17   ` Gabriele Paoloni
  (?)
@ 2016-11-18 17:54     ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-18 17:54 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > -----Original Message-----
> > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 17:54     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-18 17:54 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > -----Original Message-----
> > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 17:54     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-18 17:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > -----Original Message-----
> > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-17 17:59 ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-18 23:02   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2016-11-18 23:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linux PCI, ACPI Devel Maling List, Linux Kernel Mailing List,
	linux-arm-kernel, linaro-acpi

On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 23:02   ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2016-11-18 23:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linux PCI, ACPI Devel Maling List, Linux Kernel Mailing List,
	linux-arm-kernel, linaro-acpi

On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 23:02   ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2016-11-18 23:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linux PCI, linaro-acpi, Linux Kernel Mailing List,
	linux-arm-kernel, ACPI Devel Maling List

On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-18 23:02   ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2016-11-18 23:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-18 17:54     ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-21  8:52       ` Gabriele Paoloni
  -1 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21  8:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21  8:52       ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21  8:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21  8:52       ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21  8:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
>=20
> On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
>=20
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
>=20
> Yes.
>=20
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
>=20
> I think it's OK as it is.
>=20
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
>=20
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
>=20
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


>=20
> Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21  8:52       ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21  8:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-18 23:02   ` Rafael J. Wysocki
  (?)
@ 2016-11-21 13:58     ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 13:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Linux PCI, linaro-acpi, Linux Kernel Mailing List,
	linux-arm-kernel, ACPI Devel Maling List

On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 13:58     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 13:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Linux PCI, linaro-acpi, Linux Kernel Mailing List,
	linux-arm-kernel, ACPI Devel Maling List

On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 13:58     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 13:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-21  8:52       ` Gabriele Paoloni
  (?)
@ 2016-11-21 16:47         ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 16:47 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > Sent: 18 November 2016 17:54
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > -----Original Message-----
> > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > Sent: 17 November 2016 18:00
> > 
> > > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> > for
> > > > +reserving address space!  The static tables are for things the OS
> > > > +needs to know early in boot, before it can parse the ACPI
> > namespace.
> > > > +If a new table is defined, an old OS needs to operate correctly
> > even
> > > > +though it ignores the table.  _CRS allows that because it is
> > generic
> > > > +and understood by the old OS; a static table does not.
> > >
> > > Right so if my understanding is correct you are saying that resources
> > > described in the MCFG table should also be declared in PNP0C02
> > devices
> > > so that the PNP driver can reserve these resources.
> > 
> > Yes.
> > 
> > > On the other side the PCI Root bridge driver should not reserve such
> > > resources.
> > >
> > > Well if my understanding is correct I think we have a problem here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > >
> > > As you can see pci_ecam_create() will conflict with the pnp driver
> > > as it will try to reserve the resources from the MCFG table...
> > >
> > > Maybe we need to rework pci_ecam_create() ?
> > 
> > I think it's OK as it is.
> > 
> > The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> > marks them as "not busy".  That way they appear in /proc/iomem and
> > won't be allocated for anything else, but they can still be requested
> > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > 
> > This is analogous to what the PCI core does in pci_claim_resource().
> > This is really a function of the ACPI/PNP *core*, which should reserve
> > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > bunch of historical baggage there.
> > 
> > You'll also notice that in this case, things are out of order:
> > logically the pnp/system.c reservation should happen first, but in
> > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > That means the pnp/system.c one might fail and complain "[mem ...]
> > could not be reserved".
> 
> Correct me if I am wrong...
> 
> So currently we are relying on the fact that pci_ecam_create() is called
> before the pnp driver.
> If the pnp driver came first we would end up in pci_ecam_create() failing
> here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> 
> I am not sure but it seems to me like a bit weak condition to rely on...
> what about removing the error condition in pci_ecam_create() and logging
> just a dev_info()?

Huh.  I'm confused.  I *thought* it would be safe to reverse the
order, which would effectively be this:

  system_pnp_probe
    reserve_resources_of_dev
      reserve_range
        request_mem_region([mem 0xb0000000-0xb1ffffff])
  ...
  pci_ecam_create
    request_resource_conflict([mem 0xb0000000-0xb1ffffff])


but I experimented with the patch below on qemu, and it failed as you
predicted:

  ** res test **
  requested [mem 0xa0000000-0xafffffff]
  can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM PNP [mem 0xa0000000-0xafffffff]

I expected the request_resource_conflict() to succeed since it's
completely contained in the "ECAM PNP" region.  But I guess I don't
understand kernel/resource.c well enough.

I'm not sure we need to fix anything yet, since we currently do the
ecam.c request before the system.c one, and any change there would be
a long ways off.  If/when that *does* change, I think the correct fix
would be to change ecam.c so its request succeeds (by changing the way
it does the request, fixing kernel/resource.c, or whatever) rather
than to reduce the log level and ignore the failure.

Bjorn


diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
index adb62aa..5a35638 100644
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -7,6 +7,8 @@
    in the right sequence from here. */
 static __init int pci_arch_init(void)
 {
+	struct resource *res, *conflict;
+	static struct resource cfg;
 #ifdef CONFIG_PCI_DIRECT
 	int type = 0;
 
@@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
 
 	dmi_check_skip_isa_align();
 
+	printk("\n** res test **\n");
+
+	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
+	printk("requested %pR\n", res);
+	if (!res)
+		return 0;
+	res->flags &= ~IORESOURCE_BUSY;
+
+	cfg.start = 0xa0000000;
+	cfg.end = 0xafffffff;
+	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	cfg.name = "PCI ECAM";
+
+	conflict = request_resource_conflict(&iomem_resource, &cfg);
+	if (conflict)
+		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
+		    &cfg, conflict->name, conflict);
+
+	printk("\n");
+
 	return 0;
 }
 arch_initcall(pci_arch_init);

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 16:47         ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 16:47 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > Sent: 18 November 2016 17:54
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > -----Original Message-----
> > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > Sent: 17 November 2016 18:00
> > 
> > > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> > for
> > > > +reserving address space!  The static tables are for things the OS
> > > > +needs to know early in boot, before it can parse the ACPI
> > namespace.
> > > > +If a new table is defined, an old OS needs to operate correctly
> > even
> > > > +though it ignores the table.  _CRS allows that because it is
> > generic
> > > > +and understood by the old OS; a static table does not.
> > >
> > > Right so if my understanding is correct you are saying that resources
> > > described in the MCFG table should also be declared in PNP0C02
> > devices
> > > so that the PNP driver can reserve these resources.
> > 
> > Yes.
> > 
> > > On the other side the PCI Root bridge driver should not reserve such
> > > resources.
> > >
> > > Well if my understanding is correct I think we have a problem here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > >
> > > As you can see pci_ecam_create() will conflict with the pnp driver
> > > as it will try to reserve the resources from the MCFG table...
> > >
> > > Maybe we need to rework pci_ecam_create() ?
> > 
> > I think it's OK as it is.
> > 
> > The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> > marks them as "not busy".  That way they appear in /proc/iomem and
> > won't be allocated for anything else, but they can still be requested
> > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > 
> > This is analogous to what the PCI core does in pci_claim_resource().
> > This is really a function of the ACPI/PNP *core*, which should reserve
> > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > bunch of historical baggage there.
> > 
> > You'll also notice that in this case, things are out of order:
> > logically the pnp/system.c reservation should happen first, but in
> > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > That means the pnp/system.c one might fail and complain "[mem ...]
> > could not be reserved".
> 
> Correct me if I am wrong...
> 
> So currently we are relying on the fact that pci_ecam_create() is called
> before the pnp driver.
> If the pnp driver came first we would end up in pci_ecam_create() failing
> here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> 
> I am not sure but it seems to me like a bit weak condition to rely on...
> what about removing the error condition in pci_ecam_create() and logging
> just a dev_info()?

Huh.  I'm confused.  I *thought* it would be safe to reverse the
order, which would effectively be this:

  system_pnp_probe
    reserve_resources_of_dev
      reserve_range
        request_mem_region([mem 0xb0000000-0xb1ffffff])
  ...
  pci_ecam_create
    request_resource_conflict([mem 0xb0000000-0xb1ffffff])


but I experimented with the patch below on qemu, and it failed as you
predicted:

  ** res test **
  requested [mem 0xa0000000-0xafffffff]
  can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM PNP [mem 0xa0000000-0xafffffff]

I expected the request_resource_conflict() to succeed since it's
completely contained in the "ECAM PNP" region.  But I guess I don't
understand kernel/resource.c well enough.

I'm not sure we need to fix anything yet, since we currently do the
ecam.c request before the system.c one, and any change there would be
a long ways off.  If/when that *does* change, I think the correct fix
would be to change ecam.c so its request succeeds (by changing the way
it does the request, fixing kernel/resource.c, or whatever) rather
than to reduce the log level and ignore the failure.

Bjorn


diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
index adb62aa..5a35638 100644
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -7,6 +7,8 @@
    in the right sequence from here. */
 static __init int pci_arch_init(void)
 {
+	struct resource *res, *conflict;
+	static struct resource cfg;
 #ifdef CONFIG_PCI_DIRECT
 	int type = 0;
 
@@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
 
 	dmi_check_skip_isa_align();
 
+	printk("\n** res test **\n");
+
+	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
+	printk("requested %pR\n", res);
+	if (!res)
+		return 0;
+	res->flags &= ~IORESOURCE_BUSY;
+
+	cfg.start = 0xa0000000;
+	cfg.end = 0xafffffff;
+	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	cfg.name = "PCI ECAM";
+
+	conflict = request_resource_conflict(&iomem_resource, &cfg);
+	if (conflict)
+		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
+		    &cfg, conflict->name, conflict);
+
+	printk("\n");
+
 	return 0;
 }
 arch_initcall(pci_arch_init);

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 16:47         ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> > Sent: 18 November 2016 17:54
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > -----Original Message-----
> > > > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> > > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > Sent: 17 November 2016 18:00
> > 
> > > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> > for
> > > > +reserving address space!  The static tables are for things the OS
> > > > +needs to know early in boot, before it can parse the ACPI
> > namespace.
> > > > +If a new table is defined, an old OS needs to operate correctly
> > even
> > > > +though it ignores the table.  _CRS allows that because it is
> > generic
> > > > +and understood by the old OS; a static table does not.
> > >
> > > Right so if my understanding is correct you are saying that resources
> > > described in the MCFG table should also be declared in PNP0C02
> > devices
> > > so that the PNP driver can reserve these resources.
> > 
> > Yes.
> > 
> > > On the other side the PCI Root bridge driver should not reserve such
> > > resources.
> > >
> > > Well if my understanding is correct I think we have a problem here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > >
> > > As you can see pci_ecam_create() will conflict with the pnp driver
> > > as it will try to reserve the resources from the MCFG table...
> > >
> > > Maybe we need to rework pci_ecam_create() ?
> > 
> > I think it's OK as it is.
> > 
> > The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> > marks them as "not busy".  That way they appear in /proc/iomem and
> > won't be allocated for anything else, but they can still be requested
> > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > 
> > This is analogous to what the PCI core does in pci_claim_resource().
> > This is really a function of the ACPI/PNP *core*, which should reserve
> > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > bunch of historical baggage there.
> > 
> > You'll also notice that in this case, things are out of order:
> > logically the pnp/system.c reservation should happen first, but in
> > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > That means the pnp/system.c one might fail and complain "[mem ...]
> > could not be reserved".
> 
> Correct me if I am wrong...
> 
> So currently we are relying on the fact that pci_ecam_create() is called
> before the pnp driver.
> If the pnp driver came first we would end up in pci_ecam_create() failing
> here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> 
> I am not sure but it seems to me like a bit weak condition to rely on...
> what about removing the error condition in pci_ecam_create() and logging
> just a dev_info()?

Huh.  I'm confused.  I *thought* it would be safe to reverse the
order, which would effectively be this:

  system_pnp_probe
    reserve_resources_of_dev
      reserve_range
        request_mem_region([mem 0xb0000000-0xb1ffffff])
  ...
  pci_ecam_create
    request_resource_conflict([mem 0xb0000000-0xb1ffffff])


but I experimented with the patch below on qemu, and it failed as you
predicted:

  ** res test **
  requested [mem 0xa0000000-0xafffffff]
  can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM PNP [mem 0xa0000000-0xafffffff]

I expected the request_resource_conflict() to succeed since it's
completely contained in the "ECAM PNP" region.  But I guess I don't
understand kernel/resource.c well enough.

I'm not sure we need to fix anything yet, since we currently do the
ecam.c request before the system.c one, and any change there would be
a long ways off.  If/when that *does* change, I think the correct fix
would be to change ecam.c so its request succeeds (by changing the way
it does the request, fixing kernel/resource.c, or whatever) rather
than to reduce the log level and ignore the failure.

Bjorn


diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
index adb62aa..5a35638 100644
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -7,6 +7,8 @@
    in the right sequence from here. */
 static __init int pci_arch_init(void)
 {
+	struct resource *res, *conflict;
+	static struct resource cfg;
 #ifdef CONFIG_PCI_DIRECT
 	int type = 0;
 
@@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
 
 	dmi_check_skip_isa_align();
 
+	printk("\n** res test **\n");
+
+	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
+	printk("requested %pR\n", res);
+	if (!res)
+		return 0;
+	res->flags &= ~IORESOURCE_BUSY;
+
+	cfg.start = 0xa0000000;
+	cfg.end = 0xafffffff;
+	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	cfg.name = "PCI ECAM";
+
+	conflict = request_resource_conflict(&iomem_resource, &cfg);
+	if (conflict)
+		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
+		    &cfg, conflict->name, conflict);
+
+	printk("\n");
+
 	return 0;
 }
 arch_initcall(pci_arch_init);

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-21 16:47         ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-21 17:23           ` Gabriele Paoloni
  -1 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21 17:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > -----Original Message-----
> > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
> 
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
> 
>   system_pnp_probe
>     reserve_resources_of_dev
>       reserve_range
>         request_mem_region([mem 0xb0000000-0xb1ffffff])
>   ...
>   pci_ecam_create
>     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> 
> 
> but I experimented with the patch below on qemu, and it failed as you
> predicted:
> 
>   ** res test **
>   requested [mem 0xa0000000-0xafffffff]
>   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> PNP [mem 0xa0000000-0xafffffff]
> 
> I expected the request_resource_conflict() to succeed since it's
> completely contained in the "ECAM PNP" region.  But I guess I don't
> understand kernel/resource.c well enough.

I think it fails because effectively the PNP driver is populating the
iomem_resource resource tree and therefore pci_ecam_create() finds that
it cannot add the cfg resource to the same hierarchy as it is already
there... 

> 
> I'm not sure we need to fix anything yet, since we currently do the
> ecam.c request before the system.c one, and any change there would be
> a long ways off.  If/when that *does* change, I think the correct fix
> would be to change ecam.c so its request succeeds (by changing the way
> it does the request, fixing kernel/resource.c, or whatever) rather
> than to reduce the log level and ignore the failure.

Well in my mind I didn't want just to make the error disappear...
If all the resources should be reserved by the PNP driver then ideally
we could take away request_resource_conflict() from pci_ecam_create(),
but this would make buggy some systems with an already shipped BIOS
that relied on pci_ecam_create() reservation rather than PNP reservation.

Just removing the error condition and converting dev_err() into
dev_info() seems to me like accommodating already shipped BIOS images
and flagging a reservation that is already done by somebody else
without compromising the functionality of the PCI Root bridge driver
(so far the only reason why I can see the error condition there is
to catch a buggy MCFG with overlapping addresses; so if this is the
case maybe we need to have a different diagnostic check to make sure
that the MCFG table is alright)

BTW if you think that so far we can keep this as it is I am ok.

Many Thanks

Gab

> 
> Bjorn
> 
> 
> diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
> index adb62aa..5a35638 100644
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -7,6 +7,8 @@
>     in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> +	struct resource *res, *conflict;
> +	static struct resource cfg;
>  #ifdef CONFIG_PCI_DIRECT
>  	int type = 0;
> 
> @@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
> 
>  	dmi_check_skip_isa_align();
> 
> +	printk("\n** res test **\n");
> +
> +	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
> +	printk("requested %pR\n", res);
> +	if (!res)
> +		return 0;
> +	res->flags &= ~IORESOURCE_BUSY;
> +
> +	cfg.start = 0xa0000000;
> +	cfg.end = 0xafffffff;
> +	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	cfg.name = "PCI ECAM";
> +
> +	conflict = request_resource_conflict(&iomem_resource, &cfg);
> +	if (conflict)
> +		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
> +		    &cfg, conflict->name, conflict);
> +
> +	printk("\n");
> +
>  	return 0;
>  }
>  arch_initcall(pci_arch_init);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 17:23           ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21 17:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > -----Original Message-----
> > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
> 
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
> 
>   system_pnp_probe
>     reserve_resources_of_dev
>       reserve_range
>         request_mem_region([mem 0xb0000000-0xb1ffffff])
>   ...
>   pci_ecam_create
>     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> 
> 
> but I experimented with the patch below on qemu, and it failed as you
> predicted:
> 
>   ** res test **
>   requested [mem 0xa0000000-0xafffffff]
>   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> PNP [mem 0xa0000000-0xafffffff]
> 
> I expected the request_resource_conflict() to succeed since it's
> completely contained in the "ECAM PNP" region.  But I guess I don't
> understand kernel/resource.c well enough.

I think it fails because effectively the PNP driver is populating the
iomem_resource resource tree and therefore pci_ecam_create() finds that
it cannot add the cfg resource to the same hierarchy as it is already
there... 

> 
> I'm not sure we need to fix anything yet, since we currently do the
> ecam.c request before the system.c one, and any change there would be
> a long ways off.  If/when that *does* change, I think the correct fix
> would be to change ecam.c so its request succeeds (by changing the way
> it does the request, fixing kernel/resource.c, or whatever) rather
> than to reduce the log level and ignore the failure.

Well in my mind I didn't want just to make the error disappear...
If all the resources should be reserved by the PNP driver then ideally
we could take away request_resource_conflict() from pci_ecam_create(),
but this would make buggy some systems with an already shipped BIOS
that relied on pci_ecam_create() reservation rather than PNP reservation.

Just removing the error condition and converting dev_err() into
dev_info() seems to me like accommodating already shipped BIOS images
and flagging a reservation that is already done by somebody else
without compromising the functionality of the PCI Root bridge driver
(so far the only reason why I can see the error condition there is
to catch a buggy MCFG with overlapping addresses; so if this is the
case maybe we need to have a different diagnostic check to make sure
that the MCFG table is alright)

BTW if you think that so far we can keep this as it is I am ok.

Many Thanks

Gab

> 
> Bjorn
> 
> 
> diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
> index adb62aa..5a35638 100644
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -7,6 +7,8 @@
>     in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> +	struct resource *res, *conflict;
> +	static struct resource cfg;
>  #ifdef CONFIG_PCI_DIRECT
>  	int type = 0;
> 
> @@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
> 
>  	dmi_check_skip_isa_align();
> 
> +	printk("\n** res test **\n");
> +
> +	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
> +	printk("requested %pR\n", res);
> +	if (!res)
> +		return 0;
> +	res->flags &= ~IORESOURCE_BUSY;
> +
> +	cfg.start = 0xa0000000;
> +	cfg.end = 0xafffffff;
> +	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	cfg.name = "PCI ECAM";
> +
> +	conflict = request_resource_conflict(&iomem_resource, &cfg);
> +	if (conflict)
> +		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
> +		    &cfg, conflict->name, conflict);
> +
> +	printk("\n");
> +
>  	return 0;
>  }
>  arch_initcall(pci_arch_init);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 17:23           ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21 17:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
>=20
> On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > -----Original Message-----
> > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
>=20
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
>=20
>   system_pnp_probe
>     reserve_resources_of_dev
>       reserve_range
>         request_mem_region([mem 0xb0000000-0xb1ffffff])
>   ...
>   pci_ecam_create
>     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
>=20
>=20
> but I experimented with the patch below on qemu, and it failed as you
> predicted:
>=20
>   ** res test **
>   requested [mem 0xa0000000-0xafffffff]
>   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> PNP [mem 0xa0000000-0xafffffff]
>=20
> I expected the request_resource_conflict() to succeed since it's
> completely contained in the "ECAM PNP" region.  But I guess I don't
> understand kernel/resource.c well enough.

I think it fails because effectively the PNP driver is populating the
iomem_resource resource tree and therefore pci_ecam_create() finds that
it cannot add the cfg resource to the same hierarchy as it is already
there...=20

>=20
> I'm not sure we need to fix anything yet, since we currently do the
> ecam.c request before the system.c one, and any change there would be
> a long ways off.  If/when that *does* change, I think the correct fix
> would be to change ecam.c so its request succeeds (by changing the way
> it does the request, fixing kernel/resource.c, or whatever) rather
> than to reduce the log level and ignore the failure.

Well in my mind I didn't want just to make the error disappear...
If all the resources should be reserved by the PNP driver then ideally
we could take away request_resource_conflict() from pci_ecam_create(),
but this would make buggy some systems with an already shipped BIOS
that relied on pci_ecam_create() reservation rather than PNP reservation.

Just removing the error condition and converting dev_err() into
dev_info() seems to me like accommodating already shipped BIOS images
and flagging a reservation that is already done by somebody else
without compromising the functionality of the PCI Root bridge driver
(so far the only reason why I can see the error condition there is
to catch a buggy MCFG with overlapping addresses; so if this is the
case maybe we need to have a different diagnostic check to make sure
that the MCFG table is alright)

BTW if you think that so far we can keep this as it is I am ok.

Many Thanks

Gab

>=20
> Bjorn
>=20
>=20
> diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
> index adb62aa..5a35638 100644
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -7,6 +7,8 @@
>     in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> +	struct resource *res, *conflict;
> +	static struct resource cfg;
>  #ifdef CONFIG_PCI_DIRECT
>  	int type =3D 0;
>=20
> @@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
>=20
>  	dmi_check_skip_isa_align();
>=20
> +	printk("\n** res test **\n");
> +
> +	res =3D request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
> +	printk("requested %pR\n", res);
> +	if (!res)
> +		return 0;
> +	res->flags &=3D ~IORESOURCE_BUSY;
> +
> +	cfg.start =3D 0xa0000000;
> +	cfg.end =3D 0xafffffff;
> +	cfg.flags =3D IORESOURCE_MEM | IORESOURCE_BUSY;
> +	cfg.name =3D "PCI ECAM";
> +
> +	conflict =3D request_resource_conflict(&iomem_resource, &cfg);
> +	if (conflict)
> +		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
> +		    &cfg, conflict->name, conflict);
> +
> +	printk("\n");
> +
>  	return 0;
>  }
>  arch_initcall(pci_arch_init);
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 17:23           ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-21 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Bjorn

> -----Original Message-----
> From: linux-pci-owner at vger.kernel.org [mailto:linux-pci-
> owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> Sent: 21 November 2016 16:47
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> > > Sent: 18 November 2016 17:54
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > -----Original Message-----
> > > > > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> > > > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > Sent: 17 November 2016 18:00
> > >
> > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> mechanisms
> > > for
> > > > > +reserving address space!  The static tables are for things the
> OS
> > > > > +needs to know early in boot, before it can parse the ACPI
> > > namespace.
> > > > > +If a new table is defined, an old OS needs to operate
> correctly
> > > even
> > > > > +though it ignores the table.  _CRS allows that because it is
> > > generic
> > > > > +and understood by the old OS; a static table does not.
> > > >
> > > > Right so if my understanding is correct you are saying that
> resources
> > > > described in the MCFG table should also be declared in PNP0C02
> > > devices
> > > > so that the PNP driver can reserve these resources.
> > >
> > > Yes.
> > >
> > > > On the other side the PCI Root bridge driver should not reserve
> such
> > > > resources.
> > > >
> > > > Well if my understanding is correct I think we have a problem
> here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > >
> > > > As you can see pci_ecam_create() will conflict with the pnp
> driver
> > > > as it will try to reserve the resources from the MCFG table...
> > > >
> > > > Maybe we need to rework pci_ecam_create() ?
> > >
> > > I think it's OK as it is.
> > >
> > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> it
> > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > won't be allocated for anything else, but they can still be
> requested
> > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > >
> > > This is analogous to what the PCI core does in
> pci_claim_resource().
> > > This is really a function of the ACPI/PNP *core*, which should
> reserve
> > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > bunch of historical baggage there.
> > >
> > > You'll also notice that in this case, things are out of order:
> > > logically the pnp/system.c reservation should happen first, but in
> > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > could not be reserved".
> >
> > Correct me if I am wrong...
> >
> > So currently we are relying on the fact that pci_ecam_create() is
> called
> > before the pnp driver.
> > If the pnp driver came first we would end up in pci_ecam_create()
> failing
> > here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> >
> > I am not sure but it seems to me like a bit weak condition to rely
> on...
> > what about removing the error condition in pci_ecam_create() and
> logging
> > just a dev_info()?
> 
> Huh.  I'm confused.  I *thought* it would be safe to reverse the
> order, which would effectively be this:
> 
>   system_pnp_probe
>     reserve_resources_of_dev
>       reserve_range
>         request_mem_region([mem 0xb0000000-0xb1ffffff])
>   ...
>   pci_ecam_create
>     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> 
> 
> but I experimented with the patch below on qemu, and it failed as you
> predicted:
> 
>   ** res test **
>   requested [mem 0xa0000000-0xafffffff]
>   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> PNP [mem 0xa0000000-0xafffffff]
> 
> I expected the request_resource_conflict() to succeed since it's
> completely contained in the "ECAM PNP" region.  But I guess I don't
> understand kernel/resource.c well enough.

I think it fails because effectively the PNP driver is populating the
iomem_resource resource tree and therefore pci_ecam_create() finds that
it cannot add the cfg resource to the same hierarchy as it is already
there... 

> 
> I'm not sure we need to fix anything yet, since we currently do the
> ecam.c request before the system.c one, and any change there would be
> a long ways off.  If/when that *does* change, I think the correct fix
> would be to change ecam.c so its request succeeds (by changing the way
> it does the request, fixing kernel/resource.c, or whatever) rather
> than to reduce the log level and ignore the failure.

Well in my mind I didn't want just to make the error disappear...
If all the resources should be reserved by the PNP driver then ideally
we could take away request_resource_conflict() from pci_ecam_create(),
but this would make buggy some systems with an already shipped BIOS
that relied on pci_ecam_create() reservation rather than PNP reservation.

Just removing the error condition and converting dev_err() into
dev_info() seems to me like accommodating already shipped BIOS images
and flagging a reservation that is already done by somebody else
without compromising the functionality of the PCI Root bridge driver
(so far the only reason why I can see the error condition there is
to catch a buggy MCFG with overlapping addresses; so if this is the
case maybe we need to have a different diagnostic check to make sure
that the MCFG table is alright)

BTW if you think that so far we can keep this as it is I am ok.

Many Thanks

Gab

> 
> Bjorn
> 
> 
> diff --git a/arch/x86/pci/init.c b/arch/x86/pci/init.c
> index adb62aa..5a35638 100644
> --- a/arch/x86/pci/init.c
> +++ b/arch/x86/pci/init.c
> @@ -7,6 +7,8 @@
>     in the right sequence from here. */
>  static __init int pci_arch_init(void)
>  {
> +	struct resource *res, *conflict;
> +	static struct resource cfg;
>  #ifdef CONFIG_PCI_DIRECT
>  	int type = 0;
> 
> @@ -39,6 +41,26 @@ static __init int pci_arch_init(void)
> 
>  	dmi_check_skip_isa_align();
> 
> +	printk("\n** res test **\n");
> +
> +	res = request_mem_region(0xa0000000, 0x10000000, "ECAM PNP");
> +	printk("requested %pR\n", res);
> +	if (!res)
> +		return 0;
> +	res->flags &= ~IORESOURCE_BUSY;
> +
> +	cfg.start = 0xa0000000;
> +	cfg.end = 0xafffffff;
> +	cfg.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	cfg.name = "PCI ECAM";
> +
> +	conflict = request_resource_conflict(&iomem_resource, &cfg);
> +	if (conflict)
> +		printk("can't claim ECAM area %pR: conflict with %s %pR\n",
> +		    &cfg, conflict->name, conflict);
> +
> +	printk("\n");
> +
>  	return 0;
>  }
>  arch_initcall(pci_arch_init);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-21 17:23           ` Gabriele Paoloni
  (?)
@ 2016-11-21 20:10             ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 20:10 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 21 November 2016 16:47
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > Hi Bjorn
> > >
> > > > -----Original Message-----
> > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > > Sent: 18 November 2016 17:54
> > > > To: Gabriele Paoloni
> > > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> > ACPI
> > > >
> > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > > -----Original Message-----
> > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > Sent: 17 November 2016 18:00
> > > >
> > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > mechanisms
> > > > for
> > > > > > +reserving address space!  The static tables are for things the
> > OS
> > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > namespace.
> > > > > > +If a new table is defined, an old OS needs to operate
> > correctly
> > > > even
> > > > > > +though it ignores the table.  _CRS allows that because it is
> > > > generic
> > > > > > +and understood by the old OS; a static table does not.
> > > > >
> > > > > Right so if my understanding is correct you are saying that
> > resources
> > > > > described in the MCFG table should also be declared in PNP0C02
> > > > devices
> > > > > so that the PNP driver can reserve these resources.
> > > >
> > > > Yes.
> > > >
> > > > > On the other side the PCI Root bridge driver should not reserve
> > such
> > > > > resources.
> > > > >
> > > > > Well if my understanding is correct I think we have a problem
> > here:
> > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > >
> > > > > As you can see pci_ecam_create() will conflict with the pnp
> > driver
> > > > > as it will try to reserve the resources from the MCFG table...
> > > > >
> > > > > Maybe we need to rework pci_ecam_create() ?
> > > >
> > > > I think it's OK as it is.
> > > >
> > > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> > it
> > > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > > won't be allocated for anything else, but they can still be
> > requested
> > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > >
> > > > This is analogous to what the PCI core does in
> > pci_claim_resource().
> > > > This is really a function of the ACPI/PNP *core*, which should
> > reserve
> > > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > > bunch of historical baggage there.
> > > >
> > > > You'll also notice that in this case, things are out of order:
> > > > logically the pnp/system.c reservation should happen first, but in
> > > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > > could not be reserved".
> > >
> > > Correct me if I am wrong...
> > >
> > > So currently we are relying on the fact that pci_ecam_create() is
> > called
> > > before the pnp driver.
> > > If the pnp driver came first we would end up in pci_ecam_create()
> > failing
> > > here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > >
> > > I am not sure but it seems to me like a bit weak condition to rely
> > on...
> > > what about removing the error condition in pci_ecam_create() and
> > logging
> > > just a dev_info()?
> > 
> > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > order, which would effectively be this:
> > 
> >   system_pnp_probe
> >     reserve_resources_of_dev
> >       reserve_range
> >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> >   ...
> >   pci_ecam_create
> >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > 
> > 
> > but I experimented with the patch below on qemu, and it failed as you
> > predicted:
> > 
> >   ** res test **
> >   requested [mem 0xa0000000-0xafffffff]
> >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> > PNP [mem 0xa0000000-0xafffffff]
> > 
> > I expected the request_resource_conflict() to succeed since it's
> > completely contained in the "ECAM PNP" region.  But I guess I don't
> > understand kernel/resource.c well enough.
> 
> I think it fails because effectively the PNP driver is populating the
> iomem_resource resource tree and therefore pci_ecam_create() finds that
> it cannot add the cfg resource to the same hierarchy as it is already
> there... 

Right.  I'm just surprised because the PNP reservation is marked
"not busy", and a driver (e.g., ECAM) should still be able to request
the resource.

> > I'm not sure we need to fix anything yet, since we currently do the
> > ecam.c request before the system.c one, and any change there would be
> > a long ways off.  If/when that *does* change, I think the correct fix
> > would be to change ecam.c so its request succeeds (by changing the way
> > it does the request, fixing kernel/resource.c, or whatever) rather
> > than to reduce the log level and ignore the failure.
> 
> Well in my mind I didn't want just to make the error disappear...
> If all the resources should be reserved by the PNP driver then ideally
> we could take away request_resource_conflict() from pci_ecam_create(),
> but this would make buggy some systems with an already shipped BIOS
> that relied on pci_ecam_create() reservation rather than PNP reservation.

I don't want remove the request from ecam.c.  Ideally, there should be
TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
whatever it is, and a second from ecam.c.  The first is the generic
one saying "this region is consumed by a piece of hardware, so don't
put anything else here."  The second is the driver-specific one saying
"PCI ECAM owns this region, nobody else can use it."

This is the same way we handle PCI BAR resources.  Here are two
examples from my laptop.  The first (00:08.0) only has one line:
it has a BAR that consumes address space, but I don't have a driver
for it loaded.  The second (00:16.0) does have a driver loaded, so it
has a second line showing that the driver owns the space:

  f124a000-f124afff : 0000:00:08.0         # from PCI core

  f124d000-f124dfff : 0000:00:16.0         # from PCI core
    f124d000-f124dfff : mei_me             # from mei_me driver

> Just removing the error condition and converting dev_err() into
> dev_info() seems to me like accommodating already shipped BIOS images
> and flagging a reservation that is already done by somebody else
> without compromising the functionality of the PCI Root bridge driver
> (so far the only reason why I can see the error condition there is
> to catch a buggy MCFG with overlapping addresses; so if this is the
> case maybe we need to have a different diagnostic check to make sure
> that the MCFG table is alright)

Ideally I think we should end up with this:

  a0000000-afffffff : pnp 00:01
    a0000000-afffffff : PCI ECAM

Realistically right now we'll probably end up with only the "PCI ECAM"
line in /proc/iomem and a warning from system.c about not being able
to reserve the space.

If we ever change things to do the generic PNP reservation first, then
we should fix things so ecam.c can claim the space without an error.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 20:10             ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 20:10 UTC (permalink / raw)
  To: Gabriele Paoloni
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 21 November 2016 16:47
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > Hi Bjorn
> > >
> > > > -----Original Message-----
> > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > > Sent: 18 November 2016 17:54
> > > > To: Gabriele Paoloni
> > > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> > ACPI
> > > >
> > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > > -----Original Message-----
> > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > Sent: 17 November 2016 18:00
> > > >
> > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > mechanisms
> > > > for
> > > > > > +reserving address space!  The static tables are for things the
> > OS
> > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > namespace.
> > > > > > +If a new table is defined, an old OS needs to operate
> > correctly
> > > > even
> > > > > > +though it ignores the table.  _CRS allows that because it is
> > > > generic
> > > > > > +and understood by the old OS; a static table does not.
> > > > >
> > > > > Right so if my understanding is correct you are saying that
> > resources
> > > > > described in the MCFG table should also be declared in PNP0C02
> > > > devices
> > > > > so that the PNP driver can reserve these resources.
> > > >
> > > > Yes.
> > > >
> > > > > On the other side the PCI Root bridge driver should not reserve
> > such
> > > > > resources.
> > > > >
> > > > > Well if my understanding is correct I think we have a problem
> > here:
> > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > >
> > > > > As you can see pci_ecam_create() will conflict with the pnp
> > driver
> > > > > as it will try to reserve the resources from the MCFG table...
> > > > >
> > > > > Maybe we need to rework pci_ecam_create() ?
> > > >
> > > > I think it's OK as it is.
> > > >
> > > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> > it
> > > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > > won't be allocated for anything else, but they can still be
> > requested
> > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > >
> > > > This is analogous to what the PCI core does in
> > pci_claim_resource().
> > > > This is really a function of the ACPI/PNP *core*, which should
> > reserve
> > > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > > bunch of historical baggage there.
> > > >
> > > > You'll also notice that in this case, things are out of order:
> > > > logically the pnp/system.c reservation should happen first, but in
> > > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > > could not be reserved".
> > >
> > > Correct me if I am wrong...
> > >
> > > So currently we are relying on the fact that pci_ecam_create() is
> > called
> > > before the pnp driver.
> > > If the pnp driver came first we would end up in pci_ecam_create()
> > failing
> > > here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > >
> > > I am not sure but it seems to me like a bit weak condition to rely
> > on...
> > > what about removing the error condition in pci_ecam_create() and
> > logging
> > > just a dev_info()?
> > 
> > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > order, which would effectively be this:
> > 
> >   system_pnp_probe
> >     reserve_resources_of_dev
> >       reserve_range
> >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> >   ...
> >   pci_ecam_create
> >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > 
> > 
> > but I experimented with the patch below on qemu, and it failed as you
> > predicted:
> > 
> >   ** res test **
> >   requested [mem 0xa0000000-0xafffffff]
> >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> > PNP [mem 0xa0000000-0xafffffff]
> > 
> > I expected the request_resource_conflict() to succeed since it's
> > completely contained in the "ECAM PNP" region.  But I guess I don't
> > understand kernel/resource.c well enough.
> 
> I think it fails because effectively the PNP driver is populating the
> iomem_resource resource tree and therefore pci_ecam_create() finds that
> it cannot add the cfg resource to the same hierarchy as it is already
> there... 

Right.  I'm just surprised because the PNP reservation is marked
"not busy", and a driver (e.g., ECAM) should still be able to request
the resource.

> > I'm not sure we need to fix anything yet, since we currently do the
> > ecam.c request before the system.c one, and any change there would be
> > a long ways off.  If/when that *does* change, I think the correct fix
> > would be to change ecam.c so its request succeeds (by changing the way
> > it does the request, fixing kernel/resource.c, or whatever) rather
> > than to reduce the log level and ignore the failure.
> 
> Well in my mind I didn't want just to make the error disappear...
> If all the resources should be reserved by the PNP driver then ideally
> we could take away request_resource_conflict() from pci_ecam_create(),
> but this would make buggy some systems with an already shipped BIOS
> that relied on pci_ecam_create() reservation rather than PNP reservation.

I don't want remove the request from ecam.c.  Ideally, there should be
TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
whatever it is, and a second from ecam.c.  The first is the generic
one saying "this region is consumed by a piece of hardware, so don't
put anything else here."  The second is the driver-specific one saying
"PCI ECAM owns this region, nobody else can use it."

This is the same way we handle PCI BAR resources.  Here are two
examples from my laptop.  The first (00:08.0) only has one line:
it has a BAR that consumes address space, but I don't have a driver
for it loaded.  The second (00:16.0) does have a driver loaded, so it
has a second line showing that the driver owns the space:

  f124a000-f124afff : 0000:00:08.0         # from PCI core

  f124d000-f124dfff : 0000:00:16.0         # from PCI core
    f124d000-f124dfff : mei_me             # from mei_me driver

> Just removing the error condition and converting dev_err() into
> dev_info() seems to me like accommodating already shipped BIOS images
> and flagging a reservation that is already done by somebody else
> without compromising the functionality of the PCI Root bridge driver
> (so far the only reason why I can see the error condition there is
> to catch a buggy MCFG with overlapping addresses; so if this is the
> case maybe we need to have a different diagnostic check to make sure
> that the MCFG table is alright)

Ideally I think we should end up with this:

  a0000000-afffffff : pnp 00:01
    a0000000-afffffff : PCI ECAM

Realistically right now we'll probably end up with only the "PCI ECAM"
line in /proc/iomem and a warning from system.c about not being able
to reserve the space.

If we ever change things to do the generic PNP reservation first, then
we should fix things so ecam.c can claim the space without an error.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-21 20:10             ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-21 20:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> Hi Bjorn
> 
> > -----Original Message-----
> > From: linux-pci-owner at vger.kernel.org [mailto:linux-pci-
> > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 21 November 2016 16:47
> > To: Gabriele Paoloni
> > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > Hi Bjorn
> > >
> > > > -----Original Message-----
> > > > From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> > > > Sent: 18 November 2016 17:54
> > > > To: Gabriele Paoloni
> > > > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > > > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > > > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> > ACPI
> > > >
> > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > > > > -----Original Message-----
> > > > > > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel-
> > > > > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > Sent: 17 November 2016 18:00
> > > >
> > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > mechanisms
> > > > for
> > > > > > +reserving address space!  The static tables are for things the
> > OS
> > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > namespace.
> > > > > > +If a new table is defined, an old OS needs to operate
> > correctly
> > > > even
> > > > > > +though it ignores the table.  _CRS allows that because it is
> > > > generic
> > > > > > +and understood by the old OS; a static table does not.
> > > > >
> > > > > Right so if my understanding is correct you are saying that
> > resources
> > > > > described in the MCFG table should also be declared in PNP0C02
> > > > devices
> > > > > so that the PNP driver can reserve these resources.
> > > >
> > > > Yes.
> > > >
> > > > > On the other side the PCI Root bridge driver should not reserve
> > such
> > > > > resources.
> > > > >
> > > > > Well if my understanding is correct I think we have a problem
> > here:
> > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > >
> > > > > As you can see pci_ecam_create() will conflict with the pnp
> > driver
> > > > > as it will try to reserve the resources from the MCFG table...
> > > > >
> > > > > Maybe we need to rework pci_ecam_create() ?
> > > >
> > > > I think it's OK as it is.
> > > >
> > > > The pnp/system.c driver does try to reserve PNP0C02 resources, and
> > it
> > > > marks them as "not busy".  That way they appear in /proc/iomem and
> > > > won't be allocated for anything else, but they can still be
> > requested
> > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > >
> > > > This is analogous to what the PCI core does in
> > pci_claim_resource().
> > > > This is really a function of the ACPI/PNP *core*, which should
> > reserve
> > > > all _CRS resources for all devices (not just PNP0C02 devices).  But
> > > > it's done by pnp/system.c, and only for PNP0C02, because there's a
> > > > bunch of historical baggage there.
> > > >
> > > > You'll also notice that in this case, things are out of order:
> > > > logically the pnp/system.c reservation should happen first, but in
> > > > fact the pci/ecam.c request happens *before* the pnp/system.c one.
> > > > That means the pnp/system.c one might fail and complain "[mem ...]
> > > > could not be reserved".
> > >
> > > Correct me if I am wrong...
> > >
> > > So currently we are relying on the fact that pci_ecam_create() is
> > called
> > > before the pnp driver.
> > > If the pnp driver came first we would end up in pci_ecam_create()
> > failing
> > > here:
> > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > >
> > > I am not sure but it seems to me like a bit weak condition to rely
> > on...
> > > what about removing the error condition in pci_ecam_create() and
> > logging
> > > just a dev_info()?
> > 
> > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > order, which would effectively be this:
> > 
> >   system_pnp_probe
> >     reserve_resources_of_dev
> >       reserve_range
> >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> >   ...
> >   pci_ecam_create
> >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > 
> > 
> > but I experimented with the patch below on qemu, and it failed as you
> > predicted:
> > 
> >   ** res test **
> >   requested [mem 0xa0000000-0xafffffff]
> >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with ECAM
> > PNP [mem 0xa0000000-0xafffffff]
> > 
> > I expected the request_resource_conflict() to succeed since it's
> > completely contained in the "ECAM PNP" region.  But I guess I don't
> > understand kernel/resource.c well enough.
> 
> I think it fails because effectively the PNP driver is populating the
> iomem_resource resource tree and therefore pci_ecam_create() finds that
> it cannot add the cfg resource to the same hierarchy as it is already
> there... 

Right.  I'm just surprised because the PNP reservation is marked
"not busy", and a driver (e.g., ECAM) should still be able to request
the resource.

> > I'm not sure we need to fix anything yet, since we currently do the
> > ecam.c request before the system.c one, and any change there would be
> > a long ways off.  If/when that *does* change, I think the correct fix
> > would be to change ecam.c so its request succeeds (by changing the way
> > it does the request, fixing kernel/resource.c, or whatever) rather
> > than to reduce the log level and ignore the failure.
> 
> Well in my mind I didn't want just to make the error disappear...
> If all the resources should be reserved by the PNP driver then ideally
> we could take away request_resource_conflict() from pci_ecam_create(),
> but this would make buggy some systems with an already shipped BIOS
> that relied on pci_ecam_create() reservation rather than PNP reservation.

I don't want remove the request from ecam.c.  Ideally, there should be
TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
whatever it is, and a second from ecam.c.  The first is the generic
one saying "this region is consumed by a piece of hardware, so don't
put anything else here."  The second is the driver-specific one saying
"PCI ECAM owns this region, nobody else can use it."

This is the same way we handle PCI BAR resources.  Here are two
examples from my laptop.  The first (00:08.0) only has one line:
it has a BAR that consumes address space, but I don't have a driver
for it loaded.  The second (00:16.0) does have a driver loaded, so it
has a second line showing that the driver owns the space:

  f124a000-f124afff : 0000:00:08.0         # from PCI core

  f124d000-f124dfff : 0000:00:16.0         # from PCI core
    f124d000-f124dfff : mei_me             # from mei_me driver

> Just removing the error condition and converting dev_err() into
> dev_info() seems to me like accommodating already shipped BIOS images
> and flagging a reservation that is already done by somebody else
> without compromising the functionality of the PCI Root bridge driver
> (so far the only reason why I can see the error condition there is
> to catch a buggy MCFG with overlapping addresses; so if this is the
> case maybe we need to have a different diagnostic check to make sure
> that the MCFG table is alright)

Ideally I think we should end up with this:

  a0000000-afffffff : pnp 00:01
    a0000000-afffffff : PCI ECAM

Realistically right now we'll probably end up with only the "PCI ECAM"
line in /proc/iomem and a warning from system.c about not being able
to reserve the space.

If we ever change things to do the generic PNP reservation first, then
we should fix things so ecam.c can claim the space without an error.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-17 17:59 ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-22 10:09   ` Ard Biesheuvel
  -1 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-22 10:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +       1–This device consumes this resource
> +       0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 10:09   ` Ard Biesheuvel
  0 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-22 10:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +       1–This device consumes this resource
> +       0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 10:09   ` Ard Biesheuvel
  0 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-22 10:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++=
++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FA=
Q.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-inf=
o.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device=E2=80=99s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +       1=E2=80=93This device consumes this resource
> +       0=E2=80=93This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus=E2=80=99s _CRS.  The resources ca=
n
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 10:09   ` Ard Biesheuvel
  0 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-22 10:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device?s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +       1?This device consumes this resource
> +       0?This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus?s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-21 20:10             ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-22 13:13               ` Gabriele Paoloni
  -1 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-22 13:13 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -----Original Message-----
> > > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni
> wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> > > reserve
> > > > > all _CRS resources for all devices (not just PNP0C02 devices).
> But
> > > > > it's done by pnp/system.c, and only for PNP0C02, because
> there's a
> > > > > bunch of historical baggage there.
> > > > >
> > > > > You'll also notice that in this case, things are out of order:
> > > > > logically the pnp/system.c reservation should happen first, but
> in
> > > > > fact the pci/ecam.c request happens *before* the pnp/system.c
> one.
> > > > > That means the pnp/system.c one might fail and complain "[mem
> ...]
> > > > > could not be reserved".
> > > >
> > > > Correct me if I am wrong...
> > > >
> > > > So currently we are relying on the fact that pci_ecam_create() is
> > > called
> > > > before the pnp driver.
> > > > If the pnp driver came first we would end up in pci_ecam_create()
> > > failing
> > > > here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > > >
> > > > I am not sure but it seems to me like a bit weak condition to
> rely
> > > on...
> > > > what about removing the error condition in pci_ecam_create() and
> > > logging
> > > > just a dev_info()?
> > >
> > > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > > order, which would effectively be this:
> > >
> > >   system_pnp_probe
> > >     reserve_resources_of_dev
> > >       reserve_range
> > >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> > >   ...
> > >   pci_ecam_create
> > >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > >
> > >
> > > but I experimented with the patch below on qemu, and it failed as
> you
> > > predicted:
> > >
> > >   ** res test **
> > >   requested [mem 0xa0000000-0xafffffff]
> > >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with
> ECAM
> > > PNP [mem 0xa0000000-0xafffffff]
> > >
> > > I expected the request_resource_conflict() to succeed since it's
> > > completely contained in the "ECAM PNP" region.  But I guess I don't
> > > understand kernel/resource.c well enough.
> >
> > I think it fails because effectively the PNP driver is populating the
> > iomem_resource resource tree and therefore pci_ecam_create() finds
> that
> > it cannot add the cfg resource to the same hierarchy as it is already
> > there...
> 
> Right.  I'm just surprised because the PNP reservation is marked
> "not busy", and a driver (e.g., ECAM) should still be able to request
> the resource.

Yes unfortunately pci_ecam_create() is not flexible on the conflict as 
pci_request_regions():
http://lxr.free-electrons.com/source/kernel/resource.c#L1155
if the conflict resource is not busy pci_request_regions() will create
a child resource under the conflict sibling and mark it as busy...

or at least this is my understanding...

> 
> > > I'm not sure we need to fix anything yet, since we currently do the
> > > ecam.c request before the system.c one, and any change there would
> be
> > > a long ways off.  If/when that *does* change, I think the correct
> fix
> > > would be to change ecam.c so its request succeeds (by changing the
> way
> > > it does the request, fixing kernel/resource.c, or whatever) rather
> > > than to reduce the log level and ignore the failure.
> >
> > Well in my mind I didn't want just to make the error disappear...
> > If all the resources should be reserved by the PNP driver then
> ideally
> > we could take away request_resource_conflict() from
> pci_ecam_create(),
> > but this would make buggy some systems with an already shipped BIOS
> > that relied on pci_ecam_create() reservation rather than PNP
> reservation.
> 
> I don't want remove the request from ecam.c.  Ideally, there should be
> TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
> whatever it is, and a second from ecam.c.  The first is the generic
> one saying "this region is consumed by a piece of hardware, so don't
> put anything else here."  The second is the driver-specific one saying
> "PCI ECAM owns this region, nobody else can use it."
> 
> This is the same way we handle PCI BAR resources.  Here are two
> examples from my laptop.  The first (00:08.0) only has one line:
> it has a BAR that consumes address space, but I don't have a driver
> for it loaded.  The second (00:16.0) does have a driver loaded, so it
> has a second line showing that the driver owns the space:
> 
>   f124a000-f124afff : 0000:00:08.0         # from PCI core
> 
>   f124d000-f124dfff : 0000:00:16.0         # from PCI core
>     f124d000-f124dfff : mei_me             # from mei_me driver
> 
> > Just removing the error condition and converting dev_err() into
> > dev_info() seems to me like accommodating already shipped BIOS images
> > and flagging a reservation that is already done by somebody else
> > without compromising the functionality of the PCI Root bridge driver
> > (so far the only reason why I can see the error condition there is
> > to catch a buggy MCFG with overlapping addresses; so if this is the
> > case maybe we need to have a different diagnostic check to make sure
> > that the MCFG table is alright)
> 
> Ideally I think we should end up with this:
> 
>   a0000000-afffffff : pnp 00:01
>     a0000000-afffffff : PCI ECAM

I think that for PCIe device drivers it works ok because it is guaranteed
that their own pci_request_regions() is called always after
pci_claim_resource() of the bridge that is on top of them...
I.e. pci_claim_resource() reserves the resources as not busy and
pci_request_regions() will create a child busy resource 

> 
> Realistically right now we'll probably end up with only the "PCI ECAM"
> line in /proc/iomem and a warning from system.c about not being able
> to reserve the space.
> 
> If we ever change things to do the generic PNP reservation first, then
> we should fix things so ecam.c can claim the space without an error.

Maybe the patch below could be a sort of solution...effectively pci_ecam
should succeed in reserving a busy resource under the conflict resource
in case of PNP driver allocating a non BUSY resource first...

---
drivers/pci/ecam.c                  | 16 +++++-----------
 drivers/pci/host/pci-thunder-ecam.c |  2 +-
 include/linux/pci-ecam.h            |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
index 43ed08d..999b6ef 100644
--- a/drivers/pci/ecam.c
+++ b/drivers/pci/ecam.c
@@ -66,16 +66,10 @@ struct pci_config_window *pci_ecam_create(struct device *dev,
 	}
 	bsz = 1 << ops->bus_shift;
 
-	cfg->res.start = cfgres->start;
-	cfg->res.end = cfgres->end;
-	cfg->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
-	cfg->res.name = "PCI ECAM";
-
-	conflict = request_resource_conflict(&iomem_resource, &cfg->res);
-	if (conflict) {
+	cfg->res = request_mem_region(cfgres->start, resource_size(cfgres), "PCI ECAM");
+	if (!cfg->res) {
 		err = -EBUSY;
-		dev_err(dev, "can't claim ECAM area %pR: address conflict with %s %pR\n",
-			&cfg->res, conflict->name, conflict);
+		dev_err(dev, "can't claim ECAM area %pR\n", &cfg->res);
 		goto err_exit;
 	}
 
@@ -126,8 +120,8 @@ void pci_ecam_free(struct pci_config_window *cfg)
 		if (cfg->win)
 			iounmap(cfg->win);
 	}
-	if (cfg->res.parent)
-		release_resource(&cfg->res);
+	if (cfg->res->parent)
+		release_region(cfg->res->start, resource_size(cfg->res));
 	kfree(cfg);
 }
 
diff --git a/drivers/pci/host/pci-thunder-ecam.c b/drivers/pci/host/pci-thunder-ecam.c
index d50a3dc..2e48d9d 100644
--- a/drivers/pci/host/pci-thunder-ecam.c
+++ b/drivers/pci/host/pci-thunder-ecam.c
@@ -117,7 +117,7 @@ static int thunder_ecam_p2_config_read(struct pci_bus *bus, unsigned int devfn,
 	 * the config space access window.  Since we are working with
 	 * the high-order 32 bits, shift everything down by 32 bits.
 	 */
-	node_bits = (cfg->res.start >> 32) & (1 << 12);
+	node_bits = (cfg->res->start >> 32) & (1 << 12);
 
 	v |= node_bits;
 	set_val(v, where, size, val);
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index 7adad20..f30a4ea 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h
@@ -36,7 +36,7 @@ struct pci_ecam_ops {
  * use ECAM.
  */
 struct pci_config_window {
-	struct resource			res;
+	struct resource			*res;
 	struct resource			busr;
 	void				*priv;
 	struct pci_ecam_ops		*ops;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 13:13               ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-22 13:13 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -----Original Message-----
> > > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni
> wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> > > reserve
> > > > > all _CRS resources for all devices (not just PNP0C02 devices).
> But
> > > > > it's done by pnp/system.c, and only for PNP0C02, because
> there's a
> > > > > bunch of historical baggage there.
> > > > >
> > > > > You'll also notice that in this case, things are out of order:
> > > > > logically the pnp/system.c reservation should happen first, but
> in
> > > > > fact the pci/ecam.c request happens *before* the pnp/system.c
> one.
> > > > > That means the pnp/system.c one might fail and complain "[mem
> ...]
> > > > > could not be reserved".
> > > >
> > > > Correct me if I am wrong...
> > > >
> > > > So currently we are relying on the fact that pci_ecam_create() is
> > > called
> > > > before the pnp driver.
> > > > If the pnp driver came first we would end up in pci_ecam_create()
> > > failing
> > > > here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > > >
> > > > I am not sure but it seems to me like a bit weak condition to
> rely
> > > on...
> > > > what about removing the error condition in pci_ecam_create() and
> > > logging
> > > > just a dev_info()?
> > >
> > > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > > order, which would effectively be this:
> > >
> > >   system_pnp_probe
> > >     reserve_resources_of_dev
> > >       reserve_range
> > >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> > >   ...
> > >   pci_ecam_create
> > >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > >
> > >
> > > but I experimented with the patch below on qemu, and it failed as
> you
> > > predicted:
> > >
> > >   ** res test **
> > >   requested [mem 0xa0000000-0xafffffff]
> > >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with
> ECAM
> > > PNP [mem 0xa0000000-0xafffffff]
> > >
> > > I expected the request_resource_conflict() to succeed since it's
> > > completely contained in the "ECAM PNP" region.  But I guess I don't
> > > understand kernel/resource.c well enough.
> >
> > I think it fails because effectively the PNP driver is populating the
> > iomem_resource resource tree and therefore pci_ecam_create() finds
> that
> > it cannot add the cfg resource to the same hierarchy as it is already
> > there...
> 
> Right.  I'm just surprised because the PNP reservation is marked
> "not busy", and a driver (e.g., ECAM) should still be able to request
> the resource.

Yes unfortunately pci_ecam_create() is not flexible on the conflict as 
pci_request_regions():
http://lxr.free-electrons.com/source/kernel/resource.c#L1155
if the conflict resource is not busy pci_request_regions() will create
a child resource under the conflict sibling and mark it as busy...

or at least this is my understanding...

> 
> > > I'm not sure we need to fix anything yet, since we currently do the
> > > ecam.c request before the system.c one, and any change there would
> be
> > > a long ways off.  If/when that *does* change, I think the correct
> fix
> > > would be to change ecam.c so its request succeeds (by changing the
> way
> > > it does the request, fixing kernel/resource.c, or whatever) rather
> > > than to reduce the log level and ignore the failure.
> >
> > Well in my mind I didn't want just to make the error disappear...
> > If all the resources should be reserved by the PNP driver then
> ideally
> > we could take away request_resource_conflict() from
> pci_ecam_create(),
> > but this would make buggy some systems with an already shipped BIOS
> > that relied on pci_ecam_create() reservation rather than PNP
> reservation.
> 
> I don't want remove the request from ecam.c.  Ideally, there should be
> TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
> whatever it is, and a second from ecam.c.  The first is the generic
> one saying "this region is consumed by a piece of hardware, so don't
> put anything else here."  The second is the driver-specific one saying
> "PCI ECAM owns this region, nobody else can use it."
> 
> This is the same way we handle PCI BAR resources.  Here are two
> examples from my laptop.  The first (00:08.0) only has one line:
> it has a BAR that consumes address space, but I don't have a driver
> for it loaded.  The second (00:16.0) does have a driver loaded, so it
> has a second line showing that the driver owns the space:
> 
>   f124a000-f124afff : 0000:00:08.0         # from PCI core
> 
>   f124d000-f124dfff : 0000:00:16.0         # from PCI core
>     f124d000-f124dfff : mei_me             # from mei_me driver
> 
> > Just removing the error condition and converting dev_err() into
> > dev_info() seems to me like accommodating already shipped BIOS images
> > and flagging a reservation that is already done by somebody else
> > without compromising the functionality of the PCI Root bridge driver
> > (so far the only reason why I can see the error condition there is
> > to catch a buggy MCFG with overlapping addresses; so if this is the
> > case maybe we need to have a different diagnostic check to make sure
> > that the MCFG table is alright)
> 
> Ideally I think we should end up with this:
> 
>   a0000000-afffffff : pnp 00:01
>     a0000000-afffffff : PCI ECAM

I think that for PCIe device drivers it works ok because it is guaranteed
that their own pci_request_regions() is called always after
pci_claim_resource() of the bridge that is on top of them...
I.e. pci_claim_resource() reserves the resources as not busy and
pci_request_regions() will create a child busy resource 

> 
> Realistically right now we'll probably end up with only the "PCI ECAM"
> line in /proc/iomem and a warning from system.c about not being able
> to reserve the space.
> 
> If we ever change things to do the generic PNP reservation first, then
> we should fix things so ecam.c can claim the space without an error.

Maybe the patch below could be a sort of solution...effectively pci_ecam
should succeed in reserving a busy resource under the conflict resource
in case of PNP driver allocating a non BUSY resource first...

---
drivers/pci/ecam.c                  | 16 +++++-----------
 drivers/pci/host/pci-thunder-ecam.c |  2 +-
 include/linux/pci-ecam.h            |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
index 43ed08d..999b6ef 100644
--- a/drivers/pci/ecam.c
+++ b/drivers/pci/ecam.c
@@ -66,16 +66,10 @@ struct pci_config_window *pci_ecam_create(struct device *dev,
 	}
 	bsz = 1 << ops->bus_shift;
 
-	cfg->res.start = cfgres->start;
-	cfg->res.end = cfgres->end;
-	cfg->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
-	cfg->res.name = "PCI ECAM";
-
-	conflict = request_resource_conflict(&iomem_resource, &cfg->res);
-	if (conflict) {
+	cfg->res = request_mem_region(cfgres->start, resource_size(cfgres), "PCI ECAM");
+	if (!cfg->res) {
 		err = -EBUSY;
-		dev_err(dev, "can't claim ECAM area %pR: address conflict with %s %pR\n",
-			&cfg->res, conflict->name, conflict);
+		dev_err(dev, "can't claim ECAM area %pR\n", &cfg->res);
 		goto err_exit;
 	}
 
@@ -126,8 +120,8 @@ void pci_ecam_free(struct pci_config_window *cfg)
 		if (cfg->win)
 			iounmap(cfg->win);
 	}
-	if (cfg->res.parent)
-		release_resource(&cfg->res);
+	if (cfg->res->parent)
+		release_region(cfg->res->start, resource_size(cfg->res));
 	kfree(cfg);
 }
 
diff --git a/drivers/pci/host/pci-thunder-ecam.c b/drivers/pci/host/pci-thunder-ecam.c
index d50a3dc..2e48d9d 100644
--- a/drivers/pci/host/pci-thunder-ecam.c
+++ b/drivers/pci/host/pci-thunder-ecam.c
@@ -117,7 +117,7 @@ static int thunder_ecam_p2_config_read(struct pci_bus *bus, unsigned int devfn,
 	 * the config space access window.  Since we are working with
 	 * the high-order 32 bits, shift everything down by 32 bits.
 	 */
-	node_bits = (cfg->res.start >> 32) & (1 << 12);
+	node_bits = (cfg->res->start >> 32) & (1 << 12);
 
 	v |= node_bits;
 	set_val(v, where, size, val);
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index 7adad20..f30a4ea 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h
@@ -36,7 +36,7 @@ struct pci_ecam_ops {
  * use ECAM.
  */
 struct pci_config_window {
-	struct resource			res;
+	struct resource			*res;
 	struct resource			busr;
 	void				*priv;
 	struct pci_ecam_ops		*ops;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 13:13               ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-22 13:13 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
>=20
> On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: linux-pci-owner@vger.kernel.org [mailto:linux-pci-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -----Original Message-----
> > > > > From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> > > > > acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > > > > kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni
> wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> > > reserve
> > > > > all _CRS resources for all devices (not just PNP0C02 devices).
> But
> > > > > it's done by pnp/system.c, and only for PNP0C02, because
> there's a
> > > > > bunch of historical baggage there.
> > > > >
> > > > > You'll also notice that in this case, things are out of order:
> > > > > logically the pnp/system.c reservation should happen first, but
> in
> > > > > fact the pci/ecam.c request happens *before* the pnp/system.c
> one.
> > > > > That means the pnp/system.c one might fail and complain "[mem
> ...]
> > > > > could not be reserved".
> > > >
> > > > Correct me if I am wrong...
> > > >
> > > > So currently we are relying on the fact that pci_ecam_create() is
> > > called
> > > > before the pnp driver.
> > > > If the pnp driver came first we would end up in pci_ecam_create()
> > > failing
> > > > here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > > >
> > > > I am not sure but it seems to me like a bit weak condition to
> rely
> > > on...
> > > > what about removing the error condition in pci_ecam_create() and
> > > logging
> > > > just a dev_info()?
> > >
> > > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > > order, which would effectively be this:
> > >
> > >   system_pnp_probe
> > >     reserve_resources_of_dev
> > >       reserve_range
> > >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> > >   ...
> > >   pci_ecam_create
> > >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > >
> > >
> > > but I experimented with the patch below on qemu, and it failed as
> you
> > > predicted:
> > >
> > >   ** res test **
> > >   requested [mem 0xa0000000-0xafffffff]
> > >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with
> ECAM
> > > PNP [mem 0xa0000000-0xafffffff]
> > >
> > > I expected the request_resource_conflict() to succeed since it's
> > > completely contained in the "ECAM PNP" region.  But I guess I don't
> > > understand kernel/resource.c well enough.
> >
> > I think it fails because effectively the PNP driver is populating the
> > iomem_resource resource tree and therefore pci_ecam_create() finds
> that
> > it cannot add the cfg resource to the same hierarchy as it is already
> > there...
>=20
> Right.  I'm just surprised because the PNP reservation is marked
> "not busy", and a driver (e.g., ECAM) should still be able to request
> the resource.

Yes unfortunately pci_ecam_create() is not flexible on the conflict as=20
pci_request_regions():
http://lxr.free-electrons.com/source/kernel/resource.c#L1155
if the conflict resource is not busy pci_request_regions() will create
a child resource under the conflict sibling and mark it as busy...

or at least this is my understanding...

>=20
> > > I'm not sure we need to fix anything yet, since we currently do the
> > > ecam.c request before the system.c one, and any change there would
> be
> > > a long ways off.  If/when that *does* change, I think the correct
> fix
> > > would be to change ecam.c so its request succeeds (by changing the
> way
> > > it does the request, fixing kernel/resource.c, or whatever) rather
> > > than to reduce the log level and ignore the failure.
> >
> > Well in my mind I didn't want just to make the error disappear...
> > If all the resources should be reserved by the PNP driver then
> ideally
> > we could take away request_resource_conflict() from
> pci_ecam_create(),
> > but this would make buggy some systems with an already shipped BIOS
> > that relied on pci_ecam_create() reservation rather than PNP
> reservation.
>=20
> I don't want remove the request from ecam.c.  Ideally, there should be
> TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
> whatever it is, and a second from ecam.c.  The first is the generic
> one saying "this region is consumed by a piece of hardware, so don't
> put anything else here."  The second is the driver-specific one saying
> "PCI ECAM owns this region, nobody else can use it."
>=20
> This is the same way we handle PCI BAR resources.  Here are two
> examples from my laptop.  The first (00:08.0) only has one line:
> it has a BAR that consumes address space, but I don't have a driver
> for it loaded.  The second (00:16.0) does have a driver loaded, so it
> has a second line showing that the driver owns the space:
>=20
>   f124a000-f124afff : 0000:00:08.0         # from PCI core
>=20
>   f124d000-f124dfff : 0000:00:16.0         # from PCI core
>     f124d000-f124dfff : mei_me             # from mei_me driver
>=20
> > Just removing the error condition and converting dev_err() into
> > dev_info() seems to me like accommodating already shipped BIOS images
> > and flagging a reservation that is already done by somebody else
> > without compromising the functionality of the PCI Root bridge driver
> > (so far the only reason why I can see the error condition there is
> > to catch a buggy MCFG with overlapping addresses; so if this is the
> > case maybe we need to have a different diagnostic check to make sure
> > that the MCFG table is alright)
>=20
> Ideally I think we should end up with this:
>=20
>   a0000000-afffffff : pnp 00:01
>     a0000000-afffffff : PCI ECAM

I think that for PCIe device drivers it works ok because it is guaranteed
that their own pci_request_regions() is called always after
pci_claim_resource() of the bridge that is on top of them...
I.e. pci_claim_resource() reserves the resources as not busy and
pci_request_regions() will create a child busy resource=20

>=20
> Realistically right now we'll probably end up with only the "PCI ECAM"
> line in /proc/iomem and a warning from system.c about not being able
> to reserve the space.
>=20
> If we ever change things to do the generic PNP reservation first, then
> we should fix things so ecam.c can claim the space without an error.

Maybe the patch below could be a sort of solution...effectively pci_ecam
should succeed in reserving a busy resource under the conflict resource
in case of PNP driver allocating a non BUSY resource first...

---
drivers/pci/ecam.c                  | 16 +++++-----------
 drivers/pci/host/pci-thunder-ecam.c |  2 +-
 include/linux/pci-ecam.h            |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
index 43ed08d..999b6ef 100644
--- a/drivers/pci/ecam.c
+++ b/drivers/pci/ecam.c
@@ -66,16 +66,10 @@ struct pci_config_window *pci_ecam_create(struct device=
 *dev,
 	}
 	bsz =3D 1 << ops->bus_shift;
=20
-	cfg->res.start =3D cfgres->start;
-	cfg->res.end =3D cfgres->end;
-	cfg->res.flags =3D IORESOURCE_MEM | IORESOURCE_BUSY;
-	cfg->res.name =3D "PCI ECAM";
-
-	conflict =3D request_resource_conflict(&iomem_resource, &cfg->res);
-	if (conflict) {
+	cfg->res =3D request_mem_region(cfgres->start, resource_size(cfgres), "PC=
I ECAM");
+	if (!cfg->res) {
 		err =3D -EBUSY;
-		dev_err(dev, "can't claim ECAM area %pR: address conflict with %s %pR\n"=
,
-			&cfg->res, conflict->name, conflict);
+		dev_err(dev, "can't claim ECAM area %pR\n", &cfg->res);
 		goto err_exit;
 	}
=20
@@ -126,8 +120,8 @@ void pci_ecam_free(struct pci_config_window *cfg)
 		if (cfg->win)
 			iounmap(cfg->win);
 	}
-	if (cfg->res.parent)
-		release_resource(&cfg->res);
+	if (cfg->res->parent)
+		release_region(cfg->res->start, resource_size(cfg->res));
 	kfree(cfg);
 }
=20
diff --git a/drivers/pci/host/pci-thunder-ecam.c b/drivers/pci/host/pci-thu=
nder-ecam.c
index d50a3dc..2e48d9d 100644
--- a/drivers/pci/host/pci-thunder-ecam.c
+++ b/drivers/pci/host/pci-thunder-ecam.c
@@ -117,7 +117,7 @@ static int thunder_ecam_p2_config_read(struct pci_bus *=
bus, unsigned int devfn,
 	 * the config space access window.  Since we are working with
 	 * the high-order 32 bits, shift everything down by 32 bits.
 	 */
-	node_bits =3D (cfg->res.start >> 32) & (1 << 12);
+	node_bits =3D (cfg->res->start >> 32) & (1 << 12);
=20
 	v |=3D node_bits;
 	set_val(v, where, size, val);
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index 7adad20..f30a4ea 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h
@@ -36,7 +36,7 @@ struct pci_ecam_ops {
  * use ECAM.
  */
 struct pci_config_window {
-	struct resource			res;
+	struct resource			*res;
 	struct resource			busr;
 	void				*priv;
 	struct pci_ecam_ops		*ops;
--=20
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-22 13:13               ` Gabriele Paoloni
  0 siblings, 0 replies; 66+ messages in thread
From: Gabriele Paoloni @ 2016-11-22 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> Sent: 21 November 2016 20:10
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Mon, Nov 21, 2016 at 05:23:11PM +0000, Gabriele Paoloni wrote:
> > Hi Bjorn
> >
> > > -----Original Message-----
> > > From: linux-pci-owner at vger.kernel.org [mailto:linux-pci-
> > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 21 November 2016 16:47
> > > To: Gabriele Paoloni
> > > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > > Subject: Re: [PATCH] PCI: Add information about describing PCI in
> ACPI
> > >
> > > On Mon, Nov 21, 2016 at 08:52:52AM +0000, Gabriele Paoloni wrote:
> > > > Hi Bjorn
> > > >
> > > > > -----Original Message-----
> > > > > From: Bjorn Helgaas [mailto:helgaas at kernel.org]
> > > > > Sent: 18 November 2016 17:54
> > > > > To: Gabriele Paoloni
> > > > > Cc: Bjorn Helgaas; linux-pci at vger.kernel.org; linux-
> > > > > acpi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-
> > > > > kernel at lists.infradead.org; linaro-acpi at lists.linaro.org
> > > > > Subject: Re: [PATCH] PCI: Add information about describing PCI
> in
> > > ACPI
> > > > >
> > > > > On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni
> wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: linux-kernel-owner at vger.kernel.org [mailto:linux-
> kernel-
> > > > > > > owner at vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > > > > > Sent: 17 November 2016 18:00
> > > > >
> > > > > > > +Static tables like MCFG, HPET, ECDT, etc., are *not*
> > > mechanisms
> > > > > for
> > > > > > > +reserving address space!  The static tables are for things
> the
> > > OS
> > > > > > > +needs to know early in boot, before it can parse the ACPI
> > > > > namespace.
> > > > > > > +If a new table is defined, an old OS needs to operate
> > > correctly
> > > > > even
> > > > > > > +though it ignores the table.  _CRS allows that because it
> is
> > > > > generic
> > > > > > > +and understood by the old OS; a static table does not.
> > > > > >
> > > > > > Right so if my understanding is correct you are saying that
> > > resources
> > > > > > described in the MCFG table should also be declared in
> PNP0C02
> > > > > devices
> > > > > > so that the PNP driver can reserve these resources.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > On the other side the PCI Root bridge driver should not
> reserve
> > > such
> > > > > > resources.
> > > > > >
> > > > > > Well if my understanding is correct I think we have a problem
> > > here:
> > > > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> > > > > >
> > > > > > As you can see pci_ecam_create() will conflict with the pnp
> > > driver
> > > > > > as it will try to reserve the resources from the MCFG
> table...
> > > > > >
> > > > > > Maybe we need to rework pci_ecam_create() ?
> > > > >
> > > > > I think it's OK as it is.
> > > > >
> > > > > The pnp/system.c driver does try to reserve PNP0C02 resources,
> and
> > > it
> > > > > marks them as "not busy".  That way they appear in /proc/iomem
> and
> > > > > won't be allocated for anything else, but they can still be
> > > requested
> > > > > by drivers, e.g., pci/ecam.c, which will mark them "busy".
> > > > >
> > > > > This is analogous to what the PCI core does in
> > > pci_claim_resource().
> > > > > This is really a function of the ACPI/PNP *core*, which should
> > > reserve
> > > > > all _CRS resources for all devices (not just PNP0C02 devices).
> But
> > > > > it's done by pnp/system.c, and only for PNP0C02, because
> there's a
> > > > > bunch of historical baggage there.
> > > > >
> > > > > You'll also notice that in this case, things are out of order:
> > > > > logically the pnp/system.c reservation should happen first, but
> in
> > > > > fact the pci/ecam.c request happens *before* the pnp/system.c
> one.
> > > > > That means the pnp/system.c one might fail and complain "[mem
> ...]
> > > > > could not be reserved".
> > > >
> > > > Correct me if I am wrong...
> > > >
> > > > So currently we are relying on the fact that pci_ecam_create() is
> > > called
> > > > before the pnp driver.
> > > > If the pnp driver came first we would end up in pci_ecam_create()
> > > failing
> > > > here:
> > > > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76
> > > >
> > > > I am not sure but it seems to me like a bit weak condition to
> rely
> > > on...
> > > > what about removing the error condition in pci_ecam_create() and
> > > logging
> > > > just a dev_info()?
> > >
> > > Huh.  I'm confused.  I *thought* it would be safe to reverse the
> > > order, which would effectively be this:
> > >
> > >   system_pnp_probe
> > >     reserve_resources_of_dev
> > >       reserve_range
> > >         request_mem_region([mem 0xb0000000-0xb1ffffff])
> > >   ...
> > >   pci_ecam_create
> > >     request_resource_conflict([mem 0xb0000000-0xb1ffffff])
> > >
> > >
> > > but I experimented with the patch below on qemu, and it failed as
> you
> > > predicted:
> > >
> > >   ** res test **
> > >   requested [mem 0xa0000000-0xafffffff]
> > >   can't claim ECAM area [mem 0xa0000000-0xafffffff]: conflict with
> ECAM
> > > PNP [mem 0xa0000000-0xafffffff]
> > >
> > > I expected the request_resource_conflict() to succeed since it's
> > > completely contained in the "ECAM PNP" region.  But I guess I don't
> > > understand kernel/resource.c well enough.
> >
> > I think it fails because effectively the PNP driver is populating the
> > iomem_resource resource tree and therefore pci_ecam_create() finds
> that
> > it cannot add the cfg resource to the same hierarchy as it is already
> > there...
> 
> Right.  I'm just surprised because the PNP reservation is marked
> "not busy", and a driver (e.g., ECAM) should still be able to request
> the resource.

Yes unfortunately pci_ecam_create() is not flexible on the conflict as 
pci_request_regions():
http://lxr.free-electrons.com/source/kernel/resource.c#L1155
if the conflict resource is not busy pci_request_regions() will create
a child resource under the conflict sibling and mark it as busy...

or at least this is my understanding...

> 
> > > I'm not sure we need to fix anything yet, since we currently do the
> > > ecam.c request before the system.c one, and any change there would
> be
> > > a long ways off.  If/when that *does* change, I think the correct
> fix
> > > would be to change ecam.c so its request succeeds (by changing the
> way
> > > it does the request, fixing kernel/resource.c, or whatever) rather
> > > than to reduce the log level and ignore the failure.
> >
> > Well in my mind I didn't want just to make the error disappear...
> > If all the resources should be reserved by the PNP driver then
> ideally
> > we could take away request_resource_conflict() from
> pci_ecam_create(),
> > but this would make buggy some systems with an already shipped BIOS
> > that relied on pci_ecam_create() reservation rather than PNP
> reservation.
> 
> I don't want remove the request from ecam.c.  Ideally, there should be
> TWO lines in /proc/iomem: one from system.c for "pnp 00:01" or
> whatever it is, and a second from ecam.c.  The first is the generic
> one saying "this region is consumed by a piece of hardware, so don't
> put anything else here."  The second is the driver-specific one saying
> "PCI ECAM owns this region, nobody else can use it."
> 
> This is the same way we handle PCI BAR resources.  Here are two
> examples from my laptop.  The first (00:08.0) only has one line:
> it has a BAR that consumes address space, but I don't have a driver
> for it loaded.  The second (00:16.0) does have a driver loaded, so it
> has a second line showing that the driver owns the space:
> 
>   f124a000-f124afff : 0000:00:08.0         # from PCI core
> 
>   f124d000-f124dfff : 0000:00:16.0         # from PCI core
>     f124d000-f124dfff : mei_me             # from mei_me driver
> 
> > Just removing the error condition and converting dev_err() into
> > dev_info() seems to me like accommodating already shipped BIOS images
> > and flagging a reservation that is already done by somebody else
> > without compromising the functionality of the PCI Root bridge driver
> > (so far the only reason why I can see the error condition there is
> > to catch a buggy MCFG with overlapping addresses; so if this is the
> > case maybe we need to have a different diagnostic check to make sure
> > that the MCFG table is alright)
> 
> Ideally I think we should end up with this:
> 
>   a0000000-afffffff : pnp 00:01
>     a0000000-afffffff : PCI ECAM

I think that for PCIe device drivers it works ok because it is guaranteed
that their own pci_request_regions() is called always after
pci_claim_resource() of the bridge that is on top of them...
I.e. pci_claim_resource() reserves the resources as not busy and
pci_request_regions() will create a child busy resource 

> 
> Realistically right now we'll probably end up with only the "PCI ECAM"
> line in /proc/iomem and a warning from system.c about not being able
> to reserve the space.
> 
> If we ever change things to do the generic PNP reservation first, then
> we should fix things so ecam.c can claim the space without an error.

Maybe the patch below could be a sort of solution...effectively pci_ecam
should succeed in reserving a busy resource under the conflict resource
in case of PNP driver allocating a non BUSY resource first...

---
drivers/pci/ecam.c                  | 16 +++++-----------
 drivers/pci/host/pci-thunder-ecam.c |  2 +-
 include/linux/pci-ecam.h            |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
index 43ed08d..999b6ef 100644
--- a/drivers/pci/ecam.c
+++ b/drivers/pci/ecam.c
@@ -66,16 +66,10 @@ struct pci_config_window *pci_ecam_create(struct device *dev,
 	}
 	bsz = 1 << ops->bus_shift;
 
-	cfg->res.start = cfgres->start;
-	cfg->res.end = cfgres->end;
-	cfg->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
-	cfg->res.name = "PCI ECAM";
-
-	conflict = request_resource_conflict(&iomem_resource, &cfg->res);
-	if (conflict) {
+	cfg->res = request_mem_region(cfgres->start, resource_size(cfgres), "PCI ECAM");
+	if (!cfg->res) {
 		err = -EBUSY;
-		dev_err(dev, "can't claim ECAM area %pR: address conflict with %s %pR\n",
-			&cfg->res, conflict->name, conflict);
+		dev_err(dev, "can't claim ECAM area %pR\n", &cfg->res);
 		goto err_exit;
 	}
 
@@ -126,8 +120,8 @@ void pci_ecam_free(struct pci_config_window *cfg)
 		if (cfg->win)
 			iounmap(cfg->win);
 	}
-	if (cfg->res.parent)
-		release_resource(&cfg->res);
+	if (cfg->res->parent)
+		release_region(cfg->res->start, resource_size(cfg->res));
 	kfree(cfg);
 }
 
diff --git a/drivers/pci/host/pci-thunder-ecam.c b/drivers/pci/host/pci-thunder-ecam.c
index d50a3dc..2e48d9d 100644
--- a/drivers/pci/host/pci-thunder-ecam.c
+++ b/drivers/pci/host/pci-thunder-ecam.c
@@ -117,7 +117,7 @@ static int thunder_ecam_p2_config_read(struct pci_bus *bus, unsigned int devfn,
 	 * the config space access window.  Since we are working with
 	 * the high-order 32 bits, shift everything down by 32 bits.
 	 */
-	node_bits = (cfg->res.start >> 32) & (1 << 12);
+	node_bits = (cfg->res->start >> 32) & (1 << 12);
 
 	v |= node_bits;
 	set_val(v, where, size, val);
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index 7adad20..f30a4ea 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h
@@ -36,7 +36,7 @@ struct pci_ecam_ops {
  * use ECAM.
  */
 struct pci_config_window {
-	struct resource			res;
+	struct resource			*res;
 	struct resource			busr;
 	void				*priv;
 	struct pci_ecam_ops		*ops;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-22 10:09   ` Ard Biesheuvel
  (?)
@ 2016-11-23  1:06     ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23  1:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +    For any device that is on a non-enumerable type of bus (for
> > +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +    and the ACPI system firmware must supply an _HID object ... for
> > +    each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +    The OS enumerates motherboard devices simply by reading through
> > +    the ACPI Namespace looking for devices with hardware IDs.
> > +
> > +    Each device enumerated by ACPI includes ACPI-defined objects in
> > +    the ACPI Namespace that report the hardware resources the device
> > +    could occupy [_PRS], an object that reports the resources that are
> > +    currently used by the device [_CRS], and objects for configuring
> > +    those resources [_SRS].  The information is used by the Plug and
> > +    Play OS (OSPM) to configure the devices.
> > +
> > +[3] ACPI 6.0, sec 6.2:
> > +    OSPM uses device configuration objects to configure hardware
> > +    resources for devices enumerated via ACPI.  Device configuration
> > +    objects provide information about current and possible resource
> > +    requirements, the relationship between shared resources, and
> > +    methods for configuring hardware resources.
> > +
> > +    When OSPM enumerates a device, it calls _PRS to determine the
> > +    resource requirements of the device.  It may also call _CRS to
> > +    find the current resource settings for the device.  Using this
> > +    information, the Plug and Play system determines what resources
> > +    the device should consume and sets those resources by calling the
> > +    device’s _SRS control method.
> > +
> > +    In ACPI, devices can consume resources (for example, legacy
> > +    keyboards), provide resources (for example, a proprietary PCI
> > +    bridge), or do both.  Unless otherwise specified, resources for a
> > +    device are assumed to be taken from the nearest matching resource
> > +    above the device in the device hierarchy.
> > +
> > +[4] ACPI 6.0, sec 6.4.3.5.4:
> > +    Extended Address Space Descriptor
> > +    General Flags: Bit [0] Consumer/Producer:
> > +       1–This device consumes this resource
> > +       0–This device produces and consumes this resource
> > +
> > +[5] ACPI 6.0, sec 19.6.43:
> > +    ResourceUsage specifies whether the Memory range is consumed by
> > +    this device (ResourceConsumer) or passed on to child devices
> > +    (ResourceProducer).  If nothing is specified, then
> > +    ResourceConsumer is assumed.
> > +
> > +[6] PCI Firmware 3.0, sec 4.1.2:
> > +    If the operating system does not natively comprehend reserving the
> > +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> > +    address range reported in the MCFG table or by _CBA method (see
> > +    Section 4.1.3) must be reserved by declaring a motherboard
> > +    resource.  For most systems, the motherboard resource would appear
> > +    at the root of the ACPI namespace (under \_SB) in a node with a
> > +    _HID of EISAID (PNP0C02), and the resources in this case should
> > +    not be claimed in the root PCI bus’s _CRS.  The resources can
> > +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> > +    reserved memory but must always be reported through ACPI as a
> > +    motherboard resource.
> > +
> > +[7] PCI Express 3.0, sec 7.2.2:
> > +    For systems that are PC-compatible, or that do not implement a
> > +    processor-architecture-specific firmware interface standard that
> > +    allows access to the Configuration Space, the ECAM is required as
> > +    defined in this section.
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  1:06     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23  1:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +    For any device that is on a non-enumerable type of bus (for
> > +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +    and the ACPI system firmware must supply an _HID object ... for
> > +    each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +    The OS enumerates motherboard devices simply by reading through
> > +    the ACPI Namespace looking for devices with hardware IDs.
> > +
> > +    Each device enumerated by ACPI includes ACPI-defined objects in
> > +    the ACPI Namespace that report the hardware resources the device
> > +    could occupy [_PRS], an object that reports the resources that are
> > +    currently used by the device [_CRS], and objects for configuring
> > +    those resources [_SRS].  The information is used by the Plug and
> > +    Play OS (OSPM) to configure the devices.
> > +
> > +[3] ACPI 6.0, sec 6.2:
> > +    OSPM uses device configuration objects to configure hardware
> > +    resources for devices enumerated via ACPI.  Device configuration
> > +    objects provide information about current and possible resource
> > +    requirements, the relationship between shared resources, and
> > +    methods for configuring hardware resources.
> > +
> > +    When OSPM enumerates a device, it calls _PRS to determine the
> > +    resource requirements of the device.  It may also call _CRS to
> > +    find the current resource settings for the device.  Using this
> > +    information, the Plug and Play system determines what resources
> > +    the device should consume and sets those resources by calling the
> > +    device’s _SRS control method.
> > +
> > +    In ACPI, devices can consume resources (for example, legacy
> > +    keyboards), provide resources (for example, a proprietary PCI
> > +    bridge), or do both.  Unless otherwise specified, resources for a
> > +    device are assumed to be taken from the nearest matching resource
> > +    above the device in the device hierarchy.
> > +
> > +[4] ACPI 6.0, sec 6.4.3.5.4:
> > +    Extended Address Space Descriptor
> > +    General Flags: Bit [0] Consumer/Producer:
> > +       1–This device consumes this resource
> > +       0–This device produces and consumes this resource
> > +
> > +[5] ACPI 6.0, sec 19.6.43:
> > +    ResourceUsage specifies whether the Memory range is consumed by
> > +    this device (ResourceConsumer) or passed on to child devices
> > +    (ResourceProducer).  If nothing is specified, then
> > +    ResourceConsumer is assumed.
> > +
> > +[6] PCI Firmware 3.0, sec 4.1.2:
> > +    If the operating system does not natively comprehend reserving the
> > +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> > +    address range reported in the MCFG table or by _CBA method (see
> > +    Section 4.1.3) must be reserved by declaring a motherboard
> > +    resource.  For most systems, the motherboard resource would appear
> > +    at the root of the ACPI namespace (under \_SB) in a node with a
> > +    _HID of EISAID (PNP0C02), and the resources in this case should
> > +    not be claimed in the root PCI bus’s _CRS.  The resources can
> > +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> > +    reserved memory but must always be reported through ACPI as a
> > +    motherboard resource.
> > +
> > +[7] PCI Express 3.0, sec 7.2.2:
> > +    For systems that are PC-compatible, or that do not implement a
> > +    processor-architecture-specific firmware interface standard that
> > +    allows access to the Configuration Space, the ECAM is required as
> > +    defined in this section.
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  1:06     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +    For any device that is on a non-enumerable type of bus (for
> > +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +    and the ACPI system firmware must supply an _HID object ... for
> > +    each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +    The OS enumerates motherboard devices simply by reading through
> > +    the ACPI Namespace looking for devices with hardware IDs.
> > +
> > +    Each device enumerated by ACPI includes ACPI-defined objects in
> > +    the ACPI Namespace that report the hardware resources the device
> > +    could occupy [_PRS], an object that reports the resources that are
> > +    currently used by the device [_CRS], and objects for configuring
> > +    those resources [_SRS].  The information is used by the Plug and
> > +    Play OS (OSPM) to configure the devices.
> > +
> > +[3] ACPI 6.0, sec 6.2:
> > +    OSPM uses device configuration objects to configure hardware
> > +    resources for devices enumerated via ACPI.  Device configuration
> > +    objects provide information about current and possible resource
> > +    requirements, the relationship between shared resources, and
> > +    methods for configuring hardware resources.
> > +
> > +    When OSPM enumerates a device, it calls _PRS to determine the
> > +    resource requirements of the device.  It may also call _CRS to
> > +    find the current resource settings for the device.  Using this
> > +    information, the Plug and Play system determines what resources
> > +    the device should consume and sets those resources by calling the
> > +    device?s _SRS control method.
> > +
> > +    In ACPI, devices can consume resources (for example, legacy
> > +    keyboards), provide resources (for example, a proprietary PCI
> > +    bridge), or do both.  Unless otherwise specified, resources for a
> > +    device are assumed to be taken from the nearest matching resource
> > +    above the device in the device hierarchy.
> > +
> > +[4] ACPI 6.0, sec 6.4.3.5.4:
> > +    Extended Address Space Descriptor
> > +    General Flags: Bit [0] Consumer/Producer:
> > +       1?This device consumes this resource
> > +       0?This device produces and consumes this resource
> > +
> > +[5] ACPI 6.0, sec 19.6.43:
> > +    ResourceUsage specifies whether the Memory range is consumed by
> > +    this device (ResourceConsumer) or passed on to child devices
> > +    (ResourceProducer).  If nothing is specified, then
> > +    ResourceConsumer is assumed.
> > +
> > +[6] PCI Firmware 3.0, sec 4.1.2:
> > +    If the operating system does not natively comprehend reserving the
> > +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> > +    address range reported in the MCFG table or by _CBA method (see
> > +    Section 4.1.3) must be reserved by declaring a motherboard
> > +    resource.  For most systems, the motherboard resource would appear
> > +    at the root of the ACPI namespace (under \_SB) in a node with a
> > +    _HID of EISAID (PNP0C02), and the resources in this case should
> > +    not be claimed in the root PCI bus?s _CRS.  The resources can
> > +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> > +    reserved memory but must always be reported through ACPI as a
> > +    motherboard resource.
> > +
> > +[7] PCI Express 3.0, sec 7.2.2:
> > +    For systems that are PC-compatible, or that do not implement a
> > +    processor-architecture-specific firmware interface standard that
> > +    allows access to the Configuration Space, the ECAM is required as
> > +    defined in this section.
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-17 17:59 ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-23  3:23   ` Zheng, Lv
  -1 siblings, 0 replies; 66+ messages in thread
From: Zheng, Lv @ 2016-11-23  3:23 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn
> Helgaas
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.

Should the relationship of MCFG and _CBA be covered in this document?

Thanks and best regards
Lv

> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1–This device consumes this resource
> +	0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  3:23   ` Zheng, Lv
  0 siblings, 0 replies; 66+ messages in thread
From: Zheng, Lv @ 2016-11-23  3:23 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn
> Helgaas
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.

Should the relationship of MCFG and _CBA be covered in this document?

Thanks and best regards
Lv

> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1–This device consumes this resource
> +	0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* RE: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  3:23   ` Zheng, Lv
  0 siblings, 0 replies; 66+ messages in thread
From: Zheng, Lv @ 2016-11-23  3:23 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: linux-acpi, linux-kernel, linux-arm-kernel, linaro-acpi

SGksIEJqb3JuDQoNClRoYW5rcyBmb3IgdGhlIGRvY3VtZW50YXRpb24uDQpJdCByZWFsbHkgaGVs
cHMhDQoNCkhvd2V2ZXIgSSBoYXZlIGEgcXVlc3Rpb24gYmVsb3cuDQoNCj4gRnJvbTogbGludXgt
YWNwaS1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1hY3BpLW93bmVyQHZnZXIu
a2VybmVsLm9yZ10gT24gQmVoYWxmIE9mIEJqb3JuDQo+IEhlbGdhYXMNCj4gU3ViamVjdDogW1BB
VENIXSBQQ0k6IEFkZCBpbmZvcm1hdGlvbiBhYm91dCBkZXNjcmliaW5nIFBDSSBpbiBBQ1BJDQo+
IA0KPiBBZGQgYSB3cml0ZXVwIGFib3V0IGhvdyBQQ0kgaG9zdCBicmlkZ2VzIHNob3VsZCBiZSBk
ZXNjcmliZWQgaW4gQUNQSQ0KPiB1c2luZyBQTlAwQTAzL1BOUDBBMDggZGV2aWNlcywgUE5QMEMw
MiBkZXZpY2VzLCBhbmQgdGhlIE1DRkcgdGFibGUuDQo+IA0KPiBTaWduZWQtb2ZmLWJ5OiBCam9y
biBIZWxnYWFzIDxiaGVsZ2Fhc0Bnb29nbGUuY29tPg0KPiAtLS0NCj4gIERvY3VtZW50YXRpb24v
UENJLzAwLUlOREVYICAgICAgfCAgICAyICsNCj4gIERvY3VtZW50YXRpb24vUENJL2FjcGktaW5m
by50eHQgfCAgMTM2ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKw0KPiAg
MiBmaWxlcyBjaGFuZ2VkLCAxMzggaW5zZXJ0aW9ucygrKQ0KPiAgY3JlYXRlIG1vZGUgMTAwNjQ0
IERvY3VtZW50YXRpb24vUENJL2FjcGktaW5mby50eHQNCj4gDQo+IGRpZmYgLS1naXQgYS9Eb2N1
bWVudGF0aW9uL1BDSS8wMC1JTkRFWCBiL0RvY3VtZW50YXRpb24vUENJLzAwLUlOREVYDQo+IGlu
ZGV4IDE0NzIzMWYuLjA3ODAyODAgMTAwNjQ0DQo+IC0tLSBhL0RvY3VtZW50YXRpb24vUENJLzAw
LUlOREVYDQo+ICsrKyBiL0RvY3VtZW50YXRpb24vUENJLzAwLUlOREVYDQo+IEBAIC0xLDUgKzEs
NyBAQA0KPiAgMDAtSU5ERVgNCj4gIAktIHRoaXMgZmlsZQ0KPiArYWNwaS1pbmZvLnR4dA0KPiAr
CS0gaW5mbyBvbiBob3cgUENJIGhvc3QgYnJpZGdlcyBhcmUgcmVwcmVzZW50ZWQgaW4gQUNQSQ0K
PiAgTVNJLUhPV1RPLnR4dA0KPiAgCS0gdGhlIE1lc3NhZ2UgU2lnbmFsZWQgSW50ZXJydXB0cyAo
TVNJKSBEcml2ZXIgR3VpZGUgSE9XVE8gYW5kIEZBUS4NCj4gIFBDSUVCVVMtSE9XVE8udHh0DQo+
IGRpZmYgLS1naXQgYS9Eb2N1bWVudGF0aW9uL1BDSS9hY3BpLWluZm8udHh0IGIvRG9jdW1lbnRh
dGlvbi9QQ0kvYWNwaS1pbmZvLnR4dA0KPiBuZXcgZmlsZSBtb2RlIDEwMDY0NA0KPiBpbmRleCAw
MDAwMDAwLi5jY2JjZmRhDQo+IC0tLSAvZGV2L251bGwNCj4gKysrIGIvRG9jdW1lbnRhdGlvbi9Q
Q0kvYWNwaS1pbmZvLnR4dA0KPiBAQCAtMCwwICsxLDEzNiBAQA0KPiArCSAgICBBQ1BJIGNvbnNp
ZGVyYXRpb25zIGZvciBQQ0kgaG9zdCBicmlkZ2VzDQo+ICsNCj4gK1RoZSBiYXNpYyByZXF1aXJl
bWVudCBpcyB0aGF0IHRoZSBBQ1BJIG5hbWVzcGFjZSBzaG91bGQgZGVzY3JpYmUNCj4gKypldmVy
eXRoaW5nKiB0aGF0IGNvbnN1bWVzIGFkZHJlc3Mgc3BhY2UgdW5sZXNzIHRoZXJlJ3MgYW5vdGhl
cg0KPiArc3RhbmRhcmQgd2F5IGZvciB0aGUgT1MgdG8gZmluZCBpdCBbMSwgMl0uIMKgRm9yIGV4
YW1wbGUsIHdpbmRvd3MgdGhhdA0KPiArYXJlIGZvcndhcmRlZCB0byBQQ0kgYnkgYSBQQ0kgaG9z
dCBicmlkZ2Ugc2hvdWxkIGJlIGRlc2NyaWJlZCB2aWEgQUNQSQ0KPiArZGV2aWNlcywgc2luY2Ug
dGhlIE9TIGNhbid0IGxvY2F0ZSB0aGUgaG9zdCBicmlkZ2UgYnkgaXRzZWxmLiDCoFBDSQ0KPiAr
ZGV2aWNlcyAqYmVsb3cqIHRoZSBob3N0IGJyaWRnZSBkbyBub3QgbmVlZCB0byBiZSBkZXNjcmli
ZWQgdmlhIEFDUEksDQo+ICtiZWNhdXNlIHRoZSByZXNvdXJjZXMgdGhleSBjb25zdW1lIGFyZSBp
bnNpZGUgdGhlIGhvc3QgYnJpZGdlIHdpbmRvd3MsDQo+ICthbmQgdGhlIE9TIGNhbiBkaXNjb3Zl
ciB0aGVtIHZpYSB0aGUgc3RhbmRhcmQgUENJIGVudW1lcmF0aW9uDQo+ICttZWNoYW5pc20gKHVz
aW5nIGNvbmZpZyBhY2Nlc3NlcyB0byByZWFkIGFuZCBzaXplIHRoZSBCQVJzKS4NCj4gKw0KPiAr
VGhpcyBBQ1BJIHJlc291cmNlIGRlc2NyaXB0aW9uIGlzIGRvbmUgdmlhIF9DUlMgbWV0aG9kcyBv
ZiBkZXZpY2VzIGluDQo+ICt0aGUgQUNQSSBuYW1lc3BhY2UgWzJdLiDCoCBfQ1JTIG1ldGhvZHMg
YXJlIGxpa2UgZ2VuZXJhbGl6ZWQgUENJIEJBUnM6DQo+ICt0aGUgT1MgY2FuIHJlYWQgX0NSUyBh
bmQgZmlndXJlIG91dCB3aGF0IHJlc291cmNlIGlzIGJlaW5nIGNvbnN1bWVkDQo+ICtldmVuIGlm
IGl0IGRvZXNuJ3QgaGF2ZSBhIGRyaXZlciBmb3IgdGhlIGRldmljZSBbM10uIMKgVGhhdCdzIGlt
cG9ydGFudA0KPiArYmVjYXVzZSBpdCBtZWFucyBhbiBvbGQgT1MgY2FuIHdvcmsgY29ycmVjdGx5
IGV2ZW4gb24gYSBzeXN0ZW0gd2l0aA0KPiArbmV3IGRldmljZXMgdW5rbm93biB0byB0aGUgT1Mu
IMKgVGhlIG5ldyBkZXZpY2VzIHdvbid0IGRvIGFueXRoaW5nLCBidXQNCj4gK3RoZSBPUyBjYW4g
YXQgbGVhc3QgbWFrZSBzdXJlIG5vIHJlc291cmNlcyBjb25mbGljdCB3aXRoIHRoZW0uDQo+ICsN
Cj4gK1N0YXRpYyB0YWJsZXMgbGlrZSBNQ0ZHLCBIUEVULCBFQ0RULCBldGMuLCBhcmUgKm5vdCog
bWVjaGFuaXNtcyBmb3INCj4gK3Jlc2VydmluZyBhZGRyZXNzIHNwYWNlISAgVGhlIHN0YXRpYyB0
YWJsZXMgYXJlIGZvciB0aGluZ3MgdGhlIE9TDQo+ICtuZWVkcyB0byBrbm93IGVhcmx5IGluIGJv
b3QsIGJlZm9yZSBpdCBjYW4gcGFyc2UgdGhlIEFDUEkgbmFtZXNwYWNlLg0KPiArSWYgYSBuZXcg
dGFibGUgaXMgZGVmaW5lZCwgYW4gb2xkIE9TIG5lZWRzIHRvIG9wZXJhdGUgY29ycmVjdGx5IGV2
ZW4NCj4gK3Rob3VnaCBpdCBpZ25vcmVzIHRoZSB0YWJsZS4gIF9DUlMgYWxsb3dzIHRoYXQgYmVj
YXVzZSBpdCBpcyBnZW5lcmljDQo+ICthbmQgdW5kZXJzdG9vZCBieSB0aGUgb2xkIE9TOyBhIHN0
YXRpYyB0YWJsZSBkb2VzIG5vdC4NCg0KVGhlIGVudGlyZSBkb2N1bWVudCBkb2Vzbid0IHRhbGsg
YWJvdXQgdGhlIGRldGFpbHMgb2YgX0NCQS4NClRoZXJlIGlzIG9ubHkgb25lIGxpbmUgYmVsb3cg
bWVudGlvbmVkIF9DQkEgYXMgYW4gZXhhbXBsZS4NCg0KPiArDQo+ICtJZiB0aGUgT1MgaXMgZXhw
ZWN0ZWQgdG8gbWFuYWdlIGFuIEFDUEkgZGV2aWNlLCB0aGF0IGRldmljZSB3aWxsIGhhdmUNCj4g
K2Egc3BlY2lmaWMgX0hJRC9fQ0lEIHRoYXQgdGVsbHMgdGhlIE9TIHdoYXQgZHJpdmVyIHRvIGJp
bmQgdG8gaXQsIGFuZA0KPiArdGhlIF9DUlMgdGVsbHMgdGhlIE9TIGFuZCB0aGUgZHJpdmVyIHdo
ZXJlIHRoZSBkZXZpY2UncyByZWdpc3RlcnMgYXJlLg0KPiArDQo+ICtQTlAwQzAyICJtb3RoZXJi
b2FyZCIgZGV2aWNlcyBhcmUgYmFzaWNhbGx5IGEgY2F0Y2gtYWxsLiDCoFRoZXJlJ3Mgbm8NCj4g
K3Byb2dyYW1taW5nIG1vZGVsIGZvciB0aGVtIG90aGVyIHRoYW4gImRvbid0IHVzZSB0aGVzZSBy
ZXNvdXJjZXMgZm9yDQo+ICthbnl0aGluZyBlbHNlLiIgwqBTbyBhbnkgYWRkcmVzcyBzcGFjZSB0
aGF0IGlzICgxKSBub3QgY2xhaW1lZCBieSBzb21lDQo+ICtvdGhlciBBQ1BJIGRldmljZSBhbmQg
KDIpIHNob3VsZCBub3QgYmUgYXNzaWduZWQgYnkgdGhlIE9TIHRvDQo+ICtzb21ldGhpbmcgZWxz
ZSwgc2hvdWxkIGJlIGNsYWltZWQgYnkgYSBQTlAwQzAyIF9DUlMgbWV0aG9kLg0KPiArDQo+ICtQ
Q0kgaG9zdCBicmlkZ2VzIGFyZSBQTlAwQTAzIG9yIFBOUDBBMDggZGV2aWNlcy4gwqBUaGVpciBf
Q1JTIHNob3VsZA0KPiArZGVzY3JpYmUgYWxsIHRoZSBhZGRyZXNzIHNwYWNlIHRoZXkgY29uc3Vt
ZS4gwqBJbiBwcmluY2lwbGUsIHRoaXMgd291bGQNCj4gK2JlIGFsbCB0aGUgd2luZG93cyB0aGV5
IGZvcndhcmQgZG93biB0byB0aGUgUENJIGJ1cywgYXMgd2VsbCBhcyB0aGUNCj4gK2JyaWRnZSBy
ZWdpc3RlcnMgdGhlbXNlbHZlcy4gwqBUaGUgYnJpZGdlIHJlZ2lzdGVycyBpbmNsdWRlIHRoaW5n
cyBsaWtlDQo+ICtzZWNvbmRhcnkvc3Vib3JkaW5hdGUgYnVzIHJlZ2lzdGVycyB0aGF0IGRldGVy
bWluZSB0aGUgYnVzIHJhbmdlIGJlbG93DQo+ICt0aGUgYnJpZGdlLCB3aW5kb3cgcmVnaXN0ZXJz
IHRoYXQgZGVzY3JpYmUgdGhlIGFwZXJ0dXJlcywgZXRjLiDCoFRoZXNlDQo+ICthcmUgYWxsIGRl
dmljZS1zcGVjaWZpYywgbm9uLWFyY2hpdGVjdGVkIHRoaW5ncywgc28gdGhlIG9ubHkgd2F5IGEN
Cj4gK1BOUDBBMDMvUE5QMEEwOCBkcml2ZXIgY2FuIG1hbmFnZSB0aGVtIGlzIHZpYSBfUFJTL19D
UlMvX1NSUywgd2hpY2gNCj4gK2NvbnRhaW4gdGhlIGRldmljZS1zcGVjaWZpYyBkZXRhaWxzLiDC
oFRoZXNlIGJyaWRnZSByZWdpc3RlcnMgYWxzbw0KPiAraW5jbHVkZSBFQ0FNIHNwYWNlLCBzaW5j
ZSBpdCBpcyBjb25zdW1lZCBieSB0aGUgYnJpZGdlLg0KPiArDQo+ICtBQ1BJIGRlZmluZWQgYSBQ
cm9kdWNlci9Db25zdW1lciBiaXQgdGhhdCB3YXMgaW50ZW5kZWQgdG8gZGlzdGluZ3Vpc2gNCj4g
K3RoZSBicmlkZ2UgYXBlcnR1cmVzIGZyb20gdGhlIGJyaWRnZSByZWdpc3RlcnMgWzQsIDVdLiDC
oEhvd2V2ZXIsDQo+ICtCSU9TZXMgZGlkbid0IHVzZSB0aGF0IGJpdCBjb3JyZWN0bHksIGFuZCB0
aGUgcmVzdWx0IGlzIHRoYXQgT1NlcyBoYXZlDQo+ICt0byBhc3N1bWUgdGhhdCBldmVyeXRoaW5n
IGluIGEgUENJIGhvc3QgYnJpZGdlIF9DUlMgaXMgYSB3aW5kb3cuIMKgVGhhdA0KPiArbGVhdmVz
IG5vIHdheSB0byBkZXNjcmliZSB0aGUgYnJpZGdlIHJlZ2lzdGVycyBpbiB0aGUgUE5QMEEwMy9Q
TlAwQTA4DQo+ICtkZXZpY2UgaXRzZWxmLg0KPiArDQo+ICtUaGUgd29ya2Fyb3VuZCBpcyB0byBk
ZXNjcmliZSB0aGUgYnJpZGdlIHJlZ2lzdGVycyAoaW5jbHVkaW5nIEVDQU0NCj4gK3NwYWNlKSBp
biBQTlAwQzAyIGNhdGNoLWFsbCBkZXZpY2VzIFs2XS4gwqBXaXRoIHRoZSBleGNlcHRpb24gb2Yg
RUNBTSwNCj4gK3RoZSBicmlkZ2UgcmVnaXN0ZXIgc3BhY2UgaXMgZGV2aWNlLXNwZWNpZmljIGFu
eXdheSwgc28gdGhlIGdlbmVyaWMNCj4gK1BOUDBBMDMvUE5QMEEwOCBkcml2ZXIgKHBjaV9yb290
LmMpIGhhcyBubyBuZWVkIHRvIGtub3cgYWJvdXQgaXQuIMKgRm9yDQo+ICtFQ0FNLCBwY2lfcm9v
dC5jIGxlYXJucyBhYm91dCB0aGUgc3BhY2UgZnJvbSBlaXRoZXIgTUNGRyBvciB0aGUgX0NCQQ0K
PiArbWV0aG9kLg0KDQpTaG91bGQgdGhlIHJlbGF0aW9uc2hpcCBvZiBNQ0ZHIGFuZCBfQ0JBIGJl
IGNvdmVyZWQgaW4gdGhpcyBkb2N1bWVudD8NCg0KVGhhbmtzIGFuZCBiZXN0IHJlZ2FyZHMNCkx2
DQoNCj4gKw0KPiArTm90ZSB0aGF0IHRoZSBQQ0llIHNwZWMgYWN0dWFsbHkgZG9lcyByZXF1aXJl
IEVDQU0gdW5sZXNzIHRoZXJlJ3MgYQ0KPiArc3RhbmRhcmQgZmlybXdhcmUgaW50ZXJmYWNlIGZv
ciBjb25maWcgYWNjZXNzLCBlLmcuLCB0aGUgaWE2NCBTQUwNCj4gK2ludGVyZmFjZSBbN10uICBP
bmUgcmVhc29uIGlzIHRoYXQgd2Ugd2FudCBhIGdlbmVyaWMgaG9zdCBicmlkZ2UNCj4gK2RyaXZl
ciAocGNpX3Jvb3QuYyksIGFuZCBhIGdlbmVyaWMgZHJpdmVyIHJlcXVpcmVzIGEgZ2VuZXJpYyB3
YXkgdG8NCj4gK2FjY2VzcyBjb25maWcgc3BhY2UuDQo+ICsNCj4gKw0KPiArWzFdIEFDUEkgNi4w
LCBzZWMgNi4xOg0KPiArICAgIEZvciBhbnkgZGV2aWNlIHRoYXQgaXMgb24gYSBub24tZW51bWVy
YWJsZSB0eXBlIG9mIGJ1cyAoZm9yDQo+ICsgICAgZXhhbXBsZSwgYW4gSVNBIGJ1cyksIE9TUE0g
ZW51bWVyYXRlcyB0aGUgZGV2aWNlcycgaWRlbnRpZmllcihzKQ0KPiArICAgIGFuZCB0aGUgQUNQ
SSBzeXN0ZW0gZmlybXdhcmUgbXVzdCBzdXBwbHkgYW4gX0hJRCBvYmplY3QgLi4uIGZvcg0KPiAr
ICAgIGVhY2ggZGV2aWNlIHRvIGVuYWJsZSBPU1BNIHRvIGRvIHRoYXQuDQo+ICsNCj4gK1syXSBB
Q1BJIDYuMCwgc2VjIDMuNzoNCj4gKyAgICBUaGUgT1MgZW51bWVyYXRlcyBtb3RoZXJib2FyZCBk
ZXZpY2VzIHNpbXBseSBieSByZWFkaW5nIHRocm91Z2gNCj4gKyAgICB0aGUgQUNQSSBOYW1lc3Bh
Y2UgbG9va2luZyBmb3IgZGV2aWNlcyB3aXRoIGhhcmR3YXJlIElEcy4NCj4gKw0KPiArICAgIEVh
Y2ggZGV2aWNlIGVudW1lcmF0ZWQgYnkgQUNQSSBpbmNsdWRlcyBBQ1BJLWRlZmluZWQgb2JqZWN0
cyBpbg0KPiArICAgIHRoZSBBQ1BJIE5hbWVzcGFjZSB0aGF0IHJlcG9ydCB0aGUgaGFyZHdhcmUg
cmVzb3VyY2VzIHRoZSBkZXZpY2UNCj4gKyAgICBjb3VsZCBvY2N1cHkgW19QUlNdLCBhbiBvYmpl
Y3QgdGhhdCByZXBvcnRzIHRoZSByZXNvdXJjZXMgdGhhdCBhcmUNCj4gKyAgICBjdXJyZW50bHkg
dXNlZCBieSB0aGUgZGV2aWNlIFtfQ1JTXSwgYW5kIG9iamVjdHMgZm9yIGNvbmZpZ3VyaW5nDQo+
ICsgICAgdGhvc2UgcmVzb3VyY2VzIFtfU1JTXS4gIFRoZSBpbmZvcm1hdGlvbiBpcyB1c2VkIGJ5
IHRoZSBQbHVnIGFuZA0KPiArICAgIFBsYXkgT1MgKE9TUE0pIHRvIGNvbmZpZ3VyZSB0aGUgZGV2
aWNlcy4NCj4gKw0KPiArWzNdIEFDUEkgNi4wLCBzZWMgNi4yOg0KPiArICAgIE9TUE0gdXNlcyBk
ZXZpY2UgY29uZmlndXJhdGlvbiBvYmplY3RzIHRvIGNvbmZpZ3VyZSBoYXJkd2FyZQ0KPiArICAg
IHJlc291cmNlcyBmb3IgZGV2aWNlcyBlbnVtZXJhdGVkIHZpYSBBQ1BJLiAgRGV2aWNlIGNvbmZp
Z3VyYXRpb24NCj4gKyAgICBvYmplY3RzIHByb3ZpZGUgaW5mb3JtYXRpb24gYWJvdXQgY3VycmVu
dCBhbmQgcG9zc2libGUgcmVzb3VyY2UNCj4gKyAgICByZXF1aXJlbWVudHMsIHRoZSByZWxhdGlv
bnNoaXAgYmV0d2VlbiBzaGFyZWQgcmVzb3VyY2VzLCBhbmQNCj4gKyAgICBtZXRob2RzIGZvciBj
b25maWd1cmluZyBoYXJkd2FyZSByZXNvdXJjZXMuDQo+ICsNCj4gKyAgICBXaGVuIE9TUE0gZW51
bWVyYXRlcyBhIGRldmljZSwgaXQgY2FsbHMgX1BSUyB0byBkZXRlcm1pbmUgdGhlDQo+ICsgICAg
cmVzb3VyY2UgcmVxdWlyZW1lbnRzIG9mIHRoZSBkZXZpY2UuICBJdCBtYXkgYWxzbyBjYWxsIF9D
UlMgdG8NCj4gKyAgICBmaW5kIHRoZSBjdXJyZW50IHJlc291cmNlIHNldHRpbmdzIGZvciB0aGUg
ZGV2aWNlLiAgVXNpbmcgdGhpcw0KPiArICAgIGluZm9ybWF0aW9uLCB0aGUgUGx1ZyBhbmQgUGxh
eSBzeXN0ZW0gZGV0ZXJtaW5lcyB3aGF0IHJlc291cmNlcw0KPiArICAgIHRoZSBkZXZpY2Ugc2hv
dWxkIGNvbnN1bWUgYW5kIHNldHMgdGhvc2UgcmVzb3VyY2VzIGJ5IGNhbGxpbmcgdGhlDQo+ICsg
ICAgZGV2aWNl4oCZcyBfU1JTIGNvbnRyb2wgbWV0aG9kLg0KPiArDQo+ICsgICAgSW4gQUNQSSwg
ZGV2aWNlcyBjYW4gY29uc3VtZSByZXNvdXJjZXMgKGZvciBleGFtcGxlLCBsZWdhY3kNCj4gKyAg
ICBrZXlib2FyZHMpLCBwcm92aWRlIHJlc291cmNlcyAoZm9yIGV4YW1wbGUsIGEgcHJvcHJpZXRh
cnkgUENJDQo+ICsgICAgYnJpZGdlKSwgb3IgZG8gYm90aC4gIFVubGVzcyBvdGhlcndpc2Ugc3Bl
Y2lmaWVkLCByZXNvdXJjZXMgZm9yIGENCj4gKyAgICBkZXZpY2UgYXJlIGFzc3VtZWQgdG8gYmUg
dGFrZW4gZnJvbSB0aGUgbmVhcmVzdCBtYXRjaGluZyByZXNvdXJjZQ0KPiArICAgIGFib3ZlIHRo
ZSBkZXZpY2UgaW4gdGhlIGRldmljZSBoaWVyYXJjaHkuDQo+ICsNCj4gK1s0XSBBQ1BJIDYuMCwg
c2VjIDYuNC4zLjUuNDoNCj4gKyAgICBFeHRlbmRlZCBBZGRyZXNzIFNwYWNlIERlc2NyaXB0b3IN
Cj4gKyAgICBHZW5lcmFsIEZsYWdzOiBCaXQgWzBdIENvbnN1bWVyL1Byb2R1Y2VyOg0KPiArCTHi
gJNUaGlzIGRldmljZSBjb25zdW1lcyB0aGlzIHJlc291cmNlDQo+ICsJMOKAk1RoaXMgZGV2aWNl
IHByb2R1Y2VzIGFuZCBjb25zdW1lcyB0aGlzIHJlc291cmNlDQo+ICsNCj4gK1s1XSBBQ1BJIDYu
MCwgc2VjIDE5LjYuNDM6DQo+ICsgICAgUmVzb3VyY2VVc2FnZSBzcGVjaWZpZXMgd2hldGhlciB0
aGUgTWVtb3J5IHJhbmdlIGlzIGNvbnN1bWVkIGJ5DQo+ICsgICAgdGhpcyBkZXZpY2UgKFJlc291
cmNlQ29uc3VtZXIpIG9yIHBhc3NlZCBvbiB0byBjaGlsZCBkZXZpY2VzDQo+ICsgICAgKFJlc291
cmNlUHJvZHVjZXIpLiAgSWYgbm90aGluZyBpcyBzcGVjaWZpZWQsIHRoZW4NCj4gKyAgICBSZXNv
dXJjZUNvbnN1bWVyIGlzIGFzc3VtZWQuDQo+ICsNCj4gK1s2XSBQQ0kgRmlybXdhcmUgMy4wLCBz
ZWMgNC4xLjI6DQo+ICsgICAgSWYgdGhlIG9wZXJhdGluZyBzeXN0ZW0gZG9lcyBub3QgbmF0aXZl
bHkgY29tcHJlaGVuZCByZXNlcnZpbmcgdGhlDQo+ICsgICAgTU1DRkcgcmVnaW9uLCB0aGUgTU1D
RkcgcmVnaW9uIG11c3QgYmUgcmVzZXJ2ZWQgYnkgZmlybXdhcmUuICBUaGUNCj4gKyAgICBhZGRy
ZXNzIHJhbmdlIHJlcG9ydGVkIGluIHRoZSBNQ0ZHIHRhYmxlIG9yIGJ5IF9DQkEgbWV0aG9kIChz
ZWUNCj4gKyAgICBTZWN0aW9uIDQuMS4zKSBtdXN0IGJlIHJlc2VydmVkIGJ5IGRlY2xhcmluZyBh
IG1vdGhlcmJvYXJkDQo+ICsgICAgcmVzb3VyY2UuICBGb3IgbW9zdCBzeXN0ZW1zLCB0aGUgbW90
aGVyYm9hcmQgcmVzb3VyY2Ugd291bGQgYXBwZWFyDQo+ICsgICAgYXQgdGhlIHJvb3Qgb2YgdGhl
IEFDUEkgbmFtZXNwYWNlICh1bmRlciBcX1NCKSBpbiBhIG5vZGUgd2l0aCBhDQo+ICsgICAgX0hJ
RCBvZiBFSVNBSUQgKFBOUDBDMDIpLCBhbmQgdGhlIHJlc291cmNlcyBpbiB0aGlzIGNhc2Ugc2hv
dWxkDQo+ICsgICAgbm90IGJlIGNsYWltZWQgaW4gdGhlIHJvb3QgUENJIGJ1c+KAmXMgX0NSUy4g
IFRoZSByZXNvdXJjZXMgY2FuDQo+ICsgICAgb3B0aW9uYWxseSBiZSByZXR1cm5lZCBpbiBJbnQx
NSBFODIwIG9yIEVGSUdldE1lbW9yeU1hcCBhcw0KPiArICAgIHJlc2VydmVkIG1lbW9yeSBidXQg
bXVzdCBhbHdheXMgYmUgcmVwb3J0ZWQgdGhyb3VnaCBBQ1BJIGFzIGENCj4gKyAgICBtb3RoZXJi
b2FyZCByZXNvdXJjZS4NCj4gKw0KPiArWzddIFBDSSBFeHByZXNzIDMuMCwgc2VjIDcuMi4yOg0K
PiArICAgIEZvciBzeXN0ZW1zIHRoYXQgYXJlIFBDLWNvbXBhdGlibGUsIG9yIHRoYXQgZG8gbm90
IGltcGxlbWVudCBhDQo+ICsgICAgcHJvY2Vzc29yLWFyY2hpdGVjdHVyZS1zcGVjaWZpYyBmaXJt
d2FyZSBpbnRlcmZhY2Ugc3RhbmRhcmQgdGhhdA0KPiArICAgIGFsbG93cyBhY2Nlc3MgdG8gdGhl
IENvbmZpZ3VyYXRpb24gU3BhY2UsIHRoZSBFQ0FNIGlzIHJlcXVpcmVkIGFzDQo+ICsgICAgZGVm
aW5lZCBpbiB0aGlzIHNlY3Rpb24uDQo+IA0KPiAtLQ0KPiBUbyB1bnN1YnNjcmliZSBmcm9tIHRo
aXMgbGlzdDogc2VuZCB0aGUgbGluZSAidW5zdWJzY3JpYmUgbGludXgtYWNwaSIgaW4NCj4gdGhl
IGJvZHkgb2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcNCj4gTW9yZSBt
YWpvcmRvbW8gaW5mbyBhdCAgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8taW5mby5o
dG1sDQo=

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  3:23   ` Zheng, Lv
  0 siblings, 0 replies; 66+ messages in thread
From: Zheng, Lv @ 2016-11-23  3:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-owner at vger.kernel.org [mailto:linux-acpi-owner at vger.kernel.org] On Behalf Of Bjorn
> Helgaas
> Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
> 
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +acpi-info.txt
> +	- info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +	    ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2]. ?For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself. ?PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2]. ? _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3]. ?That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS. ?The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.

The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all. ?There's no
> +programming model for them other than "don't use these resources for
> +anything else." ?So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices. ?Their _CRS should
> +describe all the address space they consume. ?In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves. ?The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc. ?These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details. ?These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5]. ?However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window. ?That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +
> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6]. ?With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it. ?For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.

Should the relationship of MCFG and _CBA be covered in this document?

Thanks and best regards
Lv

> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device?s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +	1?This device consumes this resource
> +	0?This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus?s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23  1:06     ` Bjorn Helgaas
  (?)
@ 2016-11-23  7:28       ` Ard Biesheuvel
  -1 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-23  7:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  7:28       ` Ard Biesheuvel
  0 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-23  7:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23  7:28       ` Ard Biesheuvel
  0 siblings, 0 replies; 66+ messages in thread
From: Ard Biesheuvel @ 2016-11-23  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23  7:28       ` Ard Biesheuvel
  (?)
@ 2016-11-23 12:30         ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 66+ messages in thread
From: Lorenzo Pieralisi @ 2016-11-23 12:30 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, Bjorn Helgaas, linux-pci, linux-acpi,
	linux-kernel, linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart from the lookup, there's still some advantage in describing the
> > registers in the PNP0A03 device instead of an unrelated PNP0C02
> > device, because it makes /proc/iomem more accurate and potentially
> > makes host bridge hotplug cleaner.  We would have to enhance the host
> > bridge driver to do the reservations currently done by pnp/system.c.
> >
> > There's some value in doing it the same way as on x86, even though
> > that way is somewhat broken.
> >
> > Whatever we decide, I think it's very important to get it figured out
> > ASAP because it affects the ECAM quirks that we're trying to merge in
> > v4.10.
> >
> 
> I agree. What exactly is the impact for the quirks mechanism as proposed?
The impact is that we could just use the PNP0A03 _CRS to report the PCI
ECAM config space quirk region through a consumer resource keeping in
mind what I say above (actually I think that's what was done on APM
firmware initially, for the records).

Lorenzo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 12:30         ` Lorenzo Pieralisi
  0 siblings, 0 replies; 66+ messages in thread
From: Lorenzo Pieralisi @ 2016-11-23 12:30 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, Bjorn Helgaas, linux-pci, linux-acpi,
	linux-kernel, linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart from the lookup, there's still some advantage in describing the
> > registers in the PNP0A03 device instead of an unrelated PNP0C02
> > device, because it makes /proc/iomem more accurate and potentially
> > makes host bridge hotplug cleaner.  We would have to enhance the host
> > bridge driver to do the reservations currently done by pnp/system.c.
> >
> > There's some value in doing it the same way as on x86, even though
> > that way is somewhat broken.
> >
> > Whatever we decide, I think it's very important to get it figured out
> > ASAP because it affects the ECAM quirks that we're trying to merge in
> > v4.10.
> >
> 
> I agree. What exactly is the impact for the quirks mechanism as proposed?
The impact is that we could just use the PNP0A03 _CRS to report the PCI
ECAM config space quirk region through a consumer resource keeping in
mind what I say above (actually I think that's what was done on APM
firmware initially, for the records).

Lorenzo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 12:30         ` Lorenzo Pieralisi
  0 siblings, 0 replies; 66+ messages in thread
From: Lorenzo Pieralisi @ 2016-11-23 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart from the lookup, there's still some advantage in describing the
> > registers in the PNP0A03 device instead of an unrelated PNP0C02
> > device, because it makes /proc/iomem more accurate and potentially
> > makes host bridge hotplug cleaner.  We would have to enhance the host
> > bridge driver to do the reservations currently done by pnp/system.c.
> >
> > There's some value in doing it the same way as on x86, even though
> > that way is somewhat broken.
> >
> > Whatever we decide, I think it's very important to get it figured out
> > ASAP because it affects the ECAM quirks that we're trying to merge in
> > v4.10.
> >
> 
> I agree. What exactly is the impact for the quirks mechanism as proposed?
The impact is that we could just use the PNP0A03 _CRS to report the PCI
ECAM config space quirk region through a consumer resource keeping in
mind what I say above (actually I think that's what was done on APM
firmware initially, for the records).

Lorenzo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23  7:28       ` Ard Biesheuvel
  (?)
  (?)
@ 2016-11-23 15:06         ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23 15:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linaro-acpi, linux-pci, linux-kernel, linux-acpi, Bjorn Helgaas,
	linux-arm-kernel

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
    already done via pci_acpi_root_add_resources(), but we might need
    a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
    pnp/system.c does for PNP0C02 devices.  This is similar to the
    producer regions, but I think the consumer ones should be marked
    IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 15:06         ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23 15:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
    already done via pci_acpi_root_add_resources(), but we might need
    a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
    pnp/system.c does for PNP0C02 devices.  This is similar to the
    producer regions, but I think the consumer ones should be marked
    IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 15:06         ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23 15:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
    already done via pci_acpi_root_add_resources(), but we might need
    a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
    pnp/system.c does for PNP0C02 devices.  This is similar to the
    producer regions, but I think the consumer ones should be marked
    IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 15:06         ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-23 15:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

>From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
    already done via pci_acpi_root_add_resources(), but we might need
    a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
    pnp/system.c does for PNP0C02 devices.  This is similar to the
    producer regions, but I think the consumer ones should be marked
    IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Linaro-acpi] [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23 12:30         ` Lorenzo Pieralisi
  (?)
@ 2016-11-23 20:52           ` Duc Dang
  -1 siblings, 0 replies; 66+ messages in thread
From: Duc Dang @ 2016-11-23 20:52 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Ard Biesheuvel, linaro-acpi, linux-pci, linux-kernel, linux-acpi,
	Bjorn Helgaas, Bjorn Helgaas, linux-arm-kernel

On Wed, Nov 23, 2016 at 4:30 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
>> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> >> > +describe all the address space they consume.  In principle, this would
>> >> > +be all the windows they forward down to the PCI bus, as well as the
>> >> > +bridge registers themselves.  The bridge registers include things like
>> >> > +secondary/subordinate bus registers that determine the bus range below
>> >> > +the bridge, window registers that describe the apertures, etc.  These
>> >> > +are all device-specific, non-architected things, so the only way a
>> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> >> > +contain the device-specific details.  These bridge registers also
>> >> > +include ECAM space, since it is consumed by the bridge.
>> >> > +
>> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> >> > +the bridge apertures from the bridge registers [4, 5].  However,
>> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> >> > +device itself.
>> >>
>> >> Is that universally true? Or is it still possible to do the right
>> >> thing here on new ACPI architectures such as arm64?
>> >
>> > That's a very good question.  I had thought that the ACPI spec had
>> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
>> > spec, the Consumer/Producer bit is still documented in the Extended
>> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
>> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>> >
>> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
>> > I think is used for all these descriptors (QWord, DWord, Word, and
>> > Extended).  This doesn't quite follow the spec -- we probably should
>> > ignore it except for Extended.  In any event, acpi_decode_space() sets
>> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
>> > IORESOURCE_WINDOW in the PCI host bridge code.
>> >
>> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
>> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
>> > and looks at producer_consumer.  Then they do a little arch-specific
>> > stuff on the result.
>> >
>> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
>> > arch-specific stuff.
>> >
>> > On all three arches, we ignore the Consumer/Producer bit, so all the
>> > resources are treated as Producers, e.g., as bridge windows.
>> >
>> > I think we *could* implement an arm64 version of
>> > pci_acpi_root_prepare_resources() that would pay attention to the
>> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
>> > compliant, we would have to use Extended descriptors for all bridge
>> > windows, even if they would fit in a DWord or QWord.
>> >
>> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
>> >
>>
>> Yes, I think we should. If the spec allows for a way for a PNP0A03
>> device to describe all of its resources unambiguously, we should not
>> be relying on workarounds that were designed for another architecture
>> in another decade (for, presumably, another OS)
>
> That was the idea I floated at LPC16. We can override the
> acpi_pci_root_ops prepare_resources() function pointer with a function
> that checks IORESOURCE_WINDOW and filters resources accordingly (and
> specific quirk "drivers" may know how to intepret resources that aren't
> IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
> space quirk region in their _CRS).
>
> In a way that's something that makes sense anyway because given
> that we are starting from a clean slate on ARM64 considering resources
> that are not IORESOURCE_WINDOW as host bridge windows is just something
> we are inheriting from x86, it is not really ACPI specs compliant (is
> it ?).
>
>> Just for my understanding, we will need to use extended descriptors
>> for all consumed *and* produced regions, even though dword/qword are
>> implicitly produced-only, due to the fact that the bit is ignored?
>
> That's something that has to be clarified within the ASWG ie why the
> consumer bit is ignored for *some* descriptors and not for others.
>
> As things stand unfortunately the answer seems yes (I do not know
> why).
>
>> > It *would* be nice to have bridge registers in the bridge _CRS.  That
>> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
>> > to find the bridge registers.  Avoiding that lookup is only a
>> > temporary advantage -- the next round of bridges are supposed to fully
>> > implement ECAM, and then we won't need to know where the registers
>> > are.
>> >
>> > Apart from the lookup, there's still some advantage in describing the
>> > registers in the PNP0A03 device instead of an unrelated PNP0C02
>> > device, because it makes /proc/iomem more accurate and potentially
>> > makes host bridge hotplug cleaner.  We would have to enhance the host
>> > bridge driver to do the reservations currently done by pnp/system.c.
>> >
>> > There's some value in doing it the same way as on x86, even though
>> > that way is somewhat broken.
>> >
>> > Whatever we decide, I think it's very important to get it figured out
>> > ASAP because it affects the ECAM quirks that we're trying to merge in
>> > v4.10.
>> >
>>
>> I agree. What exactly is the impact for the quirks mechanism as proposed?
> The impact is that we could just use the PNP0A03 _CRS to report the PCI
> ECAM config space quirk region through a consumer resource keeping in
> mind what I say above (actually I think that's what was done on APM
> firmware initially, for the records).

Just to clarify: APM firmware initially has a _CSR region to declare
the controller register region. We don't know that we need to declare
the reserved space for ECAM until Bjorn pointed out recently (with the
usage of PNP0C02).

I really like this idea about declaring ECAM space and any additional
spaces required for ECAM quirk inside PNP0A03 _CRS. For the firmware
that already shipped, the quirk will need to add additional resources
(for ECAM and other needed regions) into the root-bus. If we decided
to go with this, do we still have time to make additional adjustment
for the current ECAM quirk and the foundation patches before
v4.10-rc1?

>
> Lorenzo
> _______________________________________________
> Linaro-acpi mailing list
> Linaro-acpi@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-acpi
Regards,
Duc Dang.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Linaro-acpi] [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 20:52           ` Duc Dang
  0 siblings, 0 replies; 66+ messages in thread
From: Duc Dang @ 2016-11-23 20:52 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Ard Biesheuvel, linaro-acpi, linux-pci, linux-kernel, linux-acpi,
	Bjorn Helgaas, Bjorn Helgaas, linux-arm-kernel

On Wed, Nov 23, 2016 at 4:30 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
>> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> >> > +describe all the address space they consume.  In principle, this would
>> >> > +be all the windows they forward down to the PCI bus, as well as the
>> >> > +bridge registers themselves.  The bridge registers include things like
>> >> > +secondary/subordinate bus registers that determine the bus range below
>> >> > +the bridge, window registers that describe the apertures, etc.  These
>> >> > +are all device-specific, non-architected things, so the only way a
>> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> >> > +contain the device-specific details.  These bridge registers also
>> >> > +include ECAM space, since it is consumed by the bridge.
>> >> > +
>> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> >> > +the bridge apertures from the bridge registers [4, 5].  However,
>> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> >> > +device itself.
>> >>
>> >> Is that universally true? Or is it still possible to do the right
>> >> thing here on new ACPI architectures such as arm64?
>> >
>> > That's a very good question.  I had thought that the ACPI spec had
>> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
>> > spec, the Consumer/Producer bit is still documented in the Extended
>> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
>> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>> >
>> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
>> > I think is used for all these descriptors (QWord, DWord, Word, and
>> > Extended).  This doesn't quite follow the spec -- we probably should
>> > ignore it except for Extended.  In any event, acpi_decode_space() sets
>> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
>> > IORESOURCE_WINDOW in the PCI host bridge code.
>> >
>> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
>> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
>> > and looks at producer_consumer.  Then they do a little arch-specific
>> > stuff on the result.
>> >
>> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
>> > arch-specific stuff.
>> >
>> > On all three arches, we ignore the Consumer/Producer bit, so all the
>> > resources are treated as Producers, e.g., as bridge windows.
>> >
>> > I think we *could* implement an arm64 version of
>> > pci_acpi_root_prepare_resources() that would pay attention to the
>> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
>> > compliant, we would have to use Extended descriptors for all bridge
>> > windows, even if they would fit in a DWord or QWord.
>> >
>> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
>> >
>>
>> Yes, I think we should. If the spec allows for a way for a PNP0A03
>> device to describe all of its resources unambiguously, we should not
>> be relying on workarounds that were designed for another architecture
>> in another decade (for, presumably, another OS)
>
> That was the idea I floated at LPC16. We can override the
> acpi_pci_root_ops prepare_resources() function pointer with a function
> that checks IORESOURCE_WINDOW and filters resources accordingly (and
> specific quirk "drivers" may know how to intepret resources that aren't
> IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
> space quirk region in their _CRS).
>
> In a way that's something that makes sense anyway because given
> that we are starting from a clean slate on ARM64 considering resources
> that are not IORESOURCE_WINDOW as host bridge windows is just something
> we are inheriting from x86, it is not really ACPI specs compliant (is
> it ?).
>
>> Just for my understanding, we will need to use extended descriptors
>> for all consumed *and* produced regions, even though dword/qword are
>> implicitly produced-only, due to the fact that the bit is ignored?
>
> That's something that has to be clarified within the ASWG ie why the
> consumer bit is ignored for *some* descriptors and not for others.
>
> As things stand unfortunately the answer seems yes (I do not know
> why).
>
>> > It *would* be nice to have bridge registers in the bridge _CRS.  That
>> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
>> > to find the bridge registers.  Avoiding that lookup is only a
>> > temporary advantage -- the next round of bridges are supposed to fully
>> > implement ECAM, and then we won't need to know where the registers
>> > are.
>> >
>> > Apart from the lookup, there's still some advantage in describing the
>> > registers in the PNP0A03 device instead of an unrelated PNP0C02
>> > device, because it makes /proc/iomem more accurate and potentially
>> > makes host bridge hotplug cleaner.  We would have to enhance the host
>> > bridge driver to do the reservations currently done by pnp/system.c.
>> >
>> > There's some value in doing it the same way as on x86, even though
>> > that way is somewhat broken.
>> >
>> > Whatever we decide, I think it's very important to get it figured out
>> > ASAP because it affects the ECAM quirks that we're trying to merge in
>> > v4.10.
>> >
>>
>> I agree. What exactly is the impact for the quirks mechanism as proposed?
> The impact is that we could just use the PNP0A03 _CRS to report the PCI
> ECAM config space quirk region through a consumer resource keeping in
> mind what I say above (actually I think that's what was done on APM
> firmware initially, for the records).

Just to clarify: APM firmware initially has a _CSR region to declare
the controller register region. We don't know that we need to declare
the reserved space for ECAM until Bjorn pointed out recently (with the
usage of PNP0C02).

I really like this idea about declaring ECAM space and any additional
spaces required for ECAM quirk inside PNP0A03 _CRS. For the firmware
that already shipped, the quirk will need to add additional resources
(for ECAM and other needed regions) into the root-bus. If we decided
to go with this, do we still have time to make additional adjustment
for the current ECAM quirk and the foundation patches before
v4.10-rc1?

>
> Lorenzo
> _______________________________________________
> Linaro-acpi mailing list
> Linaro-acpi@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-acpi
Regards,
Duc Dang.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [Linaro-acpi] [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-23 20:52           ` Duc Dang
  0 siblings, 0 replies; 66+ messages in thread
From: Duc Dang @ 2016-11-23 20:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 4:30 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
>> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> >> > +describe all the address space they consume.  In principle, this would
>> >> > +be all the windows they forward down to the PCI bus, as well as the
>> >> > +bridge registers themselves.  The bridge registers include things like
>> >> > +secondary/subordinate bus registers that determine the bus range below
>> >> > +the bridge, window registers that describe the apertures, etc.  These
>> >> > +are all device-specific, non-architected things, so the only way a
>> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> >> > +contain the device-specific details.  These bridge registers also
>> >> > +include ECAM space, since it is consumed by the bridge.
>> >> > +
>> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> >> > +the bridge apertures from the bridge registers [4, 5].  However,
>> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> >> > +device itself.
>> >>
>> >> Is that universally true? Or is it still possible to do the right
>> >> thing here on new ACPI architectures such as arm64?
>> >
>> > That's a very good question.  I had thought that the ACPI spec had
>> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
>> > spec, the Consumer/Producer bit is still documented in the Extended
>> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
>> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>> >
>> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
>> > I think is used for all these descriptors (QWord, DWord, Word, and
>> > Extended).  This doesn't quite follow the spec -- we probably should
>> > ignore it except for Extended.  In any event, acpi_decode_space() sets
>> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
>> > IORESOURCE_WINDOW in the PCI host bridge code.
>> >
>> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
>> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
>> > and looks at producer_consumer.  Then they do a little arch-specific
>> > stuff on the result.
>> >
>> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
>> > arch-specific stuff.
>> >
>> > On all three arches, we ignore the Consumer/Producer bit, so all the
>> > resources are treated as Producers, e.g., as bridge windows.
>> >
>> > I think we *could* implement an arm64 version of
>> > pci_acpi_root_prepare_resources() that would pay attention to the
>> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
>> > compliant, we would have to use Extended descriptors for all bridge
>> > windows, even if they would fit in a DWord or QWord.
>> >
>> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
>> >
>>
>> Yes, I think we should. If the spec allows for a way for a PNP0A03
>> device to describe all of its resources unambiguously, we should not
>> be relying on workarounds that were designed for another architecture
>> in another decade (for, presumably, another OS)
>
> That was the idea I floated at LPC16. We can override the
> acpi_pci_root_ops prepare_resources() function pointer with a function
> that checks IORESOURCE_WINDOW and filters resources accordingly (and
> specific quirk "drivers" may know how to intepret resources that aren't
> IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
> space quirk region in their _CRS).
>
> In a way that's something that makes sense anyway because given
> that we are starting from a clean slate on ARM64 considering resources
> that are not IORESOURCE_WINDOW as host bridge windows is just something
> we are inheriting from x86, it is not really ACPI specs compliant (is
> it ?).
>
>> Just for my understanding, we will need to use extended descriptors
>> for all consumed *and* produced regions, even though dword/qword are
>> implicitly produced-only, due to the fact that the bit is ignored?
>
> That's something that has to be clarified within the ASWG ie why the
> consumer bit is ignored for *some* descriptors and not for others.
>
> As things stand unfortunately the answer seems yes (I do not know
> why).
>
>> > It *would* be nice to have bridge registers in the bridge _CRS.  That
>> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
>> > to find the bridge registers.  Avoiding that lookup is only a
>> > temporary advantage -- the next round of bridges are supposed to fully
>> > implement ECAM, and then we won't need to know where the registers
>> > are.
>> >
>> > Apart from the lookup, there's still some advantage in describing the
>> > registers in the PNP0A03 device instead of an unrelated PNP0C02
>> > device, because it makes /proc/iomem more accurate and potentially
>> > makes host bridge hotplug cleaner.  We would have to enhance the host
>> > bridge driver to do the reservations currently done by pnp/system.c.
>> >
>> > There's some value in doing it the same way as on x86, even though
>> > that way is somewhat broken.
>> >
>> > Whatever we decide, I think it's very important to get it figured out
>> > ASAP because it affects the ECAM quirks that we're trying to merge in
>> > v4.10.
>> >
>>
>> I agree. What exactly is the impact for the quirks mechanism as proposed?
> The impact is that we could just use the PNP0A03 _CRS to report the PCI
> ECAM config space quirk region through a consumer resource keeping in
> mind what I say above (actually I think that's what was done on APM
> firmware initially, for the records).

Just to clarify: APM firmware initially has a _CSR region to declare
the controller register region. We don't know that we need to declare
the reserved space for ECAM until Bjorn pointed out recently (with the
usage of PNP0C02).

I really like this idea about declaring ECAM space and any additional
spaces required for ECAM quirk inside PNP0A03 _CRS. For the firmware
that already shipped, the quirk will need to add additional resources
(for ECAM and other needed regions) into the root-bus. If we decided
to go with this, do we still have time to make additional adjustment
for the current ECAM quirk and the foundation patches before
v4.10-rc1?

>
> Lorenzo
> _______________________________________________
> Linaro-acpi mailing list
> Linaro-acpi at lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-acpi
Regards,
Duc Dang.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23 15:06         ` Bjorn Helgaas
  (?)
  (?)
@ 2016-11-29 18:19           ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:19 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
>     already done via pci_acpi_root_add_resources(), but we might need
>     a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
>     pnp/system.c does for PNP0C02 devices.  This is similar to the
>     producer regions, but I think the consumer ones should be marked
>     IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long as they don't
> set the Consumer/Producer bit.

I'm going to post a couple very lightly-tested patches that should
make us ignore the Consumer/Producer bit for QWord/DWord/Word.  I'd
appreciate any discussion about whether that's the right approach.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:19           ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:19 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
>     already done via pci_acpi_root_add_resources(), but we might need
>     a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
>     pnp/system.c does for PNP0C02 devices.  This is similar to the
>     producer regions, but I think the consumer ones should be marked
>     IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long as they don't
> set the Consumer/Producer bit.

I'm going to post a couple very lightly-tested patches that should
make us ignore the Consumer/Producer bit for QWord/DWord/Word.  I'd
appreciate any discussion about whether that's the right approach.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:19           ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:19 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linaro-acpi, linux-pci, linux-kernel, linux-acpi, Bjorn Helgaas,
	linux-arm-kernel

On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
>     already done via pci_acpi_root_add_resources(), but we might need
>     a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
>     pnp/system.c does for PNP0C02 devices.  This is similar to the
>     producer regions, but I think the consumer ones should be marked
>     IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long as they don't
> set the Consumer/Producer bit.

I'm going to post a couple very lightly-tested patches that should
make us ignore the Consumer/Producer bit for QWord/DWord/Word.  I'd
appreciate any discussion about whether that's the right approach.

Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:19           ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
>     already done via pci_acpi_root_add_resources(), but we might need
>     a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
>     pnp/system.c does for PNP0C02 devices.  This is similar to the
>     producer regions, but I think the consumer ones should be marked
>     IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long as they don't
> set the Consumer/Producer bit.

I'm going to post a couple very lightly-tested patches that should
make us ignore the Consumer/Producer bit for QWord/DWord/Word.  I'd
appreciate any discussion about whether that's the right approach.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
  2016-11-23  3:23   ` Zheng, Lv
  (?)
  (?)
@ 2016-11-29 18:20     ` Bjorn Helgaas
  -1 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:20 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 03:23:35AM +0000, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  Documentation/PCI/00-INDEX      |    2 +
> >  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> >  	- this file
> > +acpi-info.txt
> > +	- info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> >  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 0000000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +	    ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2].  For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself.  PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3].  That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS.  The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:20     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:20 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Bjorn Helgaas, linux-pci, linux-acpi, linux-kernel,
	linux-arm-kernel, linaro-acpi

On Wed, Nov 23, 2016 at 03:23:35AM +0000, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  Documentation/PCI/00-INDEX      |    2 +
> >  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> >  	- this file
> > +acpi-info.txt
> > +	- info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> >  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 0000000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +	    ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2].  For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself.  PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3].  That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS.  The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:20     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:20 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: linaro-acpi, linux-pci, linux-kernel, linux-acpi, Bjorn Helgaas,
	linux-arm-kernel

On Wed, Nov 23, 2016 at 03:23:35AM +0000, Zheng, Lv wrote:
> Hi, Bjorn
> =

> Thanks for the documentation.
> It really helps!
> =

> However I have a question below.
> =

> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.ke=
rnel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > =

> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > =

> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  Documentation/PCI/00-INDEX      |    2 +
> >  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++=
++++++++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > =

> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> >  	- this file
> > +acpi-info.txt
> > +	- info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> >  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-i=
nfo.txt
> > new file mode 100644
> > index 0000000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +	    ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2]. =A0For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself. =A0PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2]. =A0 _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3]. =A0That's importa=
nt
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS. =A0The new devices won't do anything, b=
ut
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> =

> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH] PCI: Add information about describing PCI in ACPI
@ 2016-11-29 18:20     ` Bjorn Helgaas
  0 siblings, 0 replies; 66+ messages in thread
From: Bjorn Helgaas @ 2016-11-29 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 03:23:35AM +0000, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-owner at vger.kernel.org [mailto:linux-acpi-owner at vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  Documentation/PCI/00-INDEX      |    2 +
> >  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> >  	- this file
> > +acpi-info.txt
> > +	- info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> >  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 0000000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +	    ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2]. ?For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself. ?PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2]. ? _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3]. ?That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS. ?The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2016-11-29 18:20 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-17 17:59 [PATCH] PCI: Add information about describing PCI in ACPI Bjorn Helgaas
2016-11-17 17:59 ` Bjorn Helgaas
2016-11-18 17:17 ` Gabriele Paoloni
2016-11-18 17:17   ` Gabriele Paoloni
2016-11-18 17:17   ` Gabriele Paoloni
2016-11-18 17:17   ` Gabriele Paoloni
2016-11-18 17:54   ` Bjorn Helgaas
2016-11-18 17:54     ` Bjorn Helgaas
2016-11-18 17:54     ` Bjorn Helgaas
2016-11-21  8:52     ` Gabriele Paoloni
2016-11-21  8:52       ` Gabriele Paoloni
2016-11-21  8:52       ` Gabriele Paoloni
2016-11-21  8:52       ` Gabriele Paoloni
2016-11-21 16:47       ` Bjorn Helgaas
2016-11-21 16:47         ` Bjorn Helgaas
2016-11-21 16:47         ` Bjorn Helgaas
2016-11-21 17:23         ` Gabriele Paoloni
2016-11-21 17:23           ` Gabriele Paoloni
2016-11-21 17:23           ` Gabriele Paoloni
2016-11-21 17:23           ` Gabriele Paoloni
2016-11-21 20:10           ` Bjorn Helgaas
2016-11-21 20:10             ` Bjorn Helgaas
2016-11-21 20:10             ` Bjorn Helgaas
2016-11-22 13:13             ` Gabriele Paoloni
2016-11-22 13:13               ` Gabriele Paoloni
2016-11-22 13:13               ` Gabriele Paoloni
2016-11-22 13:13               ` Gabriele Paoloni
2016-11-18 23:02 ` Rafael J. Wysocki
2016-11-18 23:02   ` Rafael J. Wysocki
2016-11-18 23:02   ` Rafael J. Wysocki
2016-11-18 23:02   ` Rafael J. Wysocki
2016-11-21 13:58   ` Bjorn Helgaas
2016-11-21 13:58     ` Bjorn Helgaas
2016-11-21 13:58     ` Bjorn Helgaas
2016-11-22 10:09 ` Ard Biesheuvel
2016-11-22 10:09   ` Ard Biesheuvel
2016-11-22 10:09   ` Ard Biesheuvel
2016-11-22 10:09   ` Ard Biesheuvel
2016-11-23  1:06   ` Bjorn Helgaas
2016-11-23  1:06     ` Bjorn Helgaas
2016-11-23  1:06     ` Bjorn Helgaas
2016-11-23  7:28     ` Ard Biesheuvel
2016-11-23  7:28       ` Ard Biesheuvel
2016-11-23  7:28       ` Ard Biesheuvel
2016-11-23 12:30       ` Lorenzo Pieralisi
2016-11-23 12:30         ` Lorenzo Pieralisi
2016-11-23 12:30         ` Lorenzo Pieralisi
2016-11-23 20:52         ` [Linaro-acpi] " Duc Dang
2016-11-23 20:52           ` Duc Dang
2016-11-23 20:52           ` Duc Dang
2016-11-23 15:06       ` Bjorn Helgaas
2016-11-23 15:06         ` Bjorn Helgaas
2016-11-23 15:06         ` Bjorn Helgaas
2016-11-23 15:06         ` Bjorn Helgaas
2016-11-29 18:19         ` Bjorn Helgaas
2016-11-29 18:19           ` Bjorn Helgaas
2016-11-29 18:19           ` Bjorn Helgaas
2016-11-29 18:19           ` Bjorn Helgaas
2016-11-23  3:23 ` Zheng, Lv
2016-11-23  3:23   ` Zheng, Lv
2016-11-23  3:23   ` Zheng, Lv
2016-11-23  3:23   ` Zheng, Lv
2016-11-29 18:20   ` Bjorn Helgaas
2016-11-29 18:20     ` Bjorn Helgaas
2016-11-29 18:20     ` Bjorn Helgaas
2016-11-29 18:20     ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.