linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/PCI: Mark Power Control Unit as having non-compliant BARs
@ 2020-05-15 10:07 Xiaochun Lee
  2020-05-15 19:23 ` Bjorn Helgaas
  0 siblings, 1 reply; 2+ messages in thread
From: Xiaochun Lee @ 2020-05-15 10:07 UTC (permalink / raw)
  To: bhelgaas, linux-pci; +Cc: tglx, mingo, bp, x86, hpa, linux-kernel, Xiaochun Lee

From: Xiaochun Lee <lixc17@lenovo.com>

The device [8086:a26c] is a Power Control Unit of
Intel Ice Lake Server Processor and devices [8086:a1ec,a1ed]
are the Power Control Unit of Intel Xeon Scalable Processor,
kernel treats their pci BARs as a base address register that
leading to a boot failure like:
"pci 0000:00:11.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size)".

The symptoms in Ice Lake processor is:
"QU99 ICE LAKE ES1 HCC 24C 185W 3200 L-0"

The information of the device [8086:a26c] list as below:
00:11.0 Unassigned class [ff00]: Intel Corporation Device a26c (rev 03)
        Subsystem: Lenovo Device 7811
        Flags: fast devsel, NUMA node 0
        Expansion ROM at <ignored> [disabled]
        Capabilities: [80] Power Management version 3

The symptoms in Xeon Scalable Processor is:
"Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz"
"Intel(R) Xeon(R) Gold 6252 CPU @ 2.00GHz"

The information of the Device [8086:a1ec] list as below:
00:11.0 Unassigned class [ff00]: Intel Corporation C620 Series Chipset Family MROM 0 [8086:a1ec] (rev 09)
        Subsystem: Lenovo Device [17aa:7805]
        Latency: 0, Cache Line Size: 64 bytes
        NUMA node: 0
        Expansion ROM at <ignored> [disabled]
        Capabilities: [80] Power Management version 3

There are no other BARs on this devices, so mark the PCU as having
non-compliant BARs, therefore we don't try to probe any of them.

Signed-off-by: Xiaochun Lee <lixc17@lenovo.com>
---
 arch/x86/pci/fixup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e723559..d9abc67 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -563,6 +563,9 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
  * Erratum BDF2
  * PCI BARs in the Home Agent Will Return Non-Zero Values During Enumeration
  * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html
+ *
+ * Device [8086:a26c]
+ * Devices [8086:a1ec,a1ed]
  */
 static void pci_invalid_bar(struct pci_dev *dev)
 {
@@ -572,6 +575,9 @@ static void pci_invalid_bar(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ec, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ed, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa26c, pci_invalid_bar);
 
 /*
  * Device [1022:7808]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] x86/PCI: Mark Power Control Unit as having non-compliant BARs
  2020-05-15 10:07 [PATCH v2] x86/PCI: Mark Power Control Unit as having non-compliant BARs Xiaochun Lee
@ 2020-05-15 19:23 ` Bjorn Helgaas
  0 siblings, 0 replies; 2+ messages in thread
From: Bjorn Helgaas @ 2020-05-15 19:23 UTC (permalink / raw)
  To: Xiaochun Lee
  Cc: bhelgaas, linux-pci, tglx, mingo, bp, x86, hpa, linux-kernel,
	Xiaochun Lee

On Fri, May 15, 2020 at 06:07:51AM -0400, Xiaochun Lee wrote:
> From: Xiaochun Lee <lixc17@lenovo.com>
> 
> The device [8086:a26c] is a Power Control Unit of
> Intel Ice Lake Server Processor and devices [8086:a1ec,a1ed]
> are the Power Control Unit of Intel Xeon Scalable Processor,
> kernel treats their pci BARs as a base address register that
> leading to a boot failure like:
> "pci 0000:00:11.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size)".

Do you have a spec that says these are Power Control Units?  The spec
I found for the C620 PCH claims these are all "MROM" devices related
to "Enterprise Value Add", "Intel Management Engine", and "Innovation
Engine" configuration.

I updated the commit log, added [8086:a26d] as mentioned in that spec,
added a stable tag, and applied the patch below to pci/misc for v5.8.
Let me know if that doesn't look right.

> The symptoms in Ice Lake processor is:
> "QU99 ICE LAKE ES1 HCC 24C 185W 3200 L-0"
> 
> The information of the device [8086:a26c] list as below:
> 00:11.0 Unassigned class [ff00]: Intel Corporation Device a26c (rev 03)
>         Subsystem: Lenovo Device 7811
>         Flags: fast devsel, NUMA node 0
>         Expansion ROM at <ignored> [disabled]
>         Capabilities: [80] Power Management version 3
> 
> The symptoms in Xeon Scalable Processor is:
> "Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz"
> "Intel(R) Xeon(R) Gold 6252 CPU @ 2.00GHz"
> 
> The information of the Device [8086:a1ec] list as below:
> 00:11.0 Unassigned class [ff00]: Intel Corporation C620 Series Chipset Family MROM 0 [8086:a1ec] (rev 09)
>         Subsystem: Lenovo Device [17aa:7805]
>         Latency: 0, Cache Line Size: 64 bytes
>         NUMA node: 0
>         Expansion ROM at <ignored> [disabled]
>         Capabilities: [80] Power Management version 3
> 
> There are no other BARs on this devices, so mark the PCU as having
> non-compliant BARs, therefore we don't try to probe any of them.
> 
> Signed-off-by: Xiaochun Lee <lixc17@lenovo.com>
> ---
>  arch/x86/pci/fixup.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index e723559..d9abc67 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -563,6 +563,9 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
>   * Erratum BDF2
>   * PCI BARs in the Home Agent Will Return Non-Zero Values During Enumeration
>   * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html
> + *
> + * Device [8086:a26c]
> + * Devices [8086:a1ec,a1ed]
>   */
>  static void pci_invalid_bar(struct pci_dev *dev)
>  {
> @@ -572,6 +575,9 @@ static void pci_invalid_bar(struct pci_dev *dev)
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_invalid_bar);
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_invalid_bar);
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_invalid_bar);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ec, pci_invalid_bar);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ed, pci_invalid_bar);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa26c, pci_invalid_bar);
>  
>  /*
>   * Device [1022:7808]
> -- 
> 1.8.3.1

commit 1574051e52cb ("x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs")
Author: Xiaochun Lee <lixc17@lenovo.com>
Date:   Thu May 14 23:31:07 2020 -0400

    x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
    
    The Intel C620 Platform Controller Hub has MROM functions that have non-PCI
    registers (undocumented in the public spec) where BAR 0 is supposed to be,
    which results in messages like this:
    
      pci 0000:00:11.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size)
    
    Mark these MROM functions as having non-compliant BARs so we don't try to
    probe any of them.  There are no other BARs on these devices.
    
    See the Intel C620 Series Chipset Platform Controller Hub Datasheet,
    May 2019, Document Number 336067-007US, sec 2.1, 35.5, 35.6.
    
    [bhelgaas: commit log, add 0xa26d]
    Link: https://lore.kernel.org/r/1589513467-17070-1-git-send-email-lixiaochun.2888@163.com
    Signed-off-by: Xiaochun Lee <lixc17@lenovo.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Cc: stable@vger.kernel.org

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e723559c386a..0c67a5a94de3 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -572,6 +572,10 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ec, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa1ed, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa26c, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0xa26d, pci_invalid_bar);
 
 /*
  * Device [1022:7808]

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-05-15 19:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 10:07 [PATCH v2] x86/PCI: Mark Power Control Unit as having non-compliant BARs Xiaochun Lee
2020-05-15 19:23 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).