linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V13 0/6] PCI: Loongson pci improvements and quirks
@ 2022-04-30  8:48 Huacai Chen
  2022-04-30  8:48 ` [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen,
	Jianmin Lv, Tiezhu Yang

This patchset improves Loongson PCI controller driver and resolves some
problems: LS2K/LS7A's PCI config space supports 1/2/4-bytes access, so
the first patch use pci_generic_config_read()/pci_generic_config_write()
for them; the second patch add ACPI init support which will be used by
LoongArch; the third patch improves the mrrs quirk for LS7A chipset; The
fourth patch add a new quirk for LS7A chipset to avoid poweroff/reboot
failure, and the fifth patch add a new quirk for LS7A chipset to fix the
multifunction devices' irq pin mappings.

V1 -> V2:
1, Rework the 4th patch;
2, Improve commit messages;
3, Remove the last patch since there is better solutions.

V2 -> V3:
1, Add more affected device ids for the 4th patch;
2, Improve commit messages to describe root causes.

V3 -> V4:
1, Rework the MRRS quirk patch;
2, Improve commit messages to describe root causes, again.

V4 -> V5:
1, Improve the MRRS quirk patch;
2, Change the order of 2nd and 3rd patch;
3, Improve commit messages to describe root causes, again.

V5 -> V6:
1, Rework the 1st patch;
2, Adjust the order of the series.

V6 -> V7:
1, Use correct pci config access operations;
2, Add ACPI init support for LoongArch;
3, Don't move to quirks.c since the driver has ACPI support;
4, Some other minor improvements.

V7 -> V8:
1, Use CFG1 method for LS2K/LS7A pci config.

V8 -> V9:
1, Use pci_controller_data for the first patch. 

V9 -> V10:
1, Add a patch to avoid LS2K/LS7A access unexisting devices.

V10 -> V11:
1, Rebased on 5.16-rc4.

V11 -> V12:
1, Rebased on 5.17-rc5.

V12 -> V13:
1, Rebased on 5.18-rc2;
2, Some minor improvements (adopt Rob Herring's suggestions).

Huacai Chen, Tiezhu Yang and Jianmin Lv(6):
 PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A.
 PCI: loongson: Add ACPI init support.
 PCI: loongson: Don't access unexisting devices.
 PCI: loongson: Improve the MRRS quirk for LS7A.
 PCI: Add quirk for LS7A to avoid reboot failure.
 PCI: Add quirk for multifunction devices of LS7A.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> 
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 drivers/acpi/pci_mcfg.c               |  13 +++
 drivers/pci/controller/Kconfig        |   2 +-
 drivers/pci/controller/pci-loongson.c | 184 ++++++++++++++++++++++++++--------
 drivers/pci/pci.c                     |   6 ++
 drivers/pci/pcie/portdrv_core.c       |   6 +-
 include/linux/pci-ecam.h              |   1 +
 include/linux/pci.h                   |   2 +
 7 files changed, 169 insertions(+), 45 deletions(-)
--
2.27.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-06-01  2:08   ` Bjorn Helgaas
  2022-04-30  8:48 ` [PATCH V13 2/6] PCI: loongson: Add ACPI init support Huacai Chen
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

LS2K/LS7A support 8/16/32-bits PCI config access operations via CFG1, so
we can disable CFG0 for them and safely use pci_generic_config_read()/
pci_generic_config_write() instead of pci_generic_config_read32()/pci_
generic_config_write32().

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 65 +++++++++++++++++++--------
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 50a8e1d6f70a..0a947ace9478 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -25,11 +25,16 @@
 #define FLAG_CFG1	BIT(1)
 #define FLAG_DEV_FIX	BIT(2)
 
+struct loongson_pci_data {
+	u32 flags;
+	struct pci_ops *ops;
+};
+
 struct loongson_pci {
 	void __iomem *cfg0_base;
 	void __iomem *cfg1_base;
 	struct platform_device *pdev;
-	u32 flags;
+	struct loongson_pci_data *data;
 };
 
 /* Fixup wrong class code in PCIe bridges */
@@ -126,8 +131,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
 	 * Do not read more than one device on the bus other than
 	 * the host bus. For our hardware the root bus is always bus 0.
 	 */
-	if (priv->flags & FLAG_DEV_FIX && busnum != 0 &&
-		PCI_SLOT(devfn) > 0)
+	if (priv->data->flags & FLAG_DEV_FIX &&
+			!pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
 		return NULL;
 
 	/* CFG0 can only access standard space */
@@ -159,20 +164,42 @@ static int loongson_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
 	return val;
 }
 
-/* H/w only accept 32-bit PCI operations */
+/* LS2K/LS7A accept 8/16/32-bit PCI config operations */
 static struct pci_ops loongson_pci_ops = {
+	.map_bus = pci_loongson_map_bus,
+	.read	= pci_generic_config_read,
+	.write	= pci_generic_config_write,
+};
+
+/* RS780/SR5690 only accept 32-bit PCI config operations */
+static struct pci_ops loongson_pci_ops32 = {
 	.map_bus = pci_loongson_map_bus,
 	.read	= pci_generic_config_read32,
 	.write	= pci_generic_config_write32,
 };
 
+static const struct loongson_pci_data ls2k_pci_data = {
+	.flags = FLAG_CFG1 | FLAG_DEV_FIX,
+	.ops = &loongson_pci_ops,
+};
+
+static const struct loongson_pci_data ls7a_pci_data = {
+	.flags = FLAG_CFG1 | FLAG_DEV_FIX,
+	.ops = &loongson_pci_ops,
+};
+
+static const struct loongson_pci_data rs780e_pci_data = {
+	.flags = FLAG_CFG0,
+	.ops = &loongson_pci_ops32,
+};
+
 static const struct of_device_id loongson_pci_of_match[] = {
 	{ .compatible = "loongson,ls2k-pci",
-		.data = (void *)(FLAG_CFG0 | FLAG_CFG1 | FLAG_DEV_FIX), },
+		.data = &ls2k_pci_data, },
 	{ .compatible = "loongson,ls7a-pci",
-		.data = (void *)(FLAG_CFG0 | FLAG_CFG1 | FLAG_DEV_FIX), },
+		.data = &ls7a_pci_data, },
 	{ .compatible = "loongson,rs780e-pci",
-		.data = (void *)(FLAG_CFG0), },
+		.data = &rs780e_pci_data, },
 	{}
 };
 
@@ -193,20 +220,20 @@ static int loongson_pci_probe(struct platform_device *pdev)
 
 	priv = pci_host_bridge_priv(bridge);
 	priv->pdev = pdev;
-	priv->flags = (unsigned long)of_device_get_match_data(dev);
+	priv->data = (struct loongson_pci_data *)of_device_get_match_data(dev);
 
-	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!regs) {
-		dev_err(dev, "missing mem resources for cfg0\n");
-		return -EINVAL;
+	if (priv->data->flags & FLAG_CFG0) {
+		regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+		if (!regs)
+			dev_err(dev, "missing mem resources for cfg0\n");
+		else {
+			priv->cfg0_base = devm_pci_remap_cfg_resource(dev, regs);
+			if (IS_ERR(priv->cfg0_base))
+				return PTR_ERR(priv->cfg0_base);
+		}
 	}
 
-	priv->cfg0_base = devm_pci_remap_cfg_resource(dev, regs);
-	if (IS_ERR(priv->cfg0_base))
-		return PTR_ERR(priv->cfg0_base);
-
-	/* CFG1 is optional */
-	if (priv->flags & FLAG_CFG1) {
+	if (priv->data->flags & FLAG_CFG1) {
 		regs = platform_get_resource(pdev, IORESOURCE_MEM, 1);
 		if (!regs)
 			dev_info(dev, "missing mem resource for cfg1\n");
@@ -218,7 +245,7 @@ static int loongson_pci_probe(struct platform_device *pdev)
 	}
 
 	bridge->sysdata = priv;
-	bridge->ops = &loongson_pci_ops;
+	bridge->ops = priv->data->ops;
 	bridge->map_irq = loongson_map_irq;
 
 	return pci_host_probe(bridge);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V13 2/6] PCI: loongson: Add ACPI init support
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
  2022-04-30  8:48 ` [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-05-31 23:04   ` Bjorn Helgaas
  2022-04-30  8:48 ` [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices Huacai Chen
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

Loongson PCH (LS7A chipset) will be used by both MIPS-based and
LoongArch-based Loongson processors. MIPS-based Loongson uses FDT
while LoongArch-base Loongson uses ACPI, this patch add ACPI init
support for the driver in drivers/pci/controller/pci-loongson.c
because it is currently FDT-only.

LoongArch is a new RISC ISA, mainline support will come soon, and
documentations are here (in translation):

https://github.com/loongson/LoongArch-Documentation

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/acpi/pci_mcfg.c               | 13 ++++++
 drivers/pci/controller/Kconfig        |  2 +-
 drivers/pci/controller/pci-loongson.c | 60 ++++++++++++++++++++++++++-
 include/linux/pci-ecam.h              |  1 +
 4 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
index 53cab975f612..860014b89b8e 100644
--- a/drivers/acpi/pci_mcfg.c
+++ b/drivers/acpi/pci_mcfg.c
@@ -41,6 +41,8 @@ struct mcfg_fixup {
 static struct mcfg_fixup mcfg_quirks[] = {
 /*	{ OEM_ID, OEM_TABLE_ID, REV, SEGMENT, BUS_RANGE, ops, cfgres }, */
 
+#ifdef CONFIG_ARM64
+
 #define AL_ECAM(table_id, rev, seg, ops) \
 	{ "AMAZON", table_id, rev, seg, MCFG_BUS_ANY, ops }
 
@@ -169,6 +171,17 @@ static struct mcfg_fixup mcfg_quirks[] = {
 	ALTRA_ECAM_QUIRK(1, 13),
 	ALTRA_ECAM_QUIRK(1, 14),
 	ALTRA_ECAM_QUIRK(1, 15),
+#endif /* ARM64 */
+
+#ifdef CONFIG_LOONGARCH
+#define LOONGSON_ECAM_MCFG(table_id, seg) \
+	{ "LOONGS", table_id, 1, seg, MCFG_BUS_ANY, &loongson_pci_ecam_ops }
+
+	LOONGSON_ECAM_MCFG("\0", 0),
+	LOONGSON_ECAM_MCFG("LOONGSON", 0),
+	LOONGSON_ECAM_MCFG("\0", 1),
+	LOONGSON_ECAM_MCFG("LOONGSON", 1),
+#endif /* LOONGARCH */
 };
 
 static char mcfg_oem_id[ACPI_OEM_ID_SIZE];
diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index b8d96d38064d..9dbd73898b47 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -293,7 +293,7 @@ config PCI_HYPERV_INTERFACE
 config PCI_LOONGSON
 	bool "LOONGSON PCI Controller"
 	depends on MACH_LOONGSON64 || COMPILE_TEST
-	depends on OF
+	depends on OF || ACPI
 	depends on PCI_QUIRKS
 	default MACH_LOONGSON64
 	help
diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 0a947ace9478..adbfa4a2330f 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -9,6 +9,8 @@
 #include <linux/of_pci.h>
 #include <linux/pci.h>
 #include <linux/pci_ids.h>
+#include <linux/pci-acpi.h>
+#include <linux/pci-ecam.h>
 
 #include "../pci.h"
 
@@ -97,6 +99,18 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
 }
 DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
 
+static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
+{
+	struct pci_config_window *cfg;
+
+	if (acpi_disabled)
+		return (struct loongson_pci *)(bus->sysdata);
+	else {
+		cfg = bus->sysdata;
+		return (struct loongson_pci *)(cfg->priv);
+	}
+}
+
 static void __iomem *cfg1_map(struct loongson_pci *priv, int bus,
 				unsigned int devfn, int where)
 {
@@ -124,8 +138,10 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
 			       int where)
 {
 	unsigned char busnum = bus->number;
-	struct pci_host_bridge *bridge = pci_find_host_bridge(bus);
-	struct loongson_pci *priv =  pci_host_bridge_priv(bridge);
+	struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
+
+	if (pci_is_root_bus(bus))
+		busnum = 0;
 
 	/*
 	 * Do not read more than one device on the bus other than
@@ -146,6 +162,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
 	return NULL;
 }
 
+#ifdef CONFIG_OF
+
 static int loongson_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
 {
 	int irq;
@@ -259,3 +277,41 @@ static struct platform_driver loongson_pci_driver = {
 	.probe = loongson_pci_probe,
 };
 builtin_platform_driver(loongson_pci_driver);
+
+#endif
+
+#ifdef CONFIG_ACPI
+
+static int loongson_pci_ecam_init(struct pci_config_window *cfg)
+{
+	struct device *dev = cfg->parent;
+	struct loongson_pci *priv;
+	struct loongson_pci_data *data;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	cfg->priv = priv;
+	data->flags = FLAG_CFG1;
+	priv->data = data;
+	priv->cfg1_base = cfg->win - (cfg->busr.start << 16);
+
+	return 0;
+}
+
+const struct pci_ecam_ops loongson_pci_ecam_ops = {
+	.bus_shift = 16,
+	.init	   = loongson_pci_ecam_init,
+	.pci_ops   = {
+		.map_bus = pci_loongson_map_bus,
+		.read	 = pci_generic_config_read,
+		.write	 = pci_generic_config_write,
+	}
+};
+
+#endif
diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
index adea5a4771cf..6b1301e2498e 100644
--- a/include/linux/pci-ecam.h
+++ b/include/linux/pci-ecam.h
@@ -87,6 +87,7 @@ extern const struct pci_ecam_ops xgene_v1_pcie_ecam_ops; /* APM X-Gene PCIe v1 *
 extern const struct pci_ecam_ops xgene_v2_pcie_ecam_ops; /* APM X-Gene PCIe v2.x */
 extern const struct pci_ecam_ops al_pcie_ops;	/* Amazon Annapurna Labs PCIe */
 extern const struct pci_ecam_ops tegra194_pcie_ops; /* Tegra194 PCIe */
+extern const struct pci_ecam_ops loongson_pci_ecam_ops; /* Loongson PCIe */
 #endif
 
 #if IS_ENABLED(CONFIG_PCI_HOST_COMMON)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
  2022-04-30  8:48 ` [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
  2022-04-30  8:48 ` [PATCH V13 2/6] PCI: loongson: Add ACPI init support Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-05-31 23:14   ` Bjorn Helgaas
  2022-04-30  8:48 ` [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A Huacai Chen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

On LS2K/LS7A, some unexisting devices don't return 0xffffffff when
scanning. This is a hardware flaw but we can only avoid it by software
now.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index adbfa4a2330f..48316daa1f23 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -138,6 +138,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
 			       int where)
 {
 	unsigned char busnum = bus->number;
+	unsigned int device = PCI_SLOT(devfn);
+	unsigned int function = PCI_FUNC(devfn);
 	struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
 
 	if (pci_is_root_bus(bus))
@@ -147,8 +149,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
 	 * Do not read more than one device on the bus other than
 	 * the host bus. For our hardware the root bus is always bus 0.
 	 */
-	if (priv->data->flags & FLAG_DEV_FIX &&
-			!pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
+	if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
+		if (!pci_is_root_bus(bus) && (device > 0))
+			return NULL;
+	}
+
+	/* Don't access unexisting devices */
+	if (pci_is_root_bus(bus) && (device >= 9 && device <= 20 && function > 0))
 		return NULL;
 
 	/* CFG0 can only access standard space */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
                   ` (2 preceding siblings ...)
  2022-04-30  8:48 ` [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-06-01  2:22   ` Bjorn Helgaas
  2022-04-30  8:48 ` [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure Huacai Chen
  2022-04-30  8:48 ` [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

In new revision of LS7A, some PCIe ports support larger value than 256,
but their maximum supported MRRS values are not detectable. Moreover,
the current loongson_mrrs_quirk() cannot avoid devices increasing its
MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
will actually set a big value in its driver. So the only possible way
is configure MRRS of all devices in BIOS, and add a pci host bridge bit
flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations.

However, according to PCIe Spec, it is legal for an OS to program any
value for MRRS, and it is also legal for an endpoint to generate a Read
Request with any size up to its MRRS. As the hardware engineers say, the
root cause here is LS7A doesn't break up large read requests. In detail,
LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read
request with a size that's "too big" ("too big" means larger than the
PCIe ports can handle, which means 256 for some ports and 4096 for the
others, and of course this is a problem in the LS7A's hardware design).

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 44 +++++++++------------------
 drivers/pci/pci.c                     |  6 ++++
 include/linux/pci.h                   |  1 +
 3 files changed, 22 insertions(+), 29 deletions(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 48316daa1f23..83447264048a 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -67,37 +67,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 			DEV_LS7A_LPC, system_bus_quirk);
 
-static void loongson_mrrs_quirk(struct pci_dev *dev)
+static void loongson_mrrs_quirk(struct pci_dev *pdev)
 {
-	struct pci_bus *bus = dev->bus;
-	struct pci_dev *bridge;
-	static const struct pci_device_id bridge_devids[] = {
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
-		{ 0, },
-	};
-
-	/* look for the matching bridge */
-	while (!pci_is_root_bus(bus)) {
-		bridge = bus->self;
-		bus = bus->parent;
-		/*
-		 * Some Loongson PCIe ports have a h/w limitation of
-		 * 256 bytes maximum read request size. They can't handle
-		 * anything larger than this. So force this limit on
-		 * any devices attached under these ports.
-		 */
-		if (pci_match_id(bridge_devids, bridge)) {
-			if (pcie_get_readrq(dev) > 256) {
-				pci_info(dev, "limiting MRRS to 256\n");
-				pcie_set_readrq(dev, 256);
-			}
-			break;
-		}
-	}
+	/*
+	 * Some Loongson PCIe ports have h/w limitations of maximum read
+	 * request size. They can't handle anything larger than this. So
+	 * force this limit on any devices attached under these ports.
+	 */
+	struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
+
+	bridge->no_inc_mrrs = 1;
 }
-DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_0, loongson_mrrs_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_1, loongson_mrrs_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_2, loongson_mrrs_quirk);
 
 static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
 {
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9ecce435fb3f..72a15bf9eee8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5993,6 +5993,7 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
 {
 	u16 v;
 	int ret;
+	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
 
 	if (rq < 128 || rq > 4096 || !is_power_of_2(rq))
 		return -EINVAL;
@@ -6011,6 +6012,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
 
 	v = (ffs(rq) - 8) << 12;
 
+	if (bridge->no_inc_mrrs) {
+		if (rq > pcie_get_readrq(dev))
+			return -EINVAL;
+	}
+
 	ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
 						  PCI_EXP_DEVCTL_READRQ, v);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 60adf42460ab..d146eb28e6da 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -569,6 +569,7 @@ struct pci_host_bridge {
 	void		*release_data;
 	unsigned int	ignore_reset_delay:1;	/* For entire hierarchy */
 	unsigned int	no_ext_tags:1;		/* No Extended Tags */
+	unsigned int	no_inc_mrrs:1;		/* No Increase MRRS */
 	unsigned int	native_aer:1;		/* OS may use PCIe AER */
 	unsigned int	native_pcie_hotplug:1;	/* OS may use PCIe hotplug */
 	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
                   ` (3 preceding siblings ...)
  2022-04-30  8:48 ` [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-05-31 23:35   ` Bjorn Helgaas
  2022-04-30  8:48 ` [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services during
shutdown") causes poweroff/reboot failure on systems with LS7A chipset.
We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
do_pci_disable_device(), it can work well. The hardware engineer says
that the root cause is that CPU is still accessing PCIe devices while
poweroff/reboot, and if we disable the Bus Master Bit at this time, the
PCIe controller doesn't forward requests to downstream devices, and also
doesn't send TIMEOUT to CPU, which causes CPU wait forever (hardware
deadlock). This behavior is a PCIe protocol violation (Bus Master should
not be involved in CPU MMIO transactions), and it will be fixed in new
revisions of hardware (add timeout mechanism for CPU read request,
whether or not Bus Master bit is cleared).

On some x86 platforms, radeon/amdgpu devices can cause similar problems
[1][2]. Once before I wanted to make a single patch to solve "all of
these problems" together, but it seems unreasonable because maybe they
are not exactly the same problem. So, this patch just add a quirk for
LS7A to avoid clearing Bus Master bit in pcie_port_device_remove(), and
leave other platforms as is.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
[2] https://bugs.freedesktop.org/show_bug.cgi?id=98638

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 17 +++++++++++++++++
 drivers/pci/pcie/portdrv_core.c       |  6 +++++-
 include/linux/pci.h                   |  1 +
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 83447264048a..49d8b8c24ffb 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -85,6 +85,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 			DEV_PCIE_PORT_2, loongson_mrrs_quirk);
 
+static void loongson_bmaster_quirk(struct pci_dev *pdev)
+{
+	/*
+	 * Some Loongson PCIe ports will cause CPU deadlock if disable
+	 * the Bus Master bit during poweroff/reboot.
+	 */
+	struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
+
+	bridge->no_dis_bmaster = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_0, loongson_bmaster_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_1, loongson_bmaster_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_2, loongson_bmaster_quirk);
+
 static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
 {
 	struct pci_config_window *cfg;
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 604feeb84ee4..23f41e31a6c6 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -491,9 +491,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
  */
 void pcie_port_device_remove(struct pci_dev *dev)
 {
+	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+
 	device_for_each_child(&dev->dev, NULL, remove_iter);
 	pci_free_irq_vectors(dev);
-	pci_disable_device(dev);
+
+	if (!bridge->no_dis_bmaster)
+		pci_disable_device(dev);
 }
 
 /**
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d146eb28e6da..c52d6486ff99 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -570,6 +570,7 @@ struct pci_host_bridge {
 	unsigned int	ignore_reset_delay:1;	/* For entire hierarchy */
 	unsigned int	no_ext_tags:1;		/* No Extended Tags */
 	unsigned int	no_inc_mrrs:1;		/* No Increase MRRS */
+	unsigned int	no_dis_bmaster:1;	/* No Disable Bus Master */
 	unsigned int	native_aer:1;		/* OS may use PCIe AER */
 	unsigned int	native_pcie_hotplug:1;	/* OS may use PCIe hotplug */
 	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A
  2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
                   ` (4 preceding siblings ...)
  2022-04-30  8:48 ` [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure Huacai Chen
@ 2022-04-30  8:48 ` Huacai Chen
  2022-06-01  2:07   ` Bjorn Helgaas
  5 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-04-30  8:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring, Krzysztof Wilczyński
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Jianmin Lv, Huacai Chen

From: Jianmin Lv <lvjianmin@loongson.cn>

In LS7A, multifunction device use same PCI PIN (because the PIN register
report the same INTx value to each function) but we need different IRQ
for different functions, so add a quirk to fix it for standard PCI PIN
usage.

This patch only affect ACPI based systems (and only needed by ACPI based
systems, too). For DT based systems, the irq mappings is defined in .dts
files and be handled by of_irq_parse_pci().

Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 32 +++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 49d8b8c24ffb..024542a31a8c 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -22,6 +22,13 @@
 #define DEV_LS2K_APB	0x7a02
 #define DEV_LS7A_CONF	0x7a10
 #define DEV_LS7A_LPC	0x7a0c
+#define DEV_LS7A_GMAC	0x7a03
+#define DEV_LS7A_DC1	0x7a06
+#define DEV_LS7A_DC2	0x7a36
+#define DEV_LS7A_GPU	0x7a15
+#define DEV_LS7A_AHCI	0x7a08
+#define DEV_LS7A_EHCI	0x7a14
+#define DEV_LS7A_OHCI	0x7a24
 
 #define FLAG_CFG0	BIT(0)
 #define FLAG_CFG1	BIT(1)
@@ -102,6 +109,31 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 			DEV_PCIE_PORT_2, loongson_bmaster_quirk);
 
+static void loongson_pci_pin_quirk(struct pci_dev *pdev)
+{
+	pdev->pin = 1 + (PCI_FUNC(pdev->devfn) & 3);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_DC1, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_DC2, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_GPU, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_GMAC, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_AHCI, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_EHCI, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_OHCI, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
+
 static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
 {
 	struct pci_config_window *cfg;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 2/6] PCI: loongson: Add ACPI init support
  2022-04-30  8:48 ` [PATCH V13 2/6] PCI: loongson: Add ACPI init support Huacai Chen
@ 2022-05-31 23:04   ` Bjorn Helgaas
  2022-06-02  7:09     ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-05-31 23:04 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 04:48:42PM +0800, Huacai Chen wrote:
> Loongson PCH (LS7A chipset) will be used by both MIPS-based and
> LoongArch-based Loongson processors. MIPS-based Loongson uses FDT
> while LoongArch-base Loongson uses ACPI, this patch add ACPI init
> support for the driver in drivers/pci/controller/pci-loongson.c
> because it is currently FDT-only.
> 
> LoongArch is a new RISC ISA, mainline support will come soon, and
> documentations are here (in translation):
> 
> https://github.com/loongson/LoongArch-Documentation
> 
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/acpi/pci_mcfg.c               | 13 ++++++
>  drivers/pci/controller/Kconfig        |  2 +-
>  drivers/pci/controller/pci-loongson.c | 60 ++++++++++++++++++++++++++-
>  include/linux/pci-ecam.h              |  1 +
>  4 files changed, 73 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
> index 53cab975f612..860014b89b8e 100644
> --- a/drivers/acpi/pci_mcfg.c
> +++ b/drivers/acpi/pci_mcfg.c
> @@ -41,6 +41,8 @@ struct mcfg_fixup {
>  static struct mcfg_fixup mcfg_quirks[] = {
>  /*	{ OEM_ID, OEM_TABLE_ID, REV, SEGMENT, BUS_RANGE, ops, cfgres }, */
>  
> +#ifdef CONFIG_ARM64
> +
>  #define AL_ECAM(table_id, rev, seg, ops) \
>  	{ "AMAZON", table_id, rev, seg, MCFG_BUS_ANY, ops }
>  
> @@ -169,6 +171,17 @@ static struct mcfg_fixup mcfg_quirks[] = {
>  	ALTRA_ECAM_QUIRK(1, 13),
>  	ALTRA_ECAM_QUIRK(1, 14),
>  	ALTRA_ECAM_QUIRK(1, 15),
> +#endif /* ARM64 */

The addition of the CONFIG_ARM64 #ifdefs should be its own separate
patch so it's not buried in this Loongson one.

> +#ifdef CONFIG_LOONGARCH
> +#define LOONGSON_ECAM_MCFG(table_id, seg) \
> +	{ "LOONGS", table_id, 1, seg, MCFG_BUS_ANY, &loongson_pci_ecam_ops }
> +
> +	LOONGSON_ECAM_MCFG("\0", 0),
> +	LOONGSON_ECAM_MCFG("LOONGSON", 0),
> +	LOONGSON_ECAM_MCFG("\0", 1),
> +	LOONGSON_ECAM_MCFG("LOONGSON", 1),
> +#endif /* LOONGARCH */
>  };
>  
>  static char mcfg_oem_id[ACPI_OEM_ID_SIZE];
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index b8d96d38064d..9dbd73898b47 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -293,7 +293,7 @@ config PCI_HYPERV_INTERFACE
>  config PCI_LOONGSON
>  	bool "LOONGSON PCI Controller"
>  	depends on MACH_LOONGSON64 || COMPILE_TEST
> -	depends on OF
> +	depends on OF || ACPI
>  	depends on PCI_QUIRKS
>  	default MACH_LOONGSON64
>  	help
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index 0a947ace9478..adbfa4a2330f 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -9,6 +9,8 @@
>  #include <linux/of_pci.h>
>  #include <linux/pci.h>
>  #include <linux/pci_ids.h>
> +#include <linux/pci-acpi.h>
> +#include <linux/pci-ecam.h>
>  
>  #include "../pci.h"
>  
> @@ -97,6 +99,18 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
>  }
>  DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
>  
> +static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
> +{
> +	struct pci_config_window *cfg;
> +
> +	if (acpi_disabled)
> +		return (struct loongson_pci *)(bus->sysdata);
> +	else {
> +		cfg = bus->sysdata;
> +		return (struct loongson_pci *)(cfg->priv);
> +	}
> +}
> +
>  static void __iomem *cfg1_map(struct loongson_pci *priv, int bus,
>  				unsigned int devfn, int where)
>  {
> @@ -124,8 +138,10 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
>  			       int where)
>  {
>  	unsigned char busnum = bus->number;
> -	struct pci_host_bridge *bridge = pci_find_host_bridge(bus);
> -	struct loongson_pci *priv =  pci_host_bridge_priv(bridge);
> +	struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
> +
> +	if (pci_is_root_bus(bus))
> +		busnum = 0;

This is weird.  The comment just below says "For our hardware the root
bus is always bus 0."  Is that not true any more?  Why would you
override the bus number?

>  	/*
>  	 * Do not read more than one device on the bus other than
> @@ -146,6 +162,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
>  	return NULL;
>  }
>  
> +#ifdef CONFIG_OF
> +
>  static int loongson_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
>  {
>  	int irq;
> @@ -259,3 +277,41 @@ static struct platform_driver loongson_pci_driver = {
>  	.probe = loongson_pci_probe,
>  };
>  builtin_platform_driver(loongson_pci_driver);
> +
> +#endif
> +
> +#ifdef CONFIG_ACPI
> +
> +static int loongson_pci_ecam_init(struct pci_config_window *cfg)
> +{
> +	struct device *dev = cfg->parent;
> +	struct loongson_pci *priv;
> +	struct loongson_pci_data *data;
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	cfg->priv = priv;
> +	data->flags = FLAG_CFG1;
> +	priv->data = data;
> +	priv->cfg1_base = cfg->win - (cfg->busr.start << 16);
> +
> +	return 0;
> +}
> +
> +const struct pci_ecam_ops loongson_pci_ecam_ops = {
> +	.bus_shift = 16,

I can't remember the details of how this works.  The standard PCIe
ECAM has:

  A[11:00] 12 bits of Register number/alignment
  A[14:12]  3 bits of Function number
  A[19:15]  5 bits of Device number
  A[27:20]  8 bits of Bus number

Doesn't a bus_shift of 16 mean there are only 16 bits available for
the register number, function, and device?  So that would limit us to
8 bits of register number?  Do we enforce that somewhere?

It seems like a limit on "where" should be enforced in
pci_ecam_map_bus() and other .map_bus() functions like
pci_loongson_map_bus().  Otherwise a config read at offset
0x100 would read config space of the wrong device.

Krzysztof, do you remember how this works?

> +	.init	   = loongson_pci_ecam_init,
> +	.pci_ops   = {
> +		.map_bus = pci_loongson_map_bus,
> +		.read	 = pci_generic_config_read,
> +		.write	 = pci_generic_config_write,
> +	}
> +};

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices
  2022-04-30  8:48 ` [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices Huacai Chen
@ 2022-05-31 23:14   ` Bjorn Helgaas
  2022-06-02  4:28     ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-05-31 23:14 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 04:48:43PM +0800, Huacai Chen wrote:
> On LS2K/LS7A, some unexisting devices don't return 0xffffffff when
> scanning. This is a hardware flaw but we can only avoid it by software
> now.

s/unexisting/non-existant/ (many occurrences: subject line, commit
log, comments below)

What happens in other situations that normally cause Unsupported
Request or similar errors?  For example, memory reads/writes to a
device in D3hot should cause an Unsupported Request error.  I'm
wondering whether other error handling assumptions might be broken
on LS2K/LS7A.

> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/controller/pci-loongson.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index adbfa4a2330f..48316daa1f23 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -138,6 +138,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
>  			       int where)
>  {
>  	unsigned char busnum = bus->number;
> +	unsigned int device = PCI_SLOT(devfn);
> +	unsigned int function = PCI_FUNC(devfn);
>  	struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
>  
>  	if (pci_is_root_bus(bus))
> @@ -147,8 +149,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
>  	 * Do not read more than one device on the bus other than
>  	 * the host bus. For our hardware the root bus is always bus 0.
>  	 */
> -	if (priv->data->flags & FLAG_DEV_FIX &&
> -			!pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
> +	if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
> +		if (!pci_is_root_bus(bus) && (device > 0))
> +			return NULL;
> +	}
> +
> +	/* Don't access unexisting devices */
> +	if (pci_is_root_bus(bus) && (device >= 9 && device <= 20 && function > 0))

Yuck.  This is pretty nasty magic.  If this is something that might be
fixed in future versions of the hardware, maybe you should factor this
out into a function pointer in loongson_pci_data or something.

>  		return NULL;
>  
>  	/* CFG0 can only access standard space */
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-04-30  8:48 ` [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure Huacai Chen
@ 2022-05-31 23:35   ` Bjorn Helgaas
  2022-06-02 12:48     ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-05-31 23:35 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services during
> shutdown") 

Usual quoting style would be 

  cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
  causes poweroff/reboot ...

> causes poweroff/reboot failure on systems with LS7A chipset.
> We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> do_pci_disable_device(), it can work well. The hardware engineer says
> that the root cause is that CPU is still accessing PCIe devices while
> poweroff/reboot, and if we disable the Bus Master Bit at this time, the
> PCIe controller doesn't forward requests to downstream devices, and also
> doesn't send TIMEOUT to CPU, which causes CPU wait forever (hardware
> deadlock). This behavior is a PCIe protocol violation (Bus Master should
> not be involved in CPU MMIO transactions), and it will be fixed in new
> revisions of hardware (add timeout mechanism for CPU read request,
> whether or not Bus Master bit is cleared).

LS7A might have bugs in that clearing Bus Master Enable prevents the
root port from forwarding Memory or I/O requests in the downstream
direction.

But this feels like a bit of a band-aid because we don't know exactly
what those requests are.  If we're removing the Root Port, I assume we
think we no longer need any devices *below* the Root Port.

If that's not the case, e.g., if we still need to produce console
output or save state to a device, we probably should not be removing
the Root Port at all.

> On some x86 platforms, radeon/amdgpu devices can cause similar problems
> [1][2]. Once before I wanted to make a single patch to solve "all of
> these problems" together, but it seems unreasonable because maybe they
> are not exactly the same problem. So, this patch just add a quirk for
> LS7A to avoid clearing Bus Master bit in pcie_port_device_remove(), and
> leave other platforms as is.
> 
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
> [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
> 
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/controller/pci-loongson.c | 17 +++++++++++++++++
>  drivers/pci/pcie/portdrv_core.c       |  6 +++++-
>  include/linux/pci.h                   |  1 +
>  3 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index 83447264048a..49d8b8c24ffb 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -85,6 +85,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  			DEV_PCIE_PORT_2, loongson_mrrs_quirk);
>  
> +static void loongson_bmaster_quirk(struct pci_dev *pdev)
> +{
> +	/*
> +	 * Some Loongson PCIe ports will cause CPU deadlock if disable
> +	 * the Bus Master bit during poweroff/reboot.
> +	 */
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> +
> +	bridge->no_dis_bmaster = 1;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_0, loongson_bmaster_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_1, loongson_bmaster_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_2, loongson_bmaster_quirk);
> +
>  static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg;
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 604feeb84ee4..23f41e31a6c6 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -491,9 +491,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
>   */
>  void pcie_port_device_remove(struct pci_dev *dev)
>  {
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> +
>  	device_for_each_child(&dev->dev, NULL, remove_iter);
>  	pci_free_irq_vectors(dev);
> -	pci_disable_device(dev);
> +
> +	if (!bridge->no_dis_bmaster)
> +		pci_disable_device(dev);
>  }
>  
>  /**
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d146eb28e6da..c52d6486ff99 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -570,6 +570,7 @@ struct pci_host_bridge {
>  	unsigned int	ignore_reset_delay:1;	/* For entire hierarchy */
>  	unsigned int	no_ext_tags:1;		/* No Extended Tags */
>  	unsigned int	no_inc_mrrs:1;		/* No Increase MRRS */
> +	unsigned int	no_dis_bmaster:1;	/* No Disable Bus Master */
>  	unsigned int	native_aer:1;		/* OS may use PCIe AER */
>  	unsigned int	native_pcie_hotplug:1;	/* OS may use PCIe hotplug */
>  	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A
  2022-04-30  8:48 ` [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
@ 2022-06-01  2:07   ` Bjorn Helgaas
  2022-06-01  7:36     ` Jianmin Lv
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-01  2:07 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang, Jianmin Lv

On Sat, Apr 30, 2022 at 04:48:46PM +0800, Huacai Chen wrote:
> From: Jianmin Lv <lvjianmin@loongson.cn>
> 
> In LS7A, multifunction device use same PCI PIN (because the PIN register
> report the same INTx value to each function) but we need different IRQ
> for different functions, so add a quirk to fix it for standard PCI PIN
> usage.

Is this because your _PRT is missing or broken?  of_irq_parse_pci()
reads and relies on PCI_INTERRUPT_PIN, too, so it seems like the _PRT
could contain the same routing information you're getting from DT.

> This patch only affect ACPI based systems (and only needed by ACPI based
> systems, too). For DT based systems, the irq mappings is defined in .dts
> files and be handled by of_irq_parse_pci().
>
> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/controller/pci-loongson.c | 32 +++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index 49d8b8c24ffb..024542a31a8c 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -22,6 +22,13 @@
>  #define DEV_LS2K_APB	0x7a02
>  #define DEV_LS7A_CONF	0x7a10
>  #define DEV_LS7A_LPC	0x7a0c
> +#define DEV_LS7A_GMAC	0x7a03
> +#define DEV_LS7A_DC1	0x7a06
> +#define DEV_LS7A_DC2	0x7a36
> +#define DEV_LS7A_GPU	0x7a15
> +#define DEV_LS7A_AHCI	0x7a08
> +#define DEV_LS7A_EHCI	0x7a14
> +#define DEV_LS7A_OHCI	0x7a24
>  
>  #define FLAG_CFG0	BIT(0)
>  #define FLAG_CFG1	BIT(1)
> @@ -102,6 +109,31 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  			DEV_PCIE_PORT_2, loongson_bmaster_quirk);
>  
> +static void loongson_pci_pin_quirk(struct pci_dev *pdev)
> +{
> +	pdev->pin = 1 + (PCI_FUNC(pdev->devfn) & 3);
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_DC1, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_DC2, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_GPU, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_GMAC, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_AHCI, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_EHCI, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_OHCI, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
> +
>  static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg;
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A
  2022-04-30  8:48 ` [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
@ 2022-06-01  2:08   ` Bjorn Helgaas
  2022-06-02  4:18     ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-01  2:08 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 04:48:41PM +0800, Huacai Chen wrote:
> LS2K/LS7A support 8/16/32-bits PCI config access operations via CFG1, so
> we can disable CFG0 for them and safely use pci_generic_config_read()/
> pci_generic_config_write() instead of pci_generic_config_read32()/pci_
> generic_config_write32().
> 
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

After removing cast below,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> @@ -193,20 +220,20 @@ static int loongson_pci_probe(struct platform_device *pdev)
>  
>  	priv = pci_host_bridge_priv(bridge);
>  	priv->pdev = pdev;
> -	priv->flags = (unsigned long)of_device_get_match_data(dev);
> +	priv->data = (struct loongson_pci_data *)of_device_get_match_data(dev);

No cast needed.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-04-30  8:48 ` [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A Huacai Chen
@ 2022-06-01  2:22   ` Bjorn Helgaas
  2022-06-01 11:59     ` Jiaxun Yang
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-01  2:22 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang

On Sat, Apr 30, 2022 at 04:48:44PM +0800, Huacai Chen wrote:
> In new revision of LS7A, some PCIe ports support larger value than 256,
> but their maximum supported MRRS values are not detectable. Moreover,
> the current loongson_mrrs_quirk() cannot avoid devices increasing its
> MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
> will actually set a big value in its driver. So the only possible way
> is configure MRRS of all devices in BIOS, and add a pci host bridge bit
> flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations.
> 
> However, according to PCIe Spec, it is legal for an OS to program any
> value for MRRS, and it is also legal for an endpoint to generate a Read
> Request with any size up to its MRRS. As the hardware engineers say, the
> root cause here is LS7A doesn't break up large read requests. In detail,
> LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read
> request with a size that's "too big" ("too big" means larger than the
> PCIe ports can handle, which means 256 for some ports and 4096 for the
> others, and of course this is a problem in the LS7A's hardware design).

This seems essentially similar to ks_pcie_quirk() [1].  Why are they
different, and why do you need no_inc_mrrs, when keystone doesn't?

Or *does* keystone need it and we just haven't figured that out yet?
Are all callers of pcie_set_readrq() vulnerable to issues there?

Whatever we do should be as uniform as possible across host
controllers.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v5.18#n528

> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/controller/pci-loongson.c | 44 +++++++++------------------
>  drivers/pci/pci.c                     |  6 ++++
>  include/linux/pci.h                   |  1 +
>  3 files changed, 22 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index 48316daa1f23..83447264048a 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -67,37 +67,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  			DEV_LS7A_LPC, system_bus_quirk);
>  
> -static void loongson_mrrs_quirk(struct pci_dev *dev)
> +static void loongson_mrrs_quirk(struct pci_dev *pdev)
>  {
> -	struct pci_bus *bus = dev->bus;
> -	struct pci_dev *bridge;
> -	static const struct pci_device_id bridge_devids[] = {
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
> -		{ 0, },
> -	};
> -
> -	/* look for the matching bridge */
> -	while (!pci_is_root_bus(bus)) {
> -		bridge = bus->self;
> -		bus = bus->parent;
> -		/*
> -		 * Some Loongson PCIe ports have a h/w limitation of
> -		 * 256 bytes maximum read request size. They can't handle
> -		 * anything larger than this. So force this limit on
> -		 * any devices attached under these ports.
> -		 */
> -		if (pci_match_id(bridge_devids, bridge)) {
> -			if (pcie_get_readrq(dev) > 256) {
> -				pci_info(dev, "limiting MRRS to 256\n");
> -				pcie_set_readrq(dev, 256);
> -			}
> -			break;
> -		}
> -	}
> +	/*
> +	 * Some Loongson PCIe ports have h/w limitations of maximum read
> +	 * request size. They can't handle anything larger than this. So
> +	 * force this limit on any devices attached under these ports.
> +	 */
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> +
> +	bridge->no_inc_mrrs = 1;
>  }
> -DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_0, loongson_mrrs_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_1, loongson_mrrs_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_2, loongson_mrrs_quirk);
>  
>  static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
>  {
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..72a15bf9eee8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5993,6 +5993,7 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
>  {
>  	u16 v;
>  	int ret;
> +	struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
>  
>  	if (rq < 128 || rq > 4096 || !is_power_of_2(rq))
>  		return -EINVAL;
> @@ -6011,6 +6012,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
>  
>  	v = (ffs(rq) - 8) << 12;
>  
> +	if (bridge->no_inc_mrrs) {
> +		if (rq > pcie_get_readrq(dev))
> +			return -EINVAL;
> +	}
> +
>  	ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
>  						  PCI_EXP_DEVCTL_READRQ, v);
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 60adf42460ab..d146eb28e6da 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -569,6 +569,7 @@ struct pci_host_bridge {
>  	void		*release_data;
>  	unsigned int	ignore_reset_delay:1;	/* For entire hierarchy */
>  	unsigned int	no_ext_tags:1;		/* No Extended Tags */
> +	unsigned int	no_inc_mrrs:1;		/* No Increase MRRS */
>  	unsigned int	native_aer:1;		/* OS may use PCIe AER */
>  	unsigned int	native_pcie_hotplug:1;	/* OS may use PCIe hotplug */
>  	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A
  2022-06-01  2:07   ` Bjorn Helgaas
@ 2022-06-01  7:36     ` Jianmin Lv
  0 siblings, 0 replies; 35+ messages in thread
From: Jianmin Lv @ 2022-06-01  7:36 UTC (permalink / raw)
  To: Bjorn Helgaas, Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jiaxun Yang


On 2022/6/1 上午10:07, Bjorn Helgaas wrote:
> On Sat, Apr 30, 2022 at 04:48:46PM +0800, Huacai Chen wrote:
>> From: Jianmin Lv <lvjianmin@loongson.cn>
>>
>> In LS7A, multifunction device use same PCI PIN (because the PIN register
>> report the same INTx value to each function) but we need different IRQ
>> for different functions, so add a quirk to fix it for standard PCI PIN
>> usage.
> Is this because your _PRT is missing or broken?  of_irq_parse_pci()
> reads and relies on PCI_INTERRUPT_PIN, too, so it seems like the _PRT
> could contain the same routing information you're getting from DT.
>

Thanks for your reply first.

In _PRT, we have following packages for pci device 6(a multi-fun device),


Package (0x04)
{
     0x0006FFFF,
     Zero,
     Zero,
     0x5D
     },

Package (0x04)
{
     0x0006FFFF,
     One,
     Zero,
     0x5C
},

the 'Pin' in two packages must be diffirent(0-INTA, 1-INTB, 2-INTC, 
3-INTD) because that the 'Pin' of the _PRT entry will be compared with 
the pin in the PIN register of pci config space for a pci device in 
following code path:

pci_device_probe
  ->pcibios_alloc_irq
    ->acpi_pci_irq_enable
      ->acpi_pci_irq_lookup
        ->acpi_pci_irq_find_prt_entry
          ->acpi_pci_irq_check_entry

and in acpi_pci_irq_check_entry(), there is following code:

         if (((prt->address >> 16) & 0xffff) != device ||
             prt->pin + 1 != pin)
                 return -ENODEV;

In LS7A, the PIN register returns same value(0) for all the different 
function, so, we have to fix it first.

I don't know if there is any other way to address it.


>> This patch only affect ACPI based systems (and only needed by ACPI based
>> systems, too). For DT based systems, the irq mappings is defined in .dts
>> files and be handled by of_irq_parse_pci().
>>
>> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>> ---
>>   drivers/pci/controller/pci-loongson.c | 32 +++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
>> index 49d8b8c24ffb..024542a31a8c 100644
>> --- a/drivers/pci/controller/pci-loongson.c
>> +++ b/drivers/pci/controller/pci-loongson.c
>> @@ -22,6 +22,13 @@
>>   #define DEV_LS2K_APB	0x7a02
>>   #define DEV_LS7A_CONF	0x7a10
>>   #define DEV_LS7A_LPC	0x7a0c
>> +#define DEV_LS7A_GMAC	0x7a03
>> +#define DEV_LS7A_DC1	0x7a06
>> +#define DEV_LS7A_DC2	0x7a36
>> +#define DEV_LS7A_GPU	0x7a15
>> +#define DEV_LS7A_AHCI	0x7a08
>> +#define DEV_LS7A_EHCI	0x7a14
>> +#define DEV_LS7A_OHCI	0x7a24
>>   
>>   #define FLAG_CFG0	BIT(0)
>>   #define FLAG_CFG1	BIT(1)
>> @@ -102,6 +109,31 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>>   			DEV_PCIE_PORT_2, loongson_bmaster_quirk);
>>   
>> +static void loongson_pci_pin_quirk(struct pci_dev *pdev)
>> +{
>> +	pdev->pin = 1 + (PCI_FUNC(pdev->devfn) & 3);
>> +}
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_DC1, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_DC2, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_GPU, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_GMAC, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_AHCI, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_EHCI, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_LS7A_OHCI, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON,
>> +			DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
>> +
>>   static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
>>   {
>>   	struct pci_config_window *cfg;
>> -- 
>> 2.27.0
>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-01  2:22   ` Bjorn Helgaas
@ 2022-06-01 11:59     ` Jiaxun Yang
  2022-06-02  4:17       ` Huacai Chen
  2022-06-02 16:20       ` Bjorn Helgaas
  0 siblings, 2 replies; 35+ messages in thread
From: Jiaxun Yang @ 2022-06-01 11:59 UTC (permalink / raw)
  To: Bjorn Helgaas, Huacai Chen
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen



在2022年6月1日六月 上午3:22,Bjorn Helgaas写道:
> On Sat, Apr 30, 2022 at 04:48:44PM +0800, Huacai Chen wrote:
>> In new revision of LS7A, some PCIe ports support larger value than 256,
>> but their maximum supported MRRS values are not detectable. Moreover,
>> the current loongson_mrrs_quirk() cannot avoid devices increasing its
>> MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
>> will actually set a big value in its driver. So the only possible way
>> is configure MRRS of all devices in BIOS, and add a pci host bridge bit
>> flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations.
>> 
>> However, according to PCIe Spec, it is legal for an OS to program any
>> value for MRRS, and it is also legal for an endpoint to generate a Read
>> Request with any size up to its MRRS. As the hardware engineers say, the
>> root cause here is LS7A doesn't break up large read requests. In detail,
>> LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read
>> request with a size that's "too big" ("too big" means larger than the
>> PCIe ports can handle, which means 256 for some ports and 4096 for the
>> others, and of course this is a problem in the LS7A's hardware design).
>
> This seems essentially similar to ks_pcie_quirk() [1].  Why are they
> different, and why do you need no_inc_mrrs, when keystone doesn't?
>
> Or *does* keystone need it and we just haven't figured that out yet?
> Are all callers of pcie_set_readrq() vulnerable to issues there?

Yes actually keystone may need to set this flag as well.

I think Huacai missed a point in his commit message about why he removed
the process of walking through the bus and set proper MRRS. That’s
because Loongson’s firmware will set proper MRRS and the only thing
that Kernel needs to do is leave it as is. no_inc_mrrs is introduced for
this purpose.

In keystone’s case it’s likely that their firmware won’t do such thing, so
their workaround shouldn’t be removed.
And  no_inc_mrrs should be set for them to prevent device drivers modifying
MRRS afterwards.

Thanks
- Jiaxun

>
> Whatever we do should be as uniform as possible across host
> controllers.
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v5.18#n528
>
-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-01 11:59     ` Jiaxun Yang
@ 2022-06-02  4:17       ` Huacai Chen
  2022-06-02 16:20       ` Bjorn Helgaas
  1 sibling, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2022-06-02  4:17 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: Bjorn Helgaas, Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczyński, linux-pci, Xuefeng Li

Hi, Bjorn,

On Wed, Jun 1, 2022 at 8:00 PM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>
>
>
> 在2022年6月1日六月 上午3:22,Bjorn Helgaas写道:
> > On Sat, Apr 30, 2022 at 04:48:44PM +0800, Huacai Chen wrote:
> >> In new revision of LS7A, some PCIe ports support larger value than 256,
> >> but their maximum supported MRRS values are not detectable. Moreover,
> >> the current loongson_mrrs_quirk() cannot avoid devices increasing its
> >> MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
> >> will actually set a big value in its driver. So the only possible way
> >> is configure MRRS of all devices in BIOS, and add a pci host bridge bit
> >> flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations.
> >>
> >> However, according to PCIe Spec, it is legal for an OS to program any
> >> value for MRRS, and it is also legal for an endpoint to generate a Read
> >> Request with any size up to its MRRS. As the hardware engineers say, the
> >> root cause here is LS7A doesn't break up large read requests. In detail,
> >> LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read
> >> request with a size that's "too big" ("too big" means larger than the
> >> PCIe ports can handle, which means 256 for some ports and 4096 for the
> >> others, and of course this is a problem in the LS7A's hardware design).
> >
> > This seems essentially similar to ks_pcie_quirk() [1].  Why are they
> > different, and why do you need no_inc_mrrs, when keystone doesn't?
> >
> > Or *does* keystone need it and we just haven't figured that out yet?
> > Are all callers of pcie_set_readrq() vulnerable to issues there?
>
> Yes actually keystone may need to set this flag as well.
>
> I think Huacai missed a point in his commit message about why he removed
> the process of walking through the bus and set proper MRRS. That’s
> because Loongson’s firmware will set proper MRRS and the only thing
> that Kernel needs to do is leave it as is. no_inc_mrrs is introduced for
> this purpose.
>
> In keystone’s case it’s likely that their firmware won’t do such thing, so
> their workaround shouldn’t be removed.
> And  no_inc_mrrs should be set for them to prevent device drivers modifying
> MRRS afterwards.
Yes, the fact is the same as Jiaxun says.

Huacai
>
> Thanks
> - Jiaxun
>
> >
> > Whatever we do should be as uniform as possible across host
> > controllers.
> >
> > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v5.18#n528
> >
> --
> - Jiaxun

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A
  2022-06-01  2:08   ` Bjorn Helgaas
@ 2022-06-02  4:18     ` Huacai Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2022-06-02  4:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Wed, Jun 1, 2022 at 10:08 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Sat, Apr 30, 2022 at 04:48:41PM +0800, Huacai Chen wrote:
> > LS2K/LS7A support 8/16/32-bits PCI config access operations via CFG1, so
> > we can disable CFG0 for them and safely use pci_generic_config_read()/
> > pci_generic_config_write() instead of pci_generic_config_read32()/pci_
> > generic_config_write32().
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> After removing cast below,
OK, thanks.

Huacai
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
> > @@ -193,20 +220,20 @@ static int loongson_pci_probe(struct platform_device *pdev)
> >
> >       priv = pci_host_bridge_priv(bridge);
> >       priv->pdev = pdev;
> > -     priv->flags = (unsigned long)of_device_get_match_data(dev);
> > +     priv->data = (struct loongson_pci_data *)of_device_get_match_data(dev);
>
> No cast needed.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices
  2022-05-31 23:14   ` Bjorn Helgaas
@ 2022-06-02  4:28     ` Huacai Chen
  2022-06-02 16:23       ` Bjorn Helgaas
  0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-06-02  4:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Wed, Jun 1, 2022 at 7:14 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Sat, Apr 30, 2022 at 04:48:43PM +0800, Huacai Chen wrote:
> > On LS2K/LS7A, some unexisting devices don't return 0xffffffff when
> > scanning. This is a hardware flaw but we can only avoid it by software
> > now.
>
> s/unexisting/non-existant/ (many occurrences: subject line, commit
> log, comments below)
OK, thanks.

>
> What happens in other situations that normally cause Unsupported
> Request or similar errors?  For example, memory reads/writes to a
> device in D3hot should cause an Unsupported Request error.  I'm
> wondering whether other error handling assumptions might be broken
> on LS2K/LS7A.
Hardware engineers told me that the problem is due to pin
multiplexing, under some configurations, a PCI device is unusable but
the read request doesn't return 0xffffffff.

>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/controller/pci-loongson.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> > index adbfa4a2330f..48316daa1f23 100644
> > --- a/drivers/pci/controller/pci-loongson.c
> > +++ b/drivers/pci/controller/pci-loongson.c
> > @@ -138,6 +138,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> >                              int where)
> >  {
> >       unsigned char busnum = bus->number;
> > +     unsigned int device = PCI_SLOT(devfn);
> > +     unsigned int function = PCI_FUNC(devfn);
> >       struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
> >
> >       if (pci_is_root_bus(bus))
> > @@ -147,8 +149,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> >        * Do not read more than one device on the bus other than
> >        * the host bus. For our hardware the root bus is always bus 0.
> >        */
> > -     if (priv->data->flags & FLAG_DEV_FIX &&
> > -                     !pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
> > +     if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
> > +             if (!pci_is_root_bus(bus) && (device > 0))
> > +                     return NULL;
> > +     }
> > +
> > +     /* Don't access unexisting devices */
> > +     if (pci_is_root_bus(bus) && (device >= 9 && device <= 20 && function > 0))
>
> Yuck.  This is pretty nasty magic.  If this is something that might be
> fixed in future versions of the hardware, maybe you should factor this
> out into a function pointer in loongson_pci_data or something.
OK, seems providing a pdev_is_existant() is better.

Huacai
>
> >               return NULL;
> >
> >       /* CFG0 can only access standard space */
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 2/6] PCI: loongson: Add ACPI init support
  2022-05-31 23:04   ` Bjorn Helgaas
@ 2022-06-02  7:09     ` Huacai Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2022-06-02  7:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Wed, Jun 1, 2022 at 7:04 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Sat, Apr 30, 2022 at 04:48:42PM +0800, Huacai Chen wrote:
> > Loongson PCH (LS7A chipset) will be used by both MIPS-based and
> > LoongArch-based Loongson processors. MIPS-based Loongson uses FDT
> > while LoongArch-base Loongson uses ACPI, this patch add ACPI init
> > support for the driver in drivers/pci/controller/pci-loongson.c
> > because it is currently FDT-only.
> >
> > LoongArch is a new RISC ISA, mainline support will come soon, and
> > documentations are here (in translation):
> >
> > https://github.com/loongson/LoongArch-Documentation
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/acpi/pci_mcfg.c               | 13 ++++++
> >  drivers/pci/controller/Kconfig        |  2 +-
> >  drivers/pci/controller/pci-loongson.c | 60 ++++++++++++++++++++++++++-
> >  include/linux/pci-ecam.h              |  1 +
> >  4 files changed, 73 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/acpi/pci_mcfg.c b/drivers/acpi/pci_mcfg.c
> > index 53cab975f612..860014b89b8e 100644
> > --- a/drivers/acpi/pci_mcfg.c
> > +++ b/drivers/acpi/pci_mcfg.c
> > @@ -41,6 +41,8 @@ struct mcfg_fixup {
> >  static struct mcfg_fixup mcfg_quirks[] = {
> >  /*   { OEM_ID, OEM_TABLE_ID, REV, SEGMENT, BUS_RANGE, ops, cfgres }, */
> >
> > +#ifdef CONFIG_ARM64
> > +
> >  #define AL_ECAM(table_id, rev, seg, ops) \
> >       { "AMAZON", table_id, rev, seg, MCFG_BUS_ANY, ops }
> >
> > @@ -169,6 +171,17 @@ static struct mcfg_fixup mcfg_quirks[] = {
> >       ALTRA_ECAM_QUIRK(1, 13),
> >       ALTRA_ECAM_QUIRK(1, 14),
> >       ALTRA_ECAM_QUIRK(1, 15),
> > +#endif /* ARM64 */
>
> The addition of the CONFIG_ARM64 #ifdefs should be its own separate
> patch so it's not buried in this Loongson one.
OK, thanks.

>
> > +#ifdef CONFIG_LOONGARCH
> > +#define LOONGSON_ECAM_MCFG(table_id, seg) \
> > +     { "LOONGS", table_id, 1, seg, MCFG_BUS_ANY, &loongson_pci_ecam_ops }
> > +
> > +     LOONGSON_ECAM_MCFG("\0", 0),
> > +     LOONGSON_ECAM_MCFG("LOONGSON", 0),
> > +     LOONGSON_ECAM_MCFG("\0", 1),
> > +     LOONGSON_ECAM_MCFG("LOONGSON", 1),
> > +#endif /* LOONGARCH */
> >  };
> >
> >  static char mcfg_oem_id[ACPI_OEM_ID_SIZE];
> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> > index b8d96d38064d..9dbd73898b47 100644
> > --- a/drivers/pci/controller/Kconfig
> > +++ b/drivers/pci/controller/Kconfig
> > @@ -293,7 +293,7 @@ config PCI_HYPERV_INTERFACE
> >  config PCI_LOONGSON
> >       bool "LOONGSON PCI Controller"
> >       depends on MACH_LOONGSON64 || COMPILE_TEST
> > -     depends on OF
> > +     depends on OF || ACPI
> >       depends on PCI_QUIRKS
> >       default MACH_LOONGSON64
> >       help
> > diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> > index 0a947ace9478..adbfa4a2330f 100644
> > --- a/drivers/pci/controller/pci-loongson.c
> > +++ b/drivers/pci/controller/pci-loongson.c
> > @@ -9,6 +9,8 @@
> >  #include <linux/of_pci.h>
> >  #include <linux/pci.h>
> >  #include <linux/pci_ids.h>
> > +#include <linux/pci-acpi.h>
> > +#include <linux/pci-ecam.h>
> >
> >  #include "../pci.h"
> >
> > @@ -97,6 +99,18 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> >  }
> >  DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
> >
> > +static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
> > +{
> > +     struct pci_config_window *cfg;
> > +
> > +     if (acpi_disabled)
> > +             return (struct loongson_pci *)(bus->sysdata);
> > +     else {
> > +             cfg = bus->sysdata;
> > +             return (struct loongson_pci *)(cfg->priv);
> > +     }
> > +}
> > +
> >  static void __iomem *cfg1_map(struct loongson_pci *priv, int bus,
> >                               unsigned int devfn, int where)
> >  {
> > @@ -124,8 +138,10 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> >                              int where)
> >  {
> >       unsigned char busnum = bus->number;
> > -     struct pci_host_bridge *bridge = pci_find_host_bridge(bus);
> > -     struct loongson_pci *priv =  pci_host_bridge_priv(bridge);
> > +     struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
> > +
> > +     if (pci_is_root_bus(bus))
> > +             busnum = 0;
>
> This is weird.  The comment just below says "For our hardware the root
> bus is always bus 0."  Is that not true any more?  Why would you
> override the bus number?
The below comment should be fixed. For multi pci domain machines, the
root bus number isn't always 0, so we should override it.

>
> >       /*
> >        * Do not read more than one device on the bus other than
> > @@ -146,6 +162,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> >       return NULL;
> >  }
> >
> > +#ifdef CONFIG_OF
> > +
> >  static int loongson_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
> >  {
> >       int irq;
> > @@ -259,3 +277,41 @@ static struct platform_driver loongson_pci_driver = {
> >       .probe = loongson_pci_probe,
> >  };
> >  builtin_platform_driver(loongson_pci_driver);
> > +
> > +#endif
> > +
> > +#ifdef CONFIG_ACPI
> > +
> > +static int loongson_pci_ecam_init(struct pci_config_window *cfg)
> > +{
> > +     struct device *dev = cfg->parent;
> > +     struct loongson_pci *priv;
> > +     struct loongson_pci_data *data;
> > +
> > +     priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +     if (!priv)
> > +             return -ENOMEM;
> > +
> > +     data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> > +     if (!data)
> > +             return -ENOMEM;
> > +
> > +     cfg->priv = priv;
> > +     data->flags = FLAG_CFG1;
> > +     priv->data = data;
> > +     priv->cfg1_base = cfg->win - (cfg->busr.start << 16);
> > +
> > +     return 0;
> > +}
> > +
> > +const struct pci_ecam_ops loongson_pci_ecam_ops = {
> > +     .bus_shift = 16,
>
> I can't remember the details of how this works.  The standard PCIe
> ECAM has:
>
>   A[11:00] 12 bits of Register number/alignment
>   A[14:12]  3 bits of Function number
>   A[19:15]  5 bits of Device number
>   A[27:20]  8 bits of Bus number
>
> Doesn't a bus_shift of 16 mean there are only 16 bits available for
> the register number, function, and device?  So that would limit us to
> 8 bits of register number?  Do we enforce that somewhere?
>
> It seems like a limit on "where" should be enforced in
> pci_ecam_map_bus() and other .map_bus() functions like
> pci_loongson_map_bus().  Otherwise a config read at offset
> 0x100 would read config space of the wrong device.
>
> Krzysztof, do you remember how this works?
After a quick glance, it seems pci_ecam_map_bus() should be changed. :)

Huacai
>
> > +     .init      = loongson_pci_ecam_init,
> > +     .pci_ops   = {
> > +             .map_bus = pci_loongson_map_bus,
> > +             .read    = pci_generic_config_read,
> > +             .write   = pci_generic_config_write,
> > +     }
> > +};

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-05-31 23:35   ` Bjorn Helgaas
@ 2022-06-02 12:48     ` Huacai Chen
  2022-06-02 16:29       ` Bjorn Helgaas
  0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-06-02 12:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services during
> > shutdown")
>
> Usual quoting style would be
>
>   cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
>   causes poweroff/reboot ...
OK, thanks.

>
> > causes poweroff/reboot failure on systems with LS7A chipset.
> > We found that if we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> > do_pci_disable_device(), it can work well. The hardware engineer says
> > that the root cause is that CPU is still accessing PCIe devices while
> > poweroff/reboot, and if we disable the Bus Master Bit at this time, the
> > PCIe controller doesn't forward requests to downstream devices, and also
> > doesn't send TIMEOUT to CPU, which causes CPU wait forever (hardware
> > deadlock). This behavior is a PCIe protocol violation (Bus Master should
> > not be involved in CPU MMIO transactions), and it will be fixed in new
> > revisions of hardware (add timeout mechanism for CPU read request,
> > whether or not Bus Master bit is cleared).
>
> LS7A might have bugs in that clearing Bus Master Enable prevents the
> root port from forwarding Memory or I/O requests in the downstream
> direction.
>
> But this feels like a bit of a band-aid because we don't know exactly
> what those requests are.  If we're removing the Root Port, I assume we
> think we no longer need any devices *below* the Root Port.
>
> If that's not the case, e.g., if we still need to produce console
> output or save state to a device, we probably should not be removing
> the Root Port at all.
Do you mean it is better to skip the whole pcie_port_device_remove()
instead of just removing the "clear bus master" operation for the
buggy hardware?

Huacai
>
> > On some x86 platforms, radeon/amdgpu devices can cause similar problems
> > [1][2]. Once before I wanted to make a single patch to solve "all of
> > these problems" together, but it seems unreasonable because maybe they
> > are not exactly the same problem. So, this patch just add a quirk for
> > LS7A to avoid clearing Bus Master bit in pcie_port_device_remove(), and
> > leave other platforms as is.
> >
> > [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
> > [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/controller/pci-loongson.c | 17 +++++++++++++++++
> >  drivers/pci/pcie/portdrv_core.c       |  6 +++++-
> >  include/linux/pci.h                   |  1 +
> >  3 files changed, 23 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> > index 83447264048a..49d8b8c24ffb 100644
> > --- a/drivers/pci/controller/pci-loongson.c
> > +++ b/drivers/pci/controller/pci-loongson.c
> > @@ -85,6 +85,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> >                       DEV_PCIE_PORT_2, loongson_mrrs_quirk);
> >
> > +static void loongson_bmaster_quirk(struct pci_dev *pdev)
> > +{
> > +     /*
> > +      * Some Loongson PCIe ports will cause CPU deadlock if disable
> > +      * the Bus Master bit during poweroff/reboot.
> > +      */
> > +     struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> > +
> > +     bridge->no_dis_bmaster = 1;
> > +}
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> > +                     DEV_PCIE_PORT_0, loongson_bmaster_quirk);
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> > +                     DEV_PCIE_PORT_1, loongson_bmaster_quirk);
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> > +                     DEV_PCIE_PORT_2, loongson_bmaster_quirk);
> > +
> >  static struct loongson_pci *pci_bus_to_loongson_pci(struct pci_bus *bus)
> >  {
> >       struct pci_config_window *cfg;
> > diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> > index 604feeb84ee4..23f41e31a6c6 100644
> > --- a/drivers/pci/pcie/portdrv_core.c
> > +++ b/drivers/pci/pcie/portdrv_core.c
> > @@ -491,9 +491,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
> >   */
> >  void pcie_port_device_remove(struct pci_dev *dev)
> >  {
> > +     struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
> > +
> >       device_for_each_child(&dev->dev, NULL, remove_iter);
> >       pci_free_irq_vectors(dev);
> > -     pci_disable_device(dev);
> > +
> > +     if (!bridge->no_dis_bmaster)
> > +             pci_disable_device(dev);
> >  }
> >
> >  /**
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index d146eb28e6da..c52d6486ff99 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -570,6 +570,7 @@ struct pci_host_bridge {
> >       unsigned int    ignore_reset_delay:1;   /* For entire hierarchy */
> >       unsigned int    no_ext_tags:1;          /* No Extended Tags */
> >       unsigned int    no_inc_mrrs:1;          /* No Increase MRRS */
> > +     unsigned int    no_dis_bmaster:1;       /* No Disable Bus Master */
> >       unsigned int    native_aer:1;           /* OS may use PCIe AER */
> >       unsigned int    native_pcie_hotplug:1;  /* OS may use PCIe hotplug */
> >       unsigned int    native_shpc_hotplug:1;  /* OS may use SHPC hotplug */
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-01 11:59     ` Jiaxun Yang
  2022-06-02  4:17       ` Huacai Chen
@ 2022-06-02 16:20       ` Bjorn Helgaas
  2022-06-03 12:13         ` Krzysztof Hałasa
  2022-06-03 22:57         ` Jiaxun Yang
  1 sibling, 2 replies; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-02 16:20 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jingoo Han, Gustavo Pimentel, Kishon Vijay Abraham I,
	Krzysztof Hałasa

[+cc Jingoo, Gustavo, Kishon, Krzysztof]

On Wed, Jun 01, 2022 at 12:59:50PM +0100, Jiaxun Yang wrote:
> 在2022年6月1日六月 上午3:22,Bjorn Helgaas写道:
> > On Sat, Apr 30, 2022 at 04:48:44PM +0800, Huacai Chen wrote:
> >> In new revision of LS7A, some PCIe ports support larger value than 256,
> >> but their maximum supported MRRS values are not detectable. Moreover,
> >> the current loongson_mrrs_quirk() cannot avoid devices increasing its
> >> MRRS after pci_enable_device(), and some devices (e.g. Realtek 8169)
> >> will actually set a big value in its driver. So the only possible way
> >> is configure MRRS of all devices in BIOS, and add a pci host bridge bit
> >> flag (i.e., no_inc_mrrs) to stop the increasing MRRS operations.
> >> 
> >> However, according to PCIe Spec, it is legal for an OS to program any
> >> value for MRRS, and it is also legal for an endpoint to generate a Read
> >> Request with any size up to its MRRS. As the hardware engineers say, the
> >> root cause here is LS7A doesn't break up large read requests. In detail,
> >> LS7A PCIe port reports CA (Completer Abort) if it receives a Memory Read
> >> request with a size that's "too big" ("too big" means larger than the
> >> PCIe ports can handle, which means 256 for some ports and 4096 for the
> >> others, and of course this is a problem in the LS7A's hardware design).
> >
> > This seems essentially similar to ks_pcie_quirk() [1].  Why are they
> > different, and why do you need no_inc_mrrs, when keystone doesn't?
> >
> > Or *does* keystone need it and we just haven't figured that out yet?
> > Are all callers of pcie_set_readrq() vulnerable to issues there?
> 
> Yes actually keystone may need to set this flag as well.
> 
> I think Huacai missed a point in his commit message about why he removed
> the process of walking through the bus and set proper MRRS. That’s
> because Loongson’s firmware will set proper MRRS and the only thing
> that Kernel needs to do is leave it as is. no_inc_mrrs is introduced for
> this purpose.

I'd really like to have a single implementation of whatever quirk
works around this.  I don't think we should have multiple copies just
because we assume some firmware takes care of part of this for us.

> In keystone’s case it’s likely that their firmware won’t do such thing, so
> their workaround shouldn’t be removed.
> And  no_inc_mrrs should be set for them to prevent device drivers modifying
> MRRS afterwards.

I have the vague impression that this issue is related to an arm64 AXI
bus property [2] or maybe a DesignWare controller property [3], so
this might affect several PCIe controller drivers.

> > Whatever we do should be as uniform as possible across host
> > controllers.
> >
> > [1] 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v5.18#n528

[2] https://lore.kernel.org/all/20211126083119.16570-4-kishon@ti.com/
[3] https://lore.kernel.org/all/m3r1f08p83.fsf@t19.piap.pl/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices
  2022-06-02  4:28     ` Huacai Chen
@ 2022-06-02 16:23       ` Bjorn Helgaas
  2022-06-02 20:00         ` Jiaxun Yang
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-02 16:23 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

On Thu, Jun 02, 2022 at 12:28:40PM +0800, Huacai Chen wrote:
> Hi, Bjorn,
> 
> On Wed, Jun 1, 2022 at 7:14 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Sat, Apr 30, 2022 at 04:48:43PM +0800, Huacai Chen wrote:
> > > On LS2K/LS7A, some unexisting devices don't return 0xffffffff when
> > > scanning. This is a hardware flaw but we can only avoid it by software
> > > now.
> >
> > What happens in other situations that normally cause Unsupported
> > Request or similar errors?  For example, memory reads/writes to a
> > device in D3hot should cause an Unsupported Request error.  I'm
> > wondering whether other error handling assumptions might be broken
> > on LS2K/LS7A.
>
> Hardware engineers told me that the problem is due to pin
> multiplexing, under some configurations, a PCI device is unusable but
> the read request doesn't return 0xffffffff.

What happens if a driver does a mem read to a device that's in D3hot?

> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > ---
> > >  drivers/pci/controller/pci-loongson.c | 11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> > > index adbfa4a2330f..48316daa1f23 100644
> > > --- a/drivers/pci/controller/pci-loongson.c
> > > +++ b/drivers/pci/controller/pci-loongson.c
> > > @@ -138,6 +138,8 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> > >                              int where)
> > >  {
> > >       unsigned char busnum = bus->number;
> > > +     unsigned int device = PCI_SLOT(devfn);
> > > +     unsigned int function = PCI_FUNC(devfn);
> > >       struct loongson_pci *priv = pci_bus_to_loongson_pci(bus);
> > >
> > >       if (pci_is_root_bus(bus))
> > > @@ -147,8 +149,13 @@ static void __iomem *pci_loongson_map_bus(struct pci_bus *bus, unsigned int devf
> > >        * Do not read more than one device on the bus other than
> > >        * the host bus. For our hardware the root bus is always bus 0.
> > >        */
> > > -     if (priv->data->flags & FLAG_DEV_FIX &&
> > > -                     !pci_is_root_bus(bus) && PCI_SLOT(devfn) > 0)
> > > +     if ((priv->data->flags & FLAG_DEV_FIX) && bus->self) {
> > > +             if (!pci_is_root_bus(bus) && (device > 0))
> > > +                     return NULL;
> > > +     }
> > > +
> > > +     /* Don't access unexisting devices */
> > > +     if (pci_is_root_bus(bus) && (device >= 9 && device <= 20 && function > 0))
> >
> > Yuck.  This is pretty nasty magic.  If this is something that might be
> > fixed in future versions of the hardware, maybe you should factor this
> > out into a function pointer in loongson_pci_data or something.
> OK, seems providing a pdev_is_existant() is better.
> 
> Huacai
> >
> > >               return NULL;
> > >
> > >       /* CFG0 can only access standard space */
> > > --
> > > 2.27.0
> > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-02 12:48     ` Huacai Chen
@ 2022-06-02 16:29       ` Bjorn Helgaas
  2022-06-08  9:34         ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-02 16:29 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> Hi, Bjorn,
> 
> On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services
> > > during shutdown") causes poweroff/reboot failure on systems with
> > > LS7A chipset.  We found that if we remove "pci_command &=
> > > ~PCI_COMMAND_MASTER;" in do_pci_disable_device(), it can work
> > > well. The hardware engineer says that the root cause is that CPU
> > > is still accessing PCIe devices while poweroff/reboot, and if we
> > > disable the Bus Master Bit at this time, the PCIe controller
> > > doesn't forward requests to downstream devices, and also doesn't
> > > send TIMEOUT to CPU, which causes CPU wait forever (hardware
> > > deadlock). This behavior is a PCIe protocol violation (Bus
> > > Master should not be involved in CPU MMIO transactions), and it
> > > will be fixed in new revisions of hardware (add timeout
> > > mechanism for CPU read request, whether or not Bus Master bit is
> > > cleared).
> >
> > LS7A might have bugs in that clearing Bus Master Enable prevents the
> > root port from forwarding Memory or I/O requests in the downstream
> > direction.
> >
> > But this feels like a bit of a band-aid because we don't know exactly
> > what those requests are.  If we're removing the Root Port, I assume we
> > think we no longer need any devices *below* the Root Port.
> >
> > If that's not the case, e.g., if we still need to produce console
> > output or save state to a device, we probably should not be removing
> > the Root Port at all.
>
> Do you mean it is better to skip the whole pcie_port_device_remove()
> instead of just removing the "clear bus master" operation for the
> buggy hardware?

No, that's not what I want at all.  That's just another band-aid to
avoid a problem without understanding what the problem is.

My point is that apparently we remove a Root Port (which means we've
already removed any devices under it), and then we try to use a device
below the Root Port.  That seems broken.  I want to understand why we
try to use a device after we've removed it.

If the scenario ends up being legitimate and unavoidable, fine -- we
can figure out a quirk to work around the fact the LS7A doesn't allow
that access after we clear Bus Master Enable.  But right now the
scenario smells like a latent bug, and leaving bus mastering enabled 
just avoids it without fixing it.

Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices
  2022-06-02 16:23       ` Bjorn Helgaas
@ 2022-06-02 20:00         ` Jiaxun Yang
  0 siblings, 0 replies; 35+ messages in thread
From: Jiaxun Yang @ 2022-06-02 20:00 UTC (permalink / raw)
  To: Bjorn Helgaas, Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li


在 2022/6/2 17:23, Bjorn Helgaas 写道:
> On Thu, Jun 02, 2022 at 12:28:40PM +0800, Huacai Chen wrote:
>> Hi, Bjorn,
>>
>> On Wed, Jun 1, 2022 at 7:14 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>> On Sat, Apr 30, 2022 at 04:48:43PM +0800, Huacai Chen wrote:
>>>> On LS2K/LS7A, some unexisting devices don't return 0xffffffff when
>>>> scanning. This is a hardware flaw but we can only avoid it by software
>>>> now.
>>> What happens in other situations that normally cause Unsupported
>>> Request or similar errors?  For example, memory reads/writes to a
>>> device in D3hot should cause an Unsupported Request error.  I'm
>>> wondering whether other error handling assumptions might be broken
>>> on LS2K/LS7A.
>> Hardware engineers told me that the problem is due to pin
>> multiplexing, under some configurations, a PCI device is unusable but
>> the read request doesn't return 0xffffffff.
> What happens if a driver does a mem read to a device that's in D3hot?
Just did experiment on my hardware (which is a MIPS-era LS3A4000 with
LS7A).

It turns out that the hardware simply returns 0xffffffff for all reads and
ignore writes.

The PCIe controller of LS7A is a customized dwc and it should response
with a SLVERR transaction on LS7A's internalAXI bus. However the bus
we used to connect LS7A with CPU is incapable to pass this SLVERR
information to CPU and thus it just provides a false result.

All LS7A's on-chip devices connected on PCI bus don't expose any
PCI power management capability. Though they have power management
registers elsewhere  and generally we won't touch them when kernel
is running.

Thanks
- Jiaxun


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-02 16:20       ` Bjorn Helgaas
@ 2022-06-03 12:13         ` Krzysztof Hałasa
  2022-06-03 22:57         ` Jiaxun Yang
  1 sibling, 0 replies; 35+ messages in thread
From: Krzysztof Hałasa @ 2022-06-03 12:13 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiaxun Yang, Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczyński, linux-pci, Xuefeng Li,
	Huacai Chen, Jingoo Han, Gustavo Pimentel,
	Kishon Vijay Abraham I

Hi Bjorn et al.,

Bjorn Helgaas <helgaas@kernel.org> writes:

> I'd really like to have a single implementation of whatever quirk
> works around this.  I don't think we should have multiple copies just
> because we assume some firmware takes care of part of this for us.

I second this.
I think it should work this way:

MPS affects the whole buses, i.e., packets are not fragmented by PCIe
bridges. MPS works for both RX and TX. This means the CPU MPS (if any)
must be enforced (set in the registers) over the whole bus (system).

The system may use different (smaller) MPSes for different devices,
though. Perhaps the user should be able to ask for smaller value
(currently it's done using enum pcie_bus_config_types).

MRRS can be larger than MPS (a single read causes multiple packets of
response), and can be different for different devices.
Still, all devices must be programmed the system's limit at most (or
less if the user wishes to).

IMHO this means we should use max_mps and max_mrrs for the whole system,
and then e.g. platform PCIe controller driver or a device driver could
lower them, triggering writes to the PCI config registers down the
buses.
Individual devices/drivers could use smaller values without changing the
global variables.

> I have the vague impression that this issue is related to an arm64 AXI
> bus property [2] or maybe a DesignWare controller property [3], so
> this might affect several PCIe controller drivers.

[2] seems like a bug in TI specific SoC and revision only.
[3] it seems all DWC PCIe hosts (and maybe devices) need a limit (two
limits).

E.g.:
- i.MX6 needs MRRS = 512 (or lower at user's discretion) and MPS = 128.
- CNS3xxx needs MRRS = MPS = 128 IIRC.
-- 
Krzysztof "Chris" Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-02 16:20       ` Bjorn Helgaas
  2022-06-03 12:13         ` Krzysztof Hałasa
@ 2022-06-03 22:57         ` Jiaxun Yang
  2022-06-04  0:07           ` Bjorn Helgaas
  1 sibling, 1 reply; 35+ messages in thread
From: Jiaxun Yang @ 2022-06-03 22:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jingoo Han, Gustavo Pimentel, Kishon Vijay Abraham I,
	Krzysztof Hałasa



在2022年6月2日六月 下午5:20,Bjorn Helgaas写道:.
>
> I'd really like to have a single implementation of whatever quirk
> works around this.  I don't think we should have multiple copies just
> because we assume some firmware takes care of part of this for us.
>
Yeah that was my idea when I was writing the present version of workaround.
However in later LS7A revisions Loongson somehow raised MRRS for several
PCIe controllers on chip to 1024 and other ports remains to be 256. Kernel
have no way to aware of this change and we can only rely on firmware to set
proper value.

I have no idea how Loongson achieved this in hardware. All those PCIe controllers
are attached under the same AXI bus should share the same AXI to HyperTransport
bridge as AXI slave behind a bus matrix. Perhaps instead of fixing error handling of
their AXI protocol implementation they just increased the buffer size in AXI bridge
so it can accomplish larger requests at one time.

>> In keystone’s case it’s likely that their firmware won’t do such thing, so
>> their workaround shouldn’t be removed.
>> And  no_inc_mrrs should be set for them to prevent device drivers modifying
>> MRRS afterwards.
>
> I have the vague impression that this issue is related to an arm64 AXI
> bus property [2] or maybe a DesignWare controller property [3], so
> this might affect several PCIe controller drivers.
In my understanding it’s likely to be a AXI implementation issue.

Thanks
>
>> > Whatever we do should be as uniform as possible across host
>> > controllers.
>> >
>> > [1] 
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v5.18#n528
>
> [2] https://lore.kernel.org/all/20211126083119.16570-4-kishon@ti.com/
> [3] https://lore.kernel.org/all/m3r1f08p83.fsf@t19.piap.pl/

-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-03 22:57         ` Jiaxun Yang
@ 2022-06-04  0:07           ` Bjorn Helgaas
  2022-06-08  8:29             ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-04  0:07 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Huacai Chen,
	Jingoo Han, Gustavo Pimentel, Kishon Vijay Abraham I,
	Krzysztof Hałasa

On Fri, Jun 03, 2022 at 11:57:47PM +0100, Jiaxun Yang wrote:
> 在2022年6月2日六月 下午5:20,Bjorn Helgaas写道:.
> >
> > I'd really like to have a single implementation of whatever quirk
> > works around this.  I don't think we should have multiple copies
> > just because we assume some firmware takes care of part of this
> > for us.
> >
> Yeah that was my idea when I was writing the present version of
> workaround.  However in later LS7A revisions Loongson somehow raised
> MRRS for several PCIe controllers on chip to 1024 and other ports
> remains to be 256. Kernel have no way to aware of this change and we
> can only rely on firmware to set proper value.

That's fine; we need a controller-specific way to find the limit
(whether it's fixed for all versions or discovered from firmware
settings or whatever).

My hope is that given that controller-specific value, we can have a
single quirk that works on keystone, loongson, etc. to enforce the
limit on all relevant devices.  Some platform firmware might do that
configuration already, but it's OK if a generic quirk re-does it.

I don't think it's worth having two quirks, one that does the
configuration, and another that relies on firmware having done it.

> I have no idea how Loongson achieved this in hardware. All those
> PCIe controllers are attached under the same AXI bus should share
> the same AXI to HyperTransport bridge as AXI slave behind a bus
> matrix. Perhaps instead of fixing error handling of their AXI
> protocol implementation they just increased the buffer size in AXI
> bridge so it can accomplish larger requests at one time.

> >> In keystone’s case it’s likely that their firmware won’t do such thing, so
> >> their workaround shouldn’t be removed.
> >> And  no_inc_mrrs should be set for them to prevent device drivers modifying
> >> MRRS afterwards.
> >
> > I have the vague impression that this issue is related to an arm64 AXI
> > bus property [2] or maybe a DesignWare controller property [3], so
> > this might affect several PCIe controller drivers.
>
> In my understanding it’s likely to be a AXI implementation issue.

I know almost nothing about AXI, but this concerns me because it
sounds like other drivers could be affected.

Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A
  2022-06-04  0:07           ` Bjorn Helgaas
@ 2022-06-08  8:29             ` Huacai Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2022-06-08  8:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jiaxun Yang, Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczyński, linux-pci, Xuefeng Li,
	Jingoo Han, Gustavo Pimentel, Kishon Vijay Abraham I,
	Krzysztof Hałasa

Hi, Bjorn, Jiaxun,

On Sat, Jun 4, 2022 at 8:07 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Jun 03, 2022 at 11:57:47PM +0100, Jiaxun Yang wrote:
> > 在2022年6月2日六月 下午5:20,Bjorn Helgaas写道:.
> > >
> > > I'd really like to have a single implementation of whatever quirk
> > > works around this.  I don't think we should have multiple copies
> > > just because we assume some firmware takes care of part of this
> > > for us.
> > >
> > Yeah that was my idea when I was writing the present version of
> > workaround.  However in later LS7A revisions Loongson somehow raised
> > MRRS for several PCIe controllers on chip to 1024 and other ports
> > remains to be 256. Kernel have no way to aware of this change and we
> > can only rely on firmware to set proper value.
>
> That's fine; we need a controller-specific way to find the limit
> (whether it's fixed for all versions or discovered from firmware
> settings or whatever).
>
> My hope is that given that controller-specific value, we can have a
> single quirk that works on keystone, loongson, etc. to enforce the
> limit on all relevant devices.  Some platform firmware might do that
> configuration already, but it's OK if a generic quirk re-does it.
>
> I don't think it's worth having two quirks, one that does the
> configuration, and another that relies on firmware having done it.
I think it is better to let keystone and loongson to both use the
no_inc_mrrs quirk.

Huacai

>
> > I have no idea how Loongson achieved this in hardware. All those
> > PCIe controllers are attached under the same AXI bus should share
> > the same AXI to HyperTransport bridge as AXI slave behind a bus
> > matrix. Perhaps instead of fixing error handling of their AXI
> > protocol implementation they just increased the buffer size in AXI
> > bridge so it can accomplish larger requests at one time.
>
> > >> In keystone’s case it’s likely that their firmware won’t do such thing, so
> > >> their workaround shouldn’t be removed.
> > >> And  no_inc_mrrs should be set for them to prevent device drivers modifying
> > >> MRRS afterwards.
> > >
> > > I have the vague impression that this issue is related to an arm64 AXI
> > > bus property [2] or maybe a DesignWare controller property [3], so
> > > this might affect several PCIe controller drivers.
> >
> > In my understanding it’s likely to be a AXI implementation issue.
>
> I know almost nothing about AXI, but this concerns me because it
> sounds like other drivers could be affected.
>
> Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-02 16:29       ` Bjorn Helgaas
@ 2022-06-08  9:34         ` Huacai Chen
  2022-06-08 19:31           ` Bjorn Helgaas
  0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-06-08  9:34 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > Hi, Bjorn,
> >
> > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services
> > > > during shutdown") causes poweroff/reboot failure on systems with
> > > > LS7A chipset.  We found that if we remove "pci_command &=
> > > > ~PCI_COMMAND_MASTER;" in do_pci_disable_device(), it can work
> > > > well. The hardware engineer says that the root cause is that CPU
> > > > is still accessing PCIe devices while poweroff/reboot, and if we
> > > > disable the Bus Master Bit at this time, the PCIe controller
> > > > doesn't forward requests to downstream devices, and also doesn't
> > > > send TIMEOUT to CPU, which causes CPU wait forever (hardware
> > > > deadlock). This behavior is a PCIe protocol violation (Bus
> > > > Master should not be involved in CPU MMIO transactions), and it
> > > > will be fixed in new revisions of hardware (add timeout
> > > > mechanism for CPU read request, whether or not Bus Master bit is
> > > > cleared).
> > >
> > > LS7A might have bugs in that clearing Bus Master Enable prevents the
> > > root port from forwarding Memory or I/O requests in the downstream
> > > direction.
> > >
> > > But this feels like a bit of a band-aid because we don't know exactly
> > > what those requests are.  If we're removing the Root Port, I assume we
> > > think we no longer need any devices *below* the Root Port.
> > >
> > > If that's not the case, e.g., if we still need to produce console
> > > output or save state to a device, we probably should not be removing
> > > the Root Port at all.
> >
> > Do you mean it is better to skip the whole pcie_port_device_remove()
> > instead of just removing the "clear bus master" operation for the
> > buggy hardware?
>
> No, that's not what I want at all.  That's just another band-aid to
> avoid a problem without understanding what the problem is.
>
> My point is that apparently we remove a Root Port (which means we've
> already removed any devices under it), and then we try to use a device
> below the Root Port.  That seems broken.  I want to understand why we
> try to use a device after we've removed it.
I agree, and I think "why we try to use a device after remove it" is
because the userspace programs don't know whether a device is
"usable", so they just use it, at any time. Then it seems it is the
responsibility of the device drivers to avoid the problem.

Take radeon/amdgpu driver as an example, the .shutdown() of the
callback can call suspend() to fix.

But..., another problem is: There are many drivers "broken", not just
radeon/amdgpu drivers (at least, the ahci driver is also "broken").
Implementing the driver's .shutdown() correctly is nearly impossible
for us, because we don't know the hardware details of so many devices.
On the other hand, those drivers "just works" on most platforms, so
the driver authors don't think they are "broken".

So, very sadly, we can only use a band-aid now. :(

Huacai

>
> If the scenario ends up being legitimate and unavoidable, fine -- we
> can figure out a quirk to work around the fact the LS7A doesn't allow
> that access after we clear Bus Master Enable.  But right now the
> scenario smells like a latent bug, and leaving bus mastering enabled
> just avoids it without fixing it.
>
> Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-08  9:34         ` Huacai Chen
@ 2022-06-08 19:31           ` Bjorn Helgaas
  2022-06-16  8:39             ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-08 19:31 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services
> > > > > during shutdown") causes poweroff/reboot failure on systems with
> > > > > LS7A chipset.  We found that if we remove "pci_command &=
> > > > > ~PCI_COMMAND_MASTER;" in do_pci_disable_device(), it can work
> > > > > well. The hardware engineer says that the root cause is that CPU
> > > > > is still accessing PCIe devices while poweroff/reboot, and if we
> > > > > disable the Bus Master Bit at this time, the PCIe controller
> > > > > doesn't forward requests to downstream devices, and also doesn't
> > > > > send TIMEOUT to CPU, which causes CPU wait forever (hardware
> > > > > deadlock). This behavior is a PCIe protocol violation (Bus
> > > > > Master should not be involved in CPU MMIO transactions), and it
> > > > > will be fixed in new revisions of hardware (add timeout
> > > > > mechanism for CPU read request, whether or not Bus Master bit is
> > > > > cleared).
> > > >
> > > > LS7A might have bugs in that clearing Bus Master Enable prevents the
> > > > root port from forwarding Memory or I/O requests in the downstream
> > > > direction.
> > > >
> > > > But this feels like a bit of a band-aid because we don't know exactly
> > > > what those requests are.  If we're removing the Root Port, I assume we
> > > > think we no longer need any devices *below* the Root Port.
> > > >
> > > > If that's not the case, e.g., if we still need to produce console
> > > > output or save state to a device, we probably should not be removing
> > > > the Root Port at all.
> > >
> > > Do you mean it is better to skip the whole pcie_port_device_remove()
> > > instead of just removing the "clear bus master" operation for the
> > > buggy hardware?
> >
> > No, that's not what I want at all.  That's just another band-aid to
> > avoid a problem without understanding what the problem is.
> >
> > My point is that apparently we remove a Root Port (which means we've
> > already removed any devices under it), and then we try to use a device
> > below the Root Port.  That seems broken.  I want to understand why we
> > try to use a device after we've removed it.
>
> I agree, and I think "why we try to use a device after remove it" is
> because the userspace programs don't know whether a device is
> "usable", so they just use it, at any time. Then it seems it is the
> responsibility of the device drivers to avoid the problem.

How is userspace able to use a device after the device is removed?
E.g., if userspace does a read/write to a device that has been
removed, the syscall should return error, not touch the missing
device.  If userspace mmaps a device, an access after the device has
been removed should fault, not do MMIO to the missing device.

> Take radeon/amdgpu driver as an example, the .shutdown() of the
> callback can call suspend() to fix.
> 
> But..., another problem is: There are many drivers "broken", not just
> radeon/amdgpu drivers (at least, the ahci driver is also "broken").
> Implementing the driver's .shutdown() correctly is nearly impossible
> for us, because we don't know the hardware details of so many devices.
> On the other hand, those drivers "just works" on most platforms, so
> the driver authors don't think they are "broken".

It sounds like you have analyzed this and have more details about
exactly how this happens.  Can you please share those details?  There
are a lot of steps in the middle that I don't understand yet.

Without those details, "userspace programs don't know whether a device
is 'usable', so they just use it, at any time" is kind of hand-wavey
and not very convincing.

> So, very sadly, we can only use a band-aid now. :(
> 
> > If the scenario ends up being legitimate and unavoidable, fine -- we
> > can figure out a quirk to work around the fact the LS7A doesn't allow
> > that access after we clear Bus Master Enable.  But right now the
> > scenario smells like a latent bug, and leaving bus mastering enabled
> > just avoids it without fixing it.
> >
> > Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-08 19:31           ` Bjorn Helgaas
@ 2022-06-16  8:39             ` Huacai Chen
  2022-06-16 22:57               ` Bjorn Helgaas
  0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-06-16  8:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Thu, Jun 9, 2022 at 3:31 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> > On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe services
> > > > > > during shutdown") causes poweroff/reboot failure on systems with
> > > > > > LS7A chipset.  We found that if we remove "pci_command &=
> > > > > > ~PCI_COMMAND_MASTER;" in do_pci_disable_device(), it can work
> > > > > > well. The hardware engineer says that the root cause is that CPU
> > > > > > is still accessing PCIe devices while poweroff/reboot, and if we
> > > > > > disable the Bus Master Bit at this time, the PCIe controller
> > > > > > doesn't forward requests to downstream devices, and also doesn't
> > > > > > send TIMEOUT to CPU, which causes CPU wait forever (hardware
> > > > > > deadlock). This behavior is a PCIe protocol violation (Bus
> > > > > > Master should not be involved in CPU MMIO transactions), and it
> > > > > > will be fixed in new revisions of hardware (add timeout
> > > > > > mechanism for CPU read request, whether or not Bus Master bit is
> > > > > > cleared).
> > > > >
> > > > > LS7A might have bugs in that clearing Bus Master Enable prevents the
> > > > > root port from forwarding Memory or I/O requests in the downstream
> > > > > direction.
> > > > >
> > > > > But this feels like a bit of a band-aid because we don't know exactly
> > > > > what those requests are.  If we're removing the Root Port, I assume we
> > > > > think we no longer need any devices *below* the Root Port.
> > > > >
> > > > > If that's not the case, e.g., if we still need to produce console
> > > > > output or save state to a device, we probably should not be removing
> > > > > the Root Port at all.
> > > >
> > > > Do you mean it is better to skip the whole pcie_port_device_remove()
> > > > instead of just removing the "clear bus master" operation for the
> > > > buggy hardware?
> > >
> > > No, that's not what I want at all.  That's just another band-aid to
> > > avoid a problem without understanding what the problem is.
> > >
> > > My point is that apparently we remove a Root Port (which means we've
> > > already removed any devices under it), and then we try to use a device
> > > below the Root Port.  That seems broken.  I want to understand why we
> > > try to use a device after we've removed it.
> >
> > I agree, and I think "why we try to use a device after remove it" is
> > because the userspace programs don't know whether a device is
> > "usable", so they just use it, at any time. Then it seems it is the
> > responsibility of the device drivers to avoid the problem.
>
> How is userspace able to use a device after the device is removed?
> E.g., if userspace does a read/write to a device that has been
> removed, the syscall should return error, not touch the missing
> device.  If userspace mmaps a device, an access after the device has
> been removed should fault, not do MMIO to the missing device.
To give more details, let's take the graphics driver (e.g. amdgpu) as
an example again. The userspace programs call printf() to display
"shutting down xxx service" during shutdown/reboot. Or we can even
simplify further, the kernel calls printk() to display something
during shutdown/reboot. You know, printk() can happen at any time,
even after we call pcie_port_device_remove() to disable the pcie port
on the graphic card.

The call stack is: printk() --> call_console_drivers() -->
con->write() --> vt_console_print() --> fbcon_putcs()

I think this is a scenario of "userspace programs (or kernel itself)
don't know whether a device is 'usable', so they just use it, at any
time"

And why does the graphic driver call .suspend() in its .shutdown() can
fix the problem?
One of the key operations in .suspend() is drm_fb_helper_set_suspend()
--> fb_set_suspend() --> info->state = FBINFO_STATE_SUSPENDED;
This operation will cause fbcon_is_inactive() to return true, then
refuse fbcon_putcs() to really write to the framebuffer.

Huacai


>
> > Take radeon/amdgpu driver as an example, the .shutdown() of the
> > callback can call suspend() to fix.
> >
> > But..., another problem is: There are many drivers "broken", not just
> > radeon/amdgpu drivers (at least, the ahci driver is also "broken").
> > Implementing the driver's .shutdown() correctly is nearly impossible
> > for us, because we don't know the hardware details of so many devices.
> > On the other hand, those drivers "just works" on most platforms, so
> > the driver authors don't think they are "broken".
>
> It sounds like you have analyzed this and have more details about
> exactly how this happens.  Can you please share those details?  There
> are a lot of steps in the middle that I don't understand yet.
>
> Without those details, "userspace programs don't know whether a device
> is 'usable', so they just use it, at any time" is kind of hand-wavey
> and not very convincing.
>
> > So, very sadly, we can only use a band-aid now. :(
> >
> > > If the scenario ends up being legitimate and unavoidable, fine -- we
> > > can figure out a quirk to work around the fact the LS7A doesn't allow
> > > that access after we clear Bus Master Enable.  But right now the
> > > scenario smells like a latent bug, and leaving bus mastering enabled
> > > just avoids it without fixing it.
> > >
> > > Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-16  8:39             ` Huacai Chen
@ 2022-06-16 22:57               ` Bjorn Helgaas
  2022-06-17  2:21                 ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-16 22:57 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

On Thu, Jun 16, 2022 at 04:39:46PM +0800, Huacai Chen wrote:
> On Thu, Jun 9, 2022 at 3:31 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> > > On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe
> > > > > > > services during shutdown") causes poweroff/reboot
> > > > > > > failure on systems with LS7A chipset.  We found that if
> > > > > > > we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> > > > > > > do_pci_disable_device(), it can work well. The hardware
> > > > > > > engineer says that the root cause is that CPU is still
> > > > > > > accessing PCIe devices while poweroff/reboot, and if we
> > > > > > > disable the Bus Master Bit at this time, the PCIe
> > > > > > > controller doesn't forward requests to downstream
> > > > > > > devices, and also doesn't send TIMEOUT to CPU, which
> > > > > > > causes CPU wait forever (hardware deadlock). This
> > > > > > > behavior is a PCIe protocol violation (Bus Master should
> > > > > > > not be involved in CPU MMIO transactions), and it will
> > > > > > > be fixed in new revisions of hardware (add timeout
> > > > > > > mechanism for CPU read request, whether or not Bus
> > > > > > > Master bit is cleared).
> > > > > >
> > > > > > LS7A might have bugs in that clearing Bus Master Enable
> > > > > > prevents the root port from forwarding Memory or I/O
> > > > > > requests in the downstream direction.
> > > > > >
> > > > > > But this feels like a bit of a band-aid because we don't
> > > > > > know exactly what those requests are.  If we're removing
> > > > > > the Root Port, I assume we think we no longer need any
> > > > > > devices *below* the Root Port.
> > > > > >
> > > > > > If that's not the case, e.g., if we still need to produce
> > > > > > console output or save state to a device, we probably
> > > > > > should not be removing the Root Port at all.
> > > > >
> > > > > Do you mean it is better to skip the whole
> > > > > pcie_port_device_remove() instead of just removing the
> > > > > "clear bus master" operation for the buggy hardware?
> > > >
> > > > No, that's not what I want at all.  That's just another
> > > > band-aid to avoid a problem without understanding what the
> > > > problem is.
> > > >
> > > > My point is that apparently we remove a Root Port (which means
> > > > we've already removed any devices under it), and then we try
> > > > to use a device below the Root Port.  That seems broken.  I
> > > > want to understand why we try to use a device after we've
> > > > removed it.
> > >
> > > I agree, and I think "why we try to use a device after remove
> > > it" is because the userspace programs don't know whether a
> > > device is "usable", so they just use it, at any time. Then it
> > > seems it is the responsibility of the device drivers to avoid
> > > the problem.
> >
> > How is userspace able to use a device after the device is removed?
> > E.g., if userspace does a read/write to a device that has been
> > removed, the syscall should return error, not touch the missing
> > device.  If userspace mmaps a device, an access after the device
> > has been removed should fault, not do MMIO to the missing device.
>
> To give more details, let's take the graphics driver (e.g. amdgpu)
> as an example again. The userspace programs call printf() to display
> "shutting down xxx service" during shutdown/reboot. Or we can even
> simplify further, the kernel calls printk() to display something
> during shutdown/reboot. You know, printk() can happen at any time,
> even after we call pcie_port_device_remove() to disable the pcie
> port on the graphic card.

I've been focusing on the *remove* path, but you said the problem
you're solving is with *poweroff/reboot*.  pcie_portdrv_remove() is
used for both paths, but if there's a reason we need those paths to be
different, we might be able to split them.

For remove, we have to assume accesses to the device may already or
will soon fail.  A driver that touches the device, or a device that
performs DMA, after its drv->remove() method has been called would be
seriously broken.  The remove operation also unbinds the driver from
the device.

The semantics of device_shutdown(), pci_device_shutdown(), and any
drv->shutdown() methods are not documented very well.  This path is
only used for halt/poweroff/reboot, so my guess is that not very much
is actually required, and it doesn't do anything at all with respect
to driver binding.

I think we mainly want to disable things that might interfere with
halt/poweroff/reboot, like interrupts and maybe DMA.  We disable DMA
in this path to prevent devices from corrupting memory, but I'm more
open to a quirk in the shutdown path than in the remove path.  So how
about if you clone pcie_port_device_remove() to make a separate
pcie_port_device_shutdown() and put the quirk there?

> The call stack is: printk() --> call_console_drivers() -->
> con->write() --> vt_console_print() --> fbcon_putcs()
> 
> I think this is a scenario of "userspace programs (or kernel itself)
> don't know whether a device is 'usable', so they just use it, at any
> time"
> 
> And why does the graphic driver call .suspend() in its .shutdown() can
> fix the problem?
>
> One of the key operations in .suspend() is drm_fb_helper_set_suspend()
> --> fb_set_suspend() --> info->state = FBINFO_STATE_SUSPENDED;
>
> This operation will cause fbcon_is_inactive() to return true, then
> refuse fbcon_putcs() to really write to the framebuffer.

> > > Take radeon/amdgpu driver as an example, the .shutdown() of the
> > > callback can call suspend() to fix.
> > >
> > > But..., another problem is: There are many drivers "broken", not just
> > > radeon/amdgpu drivers (at least, the ahci driver is also "broken").
> > > Implementing the driver's .shutdown() correctly is nearly impossible
> > > for us, because we don't know the hardware details of so many devices.
> > > On the other hand, those drivers "just works" on most platforms, so
> > > the driver authors don't think they are "broken".
> >
> > It sounds like you have analyzed this and have more details about
> > exactly how this happens.  Can you please share those details?  There
> > are a lot of steps in the middle that I don't understand yet.
> >
> > Without those details, "userspace programs don't know whether a device
> > is 'usable', so they just use it, at any time" is kind of hand-wavey
> > and not very convincing.
> >
> > > So, very sadly, we can only use a band-aid now. :(
> > >
> > > > If the scenario ends up being legitimate and unavoidable, fine -- we
> > > > can figure out a quirk to work around the fact the LS7A doesn't allow
> > > > that access after we clear Bus Master Enable.  But right now the
> > > > scenario smells like a latent bug, and leaving bus mastering enabled
> > > > just avoids it without fixing it.
> > > >
> > > > Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-16 22:57               ` Bjorn Helgaas
@ 2022-06-17  2:21                 ` Huacai Chen
  2022-06-17 11:37                   ` Bjorn Helgaas
  0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2022-06-17  2:21 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Fri, Jun 17, 2022 at 6:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Jun 16, 2022 at 04:39:46PM +0800, Huacai Chen wrote:
> > On Thu, Jun 9, 2022 at 3:31 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> > > > On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > > > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe
> > > > > > > > services during shutdown") causes poweroff/reboot
> > > > > > > > failure on systems with LS7A chipset.  We found that if
> > > > > > > > we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> > > > > > > > do_pci_disable_device(), it can work well. The hardware
> > > > > > > > engineer says that the root cause is that CPU is still
> > > > > > > > accessing PCIe devices while poweroff/reboot, and if we
> > > > > > > > disable the Bus Master Bit at this time, the PCIe
> > > > > > > > controller doesn't forward requests to downstream
> > > > > > > > devices, and also doesn't send TIMEOUT to CPU, which
> > > > > > > > causes CPU wait forever (hardware deadlock). This
> > > > > > > > behavior is a PCIe protocol violation (Bus Master should
> > > > > > > > not be involved in CPU MMIO transactions), and it will
> > > > > > > > be fixed in new revisions of hardware (add timeout
> > > > > > > > mechanism for CPU read request, whether or not Bus
> > > > > > > > Master bit is cleared).
> > > > > > >
> > > > > > > LS7A might have bugs in that clearing Bus Master Enable
> > > > > > > prevents the root port from forwarding Memory or I/O
> > > > > > > requests in the downstream direction.
> > > > > > >
> > > > > > > But this feels like a bit of a band-aid because we don't
> > > > > > > know exactly what those requests are.  If we're removing
> > > > > > > the Root Port, I assume we think we no longer need any
> > > > > > > devices *below* the Root Port.
> > > > > > >
> > > > > > > If that's not the case, e.g., if we still need to produce
> > > > > > > console output or save state to a device, we probably
> > > > > > > should not be removing the Root Port at all.
> > > > > >
> > > > > > Do you mean it is better to skip the whole
> > > > > > pcie_port_device_remove() instead of just removing the
> > > > > > "clear bus master" operation for the buggy hardware?
> > > > >
> > > > > No, that's not what I want at all.  That's just another
> > > > > band-aid to avoid a problem without understanding what the
> > > > > problem is.
> > > > >
> > > > > My point is that apparently we remove a Root Port (which means
> > > > > we've already removed any devices under it), and then we try
> > > > > to use a device below the Root Port.  That seems broken.  I
> > > > > want to understand why we try to use a device after we've
> > > > > removed it.
> > > >
> > > > I agree, and I think "why we try to use a device after remove
> > > > it" is because the userspace programs don't know whether a
> > > > device is "usable", so they just use it, at any time. Then it
> > > > seems it is the responsibility of the device drivers to avoid
> > > > the problem.
> > >
> > > How is userspace able to use a device after the device is removed?
> > > E.g., if userspace does a read/write to a device that has been
> > > removed, the syscall should return error, not touch the missing
> > > device.  If userspace mmaps a device, an access after the device
> > > has been removed should fault, not do MMIO to the missing device.
> >
> > To give more details, let's take the graphics driver (e.g. amdgpu)
> > as an example again. The userspace programs call printf() to display
> > "shutting down xxx service" during shutdown/reboot. Or we can even
> > simplify further, the kernel calls printk() to display something
> > during shutdown/reboot. You know, printk() can happen at any time,
> > even after we call pcie_port_device_remove() to disable the pcie
> > port on the graphic card.
>
> I've been focusing on the *remove* path, but you said the problem
> you're solving is with *poweroff/reboot*.  pcie_portdrv_remove() is
> used for both paths, but if there's a reason we need those paths to be
> different, we might be able to split them.
I'm very sorry for that. I have misunderstood before because I suppose
the "remove path" is the pcie_portdrv_remove() function, but your
meaning is the .remove() callback in pcie_portdriver. Am I right this
time?

>
> For remove, we have to assume accesses to the device may already or
> will soon fail.  A driver that touches the device, or a device that
> performs DMA, after its drv->remove() method has been called would be
> seriously broken.  The remove operation also unbinds the driver from
> the device.
Then what will happen about the "remove path"? If we still take the
graphics driver as an example, "rmmod amdgpu" always fails with
"device is busy" because the graphics card is always be used once
after the driver is loaded. So the "remove path" has no chance to be
executed. But if we take a NIC driver as an example, "rmmod igb" can
mostly succeed, and there will be no access on the device after
removing, at least in our observation. I think there is nothing broken
about the "remove path".

>
> The semantics of device_shutdown(), pci_device_shutdown(), and any
> drv->shutdown() methods are not documented very well.  This path is
> only used for halt/poweroff/reboot, so my guess is that not very much
> is actually required, and it doesn't do anything at all with respect
> to driver binding.
>
> I think we mainly want to disable things that might interfere with
> halt/poweroff/reboot, like interrupts and maybe DMA.  We disable DMA
> in this path to prevent devices from corrupting memory, but I'm more
> open to a quirk in the shutdown path than in the remove path.  So how
> about if you clone pcie_port_device_remove() to make a separate
> pcie_port_device_shutdown() and put the quirk there?
Hmm, I think this is better. So I will clone
pcie_portdrv_remove()/pcie_port_device_remove() to make a separate
pcie_portdrv_shutdown()/pcie_port_device_shutdown() and only apply the
quirk on the shutdown path.

Huacai
>
> > The call stack is: printk() --> call_console_drivers() -->
> > con->write() --> vt_console_print() --> fbcon_putcs()
> >
> > I think this is a scenario of "userspace programs (or kernel itself)
> > don't know whether a device is 'usable', so they just use it, at any
> > time"
> >
> > And why does the graphic driver call .suspend() in its .shutdown() can
> > fix the problem?
> >
> > One of the key operations in .suspend() is drm_fb_helper_set_suspend()
> > --> fb_set_suspend() --> info->state = FBINFO_STATE_SUSPENDED;
> >
> > This operation will cause fbcon_is_inactive() to return true, then
> > refuse fbcon_putcs() to really write to the framebuffer.
>
> > > > Take radeon/amdgpu driver as an example, the .shutdown() of the
> > > > callback can call suspend() to fix.
> > > >
> > > > But..., another problem is: There are many drivers "broken", not just
> > > > radeon/amdgpu drivers (at least, the ahci driver is also "broken").
> > > > Implementing the driver's .shutdown() correctly is nearly impossible
> > > > for us, because we don't know the hardware details of so many devices.
> > > > On the other hand, those drivers "just works" on most platforms, so
> > > > the driver authors don't think they are "broken".
> > >
> > > It sounds like you have analyzed this and have more details about
> > > exactly how this happens.  Can you please share those details?  There
> > > are a lot of steps in the middle that I don't understand yet.
> > >
> > > Without those details, "userspace programs don't know whether a device
> > > is 'usable', so they just use it, at any time" is kind of hand-wavey
> > > and not very convincing.
> > >
> > > > So, very sadly, we can only use a band-aid now. :(
> > > >
> > > > > If the scenario ends up being legitimate and unavoidable, fine -- we
> > > > > can figure out a quirk to work around the fact the LS7A doesn't allow
> > > > > that access after we clear Bus Master Enable.  But right now the
> > > > > scenario smells like a latent bug, and leaving bus mastering enabled
> > > > > just avoids it without fixing it.
> > > > >
> > > > > Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-17  2:21                 ` Huacai Chen
@ 2022-06-17 11:37                   ` Bjorn Helgaas
  2022-06-17 12:14                     ` Huacai Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Bjorn Helgaas @ 2022-06-17 11:37 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

On Fri, Jun 17, 2022 at 10:21:14AM +0800, Huacai Chen wrote:
> On Fri, Jun 17, 2022 at 6:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Thu, Jun 16, 2022 at 04:39:46PM +0800, Huacai Chen wrote:
> > > On Thu, Jun 9, 2022 at 3:31 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> > > > > On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > > > > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe
> > > > > > > > > services during shutdown") causes poweroff/reboot
> > > > > > > > > failure on systems with LS7A chipset.  We found that if
> > > > > > > > > we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> > > > > > > > > do_pci_disable_device(), it can work well. The hardware
> > > > > > > > > engineer says that the root cause is that CPU is still
> > > > > > > > > accessing PCIe devices while poweroff/reboot, and if we
> > > > > > > > > disable the Bus Master Bit at this time, the PCIe
> > > > > > > > > controller doesn't forward requests to downstream
> > > > > > > > > devices, and also doesn't send TIMEOUT to CPU, which
> > > > > > > > > causes CPU wait forever (hardware deadlock). This
> > > > > > > > > behavior is a PCIe protocol violation (Bus Master should
> > > > > > > > > not be involved in CPU MMIO transactions), and it will
> > > > > > > > > be fixed in new revisions of hardware (add timeout
> > > > > > > > > mechanism for CPU read request, whether or not Bus
> > > > > > > > > Master bit is cleared).
> > > > > > > >
> > > > > > > > LS7A might have bugs in that clearing Bus Master Enable
> > > > > > > > prevents the root port from forwarding Memory or I/O
> > > > > > > > requests in the downstream direction.
> > > > > > > >
> > > > > > > > But this feels like a bit of a band-aid because we don't
> > > > > > > > know exactly what those requests are.  If we're removing
> > > > > > > > the Root Port, I assume we think we no longer need any
> > > > > > > > devices *below* the Root Port.
> > > > > > > >
> > > > > > > > If that's not the case, e.g., if we still need to produce
> > > > > > > > console output or save state to a device, we probably
> > > > > > > > should not be removing the Root Port at all.
> > > > > > >
> > > > > > > Do you mean it is better to skip the whole
> > > > > > > pcie_port_device_remove() instead of just removing the
> > > > > > > "clear bus master" operation for the buggy hardware?
> > > > > >
> > > > > > No, that's not what I want at all.  That's just another
> > > > > > band-aid to avoid a problem without understanding what the
> > > > > > problem is.
> > > > > >
> > > > > > My point is that apparently we remove a Root Port (which means
> > > > > > we've already removed any devices under it), and then we try
> > > > > > to use a device below the Root Port.  That seems broken.  I
> > > > > > want to understand why we try to use a device after we've
> > > > > > removed it.
> > > > >
> > > > > I agree, and I think "why we try to use a device after remove
> > > > > it" is because the userspace programs don't know whether a
> > > > > device is "usable", so they just use it, at any time. Then it
> > > > > seems it is the responsibility of the device drivers to avoid
> > > > > the problem.
> > > >
> > > > How is userspace able to use a device after the device is removed?
> > > > E.g., if userspace does a read/write to a device that has been
> > > > removed, the syscall should return error, not touch the missing
> > > > device.  If userspace mmaps a device, an access after the device
> > > > has been removed should fault, not do MMIO to the missing device.
> > >
> > > To give more details, let's take the graphics driver (e.g. amdgpu)
> > > as an example again. The userspace programs call printf() to display
> > > "shutting down xxx service" during shutdown/reboot. Or we can even
> > > simplify further, the kernel calls printk() to display something
> > > during shutdown/reboot. You know, printk() can happen at any time,
> > > even after we call pcie_port_device_remove() to disable the pcie
> > > port on the graphic card.
> >
> > I've been focusing on the *remove* path, but you said the problem
> > you're solving is with *poweroff/reboot*.  pcie_portdrv_remove() is
> > used for both paths, but if there's a reason we need those paths to be
> > different, we might be able to split them.
>
> I'm very sorry for that. I have misunderstood before because I suppose
> the "remove path" is the pcie_portdrv_remove() function, but your
> meaning is the .remove() callback in pcie_portdriver. Am I right this
> time?

No need to be sorry, you clearly said from the beginning that this was
a shutdown issue, not a remove issue!  I was just confused because the
.remove() and the .shutdown() callbacks are both
pcie_portdrv_remove(), so I was thinking "remove" even though you said
"poweroff".

> > For remove, we have to assume accesses to the device may already or
> > will soon fail.  A driver that touches the device, or a device that
> > performs DMA, after its drv->remove() method has been called would be
> > seriously broken.  The remove operation also unbinds the driver from
> > the device.
>
> Then what will happen about the "remove path"? If we still take the
> graphics driver as an example, "rmmod amdgpu" always fails with
> "device is busy" because the graphics card is always be used once
> after the driver is loaded. So the "remove path" has no chance to be
> executed.

Do you think this is a problem?  It doesn't sound like a problem to
me, but I don't know anything about graphics drivers.  I assume that
if a device is in use, the expected behavior is that we can't remove
the driver.

> But if we take a NIC driver as an example, "rmmod igb" can
> mostly succeed, and there will be no access on the device after
> removing, at least in our observation. I think there is nothing broken
> about the "remove path".

I agree.

Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure
  2022-06-17 11:37                   ` Bjorn Helgaas
@ 2022-06-17 12:14                     ` Huacai Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2022-06-17 12:14 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczyński, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Fri, Jun 17, 2022 at 7:37 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Jun 17, 2022 at 10:21:14AM +0800, Huacai Chen wrote:
> > On Fri, Jun 17, 2022 at 6:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Thu, Jun 16, 2022 at 04:39:46PM +0800, Huacai Chen wrote:
> > > > On Thu, Jun 9, 2022 at 3:31 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Wed, Jun 08, 2022 at 05:34:21PM +0800, Huacai Chen wrote:
> > > > > > On Fri, Jun 3, 2022 at 12:29 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > On Thu, Jun 02, 2022 at 08:48:20PM +0800, Huacai Chen wrote:
> > > > > > > > On Wed, Jun 1, 2022 at 7:35 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > > > On Sat, Apr 30, 2022 at 04:48:45PM +0800, Huacai Chen wrote:
> > > > > > > > > > Commit cc27b735ad3a75574a ("PCI/portdrv: Turn off PCIe
> > > > > > > > > > services during shutdown") causes poweroff/reboot
> > > > > > > > > > failure on systems with LS7A chipset.  We found that if
> > > > > > > > > > we remove "pci_command &= ~PCI_COMMAND_MASTER;" in
> > > > > > > > > > do_pci_disable_device(), it can work well. The hardware
> > > > > > > > > > engineer says that the root cause is that CPU is still
> > > > > > > > > > accessing PCIe devices while poweroff/reboot, and if we
> > > > > > > > > > disable the Bus Master Bit at this time, the PCIe
> > > > > > > > > > controller doesn't forward requests to downstream
> > > > > > > > > > devices, and also doesn't send TIMEOUT to CPU, which
> > > > > > > > > > causes CPU wait forever (hardware deadlock). This
> > > > > > > > > > behavior is a PCIe protocol violation (Bus Master should
> > > > > > > > > > not be involved in CPU MMIO transactions), and it will
> > > > > > > > > > be fixed in new revisions of hardware (add timeout
> > > > > > > > > > mechanism for CPU read request, whether or not Bus
> > > > > > > > > > Master bit is cleared).
> > > > > > > > >
> > > > > > > > > LS7A might have bugs in that clearing Bus Master Enable
> > > > > > > > > prevents the root port from forwarding Memory or I/O
> > > > > > > > > requests in the downstream direction.
> > > > > > > > >
> > > > > > > > > But this feels like a bit of a band-aid because we don't
> > > > > > > > > know exactly what those requests are.  If we're removing
> > > > > > > > > the Root Port, I assume we think we no longer need any
> > > > > > > > > devices *below* the Root Port.
> > > > > > > > >
> > > > > > > > > If that's not the case, e.g., if we still need to produce
> > > > > > > > > console output or save state to a device, we probably
> > > > > > > > > should not be removing the Root Port at all.
> > > > > > > >
> > > > > > > > Do you mean it is better to skip the whole
> > > > > > > > pcie_port_device_remove() instead of just removing the
> > > > > > > > "clear bus master" operation for the buggy hardware?
> > > > > > >
> > > > > > > No, that's not what I want at all.  That's just another
> > > > > > > band-aid to avoid a problem without understanding what the
> > > > > > > problem is.
> > > > > > >
> > > > > > > My point is that apparently we remove a Root Port (which means
> > > > > > > we've already removed any devices under it), and then we try
> > > > > > > to use a device below the Root Port.  That seems broken.  I
> > > > > > > want to understand why we try to use a device after we've
> > > > > > > removed it.
> > > > > >
> > > > > > I agree, and I think "why we try to use a device after remove
> > > > > > it" is because the userspace programs don't know whether a
> > > > > > device is "usable", so they just use it, at any time. Then it
> > > > > > seems it is the responsibility of the device drivers to avoid
> > > > > > the problem.
> > > > >
> > > > > How is userspace able to use a device after the device is removed?
> > > > > E.g., if userspace does a read/write to a device that has been
> > > > > removed, the syscall should return error, not touch the missing
> > > > > device.  If userspace mmaps a device, an access after the device
> > > > > has been removed should fault, not do MMIO to the missing device.
> > > >
> > > > To give more details, let's take the graphics driver (e.g. amdgpu)
> > > > as an example again. The userspace programs call printf() to display
> > > > "shutting down xxx service" during shutdown/reboot. Or we can even
> > > > simplify further, the kernel calls printk() to display something
> > > > during shutdown/reboot. You know, printk() can happen at any time,
> > > > even after we call pcie_port_device_remove() to disable the pcie
> > > > port on the graphic card.
> > >
> > > I've been focusing on the *remove* path, but you said the problem
> > > you're solving is with *poweroff/reboot*.  pcie_portdrv_remove() is
> > > used for both paths, but if there's a reason we need those paths to be
> > > different, we might be able to split them.
> >
> > I'm very sorry for that. I have misunderstood before because I suppose
> > the "remove path" is the pcie_portdrv_remove() function, but your
> > meaning is the .remove() callback in pcie_portdriver. Am I right this
> > time?
>
> No need to be sorry, you clearly said from the beginning that this was
> a shutdown issue, not a remove issue!  I was just confused because the
> .remove() and the .shutdown() callbacks are both
> pcie_portdrv_remove(), so I was thinking "remove" even though you said
> "poweroff".
>
> > > For remove, we have to assume accesses to the device may already or
> > > will soon fail.  A driver that touches the device, or a device that
> > > performs DMA, after its drv->remove() method has been called would be
> > > seriously broken.  The remove operation also unbinds the driver from
> > > the device.
> >
> > Then what will happen about the "remove path"? If we still take the
> > graphics driver as an example, "rmmod amdgpu" always fails with
> > "device is busy" because the graphics card is always be used once
> > after the driver is loaded. So the "remove path" has no chance to be
> > executed.
>
> Do you think this is a problem?  It doesn't sound like a problem to
> me, but I don't know anything about graphics drivers.  I assume that
> if a device is in use, the expected behavior is that we can't remove
> the driver.
This isn't a problem, and I've sent V14, which only modifies the shutdown logic.

Huacai
>
> > But if we take a NIC driver as an example, "rmmod igb" can
> > mostly succeed, and there will be no access on the device after
> > removing, at least in our observation. I think there is nothing broken
> > about the "remove path".
>
> I agree.
>
> Bjorn

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-06-17 12:15 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-30  8:48 [PATCH V13 0/6] PCI: Loongson pci improvements and quirks Huacai Chen
2022-04-30  8:48 ` [PATCH V13 1/6] PCI: loongson: Use generic 8/16/32-bit config ops on LS2K/LS7A Huacai Chen
2022-06-01  2:08   ` Bjorn Helgaas
2022-06-02  4:18     ` Huacai Chen
2022-04-30  8:48 ` [PATCH V13 2/6] PCI: loongson: Add ACPI init support Huacai Chen
2022-05-31 23:04   ` Bjorn Helgaas
2022-06-02  7:09     ` Huacai Chen
2022-04-30  8:48 ` [PATCH V13 3/6] PCI: loongson: Don't access unexisting devices Huacai Chen
2022-05-31 23:14   ` Bjorn Helgaas
2022-06-02  4:28     ` Huacai Chen
2022-06-02 16:23       ` Bjorn Helgaas
2022-06-02 20:00         ` Jiaxun Yang
2022-04-30  8:48 ` [PATCH V13 4/6] PCI: loongson: Improve the MRRS quirk for LS7A Huacai Chen
2022-06-01  2:22   ` Bjorn Helgaas
2022-06-01 11:59     ` Jiaxun Yang
2022-06-02  4:17       ` Huacai Chen
2022-06-02 16:20       ` Bjorn Helgaas
2022-06-03 12:13         ` Krzysztof Hałasa
2022-06-03 22:57         ` Jiaxun Yang
2022-06-04  0:07           ` Bjorn Helgaas
2022-06-08  8:29             ` Huacai Chen
2022-04-30  8:48 ` [PATCH V13 5/6] PCI: Add quirk for LS7A to avoid reboot failure Huacai Chen
2022-05-31 23:35   ` Bjorn Helgaas
2022-06-02 12:48     ` Huacai Chen
2022-06-02 16:29       ` Bjorn Helgaas
2022-06-08  9:34         ` Huacai Chen
2022-06-08 19:31           ` Bjorn Helgaas
2022-06-16  8:39             ` Huacai Chen
2022-06-16 22:57               ` Bjorn Helgaas
2022-06-17  2:21                 ` Huacai Chen
2022-06-17 11:37                   ` Bjorn Helgaas
2022-06-17 12:14                     ` Huacai Chen
2022-04-30  8:48 ` [PATCH V13 6/6] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
2022-06-01  2:07   ` Bjorn Helgaas
2022-06-01  7:36     ` Jianmin Lv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).