linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/4] PCI: Loongson-related pci quirks
@ 2021-05-28  7:14 Huacai Chen
  2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Huacai Chen @ 2021-05-28  7:14 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen,
	Tiezhu Yang, Jianmin Lv

This patchset resolve some Loongson-related pci problems (however, some
of them affect not only Loongon platform). The first patch has been sent
before, and nearly all reviewers' questions had been answered at that
time [1]. The second patch move some LS7A quirks to quirks.c, where can
be shared by multi architectures. The third patch improve the mrrs quirk
for LS7A chipset, and the fourth patch add a new quirk for LS7A chipset.

[1] http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/

V1 -> V2:
1, Rework the 4th patch;
2, Improve commit messages;
3, Remove the last patch since there is better solutions.

Huacai Chen, Tiezhu Yang and Jianmin Lv(4):
 PCI/portdrv: Don't disable device during shutdown.
 PCI: Move loongson pci quirks to quirks.c.
 PCI: Improve the MRRS quirk for LS7A.
 PCI: Add quirk for multifunction devices of LS7A.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> 
---
 drivers/pci/controller/pci-loongson.c |  69 -----------------
 drivers/pci/pci.c                     |   5 ++
 drivers/pci/pcie/portdrv.h            |   2 +-
 drivers/pci/pcie/portdrv_core.c       |   6 +-
 drivers/pci/pcie/portdrv_pci.c        |  15 +++-
 drivers/pci/quirks.c                  | 139 ++++++++++++++++++++++++++++++++++
 include/linux/pci.h                   |   2 +
 7 files changed, 164 insertions(+), 74 deletions(-)
--
2.27.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
  2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
@ 2021-05-28  7:15 ` Huacai Chen
  2021-05-28 21:43   ` Bjorn Helgaas
  2021-05-28  7:15 ` [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-05-28  7:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen,
	Tiezhu Yang

Use separate remove()/shutdown() callback, and don't disable PCI device
during shutdown. This can avoid some poweroff/reboot failures.

The poweroff/reboot failures could easily be reproduced on Loongson
platforms. I think this is not a Loongson-specific problem, instead, is
a problem related to some specific PCI hosts. On some x86 platforms,
radeon/amdgpu devices can cause the same problem [1][2], and commit
faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown")
can resolve it.

As Tiezhu said, this occasionally shutdown or reboot failure is due to
clear PCI_COMMAND_MASTER on the device in do_pci_disable_device() [3].

static void do_pci_disable_device(struct pci_dev *dev)
{
        u16 pci_command;

        pci_read_config_word(dev, PCI_COMMAND, &pci_command);
        if (pci_command & PCI_COMMAND_MASTER) {
                pci_command &= ~PCI_COMMAND_MASTER;
                pci_write_config_word(dev, PCI_COMMAND, pci_command);
        }

        pcibios_disable_device(dev);
}

When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well when
shutdown or reboot. This may implies that there are DMA activities on the
device while shutdown.

Radeon driver is more difficult than amdgpu due to its confusing symbol
names, and I have maintained an out-of-tree patch for a long time [4].
Recently, we found more and more devices can cause the same problem, and
it is very difficult to modify all problematic drivers as radeon/amdgpu
does (the .shutdown callback should make sure there is no DMA activity).
So, I think modify the PCIe port driver is a simple and effective way.
Because there is no poweroff/reboot problems before cc27b735ad3a75574a6a
("PCI/portdrv: Turn off PCIe services during shutdown"). And as early
discussed, kexec can still work fine after this patch [5].

[1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
[2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
[3] https://lore.kernel.org/patchwork/patch/1305067/
[4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0
[5] http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 drivers/pci/pcie/portdrv.h      |  2 +-
 drivers/pci/pcie/portdrv_core.c |  6 ++++--
 drivers/pci/pcie/portdrv_pci.c  | 15 +++++++++++++--
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 2ff5724b8f13..358d7281f6e8 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -117,7 +117,7 @@ int pcie_port_device_resume(struct device *dev);
 int pcie_port_device_runtime_suspend(struct device *dev);
 int pcie_port_device_runtime_resume(struct device *dev);
 #endif
-void pcie_port_device_remove(struct pci_dev *dev);
+void pcie_port_device_remove(struct pci_dev *dev, bool disable);
 int __must_check pcie_port_bus_register(void);
 void pcie_port_bus_unregister(void);
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index e1fed6649c41..98c0a99a41d6 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -484,11 +484,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
  * Remove PCI Express port service devices associated with given port and
  * disable MSI-X or MSI for the port.
  */
-void pcie_port_device_remove(struct pci_dev *dev)
+void pcie_port_device_remove(struct pci_dev *dev, bool disable)
 {
 	device_for_each_child(&dev->dev, NULL, remove_iter);
 	pci_free_irq_vectors(dev);
-	pci_disable_device(dev);
+
+	if (disable)
+		pci_disable_device(dev);
 }
 
 /**
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index c7ff1eea225a..562fbf3c1ea9 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -147,7 +147,18 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
 		pm_runtime_dont_use_autosuspend(&dev->dev);
 	}
 
-	pcie_port_device_remove(dev);
+	pcie_port_device_remove(dev, true);
+}
+
+static void pcie_portdrv_shutdown(struct pci_dev *dev)
+{
+	if (pci_bridge_d3_possible(dev)) {
+		pm_runtime_forbid(&dev->dev);
+		pm_runtime_get_noresume(&dev->dev);
+		pm_runtime_dont_use_autosuspend(&dev->dev);
+	}
+
+	pcie_port_device_remove(dev, false);
 }
 
 static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
@@ -219,7 +230,7 @@ static struct pci_driver pcie_portdriver = {
 
 	.probe		= pcie_portdrv_probe,
 	.remove		= pcie_portdrv_remove,
-	.shutdown	= pcie_portdrv_remove,
+	.shutdown	= pcie_portdrv_shutdown,
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c
  2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
  2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
@ 2021-05-28  7:15 ` Huacai Chen
  2021-06-05 21:15   ` Bjorn Helgaas
  2021-05-28  7:15 ` [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
  2021-05-28  7:15 ` [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
  3 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-05-28  7:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

Loongson PCH (LS7A chipset) will be used by both MIPS-based and
LoongArch-based Loongson processors. MIPS-based Loongson uses FDT
but LoongArch-base Loongson uses ACPI, but the driver in drivers/
pci/controller/pci-loongson.c is FDT-only. So move the quirks to
quirks.c where can be shared by all architectures.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/controller/pci-loongson.c | 69 ---------------------------
 drivers/pci/quirks.c                  | 69 +++++++++++++++++++++++++++
 2 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
index 48169b1e3817..88066e9db69e 100644
--- a/drivers/pci/controller/pci-loongson.c
+++ b/drivers/pci/controller/pci-loongson.c
@@ -12,15 +12,6 @@
 
 #include "../pci.h"
 
-/* Device IDs */
-#define DEV_PCIE_PORT_0	0x7a09
-#define DEV_PCIE_PORT_1	0x7a19
-#define DEV_PCIE_PORT_2	0x7a29
-
-#define DEV_LS2K_APB	0x7a02
-#define DEV_LS7A_CONF	0x7a10
-#define DEV_LS7A_LPC	0x7a0c
-
 #define FLAG_CFG0	BIT(0)
 #define FLAG_CFG1	BIT(1)
 #define FLAG_DEV_FIX	BIT(2)
@@ -32,66 +23,6 @@ struct loongson_pci {
 	u32 flags;
 };
 
-/* Fixup wrong class code in PCIe bridges */
-static void bridge_class_quirk(struct pci_dev *dev)
-{
-	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
-}
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_PCIE_PORT_0, bridge_class_quirk);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_PCIE_PORT_1, bridge_class_quirk);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_PCIE_PORT_2, bridge_class_quirk);
-
-static void system_bus_quirk(struct pci_dev *pdev)
-{
-	/*
-	 * The address space consumed by these devices is outside the
-	 * resources of the host bridge.
-	 */
-	pdev->mmio_always_on = 1;
-	pdev->non_compliant_bars = 1;
-}
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_LS2K_APB, system_bus_quirk);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_LS7A_CONF, system_bus_quirk);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
-			DEV_LS7A_LPC, system_bus_quirk);
-
-static void loongson_mrrs_quirk(struct pci_dev *dev)
-{
-	struct pci_bus *bus = dev->bus;
-	struct pci_dev *bridge;
-	static const struct pci_device_id bridge_devids[] = {
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
-		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
-		{ 0, },
-	};
-
-	/* look for the matching bridge */
-	while (!pci_is_root_bus(bus)) {
-		bridge = bus->self;
-		bus = bus->parent;
-		/*
-		 * Some Loongson PCIe ports have a h/w limitation of
-		 * 256 bytes maximum read request size. They can't handle
-		 * anything larger than this. So force this limit on
-		 * any devices attached under these ports.
-		 */
-		if (pci_match_id(bridge_devids, bridge)) {
-			if (pcie_get_readrq(dev) > 256) {
-				pci_info(dev, "limiting MRRS to 256\n");
-				pcie_set_readrq(dev, 256);
-			}
-			break;
-		}
-	}
-}
-DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
-
 static void __iomem *cfg1_map(struct loongson_pci *priv, int bus,
 				unsigned int devfn, int where)
 {
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index dcb229de1acb..66e4bea69431 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -205,6 +205,75 @@ static void quirk_mmio_always_on(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID,
 				PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on);
 
+/* Loongson-related quirks */
+#define DEV_PCIE_PORT_0	0x7a09
+#define DEV_PCIE_PORT_1	0x7a19
+#define DEV_PCIE_PORT_2	0x7a29
+
+#define DEV_LS2K_APB	0x7a02
+#define DEV_LS7A_CONF	0x7a10
+#define DEV_LS7A_LPC	0x7a0c
+
+/* Fixup wrong class code in PCIe bridges */
+static void loongson_bridge_class_quirk(struct pci_dev *dev)
+{
+	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_0, loongson_bridge_class_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_1, loongson_bridge_class_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_PCIE_PORT_2, loongson_bridge_class_quirk);
+
+static void loongson_system_bus_quirk(struct pci_dev *pdev)
+{
+	/*
+	 * The address space consumed by these devices is outside the
+	 * resources of the host bridge.
+	 */
+	pdev->mmio_always_on = 1;
+	pdev->non_compliant_bars = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS2K_APB, loongson_system_bus_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_CONF, loongson_system_bus_quirk);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
+			DEV_LS7A_LPC, loongson_system_bus_quirk);
+
+static void loongson_mrrs_quirk(struct pci_dev *dev)
+{
+	struct pci_bus *bus = dev->bus;
+	struct pci_dev *bridge;
+	static const struct pci_device_id bridge_devids[] = {
+		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
+		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
+		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
+		{ 0, },
+	};
+
+	/* look for the matching bridge */
+	while (!pci_is_root_bus(bus)) {
+		bridge = bus->self;
+		bus = bus->parent;
+		/*
+		 * Some Loongson PCIe ports have a h/w limitation of
+		 * 256 bytes maximum read request size. They can't handle
+		 * anything larger than this. So force this limit on
+		 * any devices attached under these ports.
+		 */
+		if (pci_match_id(bridge_devids, bridge)) {
+			if (pcie_get_readrq(dev) > 256) {
+				pci_info(dev, "limiting MRRS to 256\n");
+				pcie_set_readrq(dev, 256);
+			}
+			break;
+		}
+	}
+}
+DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
+
 /*
  * The Mellanox Tavor device gives false positive parity errors.  Disable
  * parity error reporting.
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A
  2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
  2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
  2021-05-28  7:15 ` [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
@ 2021-05-28  7:15 ` Huacai Chen
  2021-05-28 20:32   ` Bjorn Helgaas
  2021-05-28  7:15 ` [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
  3 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-05-28  7:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Huacai Chen

In new revision of LS7A, some PCIe ports support larger value than 256,
but their maximum supported MRRS values are not detectable. Moreover,
the current loongson_mrrs_quirk() cannot avoid devices increasing its
MRRS after pci_enable_device(). So the only possible way is configure
MRRS of all devices in BIOS, and add a PCI device flag (PCI_DEV_FLAGS_
NO_INCREASE_MRRS) to stop the increasing MRRS operations.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/pci.c    | 5 +++++
 drivers/pci/quirks.c | 6 ++++++
 include/linux/pci.h  | 2 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b717680377a9..6f0d2f5b6f30 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
 
 	v = (ffs(rq) - 8) << 12;
 
+	if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
+		if (rq > pcie_get_readrq(dev))
+			return -EINVAL;
+	}
+
 	ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
 						  PCI_EXP_DEVCTL_READRQ, v);
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 66e4bea69431..10b3b2057940 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -264,6 +264,12 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
 		 * any devices attached under these ports.
 		 */
 		if (pci_match_id(bridge_devids, bridge)) {
+			dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
+
+			if (pcie_bus_config == PCIE_BUS_DEFAULT ||
+			    pcie_bus_config == PCIE_BUS_TUNE_OFF)
+				break;
+
 			if (pcie_get_readrq(dev) > 256) {
 				pci_info(dev, "limiting MRRS to 256\n");
 				pcie_set_readrq(dev, 256);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c20211e59a57..7fb2072a83b8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -227,6 +227,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
 	/* Don't use Relaxed Ordering for TLPs directed at this device */
 	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
+	/* Don't increase BIOS's MRRS configuration */
+	PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
 };
 
 enum pci_irq_reroute_variant {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A
  2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
                   ` (2 preceding siblings ...)
  2021-05-28  7:15 ` [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
@ 2021-05-28  7:15 ` Huacai Chen
  2021-05-28 20:51   ` Bjorn Helgaas
  3 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-05-28  7:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang, Jianmin Lv, Huacai Chen

From: Jianmin Lv <lvjianmin@loongson.cn>

In LS7A, multifunction device use same PCI PIN (because the PIN register
report the same INTx value to each function) but we need different IRQ
for different functions, so add a quirk to fix it for standard PCI PIN
usage.

Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/pci/quirks.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 10b3b2057940..6ab4b3bba36b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -242,6 +242,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
 			DEV_LS7A_LPC, loongson_system_bus_quirk);
 
+static void loongson_pci_pin_quirk(struct pci_dev *dev)
+{
+	dev->pin = 1 + (PCI_FUNC(dev->devfn) & 3);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
+
 static void loongson_mrrs_quirk(struct pci_dev *dev)
 {
 	struct pci_bus *bus = dev->bus;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A
  2021-05-28  7:15 ` [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
@ 2021-05-28 20:32   ` Bjorn Helgaas
  2021-06-04  9:43     ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-05-28 20:32 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang

On Fri, May 28, 2021 at 03:15:02PM +0800, Huacai Chen wrote:
> In new revision of LS7A, some PCIe ports support larger value than 256,
> but their maximum supported MRRS values are not detectable. Moreover,
> the current loongson_mrrs_quirk() cannot avoid devices increasing its
> MRRS after pci_enable_device(). So the only possible way is configure
> MRRS of all devices in BIOS, and add a PCI device flag (PCI_DEV_FLAGS_
> NO_INCREASE_MRRS) to stop the increasing MRRS operations.

It's still not clear what the problem is.

As far as I can tell from the PCIe spec, it is legal for an OS to
program any value for MRRS, and it is legal for an endpoint to
generate a Read Request with any size up to its MRRS.  If you
disagree, please cite the relevant section in the spec.

There is no requirement for the OS to limit the MRRS based on a
restriction elsewhere in the system.  There is no mechanism for the OS
to even discover such a restriction.

Of course, there is also no requirement that the PCIe Completions
carrying the read data be the same size as the MRRS.  If the non-PCIe
part of the system has a restriction on read burst size, that part of
the system can certainly break up the read and respond with several
PCIe Completions.

If LS7A can't break up read requests, that sounds like a problem in
the LS7A design.  We should have a description of this erratum.  And
we should have some statement about this being fixed in future
designs, because we don't want to have to update the fixup with the
PCI vendor/device IDs every time new versions come out.

I also don't want to rely on some value left in MRRS by BIOS.  If
certain bridges have specific limits on what MRRS can be, the fixup
should have those limits in it.

loongson_mrrs_quirk() doesn't look efficient.  We should not have to
run the fixup for *every* PCI device in the system.  Also, we should
not mark every *device* below an LS7A, because it's not the devices
that are defective.

If it's the root port or the host bridge that's defective, we should
mark *that*, e.g., something along the lines of how quirk_no_ext_tags()
works.

> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/pci.c    | 5 +++++
>  drivers/pci/quirks.c | 6 ++++++
>  include/linux/pci.h  | 2 ++
>  3 files changed, 13 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b717680377a9..6f0d2f5b6f30 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
>  
>  	v = (ffs(rq) - 8) << 12;
>  
> +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> +		if (rq > pcie_get_readrq(dev))
> +			return -EINVAL;
> +	}
> +
>  	ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
>  						  PCI_EXP_DEVCTL_READRQ, v);
>  
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 66e4bea69431..10b3b2057940 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -264,6 +264,12 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
>  		 * any devices attached under these ports.
>  		 */
>  		if (pci_match_id(bridge_devids, bridge)) {
> +			dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> +
> +			if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> +			    pcie_bus_config == PCIE_BUS_TUNE_OFF)
> +				break;
> +
>  			if (pcie_get_readrq(dev) > 256) {
>  				pci_info(dev, "limiting MRRS to 256\n");
>  				pcie_set_readrq(dev, 256);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c20211e59a57..7fb2072a83b8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -227,6 +227,8 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
>  	/* Don't use Relaxed Ordering for TLPs directed at this device */
>  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> +	/* Don't increase BIOS's MRRS configuration */
> +	PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
>  };
>  
>  enum pci_irq_reroute_variant {
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A
  2021-05-28  7:15 ` [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
@ 2021-05-28 20:51   ` Bjorn Helgaas
  2021-06-04  9:59     ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-05-28 20:51 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Jianmin Lv, Rob Herring

[+cc Rob, previous discussion at
https://lore.kernel.org/r/20210514080025.1828197-5-chenhuacai@loongson.cn]

On Fri, May 28, 2021 at 03:15:03PM +0800, Huacai Chen wrote:
> From: Jianmin Lv <lvjianmin@loongson.cn>
> 
> In LS7A, multifunction device use same PCI PIN (because the PIN register
> report the same INTx value to each function) but we need different IRQ
> for different functions, so add a quirk to fix it for standard PCI PIN
> usage.

This seems to say that PCI_INTERRUPT_PIN reports the same value for
all functions, but the functions actually use *different* IRQs.

That would be a hardware defect, and of course, we can work around
such things.  It's always better if you can assure us that the defect
will be fixed in future designs so we don't have to update the
workaround with more device IDs.

But Jiaxun suggests [1] that the FDT says all the interrupts go to the
same IRQ.

So I don't know what's going on here.  We can certainly work around a
problem, but of course, this quirk would apply for both FDT and
ACPI-based systems, and the FDT systems seem to work without it.

[1] https://lore.kernel.org/r/933330cb-9d86-2b22-9bed-64becefbe2d1@flygoat.com

> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/quirks.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 10b3b2057940..6ab4b3bba36b 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -242,6 +242,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
>  			DEV_LS7A_LPC, loongson_system_bus_quirk);
>  
> +static void loongson_pci_pin_quirk(struct pci_dev *dev)
> +{
> +	dev->pin = 1 + (PCI_FUNC(dev->devfn) & 3);
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
> +
>  static void loongson_mrrs_quirk(struct pci_dev *dev)
>  {
>  	struct pci_bus *bus = dev->bus;
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
  2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
@ 2021-05-28 21:43   ` Bjorn Helgaas
  2021-06-04  9:24     ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-05-28 21:43 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang,
	Tiezhu Yang, Sinan Kaya

[+cc Sinan]

On Fri, May 28, 2021 at 03:15:00PM +0800, Huacai Chen wrote:
> Use separate remove()/shutdown() callback, and don't disable PCI device
> during shutdown. This can avoid some poweroff/reboot failures.
>
> The poweroff/reboot failures could easily be reproduced on Loongson
> platforms. I think this is not a Loongson-specific problem, instead, is
> a problem related to some specific PCI hosts. On some x86 platforms,
> radeon/amdgpu devices can cause the same problem [1][2], and commit
> faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown")
> can resolve it.
> 
> As Tiezhu said, this occasionally shutdown or reboot failure is due to
> clear PCI_COMMAND_MASTER on the device in do_pci_disable_device() [3].
> 
> static void do_pci_disable_device(struct pci_dev *dev)
> {
>         u16 pci_command;
> 
>         pci_read_config_word(dev, PCI_COMMAND, &pci_command);
>         if (pci_command & PCI_COMMAND_MASTER) {
>                 pci_command &= ~PCI_COMMAND_MASTER;
>                 pci_write_config_word(dev, PCI_COMMAND, pci_command);
>         }
> 
>         pcibios_disable_device(dev);
> }
> 
> When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well when
> shutdown or reboot. This may implies that there are DMA activities on the
> device while shutdown.
> 
> Radeon driver is more difficult than amdgpu due to its confusing symbol
> names, and I have maintained an out-of-tree patch for a long time [4].
> Recently, we found more and more devices can cause the same problem, and
> it is very difficult to modify all problematic drivers as radeon/amdgpu
> does (the .shutdown callback should make sure there is no DMA activity).
> So, I think modify the PCIe port driver is a simple and effective way.
> Because there is no poweroff/reboot problems before cc27b735ad3a75574a6a
> ("PCI/portdrv: Turn off PCIe services during shutdown"). And as early
> discussed, kexec can still work fine after this patch [5].

This needs to say *what* the failure is, and *why* the failure occurs.
There's a lot of hand-waving here but nothing really specific.

I'm getting the impression that:

  - Whatever the problem is, it didn't happen before cc27b735ad3a
    ("PCI/portdrv: Turn off PCIe services during shutdown").

  - cc27b735ad3a added a .shutdown() method for portdrv that calls
    pci_disable_device().

  - pci_disable_device() turns off bus mastering for the PCIe port,
    which means DMA from devices below the port will stop.

  - If you change the portdrv .shutdown() so DMA from devices below
    the port can continue, shutdown and reboot start working again.

So you need to explain why we need to allow DMA from those devices
even after we shutdown the port.  "It makes reboot work" is not a
sufficient explanation.

When you suggest that commit X introduced a problem, please cc the
author of X.  I added Sinan for you in this case.

A patch that fixes a problem with X should also include a "Fixes: X"
tag to help people connect problems with fixes.

> [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
> [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
> [3] https://lore.kernel.org/patchwork/patch/1305067/
> [4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0
> [5] http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/
> 
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> ---
>  drivers/pci/pcie/portdrv.h      |  2 +-
>  drivers/pci/pcie/portdrv_core.c |  6 ++++--
>  drivers/pci/pcie/portdrv_pci.c  | 15 +++++++++++++--
>  3 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
> index 2ff5724b8f13..358d7281f6e8 100644
> --- a/drivers/pci/pcie/portdrv.h
> +++ b/drivers/pci/pcie/portdrv.h
> @@ -117,7 +117,7 @@ int pcie_port_device_resume(struct device *dev);
>  int pcie_port_device_runtime_suspend(struct device *dev);
>  int pcie_port_device_runtime_resume(struct device *dev);
>  #endif
> -void pcie_port_device_remove(struct pci_dev *dev);
> +void pcie_port_device_remove(struct pci_dev *dev, bool disable);
>  int __must_check pcie_port_bus_register(void);
>  void pcie_port_bus_unregister(void);
>  
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index e1fed6649c41..98c0a99a41d6 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -484,11 +484,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
>   * Remove PCI Express port service devices associated with given port and
>   * disable MSI-X or MSI for the port.
>   */
> -void pcie_port_device_remove(struct pci_dev *dev)
> +void pcie_port_device_remove(struct pci_dev *dev, bool disable)
>  {
>  	device_for_each_child(&dev->dev, NULL, remove_iter);
>  	pci_free_irq_vectors(dev);
> -	pci_disable_device(dev);
> +
> +	if (disable)
> +		pci_disable_device(dev);
>  }
>  
>  /**
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index c7ff1eea225a..562fbf3c1ea9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -147,7 +147,18 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
>  		pm_runtime_dont_use_autosuspend(&dev->dev);
>  	}
>  
> -	pcie_port_device_remove(dev);
> +	pcie_port_device_remove(dev, true);
> +}
> +
> +static void pcie_portdrv_shutdown(struct pci_dev *dev)
> +{
> +	if (pci_bridge_d3_possible(dev)) {
> +		pm_runtime_forbid(&dev->dev);
> +		pm_runtime_get_noresume(&dev->dev);
> +		pm_runtime_dont_use_autosuspend(&dev->dev);
> +	}
> +
> +	pcie_port_device_remove(dev, false);
>  }
>  
>  static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
> @@ -219,7 +230,7 @@ static struct pci_driver pcie_portdriver = {
>  
>  	.probe		= pcie_portdrv_probe,
>  	.remove		= pcie_portdrv_remove,
> -	.shutdown	= pcie_portdrv_remove,
> +	.shutdown	= pcie_portdrv_shutdown,
>  
>  	.err_handler	= &pcie_portdrv_err_handler,
>  
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
  2021-05-28 21:43   ` Bjorn Helgaas
@ 2021-06-04  9:24     ` Huacai Chen
  2021-06-07 20:43       ` Sinan Kaya
  0 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-06-04  9:24 UTC (permalink / raw)
  To: Bjorn Helgaas, huangshuai
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jiaxun Yang,
	Tiezhu Yang, Sinan Kaya

Hi, Bjorn,

On Sat, May 29, 2021 at 5:43 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Sinan]
>
> On Fri, May 28, 2021 at 03:15:00PM +0800, Huacai Chen wrote:
> > Use separate remove()/shutdown() callback, and don't disable PCI device
> > during shutdown. This can avoid some poweroff/reboot failures.
> >
> > The poweroff/reboot failures could easily be reproduced on Loongson
> > platforms. I think this is not a Loongson-specific problem, instead, is
> > a problem related to some specific PCI hosts. On some x86 platforms,
> > radeon/amdgpu devices can cause the same problem [1][2], and commit
> > faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown")
> > can resolve it.
> >
> > As Tiezhu said, this occasionally shutdown or reboot failure is due to
> > clear PCI_COMMAND_MASTER on the device in do_pci_disable_device() [3].
> >
> > static void do_pci_disable_device(struct pci_dev *dev)
> > {
> >         u16 pci_command;
> >
> >         pci_read_config_word(dev, PCI_COMMAND, &pci_command);
> >         if (pci_command & PCI_COMMAND_MASTER) {
> >                 pci_command &= ~PCI_COMMAND_MASTER;
> >                 pci_write_config_word(dev, PCI_COMMAND, pci_command);
> >         }
> >
> >         pcibios_disable_device(dev);
> > }
> >
> > When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well when
> > shutdown or reboot. This may implies that there are DMA activities on the
> > device while shutdown.
> >
> > Radeon driver is more difficult than amdgpu due to its confusing symbol
> > names, and I have maintained an out-of-tree patch for a long time [4].
> > Recently, we found more and more devices can cause the same problem, and
> > it is very difficult to modify all problematic drivers as radeon/amdgpu
> > does (the .shutdown callback should make sure there is no DMA activity).
> > So, I think modify the PCIe port driver is a simple and effective way.
> > Because there is no poweroff/reboot problems before cc27b735ad3a75574a6a
> > ("PCI/portdrv: Turn off PCIe services during shutdown"). And as early
> > discussed, kexec can still work fine after this patch [5].
>
> This needs to say *what* the failure is, and *why* the failure occurs.
> There's a lot of hand-waving here but nothing really specific.
>
> I'm getting the impression that:
>
>   - Whatever the problem is, it didn't happen before cc27b735ad3a
>     ("PCI/portdrv: Turn off PCIe services during shutdown").
>
>   - cc27b735ad3a added a .shutdown() method for portdrv that calls
>     pci_disable_device().
>
>   - pci_disable_device() turns off bus mastering for the PCIe port,
>     which means DMA from devices below the port will stop.
>
>   - If you change the portdrv .shutdown() so DMA from devices below
>     the port can continue, shutdown and reboot start working again.
>
Yes, that all.

> So you need to explain why we need to allow DMA from those devices
> even after we shutdown the port.  "It makes reboot work" is not a
> sufficient explanation.
I think only the designer of LS7A can tell us why. So, Mr. Shuai
Huang, could you please explain this?

>
> When you suggest that commit X introduced a problem, please cc the
> author of X.  I added Sinan for you in this case.
>
> A patch that fixes a problem with X should also include a "Fixes: X"
> tag to help people connect problems with fixes.
OK, I'll add a Fixes tag in the next version.

Huacai
>
> > [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
> > [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
> > [3] https://lore.kernel.org/patchwork/patch/1305067/
> > [4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0
> > [5] http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/
> >
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > ---
> >  drivers/pci/pcie/portdrv.h      |  2 +-
> >  drivers/pci/pcie/portdrv_core.c |  6 ++++--
> >  drivers/pci/pcie/portdrv_pci.c  | 15 +++++++++++++--
> >  3 files changed, 18 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
> > index 2ff5724b8f13..358d7281f6e8 100644
> > --- a/drivers/pci/pcie/portdrv.h
> > +++ b/drivers/pci/pcie/portdrv.h
> > @@ -117,7 +117,7 @@ int pcie_port_device_resume(struct device *dev);
> >  int pcie_port_device_runtime_suspend(struct device *dev);
> >  int pcie_port_device_runtime_resume(struct device *dev);
> >  #endif
> > -void pcie_port_device_remove(struct pci_dev *dev);
> > +void pcie_port_device_remove(struct pci_dev *dev, bool disable);
> >  int __must_check pcie_port_bus_register(void);
> >  void pcie_port_bus_unregister(void);
> >
> > diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> > index e1fed6649c41..98c0a99a41d6 100644
> > --- a/drivers/pci/pcie/portdrv_core.c
> > +++ b/drivers/pci/pcie/portdrv_core.c
> > @@ -484,11 +484,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device);
> >   * Remove PCI Express port service devices associated with given port and
> >   * disable MSI-X or MSI for the port.
> >   */
> > -void pcie_port_device_remove(struct pci_dev *dev)
> > +void pcie_port_device_remove(struct pci_dev *dev, bool disable)
> >  {
> >       device_for_each_child(&dev->dev, NULL, remove_iter);
> >       pci_free_irq_vectors(dev);
> > -     pci_disable_device(dev);
> > +
> > +     if (disable)
> > +             pci_disable_device(dev);
> >  }
> >
> >  /**
> > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> > index c7ff1eea225a..562fbf3c1ea9 100644
> > --- a/drivers/pci/pcie/portdrv_pci.c
> > +++ b/drivers/pci/pcie/portdrv_pci.c
> > @@ -147,7 +147,18 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
> >               pm_runtime_dont_use_autosuspend(&dev->dev);
> >       }
> >
> > -     pcie_port_device_remove(dev);
> > +     pcie_port_device_remove(dev, true);
> > +}
> > +
> > +static void pcie_portdrv_shutdown(struct pci_dev *dev)
> > +{
> > +     if (pci_bridge_d3_possible(dev)) {
> > +             pm_runtime_forbid(&dev->dev);
> > +             pm_runtime_get_noresume(&dev->dev);
> > +             pm_runtime_dont_use_autosuspend(&dev->dev);
> > +     }
> > +
> > +     pcie_port_device_remove(dev, false);
> >  }
> >
> >  static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
> > @@ -219,7 +230,7 @@ static struct pci_driver pcie_portdriver = {
> >
> >       .probe          = pcie_portdrv_probe,
> >       .remove         = pcie_portdrv_remove,
> > -     .shutdown       = pcie_portdrv_remove,
> > +     .shutdown       = pcie_portdrv_shutdown,
> >
> >       .err_handler    = &pcie_portdrv_err_handler,
> >
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A
  2021-05-28 20:32   ` Bjorn Helgaas
@ 2021-06-04  9:43     ` Huacai Chen
  2021-06-05 21:34       ` Bjorn Helgaas
  0 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-06-04  9:43 UTC (permalink / raw)
  To: Bjorn Helgaas, huangshuai
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jiaxun Yang

Hi, Bjorn,

On Sat, May 29, 2021 at 4:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, May 28, 2021 at 03:15:02PM +0800, Huacai Chen wrote:
> > In new revision of LS7A, some PCIe ports support larger value than 256,
> > but their maximum supported MRRS values are not detectable. Moreover,
> > the current loongson_mrrs_quirk() cannot avoid devices increasing its
> > MRRS after pci_enable_device(). So the only possible way is configure
> > MRRS of all devices in BIOS, and add a PCI device flag (PCI_DEV_FLAGS_
> > NO_INCREASE_MRRS) to stop the increasing MRRS operations.
>
> It's still not clear what the problem is.
>
> As far as I can tell from the PCIe spec, it is legal for an OS to
> program any value for MRRS, and it is legal for an endpoint to
> generate a Read Request with any size up to its MRRS.  If you
> disagree, please cite the relevant section in the spec.
>
> There is no requirement for the OS to limit the MRRS based on a
> restriction elsewhere in the system.  There is no mechanism for the OS
> to even discover such a restriction.
>
> Of course, there is also no requirement that the PCIe Completions
> carrying the read data be the same size as the MRRS.  If the non-PCIe
> part of the system has a restriction on read burst size, that part of
> the system can certainly break up the read and respond with several
> PCIe Completions.
>
> If LS7A can't break up read requests, that sounds like a problem in
> the LS7A design.  We should have a description of this erratum.  And
> we should have some statement about this being fixed in future
> designs, because we don't want to have to update the fixup with the
> PCI vendor/device IDs every time new versions come out.
Thanks for your information, but I think only Mr. Shuai Huang can
explain the root cause, too.

>
> I also don't want to rely on some value left in MRRS by BIOS.  If
> certain bridges have specific limits on what MRRS can be, the fixup
> should have those limits in it.
As I know, each port of LS7A has a different maximum MRRS value (Yes,
as you said, this is unreasonable in PCIe spec. but it is a fact in
LS7A), and also different between hardware revisions. So, the kernel
cannot configure it, and relying on BIOS is the only way.

>
> loongson_mrrs_quirk() doesn't look efficient.  We should not have to
> run the fixup for *every* PCI device in the system.  Also, we should
> not mark every *device* below an LS7A, because it's not the devices
> that are defective.
>
> If it's the root port or the host bridge that's defective, we should
> mark *that*, e.g., something along the lines of how quirk_no_ext_tags()
> works.
OK, I'll improve my code.

Huacai
>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/pci.c    | 5 +++++
> >  drivers/pci/quirks.c | 6 ++++++
> >  include/linux/pci.h  | 2 ++
> >  3 files changed, 13 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index b717680377a9..6f0d2f5b6f30 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
> >
> >       v = (ffs(rq) - 8) << 12;
> >
> > +     if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> > +             if (rq > pcie_get_readrq(dev))
> > +                     return -EINVAL;
> > +     }
> > +
> >       ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
> >                                                 PCI_EXP_DEVCTL_READRQ, v);
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 66e4bea69431..10b3b2057940 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -264,6 +264,12 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> >                * any devices attached under these ports.
> >                */
> >               if (pci_match_id(bridge_devids, bridge)) {
> > +                     dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> > +
> > +                     if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> > +                         pcie_bus_config == PCIE_BUS_TUNE_OFF)
> > +                             break;
> > +
> >                       if (pcie_get_readrq(dev) > 256) {
> >                               pci_info(dev, "limiting MRRS to 256\n");
> >                               pcie_set_readrq(dev, 256);
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index c20211e59a57..7fb2072a83b8 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> >       PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> >       /* Don't use Relaxed Ordering for TLPs directed at this device */
> >       PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > +     /* Don't increase BIOS's MRRS configuration */
> > +     PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
> >  };
> >
> >  enum pci_irq_reroute_variant {
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A
  2021-05-28 20:51   ` Bjorn Helgaas
@ 2021-06-04  9:59     ` Huacai Chen
  2021-06-05 21:28       ` Bjorn Helgaas
  0 siblings, 1 reply; 20+ messages in thread
From: Huacai Chen @ 2021-06-04  9:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jiaxun Yang,
	Jianmin Lv, Rob Herring

Hi, Bjorn,

On Sat, May 29, 2021 at 4:51 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rob, previous discussion at
> https://lore.kernel.org/r/20210514080025.1828197-5-chenhuacai@loongson.cn]
>
> On Fri, May 28, 2021 at 03:15:03PM +0800, Huacai Chen wrote:
> > From: Jianmin Lv <lvjianmin@loongson.cn>
> >
> > In LS7A, multifunction device use same PCI PIN (because the PIN register
> > report the same INTx value to each function) but we need different IRQ
> > for different functions, so add a quirk to fix it for standard PCI PIN
> > usage.
>
> This seems to say that PCI_INTERRUPT_PIN reports the same value for
> all functions, but the functions actually use *different* IRQs.
>
> That would be a hardware defect, and of course, we can work around
> such things.  It's always better if you can assure us that the defect
> will be fixed in future designs so we don't have to update the
> workaround with more device IDs.
Yes, you are right, and new hardware will not need this workaround.

>
> But Jiaxun suggests [1] that the FDT says all the interrupts go to the
> same IRQ.
>
> So I don't know what's going on here.  We can certainly work around a
> problem, but of course, this quirk would apply for both FDT and
> ACPI-based systems, and the FDT systems seem to work without it.
Emmm, I have discussed with Jiaxun, and maybe you missed something. He
means that ACPI systems need this workaround, and FDT systems don't
need this. But this workaround doesn't break FDT systems, because FDT
systems simply ignore the workaround (interrupts are specified in .dts
file).

Huacai
>
> [1] https://lore.kernel.org/r/933330cb-9d86-2b22-9bed-64becefbe2d1@flygoat.com
>
> > Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  drivers/pci/quirks.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 10b3b2057940..6ab4b3bba36b 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -242,6 +242,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> >                       DEV_LS7A_LPC, loongson_system_bus_quirk);
> >
> > +static void loongson_pci_pin_quirk(struct pci_dev *dev)
> > +{
> > +     dev->pin = 1 + (PCI_FUNC(dev->devfn) & 3);
> > +}
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
> > +
> >  static void loongson_mrrs_quirk(struct pci_dev *dev)
> >  {
> >       struct pci_bus *bus = dev->bus;
> > --
> > 2.27.0
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c
  2021-05-28  7:15 ` [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
@ 2021-06-05 21:15   ` Bjorn Helgaas
  2021-06-06  8:14     ` LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c) Jiaxun Yang
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-06-05 21:15 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Bjorn Helgaas, linux-pci, Xuefeng Li, Huacai Chen, Jiaxun Yang

On Fri, May 28, 2021 at 03:15:01PM +0800, Huacai Chen wrote:
> Loongson PCH (LS7A chipset) will be used by both MIPS-based and
> LoongArch-based Loongson processors. MIPS-based Loongson uses FDT
> but LoongArch-base Loongson uses ACPI, but the driver in drivers/
> pci/controller/pci-loongson.c is FDT-only. So move the quirks to
> quirks.c where can be shared by all architectures.
> 
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  drivers/pci/controller/pci-loongson.c | 69 ---------------------------
>  drivers/pci/quirks.c                  | 69 +++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-loongson.c b/drivers/pci/controller/pci-loongson.c
> index 48169b1e3817..88066e9db69e 100644
> --- a/drivers/pci/controller/pci-loongson.c
> +++ b/drivers/pci/controller/pci-loongson.c
> @@ -12,15 +12,6 @@
>  
>  #include "../pci.h"
>  
> -/* Device IDs */
> -#define DEV_PCIE_PORT_0	0x7a09
> -#define DEV_PCIE_PORT_1	0x7a19
> -#define DEV_PCIE_PORT_2	0x7a29
> -
> -#define DEV_LS2K_APB	0x7a02
> -#define DEV_LS7A_CONF	0x7a10
> -#define DEV_LS7A_LPC	0x7a0c
> -
>  #define FLAG_CFG0	BIT(0)
>  #define FLAG_CFG1	BIT(1)
>  #define FLAG_DEV_FIX	BIT(2)
> @@ -32,66 +23,6 @@ struct loongson_pci {
>  	u32 flags;
>  };
>  
> -/* Fixup wrong class code in PCIe bridges */
> -static void bridge_class_quirk(struct pci_dev *dev)
> -{
> -	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
> -}
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_PCIE_PORT_0, bridge_class_quirk);
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_PCIE_PORT_1, bridge_class_quirk);
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_PCIE_PORT_2, bridge_class_quirk);
> -
> -static void system_bus_quirk(struct pci_dev *pdev)
> -{
> -	/*
> -	 * The address space consumed by these devices is outside the
> -	 * resources of the host bridge.
> -	 */
> -	pdev->mmio_always_on = 1;
> -	pdev->non_compliant_bars = 1;
> -}
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_LS2K_APB, system_bus_quirk);
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_LS7A_CONF, system_bus_quirk);
> -DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> -			DEV_LS7A_LPC, system_bus_quirk);
> -
> -static void loongson_mrrs_quirk(struct pci_dev *dev)
> -{
> -	struct pci_bus *bus = dev->bus;
> -	struct pci_dev *bridge;
> -	static const struct pci_device_id bridge_devids[] = {
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
> -		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
> -		{ 0, },
> -	};
> -
> -	/* look for the matching bridge */
> -	while (!pci_is_root_bus(bus)) {
> -		bridge = bus->self;
> -		bus = bus->parent;
> -		/*
> -		 * Some Loongson PCIe ports have a h/w limitation of
> -		 * 256 bytes maximum read request size. They can't handle
> -		 * anything larger than this. So force this limit on
> -		 * any devices attached under these ports.
> -		 */
> -		if (pci_match_id(bridge_devids, bridge)) {
> -			if (pcie_get_readrq(dev) > 256) {
> -				pci_info(dev, "limiting MRRS to 256\n");
> -				pcie_set_readrq(dev, 256);
> -			}
> -			break;
> -		}
> -	}
> -}
> -DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
> -
>  static void __iomem *cfg1_map(struct loongson_pci *priv, int bus,
>  				unsigned int devfn, int where)
>  {
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index dcb229de1acb..66e4bea69431 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -205,6 +205,75 @@ static void quirk_mmio_always_on(struct pci_dev *dev)
>  DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_ANY_ID, PCI_ANY_ID,
>  				PCI_CLASS_BRIDGE_HOST, 8, quirk_mmio_always_on);
>  
> +/* Loongson-related quirks */
> +#define DEV_PCIE_PORT_0	0x7a09
> +#define DEV_PCIE_PORT_1	0x7a19
> +#define DEV_PCIE_PORT_2	0x7a29
> +
> +#define DEV_LS2K_APB	0x7a02
> +#define DEV_LS7A_CONF	0x7a10
> +#define DEV_LS7A_LPC	0x7a0c

If you're moving these from device-specific file to a generic file,
these #defines now need to have device-specific names.

But these appear to be for built-in hardware that can only be present
in Loongson (I assume mips?) systems.  If that's the case, maybe they
should go to a mips-specific file like arch/mips/pci/quirks.c?

But I see you see you mention LoongArch above, so I don't know if
that's part of arch/mips, or if there's an arch/loongson coming, or
what.

> +/* Fixup wrong class code in PCIe bridges */
> +static void loongson_bridge_class_quirk(struct pci_dev *dev)
> +{
> +	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_0, loongson_bridge_class_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_1, loongson_bridge_class_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_PCIE_PORT_2, loongson_bridge_class_quirk);
> +
> +static void loongson_system_bus_quirk(struct pci_dev *pdev)
> +{
> +	/*
> +	 * The address space consumed by these devices is outside the
> +	 * resources of the host bridge.
> +	 */
> +	pdev->mmio_always_on = 1;
> +	pdev->non_compliant_bars = 1;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS2K_APB, loongson_system_bus_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_CONF, loongson_system_bus_quirk);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> +			DEV_LS7A_LPC, loongson_system_bus_quirk);
> +
> +static void loongson_mrrs_quirk(struct pci_dev *dev)
> +{
> +	struct pci_bus *bus = dev->bus;
> +	struct pci_dev *bridge;
> +	static const struct pci_device_id bridge_devids[] = {
> +		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_0) },
> +		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_1) },
> +		{ PCI_VDEVICE(LOONGSON, DEV_PCIE_PORT_2) },
> +		{ 0, },
> +	};
> +
> +	/* look for the matching bridge */
> +	while (!pci_is_root_bus(bus)) {
> +		bridge = bus->self;
> +		bus = bus->parent;
> +		/*
> +		 * Some Loongson PCIe ports have a h/w limitation of
> +		 * 256 bytes maximum read request size. They can't handle
> +		 * anything larger than this. So force this limit on
> +		 * any devices attached under these ports.
> +		 */
> +		if (pci_match_id(bridge_devids, bridge)) {
> +			if (pcie_get_readrq(dev) > 256) {
> +				pci_info(dev, "limiting MRRS to 256\n");
> +				pcie_set_readrq(dev, 256);
> +			}
> +			break;
> +		}
> +	}
> +}
> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk);
> +
>  /*
>   * The Mellanox Tavor device gives false positive parity errors.  Disable
>   * parity error reporting.
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A
  2021-06-04  9:59     ` Huacai Chen
@ 2021-06-05 21:28       ` Bjorn Helgaas
  2021-06-06  8:01         ` Jiaxun Yang
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-06-05 21:28 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jiaxun Yang,
	Jianmin Lv, Rob Herring

On Fri, Jun 04, 2021 at 05:59:35PM +0800, Huacai Chen wrote:
> On Sat, May 29, 2021 at 4:51 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > [+cc Rob, previous discussion at
> > https://lore.kernel.org/r/20210514080025.1828197-5-chenhuacai@loongson.cn]
> >
> > On Fri, May 28, 2021 at 03:15:03PM +0800, Huacai Chen wrote:
> > > From: Jianmin Lv <lvjianmin@loongson.cn>
> > >
> > > In LS7A, multifunction device use same PCI PIN (because the PIN register
> > > report the same INTx value to each function) but we need different IRQ
> > > for different functions, so add a quirk to fix it for standard PCI PIN
> > > usage.
> >
> > This seems to say that PCI_INTERRUPT_PIN reports the same value for
> > all functions, but the functions actually use *different* IRQs.
> >
> > That would be a hardware defect, and of course, we can work around
> > such things.  It's always better if you can assure us that the defect
> > will be fixed in future designs so we don't have to update the
> > workaround with more device IDs.
>
> Yes, you are right, and new hardware will not need this workaround.
> 
> > But Jiaxun suggests [1] that the FDT says all the interrupts go to the
> > same IRQ.
> >
> > So I don't know what's going on here.  We can certainly work around a
> > problem, but of course, this quirk would apply for both FDT and
> > ACPI-based systems, and the FDT systems seem to work without it.
>
> Emmm, I have discussed with Jiaxun, and maybe you missed something. He
> means that ACPI systems need this workaround, and FDT systems don't
> need this. But this workaround doesn't break FDT systems, because FDT
> systems simply ignore the workaround (interrupts are specified in .dts
> file).

I'm definitely missing something :)

Part of my confusion is because generally both ACPI and DT describe
fixed parts of the platform.  Usually neither describes PCI devices
because PCI includes ways to discover devices and discover the
resources (memory, IRQs, etc) they use.

So the general picture is that ACPI and DT describe the parts of
interrupt routing that can not be discovered from the hardware itself.
The PCI IRQ pin *can* be discovered from the hardware, so I expected
both ACPI and DT to rely on it.

But this quirk applies to [0014:7a09], [0014:7a19], and [0014:7a29],
which look like they are PCIe Root Ports, and your DT ([2]) *does*
seem to describe them with interrupt descriptions.  So I assume the
reason DT systems don't care about this quirk is because they use this
IRQ info from DT and ignore the PCI_INTERRUPT_PIN?

If DT systems ignore the quirk, as you said above, I assume that means
that DT overwrites dev->pin sometime *after* the quirk executes?  Or
there's some DT check that means we ignore dev->pin?  Can you point me
to whatever this mechanism is?

The quirk is not specific to ACPI or DT, so it's not clear to me how
it stays out of DT's way.

[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/mips/boot/dts/loongson/ls7a-pch.dtsi?id=v5.12#n229

> > [1] https://lore.kernel.org/r/933330cb-9d86-2b22-9bed-64becefbe2d1@flygoat.com
> >
> > > Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > ---
> > >  drivers/pci/quirks.c | 17 +++++++++++++++++
> > >  1 file changed, 17 insertions(+)
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 10b3b2057940..6ab4b3bba36b 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -242,6 +242,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> > >  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
> > >                       DEV_LS7A_LPC, loongson_system_bus_quirk);
> > >
> > > +static void loongson_pci_pin_quirk(struct pci_dev *dev)
> > > +{
> > > +     dev->pin = 1 + (PCI_FUNC(dev->devfn) & 3);
> > > +}
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_0, loongson_pci_pin_quirk);
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_1, loongson_pci_pin_quirk);
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, DEV_PCIE_PORT_2, loongson_pci_pin_quirk);
> > > +
> > >  static void loongson_mrrs_quirk(struct pci_dev *dev)
> > >  {
> > >       struct pci_bus *bus = dev->bus;
> > > --
> > > 2.27.0
> > >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A
  2021-06-04  9:43     ` Huacai Chen
@ 2021-06-05 21:34       ` Bjorn Helgaas
  2021-06-12  4:16         ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Bjorn Helgaas @ 2021-06-05 21:34 UTC (permalink / raw)
  To: Huacai Chen
  Cc: huangshuai, Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li,
	Jiaxun Yang

On Fri, Jun 04, 2021 at 05:43:36PM +0800, Huacai Chen wrote:
> Hi, Bjorn,
> 
> On Sat, May 29, 2021 at 4:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Fri, May 28, 2021 at 03:15:02PM +0800, Huacai Chen wrote:
> > > In new revision of LS7A, some PCIe ports support larger value than 256,
> > > but their maximum supported MRRS values are not detectable. Moreover,
> > > the current loongson_mrrs_quirk() cannot avoid devices increasing its
> > > MRRS after pci_enable_device(). So the only possible way is configure
> > > MRRS of all devices in BIOS, and add a PCI device flag (PCI_DEV_FLAGS_
> > > NO_INCREASE_MRRS) to stop the increasing MRRS operations.
> >
> > It's still not clear what the problem is.
> >
> > As far as I can tell from the PCIe spec, it is legal for an OS to
> > program any value for MRRS, and it is legal for an endpoint to
> > generate a Read Request with any size up to its MRRS.  If you
> > disagree, please cite the relevant section in the spec.
> >
> > There is no requirement for the OS to limit the MRRS based on a
> > restriction elsewhere in the system.  There is no mechanism for the OS
> > to even discover such a restriction.
> >
> > Of course, there is also no requirement that the PCIe Completions
> > carrying the read data be the same size as the MRRS.  If the non-PCIe
> > part of the system has a restriction on read burst size, that part of
> > the system can certainly break up the read and respond with several
> > PCIe Completions.
> >
> > If LS7A can't break up read requests, that sounds like a problem in
> > the LS7A design.  We should have a description of this erratum.  And
> > we should have some statement about this being fixed in future
> > designs, because we don't want to have to update the fixup with the
> > PCI vendor/device IDs every time new versions come out.
>
> Thanks for your information, but I think only Mr. Shuai Huang can
> explain the root cause, too.
> 
> > I also don't want to rely on some value left in MRRS by BIOS.  If
> > certain bridges have specific limits on what MRRS can be, the fixup
> > should have those limits in it.
>
> As I know, each port of LS7A has a different maximum MRRS value (Yes,
> as you said, this is unreasonable in PCIe spec. but it is a fact in
> LS7A), and also different between hardware revisions. So, the kernel
> cannot configure it, and relying on BIOS is the only way.

Maybe we should just set MRRS to the minimum (128 bytes) for
everything on this platform.

The generic MPS/MRRS config is messy enough already, and I'm hesitant
to add much complication for what seems to be a fairly broken PCIe
controller.

> > loongson_mrrs_quirk() doesn't look efficient.  We should not have to
> > run the fixup for *every* PCI device in the system.  Also, we should
> > not mark every *device* below an LS7A, because it's not the devices
> > that are defective.
> >
> > If it's the root port or the host bridge that's defective, we should
> > mark *that*, e.g., something along the lines of how quirk_no_ext_tags()
> > works.
> OK, I'll improve my code.
> 
> Huacai
> >
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > ---
> > >  drivers/pci/pci.c    | 5 +++++
> > >  drivers/pci/quirks.c | 6 ++++++
> > >  include/linux/pci.h  | 2 ++
> > >  3 files changed, 13 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index b717680377a9..6f0d2f5b6f30 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
> > >
> > >       v = (ffs(rq) - 8) << 12;
> > >
> > > +     if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> > > +             if (rq > pcie_get_readrq(dev))
> > > +                     return -EINVAL;
> > > +     }
> > > +
> > >       ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
> > >                                                 PCI_EXP_DEVCTL_READRQ, v);
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 66e4bea69431..10b3b2057940 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -264,6 +264,12 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> > >                * any devices attached under these ports.
> > >                */
> > >               if (pci_match_id(bridge_devids, bridge)) {
> > > +                     dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> > > +
> > > +                     if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> > > +                         pcie_bus_config == PCIE_BUS_TUNE_OFF)
> > > +                             break;
> > > +
> > >                       if (pcie_get_readrq(dev) > 256) {
> > >                               pci_info(dev, "limiting MRRS to 256\n");
> > >                               pcie_set_readrq(dev, 256);
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index c20211e59a57..7fb2072a83b8 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> > >       PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> > >       /* Don't use Relaxed Ordering for TLPs directed at this device */
> > >       PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > > +     /* Don't increase BIOS's MRRS configuration */
> > > +     PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
> > >  };
> > >
> > >  enum pci_irq_reroute_variant {
> > > --
> > > 2.27.0
> > >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A
  2021-06-05 21:28       ` Bjorn Helgaas
@ 2021-06-06  8:01         ` Jiaxun Yang
  0 siblings, 0 replies; 20+ messages in thread
From: Jiaxun Yang @ 2021-06-06  8:01 UTC (permalink / raw)
  To: Bjorn Helgaas, Huacai Chen
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jianmin Lv,
	Rob Herring



在2021年6月6日六月 上午5:28,Bjorn Helgaas写道:
> both ACPI and DT to rely on it.
> 
> But this quirk applies to [0014:7a09], [0014:7a19], and [0014:7a29],
> which look like they are PCIe Root Ports, and your DT ([2]) *does*
> seem to describe them with interrupt descriptions.  So I assume the
> reason DT systems don't care about this quirk is because they use this
> IRQ info from DT and ignore the PCI_INTERRUPT_PIN?
> 
> If DT systems ignore the quirk, as you said above, I assume that means
> that DT overwrites dev->pin sometime *after* the quirk executes?  Or
> there's some DT check that means we ignore dev->pin?  Can you point me
> to whatever this mechanism is?

dev->pin won't affect hardware behavior. The only place that kernel uses
dev->pin is mapping them to the actual IRQ number. For DT based system, this
is handled by of_irq_parse_pci.
For the present system, all four INTA,B,C,D are routed to the same upstream IRQ,
thus wrong dev->pin will still get correct mapping.

I'm unable to get my hand on an ACPI system but I guess Loongson uses different
mapping scheme on these systems so dev->pin will matter.

Thanks.
-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c)
  2021-06-05 21:15   ` Bjorn Helgaas
@ 2021-06-06  8:14     ` Jiaxun Yang
  2021-06-07  0:51       ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Jiaxun Yang @ 2021-06-06  8:14 UTC (permalink / raw)
  To: Bjorn Helgaas, Huacai Chen, arnd
  Cc: Bjorn Helgaas, linux-pci, Xuefeng Li, Huacai Chen, linux-arch



在2021年6月6日六月 上午5:15,Bjorn Helgaas写道:

+linux-arch and Arnd for arch related discussions

> 
> If you're moving these from device-specific file to a generic file,
> these #defines now need to have device-specific names.
> 
> But these appear to be for built-in hardware that can only be present
> in Loongson (I assume mips?) systems.  If that's the case, maybe they
> should go to a mips-specific file like arch/mips/pci/quirks.c?
> 
> But I see you see you mention LoongArch above, so I don't know if
> that's part of arch/mips, or if there's an arch/loongson coming, or
> what.

As far as I read LoongArch should be a brand new RISC architecture.
I saw Loongson release some documents[1] and code[2] regarding this
new architecture.

Huacai, as you are submitting these code, does it mean Loongson intends
to mainline LoongArch kernel?
If so, I'm certain that there is a lot of drivers and other code can be
reused between MIPS part and LoongArch part for Loongson chips.
Could you please make an announcement about your plans?

Thanks.

[1]: https://github.com/loongson/LoongArch-Documentation
[2]: https://github.com/loongarch64
-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c)
  2021-06-06  8:14     ` LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c) Jiaxun Yang
@ 2021-06-07  0:51       ` Huacai Chen
  0 siblings, 0 replies; 20+ messages in thread
From: Huacai Chen @ 2021-06-07  0:51 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: Bjorn Helgaas, Huacai Chen, arnd, Bjorn Helgaas, linux-pci,
	Xuefeng Li, linux-arch

Hi, all,

On Sun, Jun 6, 2021 at 4:14 PM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>
>
>
> 在2021年6月6日六月 上午5:15,Bjorn Helgaas写道:
>
> +linux-arch and Arnd for arch related discussions
>
> >
> > If you're moving these from device-specific file to a generic file,
> > these #defines now need to have device-specific names.
> >
> > But these appear to be for built-in hardware that can only be present
> > in Loongson (I assume mips?) systems.  If that's the case, maybe they
> > should go to a mips-specific file like arch/mips/pci/quirks.c?
> >
> > But I see you see you mention LoongArch above, so I don't know if
> > that's part of arch/mips, or if there's an arch/loongson coming, or
> > what.
>
> As far as I read LoongArch should be a brand new RISC architecture.
> I saw Loongson release some documents[1] and code[2] regarding this
> new architecture.
>
> Huacai, as you are submitting these code, does it mean Loongson intends
> to mainline LoongArch kernel?
> If so, I'm certain that there is a lot of drivers and other code can be
> reused between MIPS part and LoongArch part for Loongson chips.
> Could you please make an announcement about your plans?
>
> Thanks.
>
> [1]: https://github.com/loongson/LoongArch-Documentation
> [2]: https://github.com/loongarch64
Yes, LoongArch is a new RISC ISA, it is not compatible with MIPS.
Loongson processors before 3A4000 is compatible with MIPS, and 3A5000
is based on LoongArch. Part of LoongArch documentations are available
in English (as listed above by Jiaxun), others are being translated.
There will be a new directoy arch/loongarch, and I hope it can be
coming soon.

Huacai
> --
> - Jiaxun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
  2021-06-04  9:24     ` Huacai Chen
@ 2021-06-07 20:43       ` Sinan Kaya
  2021-06-12  4:31         ` Huacai Chen
  0 siblings, 1 reply; 20+ messages in thread
From: Sinan Kaya @ 2021-06-07 20:43 UTC (permalink / raw)
  To: Huacai Chen, Bjorn Helgaas, huangshuai
  Cc: Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li, Jiaxun Yang,
	Tiezhu Yang

On 6/4/2021 12:24 PM, Huacai Chen wrote:
>> So you need to explain why we need to allow DMA from those devices
>> even after we shutdown the port.  "It makes reboot work" is not a
>> sufficient explanation.
> I think only the designer of LS7A can tell us why. So, Mr. Shuai
> Huang, could you please explain this?
> 

Could there be some kind of a shutdown/init problem on your graphics
card driver?

During shutdown, remove() callback of all endpoints get called. This is
the opportunity for your graphics driver to put hardware into safe
state.

If there is a problem with the hardware/driver, it should be a quirk as
opposed to changing the default safe behavior for all devices.

The port driver here prevents memory from getting corrupted by rogue
hardware. There is a window during kexec where hardware can write to
system memory addresses if IOVA addresses and system memory addresses
overlap.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A
  2021-06-05 21:34       ` Bjorn Helgaas
@ 2021-06-12  4:16         ` Huacai Chen
  0 siblings, 0 replies; 20+ messages in thread
From: Huacai Chen @ 2021-06-12  4:16 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: huangshuai, Huacai Chen, Bjorn Helgaas, linux-pci, Xuefeng Li,
	Jiaxun Yang

Hi, Bjorn,

On Sun, Jun 6, 2021 at 5:34 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Jun 04, 2021 at 05:43:36PM +0800, Huacai Chen wrote:
> > Hi, Bjorn,
> >
> > On Sat, May 29, 2021 at 4:32 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > > On Fri, May 28, 2021 at 03:15:02PM +0800, Huacai Chen wrote:
> > > > In new revision of LS7A, some PCIe ports support larger value than 256,
> > > > but their maximum supported MRRS values are not detectable. Moreover,
> > > > the current loongson_mrrs_quirk() cannot avoid devices increasing its
> > > > MRRS after pci_enable_device(). So the only possible way is configure
> > > > MRRS of all devices in BIOS, and add a PCI device flag (PCI_DEV_FLAGS_
> > > > NO_INCREASE_MRRS) to stop the increasing MRRS operations.
> > >
> > > It's still not clear what the problem is.
> > >
> > > As far as I can tell from the PCIe spec, it is legal for an OS to
> > > program any value for MRRS, and it is legal for an endpoint to
> > > generate a Read Request with any size up to its MRRS.  If you
> > > disagree, please cite the relevant section in the spec.
> > >
> > > There is no requirement for the OS to limit the MRRS based on a
> > > restriction elsewhere in the system.  There is no mechanism for the OS
> > > to even discover such a restriction.
> > >
> > > Of course, there is also no requirement that the PCIe Completions
> > > carrying the read data be the same size as the MRRS.  If the non-PCIe
> > > part of the system has a restriction on read burst size, that part of
> > > the system can certainly break up the read and respond with several
> > > PCIe Completions.
> > >
> > > If LS7A can't break up read requests, that sounds like a problem in
> > > the LS7A design.  We should have a description of this erratum.  And
> > > we should have some statement about this being fixed in future
> > > designs, because we don't want to have to update the fixup with the
> > > PCI vendor/device IDs every time new versions come out.
> >
> > Thanks for your information, but I think only Mr. Shuai Huang can
> > explain the root cause, too.
> >
> > > I also don't want to rely on some value left in MRRS by BIOS.  If
> > > certain bridges have specific limits on what MRRS can be, the fixup
> > > should have those limits in it.
> >
> > As I know, each port of LS7A has a different maximum MRRS value (Yes,
> > as you said, this is unreasonable in PCIe spec. but it is a fact in
> > LS7A), and also different between hardware revisions. So, the kernel
> > cannot configure it, and relying on BIOS is the only way.
>
> Maybe we should just set MRRS to the minimum (128 bytes) for
> everything on this platform.
>
> The generic MPS/MRRS config is messy enough already, and I'm hesitant
> to add much complication for what seems to be a fairly broken PCIe
> controller.
I have had an offline discussion with Mr. Shuai Huang, and he told me
that the root cause is LS7A doesn't break up large read requests (Yes,
that is a problem in the LS7A design). But I think set MRRS to 128
bytes in the quirk is not enough, because as I said before, some
devices (e.g. Realtek 8169) set a big value in its driver. We cannot
block such operations if we don't touch the PCIe core.

Huacai
>
> > > loongson_mrrs_quirk() doesn't look efficient.  We should not have to
> > > run the fixup for *every* PCI device in the system.  Also, we should
> > > not mark every *device* below an LS7A, because it's not the devices
> > > that are defective.
> > >
> > > If it's the root port or the host bridge that's defective, we should
> > > mark *that*, e.g., something along the lines of how quirk_no_ext_tags()
> > > works.
> > OK, I'll improve my code.
> >
> > Huacai
> > >
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > ---
> > > >  drivers/pci/pci.c    | 5 +++++
> > > >  drivers/pci/quirks.c | 6 ++++++
> > > >  include/linux/pci.h  | 2 ++
> > > >  3 files changed, 13 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index b717680377a9..6f0d2f5b6f30 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -5802,6 +5802,11 @@ int pcie_set_readrq(struct pci_dev *dev, int rq)
> > > >
> > > >       v = (ffs(rq) - 8) << 12;
> > > >
> > > > +     if (dev->dev_flags & PCI_DEV_FLAGS_NO_INCREASE_MRRS) {
> > > > +             if (rq > pcie_get_readrq(dev))
> > > > +                     return -EINVAL;
> > > > +     }
> > > > +
> > > >       ret = pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL,
> > > >                                                 PCI_EXP_DEVCTL_READRQ, v);
> > > >
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 66e4bea69431..10b3b2057940 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -264,6 +264,12 @@ static void loongson_mrrs_quirk(struct pci_dev *dev)
> > > >                * any devices attached under these ports.
> > > >                */
> > > >               if (pci_match_id(bridge_devids, bridge)) {
> > > > +                     dev->dev_flags |= PCI_DEV_FLAGS_NO_INCREASE_MRRS;
> > > > +
> > > > +                     if (pcie_bus_config == PCIE_BUS_DEFAULT ||
> > > > +                         pcie_bus_config == PCIE_BUS_TUNE_OFF)
> > > > +                             break;
> > > > +
> > > >                       if (pcie_get_readrq(dev) > 256) {
> > > >                               pci_info(dev, "limiting MRRS to 256\n");
> > > >                               pcie_set_readrq(dev, 256);
> > > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > > index c20211e59a57..7fb2072a83b8 100644
> > > > --- a/include/linux/pci.h
> > > > +++ b/include/linux/pci.h
> > > > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> > > >       PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> > > >       /* Don't use Relaxed Ordering for TLPs directed at this device */
> > > >       PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > > > +     /* Don't increase BIOS's MRRS configuration */
> > > > +     PCI_DEV_FLAGS_NO_INCREASE_MRRS = (__force pci_dev_flags_t) (1 << 12),
> > > >  };
> > > >
> > > >  enum pci_irq_reroute_variant {
> > > > --
> > > > 2.27.0
> > > >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown
  2021-06-07 20:43       ` Sinan Kaya
@ 2021-06-12  4:31         ` Huacai Chen
  0 siblings, 0 replies; 20+ messages in thread
From: Huacai Chen @ 2021-06-12  4:31 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Bjorn Helgaas, huangshuai, Huacai Chen, Bjorn Helgaas, linux-pci,
	Xuefeng Li, Jiaxun Yang, Tiezhu Yang

Hi, Bjorn and Sinan,

On Tue, Jun 8, 2021 at 4:43 AM Sinan Kaya <okaya@kernel.org> wrote:
>
> On 6/4/2021 12:24 PM, Huacai Chen wrote:
> >> So you need to explain why we need to allow DMA from those devices
> >> even after we shutdown the port.  "It makes reboot work" is not a
> >> sufficient explanation.
> > I think only the designer of LS7A can tell us why. So, Mr. Shuai
> > Huang, could you please explain this?
> >
>
> Could there be some kind of a shutdown/init problem on your graphics
> card driver?
>
> During shutdown, remove() callback of all endpoints get called. This is
> the opportunity for your graphics driver to put hardware into safe
> state.
>
> If there is a problem with the hardware/driver, it should be a quirk as
> opposed to changing the default safe behavior for all devices.
I have had an offline discussion with Mr. Shuai Huang, he told me that
CPU is still writing data to framebuffer while poweroff/reboot, and if
we clear Bus Master Bit at this time, CPU will wait ack from device,
but never return, so deadlock.

More or less, we can modify the GPU driver to avoid this, as I said in
the commit message:

"The poweroff/reboot failures could easily be reproduced on Loongson
platforms. I think this is not a Loongson-specific problem, instead, is
a problem related to some specific PCI hosts. On some x86 platforms,
radeon/amdgpu devices can cause the same problem [1][2], and commit
faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown")
can resolve it.

Radeon driver is more difficult than amdgpu due to its confusing symbol
names, and I have maintained an out-of-tree patch for a long time [4]."

[1] https://bugs.freedesktop.org/show_bug.cgi?id=97980
[2] https://bugs.freedesktop.org/show_bug.cgi?id=98638
[4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0

Modifing every GPU driver is impossible for me, and I found some RAID
controllers have problems, too.

>
> The port driver here prevents memory from getting corrupted by rogue
> hardware. There is a window during kexec where hardware can write to
> system memory addresses if IOVA addresses and system memory addresses
> overlap.
>
KEXEC has no problems, as discussed before:
http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@lemote.com/

static void pci_device_shutdown(struct device *dev)
{
  ...
         if (drv && drv->shutdown)
                 drv->shutdown(pci_dev);

         /*
          * If this is a kexec reboot, turn off Bus Master bit on the
          * device to tell it to not continue to do DMA. Don't touch
          * devices in D3cold or unknown states.
          * If it is not a kexec reboot, firmware will hit the PCI
          * devices with big hammer and stop their DMA any way.
          */
         if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
                 pci_clear_master(pci_dev);
}

Huacai

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-06-12  4:31 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-28  7:14 [PATCH V2 0/4] PCI: Loongson-related pci quirks Huacai Chen
2021-05-28  7:15 ` [PATCH V2 1/4] PCI/portdrv: Don't disable device during shutdown Huacai Chen
2021-05-28 21:43   ` Bjorn Helgaas
2021-06-04  9:24     ` Huacai Chen
2021-06-07 20:43       ` Sinan Kaya
2021-06-12  4:31         ` Huacai Chen
2021-05-28  7:15 ` [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c Huacai Chen
2021-06-05 21:15   ` Bjorn Helgaas
2021-06-06  8:14     ` LoongArch (was: Re: [PATCH V2 2/4] PCI: Move loongson pci quirks to quirks.c) Jiaxun Yang
2021-06-07  0:51       ` Huacai Chen
2021-05-28  7:15 ` [PATCH V2 3/4] PCI: Improve the MRRS quirk for LS7A Huacai Chen
2021-05-28 20:32   ` Bjorn Helgaas
2021-06-04  9:43     ` Huacai Chen
2021-06-05 21:34       ` Bjorn Helgaas
2021-06-12  4:16         ` Huacai Chen
2021-05-28  7:15 ` [PATCH V2 4/4] PCI: Add quirk for multifunction devices of LS7A Huacai Chen
2021-05-28 20:51   ` Bjorn Helgaas
2021-06-04  9:59     ` Huacai Chen
2021-06-05 21:28       ` Bjorn Helgaas
2021-06-06  8:01         ` Jiaxun Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).