linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.0 054/262] PCI: mediatek: Fix memory mapped IO range size computation
       [not found] <20190327180158.10245-1-sashal@kernel.org>
@ 2019-03-27 17:58 ` Sasha Levin
  2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 062/262] PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() Sasha Levin
  2019-03-27 18:01 ` [PATCH AUTOSEL 5.0 237/262] PCI: pciehp: Assign ctrl->slot_ctrl before writing it to hardware Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 17:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Honghui Zhang, Lorenzo Pieralisi, Sasha Levin, linux-pci, linux-mediatek

From: Honghui Zhang <honghui.zhang@mediatek.com>

[ Upstream commit c61df57343bf05743f8abbb31eec9a6f05820dd1 ]

Mediatek's HW assigns a MMIO address range (typically starts from
0x20000000 to 0x2fffffff for both mt2712 and mt7622) for PCI usage.

This MMIO address space represents the address space that can
be allocated to PCI devices through Base Address Registers.

Even though the full MMIO address range is available to be allocated, it
should be enabled by the PCIE_AHB_TRANS_BASE register in the host
controller and the size that is enabled is determined by AHB2PCIE_SIZE
bits in this register.

Owing to a bug in the MMIO window size computation, current code does
not enable the full size of the available MMIO address range in the
PCI host controller; if the PCI devices BARs requested size exceeds the
size enabled through the PCIE_AHB_TRANS_BASE register the requests
targeting the disabled address address space will be blocked by the root
complex causing a system error.

Existing code has never run into a system error in production because
even half of the enabled MMIO range (128MB) is big enough for typical
devices BAR requests (4MB) but the full MMIO address range should
be enabled regardless.

Fix the MMIO window size computation by using resource_size(mem) instead
of mem->end - mem->start.

Since the MMIO window size for both MT2712 and MT7622 is 0x10000000,
this change will update the parameter passed to fls() from 0xfffffff to
0x10000000 and calculate the whole memory mapped IO range size
correctly.

Detected through coccinelle semantic patch (and related warning):

scripts/coccinelle/api/resource_size.cocci:

pcie-mediatek.c:720:13-16: WARNING: Suspicious code. resource_size is maybe missing with mem

Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: rewrote the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pci/controller/pcie-mediatek.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
index 55e471c18e8d..c42fe5c4319f 100644
--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -654,7 +654,6 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
 	struct resource *mem = &pcie->mem;
 	const struct mtk_pcie_soc *soc = port->pcie->soc;
 	u32 val;
-	size_t size;
 	int err;
 
 	/* MT7622 platforms need to enable LTSSM and ASPM from PCIe subsys */
@@ -706,8 +705,8 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
 		mtk_pcie_enable_msi(port);
 
 	/* Set AHB to PCIe translation windows */
-	size = mem->end - mem->start;
-	val = lower_32_bits(mem->start) | AHB2PCIE_SIZE(fls(size));
+	val = lower_32_bits(mem->start) |
+	      AHB2PCIE_SIZE(fls(resource_size(mem)));
 	writel(val, port->base + PCIE_AHB_TRANS_BASE0_L);
 
 	val = upper_32_bits(mem->start);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 5.0 062/262] PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove()
       [not found] <20190327180158.10245-1-sashal@kernel.org>
  2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 054/262] PCI: mediatek: Fix memory mapped IO range size computation Sasha Levin
@ 2019-03-27 17:58 ` Sasha Levin
  2019-03-27 18:01 ` [PATCH AUTOSEL 5.0 237/262] PCI: pciehp: Assign ctrl->slot_ctrl before writing it to hardware Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 17:58 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Rafael J. Wysocki, Bjorn Helgaas, Sasha Levin, linux-pci

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

[ Upstream commit 95c80bc6952b6a5badc7b702d23e5bf14d251e7c ]

Dongdong reported a deadlock triggered by a hotplug event during a sysfs
"remove" operation:

  pciehp 0000:00:0c.0:pcie004: Slot(0-1): Link Up
  # echo 1 > 0000:00:0c.0/remove

  PME and hotplug share an MSI/MSI-X vector.  The sysfs "remove" side is:

    remove_store
       pci_stop_and_remove_bus_device_locked
	 pci_lock_rescan_remove
	 pci_stop_and_remove_bus_device
	   ...
	   pcie_pme_remove
	     pcie_pme_suspend
	       synchronize_irq        # wait for hotplug IRQ handler
	 pci_unlock_rescan_remove

  The hotplug side is:

    pciehp_ist
       pciehp_handle_presence_or_link_change
	 pciehp_configure_device
	   pci_lock_rescan_remove     # wait for pci_unlock_rescan_remove()

  INFO: task bash:10913 blocked for more than 120 seconds.

  # ps -ax |grep D
   PID TTY      STAT   TIME COMMAND
  10913 ttyAMA0  Ds+    0:00 -bash
  14022 ?        D      0:00 [irq/745-pciehp]

  # cat /proc/14022/stack
  __switch_to+0x94/0xd8
  pci_lock_rescan_remove+0x20/0x28
  pciehp_configure_device+0x30/0x140
  pciehp_handle_presence_or_link_change+0x324/0x458
  pciehp_ist+0x1dc/0x1e0

  # cat /proc/10913/stack
  __switch_to+0x94/0xd8
  synchronize_irq+0x8c/0xc0
  pcie_pme_suspend+0xa4/0x118
  pcie_pme_remove+0x20/0x40
  pcie_port_remove_service+0x3c/0x58
  ...
  pcie_port_device_remove+0x2c/0x48
  pcie_portdrv_remove+0x68/0x78
  pci_device_remove+0x48/0x120
  ...
  pci_stop_bus_device+0x84/0xc0
  pci_stop_and_remove_bus_device_locked+0x24/0x40
  remove_store+0xa4/0xb8
  dev_attr_store+0x44/0x60
  sysfs_kf_write+0x58/0x80

It is incorrect to call pcie_pme_suspend() from pcie_pme_remove() for two
reasons.

First, pcie_pme_suspend() calls synchronize_irq(), which will wait for the
native hotplug interrupt handler as well as for the PME one, because they
share one IRQ (as per the spec).  That may deadlock if hotplug is signaled
while pcie_pme_remove() is running and the latter calls
pci_lock_rescan_remove() before the former.

Second, if pcie_pme_suspend() figures out that wakeup needs to be enabled
for the port, it will return without disabling the interrupt as expected by
pcie_pme_remove() which was overlooked by commit c7b5a4e6e8fb ("PCI / PM:
Fix native PME handling during system suspend/resume").

To fix that, rework pcie_pme_remove() to disable the PME interrupt, clear
its status and prevent the PME worker function from re-enabling it before
calling free_irq() on it, which should be sufficient.

Fixes: c7b5a4e6e8fb ("PCI / PM: Fix native PME handling during system suspend/resume")
Link: https://lore.kernel.org/linux-pci/c7697e7c-e1af-13e4-8491-0a3996e6ab5d@huawei.com
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[bhelgaas: add URL and deadlock details from Dongdong]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pci/pcie/pme.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c
index 1a8b85051b1b..efa5b552914b 100644
--- a/drivers/pci/pcie/pme.c
+++ b/drivers/pci/pcie/pme.c
@@ -363,6 +363,16 @@ static bool pcie_pme_check_wakeup(struct pci_bus *bus)
 	return false;
 }
 
+static void pcie_pme_disable_interrupt(struct pci_dev *port,
+				       struct pcie_pme_service_data *data)
+{
+	spin_lock_irq(&data->lock);
+	pcie_pme_interrupt_enable(port, false);
+	pcie_clear_root_pme_status(port);
+	data->noirq = true;
+	spin_unlock_irq(&data->lock);
+}
+
 /**
  * pcie_pme_suspend - Suspend PCIe PME service device.
  * @srv: PCIe service device to suspend.
@@ -387,11 +397,7 @@ static int pcie_pme_suspend(struct pcie_device *srv)
 			return 0;
 	}
 
-	spin_lock_irq(&data->lock);
-	pcie_pme_interrupt_enable(port, false);
-	pcie_clear_root_pme_status(port);
-	data->noirq = true;
-	spin_unlock_irq(&data->lock);
+	pcie_pme_disable_interrupt(port, data);
 
 	synchronize_irq(srv->irq);
 
@@ -427,9 +433,11 @@ static int pcie_pme_resume(struct pcie_device *srv)
  */
 static void pcie_pme_remove(struct pcie_device *srv)
 {
-	pcie_pme_suspend(srv);
+	struct pcie_pme_service_data *data = get_service_data(srv);
+
+	pcie_pme_disable_interrupt(srv->port, data);
 	free_irq(srv->irq, srv);
-	kfree(get_service_data(srv));
+	kfree(data);
 }
 
 static struct pcie_port_service_driver pcie_pme_driver = {
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH AUTOSEL 5.0 237/262] PCI: pciehp: Assign ctrl->slot_ctrl before writing it to hardware
       [not found] <20190327180158.10245-1-sashal@kernel.org>
  2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 054/262] PCI: mediatek: Fix memory mapped IO range size computation Sasha Levin
  2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 062/262] PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() Sasha Levin
@ 2019-03-27 18:01 ` Sasha Levin
  2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 18:01 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mika Westerberg, Bjorn Helgaas, Sasha Levin, linux-pci

From: Mika Westerberg <mika.westerberg@linux.intel.com>

[ Upstream commit 25bd879ec16ad3b83a5b1c3f16faa55e696bfccb ]

Shameerali reported that running v4.20-rc1 as QEMU guest, the PCIe hotplug
port times out during boot:

  pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1016 msec ago)
  pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1024 msec ago)
  pciehp 0000:00:01.0:pcie004: Failed to check link status
  pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x02f1 (issued 2520 msec ago)

The issue was bisected down to commit 720d6a671a6e ("PCI: pciehp: Do not
handle events if interrupts are masked") and was further analyzed by the
reporter to be caused by the fact that pciehp first updates the hardware
and only then cache the ctrl->slot_ctrl in pcie_do_write_cmd().  If the
interrupt happens before we cache the value, pciehp_isr() reads value 0 and
decides that the interrupt was not meant for it causing the above timeout
to trigger.

Fix by moving ctrl->slot_ctrl assignment to happen before it is written to
the hardware.

Fixes: 720d6a671a6e ("PCI: pciehp: Do not handle events if interrupts are masked")
Link: https://lore.kernel.org/linux-pci/5FC3163CFD30C246ABAA99954A238FA8387DD344@FRAEML521-MBX.china.huawei.com
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pci/hotplug/pciehp_hpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index c0fb64ace05a..8bfcb8cd0900 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -156,9 +156,9 @@ static void pcie_do_write_cmd(struct controller *ctrl, u16 cmd,
 	slot_ctrl |= (cmd & mask);
 	ctrl->cmd_busy = 1;
 	smp_mb();
+	ctrl->slot_ctrl = slot_ctrl;
 	pcie_capability_write_word(pdev, PCI_EXP_SLTCTL, slot_ctrl);
 	ctrl->cmd_started = jiffies;
-	ctrl->slot_ctrl = slot_ctrl;
 
 	/*
 	 * Controllers with the Intel CF118 and similar errata advertise
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-03-27 19:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190327180158.10245-1-sashal@kernel.org>
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 054/262] PCI: mediatek: Fix memory mapped IO range size computation Sasha Levin
2019-03-27 17:58 ` [PATCH AUTOSEL 5.0 062/262] PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() Sasha Levin
2019-03-27 18:01 ` [PATCH AUTOSEL 5.0 237/262] PCI: pciehp: Assign ctrl->slot_ctrl before writing it to hardware Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).