Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 00/23] PCI: Allow BAR movement during hotplug
@ 2019-08-16 16:50 Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
                   ` (22 more replies)
  0 siblings, 23 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

If the firmware or kernel has arranged memory for PCIe devices in a way
that doesn't provide enough space for BARs of a new hotplugged device, the
kernel can pause the drivers of the "obstructing" devices and move their
BARs, so the new BARs can fit into the freed spaces.

To rearrange the BARs and bridge windows these patches releases all of them
after a rescan and re-assigns in the same way as during the initial PCIe
topology scan at system boot.

When a driver is un-paused by the kernel after the PCIe rescan, it should
check if its BARs had moved, and ioremap() them.

Drivers indicate their support of the feature by implementing the new hooks
.rescan_prepare() and .rescan_done() in the struct pci_driver. If a driver
doesn't yet support the feature, BARs of its devices will be considered as
immovable (by checking the pci_dev_movable_bars_supported(dev)) and handled
in the same way as resources with the IORESOURCE_PCI_FIXED flag.

If a driver doesn't yet support the feature, its devices are guaranteed to
have their BARs remaining untouched.

Tested on:
 - x86_64 with "pci=realloc,assign-busses,use_crs,pcie_bus_peer2peer";
 - POWER8 PowerNV+OPAL+PHB3 ppc64le with [1] applied and the following:
   "pci=realloc,pcie_bus_peer2peer";
 - both platforms [with extra pacthes (yet to be submitted) for movable bus
   numbers]: manually initiated (via sysfs) rescan has found and turned on
   a hotplugged bridge.

Not so many platforms and test cases were covered, so all who are
interested are highly welcome to test on your setups - the more exotic the
better!

This patchset is a part of our work on adding support for hotplugging
bridges full of other bridges, NVME drives, SAS HBAs and GPUs without
special requirements such as Hot-Plug Controller, reservation of bus
numbers or memory regions by firmware, etc. The next patchset to submit
will implement the movable bus numbers.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-August/195272.html
    [PATCH v6 0/5] powerpc/powernv/pci: Make hotplug self-sufficient, independent of FW and DT

Changes since v4:
 - Feature is enabled by default (turned on by one of the latest patches);
 - Add pci_dev_movable_bars_supported(dev) instead of marking the immovable
   BARs with the IORESOURCE_PCI_FIXED flag;
 - Set up PCIe bridges during rescan via sysfs, so MPS settings are now
   configured not only during system boot or pcihp events;
 - Allow movement of switch's BARs if claimed by portdrv;
 - Update EEH address caches after rescan for powerpc;
 - Don't disable completely hot-added devices which can't have BARs being
   fit - just disable their BARs, so they are still visible in lspci etc;
 - Clearer names: fixed_range_hard -> immovable_range, fixed_range_soft ->
   realloc_range;
 - Drop the patch for pci_restore_config_space() - fixed by properly using
   the runtime PM.

Changes since v3:
 - Rebased to the upstream, so the patches apply cleanly again.

Changes since v2:
 - Fixed double-assignment of bridge windows;
 - Fixed assignment of fixed prefetched resources;
 - Fixed releasing of fixed resources;
 - Fixed a debug message;
 - Removed auto-enabling the movable BARs for x86 - let's rely on the
   "pcie_movable_bars=force" option for now;
 - Reordered the patches - bugfixes first.

Changes since v1:
 - Add a "pcie_movable_bars={ off | force }" command line argument;
 - Handle the IORESOURCE_PCI_FIXED flag properly;
 - Don't move BARs of devices which don't support the feature;
 - Guarantee that new hotplugged devices will not steal memory from working
   devices by ignoring the failing new devices with the new PCI_DEV_IGNORE
   flag;
 - Add rescan_prepare()+rescan_done() to the struct pci_driver instead of
   using the reset_prepare()+reset_done() from struct pci_error_handlers;
 - Add a bugfix of a race condition;
 - Fixed hotplug in a non-pre-enabled (by BIOS/firmware) bridge;
 - Fix the compatibility of the feature with pm_runtime and D3-state;
 - Hotplug events from pciehp also can move BARs;
 - Add support of the feature to the NVME driver.

Sergey Miroshnichenko (23):
  PCI: Fix race condition in pci_enable/disable_device()
  PCI: Enable bridge's I/O and MEM access for hotplugged devices
  PCI: hotplug: Add a flag for the movable BARs feature
  PCI: Define PCI-specific version of the release_child_resources()
  PCI: hotplug: movable BARs: Fix reassigning the released bridge
    windows
  PCI: hotplug: movable BARs: Recalculate all bridge windows during
    rescan
  PCI: hotplug: movable BARs: Don't allow added devices to steal
    resources
  PCI: Include fixed and immovable BARs into the bus size calculating
  PCI: Prohibit assigning BARs and bridge windows to non-direct parents
  PCI: hotplug: movable BARs: Try to assign unassigned resources only
    once
  PCI: hotplug: movable BARs: Calculate immovable parts of bridge
    windows
  PCI: hotplug: movable BARs: Compute limits for relocated bridge
    windows
  PCI: Make sure bridge windows include their fixed BARs
  PCI: Fix assigning the fixed prefetchable resources
  PCI: hotplug: movable BARs: Assign fixed and immovable BARs before
    others
  PCI: hotplug: movable BARs: Don't reserve IO/mem bus space
  powerpc/pci: Fix crash with enabled movable BARs
  powerpc/pci: Handle BAR movement
  PCI: hotplug: Configure MPS for hot-added bridges during bus rescan
  PCI: hotplug: movable BARs: Enable the feature by default
  nvme-pci: Handle movable BARs
  PCI/portdrv: Declare support of movable BARs
  PCI: pciehp: movable BARs: Trigger a domain rescan on hp events

 .../admin-guide/kernel-parameters.txt         |   7 +
 arch/powerpc/kernel/pci-hotplug.c             |  10 +
 arch/powerpc/platforms/powernv/pci-ioda.c     |   3 +-
 drivers/nvme/host/pci.c                       |  21 +-
 drivers/pci/bus.c                             |   2 +-
 drivers/pci/hotplug/pciehp_pci.c              |   5 +
 drivers/pci/pci.c                             |  58 +++-
 drivers/pci/pci.h                             |  30 ++
 drivers/pci/pcie/portdrv_pci.c                |  11 +
 drivers/pci/probe.c                           | 295 +++++++++++++++++-
 drivers/pci/setup-bus.c                       | 276 +++++++++++++---
 drivers/pci/setup-res.c                       |  48 ++-
 include/linux/pci.h                           |  21 ++
 13 files changed, 739 insertions(+), 48 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device()
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-22 12:37   ` Marta Rybczynska
  2019-08-16 16:50 ` [PATCH v5 02/23] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergey Miroshnichenko
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, Srinath Mannam,
	Marta Rybczynska

This is a yet another approach to fix an old [1-2] concurrency issue, when:
 - two or more devices are being hot-added into a bridge which was
   initially empty;
 - a bridge with two or more devices is being hot-added;
 - during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges.

The problem is that a bridge is reported as enabled before the MEM/IO bits
are actually written to the PCI_COMMAND register, so another driver thread
starts memory requests through the not-yet-enabled bridge:

 CPU0                                        CPU1

 pci_enable_device_mem()                     pci_enable_device_mem()
   pci_enable_bridge()                         pci_enable_bridge()
     pci_is_enabled()
       return false;
     atomic_inc_return(enable_cnt)
     Start actual enabling the bridge
     ...                                         pci_is_enabled()
     ...                                           return true;
     ...                                     Start memory requests <-- FAIL
     ...
     Set the PCI_COMMAND_MEMORY bit <-- Must wait for this

Protect the pci_enable/disable_device() and pci_enable_bridge(), which is
similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race
while enabling upstream bridges"), but adding a per-device mutexes and
preventing the dev->enable_cnt from from incrementing early.

CC: Srinath Mannam <srinath.mannam@broadcom.com>
CC: Marta Rybczynska <mrybczyn@kalray.eu>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>

[1] https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u
    [RFC PATCH v3] pci: Concurrency issue during pci enable bridge

[2] https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u
    [RFC PATCH] nvme: avoid race-conditions when enabling devices
---
 drivers/pci/pci.c   | 26 ++++++++++++++++++++++----
 drivers/pci/probe.c |  1 +
 include/linux/pci.h |  1 +
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 1b27b5af3d55..e7f8c354e644 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1645,6 +1645,8 @@ static void pci_enable_bridge(struct pci_dev *dev)
 	struct pci_dev *bridge;
 	int retval;
 
+	mutex_lock(&dev->enable_mutex);
+
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)
 		pci_enable_bridge(bridge);
@@ -1652,6 +1654,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
 	if (pci_is_enabled(dev)) {
 		if (!dev->is_busmaster)
 			pci_set_master(dev);
+		mutex_unlock(&dev->enable_mutex);
 		return;
 	}
 
@@ -1660,11 +1663,14 @@ static void pci_enable_bridge(struct pci_dev *dev)
 		pci_err(dev, "Error enabling bridge (%d), continuing\n",
 			retval);
 	pci_set_master(dev);
+	mutex_unlock(&dev->enable_mutex);
 }
 
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 {
 	struct pci_dev *bridge;
+	/* Enable-locking of bridges is performed within the pci_enable_bridge() */
+	bool need_lock = !dev->subordinate;
 	int err;
 	int i, bars = 0;
 
@@ -1680,8 +1686,13 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
 	}
 
-	if (atomic_inc_return(&dev->enable_cnt) > 1)
+	if (need_lock)
+		mutex_lock(&dev->enable_mutex);
+	if (pci_is_enabled(dev)) {
+		if (need_lock)
+			mutex_unlock(&dev->enable_mutex);
 		return 0;		/* already enabled */
+	}
 
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)
@@ -1696,8 +1707,10 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 			bars |= (1 << i);
 
 	err = do_pci_enable_device(dev, bars);
-	if (err < 0)
-		atomic_dec(&dev->enable_cnt);
+	if (err >= 0)
+		atomic_inc(&dev->enable_cnt);
+	if (need_lock)
+		mutex_unlock(&dev->enable_mutex);
 	return err;
 }
 
@@ -1941,15 +1954,20 @@ void pci_disable_device(struct pci_dev *dev)
 	if (dr)
 		dr->enabled = 0;
 
+	mutex_lock(&dev->enable_mutex);
 	dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0,
 		      "disabling already-disabled device");
 
-	if (atomic_dec_return(&dev->enable_cnt) != 0)
+	if (atomic_dec_return(&dev->enable_cnt) != 0) {
+		mutex_unlock(&dev->enable_mutex);
 		return;
+	}
 
 	do_pci_disable_device(dev);
 
 	dev->is_busmaster = 0;
+
+	mutex_unlock(&dev->enable_mutex);
 }
 EXPORT_SYMBOL(pci_disable_device);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index a3c7338fad86..2e58ece820e8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2427,6 +2427,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 	INIT_LIST_HEAD(&dev->bus_list);
 	dev->dev.type = &pci_dev_type;
 	dev->bus = pci_bus_get(bus);
+	mutex_init(&dev->enable_mutex);
 
 	return dev;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9e700d9f9f28..d3a72159722d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -425,6 +425,7 @@ struct pci_dev {
 	unsigned int	no_vf_scan:1;		/* Don't scan for VFs after IOV enablement */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
+	struct mutex	enable_mutex;
 
 	u32		saved_config_space[16]; /* Config space saved at suspend time */
 	struct hlist_head saved_cap_space;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 02/23] PCI: Enable bridge's I/O and MEM access for hotplugged devices
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 03/23] PCI: hotplug: Add a flag for the movable BARs feature Sergey Miroshnichenko
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

The PCI_COMMAND_IO and PCI_COMMAND_MEMORY bits of the bridge must be
updated not only when enabling the bridge for the first time, but also if a
hotplugged device requests these types of resources.

Originally these bits were set by the pci_enable_device_flags() only, which
exits early if the bridge is already pci_is_enabled(). So if the bridge was
empty initially (an edge case), then hotplugged devices fail to IO/MEM.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e7f8c354e644..61d951766087 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1652,6 +1652,14 @@ static void pci_enable_bridge(struct pci_dev *dev)
 		pci_enable_bridge(bridge);
 
 	if (pci_is_enabled(dev)) {
+		int i, bars = 0;
+
+		for (i = PCI_BRIDGE_RESOURCES; i < DEVICE_COUNT_RESOURCE; i++) {
+			if (dev->resource[i].flags & (IORESOURCE_MEM | IORESOURCE_IO))
+				bars |= (1 << i);
+		}
+		do_pci_enable_device(dev, bars);
+
 		if (!dev->is_busmaster)
 			pci_set_master(dev);
 		mutex_unlock(&dev->enable_mutex);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 03/23] PCI: hotplug: Add a flag for the movable BARs feature
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 02/23] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 04/23] PCI: Define PCI-specific version of the release_child_resources() Sergey Miroshnichenko
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, Sam Bobroff,
	Rajat Jain, Lukas Wunner, Oliver O'Halloran, David Laight

When hot-adding a device, the bridge may have windows not big enough (or
fragmented too much) for newly requested BARs to fit in. And expanding
these bridge windows may be impossible because blocked by "neighboring"
BARs and bridge windows.

Still, it may be possible to allocate a memory region for new BARs with the
following procedure:

1) notify all the drivers which support movable BARs to pause and release
   the BARs; the rest of the drivers are guaranteed that their devices will
   not get BARs moved;

2) release all the bridge windows except of root bridges;

3) try to recalculate new bridge windows that will fit all the BAR types:
   - fixed;
   - immovable;
   - movable;
   - newly requested by hot-added devices;

4) if the previous step fails, disable BARs for one of the hot-added
   devices and retry from step 3;

5) notify the drivers, so they remap BARs and resume.

This makes the prior reservation of memory by BIOS/bootloader/firmware not
required anymore for the PCI hotplug.

Drivers indicate their support of movable BARs by implementing the new
.rescan_prepare() and .rescan_done() hooks in the struct pci_driver. All
device's activity must be paused during a rescan, and iounmap()+ioremap()
must be applied to every used BAR.

The platform also may need to prepare to BAR movement, so new hooks added:
pcibios_rescan_prepare(pci_dev) and pcibios_rescan_prepare(pci_dev).

This patch is a preparation for future patches with actual implementation,
and for now it just does the following:
 - declares the feature;
 - defines pci_movable_bars_enabled(), pci_dev_movable_bars_supported(dev);
 - invokes the .rescan_prepare() and .rescan_done() driver notifiers;
 - declares and invokes the pcibios_rescan_prepare()/_done() hooks;
 - adds the PCI_IMMOVABLE_BARS flag.

The feature is disabled by default (via PCI_IMMOVABLE_BARS) until the final
patch of the series. It can be overridden per-arch using this flag or by
the following command line option:

    pcie_movable_bars={ off | force }

CC: Sam Bobroff <sbobroff@linux.ibm.com>
CC: Rajat Jain <rajatja@google.com>
CC: Lukas Wunner <lukas@wunner.de>
CC: Oliver O'Halloran <oohall@gmail.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 .../admin-guide/kernel-parameters.txt         |  7 ++
 drivers/pci/pci-driver.c                      |  2 +
 drivers/pci/pci.c                             | 24 ++++++
 drivers/pci/pci.h                             |  2 +
 drivers/pci/probe.c                           | 86 ++++++++++++++++++-
 include/linux/pci.h                           |  7 ++
 6 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 47d981a86e2f..e2274ee87a35 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3526,6 +3526,13 @@
 		nomsi	Do not use MSI for native PCIe PME signaling (this makes
 			all PCIe root ports use INTx for all services).
 
+	pcie_movable_bars=[PCIE]
+			Override the movable BARs support detection:
+		off
+			Disable even if supported by the platform
+		force
+			Enable even if not explicitly declared as supported
+
 	pcmv=		[HW,PCMCIA] BadgePAD 4
 
 	pd_ignore_unused
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index a8124e47bf6e..d11909e79263 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1688,6 +1688,8 @@ static int __init pci_driver_init(void)
 {
 	int ret;
 
+	pci_add_flags(PCI_IMMOVABLE_BARS);
+
 	ret = bus_register(&pci_bus_type);
 	if (ret)
 		return ret;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 61d951766087..3a504f58ac60 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -139,6 +139,30 @@ static int __init pcie_port_pm_setup(char *str)
 }
 __setup("pcie_port_pm=", pcie_port_pm_setup);
 
+static bool pcie_movable_bars_off;
+static bool pcie_movable_bars_force;
+static int __init pcie_movable_bars_setup(char *str)
+{
+	if (!strcmp(str, "off"))
+		pcie_movable_bars_off = true;
+	else if (!strcmp(str, "force"))
+		pcie_movable_bars_force = true;
+	return 1;
+}
+__setup("pcie_movable_bars=", pcie_movable_bars_setup);
+
+bool pci_movable_bars_enabled(void)
+{
+	if (pcie_movable_bars_off)
+		return false;
+
+	if (pcie_movable_bars_force)
+		return true;
+
+	return !pci_has_flag(PCI_IMMOVABLE_BARS);
+}
+EXPORT_SYMBOL(pci_movable_bars_enabled);
+
 /* Time to wait after a reset for device to become responsive */
 #define PCIE_RESET_READY_POLL_MS 60000
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d22d1b807701..be7acc477c64 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -257,6 +257,8 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
 void pci_reassigndev_resource_alignment(struct pci_dev *dev);
 void pci_disable_bridge_window(struct pci_dev *dev);
 
+bool pci_dev_movable_bars_supported(struct pci_dev *dev);
+
 /* PCIe link information */
 #define PCIE_SPEED2STR(speed) \
 	((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2e58ece820e8..60e3b48d2251 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3406,6 +3406,74 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 	return max;
 }
 
+bool pci_dev_movable_bars_supported(struct pci_dev *dev)
+{
+	if (!dev)
+		return false;
+
+	if (dev->driver && dev->driver->rescan_prepare)
+		return true;
+
+	if ((dev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
+		return false;
+
+	return !dev->driver;
+}
+
+void __weak pcibios_rescan_prepare(struct pci_dev *dev)
+{
+}
+
+void __weak pcibios_rescan_done(struct pci_dev *dev)
+{
+}
+
+static void pci_bus_rescan_prepare(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (bus->self)
+		pci_config_pm_runtime_get(bus->self);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child)
+			pci_bus_rescan_prepare(child);
+
+		if (dev->driver &&
+		    dev->driver->rescan_prepare) {
+			dev->driver->rescan_prepare(dev);
+			pcibios_rescan_prepare(dev);
+		} else if (pci_dev_movable_bars_supported(dev)) {
+			pcibios_rescan_prepare(dev);
+		}
+	}
+}
+
+static void pci_bus_rescan_done(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (dev->driver &&
+		    dev->driver->rescan_done) {
+			pcibios_rescan_done(dev);
+			dev->driver->rescan_done(dev);
+		} else if (pci_dev_movable_bars_supported(dev)) {
+			pcibios_rescan_done(dev);
+		}
+
+		if (child)
+			pci_bus_rescan_done(child);
+	}
+
+	if (bus->self)
+		pci_config_pm_runtime_put(bus->self);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3418,9 +3486,23 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
+	struct pci_bus *root = bus;
+
+	while (!pci_is_root_bus(root))
+		root = root->parent;
+
+	if (pci_movable_bars_enabled()) {
+		pci_bus_rescan_prepare(root);
+
+		max = pci_scan_child_bus(root);
+		pci_assign_unassigned_root_bus_resources(root);
+
+		pci_bus_rescan_done(root);
+	} else {
+		max = pci_scan_child_bus(bus);
+		pci_assign_unassigned_bus_resources(bus);
+	}
 
-	max = pci_scan_child_bus(bus);
-	pci_assign_unassigned_bus_resources(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d3a72159722d..e5b5eff05744 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -838,6 +838,8 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	void (*rescan_prepare)(struct pci_dev *dev);
+	void (*rescan_done)(struct pci_dev *dev);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
@@ -924,6 +926,7 @@ enum {
 	PCI_ENABLE_PROC_DOMAINS	= 0x00000010,	/* Enable domains in /proc */
 	PCI_COMPAT_DOMAIN_0	= 0x00000020,	/* ... except domain 0 */
 	PCI_SCAN_ALL_PCIE_DEVS	= 0x00000040,	/* Scan all, not just dev 0 */
+	PCI_IMMOVABLE_BARS	= 0x00000080,	/* Disable runtime BAR reassign */
 };
 
 /* These external functions are only available when PCI support is enabled */
@@ -1266,6 +1269,9 @@ unsigned int pci_rescan_bus(struct pci_bus *bus);
 void pci_lock_rescan_remove(void);
 void pci_unlock_rescan_remove(void);
 
+void pcibios_rescan_prepare(struct pci_dev *dev);
+void pcibios_rescan_done(struct pci_dev *dev);
+
 /* Vital Product Data routines */
 ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
 ssize_t pci_write_vpd(struct pci_dev *dev, loff_t pos, size_t count, const void *buf);
@@ -1402,6 +1408,7 @@ unsigned char pci_bus_max_busnr(struct pci_bus *bus);
 void pci_setup_bridge(struct pci_bus *bus);
 resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 					 unsigned long type);
+bool pci_movable_bars_enabled(void);
 
 #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
 #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 04/23] PCI: Define PCI-specific version of the release_child_resources()
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (2 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 03/23] PCI: hotplug: Add a flag for the movable BARs feature Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 05/23] PCI: hotplug: movable BARs: Fix reassigning the released bridge windows Sergey Miroshnichenko
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

If release the bridge resources with standard release_child_resources(), it
drops the .start field of children's BARs to zero, but with the STARTALIGN
flag remaining set, which makes the resource invalid for reassignment.

Some resources must preserve their offset and size: those marked with the
PCI_FIXED and the immovable ones - which are bound by drivers without
support of the movable BARs feature.

Add the pci_release_child_resources() to replace release_child_resources()
in handling the described PCI-specific cases.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 54 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 79b1fa6519be..6cb8b293c576 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1482,6 +1482,55 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
 	(IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
 	 IORESOURCE_MEM_64)
 
+/*
+ * Similar to generic release_child_resources(), but aware of immovable BARs and
+ * PCI_FIXED and STARTALIGN flags
+ */
+static void pci_release_child_resources(struct pci_bus *bus, struct resource *r)
+{
+	struct pci_dev *dev;
+
+	if (!bus || !r)
+		return;
+
+	if (r->flags & IORESOURCE_PCI_FIXED)
+		return;
+
+	r->child = NULL;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		int i;
+
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+			struct resource *tmp = &dev->resource[i];
+			resource_size_t size = resource_size(tmp);
+
+			if (!tmp->flags || tmp->parent != r)
+				continue;
+
+			tmp->parent = NULL;
+			tmp->sibling = NULL;
+
+			pci_release_child_resources(dev->subordinate, tmp);
+
+			if ((tmp->flags & IORESOURCE_PCI_FIXED) ||
+			    !pci_dev_movable_bars_supported(dev)) {
+				pci_dbg(dev, "release immovable %pR (%s), keep its flags, base and size\n",
+					tmp, tmp->name);
+				continue;
+			}
+
+			pci_dbg(dev, "release %pR (%s)\n", tmp, tmp->name);
+
+			tmp->start = 0;
+			tmp->end = size - 1;
+
+			tmp->flags &= ~IORESOURCE_STARTALIGN;
+			tmp->flags |= IORESOURCE_SIZEALIGN;
+		}
+	}
+}
+
 static void pci_bridge_release_resources(struct pci_bus *bus,
 					 unsigned long type)
 {
@@ -1522,7 +1571,10 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 		return;
 
 	/* If there are children, release them all */
-	release_child_resources(r);
+	if (pci_movable_bars_enabled())
+		pci_release_child_resources(bus, r);
+	else
+		release_child_resources(r);
 	if (!release_resource(r)) {
 		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		pci_info(dev, "resource %d %pR released\n",
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 05/23] PCI: hotplug: movable BARs: Fix reassigning the released bridge windows
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (3 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 04/23] PCI: Define PCI-specific version of the release_child_resources() Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 06/23] PCI: hotplug: movable BARs: Recalculate all bridge windows during rescan Sergey Miroshnichenko
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When a bridge window is temporarily released during the rescan, its old
size is not relevant anymore - it will be recreated from pbus_size_*(), so
it's start value should be zero.

If such window can't be reassigned, don't apply reset_resource(), so the
next retry may succeed.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6cb8b293c576..7c2c57f77c6f 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -295,7 +295,8 @@ static void assign_requested_resources_sorted(struct list_head *head,
 						    0 /* don't care */,
 						    0 /* don't care */);
 			}
-			reset_resource(res);
+			if (!pci_movable_bars_enabled())
+				reset_resource(res);
 		}
 	}
 }
@@ -1579,8 +1580,8 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		pci_info(dev, "resource %d %pR released\n",
 			 PCI_BRIDGE_RESOURCES + idx, r);
-		/* Keep the old size */
-		r->end = resource_size(r) - 1;
+		/* Don't keep the old size if the bridge will be recalculated */
+		r->end = pci_movable_bars_enabled() ? 0 : (resource_size(r) - 1);
 		r->start = 0;
 		r->flags = 0;
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 06/23] PCI: hotplug: movable BARs: Recalculate all bridge windows during rescan
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (4 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 05/23] PCI: hotplug: movable BARs: Fix reassigning the released bridge windows Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 07/23] PCI: hotplug: movable BARs: Don't allow added devices to steal resources Sergey Miroshnichenko
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When the movable BARs feature is enabled and a rescan has been requested,
release all the bridge windows and recalculate them from scratch, taking
into account all kinds for BARs: fixed, immovable, movable, new.

This increases the chances to find a memory space to fit BARs for newly
hotplugged devices, especially if no/not enough gaps were reserved by the
BIOS/bootloader/firmware.

The last step of writing the recalculated windows to the bridges is done
by the new pci_setup_bridges() function.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       |  1 +
 drivers/pci/probe.c     | 22 ++++++++++++++++++++++
 drivers/pci/setup-bus.c | 16 ++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index be7acc477c64..a0ec696512eb 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -253,6 +253,7 @@ void __pci_bus_assign_resources(const struct pci_bus *bus,
 				struct list_head *realloc_head,
 				struct list_head *fail_head);
 bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
+void pci_bus_release_root_bridge_resources(struct pci_bus *bus);
 
 void pci_reassigndev_resource_alignment(struct pci_dev *dev);
 void pci_disable_bridge_window(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 60e3b48d2251..a26bf740e9ab 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3474,6 +3474,25 @@ static void pci_bus_rescan_done(struct pci_bus *bus)
 		pci_config_pm_runtime_put(bus->self);
 }
 
+static void pci_setup_bridges(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child;
+
+		if (!pci_dev_is_added(dev) || pci_dev_is_ignored(dev))
+			continue;
+
+		child = dev->subordinate;
+		if (child)
+			pci_setup_bridges(child);
+	}
+
+	if (bus->self)
+		pci_setup_bridge(bus);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3495,8 +3514,11 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_bus_rescan_prepare(root);
 
 		max = pci_scan_child_bus(root);
+
+		pci_bus_release_root_bridge_resources(root);
 		pci_assign_unassigned_root_bus_resources(root);
 
+		pci_setup_bridges(root);
 		pci_bus_rescan_done(root);
 	} else {
 		max = pci_scan_child_bus(bus);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 7c2c57f77c6f..04f626e1ac18 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1635,6 +1635,22 @@ static void pci_bus_release_bridge_resources(struct pci_bus *bus,
 		pci_bridge_release_resources(bus, type);
 }
 
+void pci_bus_release_root_bridge_resources(struct pci_bus *root_bus)
+{
+	int i;
+	struct resource *r;
+
+	pci_bus_release_bridge_resources(root_bus, IORESOURCE_IO, whole_subtree);
+	pci_bus_release_bridge_resources(root_bus, IORESOURCE_MEM, whole_subtree);
+	pci_bus_release_bridge_resources(root_bus,
+					 IORESOURCE_MEM_64 | IORESOURCE_PREFETCH,
+					 whole_subtree);
+
+	pci_bus_for_each_resource(root_bus, r, i) {
+		pci_release_child_resources(root_bus, r);
+	}
+}
+
 static void pci_bus_dump_res(struct pci_bus *bus)
 {
 	struct resource *res;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 07/23] PCI: hotplug: movable BARs: Don't allow added devices to steal resources
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (5 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 06/23] PCI: hotplug: movable BARs: Recalculate all bridge windows during rescan Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 08/23] PCI: Include fixed and immovable BARs into the bus size calculating Sergey Miroshnichenko
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When movable BARs are enabled, the PCI subsystem at first releases all the
bridge windows and then attempts to assign resources both to previously
working devices and to the newly hotplugged ones, with the same priority.

If a hotplugged device gets its BARs first, this may lead to lack of space
for already working devices, which is unacceptable. If that happens, mark
one of the new devices with the newly introduced flag PCI_DEV_DISABLED_BARS
(if it is not yet marked) and retry the BAR recalculation.

The worst case would be no BARs for hotplugged devices, while all the rest
just continue working.

The algorithm is simple and it doesn't retry different subsets of hot-added
devices in case of a failure, e.g. if there are no space to allocate BARs
for both hotplugged devices A and B, but is enough for just A, the A will
be marked with PCI_DEV_DISABLED_BARS first, then (after the next failure) -
B. As a result, A will not get BARs while it could. This issue is only
relevant when hotplugging two and more devices simultaneously.

Add a new res_mask bitmask to the struct pci_dev for storing the indices of
assigned BARs.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       |  11 +++++
 drivers/pci/probe.c     | 101 ++++++++++++++++++++++++++++++++++++++--
 drivers/pci/setup-bus.c |  15 ++++++
 include/linux/pci.h     |   1 +
 4 files changed, 125 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index a0ec696512eb..53249cbc21b6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -373,6 +373,7 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 
 /* pci_dev priv_flags */
 #define PCI_DEV_ADDED 0
+#define PCI_DEV_DISABLED_BARS 1
 
 static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
 {
@@ -384,6 +385,16 @@ static inline bool pci_dev_is_added(const struct pci_dev *dev)
 	return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
 }
 
+static inline void pci_dev_disable_bars(struct pci_dev *dev)
+{
+	assign_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags, true);
+}
+
+static inline bool pci_dev_bars_enabled(const struct pci_dev *dev)
+{
+	return !test_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags);
+}
+
 #ifdef CONFIG_PCIEAER
 #include <linux/aer.h>
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index a26bf740e9ab..bf0a7d1c5d09 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3428,6 +3428,23 @@ void __weak pcibios_rescan_done(struct pci_dev *dev)
 {
 }
 
+static unsigned int pci_dev_count_res_mask(struct pci_dev *dev)
+{
+	unsigned int res_mask = 0;
+	int i;
+
+	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++) {
+		struct resource *r = &dev->resource[i];
+
+		if (!r->flags || (r->flags & IORESOURCE_UNSET) || !r->parent)
+			continue;
+
+		res_mask |= (1 << i);
+	}
+
+	return res_mask;
+}
+
 static void pci_bus_rescan_prepare(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
@@ -3438,6 +3455,8 @@ static void pci_bus_rescan_prepare(struct pci_bus *bus)
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		struct pci_bus *child = dev->subordinate;
 
+		dev->res_mask = pci_dev_count_res_mask(dev);
+
 		if (child)
 			pci_bus_rescan_prepare(child);
 
@@ -3481,7 +3500,7 @@ static void pci_setup_bridges(struct pci_bus *bus)
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		struct pci_bus *child;
 
-		if (!pci_dev_is_added(dev) || pci_dev_is_ignored(dev))
+		if (!pci_dev_is_added(dev) || !pci_dev_bars_enabled(dev))
 			continue;
 
 		child = dev->subordinate;
@@ -3493,6 +3512,83 @@ static void pci_setup_bridges(struct pci_bus *bus)
 		pci_setup_bridge(bus);
 }
 
+static struct pci_dev *pci_find_next_new_device(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (!bus)
+		return NULL;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child_bus = dev->subordinate;
+
+		if (!pci_dev_is_added(dev) && pci_dev_bars_enabled(dev))
+			return dev;
+
+		if (child_bus) {
+			struct pci_dev *next_new_dev;
+
+			next_new_dev = pci_find_next_new_device(child_bus);
+			if (next_new_dev)
+				return next_new_dev;
+		}
+	}
+
+	return NULL;
+}
+
+static bool pci_bus_check_all_bars_reassigned(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+	bool ret = true;
+
+	if (!bus)
+		return false;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+		unsigned int res_mask = pci_dev_count_res_mask(dev);
+
+		if (!pci_dev_bars_enabled(dev))
+			continue;
+
+		if (dev->res_mask & ~res_mask) {
+			pci_err(dev, "Non-re-enabled resources found: 0x%x -> 0x%x\n",
+				dev->res_mask, res_mask);
+			ret = false;
+		}
+
+		if (child && !pci_bus_check_all_bars_reassigned(child))
+			ret = false;
+	}
+
+	return ret;
+}
+
+static void pci_reassign_root_bus_resources(struct pci_bus *root)
+{
+	do {
+		struct pci_dev *next_new_dev;
+
+		pci_bus_release_root_bridge_resources(root);
+		pci_assign_unassigned_root_bus_resources(root);
+
+		if (pci_bus_check_all_bars_reassigned(root))
+			break;
+
+		next_new_dev = pci_find_next_new_device(root);
+		if (!next_new_dev) {
+			dev_err(&root->dev, "failed to re-assign resources even after ignoring all the hotplugged devices\n");
+			break;
+		}
+
+		dev_warn(&root->dev, "failed to re-assign resources, disable the next hotplugged device %s and retry\n",
+			 dev_name(&next_new_dev->dev));
+
+		pci_dev_disable_bars(next_new_dev);
+	} while (true);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3515,8 +3611,7 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 
 		max = pci_scan_child_bus(root);
 
-		pci_bus_release_root_bridge_resources(root);
-		pci_assign_unassigned_root_bus_resources(root);
+		pci_reassign_root_bus_resources(root);
 
 		pci_setup_bridges(root);
 		pci_bus_rescan_done(root);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 04f626e1ac18..1a731002ce18 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -128,6 +128,9 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 {
 	int i;
 
+	if (!pci_dev_bars_enabled(dev))
+		return;
+
 	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
 		struct resource *r;
 		struct pci_dev_resource *dev_res, *tmp;
@@ -177,6 +180,9 @@ static void __dev_sort_resources(struct pci_dev *dev, struct list_head *head)
 {
 	u16 class = dev->class >> 8;
 
+	if (!pci_dev_bars_enabled(dev))
+		return;
+
 	/* Don't touch classless devices or host bridges or IOAPICs */
 	if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
 		return;
@@ -278,6 +284,9 @@ static void assign_requested_resources_sorted(struct list_head *head,
 	int idx;
 
 	list_for_each_entry(dev_res, head, list) {
+		if (!pci_dev_bars_enabled(dev_res->dev))
+			continue;
+
 		res = dev_res->res;
 		idx = res - &dev_res->dev->resource[0];
 		if (resource_size(res) &&
@@ -995,6 +1004,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		int i;
 
+		if (!pci_dev_bars_enabled(dev))
+			continue;
+
 		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
 			struct resource *r = &dev->resource[i];
 			resource_size_t r_size;
@@ -1349,6 +1361,9 @@ void __pci_bus_assign_resources(const struct pci_bus *bus,
 	pbus_assign_resources_sorted(bus, realloc_head, fail_head);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
+		if (!pci_dev_bars_enabled(dev))
+			continue;
+
 		pdev_assign_fixed_resources(dev);
 
 		b = dev->subordinate;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e5b5eff05744..95a8113c2157 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -375,6 +375,7 @@ struct pci_dev {
 	 */
 	unsigned int	irq;
 	struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regions + expansion ROMs */
+	unsigned int	res_mask;		/* Bitmask of assigned resources */
 
 	bool		match_driver;		/* Skip attaching driver */
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 08/23] PCI: Include fixed and immovable BARs into the bus size calculating
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (6 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 07/23] PCI: hotplug: movable BARs: Don't allow added devices to steal resources Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 09/23] PCI: Prohibit assigning BARs and bridge windows to non-direct parents Sergey Miroshnichenko
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

The only difference between the fixed/immovable and movable BARs is a size
and offset preservation after they are released (the corresponding struct
resource* detached from a bridge window for a while during a bus rescan).

Include fixed/immovable BARs into result of pbus_size_mem() and prohibit
assigning them to non-direct parents.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 1a731002ce18..2c250efca512 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1011,12 +1011,21 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 			struct resource *r = &dev->resource[i];
 			resource_size_t r_size;
 
-			if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) ||
+			if (r->parent ||
 			    ((r->flags & mask) != type &&
 			     (r->flags & mask) != type2 &&
 			     (r->flags & mask) != type3))
 				continue;
 			r_size = resource_size(r);
+
+			if ((r->flags & IORESOURCE_PCI_FIXED) ||
+			    !pci_dev_movable_bars_supported(dev)) {
+				if (pci_movable_bars_enabled())
+					size += r_size;
+
+				continue;
+			}
+
 #ifdef CONFIG_PCI_IOV
 			/* Put SRIOV requested res to the optional list */
 			if (realloc_head && i >= PCI_IOV_RESOURCES &&
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 09/23] PCI: Prohibit assigning BARs and bridge windows to non-direct parents
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (7 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 08/23] PCI: Include fixed and immovable BARs into the bus size calculating Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 10/23] PCI: hotplug: movable BARs: Try to assign unassigned resources only once Sergey Miroshnichenko
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When movable BARs are enabled, the feature of resource relocating from
commit 2bbc6942273b5 ("PCI : ability to relocate assigned pci-resources")
is not used. Instead, inability to assign a resource is used as a signal
to retry BAR assignment with other configuration of bridge windows.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c |  2 ++
 drivers/pci/setup-res.c | 12 ++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 2c250efca512..aee330047121 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1356,6 +1356,8 @@ static void pdev_assign_fixed_resources(struct pci_dev *dev)
 		while (b && !r->parent) {
 			assign_fixed_resource_on_bus(b, r);
 			b = b->parent;
+			if (!r->parent && pci_movable_bars_enabled())
+				break;
 		}
 	}
 }
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index d8ca40a97693..732d18f60f1b 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -298,6 +298,18 @@ static int _pci_assign_resource(struct pci_dev *dev, int resno,
 
 	bus = dev->bus;
 	while ((ret = __pci_assign_resource(bus, dev, resno, size, min_align))) {
+		if (pci_movable_bars_enabled()) {
+			if (resno >= PCI_BRIDGE_RESOURCES &&
+			    resno <= PCI_BRIDGE_RESOURCE_END) {
+				struct resource *res = dev->resource + resno;
+
+				res->start = 0;
+				res->end = 0;
+				res->flags = 0;
+			}
+			break;
+		}
+
 		if (!bus->parent || !bus->self->transparent)
 			break;
 		bus = bus->parent;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 10/23] PCI: hotplug: movable BARs: Try to assign unassigned resources only once
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (8 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 09/23] PCI: Prohibit assigning BARs and bridge windows to non-direct parents Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 11/23] PCI: hotplug: movable BARs: Calculate immovable parts of bridge windows Sergey Miroshnichenko
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

With enabled BAR movement, BARs and bridge windows can only be assigned to
their direct parents, so there can be only one variant of resource tree,
thus every retry within the pci_assign_unassigned_root_bus_resources() will
result in the same tree, and it is enough to try just once.

In case of failures the pci_reassign_root_bus_resources() disables BARs for
one of the hotplugged devices and tries the assignment again.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index aee330047121..33f709095675 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1819,6 +1819,13 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	int pci_try_num = 1;
 	enum enable_type enable_local;
 
+	if (pci_movable_bars_enabled()) {
+		__pci_bus_size_bridges(bus, NULL);
+		__pci_bus_assign_resources(bus, NULL, NULL);
+
+		goto dump;
+	}
+
 	/* Don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
 	if (pci_realloc_enabled(enable_local)) {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 11/23] PCI: hotplug: movable BARs: Calculate immovable parts of bridge windows
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (9 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 10/23] PCI: hotplug: movable BARs: Try to assign unassigned resources only once Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 12/23] PCI: hotplug: movable BARs: Compute limits for relocated " Sergey Miroshnichenko
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When movable BARs are enabled, and if a bridge contains a device with fixed
(IORESOURCE_PCI_FIXED) or immovable BARs, the corresponing windows can't be
moved too far away from their original positions - they must still contain
all the fixed/immovable BARs, like that:

  1) Window position before a bus rescan:

  | <--                    root bridge window                        --> |
  |                                                                      |
  | | <--     bridge window    --> |                                     |
  | | movable BARs | **fixed BAR** |                                     |

  2) Possible valid outcome after rescan and move:

  | <--                    root bridge window                        --> |
  |                                                                      |
  |                | <--     bridge window    --> |                      |
  |                | **fixed BAR** | Movable BARs |                      |

An immovable area of a bridge (separare for IO, MEM and MEM64 window types)
is a range that covers all the fixed and immovable BARs of direct children,
and all the fixed area of children bridges:

  | <--                    root bridge window                        --> |
  |                                                                      |
  |  | <--                  bridge window level 1                --> |   |
  |  | ******** immovable area of this bridge window ********        |   |
  |  |                                                               |   |
  |  | **fixed BAR**  | <--      bridge window level 2    --> | BARs |   |
  |  |                | ***** fixed area of this bridge ***** |      |   |
  |  |                |                                       |      |   |
  |  |                | ***fixed BAR*** |   | ***fixed BAR*** |      |   |

To store these areas, the .immovable_range field has been added to struct
pci_bus. It is filled recursively from leaves to the root before a rescan.

Also make pbus_size_io() and pbus_size_mem() return their usual result OR
the size of an immovable range of according type, depending on which one is
larger.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       | 14 +++++++
 drivers/pci/probe.c     | 88 +++++++++++++++++++++++++++++++++++++++++
 drivers/pci/setup-bus.c | 17 ++++++++
 include/linux/pci.h     |  6 +++
 4 files changed, 125 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 53249cbc21b6..12add575faf1 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -371,6 +371,20 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 	return dev->error_state == pci_channel_io_perm_failure;
 }
 
+static inline int pci_get_bridge_resource_idx(struct resource *r)
+{
+	int idx = 1;
+
+	if (r->flags & IORESOURCE_IO)
+		idx = 0;
+	else if (!(r->flags & IORESOURCE_PREFETCH))
+		idx = 1;
+	else if (r->flags & IORESOURCE_MEM_64)
+		idx = 2;
+
+	return idx;
+}
+
 /* pci_dev priv_flags */
 #define PCI_DEV_ADDED 0
 #define PCI_DEV_DISABLED_BARS 1
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index bf0a7d1c5d09..5f52a19738aa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -550,6 +550,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
 static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
 {
 	struct pci_bus *b;
+	int idx;
 
 	b = kzalloc(sizeof(*b), GFP_KERNEL);
 	if (!b)
@@ -566,6 +567,11 @@ static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
 	if (parent)
 		b->domain_nr = parent->domain_nr;
 #endif
+	for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx) {
+		b->immovable_range[idx].start = 0;
+		b->immovable_range[idx].end = 0;
+	}
+
 	return b;
 }
 
@@ -3512,6 +3518,87 @@ static void pci_setup_bridges(struct pci_bus *bus)
 		pci_setup_bridge(bus);
 }
 
+static void pci_bus_update_immovable_range(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+	int idx;
+	resource_size_t start, end;
+
+	for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx) {
+		bus->immovable_range[idx].start = 0;
+		bus->immovable_range[idx].end = 0;
+	}
+
+	list_for_each_entry(dev, &bus->devices, bus_list)
+		if (dev->subordinate)
+			pci_bus_update_immovable_range(dev->subordinate);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		int i;
+		bool dev_is_movable = pci_dev_movable_bars_supported(dev);
+		struct pci_bus *child = dev->subordinate;
+
+		for (i = 0; i < PCI_BRIDGE_RESOURCES; ++i) {
+			struct resource *r = &dev->resource[i];
+
+			if (!r->flags || (r->flags & IORESOURCE_UNSET) || !r->parent)
+				continue;
+
+			if (!dev_is_movable || (r->flags & IORESOURCE_PCI_FIXED)) {
+				idx = pci_get_bridge_resource_idx(r);
+				start = bus->immovable_range[idx].start;
+				end = bus->immovable_range[idx].end;
+
+				if (!start || start > r->start)
+					start = r->start;
+				if (end < r->end)
+					end = r->end;
+
+				if (bus->immovable_range[idx].start != start ||
+				    bus->immovable_range[idx].end != end) {
+					dev_dbg(&bus->dev, "Found fixed 0x%llx-0x%llx in %s, expand the fixed bridge window %d to 0x%llx-0x%llx\n",
+						(unsigned long long)r->start,
+						(unsigned long long)r->end,
+						dev_name(&dev->dev), idx,
+						(unsigned long long)start,
+						(unsigned long long)end);
+					bus->immovable_range[idx].start = start;
+					bus->immovable_range[idx].end = end;
+				}
+			}
+		}
+
+		if (child) {
+			for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx) {
+				struct resource *child_immovable_range =
+					&child->immovable_range[idx];
+
+				if (child_immovable_range->start >=
+				    child_immovable_range->end)
+					continue;
+
+				start = bus->immovable_range[idx].start;
+				end = bus->immovable_range[idx].end;
+
+				if (!start || start > child_immovable_range->start)
+					start = child_immovable_range->start;
+				if (end < child_immovable_range->end)
+					end = child_immovable_range->end;
+
+				if (start < bus->immovable_range[idx].start ||
+				    end > bus->immovable_range[idx].end) {
+					dev_dbg(&bus->dev, "Expand the fixed bridge window %d from %s to 0x%llx-0x%llx\n",
+						idx, dev_name(&child->dev),
+						(unsigned long long)start,
+						(unsigned long long)end);
+					bus->immovable_range[idx].start = start;
+					bus->immovable_range[idx].end = end;
+				}
+			}
+		}
+	}
+}
+
 static struct pci_dev *pci_find_next_new_device(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
@@ -3610,6 +3697,7 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_bus_rescan_prepare(root);
 
 		max = pci_scan_child_bus(root);
+		pci_bus_update_immovable_range(root);
 
 		pci_reassign_root_bus_resources(root);
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 33f709095675..420510a1a257 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -882,9 +882,17 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,
 	resource_size_t children_add_size = 0;
 	resource_size_t min_align, align;
 
+	resource_size_t fixed_start = bus->immovable_range[0].start;
+	resource_size_t fixed_end = bus->immovable_range[0].end;
+	resource_size_t fixed_size = (fixed_start < fixed_end) ?
+		(fixed_end - fixed_start + 1) : 0;
+
 	if (!b_res)
 		return;
 
+	if (min_size < fixed_size)
+		min_size = fixed_size;
+
 	min_align = window_alignment(bus, IORESOURCE_IO);
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		int i;
@@ -993,6 +1001,15 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
 	resource_size_t add_align = 0;
+	bool is_mem64 = (mask & IORESOURCE_MEM_64);
+
+	resource_size_t fixed_start = bus->immovable_range[is_mem64 ? 2 : 1].start;
+	resource_size_t fixed_end = bus->immovable_range[is_mem64 ? 2 : 1].end;
+	resource_size_t fixed_size = (fixed_start < fixed_end) ?
+		(fixed_end - fixed_start + 1) : 0;
+
+	if (min_size < fixed_size)
+		min_size = fixed_size;
 
 	if (!b_res)
 		return -ENOSPC;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 95a8113c2157..efafbf816fe6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -581,6 +581,12 @@ struct pci_bus {
 	struct list_head resources;	/* Address space routed to this bus */
 	struct resource busn_res;	/* Bus numbers routed to this bus */
 
+	/*
+	 * If there are fixed or immovable resources in the bridge window, this range
+	 * contains the lowest start address and highest end address of them.
+	 */
+	struct resource immovable_range[PCI_BRIDGE_RESOURCE_NUM];
+
 	struct pci_ops	*ops;		/* Configuration access functions */
 	struct msi_controller *msi;	/* MSI controller */
 	void		*sysdata;	/* Hook for sys-specific extension */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 12/23] PCI: hotplug: movable BARs: Compute limits for relocated bridge windows
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (10 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 11/23] PCI: hotplug: movable BARs: Calculate immovable parts of bridge windows Sergey Miroshnichenko
@ 2019-08-16 16:50 ` " Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 13/23] PCI: Make sure bridge windows include their fixed BARs Sergey Miroshnichenko
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

With enabled movable BARs, bridge windows are recalculated during each pci
rescan. Some of the BARs below the bridge may be fixed/immovable: these
areas are represented by the .immovable_range field in struct pci_bus.

If a bridge window size is equal to its immovable range, it can only be
assigned to the start of this range. But if a bridge window size is larger,
and this difference in size is denoted as "delta", the window can start
from (immovable_range.start - delta) to (immovable_range.start), and it can
end from (immovable_range.end) to (immovable_range.end + delta). This range
(the new .realloc_range field in struct pci_bus) must then be compared with
immovable ranges of neighbouring bridges to guarantee no intersections.

This patch only calculates valid ranges for reallocated bridges during pci
rescan, and the next one will make use of these values during allocation.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 67 +++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h     |  6 ++++
 2 files changed, 73 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 420510a1a257..586aaa9578b2 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1819,6 +1819,72 @@ static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 }
 #endif
 
+/*
+ * Calculate the address margins where the bridge windows may be allocated to fit all
+ * the fixed and immovable BARs beneath.
+ */
+static void pci_bus_update_realloc_range(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+	struct pci_bus *parent = bus->parent;
+	int idx;
+
+	list_for_each_entry(dev, &bus->devices, bus_list)
+		if (dev->subordinate)
+			pci_bus_update_realloc_range(dev->subordinate);
+
+	if (!parent || !bus->self)
+		return;
+
+	for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx) {
+		struct resource *immovable_range = &bus->immovable_range[idx];
+		resource_size_t window_size = resource_size(bus->resource[idx]);
+		resource_size_t realloc_start, realloc_end;
+
+		bus->realloc_range[idx].start = 0;
+		bus->realloc_range[idx].end = 0;
+
+		/* Check if there any immovable BARs under the bridge */
+		if (immovable_range->start >= immovable_range->end)
+			continue;
+
+		/* The lowest possible address where the bridge window can start */
+		realloc_start = immovable_range->end - window_size + 1;
+		/* The highest possible address where the bridge window can end */
+		realloc_end = immovable_range->start + window_size - 1;
+
+		if (realloc_start > immovable_range->start)
+			realloc_start = immovable_range->start;
+
+		if (realloc_end < immovable_range->end)
+			realloc_end = immovable_range->end;
+
+		/*
+		 * Check that realloc range doesn't intersect with hard fixed ranges
+		 * of neighboring bridges
+		 */
+		list_for_each_entry(dev, &parent->devices, bus_list) {
+			struct pci_bus *neighbor = dev->subordinate;
+			struct resource *n_imm_range;
+
+			if (!neighbor || neighbor == bus)
+				continue;
+
+			n_imm_range = &neighbor->immovable_range[idx];
+
+			if (n_imm_range->start >= n_imm_range->end)
+				continue;
+
+			if (n_imm_range->end < immovable_range->start &&
+			    n_imm_range->end > realloc_start)
+				realloc_start = n_imm_range->end;
+		}
+
+		bus->realloc_range[idx].start = realloc_start;
+		bus->realloc_range[idx].end = realloc_end;
+	}
+}
+
 /*
  * First try will not touch PCI bridge res.
  * Second and later try will clear small leaf bridge res.
@@ -1838,6 +1904,7 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 
 	if (pci_movable_bars_enabled()) {
 		__pci_bus_size_bridges(bus, NULL);
+		pci_bus_update_realloc_range(bus);
 		__pci_bus_assign_resources(bus, NULL, NULL);
 
 		goto dump;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index efafbf816fe6..bf6638cf2525 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -587,6 +587,12 @@ struct pci_bus {
 	 */
 	struct resource immovable_range[PCI_BRIDGE_RESOURCE_NUM];
 
+	/*
+	 * Acceptable address range, where the bridge window may reside, considering its
+	 * size, so it will cover all the fixed and immovable BARs below.
+	 */
+	struct resource realloc_range[PCI_BRIDGE_RESOURCE_NUM];
+
 	struct pci_ops	*ops;		/* Configuration access functions */
 	struct msi_controller *msi;	/* MSI controller */
 	void		*sysdata;	/* Hook for sys-specific extension */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 13/23] PCI: Make sure bridge windows include their fixed BARs
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (11 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 12/23] PCI: hotplug: movable BARs: Compute limits for relocated " Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 14/23] PCI: Fix assigning the fixed prefetchable resources Sergey Miroshnichenko
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When the time comes to select a start address for the bridge window during
the root bus rescan, it should be not just a lowest possible address: this
window must cover all the underlying fixed and immovable BARs. The lowest
address that satisfies this requirement is the .realloc_range field of
struct pci_bus, which is calculated during the preparation to the rescan.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/bus.c       |  2 +-
 drivers/pci/setup-res.c | 28 ++++++++++++++++++++++++++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 495059d923f7..7aae830751e9 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -192,7 +192,7 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, struct resource *res,
 		 * this is an already-configured bridge window, its start
 		 * overrides "min".
 		 */
-		if (avail.start)
+		if (min_used < avail.start)
 			min_used = avail.start;
 
 		max = avail.end;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 732d18f60f1b..7357bcc12a53 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -248,9 +248,20 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 	struct resource *res = dev->resource + resno;
 	resource_size_t min;
 	int ret;
+	resource_size_t start = (resource_size_t)-1;
+	resource_size_t end = 0;
 
 	min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM;
 
+	if (dev->subordinate && resno >= PCI_BRIDGE_RESOURCES) {
+		struct pci_bus *child_bus = dev->subordinate;
+		int b_resno = resno - PCI_BRIDGE_RESOURCES;
+		struct resource *immovable_range = &child_bus->immovable_range[b_resno];
+
+		if (immovable_range->start < immovable_range->end)
+			min = child_bus->realloc_range[b_resno].start;
+	}
+
 	/*
 	 * First, try exact prefetching match.  Even if a 64-bit
 	 * prefetchable bridge window is below 4GB, we can't put a 32-bit
@@ -262,7 +273,7 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 				     IORESOURCE_PREFETCH | IORESOURCE_MEM_64,
 				     pcibios_align_resource, dev);
 	if (ret == 0)
-		return 0;
+		goto check_fixed;
 
 	/*
 	 * If the prefetchable window is only 32 bits wide, we can put
@@ -274,7 +285,7 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 					     IORESOURCE_PREFETCH,
 					     pcibios_align_resource, dev);
 		if (ret == 0)
-			return 0;
+			goto check_fixed;
 	}
 
 	/*
@@ -287,6 +298,19 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 		ret = pci_bus_alloc_resource(bus, res, size, align, min, 0,
 					     pcibios_align_resource, dev);
 
+check_fixed:
+	if (ret == 0 && start < end) {
+		if (res->start > start || res->end < end) {
+			dev_err(&bus->dev, "fixed area 0x%llx-0x%llx for %s doesn't fit in the allocated %pR (0x%llx-0x%llx)",
+				(unsigned long long)start, (unsigned long long)end,
+				dev_name(&dev->dev),
+				res, (unsigned long long)res->start,
+				(unsigned long long)res->end);
+			release_resource(res);
+			return -1;
+		}
+	}
+
 	return ret;
 }
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 14/23] PCI: Fix assigning the fixed prefetchable resources
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (12 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 13/23] PCI: Make sure bridge windows include their fixed BARs Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 15/23] PCI: hotplug: movable BARs: Assign fixed and immovable BARs before others Sergey Miroshnichenko
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Allow matching IORESOURCE_PCI_FIXED prefetchable BARs to non-prefetchable
windows, so they follow the same rules as immovable BARs.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 586aaa9578b2..6f12411357f3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1340,15 +1340,20 @@ static void assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r)
 {
 	int i;
 	struct resource *parent_r;
-	unsigned long mask = IORESOURCE_IO | IORESOURCE_MEM |
-			     IORESOURCE_PREFETCH;
+	unsigned long mask = IORESOURCE_TYPE_BITS;
 
 	pci_bus_for_each_resource(b, parent_r, i) {
 		if (!parent_r)
 			continue;
 
-		if ((r->flags & mask) == (parent_r->flags & mask) &&
-		    resource_contains(parent_r, r))
+		if ((r->flags & mask) != (parent_r->flags & mask))
+			continue;
+
+		if (parent_r->flags & IORESOURCE_PREFETCH &&
+		    !(r->flags & IORESOURCE_PREFETCH))
+			continue;
+
+		if (resource_contains(parent_r, r))
 			request_resource(parent_r, r);
 	}
 }
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 15/23] PCI: hotplug: movable BARs: Assign fixed and immovable BARs before others
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (13 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 14/23] PCI: Fix assigning the fixed prefetchable resources Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space Sergey Miroshnichenko
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Reassign resources during rescan in two steps: first the fixed/immovable
BARs and bridge windows that have fixed areas, so the movable ones will not
steal these reserved areas; then the rest - so the movable BARs will divide
the rest of the space.

With this change, pci_assign_resource() is now able to assign all types of
BARs, so the pdev_assign_fixed_resources() became unused and thus removed.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       |  2 ++
 drivers/pci/setup-bus.c | 79 ++++++++++++++++++++++++-----------------
 drivers/pci/setup-res.c |  8 +++--
 3 files changed, 55 insertions(+), 34 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 12add575faf1..e1fcc46f9c40 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -260,6 +260,8 @@ void pci_disable_bridge_window(struct pci_dev *dev);
 
 bool pci_dev_movable_bars_supported(struct pci_dev *dev);
 
+int assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r);
+
 /* PCIe link information */
 #define PCIE_SPEED2STR(speed) \
 	((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6f12411357f3..c7b7e30c6284 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -38,6 +38,15 @@ struct pci_dev_resource {
 	unsigned long flags;
 };
 
+enum assign_step {
+	assign_fixed_resources,
+	assign_float_resources,
+};
+
+static void _assign_requested_resources_sorted(struct list_head *head,
+					       struct list_head *fail_head,
+					       enum assign_step step);
+
 static void free_list(struct list_head *head)
 {
 	struct pci_dev_resource *dev_res, *tmp;
@@ -278,19 +287,48 @@ static void reassign_resources_sorted(struct list_head *realloc_head,
  */
 static void assign_requested_resources_sorted(struct list_head *head,
 				 struct list_head *fail_head)
+{
+	_assign_requested_resources_sorted(head, fail_head, assign_fixed_resources);
+	_assign_requested_resources_sorted(head, fail_head, assign_float_resources);
+}
+
+static void _assign_requested_resources_sorted(struct list_head *head,
+					       struct list_head *fail_head,
+					       enum assign_step step)
 {
 	struct resource *res;
 	struct pci_dev_resource *dev_res;
 	int idx;
 
 	list_for_each_entry(dev_res, head, list) {
+		bool is_fixed = false;
+
 		if (!pci_dev_bars_enabled(dev_res->dev))
 			continue;
 
 		res = dev_res->res;
+		if (!resource_size(res))
+			continue;
+
 		idx = res - &dev_res->dev->resource[0];
-		if (resource_size(res) &&
-		    pci_assign_resource(dev_res->dev, idx)) {
+
+		if (idx < PCI_BRIDGE_RESOURCES) {
+			is_fixed = (res->flags & IORESOURCE_PCI_FIXED) ||
+				!pci_dev_movable_bars_supported(dev_res->dev);
+		} else {
+			int b_res_idx = pci_get_bridge_resource_idx(res);
+			struct resource *fixed_res =
+				&dev_res->dev->subordinate->immovable_range[b_res_idx];
+
+			is_fixed = (fixed_res->start < fixed_res->end);
+		}
+
+		if (assign_fixed_resources == step && !is_fixed)
+			continue;
+		else if (assign_float_resources == step && is_fixed)
+			continue;
+
+		if (pci_assign_resource(dev_res->dev, idx)) {
 			if (fail_head) {
 				/*
 				 * If the failed resource is a ROM BAR and
@@ -1336,7 +1374,7 @@ void pci_bus_size_bridges(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_bus_size_bridges);
 
-static void assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r)
+int assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r)
 {
 	int i;
 	struct resource *parent_r;
@@ -1353,35 +1391,14 @@ static void assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r)
 		    !(r->flags & IORESOURCE_PREFETCH))
 			continue;
 
-		if (resource_contains(parent_r, r))
-			request_resource(parent_r, r);
-	}
-}
-
-/*
- * Try to assign any resources marked as IORESOURCE_PCI_FIXED, as they are
- * skipped by pbus_assign_resources_sorted().
- */
-static void pdev_assign_fixed_resources(struct pci_dev *dev)
-{
-	int i;
-
-	for (i = 0; i <  PCI_NUM_RESOURCES; i++) {
-		struct pci_bus *b;
-		struct resource *r = &dev->resource[i];
-
-		if (r->parent || !(r->flags & IORESOURCE_PCI_FIXED) ||
-		    !(r->flags & (IORESOURCE_IO | IORESOURCE_MEM)))
-			continue;
-
-		b = dev->bus;
-		while (b && !r->parent) {
-			assign_fixed_resource_on_bus(b, r);
-			b = b->parent;
-			if (!r->parent && pci_movable_bars_enabled())
-				break;
+		if (resource_contains(parent_r, r)) {
+			if (!request_resource(parent_r, r))
+				return 0;
 		}
 	}
+
+	dev_err(&b->dev, "failed to assign immovable %pR\n", r);
+	return -EBUSY;
 }
 
 void __pci_bus_assign_resources(const struct pci_bus *bus,
@@ -1397,8 +1414,6 @@ void __pci_bus_assign_resources(const struct pci_bus *bus,
 		if (!pci_dev_bars_enabled(dev))
 			continue;
 
-		pdev_assign_fixed_resources(dev);
-
 		b = dev->subordinate;
 		if (!b)
 			continue;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 7357bcc12a53..02620cfac0c7 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -348,8 +348,12 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
 	resource_size_t align, size;
 	int ret;
 
-	if (res->flags & IORESOURCE_PCI_FIXED)
-		return 0;
+	if ((res->flags & IORESOURCE_PCI_FIXED) ||
+	    (resno < PCI_BRIDGE_RESOURCES &&
+	     !pci_dev_movable_bars_supported(dev) &&
+	     res->start)) {
+		return assign_fixed_resource_on_bus(dev->bus, res);
+	}
 
 	res->flags |= IORESOURCE_UNSET;
 	align = pci_resource_alignment(dev, res);
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (14 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 15/23] PCI: hotplug: movable BARs: Assign fixed and immovable BARs before others Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-09-04  5:42   ` Oliver O'Halloran
  2019-08-16 16:50 ` [PATCH v5 17/23] powerpc/pci: Fix crash with enabled movable BARs Sergey Miroshnichenko
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

A hotplugged bridge with many hotplug-capable ports may request
reserving more IO space than the machine has. This could be overridden
with the "hpiosize=" kernel argument though.

But when BARs are movable, there are no need to reserve space anymore:
new BARs are allocated not from reserved gaps, but via rearranging the
existing BARs. Requesting a precise amount of space for bridge windows
increases the chances of adding the new bridge successfully.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index c7b7e30c6284..7d64ec8e7088 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1287,7 +1287,7 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
 
 	case PCI_HEADER_TYPE_BRIDGE:
 		pci_bridge_check_ranges(bus);
-		if (bus->self->is_hotplug_bridge) {
+		if (bus->self->is_hotplug_bridge && !pci_movable_bars_enabled()) {
 			additional_io_size  = pci_hotplug_io_size;
 			additional_mem_size = pci_hotplug_mem_size;
 		}
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 17/23] powerpc/pci: Fix crash with enabled movable BARs
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (15 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 18/23] powerpc/pci: Handle BAR movement Sergey Miroshnichenko
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, Alexey Kardashevskiy

Add a check for the UNSET resource flag to skip the released BARs

CC: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d8080558d020..362eac42f463 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2986,7 +2986,8 @@ static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
 	int index;
 	int64_t rc;
 
-	if (!res || !res->flags || res->start > res->end)
+	if (!res || !res->flags || res->start > res->end ||
+	    (res->flags & IORESOURCE_UNSET))
 		return;
 
 	if (res->flags & IORESOURCE_IO) {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 18/23] powerpc/pci: Handle BAR movement
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (16 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 17/23] powerpc/pci: Fix crash with enabled movable BARs Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-09-04  5:37   ` Oliver O'Halloran
  2019-08-16 16:50 ` [PATCH v5 19/23] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan Sergey Miroshnichenko
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, Oliver O'Halloran

Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
the device's driver supports movable BARs, pcibios_rescan_prepare() will be
called after the device is stopped, and pcibios_rescan_done() - before it
resumes. There are no memory requests to this device between the hooks, so
it it safe to rebuild the EEH address cache during that.

CC: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 0b0cf8168b47..18cf13bba228 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
 	pcibios_finish_adding_to_bus(bus);
 }
 EXPORT_SYMBOL_GPL(pci_hp_add_devices);
+
+void pcibios_rescan_prepare(struct pci_dev *pdev)
+{
+	eeh_addr_cache_rmv_dev(pdev);
+}
+
+void pcibios_rescan_done(struct pci_dev *pdev)
+{
+	eeh_addr_cache_insert_dev(pdev);
+}
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 19/23] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (17 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 18/23] powerpc/pci: Handle BAR movement Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 20/23] PCI: hotplug: movable BARs: Enable the feature by default Sergey Miroshnichenko
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Assure that MPS settings are set up for bridges which are discovered
during manually triggered rescan via sysfs. This sequence of bridge
init (using pci_rescan_bus()) will be used for pciehp hot-add events
when BARs are movable.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/probe.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 5f52a19738aa..4bb10d27cb3a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3688,7 +3688,7 @@ static void pci_reassign_root_bus_resources(struct pci_bus *root)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
-	struct pci_bus *root = bus;
+	struct pci_bus *root = bus, *child;
 
 	while (!pci_is_root_bus(root))
 		root = root->parent;
@@ -3708,6 +3708,9 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_assign_unassigned_bus_resources(bus);
 	}
 
+	list_for_each_entry(child, &root->children, node)
+		pcie_bus_configure_settings(child);
+
 	pci_bus_add_devices(bus);
 
 	return max;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 20/23] PCI: hotplug: movable BARs: Enable the feature by default
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (18 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 19/23] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:50 ` [PATCH v5 21/23] nvme-pci: Handle movable BARs Sergey Miroshnichenko
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

This is the last patch in the series which implements the essentials of the
Movable BARs feature, so it is turned by default now. Tested on:

 - x86_64 with "pci=realloc,assign-busses,use_crs,pcie_bus_peer2peer"
   command line argument;
 - POWER8 PowerNV+PHB3 ppc64le with "pci=realloc,pcie_bus_peer2peer".

In case of problems it is still can be overridden by the following command
line option:

    pcie_movable_bars=off

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci-driver.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index d11909e79263..a8124e47bf6e 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1688,8 +1688,6 @@ static int __init pci_driver_init(void)
 {
 	int ret;
 
-	pci_add_flags(PCI_IMMOVABLE_BARS);
-
 	ret = bus_register(&pci_bus_type);
 	if (ret)
 		return ret;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 21/23] nvme-pci: Handle movable BARs
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (19 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 20/23] PCI: hotplug: movable BARs: Enable the feature by default Sergey Miroshnichenko
@ 2019-08-16 16:50 ` Sergey Miroshnichenko
  2019-08-16 16:51 ` [PATCH v5 22/23] PCI/portdrv: Declare support of " Sergey Miroshnichenko
  2019-08-16 16:51 ` [PATCH v5 23/23] PCI: pciehp: movable BARs: Trigger a domain rescan on hp events Sergey Miroshnichenko
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:50 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, linux-nvme,
	Christoph Hellwig

Hotplugged devices can affect the existing ones by moving their BARs. The
PCI subsystem will inform the NVME driver about this by invoking the
.rescan_prepare() and .rescan_done() hooks, so the BARs can by re-mapped.

Tested under the "randrw" mode of the fio tool. Before the hotplugging:

  % sudo cat /proc/iomem
  ...
                3fe800000000-3fe8007fffff : PCI Bus 0020:0b
                  3fe800000000-3fe8007fffff : PCI Bus 0020:18
                    3fe800000000-3fe8000fffff : 0020:18:00.0
                      3fe800000000-3fe8000fffff : nvme
                    3fe800100000-3fe80017ffff : 0020:18:00.0
  ...

, then another NVME drive was hot-added, so BARs of the 0020:18:00.0 are
moved:

  % sudo cat /proc/iomem
    ...
                3fe800000000-3fe800ffffff : PCI Bus 0020:0b
                  3fe800000000-3fe8007fffff : PCI Bus 0020:10
                    3fe800000000-3fe800003fff : 0020:10:00.0
                      3fe800000000-3fe800003fff : nvme
                    3fe800010000-3fe80001ffff : 0020:10:00.0
                  3fe800800000-3fe800ffffff : PCI Bus 0020:18
                    3fe800800000-3fe8008fffff : 0020:18:00.0
                      3fe800800000-3fe8008fffff : nvme
                    3fe800900000-3fe80097ffff : 0020:18:00.0
    ...

During the rescanning, both READ and WRITE speeds drop to zero for a while
due to driver's pause, then restore.

Cc: linux-nvme@lists.infradead.org
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/nvme/host/pci.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index db160cee42ad..a805d80082ca 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1645,7 +1645,7 @@ static int nvme_remap_bar(struct nvme_dev *dev, unsigned long size)
 {
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 
-	if (size <= dev->bar_mapped_size)
+	if (dev->bar && size <= dev->bar_mapped_size)
 		return 0;
 	if (size > pci_resource_len(pdev, 0))
 		return -ENOMEM;
@@ -2980,6 +2980,23 @@ static void nvme_error_resume(struct pci_dev *pdev)
 	flush_work(&dev->ctrl.reset_work);
 }
 
+static void nvme_rescan_prepare(struct pci_dev *pdev)
+{
+	struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+	nvme_dev_disable(dev, false);
+	nvme_dev_unmap(dev);
+	dev->bar = NULL;
+}
+
+static void nvme_rescan_done(struct pci_dev *pdev)
+{
+	struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+	nvme_dev_map(dev);
+	nvme_reset_ctrl_sync(&dev->ctrl);
+}
+
 static const struct pci_error_handlers nvme_err_handler = {
 	.error_detected	= nvme_error_detected,
 	.slot_reset	= nvme_slot_reset,
@@ -3049,6 +3066,8 @@ static struct pci_driver nvme_driver = {
 #endif
 	.sriov_configure = pci_sriov_configure_simple,
 	.err_handler	= &nvme_err_handler,
+	.rescan_prepare	= nvme_rescan_prepare,
+	.rescan_done	= nvme_rescan_done,
 };
 
 static int __init nvme_init(void)
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 22/23] PCI/portdrv: Declare support of movable BARs
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (20 preceding siblings ...)
  2019-08-16 16:50 ` [PATCH v5 21/23] nvme-pci: Handle movable BARs Sergey Miroshnichenko
@ 2019-08-16 16:51 ` " Sergey Miroshnichenko
  2019-08-16 16:51 ` [PATCH v5 23/23] PCI: pciehp: movable BARs: Trigger a domain rescan on hp events Sergey Miroshnichenko
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:51 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Switch's BARs are not used by the portdrv driver, but they are still
considered as immovable until the .rescan_prepare() and .rescan_done()
hooks are added. Add these hooks to increase chances to allocate new BARs.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pcie/portdrv_pci.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 0a87091a0800..9dbddc7faaa7 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -197,6 +197,14 @@ static const struct pci_error_handlers pcie_portdrv_err_handler = {
 	.resume = pcie_portdrv_err_resume,
 };
 
+static void pcie_portdrv_rescan_prepare(struct pci_dev *pdev)
+{
+}
+
+static void pcie_portdrv_rescan_done(struct pci_dev *pdev)
+{
+}
+
 static struct pci_driver pcie_portdriver = {
 	.name		= "pcieport",
 	.id_table	= &port_pci_ids[0],
@@ -207,6 +215,9 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.rescan_prepare	= pcie_portdrv_rescan_prepare,
+	.rescan_done	= pcie_portdrv_rescan_done,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 23/23] PCI: pciehp: movable BARs: Trigger a domain rescan on hp events
  2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
                   ` (21 preceding siblings ...)
  2019-08-16 16:51 ` [PATCH v5 22/23] PCI/portdrv: Declare support of " Sergey Miroshnichenko
@ 2019-08-16 16:51 ` Sergey Miroshnichenko
  22 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-08-16 16:51 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev
  Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko, Lukas Wunner

With movable BARs, adding a hotplugged device is not local to its bridge
anymore, but it affects the whole domain: BARs, bridge windows and bus
numbers can be substantially rearranged. So instead of trying to fit the
new devices into preallocated reserved gaps, initiate a full domain rescan.

The pci_rescan_bus() covers all the operations of the replaced functions:
 - assigning new bus numbers, as the pci_hp_add_bridge() does it;
 - allocating BARs (pci_assign_unassigned_bridge_resources());
 - cofiguring MPS settings (pcie_bus_configure_settings());
 - binding devices to their drivers (pci_bus_add_devices()).

CC: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/hotplug/pciehp_pci.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index d17f3bf36f70..66c4e6d88fe3 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -58,6 +58,11 @@ int pciehp_configure_device(struct controller *ctrl)
 		goto out;
 	}
 
+	if (pci_movable_bars_enabled()) {
+		pci_rescan_bus(parent);
+		goto out;
+	}
+
 	for_each_pci_bridge(dev, parent)
 		pci_hp_add_bridge(dev);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device()
  2019-08-16 16:50 ` [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
@ 2019-08-22 12:37   ` Marta Rybczynska
  0 siblings, 0 replies; 30+ messages in thread
From: Marta Rybczynska @ 2019-08-22 12:37 UTC (permalink / raw)
  To: Sergey Miroshnichenko
  Cc: linux-pci, linuxppc-dev, Bjorn Helgaas, linux, Srinath Mannam



----- On 16 Aug, 2019, at 18:50, Sergey Miroshnichenko s.miroshnichenko@yadro.com wrote:

> This is a yet another approach to fix an old [1-2] concurrency issue, when:
> - two or more devices are being hot-added into a bridge which was
>   initially empty;
> - a bridge with two or more devices is being hot-added;
> - during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges.
> 
> The problem is that a bridge is reported as enabled before the MEM/IO bits
> are actually written to the PCI_COMMAND register, so another driver thread
> starts memory requests through the not-yet-enabled bridge:
> 
> CPU0                                        CPU1
> 
> pci_enable_device_mem()                     pci_enable_device_mem()
>   pci_enable_bridge()                         pci_enable_bridge()
>     pci_is_enabled()
>       return false;
>     atomic_inc_return(enable_cnt)
>     Start actual enabling the bridge
>     ...                                         pci_is_enabled()
>     ...                                           return true;
>     ...                                     Start memory requests <-- FAIL
>     ...
>     Set the PCI_COMMAND_MEMORY bit <-- Must wait for this
> 
> Protect the pci_enable/disable_device() and pci_enable_bridge(), which is
> similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race
> while enabling upstream bridges"), but adding a per-device mutexes and
> preventing the dev->enable_cnt from from incrementing early.
> 
> CC: Srinath Mannam <srinath.mannam@broadcom.com>
> CC: Marta Rybczynska <mrybczyn@kalray.eu>
> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> 
> [1]
> https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u
>    [RFC PATCH v3] pci: Concurrency issue during pci enable bridge
> 
> [2]
> https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u
>    [RFC PATCH] nvme: avoid race-conditions when enabling devices
> ---
> drivers/pci/pci.c   | 26 ++++++++++++++++++++++----
> drivers/pci/probe.c |  1 +
> include/linux/pci.h |  1 +
> 3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 1b27b5af3d55..e7f8c354e644 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1645,6 +1645,8 @@ static void pci_enable_bridge(struct pci_dev *dev)
> 	struct pci_dev *bridge;
> 	int retval;
> 
> +	mutex_lock(&dev->enable_mutex);
> +
> 	bridge = pci_upstream_bridge(dev);
> 	if (bridge)
> 		pci_enable_bridge(bridge);
> @@ -1652,6 +1654,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
> 	if (pci_is_enabled(dev)) {
> 		if (!dev->is_busmaster)
> 			pci_set_master(dev);
> +		mutex_unlock(&dev->enable_mutex);
> 		return;
> 	}
> 

This code is used by numerous drivers and when we've seen that issue I was wondering
if there are some use-cases when this (or pci_disable_device) is called with interrupts
disabled. It seems that it shouldn't be, but a BUG_ON or error when someone calls
it this way would be helpful when debugging.

Marta

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 18/23] powerpc/pci: Handle BAR movement
  2019-08-16 16:50 ` [PATCH v5 18/23] powerpc/pci: Handle BAR movement Sergey Miroshnichenko
@ 2019-09-04  5:37   ` Oliver O'Halloran
  2019-09-06 16:24     ` Sergey Miroshnichenko
  0 siblings, 1 reply; 30+ messages in thread
From: Oliver O'Halloran @ 2019-09-04  5:37 UTC (permalink / raw)
  To: Sergey Miroshnichenko, linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux

On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
> called after the device is stopped, and pcibios_rescan_done() - before it
> resumes. There are no memory requests to this device between the hooks, so
> it it safe to rebuild the EEH address cache during that.
> 
> CC: Oliver O'Halloran <oohall@gmail.com>
> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> ---
>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index 0b0cf8168b47..18cf13bba228 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
>  	pcibios_finish_adding_to_bus(bus);
>  }
>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
> +
> +void pcibios_rescan_prepare(struct pci_dev *pdev)
> +{
> +	eeh_addr_cache_rmv_dev(pdev);
> +}
> +
> +void pcibios_rescan_done(struct pci_dev *pdev)
> +{
> +	eeh_addr_cache_insert_dev(pdev);
> +}

Is this actually sufficent? The PE number for a device is largely
determined by the location of the MMIO BARs. If you move a BAR far
enough the PE number stored in the eeh_pe would need to be updated as
well.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space
  2019-08-16 16:50 ` [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space Sergey Miroshnichenko
@ 2019-09-04  5:42   ` Oliver O'Halloran
  2019-09-04 11:22     ` Sergey Miroshnichenko
  0 siblings, 1 reply; 30+ messages in thread
From: Oliver O'Halloran @ 2019-09-04  5:42 UTC (permalink / raw)
  To: Sergey Miroshnichenko, linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux

On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
> A hotplugged bridge with many hotplug-capable ports may request
> reserving more IO space than the machine has. This could be overridden
> with the "hpiosize=" kernel argument though.
> 
> But when BARs are movable, there are no need to reserve space anymore:
> new BARs are allocated not from reserved gaps, but via rearranging the
> existing BARs. Requesting a precise amount of space for bridge windows
> increases the chances of adding the new bridge successfully.

It wouldn't hurt to reserve some memory space to prevent unnecessary
BAR shuffling at runtime. If it turns out that we need more space then
we can always fall back to re-assigning the whole tree.

> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> ---
>  drivers/pci/setup-bus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index c7b7e30c6284..7d64ec8e7088 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1287,7 +1287,7 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
>  
>  	case PCI_HEADER_TYPE_BRIDGE:
>  		pci_bridge_check_ranges(bus);
> -		if (bus->self->is_hotplug_bridge) {
> +		if (bus->self->is_hotplug_bridge && !pci_movable_bars_enabled()) {
>  			additional_io_size  = pci_hotplug_io_size;
>  			additional_mem_size = pci_hotplug_mem_size;
>  		}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space
  2019-09-04  5:42   ` Oliver O'Halloran
@ 2019-09-04 11:22     ` Sergey Miroshnichenko
  0 siblings, 0 replies; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-09-04 11:22 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linux-pci, linuxppc-dev, Bjorn Helgaas, linux

On 9/4/19 8:42 AM, Oliver O'Halloran wrote:
> On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
>> A hotplugged bridge with many hotplug-capable ports may request
>> reserving more IO space than the machine has. This could be overridden
>> with the "hpiosize=" kernel argument though.
>>
>> But when BARs are movable, there are no need to reserve space anymore:
>> new BARs are allocated not from reserved gaps, but via rearranging the
>> existing BARs. Requesting a precise amount of space for bridge windows
>> increases the chances of adding the new bridge successfully.
> 
> It wouldn't hurt to reserve some memory space to prevent unnecessary
> BAR shuffling at runtime. If it turns out that we need more space then
> we can always fall back to re-assigning the whole tree.
> 

Hi Oliver,

Thank you for your comments!

We had an issue on a x86_64 PC with a small amount of IO space: after
hotplugging an empty bridge of 32 ports even a DEFAULT_HOTPLUG_IO_SIZE
(which is 256) was enough to exhaust the space. So another patch of
this series ("Don't allow added devices to steal resources") had
disabled the BAR allocating for this bridge. It took some time for me
to guess that "hpiosize=0" can solve that.

For MEM and MEM64 spaces it will be harder to reproduce the same, but
there can be a similar problem when fitting between two immovable BARs.

To implement a fallback it would need to add some flag indicating that
allocating this bridge with reserved spaces has failed, so its windows
should be recalculated without reserved spaces - and try again. Maybe
even two types of retrials: with and without the full re-assignment.
We've tried to avoid adding execution paths and code complicatedness.

Serge

>> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
>> ---
>>  drivers/pci/setup-bus.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>> index c7b7e30c6284..7d64ec8e7088 100644
>> --- a/drivers/pci/setup-bus.c
>> +++ b/drivers/pci/setup-bus.c
>> @@ -1287,7 +1287,7 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
>>  
>>  	case PCI_HEADER_TYPE_BRIDGE:
>>  		pci_bridge_check_ranges(bus);
>> -		if (bus->self->is_hotplug_bridge) {
>> +		if (bus->self->is_hotplug_bridge && !pci_movable_bars_enabled()) {
>>  			additional_io_size  = pci_hotplug_io_size;
>>  			additional_mem_size = pci_hotplug_mem_size;
>>  		}
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 18/23] powerpc/pci: Handle BAR movement
  2019-09-04  5:37   ` Oliver O'Halloran
@ 2019-09-06 16:24     ` Sergey Miroshnichenko
  2019-09-09 14:02       ` Oliver O'Halloran
  0 siblings, 1 reply; 30+ messages in thread
From: Sergey Miroshnichenko @ 2019-09-06 16:24 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linux-pci, linuxppc-dev, Bjorn Helgaas, linux

[-- Attachment #1.1: Type: text/plain, Size: 2610 bytes --]

Hi Oliver,

On 9/4/19 8:37 AM, Oliver O'Halloran wrote:
> On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
>> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
>> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
>> called after the device is stopped, and pcibios_rescan_done() - before it
>> resumes. There are no memory requests to this device between the hooks, so
>> it it safe to rebuild the EEH address cache during that.
>>
>> CC: Oliver O'Halloran <oohall@gmail.com>
>> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
>> ---
>>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
>> index 0b0cf8168b47..18cf13bba228 100644
>> --- a/arch/powerpc/kernel/pci-hotplug.c
>> +++ b/arch/powerpc/kernel/pci-hotplug.c
>> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
>>  	pcibios_finish_adding_to_bus(bus);
>>  }
>>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
>> +
>> +void pcibios_rescan_prepare(struct pci_dev *pdev)
>> +{
>> +	eeh_addr_cache_rmv_dev(pdev);
>> +}
>> +
>> +void pcibios_rescan_done(struct pci_dev *pdev)
>> +{
>> +	eeh_addr_cache_insert_dev(pdev);
>> +}
> 
> Is this actually sufficent? The PE number for a device is largely
> determined by the location of the MMIO BARs. If you move a BAR far
> enough the PE number stored in the eeh_pe would need to be updated as
> well.
> 

Thanks for the hint! I've checked on our PowerNV: for bridges with MEM
only it allocates PE numbers starting from 0xff down, and when there
are MEM64 - starting from 0 up, one PE number per 4GiB.

PEs are allocated during call to pnv_pci_setup_bridge(), and the I've
added invocation of pci_setup_bridge() after a hotplug event in the
"Recalculate all bridge windows during rescan" patch of this series.

Currently, if a bus already has a PE, pnv_ioda_setup_bus_PE() takes it
and returns. I can see two ways to change it, both are not difficult to
implement:

 a.1) check if MEM64 BARs appeared below the bus - allocate and assign
      a new master PE with required number of slave PEs;

 a.2) if the bus now has more MEM64 than before - check if more slave
      PEs must be reserved;

 b) release all the PEs before a PCI rescan and allocate+assign them
    again after - with this approach the "Hook up the writes to
    PCI_SECONDARY_BUS register" patch may be eliminated.

Do you find any of these suitable?

Serge


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 18/23] powerpc/pci: Handle BAR movement
  2019-09-06 16:24     ` Sergey Miroshnichenko
@ 2019-09-09 14:02       ` Oliver O'Halloran
  0 siblings, 0 replies; 30+ messages in thread
From: Oliver O'Halloran @ 2019-09-09 14:02 UTC (permalink / raw)
  To: Sergey Miroshnichenko; +Cc: linux-pci, linuxppc-dev, Bjorn Helgaas, linux

On Sat, Sep 7, 2019 at 2:25 AM Sergey Miroshnichenko
<s.miroshnichenko@yadro.com> wrote:
>
> Hi Oliver,
>
> On 9/4/19 8:37 AM, Oliver O'Halloran wrote:
> > On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote:
> >> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if
> >> the device's driver supports movable BARs, pcibios_rescan_prepare() will be
> >> called after the device is stopped, and pcibios_rescan_done() - before it
> >> resumes. There are no memory requests to this device between the hooks, so
> >> it it safe to rebuild the EEH address cache during that.
> >>
> >> CC: Oliver O'Halloran <oohall@gmail.com>
> >> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
> >> ---
> >>  arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> >> index 0b0cf8168b47..18cf13bba228 100644
> >> --- a/arch/powerpc/kernel/pci-hotplug.c
> >> +++ b/arch/powerpc/kernel/pci-hotplug.c
> >> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus)
> >>      pcibios_finish_adding_to_bus(bus);
> >>  }
> >>  EXPORT_SYMBOL_GPL(pci_hp_add_devices);
> >> +
> >> +void pcibios_rescan_prepare(struct pci_dev *pdev)
> >> +{
> >> +    eeh_addr_cache_rmv_dev(pdev);
> >> +}
> >> +
> >> +void pcibios_rescan_done(struct pci_dev *pdev)
> >> +{
> >> +    eeh_addr_cache_insert_dev(pdev);
> >> +}
> >
> > Is this actually sufficent? The PE number for a device is largely
> > determined by the location of the MMIO BARs. If you move a BAR far
> > enough the PE number stored in the eeh_pe would need to be updated as
> > well.
> >
>
> Thanks for the hint! I've checked on our PowerNV: for bridges with MEM
> only it allocates PE numbers starting from 0xff down, and when there
> are MEM64 - starting from 0 up, one PE number per 4GiB.
>
> PEs are allocated during call to pnv_pci_setup_bridge(), and the I've
> added invocation of pci_setup_bridge() after a hotplug event in the
> "Recalculate all bridge windows during rescan" patch of this series.

Sort of.

On PHB3 both the 32bit and the 64bit MMIO windows are split into 256
segments each of which is mapped to a PE number. For the 32bit space
there's a remapping table in hardware that allows arbitrary mapping of
segments to PE numbers, but in the 64bit space the mapping is fixed
with the first segment being PE0, etc. If there's a 64 bit BAR under a
bridge the PE is really "allocated" during the BAR assignment process,
and the setup_bridge() step sets up the EEH state based on that.

It's worth pointing out that this is why the 64bit window is usually
4GB. Bridge windows need to be aligned to a segment boundary to ensure
the devices under them are placed into a unique PE.

> Currently, if a bus already has a PE, pnv_ioda_setup_bus_PE() takes it
> and returns. I can see two ways to change it, both are not difficult to
> implement:
>
>  a.1) check if MEM64 BARs appeared below the bus - allocate and assign
>       a new master PE with required number of slave PEs;
>
>  a.2) if the bus now has more MEM64 than before - check if more slave
>       PEs must be reserved;
>
>  b) release all the PEs before a PCI rescan and allocate+assign them
>     again after - with this approach the "Hook up the writes to
>     PCI_SECONDARY_BUS register" patch may be eliminated.
>
> Do you find any of these suitable?

I'm not sure a) would work, but even if it does b) is preferable.
There's a lot of strangeness in the powerpc PCI code as-is without
adding extra code paths to deal with. Keeping what happens at hotplug
consistent with what happens at boot will help keep things sane.

FYI in the next few days I'm going to post a series that rips out the
use of pci_dn in powernv and the generic parts of EEH (pseries still
uses it). Assuming Bjorn isn't picking this up for 5.4 you might want
to wait for that before getting too deep into this.

Oliver

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, back to index

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16 16:50 [PATCH v5 00/23] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
2019-08-22 12:37   ` Marta Rybczynska
2019-08-16 16:50 ` [PATCH v5 02/23] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 03/23] PCI: hotplug: Add a flag for the movable BARs feature Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 04/23] PCI: Define PCI-specific version of the release_child_resources() Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 05/23] PCI: hotplug: movable BARs: Fix reassigning the released bridge windows Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 06/23] PCI: hotplug: movable BARs: Recalculate all bridge windows during rescan Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 07/23] PCI: hotplug: movable BARs: Don't allow added devices to steal resources Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 08/23] PCI: Include fixed and immovable BARs into the bus size calculating Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 09/23] PCI: Prohibit assigning BARs and bridge windows to non-direct parents Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 10/23] PCI: hotplug: movable BARs: Try to assign unassigned resources only once Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 11/23] PCI: hotplug: movable BARs: Calculate immovable parts of bridge windows Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 12/23] PCI: hotplug: movable BARs: Compute limits for relocated " Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 13/23] PCI: Make sure bridge windows include their fixed BARs Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 14/23] PCI: Fix assigning the fixed prefetchable resources Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 15/23] PCI: hotplug: movable BARs: Assign fixed and immovable BARs before others Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 16/23] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space Sergey Miroshnichenko
2019-09-04  5:42   ` Oliver O'Halloran
2019-09-04 11:22     ` Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 17/23] powerpc/pci: Fix crash with enabled movable BARs Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 18/23] powerpc/pci: Handle BAR movement Sergey Miroshnichenko
2019-09-04  5:37   ` Oliver O'Halloran
2019-09-06 16:24     ` Sergey Miroshnichenko
2019-09-09 14:02       ` Oliver O'Halloran
2019-08-16 16:50 ` [PATCH v5 19/23] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 20/23] PCI: hotplug: movable BARs: Enable the feature by default Sergey Miroshnichenko
2019-08-16 16:50 ` [PATCH v5 21/23] nvme-pci: Handle movable BARs Sergey Miroshnichenko
2019-08-16 16:51 ` [PATCH v5 22/23] PCI/portdrv: Declare support of " Sergey Miroshnichenko
2019-08-16 16:51 ` [PATCH v5 23/23] PCI: pciehp: movable BARs: Trigger a domain rescan on hp events Sergey Miroshnichenko

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org linux-pci@archiver.kernel.org
	public-inbox-index linux-pci


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/ public-inbox