linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
To: <linux-pci@vger.kernel.org>, <linuxppc-dev@lists.ozlabs.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>, <linux@yadro.com>,
	Sergey Miroshnichenko <s.miroshnichenko@yadro.com>,
	Sam Bobroff <sbobroff@linux.ibm.com>,
	Rajat Jain <rajatja@google.com>, Lukas Wunner <lukas@wunner.de>,
	Oliver O'Halloran <oohall@gmail.com>,
	David Laight <David.Laight@ACULAB.COM>
Subject: [PATCH v6 03/30] PCI: hotplug: Add a flag for the movable BARs feature
Date: Thu, 24 Oct 2019 20:12:01 +0300	[thread overview]
Message-ID: <20191024171228.877974-4-s.miroshnichenko@yadro.com> (raw)
In-Reply-To: <20191024171228.877974-1-s.miroshnichenko@yadro.com>

When hot-adding a device, the bridge may have windows not big enough (or
fragmented too much) for newly requested BARs to fit in. And expanding
these bridge windows may be impossible because blocked by "neighboring"
BARs and bridge windows.

Still, it may be possible to allocate a memory region for new BARs with the
following procedure:

1) notify all the drivers which support movable BARs to pause and release
   the BARs; the rest of the drivers are guaranteed that their devices will
   not get BARs moved;

2) release all the bridge windows and movable BARs;

3) try to recalculate new bridge windows that will fit all the BAR types:
   - fixed;
   - immovable;
   - movable;
   - newly requested by hot-added devices;

4) if the previous step fails, disable BARs for one of the hot-added
   devices and retry from step 3;

5) notify the drivers, so they remap BARs and resume.

If bridge calculation and BAR assignment fails with a hot-added devices,
BARs of these devices will be disabled, falling back to the same amount and
size of BARs as they were before the hotplug event. The kernel succeeded in
assigning then, so the same algorithm will provide the same results again.

This makes the prior reservation of memory by BIOS/bootloader/firmware not
required anymore for the PCI hotplug.

Drivers indicate their support of movable BARs by implementing the new
.rescan_prepare() and .rescan_done() hooks in the struct pci_driver. All
device's activity must be paused during a rescan, and iounmap()+ioremap()
must be applied to every used BAR.

If a device is not bound to a driver, its BARs are considered movable.

For a higher probability of the successful BAR reassignment, all the BARs
and bridge windows should be released before the rescan, not only those
with higher addresses.

One example when it is needed, BAR(I) is moved to free a gap for the new
BAR(II):

  Before:
 ==================== parent bridge window ===============
               ---- hotplug bridge window ----
|   BAR(I)    |   fixed BAR   |   fixed BAR   | fixed BAR |
                           ^
                           |
                       new BAR(II)

  After:
 ==================== parent bridge window =========================
 ----------- hotplug bridge window -----------
| new BAR(II) |   fixed BAR   |   fixed BAR   | fixed BAR | BAR(I)  |

Another example is a fragmented bridge window jammed between fixed BARs:

  Before:
 ===================== parent bridge window ========================
             ---------- hotplug bridge window ----------
| fixed BAR |   | BAR(I) |    | BAR(II) |    | BAR(III) | fixed BAR |
                           ^
                           |
                       new BAR(IV)

 After:
 ==================== parent bridge window =========================
             ---------- hotplug bridge window ----------
| fixed BAR | BAR(I) | BAR(II) | BAR(III) | new BAR(IV) | fixed BAR |

This patch is a preparation for future patches with actual implementation,
and for now it just does the following:
 - declares the feature;
 - defines the bool pci_can_move_bars and bool pci_dev_bar_movable(dev);
 - invokes the .rescan_prepare() and .rescan_done() driver notifiers;
 - disables the feature for the powerpc/pseries.

The feature is disabled by default until the final patch of the series.
It can be overridden per-arch using the pci_can_move_bars=false flag or by
the following command line option:

    pci=no_movable_bars

CC: Sam Bobroff <sbobroff@linux.ibm.com>
CC: Rajat Jain <rajatja@google.com>
CC: Lukas Wunner <lukas@wunner.de>
CC: Oliver O'Halloran <oohall@gmail.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 .../admin-guide/kernel-parameters.txt         |  1 +
 arch/powerpc/platforms/pseries/setup.c        |  2 +
 drivers/pci/pci.c                             |  4 +
 drivers/pci/pci.h                             |  2 +
 drivers/pci/probe.c                           | 85 ++++++++++++++++++-
 include/linux/pci.h                           |  4 +
 6 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a84a83f8881e..c6243aaed0c9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3528,6 +3528,7 @@
 				may put more devices in an IOMMU group.
 		force_floating	[S390] Force usage of floating interrupts.
 		nomio		[S390] Do not use MIO instructions.
+		no_movable_bars	Don't allow BARs to be moved during hotplug
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0a40201f315f..7cd12c5a2deb 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -920,6 +920,8 @@ static void __init pseries_init(void)
 {
 	pr_debug(" -> pseries_init()\n");
 
+	pci_can_move_bars = false;
+
 #ifdef CONFIG_HVC_CONSOLE
 	if (firmware_has_feature(FW_FEATURE_LPAR))
 		hvc_vio_init_early();
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e85dc63c73fd..85014c6b2817 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -78,6 +78,8 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
 int pci_domains_supported = 1;
 #endif
 
+bool pci_can_move_bars;
+
 #define DEFAULT_CARDBUS_IO_SIZE		(256)
 #define DEFAULT_CARDBUS_MEM_SIZE	(64*1024*1024)
 /* pci=cbmemsize=nnM,cbiosize=nn can override this */
@@ -6331,6 +6333,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "no_movable_bars", 15)) {
+				pci_can_move_bars = false;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 3f6947ee3324..19bc50597d12 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -286,6 +286,8 @@ void pci_disable_bridge_window(struct pci_dev *dev);
 struct pci_bus *pci_bus_get(struct pci_bus *bus);
 void pci_bus_put(struct pci_bus *bus);
 
+bool pci_dev_bar_movable(struct pci_dev *dev, struct resource *res);
+
 /* PCIe link information */
 #define PCIE_SPEED2STR(speed) \
 	((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index d4f21e413638..3d8c0f653378 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3133,6 +3133,73 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 	return max;
 }
 
+static bool pci_dev_movable(struct pci_dev *dev, bool res_has_children)
+{
+	if (!pci_can_move_bars)
+		return false;
+
+	if (dev->driver && dev->driver->rescan_prepare)
+		return true;
+
+	if (!dev->driver && !res_has_children)
+		return true;
+
+	return false;
+}
+
+bool pci_dev_bar_movable(struct pci_dev *dev, struct resource *res)
+{
+	struct pci_bus_region region;
+
+	if (res->flags & IORESOURCE_PCI_FIXED)
+		return false;
+
+	/* Workaround for the legacy VGA memory 0xa0000-0xbffff */
+	pcibios_resource_to_bus(dev->bus, &region, res);
+	if (region.start == 0xa0000)
+		return false;
+
+	return pci_dev_movable(dev, res->child);
+}
+
+static void pci_bus_rescan_prepare(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (bus->self)
+		pci_config_pm_runtime_get(bus->self);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child)
+			pci_bus_rescan_prepare(child);
+
+		if (dev->driver &&
+		    dev->driver->rescan_prepare)
+			dev->driver->rescan_prepare(dev);
+	}
+}
+
+static void pci_bus_rescan_done(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (dev->driver &&
+		    dev->driver->rescan_done)
+			dev->driver->rescan_done(dev);
+
+		if (child)
+			pci_bus_rescan_done(child);
+	}
+
+	if (bus->self)
+		pci_config_pm_runtime_put(bus->self);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3145,9 +3212,23 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
+	struct pci_bus *root = bus;
+
+	while (!pci_is_root_bus(root))
+		root = root->parent;
+
+	if (pci_can_move_bars) {
+		pci_bus_rescan_prepare(root);
+
+		max = pci_scan_child_bus(root);
+		pci_assign_unassigned_root_bus_resources(root);
+
+		pci_bus_rescan_done(root);
+	} else {
+		max = pci_scan_child_bus(bus);
+		pci_assign_unassigned_bus_resources(bus);
+	}
 
-	max = pci_scan_child_bus(bus);
-	pci_assign_unassigned_bus_resources(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 525140e3a460..b981e67c8a13 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -835,6 +835,8 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	void (*rescan_prepare)(struct pci_dev *dev);
+	void (*rescan_done)(struct pci_dev *dev);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
@@ -1395,6 +1397,8 @@ void pci_setup_bridge(struct pci_bus *bus);
 resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 					 unsigned long type);
 
+extern bool pci_can_move_bars;
+
 #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
 #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
 
-- 
2.23.0


  parent reply	other threads:[~2019-10-24 17:12 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-24 17:11 [PATCH v6 00/30] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
2019-10-24 17:11 ` [PATCH v6 01/30] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
2019-10-25 14:33   ` Oxford Semiconductor Ltd OX16PCI954 - weird dmesg Carlo Pisani
2019-10-25 16:37     ` Bjorn Helgaas
2019-10-24 17:12 ` [PATCH v6 02/30] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergey Miroshnichenko
2019-10-24 17:12 ` Sergey Miroshnichenko [this message]
2019-10-24 17:12 ` [PATCH v6 04/30] PCI: Define PCI-specific version of the release_child_resources() Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 05/30] PCI: hotplug: movable BARs: Fix reassigning the released bridge windows Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 06/30] PCI: hotplug: movable BARs: Recalculate all bridge windows during rescan Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 07/30] PCI: hotplug: movable BARs: Don't disable the released bridge windows Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 08/30] PCI: hotplug: movable BARs: Don't allow added devices to steal resources Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 09/30] PCI: Include fixed and immovable BARs into the bus size calculating Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 10/30] PCI: Prohibit assigning BARs and bridge windows to non-direct parents Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 11/30] PCI: hotplug: movable BARs: Try to assign unassigned resources only once Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 12/30] PCI: hotplug: movable BARs: Calculate immovable parts of bridge windows Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 13/30] PCI: hotplug: movable BARs: Compute limits for relocated " Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 14/30] PCI: Make sure bridge windows include their fixed BARs Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 15/30] PCI: Fix assigning the fixed prefetchable resources Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 16/30] PCI: hotplug: movable BARs: Assign fixed and immovable BARs before others Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 17/30] PCI: hotplug: movable BARs: Don't reserve IO/mem bus space Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 18/30] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 19/30] PCI: hotplug: movable BARs: Ignore the MEM BAR offsets from bootloader Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 20/30] powerpc/pci: Fix crash with enabled movable BARs Sergey Miroshnichenko
2019-10-25  1:22   ` Alexey Kardashevskiy
2019-10-24 17:12 ` [PATCH v6 21/30] powerpc/pci: Access PCI config space directly w/o pci_dn Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 22/30] powerpc/pci: Create pci_dn on demand Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 23/30] powerpc/pci: hotplug: Add support for movable BARs Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 24/30] powerpc/powernv/pci: Suppress an EEH error when reading an empty slot Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 25/30] PNP: Don't reserve BARs for PCI when enabled movable BARs Sergey Miroshnichenko
2019-10-27 17:40   ` kbuild test robot
2019-10-24 17:12 ` [PATCH v6 26/30] PCI: hotplug: movable BARs: Enable the feature by default Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 27/30] nvme-pci: Handle movable BARs Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 28/30] PCI/portdrv: Declare support of " Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 29/30] PCI: pciehp: movable BARs: Trigger a domain rescan on hp events Sergey Miroshnichenko
2019-10-24 17:12 ` [PATCH v6 30/30] Revert "powerpc/powernv/pci: Work around races in PCI bridge enabling" Sergey Miroshnichenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191024171228.877974-4-s.miroshnichenko@yadro.com \
    --to=s.miroshnichenko@yadro.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@yadro.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=oohall@gmail.com \
    --cc=rajatja@google.com \
    --cc=sbobroff@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).