linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
To: <linux-pci@vger.kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>, Stefan Roese <sr@denx.de>,
	<linux@yadro.com>,
	Sergei Miroshnichenko <s.miroshnichenko@yadro.com>,
	Sam Bobroff <sbobroff@linux.ibm.com>,
	Rajat Jain <rajatja@google.com>, Lukas Wunner <lukas@wunner.de>,
	Oliver O'Halloran <oohall@gmail.com>,
	David Laight <David.Laight@ACULAB.COM>
Subject: [PATCH v7 03/26] PCI: hotplug: Initial support of the movable BARs feature
Date: Wed, 29 Jan 2020 18:29:14 +0300	[thread overview]
Message-ID: <20200129152937.311162-4-s.miroshnichenko@yadro.com> (raw)
In-Reply-To: <20200129152937.311162-1-s.miroshnichenko@yadro.com>

When hot-adding a device, the involved bridge may have windows not big
enough (or fragmented too much) for newly requested BARs to fit in. But
expanding these bridge windows may be impossible if they are wedged between
"neighboring" BARs.

Still, it may be possible to allocate a memory region for new BARs, if at
least some utilized BARs can be moved, with the following procedure:

1) notify all the drivers which support movable BARs to pause and release
   the BARs; the rest of the drivers are guaranteed that their devices will
   not get BARs moved;

2) release all the bridge windows and movable BARs;

3) try to recalculate new bridge windows that will fit all the BAR types:
   - immovable (including those marked with PCI_FIXED);
   - movable;
   - newly requested by hot-added devices;

4) if the previous step fails, disable BARs for one of the hot-added
   devices and retry from step 3;

5) notify the drivers, so they remap BARs and resume.

If bridge calculation and BAR assignment fail with hot-added devices, this
patchset disables their BARs, falling back to the same amount and size of
BARs as they were before the hotplug event. The kernel succeeded then - so
the same BAR layout will be reproduced again.

This makes the prior reservation of memory by BIOS/bootloader/firmware not
required anymore for the PCI hotplug.

Drivers indicate their support of movable BARs by implementing the new
.rescan_prepare() and .rescan_done() hooks in the struct pci_driver. All
device's activity must be paused during a rescan, and iounmap()+ioremap()
must be applied to every used BAR.

If a device is not bound to a driver, its BARs are considered movable.

For a higher probability of the successful BAR reassignment, all the BARs
and bridge windows should be released during a rescan, not only those with
higher addresses.

One example when it is needed, BAR(I) is moved to free a gap for the new
BAR(II):

Before:

    ==================== parent bridge window ===============
                   ---- hotplug bridge window ----
    |   BAR(I)    |   fixed BAR   |   fixed BAR   | fixed BAR |
        ^^^^^^                 ^
                               |
                           new BAR(II)

After:

    ==================== parent bridge window =========================
     ----------- hotplug bridge window -----------
    | new BAR(II) |   fixed BAR   |   fixed BAR   | fixed BAR | BAR(I)  |
      ^^^^^^^^^^^                                               ^^^^^^

Another example is a fragmented bridge window jammed between fixed BARs:

Before:

     ===================== parent bridge window ========================
                 ---------- hotplug bridge window ----------
    | fixed BAR |   | BAR(I) |    | BAR(II) |    | BAR(III) | fixed BAR |
                      ^^^^^^   ^    ^^^^^^^        ^^^^^^^^
                               |
                           new BAR(IV)

After:

     ==================== parent bridge window =========================
                 ---------- hotplug bridge window ----------
    | fixed BAR | BAR(I) | BAR(II) | BAR(III) | new BAR(IV) | fixed BAR |
                  ^^^^^^   ^^^^^^^   ^^^^^^^^   ^^^^^^^^^^^

This patch is a preparation for future patches with actual implementation,
and for now it just does the following:
 - declares the feature;
 - defines bool pci_can_move_bars and bool pci_dev_bar_movable(dev, bar);
 - invokes the new .rescan_prepare() and .rescan_done() driver notifiers;
 - disables the feature for the powerpc (support will be added later, in
   another series).

This is disabled by default in this first patch of the series, until a
patch in the middle, which finalizes the actual implementation. Then it
can be overridden per-arch using the pci_can_move_bars=false flag or by the
following command line option:

    pci=no_movable_bars

CC: Sam Bobroff <sbobroff@linux.ibm.com>
CC: Rajat Jain <rajatja@google.com>
CC: Lukas Wunner <lukas@wunner.de>
CC: Oliver O'Halloran <oohall@gmail.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 .../admin-guide/kernel-parameters.txt         |  1 +
 arch/powerpc/platforms/powernv/pci.c          |  2 +
 arch/powerpc/platforms/pseries/setup.c        |  2 +
 drivers/pci/pci.c                             |  4 +
 drivers/pci/pci.h                             |  2 +
 drivers/pci/probe.c                           | 81 ++++++++++++++++++-
 include/linux/pci.h                           |  6 ++
 7 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ec92120a7952..9f42cf43bf08 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3610,6 +3610,7 @@
 				may put more devices in an IOMMU group.
 		force_floating	[S390] Force usage of floating interrupts.
 		nomio		[S390] Do not use MIO instructions.
+		no_movable_bars	Don't allow BARs to be moved during hotplug
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index c0bea75ac27b..ec772aa31ccf 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -939,6 +939,8 @@ void __init pnv_pci_init(void)
 {
 	struct device_node *np;
 
+	pci_can_move_bars = false;
+
 	pci_add_flags(PCI_CAN_SKIP_ISA_ALIGN);
 
 	/* If we don't have OPAL, eg. in sim, just skip PCI probe */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0c8421dd01ab..2edb41f55237 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -927,6 +927,8 @@ static void __init pseries_init(void)
 {
 	pr_debug(" -> pseries_init()\n");
 
+	pci_can_move_bars = false;
+
 #ifdef CONFIG_HVC_CONSOLE
 	if (firmware_has_feature(FW_FEATURE_LPAR))
 		hvc_vio_init_early();
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index cb0042f28e6a..6741c7f111ac 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -79,6 +79,8 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
 int pci_domains_supported = 1;
 #endif
 
+bool pci_can_move_bars;
+
 #define DEFAULT_CARDBUS_IO_SIZE		(256)
 #define DEFAULT_CARDBUS_MEM_SIZE	(64*1024*1024)
 /* pci=cbmemsize=nnM,cbiosize=nn can override this */
@@ -6477,6 +6479,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "no_movable_bars", 15)) {
+				pci_can_move_bars = false;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index a0a53bd05a0b..ce8c53ca7a1e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -289,6 +289,8 @@ void pci_disable_bridge_window(struct pci_dev *dev);
 struct pci_bus *pci_bus_get(struct pci_bus *bus);
 void pci_bus_put(struct pci_bus *bus);
 
+bool pci_dev_bar_movable(struct pci_dev *dev, struct resource *res);
+
 /* PCIe link information */
 #define PCIE_SPEED2STR(speed) \
 	((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2a7e81f146e4..9b6d15587f5d 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2935,6 +2935,7 @@ int pci_host_probe(struct pci_host_bridge *bridge)
 	 * or pci_bus_assign_resources().
 	 */
 	if (pci_has_flag(PCI_PROBE_ONLY)) {
+		pci_can_move_bars = false;
 		pci_bus_claim_resources(bus);
 	} else {
 		pci_bus_size_bridges(bus);
@@ -3127,6 +3128,68 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 	return max;
 }
 
+bool pci_dev_bar_movable(struct pci_dev *dev, struct resource *res)
+{
+	struct pci_bus_region region;
+
+	if (!pci_can_move_bars)
+		return false;
+
+	if (res->flags & IORESOURCE_PCI_FIXED)
+		return false;
+
+	if (dev->driver && dev->driver->rescan_prepare)
+		return true;
+
+	/* Workaround for the legacy VGA memory 0xa0000-0xbffff */
+	pcibios_resource_to_bus(dev->bus, &region, res);
+	if (region.start == 0xa0000)
+		return false;
+
+	if (!dev->driver && !res->child)
+		return true;
+
+	return false;
+}
+
+static void pci_bus_rescan_prepare(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (bus->self)
+		pci_config_pm_runtime_get(bus->self);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child)
+			pci_bus_rescan_prepare(child);
+
+		if (dev->driver &&
+		    dev->driver->rescan_prepare)
+			dev->driver->rescan_prepare(dev);
+	}
+}
+
+static void pci_bus_rescan_done(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (dev->driver &&
+		    dev->driver->rescan_done)
+			dev->driver->rescan_done(dev);
+
+		if (child)
+			pci_bus_rescan_done(child);
+	}
+
+	if (bus->self)
+		pci_config_pm_runtime_put(bus->self);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3139,9 +3202,23 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
+	struct pci_bus *root = bus;
+
+	while (!pci_is_root_bus(root))
+		root = root->parent;
+
+	if (pci_can_move_bars) {
+		pci_bus_rescan_prepare(root);
+
+		max = pci_scan_child_bus(root);
+		pci_assign_unassigned_root_bus_resources(root);
+
+		pci_bus_rescan_done(root);
+	} else {
+		max = pci_scan_child_bus(bus);
+		pci_assign_unassigned_bus_resources(bus);
+	}
 
-	max = pci_scan_child_bus(bus);
-	pci_assign_unassigned_bus_resources(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e473e70834b5..a543f647d337 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -817,6 +817,8 @@ struct module;
  *		e.g. drivers/net/e100.c.
  * @sriov_configure: Optional driver callback to allow configuration of
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
+ * @rescan_prepare: Prepare to BAR movement - called before PCI rescan.
+ * @rescan_done: Remap BARs and restore after PCI rescan.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
  * @driver:	Driver model structure.
@@ -832,6 +834,8 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	void (*rescan_prepare)(struct pci_dev *dev);
+	void (*rescan_done)(struct pci_dev *dev);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
@@ -1392,6 +1396,8 @@ void pci_setup_bridge(struct pci_bus *bus);
 resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 					 unsigned long type);
 
+extern bool pci_can_move_bars;
+
 #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
 #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
 
-- 
2.24.1


  parent reply	other threads:[~2020-01-29 15:29 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 15:29 [PATCH v7 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 02/26] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergei Miroshnichenko
2020-01-30 23:12   ` Bjorn Helgaas
2020-01-29 15:29 ` Sergei Miroshnichenko [this message]
2020-01-29 15:29 ` [PATCH v7 04/26] PCI: Add version of release_child_resources() aware of immovable BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 05/26] PCI: hotplug: Fix reassigning the released BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 06/26] PCI: hotplug: Recalculate every bridge window during rescan Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 08/26] PCI: hotplug: Try to reassign movable BARs only once Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 09/26] PCI: hotplug: Calculate immovable parts of bridge windows Sergei Miroshnichenko
2020-01-30 21:26   ` Bjorn Helgaas
2020-01-31 16:59     ` Sergei Miroshnichenko
2020-01-31 20:10       ` Bjorn Helgaas
2020-01-29 15:29 ` [PATCH v7 10/26] PCI: Include fixed and immovable BARs into the bus size calculating Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 11/26] PCI: hotplug: movable BARs: Compute limits for relocated bridge windows Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 12/26] PCI: Make sure bridge windows include their fixed BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 13/26] PCI: hotplug: Add support of immovable BARs to pci_assign_resource() Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 14/26] PCI: hotplug: Sort immovable BARs before assignment Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 15/26] PCI: hotplug: Enable the movable BARs feature by default Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 16/26] PCI: Ignore PCIBIOS_MIN_MEM Sergei Miroshnichenko
2020-01-30 23:52   ` Bjorn Helgaas
2020-01-31 18:19     ` Sergei Miroshnichenko
2020-01-31 20:27       ` Bjorn Helgaas
2020-02-05 13:04         ` Sergei Miroshnichenko
2020-02-05 16:32           ` Bjorn Helgaas
2020-02-12 12:16             ` Sergei Miroshnichenko
2020-02-12 14:13               ` Bjorn Helgaas
2020-01-29 15:29 ` [PATCH v7 17/26] PCI: hotplug: Ignore the MEM BAR offsets from BIOS/bootloader Sergei Miroshnichenko
2020-01-31 20:31   ` Bjorn Helgaas
2020-02-05 11:01     ` Sergei Miroshnichenko
2020-02-05 16:42       ` Bjorn Helgaas
2020-02-12 12:29         ` Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 18/26] PCI: Treat VGA BARs as immovable Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 19/26] PCI: hotplug: Configure MPS for hot-added bridges during bus rescan Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 20/26] PNP: Don't reserve BARs for PCI when enabled movable BARs Sergei Miroshnichenko
2020-01-30 14:39   ` Rafael J. Wysocki
2020-01-30 21:14   ` Bjorn Helgaas
2020-01-31 15:48     ` Sergei Miroshnichenko
2020-01-31 19:39       ` Bjorn Helgaas
2020-01-29 15:29 ` [PATCH v7 21/26] PCI: hotplug: Don't disable the released bridge windows immediately Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 22/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 23/26] PCI: Don't claim immovable BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 24/26] PCI: hotplug: Don't reserve bus space when enabled movable BARs Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 25/26] nvme-pci: Handle " Sergei Miroshnichenko
2020-01-29 15:29 ` [PATCH v7 26/26] PCI/portdrv: Declare support of " Sergei Miroshnichenko
2020-01-30 23:37 ` [PATCH v7 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
2020-02-03  4:56   ` Oliver O'Halloran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200129152937.311162-4-s.miroshnichenko@yadro.com \
    --to=s.miroshnichenko@yadro.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@yadro.com \
    --cc=lukas@wunner.de \
    --cc=oohall@gmail.com \
    --cc=rajatja@google.com \
    --cc=sbobroff@linux.ibm.com \
    --cc=sr@denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).