linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
@ 2020-12-18 17:39 Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
                   ` (26 more replies)
  0 siblings, 27 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Currently PCI hotplug works on top of resources which are usually reserved:
by BIOS, bootloader, firmware, or by the kernel (pci=hpmemsize=XM). These
resources are gaps in the address space where BARs of new devices may fit,
and extra bus number per port, so bridges can be hot-added. This series aim
the BARs problem: it shows the kernel how to redistribute them on the run,
so the hotplug becomes predictable and cross-platform. A follow-up patchset
will propose a solution for bus numbers. And another one -- for the powerpc
arch-specific problems.

To arrange a space for BARs of new hot-added devices, the kernel can pause
the drivers of working PCI devices and reshuffle the assigned BARs. When a
driver is un-paused by the kernel, it should ioremap() the new addresses of
its BARs.

Drivers indicate their support of the feature by implementing the new hooks
.bar_fixed(), .rescan_prepare(), .rescan_done() in the struct pci_driver.
If a driver doesn't yet support the feature, BARs of its devices are seen
as immovable and handled in the same way as resources with the PCI_FIXED
flag: they are guaranteed to remain untouched.

This approached was discussed during LPC 2020 [1].

The feature is usable not only for hotplug, but also for booting with a
complete set of BARs assigned, and for Resizable BARs.

Tested on a number of x86_64 machines without any special kernel command
line arguments (some of them -- with older v8 revision of this patchset):
 - PC: i7-5930K + ASUS X99-A;
 - PC: i5-8500 + ASUS Z370-F;
 - PC: AMD Ryzen 7 3700X + ASRock X570 + AMD 5700 XT (Resizable BARs);
 - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
 - HP ProLiant DL380 G5: Xeon X5460;
 - Dell Inspiron N5010: i5 M 480;
 - Dell Precision M6600: i7-2920XM.

Also tested on a Power8 (Vesnin) and Power9 (Nicole) ppc64le machines, but
with extra patchset, which is to be sent upstream a bit later.

The set contains:
 01-02: Independent bugfixes, both are not related directly to the movable
        BARs feature, but without them the rest of this series will not
	work as expected;

 03-14: Implement the essentials of the feature;

 15-21: Performance improvements for movable BARs and pciehp;

 22: Enable the feature by default;

 23-24: Add movable BARs support to nvme and portdrv;

 25-26: Debugging helpers.

This patchset is a part of our work on adding support for hot-adding
chains of chassis full of other bridges, NVME drives, SAS HBAs, GPUs, etc.
without special requirements such as Hot-Plug Controller, reservation of
bus numbers or memory regions by firmware, etc.

[1] https://linuxplumbersconf.org/event/7/contributions/847/attachments/584/1035/lpc2020_sergmir.pdf
    LPC 2020: VFIO/IOMMU/PCI MC: PCI hotplug: movable BARs and bus numbers

Changes since v8:
 - Fixed support of Resizable BARs, with amdgpu (no changes in the driver);
 - Added a new hook .bar_fixed(), so part of device's BARs can be movable,
   and this new call makes the API a bit cleaner in usage;
 - Memory reservation (via pci=hpmemsize, etc) now works as well;
 - Fix for a race condition (patch 01) has been simplified;
 - Simplified: the .realloc_range field is not needed anymore;
 - Added some debug messages;
 - Compilation warnings, NULL dereference fixed.

Changes since v7:
 - Added some documentation;
 - Replaced every occurrence of the word "immovable" with "fixed";
 - Don't touch PNP, ACPI resources anymore;
 - Replaced double rescan with triple rescan:
   * first try every BAR;
   * if that failed, retry without BARs which weren't assigned before;
   * if that failed, retry without BARs of hotplugged devices;
 - Reassign BARs during boot only if BIOS assigned not all requested BARs;
 - Fixed up PCIBIOS_MIN_MEM instead of ignoring it;
 - Now the feature auto-disables in presence of a transparent bridge;
 - Improved support of runtime PM;
 - Fixed issues with incorrectly released bridge windows;
 - Fixed calculating bridge window size.
 
Changes since v6:
 - Added a fix for hotplug on AMD Epyc + Supermicro H11SSL-i by ignoring
   PCIBIOS_MIN_MEM;
 - Fixed a workaround which marks VGA BARs as immovables;
 - Fixed misleading "can't claim BAR ... no compatible bridge window" error
   messages;
 - Refactored the code, reduced the amount of patches;
 - Exclude PowerPC-specific arch patches, they will be sent separately;
 - Disabled for PowerNV by default - waiting for the PCIPOCALYPSE patchset.
 - Fixed reports from the kbuild test robot.

Changes since v5:
 - Simplified the disable flag, now it is "pci=no_movable_buses";
 - More deliberate marking the BARs as immovable;
 - Mark as immovable BARs which are used by unbound drivers;
 - Ignoring BAR assignment by non-kernel program components, so the kernel
   is able now to distribute BARs in optimal and predictable way;
 - Move here PowerNV-specific patches from the older "powerpc/powernv/pci:
   Make hotplug self-sufficient, independent of FW and DT" series;
 - Fix EEH cache rebuilding and PE allocation for PowerNV during rescan.

Changes since v4:
 - Feature is enabled by default (turned on by one of the latest patches);
 - Add pci_dev_movable_bars_supported(dev) instead of marking the immovable
   BARs with the IORESOURCE_PCI_FIXED flag;
 - Set up PCIe bridges during rescan via sysfs, so MPS settings are now
   configured not only during system boot or pcihp events;
 - Allow movement of switch's BARs if claimed by portdrv;
 - Update EEH address caches after rescan for powerpc;
 - Don't disable completely hot-added devices which can't have BARs being
   fit - just disable their BARs, so they are still visible in lspci etc;
 - Clearer names: fixed_range_hard -> immovable_range, fixed_range_soft ->
   realloc_range;
 - Drop the patch for pci_restore_config_space() - fixed by properly using
   the runtime PM.

Changes since v3:
 - Rebased to the upstream, so the patches apply cleanly again.

Changes since v2:
 - Fixed double-assignment of bridge windows;
 - Fixed assignment of fixed prefetched resources;
 - Fixed releasing of fixed resources;
 - Fixed a debug message;
 - Removed auto-enabling the movable BARs for x86 - let's rely on the
   "pcie_movable_bars=force" option for now;
 - Reordered the patches - bugfixes first.

Changes since v1:
 - Add a "pcie_movable_bars={ off | force }" command line argument;
 - Handle the IORESOURCE_PCI_FIXED flag properly;
 - Don't move BARs of devices which don't support the feature;
 - Guarantee that new hotplugged devices will not steal memory from working
   devices by ignoring the failing new devices with the new PCI_DEV_IGNORE
   flag;
 - Add rescan_prepare()+rescan_done() to the struct pci_driver instead of
   using the reset_prepare()+reset_done() from struct pci_error_handlers;
 - Add a bugfix of a race condition;
 - Fixed hotplug in a non-pre-enabled (by BIOS/firmware) bridge;
 - Fix the compatibility of the feature with pm_runtime and D3-state;
 - Hotplug events from pciehp also can move BARs;
 - Add support of the feature to the NVME driver.

Sergei Miroshnichenko (26):
  PCI: Fix race condition in pci_enable/disable_device()
  PCI: Ensure a bridge has I/O and MEM access for hot-added devices
  PCI: hotplug: Initial support of the movable BARs feature
  PCI: Add version of release_child_resources() aware of fixed BARs
  PCI: hotplug: Fix reassigning the released BARs
  PCI: hotplug: Recalculate every bridge window during rescan
  PCI: hotplug: Don't allow hot-added devices to steal resources
  PCI: Reassign BARs if BIOS/bootloader had assigned not all of them
  PCI: Movable BARs: Make just a single attempt to assign bridge
    resources
  PCI: hotplug: Calculate fixed parts of bridge windows
  PCI: Include fixed BARs into the bus size calculating
  PCI: Make sure bridge windows include their fixed BARs
  PCI: hotplug: Add support of fixed BARs to pci_assign_resource()
  PCI: hotplug: Sort fixed BARs before assignment
  x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is
    invalid
  PCI: hotplug: Configure MPS after manual bus rescan
  PCI: hotplug: Don't disable the released bridge windows immediately
  PCI: pciehp: Trigger a domain rescan on hp events when enabled movable
    BARs
  PCI: Don't claim fixed BARs
  PCI: hotplug: Retry bus assignment without reserved space
  PCI: Rescan the bus to resize a BAR
  PCI: hotplug: Enable the movable BARs feature by default
  PCI/portdrv: Declare support of movable BARs
  nvme-pci: Handle movable BARs
  PCI: Add a message for updating BARs
  resource: increase max nesting level for /proc/iomem

 Documentation/PCI/pci.rst                     |  63 +++
 .../admin-guide/kernel-parameters.txt         |   1 +
 arch/powerpc/platforms/powernv/pci.c          |   2 +
 arch/powerpc/platforms/pseries/setup.c        |   2 +
 arch/x86/pci/acpi.c                           |  15 +
 drivers/nvme/host/pci.c                       |  27 +-
 drivers/pci/bus.c                             |   2 +-
 drivers/pci/hotplug/pciehp_pci.c              |   5 +
 drivers/pci/iov.c                             |   2 +
 drivers/pci/pci.c                             |  24 ++
 drivers/pci/pci.h                             |  34 ++
 drivers/pci/pcie/portdrv_pci.c                |   7 +
 drivers/pci/probe.c                           | 401 +++++++++++++++++-
 drivers/pci/setup-bus.c                       | 227 +++++++---
 drivers/pci/setup-res.c                       | 105 ++++-
 include/linux/pci.h                           |  16 +
 kernel/resource.c                             |   2 +-
 17 files changed, 863 insertions(+), 72 deletions(-)


base-commit: a409ed156a90093a03fe6a93721ddf4c591eac87
-- 
2.24.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device()
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-28 15:37   ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 02/26] PCI: Ensure a bridge has I/O and MEM access for hot-added devices Sergei Miroshnichenko
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko, Srinath Mannam, Marta Rybczynska

This is a yet another approach to fix an old [1-2] concurrency issue, when:
 - two or more devices are being hot-added into a bridge which was
   initially empty;
 - a bridge with two or more devices is being hot-added;
 - the BIOS/bootloader/firmware doesn't pre-enable bridges during boot;

The problem is that a bridge is reported as enabled before the MEM/IO bits
are actually written to the PCI_COMMAND register, so another driver thread
starts memory requests through the not-yet-enabled bridge:

 CPU0                                        CPU1

 pci_enable_device_mem()                     pci_enable_device_mem()
   pci_enable_bridge()                         pci_enable_bridge()
     pci_is_enabled()
       return false;
     atomic_inc_return(enable_cnt)
     Start actual enabling the bridge
     ...                                         pci_is_enabled()
     ...                                           return true;
     ...                                     Start memory requests <-- FAIL
     ...
     Set the PCI_COMMAND_MEMORY bit <-- Must wait for this

Protect the pci_enable_bridge(), similarly to the previous solution from
commit 40f11adc7cd9 ("PCI: Avoid race while enabling upstream bridges"),
but adding per-device mutexes.

To prevent false positives from the lockdep, use a lock_nested() with a
"depth" of a bridge within the PCI topology.

CC: Srinath Mannam <srinath.mannam@broadcom.com>
CC: Marta Rybczynska <mrybczyn@kalray.eu>
Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>

[1] https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u
    [RFC PATCH v3] pci: Concurrency issue during pci enable bridge

[2] https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u
    [RFC PATCH] nvme: avoid race-conditions when enabling devices
---
 drivers/pci/pci.c   | 16 ++++++++++++++++
 drivers/pci/probe.c |  1 +
 include/linux/pci.h |  1 +
 3 files changed, 18 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b9fecc25d213..076b908127fe 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1844,11 +1844,25 @@ int pci_reenable_device(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pci_reenable_device);
 
+#ifdef CONFIG_PROVE_LOCKING
+static int pci_bridge_depth(struct pci_dev *dev)
+{
+	struct pci_dev *bridge = pci_upstream_bridge(dev);
+
+	if (!bridge)
+		return 0;
+
+	return 1 + pci_bridge_depth(bridge);
+}
+#endif /* CONFIG_PROVE_LOCKING */
+
 static void pci_enable_bridge(struct pci_dev *dev)
 {
 	struct pci_dev *bridge;
 	int retval;
 
+	mutex_lock_nested(&dev->enable_mutex, pci_bridge_depth(dev));
+
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)
 		pci_enable_bridge(bridge);
@@ -1856,6 +1870,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
 	if (pci_is_enabled(dev)) {
 		if (!dev->is_busmaster)
 			pci_set_master(dev);
+		mutex_unlock(&dev->enable_mutex);
 		return;
 	}
 
@@ -1864,6 +1879,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
 		pci_err(dev, "Error enabling bridge (%d), continuing\n",
 			retval);
 	pci_set_master(dev);
+	mutex_unlock(&dev->enable_mutex);
 }
 
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 953f15abc850..2f9631287719 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2240,6 +2240,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 	INIT_LIST_HEAD(&dev->bus_list);
 	dev->dev.type = &pci_dev_type;
 	dev->bus = pci_bus_get(bus);
+	mutex_init(&dev->enable_mutex);
 
 	return dev;
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b32126d26997..81d54889bd51 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -455,6 +455,7 @@ struct pci_dev {
 	unsigned int	no_command_memory:1;	/* No PCI_COMMAND_MEMORY */
 	pci_dev_flags_t dev_flags;
 	atomic_t	enable_cnt;	/* pci_enable_device has been called */
+	struct mutex	enable_mutex;
 
 	u32		saved_config_space[16]; /* Config space saved at suspend time */
 	struct hlist_head saved_cap_space;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 02/26] PCI: Ensure a bridge has I/O and MEM access for hot-added devices
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 03/26] PCI: hotplug: Initial support of the movable BARs feature Sergei Miroshnichenko
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

During hot-adding a device, its bridge may be already pci_is_enabled(), but
without the I/O and/or MEM access bits, which may be required by this new
device: these bits are set during first enabling the bridge, and they must
be checked again.

When hot-adding to the following bridge:

  +-[0020:00]---00.0-[01-0d]----00.0-[02-0d]----04.0-[03-0d]--   <-   00.0

this patch sets up the MEM bit in the downstream port 0020:02:04.0, needed
for 0020:08:00.0:

  [ 1037.698206] pci 0020:00:00.0: PCI bridge to [bus 01-0d]
  [ 1037.698785] pci 0020:00:00.0:   bridge window [mem 0x3fe800000000-0x3fe8017fffff]
  [ 1037.698874] pci 0020:00:00.0:   bridge window [mem 0x240000000000-0x2400ffffffff 64bit pref]
  [ 1037.699002] pcieport 0020:02:04.0: enabling device (0545 -> 0547)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [ 1037.699114] pcieport 0020:03:00.0: enabling device (0540 -> 0542)
  [ 1037.699198] pciehp 0020:04:09.0:pcie204: Slot #41 AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise- Interlock- NoCompl- LLActRep+
  [ 1037.699285] pciehp 0020:04:09.0:pcie204: Slot(41): Card present
  [ 1037.699346] pciehp 0020:04:09.0:pcie204: Slot(41): Already enabled

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 076b908127fe..b7bbc462a0b3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1868,6 +1868,10 @@ static void pci_enable_bridge(struct pci_dev *dev)
 		pci_enable_bridge(bridge);
 
 	if (pci_is_enabled(dev)) {
+		retval = pci_reenable_device(dev);
+		if (retval)
+			pci_err(dev, "Error reenabling bridge, continuing\n");
+
 		if (!dev->is_busmaster)
 			pci_set_master(dev);
 		mutex_unlock(&dev->enable_mutex);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 03/26] PCI: hotplug: Initial support of the movable BARs feature
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 02/26] PCI: Ensure a bridge has I/O and MEM access for hot-added devices Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 04/26] PCI: Add version of release_child_resources() aware of fixed BARs Sergei Miroshnichenko
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

When hot-adding a device, the involved bridge may have windows not big
enough (or fragmented too much) for newly requested BARs to fit in. But
expanding these bridge windows may be impossible if they are wedged between
existing neighboring BARs.

Still, it may be possible to allocate a memory region for new BARs, if at
least some working BARs can be moved, using the following procedure:

1) Notify all the drivers which support movable BARs to pause and release
   the BARs; the rest of the drivers are guaranteed that their devices will
   not get BARs moved;

2) Release all the bridge windows and movable BARs;

3) Try to recalculate new bridge windows that will fit all the BAR types:
   - fixed (those marked with PCI_FIXED or bound by not-updated drivers);
   - movable;
   - newly requested by hot-added devices;

4) If that failed, disable BARs for one of the hot-added devices and repeat
   the step 3;

5) Notify the drivers, so they remap BARs and resume.

If bridge calculation and BAR assignment fail with hot-added devices, this
patchset disables their BARs, falling back to the same amount and size of
BARs as they were before the hotplug event. The kernel succeeded then, so
the same BAR layout will be reproduced again.

This makes the prior reservation of memory by BIOS/bootloader/firmware not
required anymore for the PCI hotplug.

Drivers indicate support of movable BARs by implementing a new .bar_fixed()
hook and two optional ones: .rescan_prepare() and .rescan_done() in the
struct pci_driver. All device's activity must be paused during a rescan,
and iounmap()+ioremap() must be applied to every used BAR.

If a device is not bound to a driver, its BARs are considered movable,
except VGA devices, due to sometimes indirect usage of their framebuffers.

For a higher probability of the successful BAR reassignment, all the BARs
and bridge windows should be released during a rescan, not only those with
higher addresses.

One example when it is needed, BAR(I) is moved to free a gap for the new
BAR(II):

Before:

    ==================== parent bridge window ===============
                   ---- hotplug bridge window ----
    |   BAR(I)    |   fixed BAR   |   fixed BAR   | fixed BAR |
        ^^^^^^                 ^
                               |
                           new BAR(II)

After:

    ==================== parent bridge window =========================
     ----------- hotplug bridge window -----------
    | new BAR(II) |   fixed BAR   |   fixed BAR   | fixed BAR | BAR(I)  |
      ^^^^^^^^^^^                                               ^^^^^^

Another example is a fragmented bridge window jammed between fixed BARs:

Before:

     ===================== parent bridge window ========================
                 ---------- hotplug bridge window ----------
    | fixed BAR |   | BAR(I) |    | BAR(II) |    | BAR(III) | fixed BAR |
                      ^^^^^^   ^    ^^^^^^^        ^^^^^^^^
                               |
                           new BAR(IV)

After:

     ==================== parent bridge window =========================
                 ---------- hotplug bridge window ----------
    | fixed BAR | BAR(I) | BAR(II) | BAR(III) | new BAR(IV) | fixed BAR |
                  ^^^^^^   ^^^^^^^   ^^^^^^^^   ^^^^^^^^^^^

This patch is a preparation for future patches with actual implementation,
and for now it just does the following:
 - declares the feature;
 - defines bool pci_can_move_bars and bool pci_dev_bar_fixed(dev, bar);
 - invokes the new .rescan_prepare() and .rescan_done() driver notifiers;
 - disables the feature for the powerpc (support will be added later, in
   another series).

The feature is disabled in this first patch of the series, until the actual
implementation if finalized by the following patches. It can be overridden
per-arch using the pci_can_move_bars=false flag or by the following command
line option:

    pci=no_movable_bars

Current approach doesn't support calculating windows for subtractive decode
bridges, so disable BAR movement if such bridge is present.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 Documentation/PCI/pci.rst                     |  63 +++++++++++
 .../admin-guide/kernel-parameters.txt         |   1 +
 arch/powerpc/platforms/powernv/pci.c          |   2 +
 arch/powerpc/platforms/pseries/setup.c        |   2 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |   2 +
 drivers/pci/probe.c                           | 101 +++++++++++++++++-
 include/linux/pci.h                           |   8 ++
 8 files changed, 181 insertions(+), 2 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 814b40f8360b..48b729873ddb 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -575,3 +575,66 @@ handle the PCI master abort on all platforms if the PCI device is
 expected to not respond to a readl().  Most x86 platforms will allow
 MMIO reads to master abort (a.k.a. "Soft Fail") and return garbage
 (e.g. ~0). But many RISC platforms will crash (a.k.a."Hard Fail").
+
+
+Movable BARs
+============
+
+To increase the probability of finding a space for BARs of hot-added devices,
+the kernel requests the drivers to release used BARs, so they can be moved
+to free a gap for new BARs.
+
+This ability can be added to a driver by implementing the
+:c:type:`rescan_prepare()`, :c:type:`bar_fixed()` and :c:type:`rescan_done()`
+hooks from the :c:type:`struct pci_driver`.
+
+Before a PCI bus rescan the driver must pause its activity and unmap its
+BARs, here is an example of how the NVMe driver can perform this::
+
+    static struct pci_driver nvme_driver = {
+            ...
+            .bar_fixed      = nvme_bar_fixed,
+            .rescan_prepare = nvme_rescan_prepare,
+            .rescan_done    = nvme_rescan_done,
+    };
+
+    static bool nvme_bar_fixed(struct pci_dev *pdev, int resno)
+    {
+            return false;
+    }
+
+    static void nvme_rescan_prepare(struct pci_dev *pdev)
+    {
+            struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+            nvme_dev_disable(dev, true);
+            nvme_dev_unmap(dev);
+            dev->bar = NULL;
+    }
+
+The NVMe driver uses a single BAR, which is movable.  After a PCI rescan,
+the driver must re-read new addresses of BARs, remap them and resume::
+
+    static void nvme_rescan_done(struct pci_dev *pdev)
+    {
+            struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+            nvme_dev_map(dev);
+            nvme_reset_ctrl(&dev->ctrl);
+    }
+
+Currently there are no reliable way to determine if a driver uses BARs of
+its devices or not (their :c:type:`struct resource` don't always have a child),
+so if it doesn't explicitly support movable BARs, they are considered fixed.
+To let the PCI subsystem move unused BARs, a driver have to implement one hook::
+
+    static bool pcie_portdrv_bar_fixed(struct pci_dev *pdev, int resno)
+    {
+            return false;
+    }
+
+    static struct pci_driver pcie_portdriver = {
+    {
+            ...
+            .bar_fixed      = pcie_portdrv_bar_fixed,
+    }
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d24336109c2e..ee215b0e231c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3788,6 +3788,7 @@
 		nomio		[S390] Do not use MIO instructions.
 		norid		[S390] ignore the RID field and force use of
 				one PCI domain per PCI function
+		no_movable_bars	Don't allow BARs to be moved during hotplug
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 9b9bca169275..6a1c31a4e2ca 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -890,6 +890,8 @@ void __init pnv_pci_init(void)
 {
 	struct device_node *np;
 
+	pci_can_move_bars = false;
+
 	pci_add_flags(PCI_CAN_SKIP_ISA_ALIGN);
 
 	/* If we don't have OPAL, eg. in sim, just skip PCI probe */
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 090c13f6c881..cf1804529b48 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -964,6 +964,8 @@ static void __init pseries_init(void)
 {
 	pr_debug(" -> pseries_init()\n");
 
+	pci_can_move_bars = false;
+
 #ifdef CONFIG_HVC_CONSOLE
 	if (firmware_has_feature(FW_FEATURE_LPAR))
 		hvc_vio_init_early();
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b7bbc462a0b3..f393f0bc8ec4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -76,6 +76,8 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
 int pci_domains_supported = 1;
 #endif
 
+bool pci_can_move_bars;
+
 #define DEFAULT_CARDBUS_IO_SIZE		(256)
 #define DEFAULT_CARDBUS_MEM_SIZE	(64*1024*1024)
 /* pci=cbmemsize=nnM,cbiosize=nn can override this */
@@ -6602,6 +6604,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "no_movable_bars", 15)) {
+				pci_can_move_bars = false;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 5c59365092fa..c962d0375074 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -273,6 +273,8 @@ void pci_disable_bridge_window(struct pci_dev *dev);
 struct pci_bus *pci_bus_get(struct pci_bus *bus);
 void pci_bus_put(struct pci_bus *bus);
 
+bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res);
+
 /* PCIe link information from Link Capabilities 2 */
 #define PCIE_LNKCAP2_SLS2SPEED(lnkcap2) \
 	((lnkcap2) & PCI_EXP_LNKCAP2_SLS_64_0GB ? PCIE_SPEED_64_0GT : \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f9631287719..17dd1fa4a05a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2473,6 +2473,11 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 	/* Fix up broken headers */
 	pci_fixup_device(pci_fixup_header, dev);
 
+	if (dev->transparent && pci_can_move_bars) {
+		pci_info(dev, "Disable movable BARs in presence of a transparent bridge\n");
+		pci_can_move_bars = false;
+	}
+
 	pci_reassigndev_resource_alignment(dev);
 
 	dev->state_saved = false;
@@ -3007,6 +3012,7 @@ int pci_host_probe(struct pci_host_bridge *bridge)
 	 * or pci_bus_assign_resources().
 	 */
 	if (pci_has_flag(PCI_PROBE_ONLY)) {
+		pci_can_move_bars = false;
 		pci_bus_claim_resources(bus);
 	} else {
 		pci_bus_size_bridges(bus);
@@ -3200,6 +3206,83 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 	return max;
 }
 
+bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res)
+{
+	int resno = res - dev->resource;
+
+	/* Bridge windows are never fixed */
+	if (resno >= PCI_BRIDGE_RESOURCES)
+		return false;
+
+	if (res->flags & IORESOURCE_PCI_FIXED)
+		return true;
+
+	if (res->flags & IORESOURCE_UNSET)
+		return false;
+
+	if (!pci_can_move_bars)
+		return false;
+
+	if (dev->driver && dev->driver->bar_fixed)
+		return dev->driver->bar_fixed(dev, resno);
+
+	if (res->start &&
+	    !(res->flags & IORESOURCE_IO) &&
+	    (dev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
+		return true;
+
+	if (!dev->driver && !res->child)
+		return false;
+
+	return true;
+}
+
+static void pci_bus_rescan_prepare(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (bus->self) {
+		pci_config_pm_runtime_get(bus->self);
+		pm_runtime_get_sync(&bus->self->dev);
+	}
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child)
+			pci_bus_rescan_prepare(child);
+
+		if (dev->driver &&
+		    dev->driver->rescan_prepare)
+			dev->driver->rescan_prepare(dev);
+	}
+}
+
+static void pci_bus_rescan_done(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (bus->self && !pci_dev_is_added(bus->self))
+		return;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (dev->driver &&
+		    dev->driver->rescan_done)
+			dev->driver->rescan_done(dev);
+
+		if (child)
+			pci_bus_rescan_done(child);
+	}
+
+	if (bus->self) {
+		pci_save_state(bus->self);
+		pm_runtime_put(&bus->self->dev);
+		pci_config_pm_runtime_put(bus->self);
+	}
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3212,9 +3295,23 @@ unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
+	struct pci_bus *root = bus;
+
+	while (!pci_is_root_bus(root))
+		root = root->parent;
+
+	if (pci_can_move_bars) {
+		pci_bus_rescan_prepare(root);
+
+		max = pci_scan_child_bus(root);
+		pci_assign_unassigned_root_bus_resources(root);
+
+		pci_bus_rescan_done(root);
+	} else {
+		max = pci_scan_child_bus(bus);
+		pci_assign_unassigned_bus_resources(bus);
+	}
 
-	max = pci_scan_child_bus(bus);
-	pci_assign_unassigned_bus_resources(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 81d54889bd51..8a7033b240f1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -857,6 +857,9 @@ struct module;
  *		e.g. drivers/net/e100.c.
  * @sriov_configure: Optional driver callback to allow configuration of
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
+ * @bar_fixed:	Returns true if the driver doesn't allow to move this BAR.
+ * @rescan_prepare: Prepare to BAR movement - called before PCI rescan.
+ * @rescan_done: Remap BARs and restore after PCI rescan.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
  * @driver:	Driver model structure.
@@ -872,6 +875,9 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	bool (*bar_fixed)(struct pci_dev *dev, int resno);
+	void (*rescan_prepare)(struct pci_dev *dev);
+	void (*rescan_done)(struct pci_dev *dev);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
@@ -1426,6 +1432,8 @@ void pci_setup_bridge(struct pci_bus *bus);
 resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 					 unsigned long type);
 
+extern bool pci_can_move_bars;
+
 #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
 #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 04/26] PCI: Add version of release_child_resources() aware of fixed BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (2 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 03/26] PCI: hotplug: Initial support of the movable BARs feature Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 05/26] PCI: hotplug: Fix reassigning the released BARs Sergei Miroshnichenko
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

When release_child_resources() is applied to a bridge window, it zeroes the
.start field of children BARs, but the STARTALIGN flag remains set, leaving
the resource in a state not valid for later re-assignment. So replace this
flag with SIZEALIGN.

Fixed BARs must preserve their offset and size: those marked with PCI_FIXED
or bound by drivers without support of the movable BARs feature.

Add the pci_release_child_resources() to replace release_child_resources()
in handling the described PCI-specific cases.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 53 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 2ce636937c6e..bc342eaa98e8 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1520,6 +1520,56 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
 	(IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
 	 IORESOURCE_MEM_64)
 
+/*
+ * Similar to generic release_child_resources(), but aware of fixed BARs and the
+ * STARTALIGN flag.
+ */
+static void pci_release_child_resources(struct pci_bus *bus, struct resource *r)
+{
+	struct pci_dev *dev;
+
+	if (!pci_can_move_bars)
+		return release_child_resources(r);
+
+	if (!bus || !r)
+		return;
+
+	r->child = NULL;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		int i;
+
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+			struct resource *tmp = &dev->resource[i];
+			resource_size_t size = resource_size(tmp);
+
+			if (!tmp->flags || !tmp->parent)
+				continue;
+
+			tmp->parent = NULL;
+			tmp->sibling = NULL;
+
+			pci_release_child_resources(dev->subordinate, tmp);
+
+			tmp->flags &= ~IORESOURCE_STARTALIGN;
+			tmp->flags |= IORESOURCE_SIZEALIGN;
+
+			if (pci_dev_bar_fixed(dev, tmp)) {
+				pci_dbg(dev, "release fixed BAR %d %pR (%s), keep its flags, base and size\n",
+					i, tmp, tmp->name);
+				continue;
+			}
+
+			pci_dbg(dev, "release BAR %d %pR (%s)\n", i, tmp, tmp->name);
+
+			if (i >= PCI_BRIDGE_RESOURCES)
+				tmp->flags = 0;
+			tmp->start = 0;
+			tmp->end = size - 1;
+		}
+	}
+}
+
 static void pci_bridge_release_resources(struct pci_bus *bus,
 					 unsigned long type)
 {
@@ -1560,7 +1610,8 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 		return;
 
 	/* If there are children, release them all */
-	release_child_resources(r);
+	pci_release_child_resources(bus, r);
+
 	if (!release_resource(r)) {
 		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		pci_info(dev, "resource %d %pR released\n",
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 05/26] PCI: hotplug: Fix reassigning the released BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (3 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 04/26] PCI: Add version of release_child_resources() aware of fixed BARs Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 06/26] PCI: hotplug: Recalculate every bridge window during rescan Sergei Miroshnichenko
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Bridge windows are temporarily released during a PCI rescan, and their old
sizes are not relevant anymore as they will be recreated in pbus_size_*()
from scratch. So don't keep it, but instead set to zero after release.

If BAR assignment fails after a PCI hotplug event, the kernel will retry
with a fall-back reduced configuration, so don't apply reset_resource() for
non-window BARs to keep them valid for the next attempt.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index bc342eaa98e8..a956f0179064 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -297,7 +297,9 @@ static void assign_requested_resources_sorted(struct list_head *head,
 						    0 /* don't care */,
 						    0 /* don't care */);
 			}
-			reset_resource(res);
+			if (!pci_can_move_bars ||
+			    idx >= PCI_BRIDGE_RESOURCES)
+				reset_resource(res);
 		}
 	}
 }
@@ -1616,8 +1618,8 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		pci_info(dev, "resource %d %pR released\n",
 			 PCI_BRIDGE_RESOURCES + idx, r);
-		/* Keep the old size */
-		r->end = resource_size(r) - 1;
+		/* Don't keep the old size if the bridge will be recalculated */
+		r->end = pci_can_move_bars ? 0 : (resource_size(r) - 1);
 		r->start = 0;
 		r->flags = 0;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 06/26] PCI: hotplug: Recalculate every bridge window during rescan
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (4 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 05/26] PCI: hotplug: Fix reassigning the released BARs Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources Sergei Miroshnichenko
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

When the movable BARs feature is enabled and a rescan has been requested,
release all bridge windows and recalculate them from scratch, taking into
account all kinds of BARs: fixed, movable, new.

Comparing to simply trying to expand bridge windows, this also employs the
PCI ability to shuffle BARs within a bridge, increasing the chances to find
a memory space to fit BARs of newly hot-added devices, especially if no (or
not enough) gaps were reserved.

The last step of writing the recalculated windows to the bridges is done by
the new pci_setup_bridges() function.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       |  1 +
 drivers/pci/probe.c     | 22 ++++++++++++++++++++++
 drivers/pci/setup-bus.c |  9 +++++++++
 3 files changed, 32 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c962d0375074..dc7f40b42fa7 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -267,6 +267,7 @@ void __pci_bus_assign_resources(const struct pci_bus *bus,
 				struct list_head *realloc_head,
 				struct list_head *fail_head);
 bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
+void pci_bus_release_root_bridge_resources(struct pci_bus *bus);
 
 void pci_reassigndev_resource_alignment(struct pci_dev *dev);
 void pci_disable_bridge_window(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 17dd1fa4a05a..47d28761339b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3283,6 +3283,25 @@ static void pci_bus_rescan_done(struct pci_bus *bus)
 	}
 }
 
+static void pci_setup_bridges(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child;
+
+		if (!pci_dev_is_added(dev))
+			continue;
+
+		child = dev->subordinate;
+		if (child)
+			pci_setup_bridges(child);
+	}
+
+	if (bus->self)
+		pci_setup_bridge(bus);
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3304,8 +3323,11 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_bus_rescan_prepare(root);
 
 		max = pci_scan_child_bus(root);
+
+		pci_bus_release_root_bridge_resources(root);
 		pci_assign_unassigned_root_bus_resources(root);
 
+		pci_setup_bridges(root);
 		pci_bus_rescan_done(root);
 	} else {
 		max = pci_scan_child_bus(bus);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index a956f0179064..9eb982196422 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1673,6 +1673,15 @@ static void pci_bus_release_bridge_resources(struct pci_bus *bus,
 		pci_bridge_release_resources(bus, type);
 }
 
+void pci_bus_release_root_bridge_resources(struct pci_bus *root_bus)
+{
+	pci_bus_release_bridge_resources(root_bus, IORESOURCE_IO, whole_subtree);
+	pci_bus_release_bridge_resources(root_bus, IORESOURCE_MEM, whole_subtree);
+	pci_bus_release_bridge_resources(root_bus,
+					 IORESOURCE_MEM_64 | IORESOURCE_PREFETCH,
+					 whole_subtree);
+}
+
 static void pci_bus_dump_res(struct pci_bus *bus)
 {
 	struct resource *res;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (5 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 06/26] PCI: hotplug: Recalculate every bridge window during rescan Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 08/26] PCI: Reassign BARs if BIOS/bootloader had assigned not all of them Sergei Miroshnichenko
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

When movable BARs are enabled, the PCI subsystem at first releases all the
bridge windows and then attempts to assign resources both to previously
working devices and to the newly hot-added ones, with the same priority.

If a hot-added device gets its BARs first, this may lead to lack of space
for already working devices, which is unacceptable. If that happens, mark
one of the new devices with the newly introduced flag PCI_DEV_DISABLED_BARS
(if it is not yet marked) and retry the BAR recalculation.

The worst case would be no BARs for hot-added devices, while all the rest
just continue working.

The algorithm is simple and it doesn't retry different subsets of hot-added
devices in case of a failure, e.g. if there are no space to allocate BARs
for both hot-added devices A and B, but is enough for just A, the A will be
marked with PCI_DEV_DISABLED_BARS first, then (after the next failure) - B.
As a result, A will not get BARs while it could. This issue is only
relevant when hot-adding two and more devices simultaneously.

Add a new res_mask bitmask to the struct pci_dev for storing the indices of
assigned BARs.

When preparing to the next rescan, all PCI_DEV_DISABLED_BARS marks are
unset, so the kernel can retry to assign previously failed BARs.

Before a rescan, some working devices may have assigned only part of their
BARs - for example, if BIOS didn't allocate them. With this patch, the
kernel assigns BARs in three steps:
  - first try every BAR, even those that weren't assigned before;
  - if that fails, retry without those failed BARs;
  - if that fails, retry without one of hotplugged devices.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       |   3 +
 drivers/pci/probe.c     | 177 +++++++++++++++++++++++++++++++++++++++-
 drivers/pci/setup-bus.c |   6 +-
 drivers/pci/setup-res.c |   2 +
 include/linux/pci.h     |   1 +
 5 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index dc7f40b42fa7..1668b1f45133 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -275,6 +275,8 @@ struct pci_bus *pci_bus_get(struct pci_bus *bus);
 void pci_bus_put(struct pci_bus *bus);
 
 bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res);
+bool pci_dev_bar_enabled(const struct pci_dev *dev, int idx);
+bool pci_bus_check_bars_assigned(struct pci_bus *bus, bool complete_set);
 
 /* PCIe link information from Link Capabilities 2 */
 #define PCIE_LNKCAP2_SLS2SPEED(lnkcap2) \
@@ -395,6 +397,7 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 
 /* pci_dev priv_flags */
 #define PCI_DEV_ADDED 0
+#define PCI_DEV_DISABLED_BARS 1
 
 static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
 {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 47d28761339b..294e8f262c7f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -35,6 +35,13 @@ static struct resource busn_resource = {
 LIST_HEAD(pci_root_buses);
 EXPORT_SYMBOL(pci_root_buses);
 
+/*
+ * This flag is used during pci_rescan_bus(), protected by pci_rescan_remove_lock:
+ * it indicates which BARs should be reassigned: every one, or only those which
+ * were assigned before the rescan.
+ */
+static bool pci_try_failed_bars = true;
+
 static LIST_HEAD(pci_domain_busn_res_list);
 
 struct pci_domain_busn_res {
@@ -43,6 +50,41 @@ struct pci_domain_busn_res {
 	int domain_nr;
 };
 
+static void pci_dev_disable_bars(struct pci_dev *dev)
+{
+	assign_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags, true);
+}
+
+static void pci_dev_enable_bars(struct pci_dev *dev)
+{
+	assign_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags, false);
+}
+
+static bool pci_dev_bars_enabled(const struct pci_dev *dev)
+{
+	if (pci_try_failed_bars)
+		return true;
+
+	return !(test_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags));
+}
+
+bool pci_dev_bar_enabled(const struct pci_dev *dev, int idx)
+{
+	if (idx >= PCI_BRIDGE_RESOURCES)
+		return true;
+
+	if (pci_try_failed_bars)
+		return true;
+
+	if (test_bit(PCI_DEV_DISABLED_BARS, &dev->priv_flags))
+		return false;
+
+	if (!pci_dev_is_added(dev))
+		return true;
+
+	return dev->res_mask & (1 << idx);
+}
+
 static struct resource *get_pci_domain_busn_res(int domain_nr)
 {
 	struct pci_domain_busn_res *r;
@@ -3237,6 +3279,24 @@ bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res)
 	return true;
 }
 
+static unsigned int pci_dev_count_res_mask(struct pci_dev *dev)
+{
+	unsigned int res_mask = 0;
+	int i;
+
+	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++) {
+		struct resource *r = &dev->resource[i];
+
+		if (!r->flags || !r->parent ||
+		    (r->flags & IORESOURCE_UNSET))
+			continue;
+
+		res_mask |= (1 << i);
+	}
+
+	return res_mask;
+}
+
 static void pci_bus_rescan_prepare(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
@@ -3249,6 +3309,9 @@ static void pci_bus_rescan_prepare(struct pci_bus *bus)
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		struct pci_bus *child = dev->subordinate;
 
+		dev->res_mask = pci_dev_count_res_mask(dev);
+		pci_dev_enable_bars(dev);
+
 		if (child)
 			pci_bus_rescan_prepare(child);
 
@@ -3302,6 +3365,118 @@ static void pci_setup_bridges(struct pci_bus *bus)
 		pci_setup_bridge(bus);
 }
 
+static struct pci_dev *pci_find_next_new_device(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	if (!bus)
+		return NULL;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child_bus = dev->subordinate;
+
+		if (child_bus) {
+			struct pci_dev *next_new_dev;
+
+			next_new_dev = pci_find_next_new_device(child_bus);
+			if (next_new_dev)
+				return next_new_dev;
+		}
+
+		if (!pci_dev_is_added(dev) && pci_dev_bars_enabled(dev))
+			return dev;
+	}
+
+	return NULL;
+}
+
+/**
+ * pci_bus_check_bars_assigned - check BARs under the bridge
+ * @bus: Parent PCI bus
+ * @complete_set: check every BAR, otherwise only those assigned before
+ *
+ * Returns true if every BAR is assigned.
+ */
+bool pci_bus_check_bars_assigned(struct pci_bus *bus, bool complete_set)
+{
+	struct pci_dev *dev;
+	bool good = true;
+
+	if (!bus)
+		return false;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (complete_set) {
+			int i;
+
+			for (i = 0; i < PCI_BRIDGE_RESOURCES; ++i) {
+				struct resource *r = &dev->resource[i];
+
+				if (!(r->flags & IORESOURCE_UNSET))
+					continue;
+
+				pci_warn(dev, "BAR %d: requested but not assigned: %pR\n",
+					 i, r);
+				good = false;
+			}
+		} else {
+			unsigned int res_mask;
+
+			if (!pci_dev_bars_enabled(dev))
+				continue;
+
+			res_mask = pci_dev_count_res_mask(dev);
+
+			if (dev->res_mask & ~res_mask) {
+				pci_err(dev, "Non-re-enabled resources found: 0x%x -> 0x%x\n",
+					dev->res_mask, res_mask);
+				good = false;
+			}
+		}
+
+		if (child && !pci_bus_check_bars_assigned(child, complete_set))
+			good = false;
+	}
+
+	return good;
+}
+
+static void pci_reassign_root_bus_resources(struct pci_bus *root)
+{
+	do {
+		struct pci_dev *next_new_dev;
+
+		pci_assign_unassigned_root_bus_resources(root);
+
+		if (pci_bus_check_bars_assigned(root, pci_try_failed_bars))
+			break;
+
+		if (pci_try_failed_bars) {
+			dev_warn(&root->dev, "failed to assign all BARs, retry without those failed before\n");
+
+			pci_bus_release_root_bridge_resources(root);
+			pci_try_failed_bars = false;
+			continue;
+		}
+
+		next_new_dev = pci_find_next_new_device(root);
+		if (!next_new_dev) {
+			dev_err(&root->dev, "failed to reassign BARs even after ignoring all the hot-added devices, reload the kernel with pci=no_movable_bars\n");
+			break;
+		}
+
+		dev_warn(&root->dev, "failed to reassign BARs, disable the next hot-added device %s and retry\n",
+			 dev_name(&next_new_dev->dev));
+
+		pci_dev_disable_bars(next_new_dev);
+		pci_bus_release_root_bridge_resources(root);
+	} while (true);
+
+	pci_try_failed_bars = true;
+}
+
 /**
  * pci_rescan_bus - Scan a PCI bus for devices
  * @bus: PCI bus to scan
@@ -3325,7 +3500,7 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		max = pci_scan_child_bus(root);
 
 		pci_bus_release_root_bridge_resources(root);
-		pci_assign_unassigned_root_bus_resources(root);
+		pci_reassign_root_bus_resources(root);
 
 		pci_setup_bridges(root);
 		pci_bus_rescan_done(root);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 9eb982196422..e2e253815454 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -141,7 +141,7 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 		if (r->flags & IORESOURCE_PCI_FIXED)
 			continue;
 
-		if (!(r->flags) || r->parent)
+		if (!(r->flags) || r->parent || !pci_dev_bar_enabled(dev, i))
 			continue;
 
 		r_align = pci_resource_alignment(dev, r);
@@ -903,7 +903,8 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,
 			struct resource *r = &dev->resource[i];
 			unsigned long r_size;
 
-			if (r->parent || !(r->flags & IORESOURCE_IO))
+			if (r->parent || !(r->flags & IORESOURCE_IO) ||
+			    !pci_dev_bar_enabled(dev, i))
 				continue;
 			r_size = resource_size(r);
 
@@ -1023,6 +1024,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 			resource_size_t r_size;
 
 			if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) ||
+			    !pci_dev_bar_enabled(dev, i) ||
 			    ((r->flags & mask) != type &&
 			     (r->flags & mask) != type2 &&
 			     (r->flags & mask) != type3))
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 43eda101fcf4..432f3b084f94 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -474,6 +474,8 @@ int pci_enable_resources(struct pci_dev *dev, int mask)
 		if ((i == PCI_ROM_RESOURCE) &&
 				(!(r->flags & IORESOURCE_ROM_ENABLE)))
 			continue;
+		if (!pci_dev_bar_enabled(dev, i))
+			continue;
 
 		if (r->flags & IORESOURCE_UNSET) {
 			pci_err(dev, "can't enable device: BAR %d %pR not assigned\n",
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8a7033b240f1..29310f026eb7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -401,6 +401,7 @@ struct pci_dev {
 	 */
 	unsigned int	irq;
 	struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regions + expansion ROMs */
+	unsigned int	res_mask;		/* Bitmask of assigned resources */
 
 	bool		match_driver;		/* Skip attaching driver */
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 08/26] PCI: Reassign BARs if BIOS/bootloader had assigned not all of them
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (6 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 09/26] PCI: Movable BARs: Make just a single attempt to assign bridge resources Sergei Miroshnichenko
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Some BIOSes don't allocate all requested BARs, leaving some (for example,
SR_IOV) unassigned, without gaps for bridge windows to extend.

If that happens, let the kernel use its own methods of BAR allocating on an
early init stage, when drivers aren't yet bound to their devices, and it is
safe to shuffle BARs that are not yet used.

If the reassignment fails, retry without BARs omitted by BIOS, they have
the IORESOURCE_UNSET flag being set. To use this property, a new bool was
introduced: pci_init_done.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       | 2 ++
 drivers/pci/probe.c     | 3 ++-
 drivers/pci/setup-bus.c | 7 +++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1668b1f45133..62c3eb146348 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -278,6 +278,8 @@ bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res);
 bool pci_dev_bar_enabled(const struct pci_dev *dev, int idx);
 bool pci_bus_check_bars_assigned(struct pci_bus *bus, bool complete_set);
 
+extern bool pci_init_done;
+
 /* PCIe link information from Link Capabilities 2 */
 #define PCIE_LNKCAP2_SLS2SPEED(lnkcap2) \
 	((lnkcap2) & PCI_EXP_LNKCAP2_SLS_64_0GB ? PCIE_SPEED_64_0GT : \
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 294e8f262c7f..f6e2216ce996 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -41,6 +41,7 @@ EXPORT_SYMBOL(pci_root_buses);
  * were assigned before the rescan.
  */
 static bool pci_try_failed_bars = true;
+bool pci_init_done;
 
 static LIST_HEAD(pci_domain_busn_res_list);
 
@@ -3288,7 +3289,7 @@ static unsigned int pci_dev_count_res_mask(struct pci_dev *dev)
 		struct resource *r = &dev->resource[i];
 
 		if (!r->flags || !r->parent ||
-		    (r->flags & IORESOURCE_UNSET))
+		    (pci_init_done && (r->flags & IORESOURCE_UNSET)))
 			continue;
 
 		res_mask |= (1 << i);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index e2e253815454..a69415348684 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1919,7 +1919,14 @@ void __init pci_assign_unassigned_resources(void)
 		/* Make sure the root bridge has a companion ACPI device */
 		if (ACPI_HANDLE(root_bus->bridge))
 			acpi_ioapic_add(ACPI_HANDLE(root_bus->bridge));
+
+		if (pci_can_move_bars && !pci_bus_check_bars_assigned(root_bus, true)) {
+			dev_err(&root_bus->dev, "Not all requested BARs are assigned, triggering a rescan with movable BARs");
+			pci_rescan_bus(root_bus);
+		}
 	}
+
+	pci_init_done = true;
 }
 
 static void adjust_bridge_window(struct pci_dev *bridge, struct resource *res,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 09/26] PCI: Movable BARs: Make just a single attempt to assign bridge resources
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (7 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 08/26] PCI: Reassign BARs if BIOS/bootloader had assigned not all of them Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 10/26] PCI: hotplug: Calculate fixed parts of bridge windows Sergei Miroshnichenko
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

With enabled BAR movement, all BARs and bridge windows are released at the
beginning of a reassign procedure, not just part of them. In this case only
one pass is enough, no need for several traverses.

In case of failures the pci_reassign_root_bus_resources() disables BARs for
one of the hot-added devices and retries the single-step assignment.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index a69415348684..640090046ffd 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1826,6 +1826,13 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	int pci_try_num = 1;
 	enum enable_type enable_local;
 
+	if (pci_can_move_bars) {
+		__pci_bus_size_bridges(bus, NULL);
+		__pci_bus_assign_resources(bus, NULL, NULL);
+
+		goto dump;
+	}
+
 	/* Don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
 	if (pci_realloc_enabled(enable_local)) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 10/26] PCI: hotplug: Calculate fixed parts of bridge windows
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (8 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 09/26] PCI: Movable BARs: Make just a single attempt to assign bridge resources Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 11/26] PCI: Include fixed BARs into the bus size calculating Sergei Miroshnichenko
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

When movable BARs are enabled, and if a bridge contains a device with fixed
BARs, the corresponding windows can't be moved too far away from their
original positions - they must still contain all the fixed BARs, like that:

1) Window position before a bus rescan:

   | <------                root bridge window                    ------> |
   |                                                                      |
   | | <--     bridge window    --> |                                     |
   | | movable BARs | **fixed BAR** |                                     |
       ^^^^^^^^^^^^

2) A possible valid outcome after a rescan and being moved:

   | <------                root bridge window                    ------> |
   |                                                                      |
   |                | <--     bridge window    --> |                      |
   |                | **fixed BAR** | Movable BARs |                      |
                                      ^^^^^^^^^^^^

A fixed area of a bridge window is a range that covers all the fixed BARs
of direct children, and all the fixed area of children bridges:

   | <------                root bridge window                    ------> |
   |                                                                      |
   |  | <------              bridge window level 1            ------> |   |
   |  | ********************** fixed area ********************        |   |
   |  |                                                               |   |
   |  | **fixed BAR** | <---     bridge window level 2    ---> | BARs |   |
   |  |               | ************* fixed area ************* |      |   |
   |  |               |                                        |      |   |
   |  |               | **fixed BAR** |  BARs  | **fixed BAR** |      |   |
                                         ^^^^

To store these areas, the .fixed_range field has been added to the struct
pci_bus for every bridge window type: IO, MEM and PREFETCH. They are filled
recursively from leaves to the root before a rescan.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h   | 23 ++++++++++++
 drivers/pci/probe.c | 89 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h |  6 +++
 3 files changed, 118 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 62c3eb146348..f47f80b6a620 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -278,6 +278,17 @@ bool pci_dev_bar_fixed(struct pci_dev *dev, struct resource *res);
 bool pci_dev_bar_enabled(const struct pci_dev *dev, int idx);
 bool pci_bus_check_bars_assigned(struct pci_bus *bus, bool complete_set);
 
+static inline void pci_set_fixed_range(struct resource *res)
+{
+	res->start = (resource_size_t)-1;
+	res->end = 0;
+}
+
+static inline bool pci_fixed_range_valid(struct resource *res)
+{
+	return res->start <= res->end;
+}
+
 extern bool pci_init_done;
 
 /* PCIe link information from Link Capabilities 2 */
@@ -397,6 +408,18 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 	return dev->error_state == pci_channel_io_perm_failure;
 }
 
+static inline int pci_get_bridge_resource_idx(struct resource *r)
+{
+	if (r->flags & IORESOURCE_IO)
+		return 0;
+	else if (!(r->flags & IORESOURCE_PREFETCH))
+		return 1;
+	else if (r->flags & IORESOURCE_MEM_64)
+		return 2;
+
+	return 1;
+}
+
 /* pci_dev priv_flags */
 #define PCI_DEV_ADDED 0
 #define PCI_DEV_DISABLED_BARS 1
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index f6e2216ce996..964dfa71af22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -589,6 +589,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
 static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
 {
 	struct pci_bus *b;
+	int idx;
 
 	b = kzalloc(sizeof(*b), GFP_KERNEL);
 	if (!b)
@@ -605,6 +606,13 @@ static struct pci_bus *pci_alloc_bus(struct pci_bus *parent)
 	if (parent)
 		b->domain_nr = parent->domain_nr;
 #endif
+	for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx)
+		pci_set_fixed_range(&b->fixed_range[idx]);
+
+	b->fixed_range[0].flags = IORESOURCE_IO;
+	b->fixed_range[1].flags = IORESOURCE_MEM;
+	b->fixed_range[2].flags = IORESOURCE_MEM | IORESOURCE_PREFETCH;
+
 	return b;
 }
 
@@ -3366,6 +3374,86 @@ static void pci_setup_bridges(struct pci_bus *bus)
 		pci_setup_bridge(bus);
 }
 
+static void pci_bus_update_fixed_range(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+	int idx;
+	resource_size_t start, end;
+
+	for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx)
+		pci_set_fixed_range(&bus->fixed_range[idx]);
+
+	list_for_each_entry(dev, &bus->devices, bus_list)
+		if (dev->subordinate)
+			pci_bus_update_fixed_range(dev->subordinate);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		int i;
+		struct pci_bus *child = dev->subordinate;
+
+		for (i = 0; i < PCI_BRIDGE_RESOURCES; ++i) {
+			struct resource *r = &dev->resource[i];
+			struct resource *fixed_range;
+
+			if (!r->flags || (r->flags & IORESOURCE_UNSET) ||
+			    !r->parent || !pci_dev_bar_fixed(dev, r))
+				continue;
+
+			idx = pci_get_bridge_resource_idx(r);
+			fixed_range = &bus->fixed_range[idx];
+			start = fixed_range->start;
+			end = fixed_range->end;
+
+			if (!pci_fixed_range_valid(fixed_range) ||
+			    start > r->start)
+				start = r->start;
+
+			if (end < r->end)
+				end = r->end;
+
+			if (fixed_range->start != start ||
+			    fixed_range->end != end) {
+				fixed_range->start = start;
+				fixed_range->end = end;
+				dev_dbg(&bus->dev, "Found fixed BAR %d %pR in %s, expand the fixed bridge window %d to %pR\n",
+					i, r, dev_name(&dev->dev), idx,
+					fixed_range);
+			}
+		}
+
+		if (child) {
+			for (idx = 0; idx < PCI_BRIDGE_RESOURCE_NUM; ++idx) {
+				struct resource *fixed_range = &bus->fixed_range[idx];
+				struct resource *child_fixed_range =
+					&child->fixed_range[idx];
+
+				if (!pci_fixed_range_valid(child_fixed_range))
+					continue;
+
+				start = fixed_range->start;
+				end = fixed_range->end;
+
+				if (!pci_fixed_range_valid(fixed_range) ||
+				    start > child_fixed_range->start)
+					start = child_fixed_range->start;
+
+				if (end < child_fixed_range->end)
+					end = child_fixed_range->end;
+
+				if (start < fixed_range->start ||
+				    end > fixed_range->end) {
+					dev_dbg(&bus->dev, "Expand the fixed bridge window %d from %s to 0x%llx-0x%llx\n",
+						idx, dev_name(&child->dev),
+						(unsigned long long)start,
+						(unsigned long long)end);
+					fixed_range->start = start;
+					fixed_range->end = end;
+				}
+			}
+		}
+	}
+}
+
 static struct pci_dev *pci_find_next_new_device(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
@@ -3497,6 +3585,7 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 
 	if (pci_can_move_bars) {
 		pci_bus_rescan_prepare(root);
+		pci_bus_update_fixed_range(root);
 
 		max = pci_scan_child_bus(root);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 29310f026eb7..def6b275d5ad 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -622,6 +622,12 @@ struct pci_bus {
 	struct list_head resources;	/* Address space routed to this bus */
 	struct resource busn_res;	/* Bus numbers routed to this bus */
 
+	/*
+	 * If there are fixed resources in the bridge window, this range contains
+	 * the lowest start address and the highest end address of them.
+	 */
+	struct resource fixed_range[PCI_BRIDGE_RESOURCE_NUM];
+
 	struct pci_ops	*ops;		/* Configuration access functions */
 	struct msi_controller *msi;	/* MSI controller */
 	void		*sysdata;	/* Hook for sys-specific extension */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 11/26] PCI: Include fixed BARs into the bus size calculating
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (9 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 10/26] PCI: hotplug: Calculate fixed parts of bridge windows Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 12/26] PCI: Make sure bridge windows include their fixed BARs Sergei Miroshnichenko
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

The only difference between fixed and movable BARs is a size and offset
preservation after they are released (the corresponding struct resource* is
detached from a bridge window for a while during a bus rescan). So fixed
BARs should not be skipped in pbus_size_{mem,io}().

Bridge window size calculation uses pci_{,sriov_}resource_alignment(), that
are applicable only to not yet assigned BARs and don't make sense for fixed
ones. Original alignment of a fixed BAR is lost after assignment, so return
1 in this case as a neutral value.

A window should be additionally extended if it has distant fixed BARs on
its edges:

    | <--          bridge window          --> |
    | **fixed BAR** |         | **fixed BAR** |

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/iov.c       |  2 ++
 drivers/pci/pci.h       |  2 ++
 drivers/pci/setup-bus.c | 21 ++++++++++++++++++++-
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4afd4ee4f7f0..f4f7c2702579 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -872,6 +872,8 @@ resource_size_t __weak pcibios_iov_resource_alignment(struct pci_dev *dev,
  */
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
 {
+	if (pci_dev_bar_fixed(dev, dev->resource + resno))
+		return 1;
 	return pcibios_iov_resource_alignment(dev, resno);
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f47f80b6a620..f7460ddd196a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -576,6 +576,8 @@ static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
 	if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
 		return pci_sriov_resource_alignment(dev, resno);
 #endif
+	if (pci_dev_bar_fixed(dev, res))
+		return 1;
 	if (dev->class >> 8 == PCI_CLASS_BRIDGE_CARDBUS)
 		return pci_cardbus_resource_alignment(res);
 	return resource_alignment(res);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 640090046ffd..3dadadea10d6 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -888,9 +888,15 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,
 	resource_size_t children_add_size = 0;
 	resource_size_t min_align, align;
 
+	struct resource *fixed_range;
+	resource_size_t fixed_size;
+
 	if (!b_res)
 		return;
 
+	fixed_range = &bus->fixed_range[0];
+	fixed_size = pci_fixed_range_valid(fixed_range) ? resource_size(fixed_range) : 0;
+
 	/* If resource is already assigned, nothing more to do */
 	if (b_res->parent)
 		return;
@@ -923,6 +929,9 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,
 		}
 	}
 
+	if (size1 < fixed_size)
+		size1 = fixed_size;
+
 	size0 = calculate_iosize(size, min_size, size1, 0, 0,
 			resource_size(b_res), min_align);
 	size1 = (!realloc_head || (realloc_head && !add_size && !children_add_size)) ? size0 :
@@ -1004,6 +1013,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
 	resource_size_t add_align = 0;
+	int idx;
+	struct resource *fixed_range;
+	resource_size_t fixed_size;
 
 	if (!b_res)
 		return -ENOSPC;
@@ -1012,6 +1024,13 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	if (b_res->parent)
 		return 0;
 
+	idx = pci_get_bridge_resource_idx(b_res);
+	fixed_range = &bus->fixed_range[idx];
+	fixed_size = pci_fixed_range_valid(fixed_range) ? resource_size(fixed_range) : 0;
+
+	if (min_size < fixed_size)
+		min_size = fixed_size;
+
 	memset(aligns, 0, sizeof(aligns));
 	max_order = 0;
 	size = 0;
@@ -1023,7 +1042,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 			struct resource *r = &dev->resource[i];
 			resource_size_t r_size;
 
-			if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) ||
+			if (r->parent ||
 			    !pci_dev_bar_enabled(dev, i) ||
 			    ((r->flags & mask) != type &&
 			     (r->flags & mask) != type2 &&
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 12/26] PCI: Make sure bridge windows include their fixed BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (10 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 11/26] PCI: Include fixed BARs into the bus size calculating Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 13/26] PCI: hotplug: Add support of fixed BARs to pci_assign_resource() Sergei Miroshnichenko
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

With enabled movable BARs, bridge windows are recalculated during each PCI
rescan. There may be fixed BARs below a bridge: these areas are represented
by the .fixed_range field in struct pci_bus.

When choosing a start address for a bridge window, it should be not just a
lowest possible address: this window must cover every underlying fixed BAR.

If a size of the bridge window is equal to its fixed range, it can only be
assigned to the start of this range. But if a bridge window size is larger,
its lowest possible address equals to (fixed_range.end - bus_size + 1).

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/bus.c       |  2 +-
 drivers/pci/setup-res.c | 30 ++++++++++++++++++++++++++++--
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 3cef835b375f..c716210b62c5 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -192,7 +192,7 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, struct resource *res,
 		 * this is an already-configured bridge window, its start
 		 * overrides "min".
 		 */
-		if (avail.start)
+		if (min_used < avail.start)
 			min_used = avail.start;
 
 		max = avail.end;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 432f3b084f94..ee192e731119 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -247,11 +247,27 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 		int resno, resource_size_t size, resource_size_t align)
 {
 	struct resource *res = dev->resource + resno;
+	struct resource *fixed_range = NULL;
 	resource_size_t min;
 	int ret;
 
 	min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM;
 
+	if (pci_can_move_bars && dev->subordinate && resno >= PCI_BRIDGE_RESOURCES) {
+		struct pci_bus *child_bus = dev->subordinate;
+		int win_no = resno - PCI_BRIDGE_RESOURCES;
+
+		fixed_range = &child_bus->fixed_range[win_no];
+		if (pci_fixed_range_valid(fixed_range)) {
+			if (size <= resource_size(fixed_range))
+				min = fixed_range->start;
+			else
+				min = fixed_range->end - size + 1;
+		} else {
+			fixed_range = NULL;
+		}
+	}
+
 	/*
 	 * First, try exact prefetching match.  Even if a 64-bit
 	 * prefetchable bridge window is below 4GB, we can't put a 32-bit
@@ -263,7 +279,7 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 				     IORESOURCE_PREFETCH | IORESOURCE_MEM_64,
 				     pcibios_align_resource, dev);
 	if (ret == 0)
-		return 0;
+		goto check_fixed;
 
 	/*
 	 * If the prefetchable window is only 32 bits wide, we can put
@@ -275,7 +291,7 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 					     IORESOURCE_PREFETCH,
 					     pcibios_align_resource, dev);
 		if (ret == 0)
-			return 0;
+			goto check_fixed;
 	}
 
 	/*
@@ -288,6 +304,16 @@ static int __pci_assign_resource(struct pci_bus *bus, struct pci_dev *dev,
 		ret = pci_bus_alloc_resource(bus, res, size, align, min, 0,
 					     pcibios_align_resource, dev);
 
+check_fixed:
+	if (ret == 0 && fixed_range &&
+	    (res->start > fixed_range->start ||
+	     res->end < fixed_range->end)) {
+		dev_err(&bus->dev, "fixed area %pR for %s doesn't fit in the allocated %pR",
+			fixed_range, dev_name(&dev->dev), res);
+		release_resource(res);
+		return -1;
+	}
+
 	return ret;
 }
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 13/26] PCI: hotplug: Add support of fixed BARs to pci_assign_resource()
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (11 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 12/26] PCI: Make sure bridge windows include their fixed BARs Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:39 ` [PATCH v9 14/26] PCI: hotplug: Sort fixed BARs before assignment Sergei Miroshnichenko
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Fixed BARs must be assigned within a bridge window first, before movable
BARs and neighboring bridge windows. Currently they are assigned last by
pdev_assign_fixed_resources().

Let the fixed BARs be handled by pci_assign_resource() in the same way as
it does for movable ones, assigning them in correct order, unifying the
code.

Allow matching IORESOURCE_PCI_FIXED prefetchable BARs to non-prefetchable
windows, so they follow the same rules as non-flagged fixed BARs.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 43 -----------------------------------------
 drivers/pci/setup-res.c | 41 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 39 insertions(+), 45 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 3dadadea10d6..6055f15e3ac3 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1358,47 +1358,6 @@ void pci_bus_size_bridges(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_bus_size_bridges);
 
-static void assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r)
-{
-	int i;
-	struct resource *parent_r;
-	unsigned long mask = IORESOURCE_IO | IORESOURCE_MEM |
-			     IORESOURCE_PREFETCH;
-
-	pci_bus_for_each_resource(b, parent_r, i) {
-		if (!parent_r)
-			continue;
-
-		if ((r->flags & mask) == (parent_r->flags & mask) &&
-		    resource_contains(parent_r, r))
-			request_resource(parent_r, r);
-	}
-}
-
-/*
- * Try to assign any resources marked as IORESOURCE_PCI_FIXED, as they are
- * skipped by pbus_assign_resources_sorted().
- */
-static void pdev_assign_fixed_resources(struct pci_dev *dev)
-{
-	int i;
-
-	for (i = 0; i <  PCI_NUM_RESOURCES; i++) {
-		struct pci_bus *b;
-		struct resource *r = &dev->resource[i];
-
-		if (r->parent || !(r->flags & IORESOURCE_PCI_FIXED) ||
-		    !(r->flags & (IORESOURCE_IO | IORESOURCE_MEM)))
-			continue;
-
-		b = dev->bus;
-		while (b && !r->parent) {
-			assign_fixed_resource_on_bus(b, r);
-			b = b->parent;
-		}
-	}
-}
-
 void __pci_bus_assign_resources(const struct pci_bus *bus,
 				struct list_head *realloc_head,
 				struct list_head *fail_head)
@@ -1409,8 +1368,6 @@ void __pci_bus_assign_resources(const struct pci_bus *bus,
 	pbus_assign_resources_sorted(bus, realloc_head, fail_head);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		pdev_assign_fixed_resources(dev);
-
 		b = dev->subordinate;
 		if (!b)
 			continue;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index ee192e731119..6ffda8b94c52 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -333,14 +333,51 @@ static int _pci_assign_resource(struct pci_dev *dev, int resno,
 	return ret;
 }
 
+static int assign_fixed_resource_on_bus(struct pci_dev *dev, int resno)
+{
+	int i;
+	struct resource *parent_r;
+	unsigned long mask = IORESOURCE_TYPE_BITS;
+	struct resource *r = dev->resource + resno;
+
+	/*
+	 * If we have a shadow copy in RAM, the PCI device doesn't respond
+	 * to the shadow range
+	 */
+	if (r->flags & IORESOURCE_ROM_SHADOW)
+		return 0;
+
+	pci_bus_for_each_resource(dev->bus, parent_r, i) {
+		if (!parent_r)
+			continue;
+
+		if ((r->flags & mask) != (parent_r->flags & mask))
+			continue;
+
+		if (parent_r->flags & IORESOURCE_PREFETCH &&
+		    !(r->flags & IORESOURCE_PREFETCH))
+			continue;
+
+		if (resource_contains(parent_r, r)) {
+			if (!request_resource(parent_r, r)) {
+				pci_info(dev, "BAR %d: assigned fixed %pR\n", resno, r);
+				return 0;
+			}
+		}
+	}
+
+	pci_err(dev, "BAR %d: failed to assign fixed %pR\n", resno, r);
+	return -ENOSPC;
+}
+
 int pci_assign_resource(struct pci_dev *dev, int resno)
 {
 	struct resource *res = dev->resource + resno;
 	resource_size_t align, size;
 	int ret;
 
-	if (res->flags & IORESOURCE_PCI_FIXED)
-		return 0;
+	if (res->flags && pci_dev_bar_fixed(dev, res))
+		return assign_fixed_resource_on_bus(dev, resno);
 
 	res->flags |= IORESOURCE_UNSET;
 	align = pci_resource_alignment(dev, res);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 14/26] PCI: hotplug: Sort fixed BARs before assignment
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (12 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 13/26] PCI: hotplug: Add support of fixed BARs to pci_assign_resource() Sergei Miroshnichenko
@ 2020-12-18 17:39 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 15/26] x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is invalid Sergei Miroshnichenko
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:39 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Fixed BARs and bridge windows containing fixed BARs must be assigned before
the movable ones.

When assigning a fixed BAR/bridge window, its start address is chosen to be
the lowest possible. To prevent conflicts, sort such resources based on the
start address of their fixed areas.

Add support of fixed BARs and bridge windows to pdev_sort_resources().

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 63 +++++++++++++++++++++++++++++++++++------
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6055f15e3ac3..ab58b999ac6d 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -126,7 +126,8 @@ static resource_size_t get_res_add_align(struct list_head *head,
 
 
 /* Sort resources by alignment */
-static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
+static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head,
+				struct list_head *head_fixed)
 {
 	int i;
 
@@ -135,17 +136,27 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 		struct pci_dev_resource *dev_res, *tmp;
 		resource_size_t r_align;
 		struct list_head *n;
+		struct resource *fixed_res = NULL;
 
 		r = &dev->resource[i];
 
-		if (r->flags & IORESOURCE_PCI_FIXED)
-			continue;
-
 		if (!(r->flags) || r->parent || !pci_dev_bar_enabled(dev, i))
 			continue;
 
+		if (i >= PCI_BRIDGE_RESOURCES &&
+		    dev->subordinate) {
+			int idx = i - PCI_BRIDGE_RESOURCES;
+
+			fixed_res = &dev->subordinate->fixed_range[idx];
+		} else if (pci_dev_bar_fixed(dev, r)) {
+			fixed_res = r;
+		}
+
+		if (fixed_res && !pci_fixed_range_valid(fixed_res))
+			fixed_res = NULL;
+
 		r_align = pci_resource_alignment(dev, r);
-		if (!r_align) {
+		if (!r_align && !fixed_res) {
 			pci_warn(dev, "BAR %d: %pR has bogus alignment\n",
 				 i, r);
 			continue;
@@ -157,6 +168,30 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 		tmp->res = r;
 		tmp->dev = dev;
 
+		if (fixed_res) {
+			n = head_fixed;
+			list_for_each_entry(dev_res, head_fixed, list) {
+				struct resource *c_fixed_res = NULL;
+				int c_resno = dev_res->res - dev_res->dev->resource;
+				int br_idx = c_resno - PCI_BRIDGE_RESOURCES;
+				struct pci_bus *cbus = dev_res->dev->subordinate;
+
+				if (br_idx >= 0)
+					c_fixed_res = &cbus->fixed_range[br_idx];
+				else
+					c_fixed_res = dev_res->res;
+
+				if (fixed_res->start < c_fixed_res->start) {
+					n = &dev_res->list;
+					break;
+				}
+			}
+			/* Insert it just before n */
+			list_add_tail(&tmp->list, n);
+
+			continue;
+		}
+
 		/* Fallback is smallest one or list is empty */
 		n = head;
 		list_for_each_entry(dev_res, head, list) {
@@ -175,7 +210,8 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 	}
 }
 
-static void __dev_sort_resources(struct pci_dev *dev, struct list_head *head)
+static void __dev_sort_resources(struct pci_dev *dev, struct list_head *head,
+				 struct list_head *head_fixed)
 {
 	u16 class = dev->class >> 8;
 
@@ -191,7 +227,7 @@ static void __dev_sort_resources(struct pci_dev *dev, struct list_head *head)
 			return;
 	}
 
-	pdev_sort_resources(dev, head);
+	pdev_sort_resources(dev, head, head_fixed);
 }
 
 static inline void reset_resource(struct resource *res)
@@ -486,8 +522,13 @@ static void pdev_assign_resources_sorted(struct pci_dev *dev,
 					 struct list_head *fail_head)
 {
 	LIST_HEAD(head);
+	LIST_HEAD(head_fixed);
+
+	__dev_sort_resources(dev, &head, &head_fixed);
+
+	if (!list_empty(&head_fixed))
+		__assign_resources_sorted(&head_fixed, NULL, NULL);
 
-	__dev_sort_resources(dev, &head);
 	__assign_resources_sorted(&head, add_head, fail_head);
 
 }
@@ -498,9 +539,13 @@ static void pbus_assign_resources_sorted(const struct pci_bus *bus,
 {
 	struct pci_dev *dev;
 	LIST_HEAD(head);
+	LIST_HEAD(head_fixed);
 
 	list_for_each_entry(dev, &bus->devices, bus_list)
-		__dev_sort_resources(dev, &head);
+		__dev_sort_resources(dev, &head, &head_fixed);
+
+	if (!list_empty(&head_fixed))
+		__assign_resources_sorted(&head_fixed, NULL, NULL);
 
 	__assign_resources_sorted(&head, realloc_head, fail_head);
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 15/26] x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is invalid
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (13 preceding siblings ...)
  2020-12-18 17:39 ` [PATCH v9 14/26] PCI: hotplug: Sort fixed BARs before assignment Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 16/26] PCI: hotplug: Configure MPS after manual bus rescan Sergei Miroshnichenko
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko, Thomas Gleixner

The value of PCIBIOS_MIN_MEM reported by BIOS 1.3 on Supermicro H11SSL-i
via e820__setup_pci_gap():

  [mem 0xebff1000-0xfe9fffff] available for PCI devices

is only suitable for a single root complex out of four (0000:00):

  pci_bus 0000:00: root bus resource [mem 0xec000000-0xefffffff window]
  pci_bus 0000:20: root bus resource [mem 0xeb800000-0xebefffff window]
  pci_bus 0000:40: root bus resource [mem 0xeb200000-0xeb5fffff window]
  pci_bus 0000:60: root bus resource [mem 0xe8b00000-0xeaffffff window]

That makes the AMD EPYC 7251 unable to assign BARs of devices hot-added in
those three unlucky RCs (0000:20, 0000:40 and 0000:60).

If there are apertures that end below the current PCIBIOS_MIN_MEM (which is
a variable pci_mem_start on x86), adjust it to the aperture's start.

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/x86/pci/acpi.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 948656069cdd..9eccb26d0bf3 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -299,6 +299,21 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
 	int status;
 
 	status = acpi_pci_probe_root_resources(ci);
+
+	resource_list_for_each_entry(entry, &ci->resources) {
+		struct resource *res = entry->res;
+
+		if (!(res->flags & IORESOURCE_MEM) ||
+		    res->end > pci_mem_start ||
+		    res->start == 0xa0000)
+			continue;
+
+		dev_warn(&ci->root->device->dev, "Fix up PCI start address: %lx -> %llx\n",
+			 pci_mem_start, res->start);
+
+		pci_mem_start = res->start;
+	}
+
 	if (pci_use_crs) {
 		resource_list_for_each_entry_safe(entry, tmp, &ci->resources)
 			if (resource_is_pcicfg_ioport(entry->res))
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 16/26] PCI: hotplug: Configure MPS after manual bus rescan
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (14 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 15/26] x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is invalid Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 17/26] PCI: hotplug: Don't disable the released bridge windows immediately Sergei Miroshnichenko
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Assure that MPS settings are set up for bridges which are discovered during
manually triggered rescan via sysfs. This sequence of bridge init (using
pci_rescan_bus()) later will be used for pciehp hot-add events when BAR
movement is enabled.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/probe.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 964dfa71af22..b8873ee82a4b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3578,7 +3578,7 @@ static void pci_reassign_root_bus_resources(struct pci_bus *root)
 unsigned int pci_rescan_bus(struct pci_bus *bus)
 {
 	unsigned int max;
-	struct pci_bus *root = bus;
+	struct pci_bus *root = bus, *child;
 
 	while (!pci_is_root_bus(root))
 		root = root->parent;
@@ -3599,6 +3599,9 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_assign_unassigned_bus_resources(bus);
 	}
 
+	list_for_each_entry(child, &root->children, node)
+		pcie_bus_configure_settings(child);
+
 	pci_bus_add_devices(bus);
 
 	return max;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 17/26] PCI: hotplug: Don't disable the released bridge windows immediately
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (15 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 16/26] PCI: hotplug: Configure MPS after manual bus rescan Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 18/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs Sergei Miroshnichenko
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

On a hotplug event with enabled BAR movement, calculating new BAR layout
and bridge windows takes some time. During this procedure, the structures
representing these windows are released: marked for a recalculation.

When the new bridge window values are ready, they are written to the bridge
registers via pci_setup_bridges().

Currently, bridge's registers are updated immediately after releasing a
window to disable it. But if a driver doesn't yet support movable BARs, it
doesn't stop MEM transactions during the hotplug, so disabled bridge
windows will break them.

Let the bridge windows remain operating after releasing, as they will be
updated to the new values in the end of a hotplug event.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-bus.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index ab58b999ac6d..f98772474421 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1649,7 +1649,8 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 		/* Avoiding touch the one without PREF */
 		if (type & IORESOURCE_PREFETCH)
 			type = IORESOURCE_PREFETCH;
-		__pci_setup_bridge(bus, type);
+		if (!pci_can_move_bars)
+			__pci_setup_bridge(bus, type);
 		/* For next child res under same bridge */
 		r->flags = old_flags;
 	}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 18/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (16 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 17/26] PCI: hotplug: Don't disable the released bridge windows immediately Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 19/26] PCI: Don't claim fixed BARs Sergei Miroshnichenko
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

With movable BARs, hot-adding a device is not a local event for its bridge,
but it affects the whole RC: BARs and bridge windows can be substantially
rearranged. So instead of trying to fit the new devices into pre-allocated
reserved gaps, initiate a full domain rescan.

The pci_rescan_bus() covers all the operations of the replaced functions:
 - assigning new bus numbers, as the pci_hp_add_bridge() does it;
 - allocating BARs -- pci_assign_unassigned_bridge_resources();
 - configuring MPS settings -- pcie_bus_configure_settings();
 - binding devices to their drivers -- pci_bus_add_devices().

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/hotplug/pciehp_pci.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index d17f3bf36f70..6d4c1ef38210 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -58,6 +58,11 @@ int pciehp_configure_device(struct controller *ctrl)
 		goto out;
 	}
 
+	if (pci_can_move_bars) {
+		pci_rescan_bus(parent);
+		goto out;
+	}
+
 	for_each_pci_bridge(dev, parent)
 		pci_hp_add_bridge(dev);
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 19/26] PCI: Don't claim fixed BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (17 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 18/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 20/26] PCI: hotplug: Retry bus assignment without reserved space Sergei Miroshnichenko
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

A fixed BAR always has an address, but its parent bridge window can be not
yet calculated (during boot) or temporarily released for re-calculation
(during PCI rescan) when pci_claim_resource() is called.

Apart from that, fixed BARs now have separate guaranteed mechanism of
assigning comparing to usual BARs, so claiming them is not needed.

Return immediately from pci_claim_resource() to prevent a misleading "can't
claim BAR ... no compatible bridge window" error message.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-res.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 6ffda8b94c52..28ec3d8c79c8 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -139,6 +139,9 @@ int pci_claim_resource(struct pci_dev *dev, int resource)
 		return -EINVAL;
 	}
 
+	if (pci_dev_bar_fixed(dev, res))
+		return 0;
+
 	/*
 	 * If we have a shadow copy in RAM, the PCI device doesn't respond
 	 * to the shadow range, so we don't need to claim it, and upstream
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 20/26] PCI: hotplug: Retry bus assignment without reserved space
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (18 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 19/26] PCI: Don't claim fixed BARs Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 21/26] PCI: Rescan the bus to resize a BAR Sergei Miroshnichenko
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

A hot-added bridge with many hotplug-capable ports may request reserving
more IO space than the machine has. This could be overridden with the
"hpiosize=" kernel argument though.

But when BARs are movable, no need to reserve space anymore: new BARs are
allocated not from reserved gaps, but via rearranging the existing BARs.
Requesting a precise amount of space for bridge windows increases the
chances of adding the new bridge successfully.

Still, fixed BARs may interfere with an allocation, preventing it from
happening. So it makes sense to reserve some space when it is possible, so
movable BARs can be moved a bit more effectively later.

If a BAR allocation fails with additional bridge size, retry without them.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.h       | 1 +
 drivers/pci/probe.c     | 9 +++++++++
 drivers/pci/setup-bus.c | 9 +++++++--
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f7460ddd196a..685284c57a28 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -207,6 +207,7 @@ extern unsigned long pci_hotplug_io_size;
 extern unsigned long pci_hotplug_mmio_size;
 extern unsigned long pci_hotplug_mmio_pref_size;
 extern unsigned long pci_hotplug_bus_size;
+extern bool pci_hotplug_expand;
 
 /**
  * pci_match_one_device - Tell if a PCI device structure has a matching
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b8873ee82a4b..24793766b4b7 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -41,6 +41,7 @@ EXPORT_SYMBOL(pci_root_buses);
  * were assigned before the rescan.
  */
 static bool pci_try_failed_bars = true;
+bool pci_hotplug_expand = true;
 bool pci_init_done;
 
 static LIST_HEAD(pci_domain_busn_res_list);
@@ -3542,6 +3543,13 @@ static void pci_reassign_root_bus_resources(struct pci_bus *root)
 		if (pci_bus_check_bars_assigned(root, pci_try_failed_bars))
 			break;
 
+		if (pci_hotplug_expand) {
+			dev_warn(&root->dev, "failed to assign all BARs, retry without additional window size\n");
+
+			pci_hotplug_expand = false;
+			continue;
+		}
+
 		if (pci_try_failed_bars) {
 			dev_warn(&root->dev, "failed to assign all BARs, retry without those failed before\n");
 
@@ -3564,6 +3572,7 @@ static void pci_reassign_root_bus_resources(struct pci_bus *root)
 	} while (true);
 
 	pci_try_failed_bars = true;
+	pci_hotplug_expand = true;
 }
 
 /**
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f98772474421..4b37815a9fcf 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1058,6 +1058,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
 	resource_size_t add_align = 0;
+	bool size0_can_add = pci_can_move_bars && !realloc_head;
 	int idx;
 	struct resource *fixed_range;
 	resource_size_t fixed_size;
@@ -1141,7 +1142,11 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 
 	min_align = calculate_mem_align(aligns, max_order);
 	min_align = max(min_align, window_alignment(bus, b_res->flags));
-	size0 = calculate_memsize(size, min_size, 0, 0, resource_size(b_res), min_align);
+	size0 = calculate_memsize(size, min_size,
+				  size0_can_add ? add_size : 0,
+				  size0_can_add ? children_add_size : 0,
+				  resource_size(b_res), min_align);
+
 	add_align = max(min_align, add_align);
 	size1 = (!realloc_head || (realloc_head && !add_size && !children_add_size)) ? size0 :
 		calculate_memsize(size, min_size, add_size, children_add_size,
@@ -1316,7 +1321,7 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
 
 	case PCI_HEADER_TYPE_BRIDGE:
 		pci_bridge_check_ranges(bus);
-		if (bus->self->is_hotplug_bridge) {
+		if (bus->self->is_hotplug_bridge && pci_hotplug_expand) {
 			additional_io_size  = pci_hotplug_io_size;
 			additional_mmio_size = pci_hotplug_mmio_size;
 			additional_mmio_pref_size = pci_hotplug_mmio_pref_size;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 21/26] PCI: Rescan the bus to resize a BAR
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (19 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 20/26] PCI: hotplug: Retry bus assignment without reserved space Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 22/26] PCI: hotplug: Enable the movable BARs feature by default Sergei Miroshnichenko
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

BAR resizing can be blocked by another BAR, so trigger a bus rescan to be
able to move BARs, increasing the probability of finding a good layout.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-res.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 28ec3d8c79c8..83a491f6a2c2 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -480,6 +480,9 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	u32 sizes;
 	u16 cmd;
 
+	if (pci_dev_bar_fixed(dev, res))
+		return -EOPNOTSUPP;
+
 	/* Make sure the resource isn't assigned before resizing it. */
 	if (!(res->flags & IORESOURCE_UNSET))
 		return -EBUSY;
@@ -506,7 +509,15 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	res->end = res->start + pci_rebar_size_to_bytes(size) - 1;
 
 	/* Check if the new config works by trying to assign everything. */
-	if (dev->bus->self) {
+	if (pci_can_move_bars) {
+		pci_rescan_bus(dev->bus);
+
+		if (!res->flags || (res->flags & IORESOURCE_UNSET) || !res->parent) {
+			pci_err(dev, "BAR %d resize failed\n", resno);
+			ret = -1;
+			goto error_resize;
+		}
+	} else if (dev->bus->self) {
 		ret = pci_reassign_bridge_resources(dev->bus->self, res->flags);
 		if (ret)
 			goto error_resize;
@@ -516,6 +527,8 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 error_resize:
 	pci_rebar_set_size(dev, resno, old);
 	res->end = res->start + pci_rebar_size_to_bytes(old) - 1;
+	if (pci_can_move_bars)
+		pci_rescan_bus(dev->bus);
 	return ret;
 }
 EXPORT_SYMBOL(pci_resize_resource);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 22/26] PCI: hotplug: Enable the movable BARs feature by default
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (20 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 21/26] PCI: Rescan the bus to resize a BAR Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 23/26] PCI/portdrv: Declare support of movable BARs Sergei Miroshnichenko
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

This is the last patch in the series which implements the essentials of the
movable BARs feature, so it is turned on by default now.

Tested on x86_64; and with extra patches applied, it also works on ppc64le
(PowerNV).

In case of problems it is still can be overridden by the following command
line option:

  pcie_movable_bars=off

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f393f0bc8ec4..98fabff81028 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -76,7 +76,7 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
 int pci_domains_supported = 1;
 #endif
 
-bool pci_can_move_bars;
+bool pci_can_move_bars = true;
 
 #define DEFAULT_CARDBUS_IO_SIZE		(256)
 #define DEFAULT_CARDBUS_MEM_SIZE	(64*1024*1024)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 23/26] PCI/portdrv: Declare support of movable BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (21 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 22/26] PCI: hotplug: Enable the movable BARs feature by default Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 24/26] nvme-pci: Handle " Sergei Miroshnichenko
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Currently there are no reliable way to determine if a driver uses BARs of
its devices (their struct resource don't always have a child), so if it
doesn't explicitly support movable BARs, they are considered fixed.

The portdrv driver for PCI switches don't use BARs, so add a new hook
.bar_fixed() { return false; } to increase the chances to allocate new
BARs for new devices.

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pcie/portdrv_pci.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 0b250bc5f405..3043f7e4d3c1 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -212,6 +212,11 @@ static const struct pci_error_handlers pcie_portdrv_err_handler = {
 	.resume = pcie_portdrv_err_resume,
 };
 
+static bool pcie_portdrv_bar_fixed(struct pci_dev *pdev, int resno)
+{
+	return false;
+}
+
 static struct pci_driver pcie_portdriver = {
 	.name		= "pcieport",
 	.id_table	= &port_pci_ids[0],
@@ -222,6 +227,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.bar_fixed	= pcie_portdrv_bar_fixed,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 24/26] nvme-pci: Handle movable BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (22 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 23/26] PCI/portdrv: Declare support of movable BARs Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 25/26] PCI: Add a message for updating BARs Sergei Miroshnichenko
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko, linux-nvme, Christoph Hellwig

Hot-added devices can affect the existing ones by moving their BARs. The
PCI subsystem will inform the NVME driver about this by invoking the
.rescan_prepare() and .rescan_done() hooks, so the BARs can by re-mapped.

Tested under the "randrw" mode of the fio tool, and when using an NVME
drive as a root filesystem storage. Before the hot-adding:

  % sudo cat /proc/iomem
  ...
                3fe800000000-3fe8007fffff : PCI Bus 0020:0b
                  3fe800000000-3fe8007fffff : PCI Bus 0020:18
                    3fe800000000-3fe8000fffff : 0020:18:00.0
                      3fe800000000-3fe8000fffff : nvme
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
                    3fe800100000-3fe80017ffff : 0020:18:00.0
  ...

Then another NVME drive was hot-added, so BARs of the 0020:18:00.0 are
moved:

  % sudo cat /proc/iomem
    ...
                3fe800000000-3fe800ffffff : PCI Bus 0020:0b
                  3fe800000000-3fe8007fffff : PCI Bus 0020:10
                    3fe800000000-3fe800003fff : 0020:10:00.0
                      3fe800000000-3fe800003fff : nvme
                    3fe800010000-3fe80001ffff : 0020:10:00.0
                  3fe800800000-3fe800ffffff : PCI Bus 0020:18
                    3fe800800000-3fe8008fffff : 0020:18:00.0
                      3fe800800000-3fe8008fffff : nvme
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
                    3fe800900000-3fe80097ffff : 0020:18:00.0
    ...

During the rescanning, both READ and WRITE speeds drop to zero for a while
due to driver's pause, then restore.

Cc: linux-nvme@lists.infradead.org
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/nvme/host/pci.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b4385cb0ff60..7993062b11b6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1656,7 +1656,7 @@ static int nvme_remap_bar(struct nvme_dev *dev, unsigned long size)
 {
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 
-	if (size <= dev->bar_mapped_size)
+	if (dev->bar && size <= dev->bar_mapped_size)
 		return 0;
 	if (size > pci_resource_len(pdev, 0))
 		return -ENOMEM;
@@ -3152,6 +3152,28 @@ static void nvme_error_resume(struct pci_dev *pdev)
 	flush_work(&dev->ctrl.reset_work);
 }
 
+static bool nvme_bar_fixed(struct pci_dev *pdev, int resno)
+{
+	return false;
+}
+
+static void nvme_rescan_prepare(struct pci_dev *pdev)
+{
+	struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+	nvme_dev_disable(dev, true);
+	nvme_dev_unmap(dev);
+	dev->bar = NULL;
+}
+
+static void nvme_rescan_done(struct pci_dev *pdev)
+{
+	struct nvme_dev *dev = pci_get_drvdata(pdev);
+
+	nvme_dev_map(dev);
+	nvme_reset_ctrl(&dev->ctrl);
+}
+
 static const struct pci_error_handlers nvme_err_handler = {
 	.error_detected	= nvme_error_detected,
 	.slot_reset	= nvme_slot_reset,
@@ -3238,6 +3260,9 @@ static struct pci_driver nvme_driver = {
 #endif
 	.sriov_configure = pci_sriov_configure_simple,
 	.err_handler	= &nvme_err_handler,
+	.rescan_prepare	= nvme_rescan_prepare,
+	.rescan_done	= nvme_rescan_done,
+	.bar_fixed	= nvme_bar_fixed,
 };
 
 static int __init nvme_init(void)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 25/26] PCI: Add a message for updating BARs
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (23 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 24/26] nvme-pci: Handle " Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2020-12-18 17:40 ` [PATCH v9 26/26] resource: increase max nesting level for /proc/iomem Sergei Miroshnichenko
  2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Add a new debug message for changing a BAR value of a device:

[    1.851161] pci 0003:0a:00.1: BAR 0 updated: 0x60200c2000000 -> 0x6020142000000

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/setup-res.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 83a491f6a2c2..5fed21aed9b8 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -27,7 +27,8 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno)
 	struct pci_bus_region region;
 	bool disable;
 	u16 cmd;
-	u32 new, check, mask;
+	u32 new, check, mask, old;
+	u64 old_start;
 	int reg;
 	struct resource *res = dev->resource + resno;
 
@@ -96,6 +97,9 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno)
 				      cmd & ~PCI_COMMAND_MEMORY);
 	}
 
+	pci_read_config_dword(dev, reg, &old);
+	old_start = old & mask;
+
 	pci_write_config_dword(dev, reg, new);
 	pci_read_config_dword(dev, reg, &check);
 
@@ -105,6 +109,9 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno)
 	}
 
 	if (res->flags & IORESOURCE_MEM_64) {
+		pci_read_config_dword(dev, reg + 4, &old);
+		old_start |= (u64)old << 32;
+
 		new = region.start >> 16 >> 16;
 		pci_write_config_dword(dev, reg + 4, new);
 		pci_read_config_dword(dev, reg + 4, &check);
@@ -116,6 +123,11 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno)
 
 	if (disable)
 		pci_write_config_word(dev, PCI_COMMAND, cmd);
+
+	if (old_start != region.start)
+		pci_info(dev, "BAR %d updated: %#llx -> %#llx\n", resno,
+			 (unsigned long long)old_start,
+			 (unsigned long long)region.start);
 }
 
 void pci_update_resource(struct pci_dev *dev, int resno)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v9 26/26] resource: increase max nesting level for /proc/iomem
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (24 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 25/26] PCI: Add a message for updating BARs Sergei Miroshnichenko
@ 2020-12-18 17:40 ` Sergei Miroshnichenko
  2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
  26 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-18 17:40 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Sergei Miroshnichenko

Current value of maximum indentation for /proc/iomem and /proc/ioports is
not enough for comfortable PCI resources debugging when several bridges are
nested, so increase it from 5 to 15:

 % lspci -tv
 +-[0000:80]-+-00.0-[81]--
 |           +-01.0-[82]--
 |           +-03.0-[83-99]----00.0-[84-99]----04.0-[85-99]----00.0-[86-99]--+-00.0-[87]--
 |           |                                                               +-04.0-[88]--
 |           |                                                               +-08.0-[89]--
 |           |                                                               +-09.0-[8a]--
 |           |                                                               +-0c.0-[8b-95]----00.0-[8c-95]--+-00.0-[8d]--
 |           |                                                               |                               +-04.0-[8e]--
 |           |                                                               |                               +-08.0-[8f]--
 |           |                                                               |                               +-09.0-[90]--
 |           |                                                               |                               +-0c.0-[91]--
 |           |                                                               |                               +-10.0-[92]--+-00.0
 |           |                                                               |                               |            \-00.1

 % sudo cat /proc/iomem
 ...
 f9000000000-f907fffffff : PCI Bus 0000:80
   f9000100000-f90010fffff : PCI Bus 0000:83
     f9000100000-f90010fffff : PCI Bus 0000:84
       f9000100000-f90010fffff : PCI Bus 0000:85
         f9000100000-f90010fffff : PCI Bus 0000:86
           f9000100000-f90009fffff : PCI Bus 0000:8b
             f9000100000-f90009fffff : PCI Bus 0000:8c
               f9000100000-f90003fffff : PCI Bus 0000:92
                 f9000100000-f90001fffff : 0000:92:00.0
                   f9000100000-f90001fffff : qla2xx
                 f9000200000-f9002ffffff : 0000:92:00.1
                   f9000200000-f90002fffff : qla2xxx
                 f9000300000-f9000301fff : 0000:92:00.0
                   f9000300000-f9000301fff : qla2xxx
                 f9000302000-f9000303fff : 0000:92:00.1
                   f9000302000-f9000303fff : qla2xxx
                 f9000304000-f9000304fff : 0000:92:00.0
                   f9000304000-f9000304fff : qla2xxx
                 f9000305000-f9000305fff : 0000:92:00.1
                   f9000305000-f9000305fff : qla2xxx

Signed-off-by: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>
---
 kernel/resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 833394f9c608..4b3b378fed19 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -83,7 +83,7 @@ static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 
 #ifdef CONFIG_PROC_FS
 
-enum { MAX_IORES_LEVEL = 5 };
+enum { MAX_IORES_LEVEL = 15 };
 
 static void *r_start(struct seq_file *m, loff_t *pos)
 	__acquires(resource_lock)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device()
  2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
@ 2020-12-28 15:37   ` Sergei Miroshnichenko
  0 siblings, 0 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2020-12-28 15:37 UTC (permalink / raw)
  To: linux-pci

Hi,

The kbuild test robot has reported a build error on some configs,
pointing out that another #ifdef should be used. I've checked, the
DEBUG_LOCK_ALLOC seems to suit better here:

On Fri, 2020-12-18 at 20:39 +0300, Sergei Miroshnichenko wrote:
> This is a yet another approach to fix an old [1-2] concurrency issue,
> when:
>  - two or more devices are being hot-added into a bridge which was
>    initially empty;
>  - a bridge with two or more devices is being hot-added;
>  - the BIOS/bootloader/firmware doesn't pre-enable bridges during
> boot;
> 
> ...
>  
> +#ifdef CONFIG_PROVE_LOCKING

+#ifdef CONFIG_DEBUG_LOCK_ALLOC

> +static int pci_bridge_depth(struct pci_dev *dev)
> +{
> +	struct pci_dev *bridge = pci_upstream_bridge(dev);
> +
> +	if (!bridge)
> +		return 0;
> +
> +	return 1 + pci_bridge_depth(bridge);
> +}
> +#endif /* CONFIG_PROVE_LOCKING */

+#endif /* CONFIG_DEBUG_LOCK_ALLOC */

> +
>  static void pci_enable_bridge(struct pci_dev *dev)
>  {
>  	struct pci_dev *bridge;
>  	int retval;
>  
> +	mutex_lock_nested(&dev->enable_mutex, pci_bridge_depth(dev));
> +
>  	bridge = pci_upstream_bridge(dev);
>  	if (bridge)
>  		pci_enable_bridge(bridge);

Is there a proper way to send a "hotfix" for a single patch of the
series of 26, without resending them all?

Best regards,
Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
                   ` (25 preceding siblings ...)
  2020-12-18 17:40 ` [PATCH v9 26/26] resource: increase max nesting level for /proc/iomem Sergei Miroshnichenko
@ 2021-01-28 14:53 ` Bjorn Helgaas
  2021-01-28 20:39   ` Lukas Wunner
  2021-02-03 19:59   ` Sergei Miroshnichenko
  26 siblings, 2 replies; 44+ messages in thread
From: Bjorn Helgaas @ 2021-01-28 14:53 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: linux-pci, Lukas Wunner, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux

On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei Miroshnichenko wrote:
> Currently PCI hotplug works on top of resources which are usually reserved:
> by BIOS, bootloader, firmware, or by the kernel (pci=hpmemsize=XM). These
> resources are gaps in the address space where BARs of new devices may fit,
> and extra bus number per port, so bridges can be hot-added. This series aim
> the BARs problem: it shows the kernel how to redistribute them on the run,
> so the hotplug becomes predictable and cross-platform. A follow-up patchset
> will propose a solution for bus numbers. And another one -- for the powerpc
> arch-specific problems.

Hi Sergei,

It looks like there's a lot of good work here, but I don't really know
how to move forward with it.

I can certainly see scenarios where this functionality will be useful,
but the series currently doesn't mention bug reports that it fixes.  I
suspect there *are* some related bug reports, e.g., for Thunderbolt
hotplug.  We should dig them up, include pointers to them, and get the
reporters to test the series and provide feedback.

We had some review traffic on earlier versions, but as far as I can
see, nobody has stepped up with any actual signs of approval, e.g.,
Tested-by, Reviewed-by, or Acked-by tags.  That's a problem because
this touches a lot of sensitive code and I don't want to be stuck
fixing issues all by myself.  When issues crop up, I look to the
author and reviewers to help out.

Along that line, I'm a little concerned about how we will be able to
reproduce and debug problems.  Issues will likely depend on the
topology, the resources of the specific devices involved, and a
specific sequence of hotplug operations.  Those may be hard to
reproduce even for the reporter.  Maybe this is nothing more than a
grep pattern to extract relevant events from dmesg.  Ideal, but maybe
impractical, would be a way to reproduce things in a VM using qemu and
a script to simulate hotplugs.

> To arrange a space for BARs of new hot-added devices, the kernel can pause
> the drivers of working PCI devices and reshuffle the assigned BARs. When a
> driver is un-paused by the kernel, it should ioremap() the new addresses of
> its BARs.
> 
> Drivers indicate their support of the feature by implementing the new hooks
> .bar_fixed(), .rescan_prepare(), .rescan_done() in the struct pci_driver.
> If a driver doesn't yet support the feature, BARs of its devices are seen
> as immovable and handled in the same way as resources with the PCI_FIXED
> flag: they are guaranteed to remain untouched.
> 
> This approached was discussed during LPC 2020 [1].
> 
> The feature is usable not only for hotplug, but also for booting with a
> complete set of BARs assigned, and for Resizable BARs.

I'm interested in more details about both of these.  What does "a
complete set of BARs assigned" mean?  On x86, the BIOS usually assigns
all the BARs ahead of time, but I know that's not the case for all
arches.

For Resizable BARs, is the point that this allows more flexibility in
resizing BARs because we can now move things out of the way?

> Tested on a number of x86_64 machines without any special kernel command
> line arguments (some of them -- with older v8 revision of this patchset):
>  - PC: i7-5930K + ASUS X99-A;
>  - PC: i5-8500 + ASUS Z370-F;
>  - PC: AMD Ryzen 7 3700X + ASRock X570 + AMD 5700 XT (Resizable BARs);
>  - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
>  - HP ProLiant DL380 G5: Xeon X5460;
>  - Dell Inspiron N5010: i5 M 480;
>  - Dell Precision M6600: i7-2920XM.

Does this testing show no change in behavior and no regressions, or
does it show that this series fixes cases that previously did not
work?  If the latter, some bugzilla reports with before/after dmesg
logs would give some insight into how this works and also some good
justification.

> Also tested on a Power8 (Vesnin) and Power9 (Nicole) ppc64le machines, but
> with extra patchset, which is to be sent upstream a bit later.
> 
> The set contains:
>  01-02: Independent bugfixes, both are not related directly to the movable
>         BARs feature, but without them the rest of this series will not
> 	work as expected;
> 
>  03-14: Implement the essentials of the feature;
> 
>  15-21: Performance improvements for movable BARs and pciehp;
> 
>  22: Enable the feature by default;
> 
>  23-24: Add movable BARs support to nvme and portdrv;
> 
>  25-26: Debugging helpers.
> 
> This patchset is a part of our work on adding support for hot-adding
> chains of chassis full of other bridges, NVME drives, SAS HBAs, GPUs, etc.
> without special requirements such as Hot-Plug Controller, reservation of
> bus numbers or memory regions by firmware, etc.

This is a little bit of a chicken and egg situation.  I suspect many
people would like this functionality, but currently we either avoid it
because it's "known not to work" or we hack around it with the
firmware reservations you mention.

Bjorn

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
@ 2021-01-28 20:39   ` Lukas Wunner
  2021-02-01 12:55     ` Mika Westerberg
  2021-02-03 20:01     ` Sergei Miroshnichenko
  2021-02-03 19:59   ` Sergei Miroshnichenko
  1 sibling, 2 replies; 44+ messages in thread
From: Lukas Wunner @ 2021-01-28 20:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Sergei Miroshnichenko, linux-pci, Stefan Roese, Andy Lavr,
	Christian König, David Laight, Rajat Jain, linux,
	Yehezkel Bernat, Mario Limonciello, Christian Kellner,
	Mika Westerberg

On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei Miroshnichenko wrote:
> > Currently PCI hotplug works on top of resources which are usually reserved:
> > by BIOS, bootloader, firmware, or by the kernel (pci=hpmemsize=XM). These
> > resources are gaps in the address space where BARs of new devices may fit,
> > and extra bus number per port, so bridges can be hot-added. This series aim
> > the BARs problem: it shows the kernel how to redistribute them on the run,
> > so the hotplug becomes predictable and cross-platform. A follow-up patchset
> > will propose a solution for bus numbers. And another one -- for the powerpc
> > arch-specific problems.
> 
> I can certainly see scenarios where this functionality will be useful,
> but the series currently doesn't mention bug reports that it fixes.  I
> suspect there *are* some related bug reports, e.g., for Thunderbolt
> hotplug.  We should dig them up, include pointers to them, and get the
> reporters to test the series and provide feedback.

In case it helps, an earlier version of the series was referenced
in this LWN article more than 2 years ago (scroll down to the
"Moving BARs" section at the end of the article):

https://lwn.net/Articles/767885/

The article provides some context:  Specifically, macOS is capable
of resizing and moving BARs, so this series sort of helps us catch
up with the competition.

With Thunderbolt, this series is particularly useful if
(a) PCIe cards are hot-added with large BARs (such as GPUs) and/or
(b) the Thunderbolt daisy-chain is very long.

Thunderbolt is essentially a cascade of nested hotplug ports,
so if more and more devices are added, it's easy to see that
the top-level hotplug port's BAR window may run out of space.

My understanding is that Sergei's use case doesn't involve
Thunderbolt at all but rather hotplugging of GPUs and network
cards in PowerPC servers in a datacenter, which may have the
same kinds of issues.

I intended to review and test this iteration of the series more
closely, but haven't been able to carve out the required time.
I'm adding some Thunderbolt folks to cc in the hope that they
can at least test the series on their development branch.
Getting this upstreamed should really be in the best interest
of Intel and other promulgators of Thunderbolt.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-01-28 20:39   ` Lukas Wunner
@ 2021-02-01 12:55     ` Mika Westerberg
  2021-02-03 20:17       ` Sergei Miroshnichenko
  2021-02-03 20:01     ` Sergei Miroshnichenko
  1 sibling, 1 reply; 44+ messages in thread
From: Mika Westerberg @ 2021-02-01 12:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Sergei Miroshnichenko, linux-pci, Stefan Roese,
	Andy Lavr, Christian König, David Laight, Rajat Jain, linux,
	Yehezkel Bernat, Mario Limonciello, Christian Kellner

Hi,

On Thu, Jan 28, 2021 at 09:39:29PM +0100, Lukas Wunner wrote:
> On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> > On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei Miroshnichenko wrote:
> > > Currently PCI hotplug works on top of resources which are usually reserved:
> > > by BIOS, bootloader, firmware, or by the kernel (pci=hpmemsize=XM). These
> > > resources are gaps in the address space where BARs of new devices may fit,
> > > and extra bus number per port, so bridges can be hot-added. This series aim
> > > the BARs problem: it shows the kernel how to redistribute them on the run,
> > > so the hotplug becomes predictable and cross-platform. A follow-up patchset
> > > will propose a solution for bus numbers. And another one -- for the powerpc
> > > arch-specific problems.
> > 
> > I can certainly see scenarios where this functionality will be useful,
> > but the series currently doesn't mention bug reports that it fixes.  I
> > suspect there *are* some related bug reports, e.g., for Thunderbolt
> > hotplug.  We should dig them up, include pointers to them, and get the
> > reporters to test the series and provide feedback.
> 
> In case it helps, an earlier version of the series was referenced
> in this LWN article more than 2 years ago (scroll down to the
> "Moving BARs" section at the end of the article):
> 
> https://lwn.net/Articles/767885/
> 
> The article provides some context:  Specifically, macOS is capable
> of resizing and moving BARs, so this series sort of helps us catch
> up with the competition.
> 
> With Thunderbolt, this series is particularly useful if
> (a) PCIe cards are hot-added with large BARs (such as GPUs) and/or
> (b) the Thunderbolt daisy-chain is very long.
> 
> Thunderbolt is essentially a cascade of nested hotplug ports,
> so if more and more devices are added, it's easy to see that
> the top-level hotplug port's BAR window may run out of space.
> 
> My understanding is that Sergei's use case doesn't involve
> Thunderbolt at all but rather hotplugging of GPUs and network
> cards in PowerPC servers in a datacenter, which may have the
> same kinds of issues.
> 
> I intended to review and test this iteration of the series more
> closely, but haven't been able to carve out the required time.
> I'm adding some Thunderbolt folks to cc in the hope that they
> can at least test the series on their development branch.
> Getting this upstreamed should really be in the best interest
> of Intel and other promulgators of Thunderbolt.

Sure. It seems that this series was submitted in December so probably
not applicable to the pci.git/next anymore. Anyways, I can give it a try
on a TBT capable system if someone tells me what exactly to test ;-)
Probably at least that the existing functionality still works but
something else maybe too?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
  2021-01-28 20:39   ` Lukas Wunner
@ 2021-02-03 19:59   ` Sergei Miroshnichenko
  2021-02-04  8:26     ` Hinko Kocevar
  1 sibling, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-03 19:59 UTC (permalink / raw)
  To: helgaas
  Cc: David.Laight, rajatja, christian.koenig, andy.lavr, linux-pci,
	sr, lukas, linux

Hi Bjorn,

On Thu, 2021-01-28 at 08:53 -0600, Bjorn Helgaas wrote:
> I can certainly see scenarios where this functionality will be
> useful,
> but the series currently doesn't mention bug reports that it
> fixes.  I
> suspect there *are* some related bug reports, e.g., for Thunderbolt
> hotplug.  We should dig them up, include pointers to them, and get
> the
> reporters to test the series and provide feedback.

It never occurred to me to actually search for reports, thanks.

> We had some review traffic on earlier versions, but as far as I can
> see, nobody has stepped up with any actual signs of approval, e.g.,
> Tested-by, Reviewed-by, or Acked-by tags.  That's a problem because
> this touches a lot of sensitive code and I don't want to be stuck
> fixing issues all by myself.  When issues crop up, I look to the
> author and reviewers to help out.

Indeed, I had only a few informal personal responses [from outside of
our company]:
 - some machines have no regressions with the kernel built with Clang-
12+LTO+IAS;
 - NVMe hotplug started working for AMD Epyc with a PEX switch;
 - Another system with several PEX switches is under ongoing
experiments.

Providing an ability to quickly return to the old BARs allocating is
the reason why I keep wrapping the most intrusive code with "if
(pci_can_move_bars)". Originally I wanted it to be disabled by default,
available for those truly desperate. But there was an objection that
the code will not be ever tested this way.

> Along that line, I'm a little concerned about how we will be able to
> reproduce and debug problems.  Issues will likely depend on the
> topology, the resources of the specific devices involved, and a
> specific sequence of hotplug operations.  Those may be hard to
> reproduce even for the reporter.  Maybe this is nothing more than a
> grep pattern to extract relevant events from dmesg.  Ideal, but maybe
> impractical, would be a way to reproduce things in a VM using qemu
> and
> a script to simulate hotplugs.

I haven't yet tried the qemu for PCI debugging, definitely need to take
a look. To help with that, the patchset is supplied with additional
prints, but CONFIG_PCI_DEBUG=y is still preferred. A simulation will
reveal if additional debug messages are needed.

> > The feature is usable not only for hotplug, but also for booting
> > with a
> > complete set of BARs assigned, and for Resizable BARs.
> 
> I'm interested in more details about both of these.  What does "a
> complete set of BARs assigned" mean?  On x86, the BIOS usually
> assigns
> all the BARs ahead of time, but I know that's not the case for all
> arches.

Usually yes, but we have at least one x86 PC that skips some BARs (need
to check out again its particular model and version, IIRC that was
Z370-F) -- not the crucial ones like BAR0, though. That really got me
by surprise, so now it is one more case covered.

> For Resizable BARs, is the point that this allows more flexibility in
> resizing BARs because we can now move things out of the way?

Not only things out of the way, but most importantly the bridge windows
are now floating as well. So it's more freedom for a new BAR to
"choose" a place.

An awfully synthetic example:

|                       RC Address Space                       |
| orig GPU BAR | fixed BAR |                                   |
|              | fixed BAR |     Resized GPU BAR        |      |

> > Tested on a number of x86_64 machines without any special kernel
> > command
> > line arguments (some of them -- with older v8 revision of this
> > patchset):
> >  - PC: i7-5930K + ASUS X99-A;
> >  - PC: i5-8500 + ASUS Z370-F;
> >  - PC: AMD Ryzen 7 3700X + ASRock X570 + AMD 5700 XT (Resizable
> > BARs);
> >  - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
> >  - HP ProLiant DL380 G5: Xeon X5460;
> >  - Dell Inspiron N5010: i5 M 480;
> >  - Dell Precision M6600: i7-2920XM.
> 
> Does this testing show no change in behavior and no regressions, or
> does it show that this series fixes cases that previously did not
> work?  If the latter, some bugzilla reports with before/after dmesg
> logs would give some insight into how this works and also some good
> justification.

Both, actually, but the dmesg logs were never published, so they will
be the next step -- combined with qemu scripts, if it works.

> > This patchset is a part of our work on adding support for hot-
> > adding
> > chains of chassis full of other bridges, NVME drives, SAS HBAs,
> > GPUs, etc.
> > without special requirements such as Hot-Plug Controller,
> > reservation of
> > bus numbers or memory regions by firmware, etc.
> 
> This is a little bit of a chicken and egg situation.  I suspect many
> people would like this functionality, but currently we either avoid
> it
> because it's "known not to work" or we hack around it with the
> firmware reservations you mention.

The current (v9) patchset became more symbiotic with other hotplug
facilities: it started to respect pci=hpmemsize etc, and only drops it
if such allocation is impossible.

Attention Button for Assisted Hotplug (pciehp driver) was always
supported. It is also fully compatible with DPC.

The goal is to squeeze as much as possible even without special
hw+fw+sw support, when the only available tool is a manual rescan via
sysfs.

Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-01-28 20:39   ` Lukas Wunner
  2021-02-01 12:55     ` Mika Westerberg
@ 2021-02-03 20:01     ` Sergei Miroshnichenko
  2021-02-04  9:34       ` David Laight
  1 sibling, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-03 20:01 UTC (permalink / raw)
  To: lukas, helgaas
  Cc: David.Laight, christian, christian.koenig, YehezkelShB, rajatja,
	andy.lavr, linux, linux-pci, sr, mika.westerberg,
	mario.limonciello

On Thu, 2021-01-28 at 21:39 +0100, Lukas Wunner wrote:
> With Thunderbolt, this series is particularly useful if
> (a) PCIe cards are hot-added with large BARs (such as GPUs) and/or
> (b) the Thunderbolt daisy-chain is very long.
> 
> Thunderbolt is essentially a cascade of nested hotplug ports,
> so if more and more devices are added, it's easy to see that
> the top-level hotplug port's BAR window may run out of space.
> 
> My understanding is that Sergei's use case doesn't involve
> Thunderbolt at all but rather hotplugging of GPUs and network
> cards in PowerPC servers in a datacenter, which may have the
> same kinds of issues.

Hi Lukas,

I have yet to find some Thunderbolt hardware and try it.

Actually, we are hotplugging not only endpoints, but nested PCIe
switches as well: those are PCIe JBOD chassis (with NVMes and SAS
drives).

But to deal with them we have to use an additional patchset "Movable
bus numbers", that was also described during LPC2020 (including its
problematic points), here is its RFC, it haven't changed much since
then:

https://lore.kernel.org/linux-pci/20191024172157.878735-5-s.miroshnichenko@yadro.com/T/

Can Thunderbolt have a chain of switches deep enough to have problems
with bus numbers, when reserving is not the best option?

Serge

> I intended to review and test this iteration of the series more
> closely, but haven't been able to carve out the required time.
> I'm adding some Thunderbolt folks to cc in the hope that they
> can at least test the series on their development branch.
> Getting this upstreamed should really be in the best interest
> of Intel and other promulgators of Thunderbolt.
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-01 12:55     ` Mika Westerberg
@ 2021-02-03 20:17       ` Sergei Miroshnichenko
  2021-02-04 10:49         ` mika.westerberg
  0 siblings, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-03 20:17 UTC (permalink / raw)
  To: mika.westerberg, lukas
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, andy.lavr

Hi Mika,

On Mon, 2021-02-01 at 14:55 +0200, Mika Westerberg wrote:
> On Thu, Jan 28, 2021 at 09:39:29PM +0100, Lukas Wunner wrote:
> > On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> > > On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei Miroshnichenko
> > > wrote:
> > > > ...
> > 
> > I intended to review and test this iteration of the series more
> > closely, but haven't been able to carve out the required time.
> > I'm adding some Thunderbolt folks to cc in the hope that they
> > can at least test the series on their development branch.
> > Getting this upstreamed should really be in the best interest
> > of Intel and other promulgators of Thunderbolt.
> 
> Sure. It seems that this series was submitted in December so probably
> not applicable to the pci.git/next anymore. Anyways, I can give it a
> try
> on a TBT capable system if someone tells me what exactly to test ;-)
> Probably at least that the existing functionality still works but
> something else maybe too?

For setups that worked fine, the only expected change is a possible
little different BAR layout (in /proc/iomem), and there should the same
quantity (or more) of BARs assigned than before.

But if there are any problematic setups, which weren't able to arrange
new BARs, this patchset may push a bit further.

In a few days I'll provide an updated branch for our mirror of the
kernel on Github, with a complete and bumped set of patches, reducing
the steps required to test them.

Thanks!

Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-03 19:59   ` Sergei Miroshnichenko
@ 2021-02-04  8:26     ` Hinko Kocevar
  0 siblings, 0 replies; 44+ messages in thread
From: Hinko Kocevar @ 2021-02-04  8:26 UTC (permalink / raw)
  To: Sergei Miroshnichenko, helgaas
  Cc: David.Laight, rajatja, christian.koenig, andy.lavr, linux-pci,
	sr, lukas, linux

Dear,

On 2/3/21 8:59 PM, Sergei Miroshnichenko wrote:
> Hi Bjorn,
> 
> On Thu, 2021-01-28 at 08:53 -0600, Bjorn Helgaas wrote:
>> I can certainly see scenarios where this functionality will be
>> useful,
>> but the series currently doesn't mention bug reports that it
>> fixes.  I
>> suspect there *are* some related bug reports, e.g., for Thunderbolt
>> hotplug.  We should dig them up, include pointers to them, and get
>> the
>> reporters to test the series and provide feedback.
> 
> It never occurred to me to actually search for reports, thanks.
> 
>> We had some review traffic on earlier versions, but as far as I can
>> see, nobody has stepped up with any actual signs of approval, e.g.,
>> Tested-by, Reviewed-by, or Acked-by tags.  That's a problem because
>> this touches a lot of sensitive code and I don't want to be stuck
>> fixing issues all by myself.  When issues crop up, I look to the
>> author and reviewers to help out.
> 
> Indeed, I had only a few informal personal responses [from outside of
> our company]:
>   - some machines have no regressions with the kernel built with Clang-
> 12+LTO+IAS;
>   - NVMe hotplug started working for AMD Epyc with a PEX switch;
>   - Another system with several PEX switches is under ongoing
> experiments.
> 


I reached out to Sergei a while ago about these patches on systems I 
work with. They involve a PEX switch on a separate card (MCH).

FWIW, I find this patchset really interesting and useful in the modular 
systems such as microTCA [1]. In those, up to 12 PCIe hotswappable 
mezzanine cards (AMCs) can be added/removed at anytime. Each can be a 
PCIe endpoint. The infrastructure usually involves a CPU card with a 
PCIe root complex. In addition, PCIe switch is located on the carrier 
hub (MCH) card which handles the PCIe links for AMCs.

At this time, if the system is booted without a certain slot being 
populated at CPU/Linux startup adding a card requires system power 
cycle. Today there are already (minor?) issues with the hotplug/hotswap 
of the cards that are present at Linux bootup hence folks that use these 
systems usually do not rely on any these features; as a result boxes are 
power cycled. I'm not saying it is all Linux;s fault, as the microTCA is 
inherently complex due to modularity and versatility of configurations 
one can have, even without hotplugging. It is also a matter of 
understanding how to configure and use the hotplugging that can be a 
challenge on its own.

Some preliminary tests of this patchset, I managed to run so far, 
suggest that newly added cards are nicely being recognized by the Linux 
and are ready for use without CPU reboot. I currently do not have access 
to the microTCA boxes I use. I hope to be in a better place in couple of 
weeks and look into these patches again!

Thanks!
//hinko

[1] 
https://www.picmg.org/wp-content/uploads/MicroTCA_Short_Form_Sept_2006.pdf

> Providing an ability to quickly return to the old BARs allocating is
> the reason why I keep wrapping the most intrusive code with "if
> (pci_can_move_bars)". Originally I wanted it to be disabled by default,
> available for those truly desperate. But there was an objection that
> the code will not be ever tested this way.
> 
>> Along that line, I'm a little concerned about how we will be able to
>> reproduce and debug problems.  Issues will likely depend on the
>> topology, the resources of the specific devices involved, and a
>> specific sequence of hotplug operations.  Those may be hard to
>> reproduce even for the reporter.  Maybe this is nothing more than a
>> grep pattern to extract relevant events from dmesg.  Ideal, but maybe
>> impractical, would be a way to reproduce things in a VM using qemu
>> and
>> a script to simulate hotplugs.
> 
> I haven't yet tried the qemu for PCI debugging, definitely need to take
> a look. To help with that, the patchset is supplied with additional
> prints, but CONFIG_PCI_DEBUG=y is still preferred. A simulation will
> reveal if additional debug messages are needed.
> 
>>> The feature is usable not only for hotplug, but also for booting
>>> with a
>>> complete set of BARs assigned, and for Resizable BARs.
>>
>> I'm interested in more details about both of these.  What does "a
>> complete set of BARs assigned" mean?  On x86, the BIOS usually
>> assigns
>> all the BARs ahead of time, but I know that's not the case for all
>> arches.
> 
> Usually yes, but we have at least one x86 PC that skips some BARs (need
> to check out again its particular model and version, IIRC that was
> Z370-F) -- not the crucial ones like BAR0, though. That really got me
> by surprise, so now it is one more case covered.
> 
>> For Resizable BARs, is the point that this allows more flexibility in
>> resizing BARs because we can now move things out of the way?
> 
> Not only things out of the way, but most importantly the bridge windows
> are now floating as well. So it's more freedom for a new BAR to
> "choose" a place.
> 
> An awfully synthetic example:
> 
> |                       RC Address Space                       |
> | orig GPU BAR | fixed BAR |                                   |
> |              | fixed BAR |     Resized GPU BAR        |      |
> 
>>> Tested on a number of x86_64 machines without any special kernel
>>> command
>>> line arguments (some of them -- with older v8 revision of this
>>> patchset):
>>>   - PC: i7-5930K + ASUS X99-A;
>>>   - PC: i5-8500 + ASUS Z370-F;
>>>   - PC: AMD Ryzen 7 3700X + ASRock X570 + AMD 5700 XT (Resizable
>>> BARs);
>>>   - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
>>>   - HP ProLiant DL380 G5: Xeon X5460;
>>>   - Dell Inspiron N5010: i5 M 480;
>>>   - Dell Precision M6600: i7-2920XM.
>>
>> Does this testing show no change in behavior and no regressions, or
>> does it show that this series fixes cases that previously did not
>> work?  If the latter, some bugzilla reports with before/after dmesg
>> logs would give some insight into how this works and also some good
>> justification.
> 
> Both, actually, but the dmesg logs were never published, so they will
> be the next step -- combined with qemu scripts, if it works.
> 
>>> This patchset is a part of our work on adding support for hot-
>>> adding
>>> chains of chassis full of other bridges, NVME drives, SAS HBAs,
>>> GPUs, etc.
>>> without special requirements such as Hot-Plug Controller,
>>> reservation of
>>> bus numbers or memory regions by firmware, etc.
>>
>> This is a little bit of a chicken and egg situation.  I suspect many
>> people would like this functionality, but currently we either avoid
>> it
>> because it's "known not to work" or we hack around it with the
>> firmware reservations you mention.
> 
> The current (v9) patchset became more symbiotic with other hotplug
> facilities: it started to respect pci=hpmemsize etc, and only drops it
> if such allocation is impossible.
> 
> Attention Button for Assisted Hotplug (pciehp driver) was always
> supported. It is also fully compatible with DPC.
> 
> The goal is to squeeze as much as possible even without special
> hw+fw+sw support, when the only available tool is a manual rescan via
> sysfs.
> 
> Serge
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* RE: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-03 20:01     ` Sergei Miroshnichenko
@ 2021-02-04  9:34       ` David Laight
  0 siblings, 0 replies; 44+ messages in thread
From: David Laight @ 2021-02-04  9:34 UTC (permalink / raw)
  To: 'Sergei Miroshnichenko', lukas, helgaas
  Cc: christian, christian.koenig, YehezkelShB, rajatja, andy.lavr,
	linux, linux-pci, sr, mika.westerberg, mario.limonciello

> Actually, we are hotplugging not only endpoints, but nested PCIe
> switches as well: those are PCIe JBOD chassis (with NVMes and SAS
> drives).

From what I remember of playing with some PCI hot-plug hardware
and cardbus extenders to PCI chassis many years ago bus numbers
were actually a big problem.

A surprising number of io cards contain a bridge - so need a
bus number if hot-plugged.

(In spite of the marketing hype hot-plug basically let you remove
a working card and replace it with an identical one!
Modern drivers and OS are likely to handle the errors from
faulty cards better.)

The initial allocation of bus-numbers could easily spread out
the unused bus numbers.
Doing that for memory resources may have other problems
(You probably don't want to allocate the entire range to the
root hub?)

Are the bus numbers exposed as keys/filename in /sys ?
In which case changing them after boot is problematic.
You'd need a map of virtual bus numbers to physical ones.

As well as your 'suspend/resume' sequence, it should
be possible to send a card-remove/insert sequence to
an idle driver.

There is another case were BARs might need moving.
The PCIe spec doesn't allow very long (200ms?) from
reset removal (which might be close to power on) and
the requirement for endpoints to answer config cycles.
If you have to load a large FPGA from a serial EEPROM
this is actually a real problem.
So if the OS does a full probe of the PCI devices it may
find endpoints (or even bridges) that weren't actually there
when the BIOS (or equivalent) did its earlier probe.
Finding space in the middle of the pci devices for an endpoint
that wants two 1MB BARs is unlikely to suceed!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-03 20:17       ` Sergei Miroshnichenko
@ 2021-02-04 10:49         ` mika.westerberg
  2021-02-10 19:40           ` Sergei Miroshnichenko
  0 siblings, 1 reply; 44+ messages in thread
From: mika.westerberg @ 2021-02-04 10:49 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: lukas, David.Laight, christian, christian.koenig, rajatja,
	YehezkelShB, mario.limonciello, helgaas, linux, linux-pci, sr,
	andy.lavr

On Wed, Feb 03, 2021 at 08:17:14PM +0000, Sergei Miroshnichenko wrote:
> Hi Mika,
> 
> On Mon, 2021-02-01 at 14:55 +0200, Mika Westerberg wrote:
> > On Thu, Jan 28, 2021 at 09:39:29PM +0100, Lukas Wunner wrote:
> > > On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> > > > On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei Miroshnichenko
> > > > wrote:
> > > > > ...
> > > 
> > > I intended to review and test this iteration of the series more
> > > closely, but haven't been able to carve out the required time.
> > > I'm adding some Thunderbolt folks to cc in the hope that they
> > > can at least test the series on their development branch.
> > > Getting this upstreamed should really be in the best interest
> > > of Intel and other promulgators of Thunderbolt.
> > 
> > Sure. It seems that this series was submitted in December so probably
> > not applicable to the pci.git/next anymore. Anyways, I can give it a
> > try
> > on a TBT capable system if someone tells me what exactly to test ;-)
> > Probably at least that the existing functionality still works but
> > something else maybe too?
> 
> For setups that worked fine, the only expected change is a possible
> little different BAR layout (in /proc/iomem), and there should the same
> quantity (or more) of BARs assigned than before.
> 
> But if there are any problematic setups, which weren't able to arrange
> new BARs, this patchset may push a bit further.

Got it.

> In a few days I'll provide an updated branch for our mirror of the
> kernel on Github, with a complete and bumped set of patches, reducing
> the steps required to test them.

Sounds good, thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-04 10:49         ` mika.westerberg
@ 2021-02-10 19:40           ` Sergei Miroshnichenko
  2021-02-10 21:46             ` Lukas Wunner
  2021-02-11 11:39             ` mika.westerberg
  0 siblings, 2 replies; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-10 19:40 UTC (permalink / raw)
  To: mika.westerberg
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

On Thu, 2021-02-04 at 12:49 +0200, Mika Westerberg
wrote:
> On Wed, Feb 03, 2021 at 08:17:14PM +0000, Sergei Miroshnichenko
> wrote:
> > On Mon, 2021-02-01 at 14:55 +0200, Mika Westerberg wrote:
> > > On Thu, Jan 28, 2021 at 09:39:29PM +0100, Lukas Wunner wrote:
> > > > On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> > > > > On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei
> > > > > Miroshnichenko
> > > > > wrote:
> > > > > > ...
> > > > 
> > > > I intended to review and test this iteration of the series more
> > > > closely, but haven't been able to carve out the required time.
> > > > I'm adding some Thunderbolt folks to cc in the hope that they
> > > > can at least test the series on their development branch.
> > > > Getting this upstreamed should really be in the best interest
> > > > of Intel and other promulgators of Thunderbolt.
> > > 
> > > Sure. It seems that this series was submitted in December so
> > > probably
> > > not applicable to the pci.git/next anymore. Anyways, I can give
> > > it a
> > > try
> > > on a TBT capable system if someone tells me what exactly to test
> > > ;-)
> > > Probably at least that the existing functionality still works but
> > > something else maybe too?
> > 
> > For setups that worked fine, the only expected change is a possible
> > little different BAR layout (in /proc/iomem), and there should the
> > same
> > quantity (or more) of BARs assigned than before.
> > 
> > But if there are any problematic setups, which weren't able to
> > arrange
> > new BARs, this patchset may push a bit further.
> 
> Got it.
> 
> > In a few days I'll provide an updated branch for our mirror of the
> > kernel on Github, with a complete and bumped set of patches,
> > reducing
> > the steps required to test them.
> 
> Sounds good, thanks!

Hi Mika,

The branch is finally ready, so if you still have time for that, please
take a look:

https://github.com/YADRO-KNS/linux/tree/yadro/pcie_hotplug/movable_bars_v9.1


And a bit off topic: I've also updated the branch with a complete set
of our hotplug-related patches, including an RFC series of movable bus
numbers for hot-adding large and/or nested switches (at the cost of
breaking the sysfs+procsf ABI). In case one is ever going to try that,
the pci=realloc,movable_buses kernel command line arguments are needed:

https://github.com/YADRO-KNS/linux/commits/yadro/pcie_hotplug/movable_buses_v9.1

Thanks!

Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-10 19:40           ` Sergei Miroshnichenko
@ 2021-02-10 21:46             ` Lukas Wunner
  2021-02-11 11:39             ` mika.westerberg
  1 sibling, 0 replies; 44+ messages in thread
From: Lukas Wunner @ 2021-02-10 21:46 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: mika.westerberg, David.Laight, christian, christian.koenig,
	rajatja, YehezkelShB, mario.limonciello, helgaas, linux,
	linux-pci, sr, andy.lavr

On Wed, Feb 10, 2021 at 07:40:06PM +0000, Sergei Miroshnichenko wrote:
> The branch is finally ready, so if you still have time for that, please
> take a look:
> 
> https://github.com/YADRO-KNS/linux/tree/yadro/pcie_hotplug/movable_bars_v9.1

Just a quick drive-by comment, in commit

    scsi: mpt3sas: Handle a surprise PCI unplug faster

you probably want to call pci_dev_is_disconnected() instead of
adding a new flag.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-10 19:40           ` Sergei Miroshnichenko
  2021-02-10 21:46             ` Lukas Wunner
@ 2021-02-11 11:39             ` mika.westerberg
  2021-02-11 17:45               ` Sergei Miroshnichenko
  1 sibling, 1 reply; 44+ messages in thread
From: mika.westerberg @ 2021-02-11 11:39 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

Hi,

On Wed, Feb 10, 2021 at 07:40:06PM +0000, Sergei Miroshnichenko wrote:
> On Thu, 2021-02-04 at 12:49 +0200, Mika Westerberg
> wrote:
> > On Wed, Feb 03, 2021 at 08:17:14PM +0000, Sergei Miroshnichenko
> > wrote:
> > > On Mon, 2021-02-01 at 14:55 +0200, Mika Westerberg wrote:
> > > > On Thu, Jan 28, 2021 at 09:39:29PM +0100, Lukas Wunner wrote:
> > > > > On Thu, Jan 28, 2021 at 08:53:16AM -0600, Bjorn Helgaas wrote:
> > > > > > On Fri, Dec 18, 2020 at 08:39:45PM +0300, Sergei
> > > > > > Miroshnichenko
> > > > > > wrote:
> > > > > > > ...
> > > > > 
> > > > > I intended to review and test this iteration of the series more
> > > > > closely, but haven't been able to carve out the required time.
> > > > > I'm adding some Thunderbolt folks to cc in the hope that they
> > > > > can at least test the series on their development branch.
> > > > > Getting this upstreamed should really be in the best interest
> > > > > of Intel and other promulgators of Thunderbolt.
> > > > 
> > > > Sure. It seems that this series was submitted in December so
> > > > probably
> > > > not applicable to the pci.git/next anymore. Anyways, I can give
> > > > it a
> > > > try
> > > > on a TBT capable system if someone tells me what exactly to test
> > > > ;-)
> > > > Probably at least that the existing functionality still works but
> > > > something else maybe too?
> > > 
> > > For setups that worked fine, the only expected change is a possible
> > > little different BAR layout (in /proc/iomem), and there should the
> > > same
> > > quantity (or more) of BARs assigned than before.
> > > 
> > > But if there are any problematic setups, which weren't able to
> > > arrange
> > > new BARs, this patchset may push a bit further.
> > 
> > Got it.
> > 
> > > In a few days I'll provide an updated branch for our mirror of the
> > > kernel on Github, with a complete and bumped set of patches,
> > > reducing
> > > the steps required to test them.
> > 
> > Sounds good, thanks!
> 
> Hi Mika,
> 
> The branch is finally ready, so if you still have time for that, please
> take a look:
> 
> https://github.com/YADRO-KNS/linux/tree/yadro/pcie_hotplug/movable_bars_v9.1

Thanks for sharing!

I tried this series on top of v5.11-rc7 on a Dell XPS 9380 so that I
have two TBT3 devices connected. Each device includes PCIe switch and a
xHCI endpoint.

What I see that the hotplug downstream port does not have enough bus
numbers (and other resources allocated) so when attaching the second
device it does not fit there anymore. The resulting 'lspci -t' output
looks like below:

-[0000:00]-+-00.0
           +-02.0
           +-04.0
           +-08.0
           +-12.0
           +-14.0
           +-14.2
           +-15.0
           +-15.1
           +-16.0
           +-1c.0-[01]----00.0
           +-1c.6-[02]----00.0
           +-1d.0-[03-3b]----00.0-[04-3b]--+-00.0-[05]----00.0
           |                               +-01.0-[06-1f]----00.0-[07-09]--+-02.0-[08]----00.0
           |                               |                               \-04.0-[09]----00.0-[0a]--
           |                               +-02.0-[20]----00.0
           |                               \-04.0-[21-3b]--
           +-1d.4-[3c]----00.0
           +-1f.0
           +-1f.3
           +-1f.4
           \-1f.5

So the last PCIE switch is not visible anymore, and the xHCI on the
second TBT device is not functional either.

On the mainline kernel I get this:

-[0000:00]-+-00.0
           +-02.0
           +-04.0
           +-08.0
           +-12.0
           +-14.0
           +-14.2
           +-15.0
           +-15.1
           +-16.0
           +-1c.0-[01]----00.0
           +-1c.6-[02]----00.0
           +-1d.0-[03-3b]----00.0-[04-3b]--+-00.0-[05]----00.0
           |                               +-01.0-[06-1f]----00.0-[07-1f]--+-02.0-[08]----00.0
           |                               |                               \-04.0-[09-1f]----00.0-[0a-1f]--+-02.0-[0b]----00.0
           |                               |                                                               \-04.0-[0c-1f]--
           |                               +-02.0-[20]----00.0
           |                               \-04.0-[21-3b]--
           +-1d.4-[3c]----00.0
           +-1f.0
           +-1f.3
           +-1f.4
           \-1f.5

In this topology I can add yet another TBT device and there are still
resources available and all the endpoints are functional too.

I can send you the full dmesg and lspci -vv output if needed.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-11 11:39             ` mika.westerberg
@ 2021-02-11 17:45               ` Sergei Miroshnichenko
  2021-02-12 12:52                 ` mika.westerberg
  0 siblings, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-11 17:45 UTC (permalink / raw)
  To: mika.westerberg
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

On Thu, 2021-02-11 at 13:39 +0200, mika.westerberg@linux.intel.com
wrote:
> On Wed, Feb 10, 2021 at 07:40:06PM +0000, Sergei Miroshnichenko
> wrote:
> > On Thu, 2021-02-04 at 12:49 +0200, Mika Westerberg
> > wrote:
> > > ...
> > The branch is finally ready, so if you still have time for that,
> > please
> > take a look:
> > 
> > https://github.com/YADRO-KNS/linux/tree/yadro/pcie_hotplug/movable_bars_v9.1
> 
> Thanks for sharing!
> 
> I tried this series on top of v5.11-rc7 on a Dell XPS 9380 so that I
> have two TBT3 devices connected. Each device includes PCIe switch and
> a
> xHCI endpoint.
> 
> What I see that the hotplug downstream port does not have enough bus
> numbers (and other resources allocated) so when attaching the second
> device it does not fit there anymore. The resulting 'lspci -t' output
> looks like below:
> 
> -[0000:00]-+-00.0
>            +-02.0
>            +-04.0
>            +-08.0
>            +-12.0
>            +-14.0
>            +-14.2
>            +-15.0
>            +-15.1
>            +-16.0
>            +-1c.0-[01]----00.0
>            +-1c.6-[02]----00.0
>            +-1d.0-[03-3b]----00.0-[04-3b]--+-00.0-[05]----00.0
>            |                               +-01.0-[06-1f]----00.0-
> [07-09]--+-02.0-[08]----00.0
>            |                               |                         
>       \-04.0-[09]----00.0-[0a]--
>            |                               +-02.0-[20]----00.0
>            |                               \-04.0-[21-3b]--
>            +-1d.4-[3c]----00.0
>            +-1f.0
>            +-1f.3
>            +-1f.4
>            \-1f.5
> 
> So the last PCIE switch is not visible anymore, and the xHCI on the
> second TBT device is not functional either.
> 
> On the mainline kernel I get this:
> 
> -[0000:00]-+-00.0
>            +-02.0
>            +-04.0
>            +-08.0
>            +-12.0
>            +-14.0
>            +-14.2
>            +-15.0
>            +-15.1
>            +-16.0
>            +-1c.0-[01]----00.0
>            +-1c.6-[02]----00.0
>            +-1d.0-[03-3b]----00.0-[04-3b]--+-00.0-[05]----00.0
>            |                               +-01.0-[06-1f]----00.0-
> [07-1f]--+-02.0-[08]----00.0
>            |                               |                         
>       \-04.0-[09-1f]----00.0-[0a-1f]--+-02.0-[0b]----00.0
>            |                               |                         
>                                       \-04.0-[0c-1f]--
>            |                               +-02.0-[20]----00.0
>            |                               \-04.0-[21-3b]--
>            +-1d.4-[3c]----00.0
>            +-1f.0
>            +-1f.3
>            +-1f.4
>            \-1f.5
> 
> In this topology I can add yet another TBT device and there are still
> resources available and all the endpoints are functional too.
> 
> I can send you the full dmesg and lspci -vv output if needed.


What a pity. Yes, please, I would of course like to take a look why
that happened, and compare, what went wrong (before and after the
hotplug: lspci -tv, dmesg -t and /proc/iomem with /proc/ioports, if it
wouldn't be too much trouble).

The first patchset wasn't intended to change the bus numbers behavior,
so I also have to check that locally. And despite of lack of bus
numbers, new endpoints still should be visible, this is for me to check
as well.

Thanks a lot for testing!

Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-11 17:45               ` Sergei Miroshnichenko
@ 2021-02-12 12:52                 ` mika.westerberg
  2021-02-12 20:54                   ` Sergei Miroshnichenko
  0 siblings, 1 reply; 44+ messages in thread
From: mika.westerberg @ 2021-02-12 12:52 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

Hi,

On Thu, Feb 11, 2021 at 05:45:20PM +0000, Sergei Miroshnichenko wrote:
> What a pity. Yes, please, I would of course like to take a look why
> that happened, and compare, what went wrong (before and after the
> hotplug: lspci -tv, dmesg -t and /proc/iomem with /proc/ioports, if it
> wouldn't be too much trouble).

I just sent these logs to you in a separate email. Let me know if you
need more.

Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-12 12:52                 ` mika.westerberg
@ 2021-02-12 20:54                   ` Sergei Miroshnichenko
  2021-02-15 14:56                     ` mika.westerberg
  0 siblings, 1 reply; 44+ messages in thread
From: Sergei Miroshnichenko @ 2021-02-12 20:54 UTC (permalink / raw)
  To: mika.westerberg
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

On Fri, 2021-02-12 at 14:52 +0200, mika.westerberg@linux.intel.com
wrote:
> Hi,
> 
> On Thu, Feb 11, 2021 at 05:45:20PM +0000, Sergei Miroshnichenko
> wrote:
> > What a pity. Yes, please, I would of course like to take a look why
> > that happened, and compare, what went wrong (before and after the
> > hotplug: lspci -tv, dmesg -t and /proc/iomem with /proc/ioports, if
> > it
> > wouldn't be too much trouble).
> 
> I just sent these logs to you in a separate email. Let me know if you
> need more.

Thanks, from them it's clear that the "full rescan" apprach currently
doesn't involve the pci_bus_distribute_available_resources(), that's
why hot-adding a second nested switch breaks: because of non-
distributed free bus numbers. The first switch seems was hot-added just
fine, with BARs being moved a bit.

This is to be fixed in v10, along with the
mpt3sas+pci_dev_is_disconnected() moment Lukas had found (thanks
Lukas!), CONFIG_DEBUG_LOCK_ALLOC macro, and a more useful debug
messages.

Serge

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
  2021-02-12 20:54                   ` Sergei Miroshnichenko
@ 2021-02-15 14:56                     ` mika.westerberg
  0 siblings, 0 replies; 44+ messages in thread
From: mika.westerberg @ 2021-02-15 14:56 UTC (permalink / raw)
  To: Sergei Miroshnichenko
  Cc: David.Laight, christian, christian.koenig, rajatja, YehezkelShB,
	mario.limonciello, helgaas, linux, linux-pci, sr, lukas,
	andy.lavr

On Fri, Feb 12, 2021 at 08:54:18PM +0000, Sergei Miroshnichenko wrote:
> On Fri, 2021-02-12 at 14:52 +0200, mika.westerberg@linux.intel.com
> wrote:
> > Hi,
> > 
> > On Thu, Feb 11, 2021 at 05:45:20PM +0000, Sergei Miroshnichenko
> > wrote:
> > > What a pity. Yes, please, I would of course like to take a look why
> > > that happened, and compare, what went wrong (before and after the
> > > hotplug: lspci -tv, dmesg -t and /proc/iomem with /proc/ioports, if
> > > it
> > > wouldn't be too much trouble).
> > 
> > I just sent these logs to you in a separate email. Let me know if you
> > need more.
> 
> Thanks, from them it's clear that the "full rescan" apprach currently
> doesn't involve the pci_bus_distribute_available_resources(), that's
> why hot-adding a second nested switch breaks: because of non-
> distributed free bus numbers. The first switch seems was hot-added just
> fine, with BARs being moved a bit.
> 
> This is to be fixed in v10, along with the
> mpt3sas+pci_dev_is_disconnected() moment Lukas had found (thanks
> Lukas!), CONFIG_DEBUG_LOCK_ALLOC macro, and a more useful debug
> messages.

Great, thanks! :)

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-02-15 14:58 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
2020-12-28 15:37   ` Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 02/26] PCI: Ensure a bridge has I/O and MEM access for hot-added devices Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 03/26] PCI: hotplug: Initial support of the movable BARs feature Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 04/26] PCI: Add version of release_child_resources() aware of fixed BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 05/26] PCI: hotplug: Fix reassigning the released BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 06/26] PCI: hotplug: Recalculate every bridge window during rescan Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 08/26] PCI: Reassign BARs if BIOS/bootloader had assigned not all of them Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 09/26] PCI: Movable BARs: Make just a single attempt to assign bridge resources Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 10/26] PCI: hotplug: Calculate fixed parts of bridge windows Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 11/26] PCI: Include fixed BARs into the bus size calculating Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 12/26] PCI: Make sure bridge windows include their fixed BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 13/26] PCI: hotplug: Add support of fixed BARs to pci_assign_resource() Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 14/26] PCI: hotplug: Sort fixed BARs before assignment Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 15/26] x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is invalid Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 16/26] PCI: hotplug: Configure MPS after manual bus rescan Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 17/26] PCI: hotplug: Don't disable the released bridge windows immediately Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 18/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 19/26] PCI: Don't claim fixed BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 20/26] PCI: hotplug: Retry bus assignment without reserved space Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 21/26] PCI: Rescan the bus to resize a BAR Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 22/26] PCI: hotplug: Enable the movable BARs feature by default Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 23/26] PCI/portdrv: Declare support of movable BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 24/26] nvme-pci: Handle " Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 25/26] PCI: Add a message for updating BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 26/26] resource: increase max nesting level for /proc/iomem Sergei Miroshnichenko
2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
2021-01-28 20:39   ` Lukas Wunner
2021-02-01 12:55     ` Mika Westerberg
2021-02-03 20:17       ` Sergei Miroshnichenko
2021-02-04 10:49         ` mika.westerberg
2021-02-10 19:40           ` Sergei Miroshnichenko
2021-02-10 21:46             ` Lukas Wunner
2021-02-11 11:39             ` mika.westerberg
2021-02-11 17:45               ` Sergei Miroshnichenko
2021-02-12 12:52                 ` mika.westerberg
2021-02-12 20:54                   ` Sergei Miroshnichenko
2021-02-15 14:56                     ` mika.westerberg
2021-02-03 20:01     ` Sergei Miroshnichenko
2021-02-04  9:34       ` David Laight
2021-02-03 19:59   ` Sergei Miroshnichenko
2021-02-04  8:26     ` Hinko Kocevar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).