linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers
@ 2019-10-24 17:21 Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

To allow hotplugging bridges, the kernel or BIOS/bootloader/firmware add
extra bus numbers per slot, but this range may be not enough for a large
bridge and/or nested bridges when hot-adding a chassis full of devices.

This patchset proposes an approach similar to movable BARs: bus numbers are
not reserved anymore, instead the kernel moves the "tail" of the PCI tree
by one, when needed a new bus.

When something like this is going to happen:
                                                                   *LARGE*
 +-[0020:00]---00.0-[01-20]--+-00.0-[02-08]--+-00.0-[03]--   <--  *NESTED*
 |                           |               +-01.0-[04]--        *BRIDGE*
 |                           |               +-02.0-[05]--
 |                           |               +-03.0-[06]--
 |                           |               +-04.0-[07]--
 |                           |               \-05.0-[08]--
 ...

, this will result into the following:

 +-[0020:00]---00.0-[01-22]--+-00.0-[02-22]--+-00.0-[03-1d]----04.0-[04-1d]--+-00.0-[05]--
 |                           |               |                               +-04.0-[06]--
 |                           |               |                               +-09.0-[07]--
 |                           |               |                               +-0c.0-[08-19]----00.0-[09-19]--+-01.0-[0a]--
 |                           |               |                               |                               ...
 |                           |               |                               |                               \-11.0-[19]--
 |                           |               |                               ...
 |                           |               |                               \-15.0-[1d]--
 |                           |               +-01.0-[1e]--  <-- Renamed from 04
 |                           |               +-02.0-[1f]--  <-- Renamed from 05
 |                           |               +-03.0-[20]--  <-- Renamed from 06
 |                           |               +-04.0-[21]--  <-- Renamed from 07
 |                           |               \-05.0-[22]--  <-- Renamed from 08
 ...


This looks to be safe in the kernel, because drivers don't use the raw PCI
BDF ID, and we've tested that on our x86 and PowerNV machines: mass storage
with roots and network adapters just continue their work while their bus
numbers had moved.

But here comes the userspace:

 - procfs entries:

    % ls -la /proc/bus/pci/*
    /proc/bus/pci/00:
    00.0
    02.0
    ...
    1f.4
    1f.6

    /proc/bus/pci/04:
    00.0

    /proc/bus/pci/40:
    00.0

 - sysfs entries:

    % ls -la /sys/devices/pci0000:00/
    0000:00:00.0
    0000:00:02.0
    ...
    0000:00:1f.3
    0000:00:1f.4
    0000:00:1f.6

    % ls -la /sys/devices/pci0000:00/0000:00:1c.6/0000:04:00.0/driver
    driver -> ../../../../bus/pci/drivers/iwlwifi

 - sysfs symlinks:

    % ls -la /sys/bus/pci/devices
    0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0
    0000:00:02.0 -> ../../../devices/pci0000:00/0000:00:02.0
    ...
    0000:04:00.0 -> ../../../devices/pci0000:00/0000:00:1c.6/0000:04:00.0
    0000:40:00.0 -> ../../../devices/pci0000:00/0000:00:1d.2/0000:40:00.0


These patches alter the kernel public API and some internals to be able to
remove these files before changing a bus number, and create new versions
of them after device has changed its BDF.

On one hand, this makes the hotplug predictable, independent of non-kernel
program components (BIOS, bootloader, etc.) and cross-platform, but this is
also a severe ABI violation.

Probably, the udev should have a new action like "rename" in addition to
"add" and "remove".

Is it feasible to have this feature disabled by default, but with a chance
to enable by a kernel command line argument like this:

  pci=realloc,movable_buses

?

This code is follow-up of the "PCI: Allow BAR movement during hotplug"
series (v6).

Sergey Miroshnichenko (11):
  PCI: sysfs: Nullify freed pointers
  PCI: proc: Nullify a freed pointer
  drivers: base: Make bus_add_device() public
  drivers: base: Make device_{add|remove}_class_symlinks() public
  drivers: base: Add bus_disconnect_device()
  powerpc/pci: Enable assigning bus numbers instead of reading them from
    DT
  powerpc/pci: Don't reduce the host bridge bus range
  PCI: Allow expanding the bridges
  PCI: hotplug: Add initial support for movable bus numbers
  PCI: hotplug: movable bus numbers: rename proc and sysfs entries
  PCI: hotplug: movable bus numbers: compact the gaps in numbering

 .../admin-guide/kernel-parameters.txt         |   3 +
 arch/powerpc/kernel/pci-common.c              |   1 -
 arch/powerpc/kernel/pci_dn.c                  |   5 +
 arch/powerpc/platforms/powernv/eeh-powernv.c  |   3 +-
 drivers/base/base.h                           |   1 -
 drivers/base/bus.c                            |  37 +++
 drivers/base/core.c                           |   6 +-
 drivers/pci/pci-sysfs.c                       |   7 +-
 drivers/pci/pci.c                             |   3 +
 drivers/pci/pci.h                             |   2 +
 drivers/pci/probe.c                           | 291 +++++++++++++++++-
 drivers/pci/proc.c                            |   1 +
 include/linux/device.h                        |   5 +
 13 files changed, 351 insertions(+), 14 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 02/11] PCI: proc: Nullify a freed pointer Sergey Miroshnichenko
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

After hotplugging a bridge the PCI topology will be changed: buses may have
their numbers changed. In this case all the affected sysfs entries/symlinks
must be recreated, because they have BDF address in their names.

Set the freed pointers to NULL, so the !NULL checks will be satisfied when
its time to recreate the sysfs entries.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/pci-sysfs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 793412954529..a238935c1193 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1129,12 +1129,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr[i] = NULL;
 		}
 
 		res_attr = pdev->res_attr_wc[i];
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr_wc[i] = NULL;
 		}
 	}
 }
@@ -1175,8 +1177,11 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
 	res_attr->size = pci_resource_len(pdev, num);
 	res_attr->private = (void *)(unsigned long)num;
 	retval = sysfs_create_bin_file(&pdev->dev.kobj, res_attr);
-	if (retval)
+	if (retval) {
 		kfree(res_attr);
+		if (pdev->res_attr[num] == res_attr)
+			pdev->res_attr[num] = NULL;
+	}
 
 	return retval;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 02/11] PCI: proc: Nullify a freed pointer
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 03/11] drivers: base: Make bus_add_device() public Sergey Miroshnichenko
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

A PCI device may be detached from /proc/bus/pci/devices not only when it is
removed, but also when its bus had changed the number - in this case the
proc entry must be recreated to reflect the new PCI topology.

Nullify freed pointers to mark them as valid for allocating again.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/proc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 5495537c60c2..c85654dd315b 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -443,6 +443,7 @@ int pci_proc_detach_device(struct pci_dev *dev)
 int pci_proc_detach_bus(struct pci_bus *bus)
 {
 	proc_remove(bus->procdir);
+	bus->procdir = NULL;
 	return 0;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 03/11] drivers: base: Make bus_add_device() public
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 02/11] PCI: proc: Nullify a freed pointer Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 04/11] drivers: base: Make device_{add|remove}_class_symlinks() public Sergey Miroshnichenko
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Move the bus_add_device() to a public API, so it can be applied to devices
which are temporarily detached from their buses without being destroyed.

This will be used after changes in PCI topology after hotplugging a bridge:
buses may get their numbers changed, so their child devices must be
reattached and their sysfs and proc files recreated.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/base/base.h    | 1 -
 drivers/base/bus.c     | 1 +
 include/linux/device.h | 2 ++
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 0d32544b6f91..c93d302e6345 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -110,7 +110,6 @@ extern void container_dev_init(void);
 
 struct kobject *virtual_device_parent(struct device *dev);
 
-extern int bus_add_device(struct device *dev);
 extern void bus_probe_device(struct device *dev);
 extern void bus_remove_device(struct device *dev);
 
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index a1d1e8256324..8f3445cc533e 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -471,6 +471,7 @@ int bus_add_device(struct device *dev)
 	bus_put(dev->bus);
 	return error;
 }
+EXPORT_SYMBOL_GPL(bus_add_device);
 
 /**
  * bus_probe_device - probe drivers for a new device
diff --git a/include/linux/device.h b/include/linux/device.h
index 297239a08bb7..4d8bbc8ae73d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -267,6 +267,8 @@ int bus_for_each_drv(struct bus_type *bus, struct device_driver *start,
 void bus_sort_breadthfirst(struct bus_type *bus,
 			   int (*compare)(const struct device *a,
 					  const struct device *b));
+extern int bus_add_device(struct device *dev);
+
 /*
  * Bus notifiers: Get notified of addition/removal of devices
  * and binding/unbinding of drivers to devices.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 04/11] drivers: base: Make device_{add|remove}_class_symlinks() public
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (2 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 03/11] drivers: base: Make bus_add_device() public Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 05/11] drivers: base: Add bus_disconnect_device() Sergey Miroshnichenko
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When updating the /sys/devices/pci* entries affected by changes in the PCI
topology, their symlinks in /sys/bus/pci/devices/* must also be rebuilt.

Moving device_add_class_symlinks() and device_remove_class_symlinks() to a
public API allows the PCI subsystem to update the sysfs without destroying
the working affected devices.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/base/core.c    | 6 ++++--
 include/linux/device.h | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7bd9cd366d41..23e689fc8478 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1922,7 +1922,7 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
 	mutex_unlock(&gdp_mutex);
 }
 
-static int device_add_class_symlinks(struct device *dev)
+int device_add_class_symlinks(struct device *dev)
 {
 	struct device_node *of_node = dev_of_node(dev);
 	int error;
@@ -1973,8 +1973,9 @@ static int device_add_class_symlinks(struct device *dev)
 	sysfs_remove_link(&dev->kobj, "of_node");
 	return error;
 }
+EXPORT_SYMBOL_GPL(device_add_class_symlinks);
 
-static void device_remove_class_symlinks(struct device *dev)
+void device_remove_class_symlinks(struct device *dev)
 {
 	if (dev_of_node(dev))
 		sysfs_remove_link(&dev->kobj, "of_node");
@@ -1991,6 +1992,7 @@ static void device_remove_class_symlinks(struct device *dev)
 #endif
 	sysfs_delete_link(&dev->class->p->subsys.kobj, &dev->kobj, dev_name(dev));
 }
+EXPORT_SYMBOL_GPL(device_remove_class_symlinks);
 
 /**
  * dev_set_name - set a device name
diff --git a/include/linux/device.h b/include/linux/device.h
index 4d8bbc8ae73d..420228ab9c4b 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -268,6 +268,8 @@ void bus_sort_breadthfirst(struct bus_type *bus,
 			   int (*compare)(const struct device *a,
 					  const struct device *b));
 extern int bus_add_device(struct device *dev);
+extern int device_add_class_symlinks(struct device *dev);
+extern void device_remove_class_symlinks(struct device *dev);
 
 /*
  * Bus notifiers: Get notified of addition/removal of devices
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 05/11] drivers: base: Add bus_disconnect_device()
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (3 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 04/11] drivers: base: Make device_{add|remove}_class_symlinks() public Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 06/11] powerpc/pci: Enable assigning bus numbers instead of reading them from DT Sergey Miroshnichenko
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Add bus_disconnect_device(), which is similar to bus_remove_device(), but
it doesn't detach the device from its driver, so it can be reconnected to
the same or another bus later.

This is a yet another preparation to hotplugging large PCIe bridges, which
may entail changes in BDF addresses of working devices due to movable bus
numbers. Changed addresses require rebuilding the affected entries in
/sys/bus/pci and /proc/bus/pci.

Using bus_disconnect_device()+bus_add_device() during PCI rescan allows the
drivers to work with their devices uninterruptedly, regardless of changes
in PCI addresses.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/base/bus.c     | 36 ++++++++++++++++++++++++++++++++++++
 include/linux/device.h |  1 +
 2 files changed, 37 insertions(+)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 8f3445cc533e..52d77fb90218 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -497,6 +497,42 @@ void bus_probe_device(struct device *dev)
 	mutex_unlock(&bus->p->mutex);
 }
 
+/**
+ * bus_disconnect_device - disconnect device from bus,
+ * but don't detach it from driver
+ * @dev: device to be disconnected
+ *
+ * - Remove device from all interfaces.
+ * - Remove symlink from bus' directory.
+ * - Delete device from bus's list.
+ */
+void bus_disconnect_device(struct device *dev)
+{
+	struct bus_type *bus = dev->bus;
+	struct subsys_interface *sif;
+
+	if (!bus)
+		return;
+
+	mutex_lock(&bus->p->mutex);
+	list_for_each_entry(sif, &bus->p->interfaces, node)
+		if (sif->remove_dev)
+			sif->remove_dev(dev, sif);
+	mutex_unlock(&bus->p->mutex);
+
+	sysfs_remove_link(&dev->kobj, "subsystem");
+	sysfs_remove_link(&dev->bus->p->devices_kset->kobj,
+			  dev_name(dev));
+	device_remove_groups(dev, dev->bus->dev_groups);
+	if (klist_node_attached(&dev->p->knode_bus))
+		klist_del(&dev->p->knode_bus);
+
+	pr_debug("bus: '%s': remove device %s\n",
+		 dev->bus->name, dev_name(dev));
+	bus_put(dev->bus);
+}
+EXPORT_SYMBOL_GPL(bus_disconnect_device);
+
 /**
  * bus_remove_device - remove device from bus
  * @dev: device to be removed
diff --git a/include/linux/device.h b/include/linux/device.h
index 420228ab9c4b..9f098c32a4ad 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -268,6 +268,7 @@ void bus_sort_breadthfirst(struct bus_type *bus,
 			   int (*compare)(const struct device *a,
 					  const struct device *b));
 extern int bus_add_device(struct device *dev);
+extern void bus_disconnect_device(struct device *dev);
 extern int device_add_class_symlinks(struct device *dev);
 extern void device_remove_class_symlinks(struct device *dev);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 06/11] powerpc/pci: Enable assigning bus numbers instead of reading them from DT
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (4 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 05/11] drivers: base: Add bus_disconnect_device() Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 07/11] powerpc/pci: Don't reduce the host bridge bus range Sergey Miroshnichenko
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

If the firmware indicates support of reassigning bus numbers via the PHB's
"ibm,supported-movable-bdfs" property in DT, PowerNV will not depend on PCI
topology info from DT anymore.

This makes possible to re-enumerate the fabric, assign the new bus numbers
and switch from the pnv_php module to the standard pciehp driver for PCI
hotplug functionality.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/powerpc/kernel/pci_dn.c                 | 5 +++++
 arch/powerpc/platforms/powernv/eeh-powernv.c | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index ad0ecf48e943..b9b7518eb2b4 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -559,6 +559,11 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
 		phb->pci_data = pdn;
 	}
 
+	if (of_get_property(dn, "ibm,supported-movable-bdfs", NULL)) {
+		pci_add_flags(PCI_REASSIGN_ALL_BUS);
+		return;
+	}
+
 	/* Update dn->phb ptrs for new phb and children devices */
 	pci_traverse_device_nodes(dn, add_pdn, phb);
 }
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 6bc24a47e9ef..6c126aa2a6b7 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -42,7 +42,8 @@ void pnv_pcibios_bus_add_device(struct pci_dev *pdev)
 {
 	struct pci_dn *pdn = pci_get_pdn(pdev);
 
-	if (eeh_has_flag(EEH_FORCE_DISABLED))
+	if (eeh_has_flag(EEH_FORCE_DISABLED) ||
+	    !pci_has_flag(PCI_REASSIGN_ALL_BUS))
 		return;
 
 	dev_dbg(&pdev->dev, "EEH: Setting up device\n");
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 07/11] powerpc/pci: Don't reduce the host bridge bus range
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (5 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 06/11] powerpc/pci: Enable assigning bus numbers instead of reading them from DT Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 08/11] PCI: Allow expanding the bridges Sergey Miroshnichenko
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Currently the last possible bus number of the PHB is set to the last
used bus number during the boot. So when hotplugging a bridge later,
no new buses can be allocated because they are limited by this value.

Let the host bridge contain any number of buses up to 255.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 arch/powerpc/kernel/pci-common.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 1c448cf25506..5877ef7a39a0 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1631,7 +1631,6 @@ void pcibios_scan_phb(struct pci_controller *hose)
 	if (mode == PCI_PROBE_NORMAL) {
 		pci_bus_update_busn_res_end(bus, 255);
 		hose->last_busno = pci_scan_child_bus(bus);
-		pci_bus_update_busn_res_end(bus, hose->last_busno);
 	}
 
 	/* Platform gets a chance to do some global fixups before
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 08/11] PCI: Allow expanding the bridges
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (6 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 07/11] powerpc/pci: Don't reduce the host bridge bus range Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 09/11] PCI: hotplug: Add initial support for movable bus numbers Sergey Miroshnichenko
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

When hotplugging a bridge, the parent bus may not have [enough] reserved
bus numbers. So before rescanning the bus, set its subordinate number to
the maximum possible value: it is 255 when there is only one root bridge
in the domain.

During the PCI rescan, the subordinate bus number of every bus will be
contracted to the actual value.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/probe.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 539f5d39bb6d..3494b5d265d5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3195,20 +3195,22 @@ static unsigned int pci_dev_count_res_mask(struct pci_dev *dev)
 	return res_mask;
 }
 
-static void pci_bus_rescan_prepare(struct pci_bus *bus)
+static void pci_bus_rescan_prepare(struct pci_bus *bus, int last_bus_number)
 {
 	struct pci_dev *dev;
 
 	if (bus->self)
 		pci_config_pm_runtime_get(bus->self);
 
+	bus->busn_res.end = last_bus_number;
+
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		struct pci_bus *child = dev->subordinate;
 
 		dev->res_mask = pci_dev_count_res_mask(dev);
 
 		if (child)
-			pci_bus_rescan_prepare(child);
+			pci_bus_rescan_prepare(child, last_bus_number);
 
 		if (dev->driver &&
 		    dev->driver->rescan_prepare)
@@ -3439,7 +3441,7 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 
 	if (pci_can_move_bars) {
 		pcibios_root_bus_rescan_prepare(root);
-		pci_bus_rescan_prepare(root);
+		pci_bus_rescan_prepare(root, root->busn_res.end);
 		pci_bus_update_immovable_range(root);
 		pci_bus_release_root_bridge_resources(root);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 09/11] PCI: hotplug: Add initial support for movable bus numbers
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (7 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 08/11] PCI: Allow expanding the bridges Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 10/11] PCI: hotplug: movable bus numbers: rename proc and sysfs entries Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 11/11] PCI: hotplug: movable bus numbers: compact the gaps in numbering Sergey Miroshnichenko
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Currently, hot-adding a bridge requires enough bus numbers to be reserved
on the slot. Choosing a favorable number of reserved buses per slot is
relatively simple for predictable cases, but it gets trickier when bridges
can be hot-plugged into hot-plugged bridges: there may be either not enough
buses in a slot for a new big bridge, or all the 255 possible numbers will
be depleted. So hot-add may fail still having unused buses somewhere in the
PCI topology.

Instead of reserving, the bus numbers can be allocated continuously, and
during a hot-adding a bridge in the middle of the PCI tree, the conflicting
buses can increment their numbers, creating a gap for the new bridge.

Before the moving, ensure there are enough space to move on, and there will
be no conflicts with other buses, taking into consideration that it may be
more than one root bridge in the domain (e.g. on some Intel Xeons one root
has buses 00-7f, and the second one - 80-ff).

The feature is disabled by default to not break the ABI, and can be enabled
by the "pci=movable_buses" command line argument, if all risks accepted.

The following set of parameters provides a safe activation of the feature:

  pci=realloc,pcie_bus_peer2peer,movable_buses

On x86, the "pci=assign-busses" is also required:

  pci=realloc,pcie_bus_peer2peer,movable_buses,assign-busses

This series is the second half of the work started by the "Movable BARs"
patches, and relies on fixes made there.

Following patches will resolve the introduced issues:
 - fix desynchronization in /sys/devices/pci*, /sys/bus/pci/devices/* and
   /proc/bus/pci/* after changes in PCI topology;
 - compact gaps in numbering, which may appear after removing a bridge, to
   maintain the number continuity.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 .../admin-guide/kernel-parameters.txt         |   3 +
 drivers/pci/pci.c                             |   3 +
 drivers/pci/pci.h                             |   2 +
 drivers/pci/probe.c                           | 153 +++++++++++++++++-
 4 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c6243aaed0c9..1bf8dea1f08a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3529,6 +3529,9 @@
 		force_floating	[S390] Force usage of floating interrupts.
 		nomio		[S390] Do not use MIO instructions.
 		no_movable_bars	Don't allow BARs to be moved during hotplug
+		movable_buses	Prefer bus renaming over the number reserving. This
+				inflicts the deleting+recreating of sysfs and procfs
+				entries.
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6ec1b70e4a96..9b2dcaa268e8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -79,6 +79,7 @@ int pci_domains_supported = 1;
 #endif
 
 bool pci_can_move_bars = true;
+bool pci_movable_buses;
 
 #define DEFAULT_CARDBUS_IO_SIZE		(256)
 #define DEFAULT_CARDBUS_MEM_SIZE	(64*1024*1024)
@@ -6335,6 +6336,8 @@ static int __init pci_setup(char *str)
 				disable_acs_redir_param = str + 18;
 			} else if (!strncmp(str, "no_movable_bars", 15)) {
 				pci_can_move_bars = false;
+			} else if (!strncmp(str, "movable_buses", 13)) {
+				pci_movable_buses = true;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 9b5164d10499..804176bb1d1b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -289,6 +289,8 @@ void pci_bus_put(struct pci_bus *bus);
 
 bool pci_dev_bar_movable(struct pci_dev *dev, struct resource *res);
 
+extern bool pci_movable_buses;
+
 int assign_fixed_resource_on_bus(struct pci_bus *b, struct resource *r);
 
 /* PCIe link information */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 3494b5d265d5..be9e5754cac7 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1096,6 +1096,126 @@ static void pci_enable_crs(struct pci_dev *pdev)
 					 PCI_EXP_RTCTL_CRSSVE);
 }
 
+static void pci_do_move_buses(const int domain, int busnr, int first_moved_busnr,
+			      int delta, const struct resource *valid_range)
+{
+	struct pci_bus *bus;
+	int subordinate;
+	u32 old_buses, buses;
+
+	if (busnr < valid_range->start || busnr > valid_range->end)
+		return;
+
+	bus = pci_find_bus(domain, busnr);
+	if (!bus)
+		return;
+
+	if (delta > 0) {
+		pci_do_move_buses(domain, busnr + 1, first_moved_busnr,
+				  delta, valid_range);
+	}
+
+	bus->number += delta;
+	bus->busn_res.start += delta;
+
+	/* Children of moved buses must update their primary bus */
+	if (bus->primary >= first_moved_busnr)
+		bus->primary += delta;
+
+	pci_read_config_dword(bus->self, PCI_PRIMARY_BUS, &buses);
+	old_buses = buses;
+	subordinate = (old_buses >> 16) & 0xff;
+	subordinate += delta;
+	buses &= 0xff000000;
+	buses |= (unsigned int)bus->primary;
+	buses |= (unsigned int)(bus->number << 8);
+	buses |= (unsigned int)(subordinate << 16);
+	pci_write_config_dword(bus->self, PCI_PRIMARY_BUS, buses);
+
+	if (delta < 0)
+		pci_do_move_buses(domain, busnr + 1, first_moved_busnr,
+				  delta, valid_range);
+}
+
+/*
+ * Buses can only be moved if distributed continuously, without neither gaps nor reserved
+ * bus numbers.
+ *
+ * Secondary bus of every bridge is expanded to the maximum possible value allowed be the
+ * root bridge.
+ */
+static int pci_move_buses(int domain, int busnr, int delta,
+			  const struct resource *valid_range)
+{
+	if (!pci_movable_buses)
+		return 0;
+
+	if (!delta)
+		return 0;
+
+	/* Return immediately for the root bus */
+	if (!busnr)
+		return 0;
+
+	if (busnr < valid_range->start || busnr > valid_range->end) {
+		pr_err("Bus number %02x is outside of valid range %pR\n",
+		       busnr, valid_range);
+		return -EINVAL;
+	}
+
+	if (((busnr + delta) < valid_range->start) ||
+	    ((busnr + delta) > valid_range->end)) {
+		pr_err("Can't move bus %02x by %d outside of valid range %pR\n",
+		       busnr, delta, valid_range);
+		return -ENOSPC;
+	}
+
+	if (delta > 0) {
+		struct pci_bus *bus = pci_find_bus(domain, valid_range->end - delta + 1);
+
+		if (bus) {
+			pr_err("Not enough space for bus movement - blocked by %s\n",
+			       dev_name(&bus->dev));
+			return -ENOSPC;
+		}
+	} else {
+		int check_busnr;
+
+		for (check_busnr = busnr + delta; check_busnr < busnr; ++check_busnr) {
+			struct pci_bus *bus = pci_find_bus(domain, check_busnr);
+
+			if (bus) {
+				pr_err("Not enough space for bus movement - blocked by %s\n",
+				       dev_name(&bus->dev));
+				return -ENOSPC;
+			}
+		}
+	}
+
+	pci_do_move_buses(domain, busnr, busnr,
+			  delta, valid_range);
+
+	return 0;
+}
+
+static bool pci_new_bus_needed(struct pci_bus *bus, const struct pci_dev *self)
+{
+	if (!bus)
+		return true;
+
+	if (!pci_movable_buses)
+		return false;
+
+	if (pci_is_root_bus(bus))
+		return false;
+
+	/* Check if the downstream port already has the requested bus number */
+	if (bus->self == self)
+		return false;
+
+	return true;
+}
+
 static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 					      unsigned int available_buses);
 /**
@@ -1165,6 +1285,10 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
 	bool fixed_buses;
 	u8 fixed_sec, fixed_sub;
 	int next_busnr;
+	struct pci_bus *root = bus;
+
+	while (!pci_is_root_bus(root))
+		root = root->parent;
 
 	/*
 	 * Make sure the bridge is powered on to be able to access config
@@ -1277,7 +1401,11 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
 		 * case we only re-scan this bus.
 		 */
 		child = pci_find_bus(pci_domain_nr(bus), next_busnr);
-		if (!child) {
+		if (pci_new_bus_needed(child, dev)) {
+			if (child && pci_move_buses(pci_domain_nr(child), next_busnr,
+						    1, &root->busn_res))
+				goto out;
+
 			child = pci_add_new_bus(bus, dev, next_busnr);
 			if (!child)
 				goto out;
@@ -2771,9 +2899,13 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 		}
 	}
 
-	/* Reserve buses for SR-IOV capability */
-	used_buses = pci_iov_bus_range(bus);
-	max += used_buses;
+	if (!pci_movable_buses) {
+		/* Reserve buses for SR-IOV capability */
+		used_buses = pci_iov_bus_range(bus);
+		max += used_buses;
+	} else {
+		used_buses = 0;
+	}
 
 	/*
 	 * After performing arch-dependent fixup of the bus, look behind
@@ -2806,6 +2938,11 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 		cmax = max;
 		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
 
+		if (pci_movable_buses) {
+			used_buses += cmax - max;
+			continue;
+		}
+
 		/*
 		 * Reserve one bus for each bridge now to avoid extending
 		 * hotplug bridges too much during the second scan below.
@@ -2835,11 +2972,17 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 			 * bridges if any.
 			 */
 			buses = available_buses / hotplug_bridges;
-			buses = min(buses, available_buses - used_buses + 1);
+			buses = min(buses, available_buses - used_buses +
+				    (pci_movable_buses ? 0 : 1));
 		}
 
 		cmax = max;
 		max = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);
+		if (pci_movable_buses) {
+			used_buses += max - cmax;
+			continue;
+		}
+
 		/* One bus is already accounted so don't add it again */
 		if (max - cmax > 1)
 			used_buses += max - cmax - 1;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 10/11] PCI: hotplug: movable bus numbers: rename proc and sysfs entries
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (8 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 09/11] PCI: hotplug: Add initial support for movable bus numbers Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 11/11] PCI: hotplug: movable bus numbers: compact the gaps in numbering Sergey Miroshnichenko
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

Changing the number of a bus (therefore changing addresses of this bus, of
its children and all the buses next in the tree) invalidates entries in
/sys/devices/pci*, /proc/bus/pci/* and symlinks in /sys/bus/pci/devices/*
for all the renamed devices and buses.

Remove the affected proc and sysfs entries and symlinks before renaming the
bus, then created them back.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/probe.c | 105 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 104 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index be9e5754cac7..fe9bf012ef33 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1096,12 +1096,99 @@ static void pci_enable_crs(struct pci_dev *pdev)
 					 PCI_EXP_RTCTL_CRSSVE);
 }
 
+static void pci_buses_remove_sysfs(int domain, int busnr, int max_bus_number)
+{
+	struct pci_bus *bus;
+	struct pci_dev *dev = NULL;
+
+	bus = pci_find_bus(domain, busnr);
+	if (!bus)
+		return;
+
+	if (busnr < max_bus_number)
+		pci_buses_remove_sysfs(domain, busnr + 1, max_bus_number);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		device_remove_class_symlinks(&dev->dev);
+		pci_remove_sysfs_dev_files(dev);
+		pci_proc_detach_device(dev);
+		bus_disconnect_device(&dev->dev);
+	}
+
+	device_remove_class_symlinks(&bus->dev);
+	pci_proc_detach_bus(bus);
+}
+
+static void pci_buses_create_sysfs(int domain, int busnr, int max_bus_number)
+{
+	struct pci_bus *bus;
+	struct pci_dev *dev = NULL;
+
+	bus = pci_find_bus(domain, busnr);
+	if (!bus)
+		return;
+
+	device_add_class_symlinks(&bus->dev);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		bus_add_device(&dev->dev);
+		if (pci_dev_is_added(dev)) {
+			pci_proc_attach_device(dev);
+			pci_create_sysfs_dev_files(dev);
+			device_add_class_symlinks(&dev->dev);
+		}
+	}
+
+	if (busnr < max_bus_number)
+		pci_buses_create_sysfs(domain, busnr + 1, max_bus_number);
+}
+
+static void pci_rename_bus(struct pci_bus *bus, const char *new_bus_name)
+{
+	struct class *class;
+	int err;
+
+	class = bus->dev.class;
+	bus->dev.class = NULL;
+	err = device_rename(&bus->dev, new_bus_name);
+	bus->dev.class = class;
+}
+
+static void pci_rename_bus_devices(struct pci_bus *bus, const int domain,
+				   const int new_busnr)
+{
+	struct pci_dev *dev = NULL;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		char old_name[64];
+		char new_name[64];
+		struct class *class;
+		int err;
+		int i;
+
+		strncpy(old_name, dev_name(&dev->dev), sizeof(old_name));
+		sprintf(new_name, "%04x:%02x:%02x.%d", domain, new_busnr,
+			PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
+		class = dev->dev.class;
+		dev->dev.class = NULL;
+		err = device_rename(&dev->dev, new_name);
+		dev->dev.class = class;
+
+		for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
+			dev->resource[i].name = pci_name(dev);
+	}
+}
+
 static void pci_do_move_buses(const int domain, int busnr, int first_moved_busnr,
 			      int delta, const struct resource *valid_range)
 {
 	struct pci_bus *bus;
-	int subordinate;
+	int subordinate, old_primary;
 	u32 old_buses, buses;
+	char old_bus_name[64];
+	char new_bus_name[64];
+	struct resource old_res;
+	int new_busnr = busnr + delta;
 
 	if (busnr < valid_range->start || busnr > valid_range->end)
 		return;
@@ -1110,11 +1197,21 @@ static void pci_do_move_buses(const int domain, int busnr, int first_moved_busnr
 	if (!bus)
 		return;
 
+	old_primary = bus->primary;
+	strncpy(old_bus_name, dev_name(&bus->dev), sizeof(old_bus_name));
+	sprintf(new_bus_name, "%04x:%02x", domain, new_busnr);
+
 	if (delta > 0) {
 		pci_do_move_buses(domain, busnr + 1, first_moved_busnr,
 				  delta, valid_range);
+		pci_rename_bus_devices(bus, domain, new_busnr);
+		pci_rename_bus(bus, new_bus_name);
+	} else {
+		pci_rename_bus(bus, new_bus_name);
+		pci_rename_bus_devices(bus, domain, new_busnr);
 	}
 
+	memcpy(&old_res, &bus->busn_res, sizeof(old_res));
 	bus->number += delta;
 	bus->busn_res.start += delta;
 
@@ -1132,6 +1229,10 @@ static void pci_do_move_buses(const int domain, int busnr, int first_moved_busnr
 	buses |= (unsigned int)(subordinate << 16);
 	pci_write_config_dword(bus->self, PCI_PRIMARY_BUS, buses);
 
+	dev_warn(&bus->dev, "Renamed bus %s (%02x-%pR) to %s (%02x-%pR)\n",
+		 old_bus_name, old_primary, &old_res,
+		 new_bus_name, bus->primary, &bus->busn_res);
+
 	if (delta < 0)
 		pci_do_move_buses(domain, busnr + 1, first_moved_busnr,
 				  delta, valid_range);
@@ -1192,8 +1293,10 @@ static int pci_move_buses(int domain, int busnr, int delta,
 		}
 	}
 
+	pci_buses_remove_sysfs(domain, busnr, valid_range->end);
 	pci_do_move_buses(domain, busnr, busnr,
 			  delta, valid_range);
+	pci_buses_create_sysfs(domain, busnr + delta, valid_range->end);
 
 	return 0;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RFC 11/11] PCI: hotplug: movable bus numbers: compact the gaps in numbering
  2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
                   ` (9 preceding siblings ...)
  2019-10-24 17:21 ` [PATCH RFC 10/11] PCI: hotplug: movable bus numbers: rename proc and sysfs entries Sergey Miroshnichenko
@ 2019-10-24 17:21 ` Sergey Miroshnichenko
  10 siblings, 0 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

If bus numbers are distributed sparsely and there are lot of devices in the
tree, hotplugging a bridge into the end of the tree may fail even if it has
less slots then the total number of unused bus numbers.

Thus, the feature of bus renaming relies on the continuity of bus numbers,
so if a bridge was unplugged, the gap in bus numbers must be compacted.

Let's densify the bus numbering at the beginning of a next PCI rescan.

Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
---
 drivers/pci/probe.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index fe9bf012ef33..0c91b9d453dd 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1319,6 +1319,30 @@ static bool pci_new_bus_needed(struct pci_bus *bus, const struct pci_dev *self)
 	return true;
 }
 
+static void pci_compact_bus_numbers(const int domain, const struct resource *valid_range)
+{
+	int busnr_p1 = valid_range->start;
+
+	while (busnr_p1 < valid_range->end) {
+		int busnr_p2 = busnr_p1 + 1;
+		struct pci_bus *bus_p2;
+		int delta;
+
+		while (busnr_p2 <= valid_range->end &&
+		       !(bus_p2 = pci_find_bus(domain, busnr_p2)))
+			++busnr_p2;
+
+		if (!bus_p2 || busnr_p2 > valid_range->end)
+			break;
+
+		delta = busnr_p1 - busnr_p2 + 1;
+		if (delta)
+			pci_move_buses(domain, busnr_p2, delta, valid_range);
+
+		++busnr_p1;
+	}
+}
+
 static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 					      unsigned int available_buses);
 /**
@@ -3691,6 +3715,9 @@ unsigned int pci_rescan_bus(struct pci_bus *bus)
 		pci_bus_update_immovable_range(root);
 		pci_bus_release_root_bridge_resources(root);
 
+		pci_compact_bus_numbers(pci_domain_nr(bus),
+					&root->busn_res);
+
 		max = pci_scan_child_bus(root);
 
 		pci_reassign_root_bus_resources(root);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-24 17:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 02/11] PCI: proc: Nullify a freed pointer Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 03/11] drivers: base: Make bus_add_device() public Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 04/11] drivers: base: Make device_{add|remove}_class_symlinks() public Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 05/11] drivers: base: Add bus_disconnect_device() Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 06/11] powerpc/pci: Enable assigning bus numbers instead of reading them from DT Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 07/11] powerpc/pci: Don't reduce the host bridge bus range Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 08/11] PCI: Allow expanding the bridges Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 09/11] PCI: hotplug: Add initial support for movable bus numbers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 10/11] PCI: hotplug: movable bus numbers: rename proc and sysfs entries Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 11/11] PCI: hotplug: movable bus numbers: compact the gaps in numbering Sergey Miroshnichenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).