linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers
@ 2019-10-24 17:21 Sergey Miroshnichenko
  2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Sergey Miroshnichenko @ 2019-10-24 17:21 UTC (permalink / raw)
  To: linux-pci, linuxppc-dev; +Cc: Bjorn Helgaas, linux, Sergey Miroshnichenko

To allow hotplugging bridges, the kernel or BIOS/bootloader/firmware add
extra bus numbers per slot, but this range may be not enough for a large
bridge and/or nested bridges when hot-adding a chassis full of devices.

This patchset proposes an approach similar to movable BARs: bus numbers are
not reserved anymore, instead the kernel moves the "tail" of the PCI tree
by one, when needed a new bus.

When something like this is going to happen:
                                                                   *LARGE*
 +-[0020:00]---00.0-[01-20]--+-00.0-[02-08]--+-00.0-[03]--   <--  *NESTED*
 |                           |               +-01.0-[04]--        *BRIDGE*
 |                           |               +-02.0-[05]--
 |                           |               +-03.0-[06]--
 |                           |               +-04.0-[07]--
 |                           |               \-05.0-[08]--
 ...

, this will result into the following:

 +-[0020:00]---00.0-[01-22]--+-00.0-[02-22]--+-00.0-[03-1d]----04.0-[04-1d]--+-00.0-[05]--
 |                           |               |                               +-04.0-[06]--
 |                           |               |                               +-09.0-[07]--
 |                           |               |                               +-0c.0-[08-19]----00.0-[09-19]--+-01.0-[0a]--
 |                           |               |                               |                               ...
 |                           |               |                               |                               \-11.0-[19]--
 |                           |               |                               ...
 |                           |               |                               \-15.0-[1d]--
 |                           |               +-01.0-[1e]--  <-- Renamed from 04
 |                           |               +-02.0-[1f]--  <-- Renamed from 05
 |                           |               +-03.0-[20]--  <-- Renamed from 06
 |                           |               +-04.0-[21]--  <-- Renamed from 07
 |                           |               \-05.0-[22]--  <-- Renamed from 08
 ...


This looks to be safe in the kernel, because drivers don't use the raw PCI
BDF ID, and we've tested that on our x86 and PowerNV machines: mass storage
with roots and network adapters just continue their work while their bus
numbers had moved.

But here comes the userspace:

 - procfs entries:

    % ls -la /proc/bus/pci/*
    /proc/bus/pci/00:
    00.0
    02.0
    ...
    1f.4
    1f.6

    /proc/bus/pci/04:
    00.0

    /proc/bus/pci/40:
    00.0

 - sysfs entries:

    % ls -la /sys/devices/pci0000:00/
    0000:00:00.0
    0000:00:02.0
    ...
    0000:00:1f.3
    0000:00:1f.4
    0000:00:1f.6

    % ls -la /sys/devices/pci0000:00/0000:00:1c.6/0000:04:00.0/driver
    driver -> ../../../../bus/pci/drivers/iwlwifi

 - sysfs symlinks:

    % ls -la /sys/bus/pci/devices
    0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0
    0000:00:02.0 -> ../../../devices/pci0000:00/0000:00:02.0
    ...
    0000:04:00.0 -> ../../../devices/pci0000:00/0000:00:1c.6/0000:04:00.0
    0000:40:00.0 -> ../../../devices/pci0000:00/0000:00:1d.2/0000:40:00.0


These patches alter the kernel public API and some internals to be able to
remove these files before changing a bus number, and create new versions
of them after device has changed its BDF.

On one hand, this makes the hotplug predictable, independent of non-kernel
program components (BIOS, bootloader, etc.) and cross-platform, but this is
also a severe ABI violation.

Probably, the udev should have a new action like "rename" in addition to
"add" and "remove".

Is it feasible to have this feature disabled by default, but with a chance
to enable by a kernel command line argument like this:

  pci=realloc,movable_buses

?

This code is follow-up of the "PCI: Allow BAR movement during hotplug"
series (v6).

Sergey Miroshnichenko (11):
  PCI: sysfs: Nullify freed pointers
  PCI: proc: Nullify a freed pointer
  drivers: base: Make bus_add_device() public
  drivers: base: Make device_{add|remove}_class_symlinks() public
  drivers: base: Add bus_disconnect_device()
  powerpc/pci: Enable assigning bus numbers instead of reading them from
    DT
  powerpc/pci: Don't reduce the host bridge bus range
  PCI: Allow expanding the bridges
  PCI: hotplug: Add initial support for movable bus numbers
  PCI: hotplug: movable bus numbers: rename proc and sysfs entries
  PCI: hotplug: movable bus numbers: compact the gaps in numbering

 .../admin-guide/kernel-parameters.txt         |   3 +
 arch/powerpc/kernel/pci-common.c              |   1 -
 arch/powerpc/kernel/pci_dn.c                  |   5 +
 arch/powerpc/platforms/powernv/eeh-powernv.c  |   3 +-
 drivers/base/base.h                           |   1 -
 drivers/base/bus.c                            |  37 +++
 drivers/base/core.c                           |   6 +-
 drivers/pci/pci-sysfs.c                       |   7 +-
 drivers/pci/pci.c                             |   3 +
 drivers/pci/pci.h                             |   2 +
 drivers/pci/probe.c                           | 291 +++++++++++++++++-
 drivers/pci/proc.c                            |   1 +
 include/linux/device.h                        |   5 +
 13 files changed, 351 insertions(+), 14 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-24 17:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-24 17:21 [PATCH RFC 00/11] PCI: hotplug: Movable bus numbers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 01/11] PCI: sysfs: Nullify freed pointers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 02/11] PCI: proc: Nullify a freed pointer Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 03/11] drivers: base: Make bus_add_device() public Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 04/11] drivers: base: Make device_{add|remove}_class_symlinks() public Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 05/11] drivers: base: Add bus_disconnect_device() Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 06/11] powerpc/pci: Enable assigning bus numbers instead of reading them from DT Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 07/11] powerpc/pci: Don't reduce the host bridge bus range Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 08/11] PCI: Allow expanding the bridges Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 09/11] PCI: hotplug: Add initial support for movable bus numbers Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 10/11] PCI: hotplug: movable bus numbers: rename proc and sysfs entries Sergey Miroshnichenko
2019-10-24 17:21 ` [PATCH RFC 11/11] PCI: hotplug: movable bus numbers: compact the gaps in numbering Sergey Miroshnichenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).