All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug
@ 2018-02-26 13:21 Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number Mika Westerberg
                   ` (6 more replies)
  0 siblings, 7 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

Hi,

When Thunderbolt controller is configured to be in native enumeration mode
it actually includes two slightly different modes. First there is non-RTD3
mode where the Thunderbolt host controller is only present when there is a
device connected. The second one is RTD3 mode where the controller is
always present.

In non-RTD3 mode the Thunderbolt host controller (NHI) and USB host (xHCI)
controller are not hotplugged using native PCIe hotplug but instead they
will be hotplugged via BIOS triggered ACPI Notify() to the root port. This
is done to preserve resources since the NHI and xHCI only need 1 MB of MMIO
space and no additional buses. Currently Linux does not support this very
well and ends up failing the hotplug in one way or another. More detailed
explanation is in changelog of patch [4/5].

This series fixes this issue and in addition includes fixes for few other
issues found during testing on a system that has Thunderbolt controller in
non-RTD3 native PCIe enumeration mode. However, the fixes here are not in
any way Thunderbolt specific and should be applicable to other systems as
well.

The previous versions of the patch series can be found here:

  v2: https://www.spinics.net/lists/linux-pci/msg69186.html
  v1: https://www.spinics.net/lists/linux-acpi/msg80607.html

Changes from v2:

  - Added Rafael's tag to patch [1/5].
  - Updated changelog of patch [1/5] to include more details about how the
    problem can be observed from dmesg and lspci output.

Changes from v1:

  - Drop 'cmax - max ?: 1' and use similar construct than we use in second
    pass loop in patch [1/5].
  - Drop unnecessary parentheses in patch [1/5].
  - Added Rafael's tag to patches [2-5/5].

Mika Westerberg (5):
  PCI: Make sure all bridges reserve at least one bus number
  PCI: Take bridge window alignment into account when distributing resources
  PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
  ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used
  ACPI / hotplug / PCI: Mark stale PCI devices disconnected

 drivers/pci/hotplug/acpiphp.h      |  1 +
 drivers/pci/hotplug/acpiphp_glue.c | 78 ++++++++++++++++++++++++++++++--------
 drivers/pci/hotplug/pciehp.h       |  2 +-
 drivers/pci/hotplug/pciehp_core.c  |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c   | 13 ++++++-
 drivers/pci/probe.c                | 11 ++++--
 drivers/pci/setup-bus.c            | 41 +++++++++++++++++++-
 7 files changed, 126 insertions(+), 22 deletions(-)

-- 
2.16.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
@ 2018-02-26 13:21 ` Mika Westerberg
  2018-03-27 18:57   ` Bjorn Helgaas
  2018-02-26 13:21 ` [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources Mika Westerberg
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

When distributing extra buses between hotplug bridges we need to make
sure each bridge reserve at least one bus number, even if there is
currently nothing connected to it. For instance ACPI hotplug may bring
in additional devices to non-hotplug bridges later on.

Here is what happens on one system when a Thunderbolt device is plugged in:

  pci 0000:01:00.0: PCI bridge to [bus 02-39]
  ...
  pci_bus 0000:04: [bus 04-39] extended by 0x35
  pci_bus 0000:04: bus scan returning with max=39
  pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
  pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
  pci_bus 0000:3a: scanning bus
  pci_bus 0000:3a: bus scan returning with max=3a
  pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
  pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
  pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
  pci_bus 0000:02: bus scan returning with max=3a
  pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a

Resulting 'lspci -t' output looks like this:

  +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
                                  +-01.0-[04-39]--
                                  \-02.0-[3a]----00.0

The device behind downstream port at 02:02 is the integrated xHCI (USB 3
host controller) and is not fully accessible because the hotplug bridge
is reserving too many bus numbers.

To make sure we don't run out of bus numbers for non-hotplug bridges reserve
one bus number for them upfront before distributing buses for hotplug bridges.

Fixes: 1c02ea810065 ("PCI: Distribute available buses to hotplug-capable bridges")
Reported-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/probe.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef5377438a1e..6cefd47556e3 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2561,7 +2561,10 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 	for_each_pci_bridge(dev, bus) {
 		cmax = max;
 		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
-		used_buses += cmax - max;
+		/* Reserve one bus for each bridge */
+		used_buses++;
+		if (cmax - max > 1)
+			used_buses += cmax - max - 1;
 	}
 
 	/* Scan bridges that need to be reconfigured */
@@ -2584,12 +2587,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
 			 * bridges if any.
 			 */
 			buses = available_buses / hotplug_bridges;
-			buses = min(buses, available_buses - used_buses);
+			buses = min(buses, available_buses - used_buses + 1);
 		}
 
 		cmax = max;
 		max = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);
-		used_buses += max - cmax;
+		/* One bus is already accounted so don't add it again */
+		if (max - cmax > 1)
+			used_buses += max - cmax - 1;
 	}
 
 	/*
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number Mika Westerberg
@ 2018-02-26 13:21 ` Mika Westerberg
  2018-03-27 19:23   ` Bjorn Helgaas
  2018-02-26 13:21 ` [PATCH v3 3/5] PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume Mika Westerberg
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

When connecting a Thunderbolt endpoint which itself has internal PCIe
switch (not the integrated one) the way we currently distribute
resources does not always work well because devices below non-hotplug
bridges might need to have their MMIO resources aligned to something
else than 1 MB boundary. As an example connecting a server grade 4-port
GbE add in card adds another PCIe switch leading to GbE devices that
want to have their MMIO resources aligned to 2 MB boundary instead.

Current resource distribution code does not take this alignment into
account and might try to add too much resources for the extension
hotplug bridges. The resulting bridge window is then too big which makes
Linux to re-allocate minimal amount of resources, making future
extension impossible.

Make this work better by substracting properly aligned non-hotplug
downstream bridge window size from the remaining resources.

Fixes: 1a5767725cec ("PCI: Distribute available resources to hotplug-capable bridges")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/setup-bus.c | 41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 3cce29a069e6..f1e6172734f0 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1882,6 +1882,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
 	resource_size_t available_mmio, resource_size_t available_mmio_pref)
 {
 	resource_size_t remaining_io, remaining_mmio, remaining_mmio_pref;
+	resource_size_t io_start, mmio_start, mmio_pref_start;
 	unsigned int normal_bridges = 0, hotplug_bridges = 0;
 	struct resource *io_res, *mmio_res, *mmio_pref_res;
 	struct pci_dev *dev, *bridge = bus->self;
@@ -1946,11 +1947,16 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
 			remaining_mmio_pref -= resource_size(res);
 	}
 
+	io_start = io_res->start;
+	mmio_start = mmio_res->start;
+	mmio_pref_start = mmio_pref_res->start;
+
 	/*
 	 * Go over devices on this bus and distribute the remaining
 	 * resource space between hotplug bridges.
 	 */
 	for_each_pci_bridge(dev, bus) {
+		resource_size_t align;
 		struct pci_bus *b;
 
 		b = dev->subordinate;
@@ -1968,7 +1974,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
 				available_io, available_mmio,
 				available_mmio_pref);
 		} else if (dev->is_hotplug_bridge) {
-			resource_size_t align, io, mmio, mmio_pref;
+			resource_size_t io, mmio, mmio_pref;
 
 			/*
 			 * Distribute available extra resources equally
@@ -1981,11 +1987,13 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
 			io = div64_ul(available_io, hotplug_bridges);
 			io = min(ALIGN(io, align), remaining_io);
 			remaining_io -= io;
+			io_start += io;
 
 			align = pci_resource_alignment(bridge, mmio_res);
 			mmio = div64_ul(available_mmio, hotplug_bridges);
 			mmio = min(ALIGN(mmio, align), remaining_mmio);
 			remaining_mmio -= mmio;
+			mmio_start += mmio;
 
 			align = pci_resource_alignment(bridge, mmio_pref_res);
 			mmio_pref = div64_ul(available_mmio_pref,
@@ -1993,9 +2001,40 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
 			mmio_pref = min(ALIGN(mmio_pref, align),
 					remaining_mmio_pref);
 			remaining_mmio_pref -= mmio_pref;
+			mmio_pref_start += mmio_pref;
 
 			pci_bus_distribute_available_resources(b, add_list, io,
 							       mmio, mmio_pref);
+		} else {
+			/*
+			 * For normal bridges, track start of the parent
+			 * bridge window to make sure we align the
+			 * remaining space which is distributed to the
+			 * hotplug bridges properly.
+			 */
+			resource_size_t aligned;
+			struct resource *res;
+
+			res = &dev->resource[PCI_BRIDGE_RESOURCES + 0];
+			io_start += resource_size(res);
+			aligned = ALIGN(io_start,
+					pci_resource_alignment(dev, res));
+			if (aligned > io_start)
+				remaining_io -= aligned - io_start;
+
+			res = &dev->resource[PCI_BRIDGE_RESOURCES + 1];
+			mmio_start += resource_size(res);
+			aligned = ALIGN(mmio_start,
+					pci_resource_alignment(dev, res));
+			if (aligned > mmio_start)
+				remaining_mmio -= aligned - mmio_start;
+
+			res = &dev->resource[PCI_BRIDGE_RESOURCES + 2];
+			mmio_pref_start += resource_size(res);
+			aligned = ALIGN(mmio_pref_start,
+					pci_resource_alignment(dev, res));
+			if (aligned > mmio_pref_start)
+				remaining_mmio_pref -= aligned - mmio_pref_start;
 		}
 	}
 }
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 3/5] PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources Mika Westerberg
@ 2018-02-26 13:21 ` Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 4/5] ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used Mika Westerberg
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

After suspend/resume cycle the Presence Detect or Data Link Layer Status
Changed bits might be set and if we don't clear them those events will
not fire anymore and nothing happens for instance when a device is now
hot-unplugged.

Fix this by clearing those bits in a newly introduced function
pcie_reenable_notification(). This should be fine because immediately
after we check if the adapter is still present by reading directly from
the status register.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/hotplug/pciehp.h      |  2 +-
 drivers/pci/hotplug/pciehp_core.c |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c  | 13 ++++++++++++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 636ed8f4b869..5abf864eae35 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -120,7 +120,7 @@ struct controller *pcie_init(struct pcie_device *dev);
 int pcie_init_notification(struct controller *ctrl);
 int pciehp_enable_slot(struct slot *p_slot);
 int pciehp_disable_slot(struct slot *p_slot);
-void pcie_enable_notification(struct controller *ctrl);
+void pcie_reenable_notification(struct controller *ctrl);
 int pciehp_power_on_slot(struct slot *slot);
 void pciehp_power_off_slot(struct slot *slot);
 void pciehp_get_power_status(struct slot *slot, u8 *status);
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 332b723ff9e6..44a6a63802d5 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -283,7 +283,7 @@ static int pciehp_resume(struct pcie_device *dev)
 	ctrl = get_service_data(dev);
 
 	/* reinitialize the chipset's event detection logic */
-	pcie_enable_notification(ctrl);
+	pcie_reenable_notification(ctrl);
 
 	slot = ctrl->slot;
 
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 18a42f8f5dc5..98ea75aa32c7 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -659,7 +659,7 @@ static irqreturn_t pcie_isr(int irq, void *dev_id)
 	return handled;
 }
 
-void pcie_enable_notification(struct controller *ctrl)
+static void pcie_enable_notification(struct controller *ctrl)
 {
 	u16 cmd, mask;
 
@@ -697,6 +697,17 @@ void pcie_enable_notification(struct controller *ctrl)
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, cmd);
 }
 
+void pcie_reenable_notification(struct controller *ctrl)
+{
+	/*
+	 * Clear both Presence and Data Link Layer Changed to make sure
+	 * those events still fire after we have re-enabled them.
+	 */
+	pcie_capability_write_word(ctrl->pcie->port, PCI_EXP_SLTSTA,
+				   PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC);
+	pcie_enable_notification(ctrl);
+}
+
 static void pcie_disable_notification(struct controller *ctrl)
 {
 	u16 mask;
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 4/5] ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
                   ` (2 preceding siblings ...)
  2018-02-26 13:21 ` [PATCH v3 3/5] PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume Mika Westerberg
@ 2018-02-26 13:21 ` Mika Westerberg
  2018-02-26 13:21 ` [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected Mika Westerberg
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

When a system is using native PCIe hotplug for Thunderbolt and the
controller is not enabled for full RTD3 (runtime D3) it will be only
present in the system when there is a device connected. This pretty much
follows the BIOS assisted hotplug behaviour.

Thunderbolt host controller integrated PCIe switch has two additional
PCIe downstream ports that lead to NHI (Thunderbolt host controller) and
xHCI (USB 3 host controller) respectively. These downstream ports are
not marked being hotplug capable. Reason for that is to preserve
resources. Otherwise the OS would distribute remaining resources between
all downstream ports making these two ports consume too much resources
they don't really need (both NHI and xHCI only need 1 MB of MMIO space,
and no extra buses).

Now, because these ports are not marked being hotplug capable the OS
will not enable hotplug interrupt for them either and will not receive
interrupt when devices below them are hotplugged. Solution to this is
that the BIOS sends ACPI Notify() for the root port to notify the OS it
needs to rescan for added devices.

However, the current ACPI hotplug implementation scans the whole slot
regardless whether native PCIe is used or not and it expects that the
BIOS has configured bridge resources upfront. If that's not the case it
assigns resources using minimal allocation (everything currently found
just barely fit) preventing future extension. In addition to that, if
there is another native PCIe hotplug going on we may find the new PCIe
switch only partially ready (all links are not fully trained yet)
confusing PCIehp when it finally starts to enumerate new devices.

To make this work better with the native PCIe hotplug driver (PCIehp),
we let it to handle all slot management and resource allocation for
hotplug bridges and restrict ACPI hotplug to non-hotplug bridges.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/hotplug/acpiphp.h      |  1 +
 drivers/pci/hotplug/acpiphp_glue.c | 73 ++++++++++++++++++++++++++++++--------
 2 files changed, 59 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/hotplug/acpiphp.h b/drivers/pci/hotplug/acpiphp.h
index e438a2d734f2..8dcfd623ef1d 100644
--- a/drivers/pci/hotplug/acpiphp.h
+++ b/drivers/pci/hotplug/acpiphp.h
@@ -158,6 +158,7 @@ struct acpiphp_attention_info
 
 #define SLOT_ENABLED		(0x00000001)
 #define SLOT_IS_GOING_AWAY	(0x00000002)
+#define SLOT_IS_NATIVE		(0x00000004)
 
 /* function flags */
 
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index b45b375c0e6c..5efa21cdddc9 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -282,6 +282,9 @@ static acpi_status acpiphp_add_context(acpi_handle handle, u32 lvl, void *data,
 	slot->device = device;
 	INIT_LIST_HEAD(&slot->funcs);
 
+	if (pdev && pciehp_is_native(pdev))
+		slot->flags |= SLOT_IS_NATIVE;
+
 	list_add_tail(&slot->node, &bridge->slots);
 
 	/*
@@ -291,7 +294,7 @@ static acpi_status acpiphp_add_context(acpi_handle handle, u32 lvl, void *data,
 	 * expose slots to user space in those cases.
 	 */
 	if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(adev))
-	    && !(pdev && pdev->is_hotplug_bridge && pciehp_is_native(pdev))) {
+	    && !(slot->flags & SLOT_IS_NATIVE && pdev->is_hotplug_bridge)) {
 		unsigned long long sun;
 		int retval;
 
@@ -430,6 +433,29 @@ static int acpiphp_rescan_slot(struct acpiphp_slot *slot)
 	return pci_scan_slot(slot->bus, PCI_DEVFN(slot->device, 0));
 }
 
+static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
+{
+	struct pci_bus *bus = bridge->subordinate;
+	struct pci_dev *dev;
+	int max;
+
+	if (!bus)
+		return;
+
+	max = bus->busn_res.start;
+	/* Scan already configured non-hotplug bridges */
+	for_each_pci_bridge(dev, bus) {
+		if (!dev->is_hotplug_bridge)
+			max = pci_scan_bridge(bus, dev, max, 0);
+	}
+
+	/* Scan non-hotplug bridges that need to be reconfigured */
+	for_each_pci_bridge(dev, bus) {
+		if (!dev->is_hotplug_bridge)
+			max = pci_scan_bridge(bus, dev, max, 1);
+	}
+}
+
 /**
  * enable_slot - enable, configure a slot
  * @slot: slot to be enabled
@@ -442,25 +468,42 @@ static void enable_slot(struct acpiphp_slot *slot)
 	struct pci_dev *dev;
 	struct pci_bus *bus = slot->bus;
 	struct acpiphp_func *func;
-	int max, pass;
-	LIST_HEAD(add_list);
 
-	acpiphp_rescan_slot(slot);
-	max = acpiphp_max_busnr(bus);
-	for (pass = 0; pass < 2; pass++) {
+	if (slot->flags & SLOT_IS_NATIVE) {
+		/*
+		 * If native PCIe hotplug is used, it will take care of
+		 * hotplug slot management and resource allocation for
+		 * hotplug bridges. However, ACPI hotplug may still be
+		 * used for non-hotplug bridges to bring in additional
+		 * devices such as Thunderbolt host controller.
+		 */
 		for_each_pci_bridge(dev, bus) {
-			if (PCI_SLOT(dev->devfn) != slot->device)
-				continue;
-
-			max = pci_scan_bridge(bus, dev, max, pass);
-			if (pass && dev->subordinate) {
-				check_hotplug_bridge(slot, dev);
-				pcibios_resource_survey_bus(dev->subordinate);
-				__pci_bus_size_bridges(dev->subordinate, &add_list);
+			if (PCI_SLOT(dev->devfn) == slot->device)
+				acpiphp_native_scan_bridge(dev);
+		}
+		pci_assign_unassigned_bridge_resources(bus->self);
+	} else {
+		LIST_HEAD(add_list);
+		int max, pass;
+
+		acpiphp_rescan_slot(slot);
+		max = acpiphp_max_busnr(bus);
+		for (pass = 0; pass < 2; pass++) {
+			for_each_pci_bridge(dev, bus) {
+				if (PCI_SLOT(dev->devfn) != slot->device)
+					continue;
+
+				max = pci_scan_bridge(bus, dev, max, pass);
+				if (pass && dev->subordinate) {
+					check_hotplug_bridge(slot, dev);
+					pcibios_resource_survey_bus(dev->subordinate);
+					__pci_bus_size_bridges(dev->subordinate,
+							       &add_list);
+				}
 			}
 		}
+		__pci_bus_assign_resources(bus, &add_list, NULL);
 	}
-	__pci_bus_assign_resources(bus, &add_list, NULL);
 
 	acpiphp_sanitize_bus(bus);
 	pcie_bus_configure_settings(bus);
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
                   ` (3 preceding siblings ...)
  2018-02-26 13:21 ` [PATCH v3 4/5] ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used Mika Westerberg
@ 2018-02-26 13:21 ` Mika Westerberg
  2018-03-27 19:45   ` Bjorn Helgaas
  2018-02-26 15:16 ` [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Andy Shevchenko
  2018-03-15  6:00 ` Mika Westerberg
  6 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-02-26 13:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Mika Westerberg, Andy Shevchenko, linux-pci, linux-acpi

Following PCIehp mark the unplugged PCI devices disconnected. This makes
sure PCI core code leaves the now missing hardware registers alone.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/hotplug/acpiphp_glue.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 5efa21cdddc9..0aef35ee665a 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -651,6 +651,11 @@ static void trim_stale_devices(struct pci_dev *dev)
 		alive = pci_device_is_present(dev);
 
 	if (!alive) {
+		pci_dev_set_disconnected(dev, NULL);
+		if (pci_has_subordinate(dev))
+			pci_walk_bus(dev->subordinate, pci_dev_set_disconnected,
+				     NULL);
+
 		pci_stop_and_remove_bus_device(dev);
 		if (adev)
 			acpi_bus_trim(adev);
-- 
2.16.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
                   ` (4 preceding siblings ...)
  2018-02-26 13:21 ` [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected Mika Westerberg
@ 2018-02-26 15:16 ` Andy Shevchenko
  2018-03-15  6:00 ` Mika Westerberg
  6 siblings, 0 replies; 34+ messages in thread
From: Andy Shevchenko @ 2018-02-26 15:16 UTC (permalink / raw)
  To: Mika Westerberg, Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	linux-pci, linux-acpi

On Mon, 2018-02-26 at 16:21 +0300, Mika Westerberg wrote:
> Hi,
> 
> When Thunderbolt controller is configured to be in native enumeration
> mode
> it actually includes two slightly different modes. First there is non-
> RTD3
> mode where the Thunderbolt host controller is only present when there
> is a
> device connected. The second one is RTD3 mode where the controller is
> always present.
> 
> In non-RTD3 mode the Thunderbolt host controller (NHI) and USB host
> (xHCI)
> controller are not hotplugged using native PCIe hotplug but instead
> they
> will be hotplugged via BIOS triggered ACPI Notify() to the root port.
> This
> is done to preserve resources since the NHI and xHCI only need 1 MB of
> MMIO
> space and no additional buses. Currently Linux does not support this
> very
> well and ends up failing the hotplug in one way or another. More
> detailed
> explanation is in changelog of patch [4/5].
> 
> This series fixes this issue and in addition includes fixes for few
> other
> issues found during testing on a system that has Thunderbolt
> controller in
> non-RTD3 native PCIe enumeration mode. However, the fixes here are not
> in
> any way Thunderbolt specific and should be applicable to other systems
> as
> well.
> 

FWIW, 
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

> The previous versions of the patch series can be found here:
> 
>   v2: https://www.spinics.net/lists/linux-pci/msg69186.html
>   v1: https://www.spinics.net/lists/linux-acpi/msg80607.html
> 
> Changes from v2:
> 
>   - Added Rafael's tag to patch [1/5].
>   - Updated changelog of patch [1/5] to include more details about how
> the
>     problem can be observed from dmesg and lspci output.
> 
> Changes from v1:
> 
>   - Drop 'cmax - max ?: 1' and use similar construct than we use in
> second
>     pass loop in patch [1/5].
>   - Drop unnecessary parentheses in patch [1/5].
>   - Added Rafael's tag to patches [2-5/5].
> 
> Mika Westerberg (5):
>   PCI: Make sure all bridges reserve at least one bus number
>   PCI: Take bridge window alignment into account when distributing
> resources
>   PCI: pciehp: Clear Presence Detect and Data Link Layer Status
> Changed on resume
>   ACPI / hotplug / PCI: Do not scan all bridges when native PCIe
> hotplug is used
>   ACPI / hotplug / PCI: Mark stale PCI devices disconnected
> 
>  drivers/pci/hotplug/acpiphp.h      |  1 +
>  drivers/pci/hotplug/acpiphp_glue.c | 78
> ++++++++++++++++++++++++++++++--------
>  drivers/pci/hotplug/pciehp.h       |  2 +-
>  drivers/pci/hotplug/pciehp_core.c  |  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c   | 13 ++++++-
>  drivers/pci/probe.c                | 11 ++++--
>  drivers/pci/setup-bus.c            | 41 +++++++++++++++++++-
>  7 files changed, 126 insertions(+), 22 deletions(-)
> 

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug
  2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
                   ` (5 preceding siblings ...)
  2018-02-26 15:16 ` [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Andy Shevchenko
@ 2018-03-15  6:00 ` Mika Westerberg
  2018-03-26 10:01   ` Mika Westerberg
  6 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-15  6:00 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Mon, Feb 26, 2018 at 04:21:07PM +0300, Mika Westerberg wrote:
> Hi,
> 
> When Thunderbolt controller is configured to be in native enumeration mode
> it actually includes two slightly different modes. First there is non-RTD3
> mode where the Thunderbolt host controller is only present when there is a
> device connected. The second one is RTD3 mode where the controller is
> always present.
> 
> In non-RTD3 mode the Thunderbolt host controller (NHI) and USB host (xHCI)
> controller are not hotplugged using native PCIe hotplug but instead they
> will be hotplugged via BIOS triggered ACPI Notify() to the root port. This
> is done to preserve resources since the NHI and xHCI only need 1 MB of MMIO
> space and no additional buses. Currently Linux does not support this very
> well and ends up failing the hotplug in one way or another. More detailed
> explanation is in changelog of patch [4/5].
> 
> This series fixes this issue and in addition includes fixes for few other
> issues found during testing on a system that has Thunderbolt controller in
> non-RTD3 native PCIe enumeration mode. However, the fixes here are not in
> any way Thunderbolt specific and should be applicable to other systems as
> well.
> 
> The previous versions of the patch series can be found here:
> 
>   v2: https://www.spinics.net/lists/linux-pci/msg69186.html
>   v1: https://www.spinics.net/lists/linux-acpi/msg80607.html
> 
> Changes from v2:
> 
>   - Added Rafael's tag to patch [1/5].
>   - Updated changelog of patch [1/5] to include more details about how the
>     problem can be observed from dmesg and lspci output.
> 
> Changes from v1:
> 
>   - Drop 'cmax - max ?: 1' and use similar construct than we use in second
>     pass loop in patch [1/5].
>   - Drop unnecessary parentheses in patch [1/5].
>   - Added Rafael's tag to patches [2-5/5].

Hi Bjorn,

Do you have any comments on the series? It would be nice to get it
merged for v4.17 if no objections.

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug
  2018-03-15  6:00 ` Mika Westerberg
@ 2018-03-26 10:01   ` Mika Westerberg
  0 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-03-26 10:01 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Thu, Mar 15, 2018 at 08:00:05AM +0200, Mika Westerberg wrote:
> Do you have any comments on the series? It would be nice to get it
> merged for v4.17 if no objections.

Hi Bjorn,

Sorry for pressing you but this series is pretty much needed for any
systems having Thunderbolt in native PCIe enumeration mode. This mode is
already used in some existing systems so it effectively means that
Thunderbolt hotplug does not work on them properly.

I would really appeciate if you could consider merging this for
v4.17-rc1.

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-02-26 13:21 ` [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number Mika Westerberg
@ 2018-03-27 18:57   ` Bjorn Helgaas
  2018-03-28 11:43     ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-27 18:57 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Mon, Feb 26, 2018 at 04:21:08PM +0300, Mika Westerberg wrote:
> When distributing extra buses between hotplug bridges we need to make
> sure each bridge reserve at least one bus number, even if there is
> currently nothing connected to it. For instance ACPI hotplug may bring
> in additional devices to non-hotplug bridges later on.

I guess you mean ACPI hotplug can add devices below bridges that have
"bridge->is_hotplug_bridge == 0"?  Why don't we set is_hotplug_bridge
in that case?  I do see that acpiphp sets it in *some* cases (see
check_hotplug_bridge()).  Are we missing some case?

> Here is what happens on one system when a Thunderbolt device is plugged in:
> 
>   pci 0000:01:00.0: PCI bridge to [bus 02-39]
>   ...
>   pci_bus 0000:04: [bus 04-39] extended by 0x35
>   pci_bus 0000:04: bus scan returning with max=39
>   pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
>   pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
>   pci_bus 0000:3a: scanning bus
>   pci_bus 0000:3a: bus scan returning with max=3a
>   pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
>   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
>   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
>   pci_bus 0000:02: bus scan returning with max=3a
>   pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a
> 
> Resulting 'lspci -t' output looks like this:
> 
>   +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
>                                   +-01.0-[04-39]--
>                                   \-02.0-[3a]----00.0
> 
> The device behind downstream port at 02:02 is the integrated xHCI (USB 3
> host controller) and is not fully accessible because the hotplug bridge
> is reserving too many bus numbers.

Thanks for the details here, but I can't tell what happened before and
was broken, vs. what happens now.  Which is the hotplug bridge?  Which
is the Thunderbolt controller?

I guess 02:01.0 must be the bridge consuming too many bus numbers
([bus 04-39])?

And 02:02.0 might be the Thunderbolt controller that wants to use bus
3a?  But obviously that won't work because 1b.0 doesn't route things
to bus 3a, since it only consumes [bus 01-39].

(The device behind 02:02.0 is more than just "not fully accessible" --
it's not accessible via config space *at all*.)

I guess the 'lspci -t' above must be without this patch, and with this
patch, we'd have

  pci 0000:02:00.0: PCI bridge to [bus 03]
  pci 0000:02:01.0: PCI bridge to [bus 04-38]
  pci 0000:02:02.0: PCI bridge to [bus 39]

This patch might fix the situation for simple hot-added devices, but
won't we have the same problem again if we hot-add a bridge?  It seems
like we need a more comprehensive solution.  I don't mean we need to
go whole hog and reassign everybody's bus numbers dynamically, but we
ought to at least be able to notice the situation, decline to enable
the bridge leading to devices we can't reach, and give a meaningful
error message.

Nit unrelated to this patch: "bridge 0000:02" is not a bridge, it's a
bus.  Apparently bus 3a is hidden because 1b.0's subordinate bus is
39.

> To make sure we don't run out of bus numbers for non-hotplug bridges reserve
> one bus number for them upfront before distributing buses for hotplug bridges.
> 
> Fixes: 1c02ea810065 ("PCI: Distribute available buses to hotplug-capable bridges")
> Reported-by: Mario Limonciello <mario.limonciello@dell.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/pci/probe.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ef5377438a1e..6cefd47556e3 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2561,7 +2561,10 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  	for_each_pci_bridge(dev, bus) {
>  		cmax = max;
>  		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
> -		used_buses += cmax - max;
> +		/* Reserve one bus for each bridge */
> +		used_buses++;
> +		if (cmax - max > 1)
> +			used_buses += cmax - max - 1;

Sorry, this should be trivial, but I'm having a hard time wrapping my
mind around it.

AFAICT, "cmax" is the highest known bus number below this bus, "max"
is the highest bus number below "dev" (one of the bridges on "bus").

I assume "used_buses++" accounts for the fact that every enabled
bridge must consume one bus number for its secondary side.

And I guess "used_buses += cmax - max - 1" adds in the bus numbers
downstream from "dev" (subtracting the one used for its secondary
bus)?

pci_scan_bridge_extend() seems to return something related to the
number of bus numbers used below "dev".  Why doesn't *it* account for
the secondary bus number of "dev"?

It might help if the pci_scan_bridge_extend() function comment were
extended to say what it actually returns.

>  	}
>  
>  	/* Scan bridges that need to be reconfigured */
> @@ -2584,12 +2587,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
>  			 * bridges if any.
>  			 */
>  			buses = available_buses / hotplug_bridges;
> -			buses = min(buses, available_buses - used_buses);
> +			buses = min(buses, available_buses - used_buses + 1);
>  		}
>  
>  		cmax = max;
>  		max = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);
> -		used_buses += max - cmax;
> +		/* One bus is already accounted so don't add it again */
> +		if (max - cmax > 1)
> +			used_buses += max - cmax - 1;
>  	}
>  
>  	/*
> -- 
> 2.16.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources
  2018-02-26 13:21 ` [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources Mika Westerberg
@ 2018-03-27 19:23   ` Bjorn Helgaas
  2018-03-28 12:01     ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-27 19:23 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Mon, Feb 26, 2018 at 04:21:09PM +0300, Mika Westerberg wrote:
> When connecting a Thunderbolt endpoint which itself has internal PCIe
> switch (not the integrated one) the way we currently distribute

I don't understand this distinction between "internal PCIe switch" and
"integrated one".

>From a PCI point of view, I assume you're adding a switch with 4 NIC
endpoints below it.

> resources does not always work well because devices below non-hotplug
> bridges might need to have their MMIO resources aligned to something
> else than 1 MB boundary. As an example connecting a server grade 4-port
> GbE add in card adds another PCIe switch leading to GbE devices that
> want to have their MMIO resources aligned to 2 MB boundary instead.

Is there something unique about Thunderbolt here?  I assume from a PCI
point of view, these NIC ports have BARs that are 2 MB or larger and
hence need 2 MB or larger alignment?

> Current resource distribution code does not take this alignment into
> account and might try to add too much resources for the extension
> hotplug bridges. The resulting bridge window is then too big which makes
> Linux to re-allocate minimal amount of resources, making future
> extension impossible.

Maybe a concrete example would make this clearer to me.

> Make this work better by substracting properly aligned non-hotplug
> downstream bridge window size from the remaining resources.
> 
> Fixes: 1a5767725cec ("PCI: Distribute available resources to hotplug-capable bridges")
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/pci/setup-bus.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 3cce29a069e6..f1e6172734f0 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1882,6 +1882,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  	resource_size_t available_mmio, resource_size_t available_mmio_pref)
>  {
>  	resource_size_t remaining_io, remaining_mmio, remaining_mmio_pref;
> +	resource_size_t io_start, mmio_start, mmio_pref_start;
>  	unsigned int normal_bridges = 0, hotplug_bridges = 0;
>  	struct resource *io_res, *mmio_res, *mmio_pref_res;
>  	struct pci_dev *dev, *bridge = bus->self;
> @@ -1946,11 +1947,16 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			remaining_mmio_pref -= resource_size(res);
>  	}
>  
> +	io_start = io_res->start;
> +	mmio_start = mmio_res->start;
> +	mmio_pref_start = mmio_pref_res->start;
> +
>  	/*
>  	 * Go over devices on this bus and distribute the remaining
>  	 * resource space between hotplug bridges.
>  	 */
>  	for_each_pci_bridge(dev, bus) {
> +		resource_size_t align;
>  		struct pci_bus *b;
>  
>  		b = dev->subordinate;
> @@ -1968,7 +1974,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  				available_io, available_mmio,
>  				available_mmio_pref);
>  		} else if (dev->is_hotplug_bridge) {
> -			resource_size_t align, io, mmio, mmio_pref;
> +			resource_size_t io, mmio, mmio_pref;
>  
>  			/*
>  			 * Distribute available extra resources equally
> @@ -1981,11 +1987,13 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			io = div64_ul(available_io, hotplug_bridges);
>  			io = min(ALIGN(io, align), remaining_io);
>  			remaining_io -= io;
> +			io_start += io;
>  
>  			align = pci_resource_alignment(bridge, mmio_res);
>  			mmio = div64_ul(available_mmio, hotplug_bridges);
>  			mmio = min(ALIGN(mmio, align), remaining_mmio);
>  			remaining_mmio -= mmio;
> +			mmio_start += mmio;
>  
>  			align = pci_resource_alignment(bridge, mmio_pref_res);
>  			mmio_pref = div64_ul(available_mmio_pref,
> @@ -1993,9 +2001,40 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  			mmio_pref = min(ALIGN(mmio_pref, align),
>  					remaining_mmio_pref);
>  			remaining_mmio_pref -= mmio_pref;
> +			mmio_pref_start += mmio_pref;
>  
>  			pci_bus_distribute_available_resources(b, add_list, io,
>  							       mmio, mmio_pref);
> +		} else {
> +			/*
> +			 * For normal bridges, track start of the parent
> +			 * bridge window to make sure we align the
> +			 * remaining space which is distributed to the
> +			 * hotplug bridges properly.
> +			 */
> +			resource_size_t aligned;
> +			struct resource *res;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 0];
> +			io_start += resource_size(res);
> +			aligned = ALIGN(io_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > io_start)
> +				remaining_io -= aligned - io_start;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 1];
> +			mmio_start += resource_size(res);
> +			aligned = ALIGN(mmio_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > mmio_start)
> +				remaining_mmio -= aligned - mmio_start;
> +
> +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 2];
> +			mmio_pref_start += resource_size(res);
> +			aligned = ALIGN(mmio_pref_start,
> +					pci_resource_alignment(dev, res));
> +			if (aligned > mmio_pref_start)
> +				remaining_mmio_pref -= aligned - mmio_pref_start;

I don't understand how this can work.  You have this:

  for_each_pci_bridge(dev, bus) {
    if (!hotplug_bridges && normal_bridges == 1) {
      pci_bus_distribute_available_resources(...);   # block 1
    } else if (dev->is_hotplug_bridge) {
      ...                                            # block 2
      pci_bus_distribute_available_resources(...);
    } else {
      ...                                            # block 3
    }
  }

This patch adds block 3.  Block 3 doesn't do anything with side
effects, i.e., it only updates local variables like io_start,
remaining_io, mmio_start, remaining_mmio, etc.

Those may be used in subsequent iterations, but it's only useful if
there *are* subsequent iterations, so this is dependent on the order
we iterate through the bridges.  It seems like we want the same
results no matter what the order.

>  		}
>  	}
>  }
> -- 
> 2.16.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  2018-02-26 13:21 ` [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected Mika Westerberg
@ 2018-03-27 19:45   ` Bjorn Helgaas
  2018-03-27 20:52     ` Lukas Wunner
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-27 19:45 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Mon, Feb 26, 2018 at 04:21:12PM +0300, Mika Westerberg wrote:
> Following PCIehp mark the unplugged PCI devices disconnected. This makes
> sure PCI core code leaves the now missing hardware registers alone.
> 
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/pci/hotplug/acpiphp_glue.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 5efa21cdddc9..0aef35ee665a 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -651,6 +651,11 @@ static void trim_stale_devices(struct pci_dev *dev)
>  		alive = pci_device_is_present(dev);
>  
>  	if (!alive) {
> +		pci_dev_set_disconnected(dev, NULL);
> +		if (pci_has_subordinate(dev))
> +			pci_walk_bus(dev->subordinate, pci_dev_set_disconnected,
> +				     NULL);

I hate this pci_dev_set_disconnected() thing and hope to remove it
someday.  It's racy and adds additional states that are hard to
manage, e.g., "device has been removed and will not respond to PCI
accesses, but pci_dev_set_disconnected() hasn't been called yet".  If
we can handle that case (and we should), we shouldn't need
pci_dev_set_disconnected().

But I guess I accepted it other places, so now I have to accept it
here.

I do wonder whether it should be factored into a
"pci_disconnect_dev()" that does both the device itself and the bus
walk.

And I wonder whether that should be called form
pci_stop_and_remove_bus_device(), since some callers do this and
others don't.

> +
>  		pci_stop_and_remove_bus_device(dev);
>  		if (adev)
>  			acpi_bus_trim(adev);
> -- 
> 2.16.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  2018-03-27 19:45   ` Bjorn Helgaas
@ 2018-03-27 20:52     ` Lukas Wunner
  2018-03-27 22:13       ` Bjorn Helgaas
  0 siblings, 1 reply; 34+ messages in thread
From: Lukas Wunner @ 2018-03-27 20:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Mika Westerberg, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Tue, Mar 27, 2018 at 02:45:12PM -0500, Bjorn Helgaas wrote:
> I hate this pci_dev_set_disconnected() thing and hope to remove it
> someday.  It's racy and adds additional states that are hard to
> manage, e.g., "device has been removed and will not respond to PCI
> accesses, but pci_dev_set_disconnected() hasn't been called yet".  If
> we can handle that case (and we should), we shouldn't need
> pci_dev_set_disconnected().

Absent pci_dev_is_disconnected(), how can the PCI core detect if
a device is still there (versus has been hot-removed)?  The only
valid method I know of is reading the vendor ID from config space.
However reading config spaces takes time.  To properly handle
surprise removal, we ought to check presence of the device frequently
lest we clutter the logs with errors, unnecessarily occupy the CPU
and bus or in some cases hang the machine.  Ideally, we should check
presence of the device every time we access it.

Now, imagine every time we access config or mmio space, we read the
vendor ID from config space, so we essentially double the number of
accesses to the device.  Does that make sense, performance-wise?

If pci_dev_is_disconnected() returns true, the device is either there
or was until very recently, so in general checking that instead of
reading the vendor ID should be sufficient and should result in
acceptable performance.

I keep hearing these complaints about the PCI_DEV_DISCONNECTED flag,
from Greg but now also from you, but I'm not seeing constructive
criticism how checking presence of a device should be handled instead,
in a way that doesn't negatively impact performance.

IMO it was a mistake to constrain visibility of the flag to the PCI core
(at Greg's behest), it should have been made available to drivers so
they're afforded a better method to detect surprise removal than just
reading the vendor ID every time they access the device.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  2018-03-27 20:52     ` Lukas Wunner
@ 2018-03-27 22:13       ` Bjorn Helgaas
  2018-03-28  6:25         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-27 22:13 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Mika Westerberg, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi, Greg Kroah-Hartman

[+cc Greg]

On Tue, Mar 27, 2018 at 10:52:36PM +0200, Lukas Wunner wrote:
> On Tue, Mar 27, 2018 at 02:45:12PM -0500, Bjorn Helgaas wrote:
> > I hate this pci_dev_set_disconnected() thing and hope to remove it
> > someday.  It's racy and adds additional states that are hard to
> > manage, e.g., "device has been removed and will not respond to PCI
> > accesses, but pci_dev_set_disconnected() hasn't been called yet".  If
> > we can handle that case (and we should), we shouldn't need
> > pci_dev_set_disconnected().
> 
> Absent pci_dev_is_disconnected(), how can the PCI core detect if
> a device is still there (versus has been hot-removed)?  The only
> valid method I know of is reading the vendor ID from config space.
> However reading config spaces takes time.  To properly handle
> surprise removal, we ought to check presence of the device frequently
> lest we clutter the logs with errors, unnecessarily occupy the CPU
> and bus or in some cases hang the machine.  Ideally, we should check
> presence of the device every time we access it.

I'm not proposing a change in pci_dev_is_disconnected() for *this*
patch, so we can have a discussion here, but I don't think resolving
this is a precondition for this patch.

I was just expressing remorse, which I agree is not constructive, so I
should have kept quiet.  I know you understand the problem, but to try
to make amends, I'll recap it here for any onlookers:

  - driver calls pci_dev_is_disconnected(), learns device is present
  - device is removed
  - driver reads register, believing device is still present

Obviously the register read fails, because the device is gone.  So
the driver has to be prepared for that failure regardless of whether
we have pci_dev_is_disconnected().

> Now, imagine every time we access config or mmio space, we read the
> vendor ID from config space, so we essentially double the number of
> accesses to the device.  Does that make sense, performance-wise?

No, but I wouldn't advocate that strategy.

> If pci_dev_is_disconnected() returns true, the device is either there
> or was until very recently, so in general checking that instead of
> reading the vendor ID should be sufficient and should result in
> acceptable performance.
> 
> I keep hearing these complaints about the PCI_DEV_DISCONNECTED flag,
> from Greg but now also from you, but I'm not seeing constructive
> criticism how checking presence of a device should be handled instead,
> in a way that doesn't negatively impact performance.
> 
> IMO it was a mistake to constrain visibility of the flag to the PCI core
> (at Greg's behest), it should have been made available to drivers so
> they're afforded a better method to detect surprise removal than just
> reading the vendor ID every time they access the device.

To avoid the race, drivers have to check for a valid value anyway.  If
they *also* check pci_dev_is_disconnected(), that clutters the code
and gives a false sense of security.

When a read fails, it normally returns ~0 as the data to the driver. 
The driver knows the semantics of the registers it reads, and often it
can tell immediately that a successful read of the register could
never return that value, so it doesn't need to try any more accesses
to Vendor ID or anything else.

If ~0 *is* a possible valid value, the driver can read some other
register (maybe Vendor ID, or maybe an MMIO register that can be read
faster).

Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  2018-03-27 22:13       ` Bjorn Helgaas
@ 2018-03-28  6:25         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 34+ messages in thread
From: Greg Kroah-Hartman @ 2018-03-28  6:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lukas Wunner, Mika Westerberg, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Tue, Mar 27, 2018 at 05:13:17PM -0500, Bjorn Helgaas wrote:
> > If pci_dev_is_disconnected() returns true, the device is either there
> > or was until very recently, so in general checking that instead of
> > reading the vendor ID should be sufficient and should result in
> > acceptable performance.
> > 
> > I keep hearing these complaints about the PCI_DEV_DISCONNECTED flag,
> > from Greg but now also from you, but I'm not seeing constructive
> > criticism how checking presence of a device should be handled instead,
> > in a way that doesn't negatively impact performance.

You can't have things both ways.  If you are worried about your device
going away (and you have to), then just check all reads and be fine with
it.  If you have values that can be all 0xff, then just accept that as a
valid value and move to the next read where it can't be valid.

Don't try to be smart and constantly read other registers like the
vendor id, because you will still race with reading them as well (and
some devices might not like that, as this is not a codepath that any
device expects to be "fast", so odds are the silicon isn't optimised for
that at all).

> > IMO it was a mistake to constrain visibility of the flag to the PCI core
> > (at Greg's behest), it should have been made available to drivers so
> > they're afforded a better method to detect surprise removal than just
> > reading the vendor ID every time they access the device.

No, just check the value you read read and move on.  This is something
we have had to handle for 20 years now, it's not a new thing at all, why
is this even something to be argued about?  Is it that difficult for
your driver to check this and handle the removal problem gracefully?

> To avoid the race, drivers have to check for a valid value anyway.  If
> they *also* check pci_dev_is_disconnected(), that clutters the code
> and gives a false sense of security.
> 
> When a read fails, it normally returns ~0 as the data to the driver. 
> The driver knows the semantics of the registers it reads, and often it
> can tell immediately that a successful read of the register could
> never return that value, so it doesn't need to try any more accesses
> to Vendor ID or anything else.
> 
> If ~0 *is* a possible valid value, the driver can read some other
> register (maybe Vendor ID, or maybe an MMIO register that can be read
> faster).

I totally agree with Bjorn here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-27 18:57   ` Bjorn Helgaas
@ 2018-03-28 11:43     ` Mika Westerberg
  2018-03-28 18:09       ` Bjorn Helgaas
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-28 11:43 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Tue, Mar 27, 2018 at 01:57:42PM -0500, Bjorn Helgaas wrote:
> On Mon, Feb 26, 2018 at 04:21:08PM +0300, Mika Westerberg wrote:
> > When distributing extra buses between hotplug bridges we need to make
> > sure each bridge reserve at least one bus number, even if there is
> > currently nothing connected to it. For instance ACPI hotplug may bring
> > in additional devices to non-hotplug bridges later on.
> 
> I guess you mean ACPI hotplug can add devices below bridges that have
> "bridge->is_hotplug_bridge == 0"?  Why don't we set is_hotplug_bridge
> in that case?  I do see that acpiphp sets it in *some* cases (see
> check_hotplug_bridge()).  Are we missing some case?

We don't know upfront that these ports are going to get devices below
them. Only thing that happens in these cases is that we may get ACPI
Notify() to the root port leading to these ports.

Also the allocation strategy we use is based on ->is_hotplug_bridge ==
1.  Those bridges will be assigned all the remaining bus space and
resources. If we somehow set ->is_hotplug_bridge == 1 for these
non-hotplug ports it means that we now include those ports also when
resources are distributed which defeats the reason why ACPI Notify() is
used there in the first place (to preseve bus numbers).

> > Here is what happens on one system when a Thunderbolt device is plugged in:
> > 
> >   pci 0000:01:00.0: PCI bridge to [bus 02-39]
> >   ...
> >   pci_bus 0000:04: [bus 04-39] extended by 0x35
> >   pci_bus 0000:04: bus scan returning with max=39
> >   pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
> >   pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
> >   pci_bus 0000:3a: scanning bus
> >   pci_bus 0000:3a: bus scan returning with max=3a
> >   pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
> >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
> >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
> >   pci_bus 0000:02: bus scan returning with max=3a
> >   pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a
> > 
> > Resulting 'lspci -t' output looks like this:
> > 
> >   +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
> >                                   +-01.0-[04-39]--
> >                                   \-02.0-[3a]----00.0
> > 
> > The device behind downstream port at 02:02 is the integrated xHCI (USB 3
> > host controller) and is not fully accessible because the hotplug bridge
> > is reserving too many bus numbers.
> 
> Thanks for the details here, but I can't tell what happened before and
> was broken, vs. what happens now.  Which is the hotplug bridge?  Which
> is the Thunderbolt controller?
> 
> I guess 02:01.0 must be the bridge consuming too many bus numbers
> ([bus 04-39])?

Yes, that's correct.

> And 02:02.0 might be the Thunderbolt controller that wants to use bus
> 3a?  But obviously that won't work because 1b.0 doesn't route things
> to bus 3a, since it only consumes [bus 01-39].

In fact 02:02 leads to xHCI controller which in this case is
inaccessible which means that currently when you plug in USB 3 device to
systems with this setup, it won't work.

> (The device behind 02:02.0 is more than just "not fully accessible" --
> it's not accessible via config space *at all*.)

Right.

> I guess the 'lspci -t' above must be without this patch, and with this
> patch, we'd have
> 
>   pci 0000:02:00.0: PCI bridge to [bus 03]
>   pci 0000:02:01.0: PCI bridge to [bus 04-38]
>   pci 0000:02:02.0: PCI bridge to [bus 39]

That's correct.

Do you want me to amend the changelog to include this information as
well?

> This patch might fix the situation for simple hot-added devices, but
> won't we have the same problem again if we hot-add a bridge?  It seems
> like we need a more comprehensive solution.  I don't mean we need to
> go whole hog and reassign everybody's bus numbers dynamically, but we
> ought to at least be able to notice the situation, decline to enable
> the bridge leading to devices we can't reach, and give a meaningful
> error message.

The problem is that you don't know upfront what is going to be
hotplugged. Thus it is hard to guess how many buses you want to reserve
there. Doing that afterwards is not going to work because of the nature
how we do scan and add devices, without rewriting the whole scanning
logic.

The strategy we use here is the same as Windows does (e.g reserve one
bus for these bridges, just in case ACPI Notify() brings in a new
device. This is kind of special case used to hotplug TBT and xHCI
controller (not bridges). It will not work properly if you hotplug a
bridge but at least it works better than just failing miserably.

If you have better ideas how to handle this, I'm all ears :)

The sanity check at the end of pci_scan_bridge_extend() already detects
cases where things went wrong:

   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]

I'm not sure wheter we want to add more error prints to confuse users.

(We actually look for these strings in our test automation for native
 PCIe hotplug).

> Nit unrelated to this patch: "bridge 0000:02" is not a bridge, it's a
> bus.  Apparently bus 3a is hidden because 1b.0's subordinate bus is
> 39.

Indeed.

> > To make sure we don't run out of bus numbers for non-hotplug bridges reserve
> > one bus number for them upfront before distributing buses for hotplug bridges.
> > 
> > Fixes: 1c02ea810065 ("PCI: Distribute available buses to hotplug-capable bridges")
> > Reported-by: Mario Limonciello <mario.limonciello@dell.com>
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/pci/probe.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index ef5377438a1e..6cefd47556e3 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -2561,7 +2561,10 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> >  	for_each_pci_bridge(dev, bus) {
> >  		cmax = max;
> >  		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
> > -		used_buses += cmax - max;
> > +		/* Reserve one bus for each bridge */
> > +		used_buses++;
> > +		if (cmax - max > 1)
> > +			used_buses += cmax - max - 1;
> 
> Sorry, this should be trivial, but I'm having a hard time wrapping my
> mind around it.

I don't blame you, it is getting quite complex. That's why I added the
comments there, hopefully makeing it easier to understand.

> AFAICT, "cmax" is the highest known bus number below this bus, "max"
> is the highest bus number below "dev" (one of the bridges on "bus").
> 
> I assume "used_buses++" accounts for the fact that every enabled
> bridge must consume one bus number for its secondary side.

Exactly.

> And I guess "used_buses += cmax - max - 1" adds in the bus numbers
> downstream from "dev" (subtracting the one used for its secondary
> bus)?

That's also correct.

> pci_scan_bridge_extend() seems to return something related to the
> number of bus numbers used below "dev".  Why doesn't *it* account for
> the secondary bus number of "dev"?

It returns new "max" i.e maximum subordinate number the bridge occupies.

Reason why we handle this one in pci_scan_child_bus_extend() instead is
that we then have the bus number distribution logic pretty much in a
single function making it easier to understand what happens (well, it is
getting quite complex but I still think it makes sense that way). If you
insist, I can move it to pci_scan_bridge_extend() istead.

> It might help if the pci_scan_bridge_extend() function comment were
> extended to say what it actually returns.

I can make a separate patch adding comment about the return value. These
functions all return new "max" but I suppose it makes sense to document it.

> >  	}
> >  
> >  	/* Scan bridges that need to be reconfigured */
> > @@ -2584,12 +2587,14 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> >  			 * bridges if any.
> >  			 */
> >  			buses = available_buses / hotplug_bridges;
> > -			buses = min(buses, available_buses - used_buses);
> > +			buses = min(buses, available_buses - used_buses + 1);
> >  		}
> >  
> >  		cmax = max;
> >  		max = pci_scan_bridge_extend(bus, dev, cmax, buses, 1);
> > -		used_buses += max - cmax;
> > +		/* One bus is already accounted so don't add it again */
> > +		if (max - cmax > 1)
> > +			used_buses += max - cmax - 1;
> >  	}
> >  
> >  	/*
> > -- 
> > 2.16.1
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources
  2018-03-27 19:23   ` Bjorn Helgaas
@ 2018-03-28 12:01     ` Mika Westerberg
  2018-03-29 21:29       ` Bjorn Helgaas
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-28 12:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Tue, Mar 27, 2018 at 02:23:18PM -0500, Bjorn Helgaas wrote:
> On Mon, Feb 26, 2018 at 04:21:09PM +0300, Mika Westerberg wrote:
> > When connecting a Thunderbolt endpoint which itself has internal PCIe
> > switch (not the integrated one) the way we currently distribute
> 
> I don't understand this distinction between "internal PCIe switch" and
> "integrated one".

I tried to explain here that Thunderbolt endpoint itself has an internal
(integrated) PCIe switch.

> >From a PCI point of view, I assume you're adding a switch with 4 NIC
> endpoints below it.

Yes, exactly.

The PCIe device in question is Intel Gigabit ET2 Quad Port Server
adapter.

> > resources does not always work well because devices below non-hotplug
> > bridges might need to have their MMIO resources aligned to something
> > else than 1 MB boundary. As an example connecting a server grade 4-port
> > GbE add in card adds another PCIe switch leading to GbE devices that
> > want to have their MMIO resources aligned to 2 MB boundary instead.
> 
> Is there something unique about Thunderbolt here?  I assume from a PCI
> point of view, these NIC ports have BARs that are 2 MB or larger and
> hence need 2 MB or larger alignment?

No, this is not unique to Thunderbolt in any way.

> > Current resource distribution code does not take this alignment into
> > account and might try to add too much resources for the extension
> > hotplug bridges. The resulting bridge window is then too big which makes
> > Linux to re-allocate minimal amount of resources, making future
> > extension impossible.
> 
> Maybe a concrete example would make this clearer to me.

OK.

> > Make this work better by substracting properly aligned non-hotplug
> > downstream bridge window size from the remaining resources.
> > 
> > Fixes: 1a5767725cec ("PCI: Distribute available resources to hotplug-capable bridges")
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/pci/setup-bus.c | 41 ++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 40 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index 3cce29a069e6..f1e6172734f0 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -1882,6 +1882,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> >  	resource_size_t available_mmio, resource_size_t available_mmio_pref)
> >  {
> >  	resource_size_t remaining_io, remaining_mmio, remaining_mmio_pref;
> > +	resource_size_t io_start, mmio_start, mmio_pref_start;
> >  	unsigned int normal_bridges = 0, hotplug_bridges = 0;
> >  	struct resource *io_res, *mmio_res, *mmio_pref_res;
> >  	struct pci_dev *dev, *bridge = bus->self;
> > @@ -1946,11 +1947,16 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> >  			remaining_mmio_pref -= resource_size(res);
> >  	}
> >  
> > +	io_start = io_res->start;
> > +	mmio_start = mmio_res->start;
> > +	mmio_pref_start = mmio_pref_res->start;
> > +
> >  	/*
> >  	 * Go over devices on this bus and distribute the remaining
> >  	 * resource space between hotplug bridges.
> >  	 */
> >  	for_each_pci_bridge(dev, bus) {
> > +		resource_size_t align;
> >  		struct pci_bus *b;
> >  
> >  		b = dev->subordinate;
> > @@ -1968,7 +1974,7 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> >  				available_io, available_mmio,
> >  				available_mmio_pref);
> >  		} else if (dev->is_hotplug_bridge) {
> > -			resource_size_t align, io, mmio, mmio_pref;
> > +			resource_size_t io, mmio, mmio_pref;
> >  
> >  			/*
> >  			 * Distribute available extra resources equally
> > @@ -1981,11 +1987,13 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> >  			io = div64_ul(available_io, hotplug_bridges);
> >  			io = min(ALIGN(io, align), remaining_io);
> >  			remaining_io -= io;
> > +			io_start += io;
> >  
> >  			align = pci_resource_alignment(bridge, mmio_res);
> >  			mmio = div64_ul(available_mmio, hotplug_bridges);
> >  			mmio = min(ALIGN(mmio, align), remaining_mmio);
> >  			remaining_mmio -= mmio;
> > +			mmio_start += mmio;
> >  
> >  			align = pci_resource_alignment(bridge, mmio_pref_res);
> >  			mmio_pref = div64_ul(available_mmio_pref,
> > @@ -1993,9 +2001,40 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
> >  			mmio_pref = min(ALIGN(mmio_pref, align),
> >  					remaining_mmio_pref);
> >  			remaining_mmio_pref -= mmio_pref;
> > +			mmio_pref_start += mmio_pref;
> >  
> >  			pci_bus_distribute_available_resources(b, add_list, io,
> >  							       mmio, mmio_pref);
> > +		} else {
> > +			/*
> > +			 * For normal bridges, track start of the parent
> > +			 * bridge window to make sure we align the
> > +			 * remaining space which is distributed to the
> > +			 * hotplug bridges properly.
> > +			 */
> > +			resource_size_t aligned;
> > +			struct resource *res;
> > +
> > +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 0];
> > +			io_start += resource_size(res);
> > +			aligned = ALIGN(io_start,
> > +					pci_resource_alignment(dev, res));
> > +			if (aligned > io_start)
> > +				remaining_io -= aligned - io_start;
> > +
> > +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 1];
> > +			mmio_start += resource_size(res);
> > +			aligned = ALIGN(mmio_start,
> > +					pci_resource_alignment(dev, res));
> > +			if (aligned > mmio_start)
> > +				remaining_mmio -= aligned - mmio_start;
> > +
> > +			res = &dev->resource[PCI_BRIDGE_RESOURCES + 2];
> > +			mmio_pref_start += resource_size(res);
> > +			aligned = ALIGN(mmio_pref_start,
> > +					pci_resource_alignment(dev, res));
> > +			if (aligned > mmio_pref_start)
> > +				remaining_mmio_pref -= aligned - mmio_pref_start;
> 
> I don't understand how this can work.  You have this:
> 
>   for_each_pci_bridge(dev, bus) {
>     if (!hotplug_bridges && normal_bridges == 1) {
>       pci_bus_distribute_available_resources(...);   # block 1
>     } else if (dev->is_hotplug_bridge) {
>       ...                                            # block 2
>       pci_bus_distribute_available_resources(...);
>     } else {
>       ...                                            # block 3
>     }
>   }
> 
> This patch adds block 3.  Block 3 doesn't do anything with side
> effects, i.e., it only updates local variables like io_start,
> remaining_io, mmio_start, remaining_mmio, etc.

It updates remaining_* so that the next bridge resource is aligned
properly (if there is going to be one).

> Those may be used in subsequent iterations, but it's only useful if
> there *are* subsequent iterations, so this is dependent on the order
> we iterate through the bridges.  It seems like we want the same
> results no matter what the order.

Here again we walk over the bridges using recursive calls so we need to
know upfront the alignment before we do next call to
pci_bus_distribute_available_resources() and that information is only
available when previous bridge is already handled. I suppose this can be
done in a better way but unfortunately I have no idea how :-( Can you
suggest something conrete I could do instead if this is not sufficient?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-28 11:43     ` Mika Westerberg
@ 2018-03-28 18:09       ` Bjorn Helgaas
  2018-03-29 11:59         ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-28 18:09 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Wed, Mar 28, 2018 at 02:43:46PM +0300, Mika Westerberg wrote:
> On Tue, Mar 27, 2018 at 01:57:42PM -0500, Bjorn Helgaas wrote:
> > On Mon, Feb 26, 2018 at 04:21:08PM +0300, Mika Westerberg wrote:
> > > When distributing extra buses between hotplug bridges we need to make
> > > sure each bridge reserve at least one bus number, even if there is
> > > currently nothing connected to it. For instance ACPI hotplug may bring
> > > in additional devices to non-hotplug bridges later on.
> > 
> > I guess you mean ACPI hotplug can add devices below bridges that have
> > "bridge->is_hotplug_bridge == 0"?  Why don't we set is_hotplug_bridge
> > in that case?  I do see that acpiphp sets it in *some* cases (see
> > check_hotplug_bridge()).  Are we missing some case?
> 
> We don't know upfront that these ports are going to get devices below
> them. Only thing that happens in these cases is that we may get ACPI
> Notify() to the root port leading to these ports.

Right, it makes sense that we can't tell in advance which devices
might receive ACPI Notify() events.

We set "is_hotplug_bridge" in these cases:

  1. PCIe bridge with PCI_EXP_SLTCAP_HPC (hot-plug capable) bit set.

     Possibly this could be conditional on CONFIG_HOTPLUG_PCI_PCIE
     because we can't really handle hotplug anyway if that's not set.
     The "manual hotplug" scenario where the user initiates a rescan
     with /sys/bus/pci/rescan or similar might suffer resource
     problems, but that's sort of a corner case that doesn't feel
     super important to me.

  2. acpiphp in check_hotplug_bridge().  bbd34fcdd1b2 ("ACPI / hotplug
     / PCI: Register all devices under the given bridge") suggests
     that we treat "all PCI bridges represented in the ACPI namespace
     are now considered 'hotplug' bridges".  I guess this basically a
     hint that hotplug is more likely if we have ACPI objects and
     possibly things like _RMV, _EJx, etc.

  3. The PLX 6254/HINT HB6 quirk, which I think we can disregard.

> Also the allocation strategy we use is based on ->is_hotplug_bridge
> == 1.  Those bridges will be assigned all the remaining bus space
> and resources. If we somehow set ->is_hotplug_bridge == 1 for these
> non-hotplug ports it means that we now include those ports also when
> resources are distributed which defeats the reason why ACPI Notify()
> is used there in the first place (to preserve bus numbers).

So as a general rule, can we say that we currently distribute
resources across bridges that have PCI_EXP_SLTCAP_HPC or are described
in the ACPI namespace, and we don't reserve anything for other
bridges?

And the point of this patch is that we want to reserve at least one
bus number for *every* bridge because acpiphp may add something below
it?  Presumably we should reserve some I/O and MMIO space, too?  Is
that already covered elsewhere?  Maybe we only need to do this if
acpiphp is actually compiled in?

> > > Here is what happens on one system when a Thunderbolt device is plugged in:
> > > 
> > >   pci 0000:01:00.0: PCI bridge to [bus 02-39]
> > >   ...
> > >   pci_bus 0000:04: [bus 04-39] extended by 0x35
> > >   pci_bus 0000:04: bus scan returning with max=39
> > >   pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
> > >   pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
> > >   pci_bus 0000:3a: scanning bus
> > >   pci_bus 0000:3a: bus scan returning with max=3a
> > >   pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
> > >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
> > >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
> > >   pci_bus 0000:02: bus scan returning with max=3a
> > >   pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a
> > > 
> > > Resulting 'lspci -t' output looks like this:
> > > 
> > >   +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
> > >                                   +-01.0-[04-39]--
> > >                                   \-02.0-[3a]----00.0
> > > 
> > > The device behind downstream port at 02:02 is the integrated xHCI (USB 3
> > > host controller) and is not fully accessible because the hotplug bridge
> > > is reserving too many bus numbers.
> > 
> > Thanks for the details here, but I can't tell what happened before and
> > was broken, vs. what happens now.  Which is the hotplug bridge?  Which
> > is the Thunderbolt controller?
> > 
> > I guess 02:01.0 must be the bridge consuming too many bus numbers
> > ([bus 04-39])?
> 
> Yes, that's correct.
> 
> > And 02:02.0 might be the Thunderbolt controller that wants to use bus
> > 3a?  But obviously that won't work because 1b.0 doesn't route things
> > to bus 3a, since it only consumes [bus 01-39].
> 
> In fact 02:02 leads to xHCI controller which in this case is
> inaccessible which means that currently when you plug in USB 3 device to
> systems with this setup, it won't work.
> 
> > (The device behind 02:02.0 is more than just "not fully accessible" --
> > it's not accessible via config space *at all*.)
> 
> Right.
> 
> > I guess the 'lspci -t' above must be without this patch, and with this
> > patch, we'd have
> > 
> >   pci 0000:02:00.0: PCI bridge to [bus 03]
> >   pci 0000:02:01.0: PCI bridge to [bus 04-38]
> >   pci 0000:02:02.0: PCI bridge to [bus 39]
> 
> That's correct.
> 
> Do you want me to amend the changelog to include this information as
> well?

Yes, please; I think it's very helpful to understand exactly what
changes as a result of this patch.

> > This patch might fix the situation for simple hot-added devices,
> > but won't we have the same problem again if we hot-add a bridge?
> > It seems like we need a more comprehensive solution.  I don't mean
> > we need to go whole hog and reassign everybody's bus numbers
> > dynamically, but we ought to at least be able to notice the
> > situation, decline to enable the bridge leading to devices we
> > can't reach, and give a meaningful error message.
> 
> The problem is that you don't know upfront what is going to be
> hotplugged. Thus it is hard to guess how many buses you want to
> reserve there. Doing that afterwards is not going to work because of
> the nature how we do scan and add devices, without rewriting the
> whole scanning logic.

That's what I meant about "not going whole hog" -- it's impractical
right now to solve the general problem.  But we should be clear in
*this* patch that we're adding a special case to handle a common
situation: hot-adding a single endpoint.

And if we encounter the more difficult cases like hot-adding a bridge
where we don't have resources (bus numbers, I/O, or MMIO) for the
hierarchy, we can fail with a clear message.  The "partially hidden
behind bridge" message does not count as clear :)

> The strategy we use here is the same as Windows does (e.g reserve
> one bus for these bridges, just in case ACPI Notify() brings in a
> new device. This is kind of special case used to hotplug TBT and
> xHCI controller (not bridges). It will not work properly if you
> hotplug a bridge but at least it works better than just failing
> miserably.

The same issue could happen on any system where we use acpiphp, so I
don't think Thunderbolt is really relevant here, and it's easy to
confuse things by mentioning it.  IIUC, the problem we're trying to
solve here is simply this:

  A hot-added endpoint is useless unless we can assign a bus number
  for it.

I think if this changelog mentioned that fact, what we plan to do
(assign one bus number for every bridge, and potentially more for
hotplug bridges), and a simple example of the previous and new bus
number assignments, that would be about right.

> The sanity check at the end of pci_scan_bridge_extend() already detects
> cases where things went wrong:
> 
>    pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
>    pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
> 
> I'm not sure whether we want to add more error prints to confuse users.
> 
> (We actually look for these strings in our test automation for native
>  PCIe hotplug).

I think we can make things less confusing for users if we improve or
replace that message.  Maybe something along the lines of "devices
behind bridge X are unusable because we can't assign a bus number for
them."

We can't be very specific about what the new devices *are*, because
without a bus number, we can't even read their Vendor/Device IDs.

We should make sure that we don't create pci_devs for any of these
hidden devices (I suspect this is already the case).

Also, the "Has only triggered on CardBus, fixup is in yenta_socket"
comment above this block looks obsolete.  We should do some
archaeology on this and probably remove it, because you're seeing
this, and I don't think you're using CardBus or yenta_socket.

> > Nit unrelated to this patch: "bridge 0000:02" is not a bridge, it's a
> > bus.  Apparently bus 3a is hidden because 1b.0's subordinate bus is
> > 39.
> 
> Indeed.
> 
> > > To make sure we don't run out of bus numbers for non-hotplug bridges reserve
> > > one bus number for them upfront before distributing buses for hotplug bridges.
> > > 
> > > Fixes: 1c02ea810065 ("PCI: Distribute available buses to hotplug-capable bridges")
> > > Reported-by: Mario Limonciello <mario.limonciello@dell.com>
> > > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/pci/probe.c | 11 ++++++++---
> > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > > index ef5377438a1e..6cefd47556e3 100644
> > > --- a/drivers/pci/probe.c
> > > +++ b/drivers/pci/probe.c
> > > @@ -2561,7 +2561,10 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> > >  	for_each_pci_bridge(dev, bus) {
> > >  		cmax = max;
> > >  		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
> > > -		used_buses += cmax - max;
> > > +		/* Reserve one bus for each bridge */
> > > +		used_buses++;
> > > +		if (cmax - max > 1)
> > > +			used_buses += cmax - max - 1;
> > 
> > Sorry, this should be trivial, but I'm having a hard time wrapping my
> > mind around it.
> 
> I don't blame you, it is getting quite complex. That's why I added the
> comments there, hopefully makeing it easier to understand.
> 
> > AFAICT, "cmax" is the highest known bus number below this bus, "max"
> > is the highest bus number below "dev" (one of the bridges on "bus").
> > 
> > I assume "used_buses++" accounts for the fact that every enabled
> > bridge must consume one bus number for its secondary side.
> 
> Exactly.
> 
> > And I guess "used_buses += cmax - max - 1" adds in the bus numbers
> > downstream from "dev" (subtracting the one used for its secondary
> > bus)?
> 
> That's also correct.
> 
> > pci_scan_bridge_extend() seems to return something related to the
> > number of bus numbers used below "dev".  Why doesn't *it* account for
> > the secondary bus number of "dev"?
> 
> It returns new "max" i.e maximum subordinate number the bridge occupies.
> 
> Reason why we handle this one in pci_scan_child_bus_extend() instead is
> that we then have the bus number distribution logic pretty much in a
> single function making it easier to understand what happens (well, it is
> getting quite complex but I still think it makes sense that way). If you
> insist, I can move it to pci_scan_bridge_extend() instead.

I could be wrong, but I think the code would make more sense if all
the bus number consumption computation were in one place.  It looks
like pci_scan_bridge_extend() already figures out the [secondary + 1
.. subordinate] interval, so it seems natural to me to have it figure
out the entire [secondary .. subordinate] interval.

> > It might help if the pci_scan_bridge_extend() function comment were
> > extended to say what it actually returns.
> 
> I can make a separate patch adding comment about the return value.
> These functions all return new "max" but I suppose it makes sense to
> document it.

Yes, please!  I think we could really benefit from some trivial
cleanup and consolidation patches in this area, apart from adding new
functionality.  Simple things like fixing the misleading "bridge
<bus-number>" message above, factoring out PCI_BRIDGE_CTL_MASTER_ABORT
/ PCI_STATUS handling (IIRC this was mentioned recently along with
related PCIe functionality), factoring out some of the CardBus code,
factoring out PCI_PRIMARY_BUS updates (the "blast all three values"
comment could then be expanded and maybe reconciled with the
PCI_SUBORDINATE_BUS update), etc.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-28 18:09       ` Bjorn Helgaas
@ 2018-03-29 11:59         ` Mika Westerberg
  2018-03-31  8:29           ` Lukas Wunner
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-29 11:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> On Wed, Mar 28, 2018 at 02:43:46PM +0300, Mika Westerberg wrote:
> > On Tue, Mar 27, 2018 at 01:57:42PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Feb 26, 2018 at 04:21:08PM +0300, Mika Westerberg wrote:
> > > > When distributing extra buses between hotplug bridges we need to make
> > > > sure each bridge reserve at least one bus number, even if there is
> > > > currently nothing connected to it. For instance ACPI hotplug may bring
> > > > in additional devices to non-hotplug bridges later on.
> > > 
> > > I guess you mean ACPI hotplug can add devices below bridges that have
> > > "bridge->is_hotplug_bridge == 0"?  Why don't we set is_hotplug_bridge
> > > in that case?  I do see that acpiphp sets it in *some* cases (see
> > > check_hotplug_bridge()).  Are we missing some case?
> > 
> > We don't know upfront that these ports are going to get devices below
> > them. Only thing that happens in these cases is that we may get ACPI
> > Notify() to the root port leading to these ports.
> 
> Right, it makes sense that we can't tell in advance which devices
> might receive ACPI Notify() events.
> 
> We set "is_hotplug_bridge" in these cases:
> 
>   1. PCIe bridge with PCI_EXP_SLTCAP_HPC (hot-plug capable) bit set.
> 
>      Possibly this could be conditional on CONFIG_HOTPLUG_PCI_PCIE
>      because we can't really handle hotplug anyway if that's not set.
>      The "manual hotplug" scenario where the user initiates a rescan
>      with /sys/bus/pci/rescan or similar might suffer resource
>      problems, but that's sort of a corner case that doesn't feel
>      super important to me.
> 
>   2. acpiphp in check_hotplug_bridge().  bbd34fcdd1b2 ("ACPI / hotplug
>      / PCI: Register all devices under the given bridge") suggests
>      that we treat "all PCI bridges represented in the ACPI namespace
>      are now considered 'hotplug' bridges".  I guess this basically a
>      hint that hotplug is more likely if we have ACPI objects and
>      possibly things like _RMV, _EJx, etc.
> 
>   3. The PLX 6254/HINT HB6 quirk, which I think we can disregard.
> 
> > Also the allocation strategy we use is based on ->is_hotplug_bridge
> > == 1.  Those bridges will be assigned all the remaining bus space
> > and resources. If we somehow set ->is_hotplug_bridge == 1 for these
> > non-hotplug ports it means that we now include those ports also when
> > resources are distributed which defeats the reason why ACPI Notify()
> > is used there in the first place (to preserve bus numbers).
> 
> So as a general rule, can we say that we currently distribute
> resources across bridges that have PCI_EXP_SLTCAP_HPC or are described
> in the ACPI namespace, and we don't reserve anything for other
> bridges?

More like, we distribute remaining resources to downstream bridges if
they have is_hotplug_bridge == 1 (PCI_EXP_SLTCAP_HPC) and need to be
reconfigured (i.e we don't do any resource allocation if the bridge is
already configured by the BIOS).

For other bridges, we only reserve the amount of resources they need for
the devices below them.

> And the point of this patch is that we want to reserve at least one
> bus number for *every* bridge because acpiphp may add something below
> it?

Point is to reserve at least one bus number for non-hotplug bridges that
currently don't have device below them to keep the bus number
distribution code from using those numbers for extension. There is a bug
now that makes it allocate too much if you have non-hotplug bridges
after the hotplug bridge.

The topology is like:

  US ---+--- DS for Thunderbolt NHI
        |
        +--- DS hotplug for extension
        |
        +--- DS for xHCI

Because all the DS ports need to be reconfigured the code in
pci_scan_child_bus_extend() loops over DS bridges but since the DS
leading to XHCI comes after the DS hotplug port it will not be accounted
when extension bus number space is distributed.

> Presumably we should reserve some I/O and MMIO space, too?  Is
> that already covered elsewhere?  Maybe we only need to do this if
> acpiphp is actually compiled in?

AFAICT the resource allocation code in drivers/pci/setup-bus.c makes
sure bridge I/O and MMIO windows reserve at least minimum (4k for I/O,
1M for MMIO).

It is not just acpiphp but could happen when PCIe switch is hotplugged
using native PCIe and has similar configuration (non-hotplug bridges
following hotplug bridges).

> > > > Here is what happens on one system when a Thunderbolt device is plugged in:
> > > > 
> > > >   pci 0000:01:00.0: PCI bridge to [bus 02-39]
> > > >   ...
> > > >   pci_bus 0000:04: [bus 04-39] extended by 0x35
> > > >   pci_bus 0000:04: bus scan returning with max=39
> > > >   pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
> > > >   pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
> > > >   pci_bus 0000:3a: scanning bus
> > > >   pci_bus 0000:3a: bus scan returning with max=3a
> > > >   pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
> > > >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
> > > >   pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
> > > >   pci_bus 0000:02: bus scan returning with max=3a
> > > >   pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a
> > > > 
> > > > Resulting 'lspci -t' output looks like this:
> > > > 
> > > >   +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
> > > >                                   +-01.0-[04-39]--
> > > >                                   \-02.0-[3a]----00.0
> > > > 
> > > > The device behind downstream port at 02:02 is the integrated xHCI (USB 3
> > > > host controller) and is not fully accessible because the hotplug bridge
> > > > is reserving too many bus numbers.
> > > 
> > > Thanks for the details here, but I can't tell what happened before and
> > > was broken, vs. what happens now.  Which is the hotplug bridge?  Which
> > > is the Thunderbolt controller?
> > > 
> > > I guess 02:01.0 must be the bridge consuming too many bus numbers
> > > ([bus 04-39])?
> > 
> > Yes, that's correct.
> > 
> > > And 02:02.0 might be the Thunderbolt controller that wants to use bus
> > > 3a?  But obviously that won't work because 1b.0 doesn't route things
> > > to bus 3a, since it only consumes [bus 01-39].
> > 
> > In fact 02:02 leads to xHCI controller which in this case is
> > inaccessible which means that currently when you plug in USB 3 device to
> > systems with this setup, it won't work.
> > 
> > > (The device behind 02:02.0 is more than just "not fully accessible" --
> > > it's not accessible via config space *at all*.)
> > 
> > Right.
> > 
> > > I guess the 'lspci -t' above must be without this patch, and with this
> > > patch, we'd have
> > > 
> > >   pci 0000:02:00.0: PCI bridge to [bus 03]
> > >   pci 0000:02:01.0: PCI bridge to [bus 04-38]
> > >   pci 0000:02:02.0: PCI bridge to [bus 39]
> > 
> > That's correct.
> > 
> > Do you want me to amend the changelog to include this information as
> > well?
> 
> Yes, please; I think it's very helpful to understand exactly what
> changes as a result of this patch.

OK, I'll add that to the changelog then.

> > > This patch might fix the situation for simple hot-added devices,
> > > but won't we have the same problem again if we hot-add a bridge?
> > > It seems like we need a more comprehensive solution.  I don't mean
> > > we need to go whole hog and reassign everybody's bus numbers
> > > dynamically, but we ought to at least be able to notice the
> > > situation, decline to enable the bridge leading to devices we
> > > can't reach, and give a meaningful error message.
> > 
> > The problem is that you don't know upfront what is going to be
> > hotplugged. Thus it is hard to guess how many buses you want to
> > reserve there. Doing that afterwards is not going to work because of
> > the nature how we do scan and add devices, without rewriting the
> > whole scanning logic.
> 
> That's what I meant about "not going whole hog" -- it's impractical
> right now to solve the general problem.  But we should be clear in
> *this* patch that we're adding a special case to handle a common
> situation: hot-adding a single endpoint.
> 
> And if we encounter the more difficult cases like hot-adding a bridge
> where we don't have resources (bus numbers, I/O, or MMIO) for the
> hierarchy, we can fail with a clear message.  The "partially hidden
> behind bridge" message does not count as clear :)

OK

> > The strategy we use here is the same as Windows does (e.g reserve
> > one bus for these bridges, just in case ACPI Notify() brings in a
> > new device. This is kind of special case used to hotplug TBT and
> > xHCI controller (not bridges). It will not work properly if you
> > hotplug a bridge but at least it works better than just failing
> > miserably.
> 
> The same issue could happen on any system where we use acpiphp, so I
> don't think Thunderbolt is really relevant here, and it's easy to
> confuse things by mentioning it.

This issue can happen regardless whether acpiphp is used or not. I agree
that mentioning Thunderbolt might confuse things.

> IIUC, the problem we're trying to solve here is simply this:
> 
>   A hot-added endpoint is useless unless we can assign a bus number
>   for it.
> 
> I think if this changelog mentioned that fact, what we plan to do
> (assign one bus number for every bridge, and potentially more for
> hotplug bridges), and a simple example of the previous and new bus
> number assignments, that would be about right.

I'll update the changelog to be more clear about what the actual problem
is that we are trying to solve with this patch.

> > The sanity check at the end of pci_scan_bridge_extend() already detects
> > cases where things went wrong:
> > 
> >    pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
> >    pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
> > 
> > I'm not sure whether we want to add more error prints to confuse users.
> > 
> > (We actually look for these strings in our test automation for native
> >  PCIe hotplug).
> 
> I think we can make things less confusing for users if we improve or
> replace that message.  Maybe something along the lines of "devices
> behind bridge X are unusable because we can't assign a bus number for
> them."
> 
> We can't be very specific about what the new devices *are*, because
> without a bus number, we can't even read their Vendor/Device IDs.
> 
> We should make sure that we don't create pci_devs for any of these
> hidden devices (I suspect this is already the case).

Yes, that's the case already.

> Also, the "Has only triggered on CardBus, fixup is in yenta_socket"
> comment above this block looks obsolete.  We should do some
> archaeology on this and probably remove it, because you're seeing
> this, and I don't think you're using CardBus or yenta_socket.

But since this is a fix, should we do all the cleanup and message
improvement in a separate patch(es) to make sure backporting the actual
fix to stable releases is easy enough?

> > > Nit unrelated to this patch: "bridge 0000:02" is not a bridge, it's a
> > > bus.  Apparently bus 3a is hidden because 1b.0's subordinate bus is
> > > 39.
> > 
> > Indeed.
> > 
> > > > To make sure we don't run out of bus numbers for non-hotplug bridges reserve
> > > > one bus number for them upfront before distributing buses for hotplug bridges.
> > > > 
> > > > Fixes: 1c02ea810065 ("PCI: Distribute available buses to hotplug-capable bridges")
> > > > Reported-by: Mario Limonciello <mario.limonciello@dell.com>
> > > > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Cc: stable@vger.kernel.org
> > > > ---
> > > >  drivers/pci/probe.c | 11 ++++++++---
> > > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > > > index ef5377438a1e..6cefd47556e3 100644
> > > > --- a/drivers/pci/probe.c
> > > > +++ b/drivers/pci/probe.c
> > > > @@ -2561,7 +2561,10 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> > > >  	for_each_pci_bridge(dev, bus) {
> > > >  		cmax = max;
> > > >  		max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
> > > > -		used_buses += cmax - max;
> > > > +		/* Reserve one bus for each bridge */
> > > > +		used_buses++;
> > > > +		if (cmax - max > 1)
> > > > +			used_buses += cmax - max - 1;
> > > 
> > > Sorry, this should be trivial, but I'm having a hard time wrapping my
> > > mind around it.
> > 
> > I don't blame you, it is getting quite complex. That's why I added the
> > comments there, hopefully makeing it easier to understand.
> > 
> > > AFAICT, "cmax" is the highest known bus number below this bus, "max"
> > > is the highest bus number below "dev" (one of the bridges on "bus").
> > > 
> > > I assume "used_buses++" accounts for the fact that every enabled
> > > bridge must consume one bus number for its secondary side.
> > 
> > Exactly.
> > 
> > > And I guess "used_buses += cmax - max - 1" adds in the bus numbers
> > > downstream from "dev" (subtracting the one used for its secondary
> > > bus)?
> > 
> > That's also correct.
> > 
> > > pci_scan_bridge_extend() seems to return something related to the
> > > number of bus numbers used below "dev".  Why doesn't *it* account for
> > > the secondary bus number of "dev"?
> > 
> > It returns new "max" i.e maximum subordinate number the bridge occupies.
> > 
> > Reason why we handle this one in pci_scan_child_bus_extend() instead is
> > that we then have the bus number distribution logic pretty much in a
> > single function making it easier to understand what happens (well, it is
> > getting quite complex but I still think it makes sense that way). If you
> > insist, I can move it to pci_scan_bridge_extend() instead.
> 
> I could be wrong, but I think the code would make more sense if all
> the bus number consumption computation were in one place.  It looks
> like pci_scan_bridge_extend() already figures out the [secondary + 1
> .. subordinate] interval, so it seems natural to me to have it figure
> out the entire [secondary .. subordinate] interval.

I looked at how we could move this to pci_scan_bridge_extend() instead
but it turns out not to be so simple afterall. The issue is that
pci_scan_bridge_extend() returns "max", not the number of buses the
bridge consumes. Now, we would need to return max + 1 if the bridge is
matching the criteria (non-hotplug downstream bridge that needs to be
reconfigured) but since we have a loops like this:

       for_each_pci_bridge(dev, bus) {
		..
                max = pci_scan_bridge_extend(bus, dev, max, 0, 0);
        }

       for_each_pci_bridge(dev, bus) {
		..
                max = pci_scan_bridge_extend(bus, dev, max, buses, 1);
       }

In the first pass we end up increasing max by two (on this particular
system) which then shifts starting subordinate bus number of the second
pass by two as well so we would need to account that here spreading the
complexity to two functions instead of just one.

Since this particular patch adds total 5 new lines to fix the problem
and they are contained in a single function, I would rather go with that.

It could be that I'm just missing something obvious, though - this whole
scanning logic is not the most understandable code I've ever
encountered.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources
  2018-03-28 12:01     ` Mika Westerberg
@ 2018-03-29 21:29       ` Bjorn Helgaas
  2018-03-31  9:18         ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2018-03-29 21:29 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Wed, Mar 28, 2018 at 03:01:20PM +0300, Mika Westerberg wrote:
> On Tue, Mar 27, 2018 at 02:23:18PM -0500, Bjorn Helgaas wrote:
> > On Mon, Feb 26, 2018 at 04:21:09PM +0300, Mika Westerberg wrote:
> > > When connecting a Thunderbolt endpoint which itself has internal PCIe
> > > switch (not the integrated one) the way we currently distribute
> > 
> > I don't understand this distinction between "internal PCIe switch" and
> > "integrated one".
> 
> I tried to explain here that Thunderbolt endpoint itself has an internal
> (integrated) PCIe switch.

I was confused by this because I thought you meant "Endpoint" in the
sense used in the PCIe spec, i.e., something with a type 0 config
header.  But you're using "endpoint" to mean "some collection of
devices with a Thunderbolt interface on one side".

> > >From a PCI point of view, I assume you're adding a switch with 4 NIC
> > endpoints below it.
> 
> Yes, exactly.
> 
> The PCIe device in question is Intel Gigabit ET2 Quad Port Server
> adapter.
> 
> > > resources does not always work well because devices below
> > > non-hotplug bridges might need to have their MMIO resources
> > > aligned to something else than 1 MB boundary. As an example
> > > connecting a server grade 4-port GbE add in card adds another
> > > PCIe switch leading to GbE devices that want to have their MMIO
> > > resources aligned to 2 MB boundary instead.
> > 
> > Is there something unique about Thunderbolt here?  I assume from a
> > PCI point of view, these NIC ports have BARs that are 2 MB or
> > larger and hence need 2 MB or larger alignment?
> 
> No, this is not unique to Thunderbolt in any way.

I think this is a case of confusing things by mentioning irrelevant
details, e.g., leading off with "When connecting a Thunderbolt
endpoint" when the problem can occur when hot-adding any PCIe device
(or maybe only when adding a bridge?  I don't know).

It's good to mention a specific example, but I think it's best if you
can describe the situation as generically as possible first, then cite
the case at hand in a way that it's obvious this is just one of many
possibilities.

> > I don't understand how this can work.  You have this:
> > 
> >   for_each_pci_bridge(dev, bus) {
> >     if (!hotplug_bridges && normal_bridges == 1) {
> >       pci_bus_distribute_available_resources(...);   # block 1
> >     } else if (dev->is_hotplug_bridge) {
> >       ...                                            # block 2
> >       pci_bus_distribute_available_resources(...);
> >     } else {
> >       ...                                            # block 3
> >     }
> >   }
> > 
> > This patch adds block 3.  Block 3 doesn't do anything with side
> > effects, i.e., it only updates local variables like io_start,
> > remaining_io, mmio_start, remaining_mmio, etc.
> 
> It updates remaining_* so that the next bridge resource is aligned
> properly (if there is going to be one).

What do you mean by "aligned properly"?  The spec requires that bridge
memory windows be 1MB aligned and I/O windows 4KB aligned.  Does the
current code do that wrong?

> > Those may be used in subsequent iterations, but it's only useful
> > if there *are* subsequent iterations, so this is dependent on the
> > order we iterate through the bridges.  It seems like we want the
> > same results no matter what the order.
> 
> Here again we walk over the bridges using recursive calls so we need
> to know upfront the alignment before we do next call to
> pci_bus_distribute_available_resources() and that information is
> only available when previous bridge is already handled. I suppose
> this can be done in a better way but unfortunately I have no idea
> how :-( Can you suggest something concrete I could do instead if
> this is not sufficient?

I don't understand this well enough to have any great suggestions.

I did notice that the "!hotplug_bridges && normal_bridges == 1" case
doesn't look like it needs to be in the for_each_pci_bridge() loop
because there's only one bridge (I think "hotplug_bridges == 0 &&
normal_bridges == 1" would be slightly easier to read).

And I suspect you might want to handle the "hotplug_bridges == 1 &&
normal_bridges == 0" case identically.  That would mean you could pull
these "list_is_singular()" cases out of the loop, which makes sense
because if there's only one bridge, it should get everything.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-29 11:59         ` Mika Westerberg
@ 2018-03-31  8:29           ` Lukas Wunner
  2018-03-31  8:58             ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Lukas Wunner @ 2018-03-31  8:29 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
> On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> > The same issue could happen on any system where we use acpiphp, so I
> > don't think Thunderbolt is really relevant here, and it's easy to
> > confuse things by mentioning it.
> 
> This issue can happen regardless whether acpiphp is used or not.

If the platform has yielded hotplug control to the OS via _OSC,
I don't see how the platform could hot-add devices.  So surely
reserving a bus number for a bridge without anything below it
can be constrained to the !pciehp_is_native(bridge) case?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  8:29           ` Lukas Wunner
@ 2018-03-31  8:58             ` Mika Westerberg
  2018-03-31  9:12               ` Lukas Wunner
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31  8:58 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
> On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
> > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> > > The same issue could happen on any system where we use acpiphp, so I
> > > don't think Thunderbolt is really relevant here, and it's easy to
> > > confuse things by mentioning it.
> > 
> > This issue can happen regardless whether acpiphp is used or not.
> 
> If the platform has yielded hotplug control to the OS via _OSC,
> I don't see how the platform could hot-add devices.  So surely
> reserving a bus number for a bridge without anything below it
> can be constrained to the !pciehp_is_native(bridge) case?

Nothing prevents ACPI Notify() happening while native PCIe hotplug is
used on non-hotplug ports (the ones not controlled by pciehp). And it
cannot be constrained to !pciehp_is_native(bridge) because it is the
root port that has the _OSC but below it can be non-hotplug ports where
ACPI Notify() is used to bring in additional devices.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  8:58             ` Mika Westerberg
@ 2018-03-31  9:12               ` Lukas Wunner
  2018-03-31  9:19                 ` Rafael J. Wysocki
  2018-03-31  9:20                 ` Mika Westerberg
  0 siblings, 2 replies; 34+ messages in thread
From: Lukas Wunner @ 2018-03-31  9:12 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 11:58:52AM +0300, Mika Westerberg wrote:
> On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
> > On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
> > > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> > > > The same issue could happen on any system where we use acpiphp, so I
> > > > don't think Thunderbolt is really relevant here, and it's easy to
> > > > confuse things by mentioning it.
> > > 
> > > This issue can happen regardless whether acpiphp is used or not.
> > 
> > If the platform has yielded hotplug control to the OS via _OSC,
> > I don't see how the platform could hot-add devices.  So surely
> > reserving a bus number for a bridge without anything below it
> > can be constrained to the !pciehp_is_native(bridge) case?
> 
> Nothing prevents ACPI Notify() happening while native PCIe hotplug is
> used on non-hotplug ports (the ones not controlled by pciehp). And it
> cannot be constrained to !pciehp_is_native(bridge) because it is the
> root port that has the _OSC but below it can be non-hotplug ports where
> ACPI Notify() is used to bring in additional devices.

That sounds like a violation of the spec to me.

ACPI 6.1 table 6-178 says if OS is granted control over PCIe hotplug,
the firmware "must ensure that all hot plug events are routed to device
interrupts", which wouldn't be the case for Notify() because the
interrupt generated is an SCI, not an MSI or INTx interrupt for the
hotplug port itself.

Moreover, "after control is transferred to the OS, firmware must not
update the state of hot plug slots, including the state of the
indicators and power controller."

Maybe I've misunderstood the spec all the time, my understanding was
that if OS is granted control, the firmware won't do anything with
hotplug ports below the host bridge, period.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources
  2018-03-29 21:29       ` Bjorn Helgaas
@ 2018-03-31  9:18         ` Mika Westerberg
  0 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31  9:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Mario.Limonciello,
	Michael Jamet, Yehezkel Bernat, Andy Shevchenko, linux-pci,
	linux-acpi

On Thu, Mar 29, 2018 at 04:29:21PM -0500, Bjorn Helgaas wrote:
> On Wed, Mar 28, 2018 at 03:01:20PM +0300, Mika Westerberg wrote:
> > On Tue, Mar 27, 2018 at 02:23:18PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Feb 26, 2018 at 04:21:09PM +0300, Mika Westerberg wrote:
> > > > When connecting a Thunderbolt endpoint which itself has internal PCIe
> > > > switch (not the integrated one) the way we currently distribute
> > > 
> > > I don't understand this distinction between "internal PCIe switch" and
> > > "integrated one".
> > 
> > I tried to explain here that Thunderbolt endpoint itself has an internal
> > (integrated) PCIe switch.
> 
> I was confused by this because I thought you meant "Endpoint" in the
> sense used in the PCIe spec, i.e., something with a type 0 config
> header.  But you're using "endpoint" to mean "some collection of
> devices with a Thunderbolt interface on one side".

Yeah, I think the official term is "Thunderbolt endpoint" but I admit
it is confusing here.

> > > >From a PCI point of view, I assume you're adding a switch with 4 NIC
> > > endpoints below it.
> > 
> > Yes, exactly.
> > 
> > The PCIe device in question is Intel Gigabit ET2 Quad Port Server
> > adapter.
> > 
> > > > resources does not always work well because devices below
> > > > non-hotplug bridges might need to have their MMIO resources
> > > > aligned to something else than 1 MB boundary. As an example
> > > > connecting a server grade 4-port GbE add in card adds another
> > > > PCIe switch leading to GbE devices that want to have their MMIO
> > > > resources aligned to 2 MB boundary instead.
> > > 
> > > Is there something unique about Thunderbolt here?  I assume from a
> > > PCI point of view, these NIC ports have BARs that are 2 MB or
> > > larger and hence need 2 MB or larger alignment?
> > 
> > No, this is not unique to Thunderbolt in any way.
> 
> I think this is a case of confusing things by mentioning irrelevant
> details, e.g., leading off with "When connecting a Thunderbolt
> endpoint" when the problem can occur when hot-adding any PCIe device
> (or maybe only when adding a bridge?  I don't know).
> 
> It's good to mention a specific example, but I think it's best if you
> can describe the situation as generically as possible first, then cite
> the case at hand in a way that it's obvious this is just one of many
> possibilities.

OK, I'll rework the changelog accordingly.

> > > I don't understand how this can work.  You have this:
> > > 
> > >   for_each_pci_bridge(dev, bus) {
> > >     if (!hotplug_bridges && normal_bridges == 1) {
> > >       pci_bus_distribute_available_resources(...);   # block 1
> > >     } else if (dev->is_hotplug_bridge) {
> > >       ...                                            # block 2
> > >       pci_bus_distribute_available_resources(...);
> > >     } else {
> > >       ...                                            # block 3
> > >     }
> > >   }
> > > 
> > > This patch adds block 3.  Block 3 doesn't do anything with side
> > > effects, i.e., it only updates local variables like io_start,
> > > remaining_io, mmio_start, remaining_mmio, etc.
> > 
> > It updates remaining_* so that the next bridge resource is aligned
> > properly (if there is going to be one).
> 
> What do you mean by "aligned properly"?  The spec requires that bridge
> memory windows be 1MB aligned and I/O windows 4KB aligned.  Does the
> current code do that wrong?

Yes it does. I'll put this to the changelog but in short here are
the two bridges where you can see the alignment:

  pci 0000:39:00.0: BAR 14: assigned [mem 0x53300000-0x6a0fffff]
                                          ^^^^^^^^^^
  pci 0000:3a:01.0: BAR 14: assigned [mem 0x53400000-0x547fffff]
                                          ^^^^^^^^^^

The 3a:01.0 (downstream port) is behind 39:00.0 (upstream port) and
leads to the other PCIe switch with 4 NICs. The starting address is
aligned to 2 MB instead of 1 MB. Next we try to assign remaining
resources to the hotplug downstream port 3a:04.0:

   pci 0000:3a:04.0: BAR 14: no space for [mem size 0x15a00000]
   pci 0000:3a:04.0: BAR 14: failed to assign [mem size 0x15a00000]

but since we calculated the remaining MMIO space (0x15a00000) without
taking this alignment into account it does not fit to the [mem
0x53300000-0x6a0fffff] bridge window:

  start = 0x54800000
  end = 0x54800000 + 0x15a00000 - 1 = 0x6a1fffff

And 0x6a1fffff > 0x6a0fffff which leads to the failure.

> > > Those may be used in subsequent iterations, but it's only useful
> > > if there *are* subsequent iterations, so this is dependent on the
> > > order we iterate through the bridges.  It seems like we want the
> > > same results no matter what the order.
> > 
> > Here again we walk over the bridges using recursive calls so we need
> > to know upfront the alignment before we do next call to
> > pci_bus_distribute_available_resources() and that information is
> > only available when previous bridge is already handled. I suppose
> > this can be done in a better way but unfortunately I have no idea
> > how :-( Can you suggest something concrete I could do instead if
> > this is not sufficient?
> 
> I don't understand this well enough to have any great suggestions.

OK, thanks anyway.

> I did notice that the "!hotplug_bridges && normal_bridges == 1" case
> doesn't look like it needs to be in the for_each_pci_bridge() loop
> because there's only one bridge (I think "hotplug_bridges == 0 &&
> normal_bridges == 1" would be slightly easier to read).
> 
> And I suspect you might want to handle the "hotplug_bridges == 1 &&
> normal_bridges == 0" case identically.  That would mean you could pull
> these "list_is_singular()" cases out of the loop, which makes sense
> because if there's only one bridge, it should get everything.

That's a good idea. I'll do that in the next revision.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:12               ` Lukas Wunner
@ 2018-03-31  9:19                 ` Rafael J. Wysocki
  2018-03-31  9:20                 ` Mika Westerberg
  1 sibling, 0 replies; 34+ messages in thread
From: Rafael J. Wysocki @ 2018-03-31  9:19 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Mika Westerberg, Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Mario Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, Linux PCI, ACPI Devel Maling List

On Sat, Mar 31, 2018 at 11:12 AM, Lukas Wunner <lukas@wunner.de> wrote:
> On Sat, Mar 31, 2018 at 11:58:52AM +0300, Mika Westerberg wrote:
>> On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
>> > On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
>> > > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
>> > > > The same issue could happen on any system where we use acpiphp, so I
>> > > > don't think Thunderbolt is really relevant here, and it's easy to
>> > > > confuse things by mentioning it.
>> > >
>> > > This issue can happen regardless whether acpiphp is used or not.
>> >
>> > If the platform has yielded hotplug control to the OS via _OSC,
>> > I don't see how the platform could hot-add devices.  So surely
>> > reserving a bus number for a bridge without anything below it
>> > can be constrained to the !pciehp_is_native(bridge) case?
>>
>> Nothing prevents ACPI Notify() happening while native PCIe hotplug is
>> used on non-hotplug ports (the ones not controlled by pciehp). And it
>> cannot be constrained to !pciehp_is_native(bridge) because it is the
>> root port that has the _OSC but below it can be non-hotplug ports where
>> ACPI Notify() is used to bring in additional devices.
>
> That sounds like a violation of the spec to me.
>
> ACPI 6.1 table 6-178 says if OS is granted control over PCIe hotplug,
> the firmware "must ensure that all hot plug events are routed to device
> interrupts", which wouldn't be the case for Notify() because the
> interrupt generated is an SCI, not an MSI or INTx interrupt for the
> hotplug port itself.
>
> Moreover, "after control is transferred to the OS, firmware must not
> update the state of hot plug slots, including the state of the
> indicators and power controller."
>
> Maybe I've misunderstood the spec all the time, my understanding was
> that if OS is granted control, the firmware won't do anything with
> hotplug ports below the host bridge, period.

This is in agreement with my understanding of it.

The _OSC at a root port should cover the entire hierarchy below that port.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:12               ` Lukas Wunner
  2018-03-31  9:19                 ` Rafael J. Wysocki
@ 2018-03-31  9:20                 ` Mika Westerberg
  2018-03-31  9:30                   ` Lukas Wunner
  2018-03-31  9:30                   ` Rafael J. Wysocki
  1 sibling, 2 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31  9:20 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 11:12:45AM +0200, Lukas Wunner wrote:
> On Sat, Mar 31, 2018 at 11:58:52AM +0300, Mika Westerberg wrote:
> > On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
> > > On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
> > > > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> > > > > The same issue could happen on any system where we use acpiphp, so I
> > > > > don't think Thunderbolt is really relevant here, and it's easy to
> > > > > confuse things by mentioning it.
> > > > 
> > > > This issue can happen regardless whether acpiphp is used or not.
> > > 
> > > If the platform has yielded hotplug control to the OS via _OSC,
> > > I don't see how the platform could hot-add devices.  So surely
> > > reserving a bus number for a bridge without anything below it
> > > can be constrained to the !pciehp_is_native(bridge) case?
> > 
> > Nothing prevents ACPI Notify() happening while native PCIe hotplug is
> > used on non-hotplug ports (the ones not controlled by pciehp). And it
> > cannot be constrained to !pciehp_is_native(bridge) because it is the
> > root port that has the _OSC but below it can be non-hotplug ports where
> > ACPI Notify() is used to bring in additional devices.
> 
> That sounds like a violation of the spec to me.
> 
> ACPI 6.1 table 6-178 says if OS is granted control over PCIe hotplug,
> the firmware "must ensure that all hot plug events are routed to device
> interrupts", which wouldn't be the case for Notify() because the
> interrupt generated is an SCI, not an MSI or INTx interrupt for the
> hotplug port itself.
> 
> Moreover, "after control is transferred to the OS, firmware must not
> update the state of hot plug slots, including the state of the
> indicators and power controller."
> 
> Maybe I've misunderstood the spec all the time, my understanding was
> that if OS is granted control, the firmware won't do anything with
> hotplug ports below the host bridge, period.

The whole point here is that those are *not* hotplug slots just regular
downstream ports.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:20                 ` Mika Westerberg
@ 2018-03-31  9:30                   ` Lukas Wunner
  2018-03-31  9:58                     ` Mika Westerberg
  2018-03-31  9:30                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 34+ messages in thread
From: Lukas Wunner @ 2018-03-31  9:30 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 12:20:17PM +0300, Mika Westerberg wrote:
> On Sat, Mar 31, 2018 at 11:12:45AM +0200, Lukas Wunner wrote:
> > On Sat, Mar 31, 2018 at 11:58:52AM +0300, Mika Westerberg wrote:
> > > On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
> > > > On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
> > > > > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
> > > > > > The same issue could happen on any system where we use acpiphp, so I
> > > > > > don't think Thunderbolt is really relevant here, and it's easy to
> > > > > > confuse things by mentioning it.
> > > > > 
> > > > > This issue can happen regardless whether acpiphp is used or not.
> > > > 
> > > > If the platform has yielded hotplug control to the OS via _OSC,
> > > > I don't see how the platform could hot-add devices.  So surely
> > > > reserving a bus number for a bridge without anything below it
> > > > can be constrained to the !pciehp_is_native(bridge) case?
> > > 
> > > Nothing prevents ACPI Notify() happening while native PCIe hotplug is
> > > used on non-hotplug ports (the ones not controlled by pciehp). And it
> > > cannot be constrained to !pciehp_is_native(bridge) because it is the
> > > root port that has the _OSC but below it can be non-hotplug ports where
> > > ACPI Notify() is used to bring in additional devices.
> > 
> > That sounds like a violation of the spec to me.
> > 
> > ACPI 6.1 table 6-178 says if OS is granted control over PCIe hotplug,
> > the firmware "must ensure that all hot plug events are routed to device
> > interrupts", which wouldn't be the case for Notify() because the
> > interrupt generated is an SCI, not an MSI or INTx interrupt for the
> > hotplug port itself.
> > 
> > Moreover, "after control is transferred to the OS, firmware must not
> > update the state of hot plug slots, including the state of the
> > indicators and power controller."
> > 
> > Maybe I've misunderstood the spec all the time, my understanding was
> > that if OS is granted control, the firmware won't do anything with
> > hotplug ports below the host bridge, period.
> 
> The whole point here is that those are *not* hotplug slots just regular
> downstream ports.

Okay, understood.  Is this about the NHI or XHCI?  Because at least
on Alpine Ridge (C step), the bridge above the XHCI *is* a hotplug
bridge.  Only the bridge above the NHI is not.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:20                 ` Mika Westerberg
  2018-03-31  9:30                   ` Lukas Wunner
@ 2018-03-31  9:30                   ` Rafael J. Wysocki
  2018-03-31  9:56                     ` Mika Westerberg
  1 sibling, 1 reply; 34+ messages in thread
From: Rafael J. Wysocki @ 2018-03-31  9:30 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Lukas Wunner, Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Mario Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, Linux PCI, ACPI Devel Maling List

On Sat, Mar 31, 2018 at 11:20 AM, Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> On Sat, Mar 31, 2018 at 11:12:45AM +0200, Lukas Wunner wrote:
>> On Sat, Mar 31, 2018 at 11:58:52AM +0300, Mika Westerberg wrote:
>> > On Sat, Mar 31, 2018 at 10:29:03AM +0200, Lukas Wunner wrote:
>> > > On Thu, Mar 29, 2018 at 02:59:11PM +0300, Mika Westerberg wrote:
>> > > > On Wed, Mar 28, 2018 at 01:09:06PM -0500, Bjorn Helgaas wrote:
>> > > > > The same issue could happen on any system where we use acpiphp, so I
>> > > > > don't think Thunderbolt is really relevant here, and it's easy to
>> > > > > confuse things by mentioning it.
>> > > >
>> > > > This issue can happen regardless whether acpiphp is used or not.
>> > >
>> > > If the platform has yielded hotplug control to the OS via _OSC,
>> > > I don't see how the platform could hot-add devices.  So surely
>> > > reserving a bus number for a bridge without anything below it
>> > > can be constrained to the !pciehp_is_native(bridge) case?
>> >
>> > Nothing prevents ACPI Notify() happening while native PCIe hotplug is
>> > used on non-hotplug ports (the ones not controlled by pciehp). And it
>> > cannot be constrained to !pciehp_is_native(bridge) because it is the
>> > root port that has the _OSC but below it can be non-hotplug ports where
>> > ACPI Notify() is used to bring in additional devices.
>>
>> That sounds like a violation of the spec to me.
>>
>> ACPI 6.1 table 6-178 says if OS is granted control over PCIe hotplug,
>> the firmware "must ensure that all hot plug events are routed to device
>> interrupts", which wouldn't be the case for Notify() because the
>> interrupt generated is an SCI, not an MSI or INTx interrupt for the
>> hotplug port itself.
>>
>> Moreover, "after control is transferred to the OS, firmware must not
>> update the state of hot plug slots, including the state of the
>> indicators and power controller."
>>
>> Maybe I've misunderstood the spec all the time, my understanding was
>> that if OS is granted control, the firmware won't do anything with
>> hotplug ports below the host bridge, period.
>
> The whole point here is that those are *not* hotplug slots just regular
> downstream ports.

I'm not sure what scenario exactly you are referring to to be honest.

Something related to Thunderbolt I suppose?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:30                   ` Rafael J. Wysocki
@ 2018-03-31  9:56                     ` Mika Westerberg
  2018-03-31 10:05                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31  9:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Mario Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, Linux PCI, ACPI Devel Maling List

On Sat, Mar 31, 2018 at 11:30:44AM +0200, Rafael J. Wysocki wrote:
> > The whole point here is that those are *not* hotplug slots just regular
> > downstream ports.
> 
> I'm not sure what scenario exactly you are referring to to be honest.
> 
> Something related to Thunderbolt I suppose?

Here is an example that hopefully clarifies. This example is from a
system using Thunderbolt in "native" mode but I think it is not specific
to Thunderbolt. The idea is that when you don't have anything connected
to Thunderbolt ports you have following PCI topology:

  00:1b.0 --

So a root port that is hotplug capable and handled by pciehp.

Next when you plug in a Thunderbolt enpdoint, you get native PCIe
hotplug event and after it is handled the topology looks like:

  00:1b.0 --- 01:00.0 --+- 02:00.0 --
                        +- 02:01.0 (hotplug) --
                        \- 02:02.0 --

In other words there is a PCIe switch with one hotplug port (02:01.0)
that is again handled by pciehp (this is used to daisy chain further
devices). However, downstream ports 02:00.0 and 02:02.0 are not marked
as hotplug capable so pciehp is not controlling them.

To bring in xHCI and/or Thunderbolt host controller we get ACPI Notify()
to the root port 00:1b.0 which should result following topology after
handled by acpiphp:

  00:1b.0 --- 01:00.0 --+- 02:00.0 -- Thunderbolt host controller
                        +- 02:01.0 (hotplug) --
                        \- 02:02.0 -- xHCI host controller

In other words ACPI Notify() is used to populate devices connected to
non-hotplug downstream ports. It is also used to "hot-unplug" them in
the same way (for example if you only connect standard USB-C device to
the port the Thunderbolt host controller is hot-unplugged using this
mechanism).

Rest of the devices in the chain are hotplugged using standard native
PCIe hotplug so pciehp will be controlling then.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:30                   ` Lukas Wunner
@ 2018-03-31  9:58                     ` Mika Westerberg
  2018-03-31 10:18                       ` Lukas Wunner
  0 siblings, 1 reply; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31  9:58 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 11:30:17AM +0200, Lukas Wunner wrote:
> > The whole point here is that those are *not* hotplug slots just regular
> > downstream ports.
> 
> Okay, understood.  Is this about the NHI or XHCI?  Because at least
> on Alpine Ridge (C step), the bridge above the XHCI *is* a hotplug
> bridge.  Only the bridge above the NHI is not.

Yes, exactly. I tried to clarify this mechanism a bit better in the
other email I just sent.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:56                     ` Mika Westerberg
@ 2018-03-31 10:05                       ` Rafael J. Wysocki
  2018-03-31 10:12                         ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Rafael J. Wysocki @ 2018-03-31 10:05 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Rafael J. Wysocki, Lukas Wunner, Bjorn Helgaas, Bjorn Helgaas,
	Rafael J. Wysocki, Len Brown, Mario Limonciello, Michael Jamet,
	Yehezkel Bernat, Andy Shevchenko, Linux PCI,
	ACPI Devel Maling List

On Sat, Mar 31, 2018 at 11:56 AM, Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> On Sat, Mar 31, 2018 at 11:30:44AM +0200, Rafael J. Wysocki wrote:
>> > The whole point here is that those are *not* hotplug slots just regular
>> > downstream ports.
>>
>> I'm not sure what scenario exactly you are referring to to be honest.
>>
>> Something related to Thunderbolt I suppose?
>
> Here is an example that hopefully clarifies. This example is from a
> system using Thunderbolt in "native" mode but I think it is not specific
> to Thunderbolt. The idea is that when you don't have anything connected
> to Thunderbolt ports you have following PCI topology:
>
>   00:1b.0 --
>
> So a root port that is hotplug capable and handled by pciehp.
>
> Next when you plug in a Thunderbolt enpdoint, you get native PCIe
> hotplug event and after it is handled the topology looks like:
>
>   00:1b.0 --- 01:00.0 --+- 02:00.0 --
>                         +- 02:01.0 (hotplug) --
>                         \- 02:02.0 --
>
> In other words there is a PCIe switch with one hotplug port (02:01.0)
> that is again handled by pciehp (this is used to daisy chain further
> devices). However, downstream ports 02:00.0 and 02:02.0 are not marked
> as hotplug capable so pciehp is not controlling them.
>
> To bring in xHCI and/or Thunderbolt host controller we get ACPI Notify()
> to the root port 00:1b.0 which should result following topology after
> handled by acpiphp:
>
>   00:1b.0 --- 01:00.0 --+- 02:00.0 -- Thunderbolt host controller
>                         +- 02:01.0 (hotplug) --
>                         \- 02:02.0 -- xHCI host controller
>
> In other words ACPI Notify() is used to populate devices connected to
> non-hotplug downstream ports. It is also used to "hot-unplug" them in
> the same way (for example if you only connect standard USB-C device to
> the port the Thunderbolt host controller is hot-unplugged using this
> mechanism).
>
> Rest of the devices in the chain are hotplugged using standard native
> PCIe hotplug so pciehp will be controlling then.

OK, thanks!

I thought it would be something like this. :-)

So the mechanism is not TBT-specific in principle, but I don't think
that it is used in practice anywhere outside of Thunderbolt.

I also think that it would be good to put the above example somewhere
like a git commit changelog or even a comment in the code, so people
in the future don't have to wonder what this is all about.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31 10:05                       ` Rafael J. Wysocki
@ 2018-03-31 10:12                         ` Mika Westerberg
  0 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31 10:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Mario Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, Linux PCI, ACPI Devel Maling List

On Sat, Mar 31, 2018 at 12:05:50PM +0200, Rafael J. Wysocki wrote:
> I also think that it would be good to put the above example somewhere
> like a git commit changelog or even a comment in the code, so people
> in the future don't have to wonder what this is all about.

Sure, I'll add this information to the changelogs of the next revision
of the patch series.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31  9:58                     ` Mika Westerberg
@ 2018-03-31 10:18                       ` Lukas Wunner
  2018-03-31 10:37                         ` Mika Westerberg
  0 siblings, 1 reply; 34+ messages in thread
From: Lukas Wunner @ 2018-03-31 10:18 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 12:58:04PM +0300, Mika Westerberg wrote:
> On Sat, Mar 31, 2018 at 11:30:17AM +0200, Lukas Wunner wrote:
> > > The whole point here is that those are *not* hotplug slots just regular
> > > downstream ports.
> > 
> > Okay, understood.  Is this about the NHI or XHCI?  Because at least
> > on Alpine Ridge (C step), the bridge above the XHCI *is* a hotplug
> > bridge.  Only the bridge above the NHI is not.
> 
> Yes, exactly. I tried to clarify this mechanism a bit better in the
> other email I just sent.

But in the e-mail you just sent, the bridge above the XHCI is not a
hotplug bridge and according to the lspci output of a MacBookPro13,3
I have here, it *is* a hotplug bridge on Alpine Ridge (C step).

FWIW, the HDA controller integrated into Nvidia GPUs can be made
visible or hidden in a similar fashion by setting a bit in the
GPU's config space.  Some laptop DSDTs use this to hide the HDA
controller on boot (and resume from system sleep) if no HDMI cable
is plugged in.  I think this behavior is geared towards Windows'
driver model.  On Linux it's mostly an annoyance and we're considering
un-hiding the HDA controller unconditionally from a PCI quirk:
https://bugs.freedesktop.org/show_bug.cgi?id=75985

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number
  2018-03-31 10:18                       ` Lukas Wunner
@ 2018-03-31 10:37                         ` Mika Westerberg
  0 siblings, 0 replies; 34+ messages in thread
From: Mika Westerberg @ 2018-03-31 10:37 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Bjorn Helgaas, Rafael J. Wysocki, Len Brown,
	Mario.Limonciello, Michael Jamet, Yehezkel Bernat,
	Andy Shevchenko, linux-pci, linux-acpi

On Sat, Mar 31, 2018 at 12:18:16PM +0200, Lukas Wunner wrote:
> On Sat, Mar 31, 2018 at 12:58:04PM +0300, Mika Westerberg wrote:
> > On Sat, Mar 31, 2018 at 11:30:17AM +0200, Lukas Wunner wrote:
> > > > The whole point here is that those are *not* hotplug slots just regular
> > > > downstream ports.
> > > 
> > > Okay, understood.  Is this about the NHI or XHCI?  Because at least
> > > on Alpine Ridge (C step), the bridge above the XHCI *is* a hotplug
> > > bridge.  Only the bridge above the NHI is not.
> > 
> > Yes, exactly. I tried to clarify this mechanism a bit better in the
> > other email I just sent.
> 
> But in the e-mail you just sent, the bridge above the XHCI is not a
> hotplug bridge and according to the lspci output of a MacBookPro13,3
> I have here, it *is* a hotplug bridge on Alpine Ridge (C step).

They are different because on Apple systems the Thunderbolt host router
(switch + xHCI + NHI) are always there and thus configured by the boot
firmware initially. So there is no need for this ACPI Notify() mechanism
at all, so the downstream port leading to xHCI can be marked as
hotplug capable (even though not needed actually).

The systems I'm refering are in "native" mode which means that they use
native PCIe hotplug also for the Thunderbolt host router and thus it is
up to the OS to configure everything. The reason those two ports are not
marked as hotplug is that then the OS does not try to distribute all
remaining resources to them which leaves more space to the extension
hotplug port.

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2018-03-31 10:37 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-26 13:21 [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Mika Westerberg
2018-02-26 13:21 ` [PATCH v3 1/5] PCI: Make sure all bridges reserve at least one bus number Mika Westerberg
2018-03-27 18:57   ` Bjorn Helgaas
2018-03-28 11:43     ` Mika Westerberg
2018-03-28 18:09       ` Bjorn Helgaas
2018-03-29 11:59         ` Mika Westerberg
2018-03-31  8:29           ` Lukas Wunner
2018-03-31  8:58             ` Mika Westerberg
2018-03-31  9:12               ` Lukas Wunner
2018-03-31  9:19                 ` Rafael J. Wysocki
2018-03-31  9:20                 ` Mika Westerberg
2018-03-31  9:30                   ` Lukas Wunner
2018-03-31  9:58                     ` Mika Westerberg
2018-03-31 10:18                       ` Lukas Wunner
2018-03-31 10:37                         ` Mika Westerberg
2018-03-31  9:30                   ` Rafael J. Wysocki
2018-03-31  9:56                     ` Mika Westerberg
2018-03-31 10:05                       ` Rafael J. Wysocki
2018-03-31 10:12                         ` Mika Westerberg
2018-02-26 13:21 ` [PATCH v3 2/5] PCI: Take bridge window alignment into account when distributing resources Mika Westerberg
2018-03-27 19:23   ` Bjorn Helgaas
2018-03-28 12:01     ` Mika Westerberg
2018-03-29 21:29       ` Bjorn Helgaas
2018-03-31  9:18         ` Mika Westerberg
2018-02-26 13:21 ` [PATCH v3 3/5] PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume Mika Westerberg
2018-02-26 13:21 ` [PATCH v3 4/5] ACPI / hotplug / PCI: Do not scan all bridges when native PCIe hotplug is used Mika Westerberg
2018-02-26 13:21 ` [PATCH v3 5/5] ACPI / hotplug / PCI: Mark stale PCI devices disconnected Mika Westerberg
2018-03-27 19:45   ` Bjorn Helgaas
2018-03-27 20:52     ` Lukas Wunner
2018-03-27 22:13       ` Bjorn Helgaas
2018-03-28  6:25         ` Greg Kroah-Hartman
2018-02-26 15:16 ` [PATCH v3 0/5] PCI: Fixes for native PCIe and ACPI hotplug Andy Shevchenko
2018-03-15  6:00 ` Mika Westerberg
2018-03-26 10:01   ` Mika Westerberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.