linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] PCI: Minimizing resource assignment algorithm
@ 2020-09-28  1:06 Jon Derrick
  2020-09-28  1:06 ` [PATCH 1/3] PCI: Create helper to release/restore bridge resources Jon Derrick
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jon Derrick @ 2020-09-28  1:06 UTC (permalink / raw)
  To: linux-pci
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski, Dave Fugate,
	Jon Derrick

This set adds a minimizing resource assignment algorithm. VMD domains
frequently have issues with default hotplug settings and large arrays of
drives such as those in JBOFs. This algorithm uses the default or
user-specified hotplug resource settings, then tries with minimal
settings using 256 for IO, and 1MB for MMIO and Prefetch, and finally
tries without additional hotplug resources as if the bridge were not
hotplug capable.

This set allows a resource constrained domain to at the very least
enumerate and attach drivers to devices, though may not result in
supportable hotplug slot if a device is not already occupied in the
slot.

Jon Derrick (3):
  PCI: Create helper to release/restore bridge resources
  PCI: Introduce a minimizing assignment algorithm
  PCI: vmd: Wire up VMD for fallback resource assignment

 drivers/pci/controller/vmd.c |   2 +-
 drivers/pci/setup-bus.c      | 147 ++++++++++++++++++++++++++++-------
 include/linux/pci.h          |   2 +
 3 files changed, 124 insertions(+), 27 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] PCI: Create helper to release/restore bridge resources
  2020-09-28  1:06 [PATCH 0/3] PCI: Minimizing resource assignment algorithm Jon Derrick
@ 2020-09-28  1:06 ` Jon Derrick
  2020-09-28  1:06 ` [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm Jon Derrick
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Jon Derrick @ 2020-09-28  1:06 UTC (permalink / raw)
  To: linux-pci
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski, Dave Fugate,
	Jon Derrick

Moves bridge release and restore code into a common helper. No
functional changes.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/setup-bus.c | 49 +++++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 3951e02b7ded..f22502e8e6e6 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2047,6 +2047,33 @@ static void pci_bridge_distribute_available_resources(struct pci_dev *bridge,
 					       available_mmio_pref);
 }
 
+static void release_and_restore_resources(struct list_head *head)
+{
+	struct pci_dev_resource *dev_res;
+
+	list_for_each_entry(dev_res, head, list)
+		pci_bus_release_bridge_resources(dev_res->dev->bus,
+						 dev_res->flags & PCI_RES_TYPE_MASK,
+						 whole_subtree);
+
+	/* Restore size and flags */
+	list_for_each_entry(dev_res, head, list) {
+		struct resource *res = dev_res->res;
+		int idx;
+
+		res->start = dev_res->start;
+		res->end = dev_res->end;
+		res->flags = dev_res->flags;
+
+		if (pci_is_bridge(dev_res->dev)) {
+			idx = res - &dev_res->dev->resource[0];
+			if (idx >= PCI_BRIDGE_RESOURCES &&
+			    idx <= PCI_BRIDGE_RESOURCE_END)
+				res->flags = 0;
+		}
+	}
+}
+
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 {
 	struct pci_bus *parent = bridge->subordinate;
@@ -2088,27 +2115,7 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	 * Try to release leaf bridge's resources that aren't big enough
 	 * to contain child device resources.
 	 */
-	list_for_each_entry(fail_res, &fail_head, list)
-		pci_bus_release_bridge_resources(fail_res->dev->bus,
-						 fail_res->flags & PCI_RES_TYPE_MASK,
-						 whole_subtree);
-
-	/* Restore size and flags */
-	list_for_each_entry(fail_res, &fail_head, list) {
-		struct resource *res = fail_res->res;
-		int idx;
-
-		res->start = fail_res->start;
-		res->end = fail_res->end;
-		res->flags = fail_res->flags;
-
-		if (pci_is_bridge(fail_res->dev)) {
-			idx = res - &fail_res->dev->resource[0];
-			if (idx >= PCI_BRIDGE_RESOURCES &&
-			    idx <= PCI_BRIDGE_RESOURCE_END)
-				res->flags = 0;
-		}
-	}
+	release_and_restore_resources(&fail_head);
 	free_list(&fail_head);
 
 	goto again;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm
  2020-09-28  1:06 [PATCH 0/3] PCI: Minimizing resource assignment algorithm Jon Derrick
  2020-09-28  1:06 ` [PATCH 1/3] PCI: Create helper to release/restore bridge resources Jon Derrick
@ 2020-09-28  1:06 ` Jon Derrick
  2020-09-28  7:17   ` Christoph Hellwig
  2020-09-28  1:06 ` [PATCH 3/3] PCI: vmd: Wire up VMD for fallback resource assignment Jon Derrick
  2020-09-28  7:16 ` [PATCH 0/3] PCI: Minimizing resource assignment algorithm Christoph Hellwig
  3 siblings, 1 reply; 8+ messages in thread
From: Jon Derrick @ 2020-09-28  1:06 UTC (permalink / raw)
  To: linux-pci
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski, Dave Fugate,
	Jon Derrick

Some PCI domains have limited resources that get exhausted by hotplug
resource domains. VMD subdevice domains, for example, tend to support
only 32MB MMIO, of which the decoable address space is split between
prefetchable and non-prefetchable windows using existing resource
assignment algorithms. In addition to these limitations, hotplug bridges
require additional resource reservations as specified by default or
module parameters "pci=hp{io,mmio,mmiopref}size, further exhausting the
domain resources prior to full domain assignment.

Introduce a minimizing assignment algorithm which starts with the
default or user-requested hotplug resource values, tries with minimal
hotplug resource values, and lastly tries no hotplug resource values.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/setup-bus.c | 98 ++++++++++++++++++++++++++++++++++++++---
 include/linux/pci.h     |  2 +
 2 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f22502e8e6e6..7beb4f37660b 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1200,6 +1200,35 @@ static void pci_bus_size_cardbus(struct pci_bus *bus,
 	;
 }
 
+enum {
+	PCI_SIZING_VARIANT_DEFAULT,
+	PCI_SIZING_VARIANT_NOHOTPLUG,
+	PCI_SIZING_VARIANT_MINIMUM,
+	PCI_NUM_SIZING_VARIANTS,
+};
+
+static void hotplug_sizes(int sizing_variant, resource_size_t *io,
+			  resource_size_t *mmio, resource_size_t *pref)
+{
+	switch (sizing_variant) {
+	case PCI_SIZING_VARIANT_MINIMUM:
+		*io = 0;
+		*mmio = 0;
+		*pref = 0;
+		break;
+	case PCI_SIZING_VARIANT_NOHOTPLUG:
+		*io = 256;
+		*mmio = 1 << 20;
+		*pref = 1 << 20;
+		break;
+	case PCI_SIZING_VARIANT_DEFAULT:
+	default:
+		*io = pci_hotplug_io_size;
+		*mmio = pci_hotplug_mmio_size;
+		*pref = pci_hotplug_mmio_pref_size;
+	}
+}
+
 void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
 {
 	struct pci_dev *dev;
@@ -1248,11 +1277,11 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head)
 
 	case PCI_HEADER_TYPE_BRIDGE:
 		pci_bridge_check_ranges(bus);
-		if (bus->self->is_hotplug_bridge) {
-			additional_io_size  = pci_hotplug_io_size;
-			additional_mmio_size = pci_hotplug_mmio_size;
-			additional_mmio_pref_size = pci_hotplug_mmio_pref_size;
-		}
+		if (bus->self->is_hotplug_bridge)
+			hotplug_sizes(bus->self->sizing_variant,
+				      &additional_io_size,
+				      &additional_mmio_size,
+				      &additional_mmio_pref_size);
 		/* Fall through */
 	default:
 		pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
@@ -2247,3 +2276,62 @@ void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
 	BUG_ON(!list_empty(&add_list));
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bus_resources);
+
+static int __set_sizing_variant(struct pci_dev *dev, void *data)
+{
+	if (dev->is_hotplug_bridge)
+		dev->sizing_variant = *((int *) data);
+
+	return 0;
+}
+
+static void release_bridge_resources(struct pci_bus *bus)
+{
+	struct resource *res;
+	struct pci_dev *dev;
+	int i;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		if (dev->subordinate) {
+			for (i = PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_END; i++)
+				reset_resource(&dev->resource[i]);
+
+			release_bridge_resources(dev->subordinate);
+		}
+
+		if (pci_is_root_bus(bus))
+			continue;
+
+		pci_bus_for_each_resource(bus, res, i)
+			reset_resource(res);
+	}
+}
+
+void pci_bus_assign_resources_fallback_sizing(struct pci_bus *bus)
+{
+	LIST_HEAD(fail_head);
+	int i = 0;
+
+	pci_walk_bus(bus, __set_sizing_variant, &i);
+	__pci_bus_assign_resources(bus, NULL, &fail_head);
+
+	if (list_empty(&fail_head))
+		return;
+
+	for (i = 0; i < PCI_NUM_SIZING_VARIANTS; i++) {
+		pci_walk_bus(bus, __set_sizing_variant, &i);
+
+		down_read(&pci_bus_sem);
+		__pci_bus_size_bridges(bus, NULL);
+		up_read(&pci_bus_sem);
+
+		__pci_bus_assign_resources(bus, NULL, &fail_head);
+		if (list_empty(&fail_head))
+			return;
+
+		release_and_restore_resources(&fail_head);
+		release_bridge_resources(bus);
+		free_list(&fail_head);
+	}
+}
+EXPORT_SYMBOL(pci_bus_assign_resources_fallback_sizing);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 801e9ad0d57e..72ae11d3b5ea 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -424,6 +424,7 @@ struct pci_dev {
 	unsigned int	is_hotplug_bridge:1;
 	unsigned int	shpc_managed:1;		/* SHPC owned by shpchp */
 	unsigned int	is_thunderbolt:1;	/* Thunderbolt controller */
+	unsigned int	sizing_variant:2;	/* normal, minimum, no hotplug */
 	/*
 	 * Devices marked being untrusted are the ones that can potentially
 	 * execute DMA attacks and similar. They are typically connected
@@ -1299,6 +1300,7 @@ void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
 void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
+void pci_bus_assign_resources_fallback_sizing(struct pci_bus *bus);
 int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] PCI: vmd: Wire up VMD for fallback resource assignment
  2020-09-28  1:06 [PATCH 0/3] PCI: Minimizing resource assignment algorithm Jon Derrick
  2020-09-28  1:06 ` [PATCH 1/3] PCI: Create helper to release/restore bridge resources Jon Derrick
  2020-09-28  1:06 ` [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm Jon Derrick
@ 2020-09-28  1:06 ` Jon Derrick
  2020-09-28  7:16 ` [PATCH 0/3] PCI: Minimizing resource assignment algorithm Christoph Hellwig
  3 siblings, 0 replies; 8+ messages in thread
From: Jon Derrick @ 2020-09-28  1:06 UTC (permalink / raw)
  To: linux-pci
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski, Dave Fugate,
	Jon Derrick

The VMD subdevice domain would prefer all devices be assigned resources
and working rather than a few or none assigned, but with valid hotplug
bridge resources. The resource assignment fallback algorithm works best
for these requirements.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/controller/vmd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index fdc1a206f73e..4debc547c813 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -777,7 +777,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 		dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
 
 	pci_scan_child_bus(vmd->bus);
-	pci_assign_unassigned_bus_resources(vmd->bus);
+	pci_bus_assign_resources_fallback_sizing(vmd->bus);
 
 	/*
 	 * VMD root buses are virtual and don't return true on pci_is_pcie()
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] PCI: Minimizing resource assignment algorithm
  2020-09-28  1:06 [PATCH 0/3] PCI: Minimizing resource assignment algorithm Jon Derrick
                   ` (2 preceding siblings ...)
  2020-09-28  1:06 ` [PATCH 3/3] PCI: vmd: Wire up VMD for fallback resource assignment Jon Derrick
@ 2020-09-28  7:16 ` Christoph Hellwig
  3 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2020-09-28  7:16 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski,
	Dave Fugate

<broken record>
Intel, please just allow us to opt-out of using VMD as VMD just makes
life miserable.  Thanks!
</broken record>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm
  2020-09-28  1:06 ` [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm Jon Derrick
@ 2020-09-28  7:17   ` Christoph Hellwig
  2020-09-28 13:34     ` Derrick, Jonathan
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2020-09-28  7:17 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Lorenzo Pieralisi, Bjorn Helgaas, Andrzej Jakowski,
	Dave Fugate

Please keep this code in VMD if we really have to do it (although I'd
be perfectly fine to let people dumb enough to enable VMD devices to
live with the problems).  You are adding lots of code that gets
copiled into every Linux kernel that supports PCI jut to work around
a copletely idiotic invention from Intel that makes life painful for
us for no good reason.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm
  2020-09-28  7:17   ` Christoph Hellwig
@ 2020-09-28 13:34     ` Derrick, Jonathan
  2020-09-29 17:48       ` hch
  0 siblings, 1 reply; 8+ messages in thread
From: Derrick, Jonathan @ 2020-09-28 13:34 UTC (permalink / raw)
  To: hch
  Cc: linux-pci, lorenzo.pieralisi, helgaas, andrzej.jakowski, Fugate, David

Hi Christoph,

Thanks for your valuable feedback as always

On Mon, 2020-09-28 at 08:17 +0100, Christoph Hellwig wrote:
> Please keep this code in VMD if we really have to do it (although I'd
> be perfectly fine to let people dumb enough to enable VMD devices to
> live with the problems).
Great! Sounds like you're more open to us working openly within vmd.c
then?

>   You are adding lots of code that gets
> copiled into every Linux kernel that supports PCI jut to work around
> a copletely idiotic invention from Intel that makes life painful for
> us for no good reason.
Well this fix in particular may not be needed once the dynamic hotplug
resource resizing set is in and build on that. But frankly the generic
resource assignment code itself is very difficult to work within and
has been discussed at several LPC over the years. I don't see a problem
with another algorithm which could be relied upon by other host bridge
controller drivers if they want it.

I also spent a good deal of time trying to get the minimizing algorithm
into pci_assign_unassigned_root_bus_resources, where the only instance
of pci=realloc detection takes place (who knew there were so many
originating different paths for resource assignment?). I couldn't make
headway there so started fresh. Maybe someone talented could refactor
mine into it and save a few instruction bytes.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm
  2020-09-28 13:34     ` Derrick, Jonathan
@ 2020-09-29 17:48       ` hch
  0 siblings, 0 replies; 8+ messages in thread
From: hch @ 2020-09-29 17:48 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: hch, linux-pci, lorenzo.pieralisi, helgaas, andrzej.jakowski,
	Fugate, David

On Mon, Sep 28, 2020 at 01:34:50PM +0000, Derrick, Jonathan wrote:
> Well this fix in particular may not be needed once the dynamic hotplug
> resource resizing set is in and build on that. But frankly the generic
> resource assignment code itself is very difficult to work within and
> has been discussed at several LPC over the years. I don't see a problem
> with another algorithm which could be relied upon by other host bridge
> controller drivers if they want it.
> 
> I also spent a good deal of time trying to get the minimizing algorithm
> into pci_assign_unassigned_root_bus_resources, where the only instance
> of pci=realloc detection takes place (who knew there were so many
> originating different paths for resource assignment?). I couldn't make
> headway there so started fresh. Maybe someone talented could refactor
> mine into it and save a few instruction bytes.

If the maintainers think there might be other use cases we could
also just make it conditional and let VMD select it.  I'm just a little
worried but all kinds of cruft slipping into core code to work around
the various problems vmd creates.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-29 17:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-28  1:06 [PATCH 0/3] PCI: Minimizing resource assignment algorithm Jon Derrick
2020-09-28  1:06 ` [PATCH 1/3] PCI: Create helper to release/restore bridge resources Jon Derrick
2020-09-28  1:06 ` [PATCH 2/3] PCI: Introduce a minimizing assignment algorithm Jon Derrick
2020-09-28  7:17   ` Christoph Hellwig
2020-09-28 13:34     ` Derrick, Jonathan
2020-09-29 17:48       ` hch
2020-09-28  1:06 ` [PATCH 3/3] PCI: vmd: Wire up VMD for fallback resource assignment Jon Derrick
2020-09-28  7:16 ` [PATCH 0/3] PCI: Minimizing resource assignment algorithm Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).