linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"mika.westerberg@linux.intel.com"
	<mika.westerberg@linux.intel.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Subject: [PATCH v6 3/4] PCI: Fix bug resulting in double hpmemsize being assigned to MMIO window
Date: Wed, 22 May 2019 14:31:11 +0000	[thread overview]
Message-ID: <PS2P216MB0642EEC485C82AF74AEEB89280000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <20190522222928.2964-1-nicholas.johnson-opensource@outlook.com.au>

Background
==========================================================================

Solve bug report:
https://bugzilla.kernel.org/show_bug.cgi?id=203243

Currently, the kernel can sometimes assign the MMIO_PREF window
additional size into the MMIO window, resulting in double the MMIO
additional size, even if the MMIO_PREF window was successful.

This happens if in the first pass, the MMIO_PREF succeeds but the MMIO
fails. In the next pass, because MMIO_PREF is already assigned, the
attempt to assign MMIO_PREF returns an error code instead of success
(nothing more to do, already allocated).

Example of problem (more context can be found in the bug report URL):

Mainline kernel:
pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0xa00fffff] = 256M
pci 0000:06:04.0: BAR 14: assigned [mem 0xa0200000-0xb01fffff] = 256M

Patched kernel:
pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0x980fffff] = 128M
pci 0000:06:04.0: BAR 14: assigned [mem 0x98200000-0xa01fffff] = 128M

This was using pci=realloc,hpmemsize=128M,nocrs - on the same machine
with the same configuration, with a Ubuntu mainline kernel and a kernel
patched with this patch series.

This patch is vital for the next patch in the series. The next patch
allows the user to specify MMIO and MMIO_PREF independently. If the
MMIO_PREF is set to be very large, this bug will end up more than
doubling the MMIO size. The bug results in the MMIO_PREF being added to
the MMIO window, which means doubling if MMIO_PREF size == MMIO size.
With a large MMIO_PREF, without this patch, the MMIO window will likely
fail to be assigned altogether due to lack of 32-bit address space.

Patch notes
==========================================================================

Change find_free_bus_resource() to not skip assigned resources with
non-null parent.

Add checks in pbus_size_io() and pbus_size_mem() to return success if
resource returned from find_free_bus_resource() is already allocated.

This avoids pbus_size_io() and pbus_size_mem() returning error code to
__pci_bus_size_bridges() when a resource has been successfully assigned
in a previous pass. This fixes the existing behaviour where space for a
resource could be reserved multiple times in different parent bridge
windows. This also greatly reduces the number of failed BAR messages in
dmesg when Linux assigns resources.

Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
---
 drivers/pci/setup-bus.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 5675254fa..8e1bc7ee7 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -752,13 +752,18 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
 }
 
 /*
- * Helper function for sizing routines: find first available bus resource
- * of a given type.  Note: we intentionally skip the bus resources which
- * have already been assigned (that is, have non-NULL parent resource).
+ * Helper function for sizing routines: find first bus resource of a given
+ * type. Note: we do not skip the bus resources which have already been
+ * assigned (r->parent != NULL). This is because a resource that is already
+ * assigned (nothing more to be done) will be indistinguishable from one that
+ * failed due to lack of space if we skip assigned resources. If the caller
+ * function cannot tell the difference then it might try to place the
+ * resources in a different window, doubling up on resources or causing
+ * unforeseeable issues.
  */
-static struct resource *find_free_bus_resource(struct pci_bus *bus,
-					       unsigned long type_mask,
-					       unsigned long type)
+struct resource *find_bus_resource_of_type(struct pci_bus *bus,
+						  unsigned long type_mask,
+						  unsigned long type)
 {
 	int i;
 	struct resource *r;
@@ -766,7 +771,7 @@ static struct resource *find_free_bus_resource(struct pci_bus *bus,
 	pci_bus_for_each_resource(bus, r, i) {
 		if (r == &ioport_resource || r == &iomem_resource)
 			continue;
-		if (r && (r->flags & type_mask) == type && !r->parent)
+		if (r && (r->flags & type_mask) == type)
 			return r;
 	}
 	return NULL;
@@ -866,14 +871,16 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,
 			 struct list_head *realloc_head)
 {
 	struct pci_dev *dev;
-	struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,
-							IORESOURCE_IO);
+	struct resource *b_res = find_bus_resource_of_type(bus, IORESOURCE_IO,
+					IORESOURCE_IO);
 	resource_size_t size = 0, size0 = 0, size1 = 0;
 	resource_size_t children_add_size = 0;
 	resource_size_t min_align, align;
 
 	if (!b_res)
 		return;
+	if (b_res->parent)
+		return;
 
 	min_align = window_alignment(bus, IORESOURCE_IO);
 	list_for_each_entry(dev, &bus->devices, bus_list) {
@@ -978,7 +985,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	resource_size_t min_align, align, size, size0, size1;
 	resource_size_t aligns[18]; /* Alignments from 1MB to 128GB */
 	int order, max_order;
-	struct resource *b_res = find_free_bus_resource(bus,
+	struct resource *b_res = find_bus_resource_of_type(bus,
 					mask | IORESOURCE_PREFETCH, type);
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
@@ -986,6 +993,8 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 
 	if (!b_res)
 		return -ENOSPC;
+	if (b_res->parent)
+		return 0;
 
 	memset(aligns, 0, sizeof(aligns));
 	max_order = 0;
-- 
2.20.1


  parent reply	other threads:[~2019-05-22 14:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190522222928.2964-1-nicholas.johnson-opensource@outlook.com.au>
2019-05-22 14:30 ` [PATCH v6 1/4] PCI: Consider alignment of hot-added bridges when distributing available resources Nicholas Johnson
2019-06-15 19:56   ` Bjorn Helgaas
2019-06-17  9:21     ` mika.westerberg
2019-06-19 13:40     ` Nicholas Johnson
2019-06-19 21:53       ` Bjorn Helgaas
2019-06-17  9:35   ` mika.westerberg
2019-06-17 18:22     ` Bjorn Helgaas
2019-06-19 13:47     ` Nicholas Johnson
2019-05-22 14:30 ` [PATCH v6 2/4] PCI: Modify extend_bridge_window() to set resource size directly Nicholas Johnson
2019-06-15 19:57   ` Bjorn Helgaas
2019-05-22 14:31 ` Nicholas Johnson [this message]
2019-05-22 14:31 ` [PATCH v6 4/4] PCI: Add pci=hpmemprefsize parameter to set MMIO_PREF size independently Nicholas Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PS2P216MB0642EEC485C82AF74AEEB89280000@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM \
    --to=nicholas.johnson-opensource@outlook.com.au \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).