All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up
@ 2010-10-26 21:41 Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 1/9] resources: add a default alignf to simplify find_resource() Bjorn Helgaas
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner

When we move PCI devices, we currently allocate space bottom-up, i.e., we look
at PCI bus resources in the order we found them, we look at gaps between child
resources bottom-up, and we align the new space at the bottom of an available
region.

On x86, we move PCI devices more than we used to because we now pay attention
to the PCI host bridge windows from ACPI.  For example, when we find a device
that's outside all the known host bridge windows, we try to move it into a
window, and we look for space starting at the bottom.

Windows does similar device moves, but it looks for space top-down rather than
bottom-up.  Since most machines are better-tested with Windows than Linux, this
difference means that Linux is more likely to trip over BIOS bugs in the PCI
host bridge window descriptions than Windows is.

We've had several reports of Dell machines where the BIOS leaves the AHCI
controller outside the host bridge windows (BIOS bug #1), *and* the lowest
host bridge window includes an area that doesn't actually reach PCI (BIOS
bug #2).  The result is that Windows (which moves AHCI to the top of a window)
works fine, while Linux (which moves AHCI to the bottom, buggy, area) doesn't
work.

These patches change Linux to allocate space more like Windows does:

    1) The x86 pcibios_align_resource() will choose space from the
       end of an available area, not the beginning.

    2) In the generic allocate_resource() path, we'll look for space
       between existing children from the top, not from the bottom.

    3) When pci_bus_alloc_resource() looks for available space, it
       will start from the highest window, not the first one we found.

This series fixes a 2.6.34 regression that prevents many Dell Precision
workstations from booting:

    https://bugzilla.kernel.org/show_bug.cgi?id=16228

Changes from v4 to v5:
    - Fix for https://bugzilla.redhat.com/show_bug.cgi?id=646027
      reported by Fabrice Bellet <fabrice@bellet.info>
    - Declutter find_resource().
    - Catch overflow of "ALIGN(tmp.start)".

Changes from v3 to v4:
    - Use round_down() rather than adding ALIGN_DOWN().
    - Replace ARCH_HAS_TOP_DOWN_ALLOC #define with a boot-time architecture
      choice and add a "resource_alloc_from_bottom" command line option to
      revert to the old behavior (NOTE: this only affects allocate_resource(),
      not pcibios_align_resource() or pci_bus_alloc_resource()).
    - Fixed find_resource_from_top() again; it still didn't handle a child
      that ended at the parent's end correctly.

Changes from v2 to v3:
    - Fixes for https://bugzilla.redhat.com/show_bug.cgi?id=637647
      reported by Horst H. von Brand <vonbrand@inf.utfsm.cl>
    - Updated iomem_resource.end to reflect the end of usable physical address
      space.  Otherwise, we might allocate right up to 0xffffffff_ffffffff,
      which isn't usable.
    - Make allocate_resource() change conditional on ARCH_HAS_TOP_DOWN_ALLOC.
      Without arch-specific changes like the above, it's too dangerous to
      make this change for everybody at once.
    - Fix 64-bit wraparound in find_resource().  If the last child happened
      to end at ~0, we computed the highest available space as [child->end + 1,
      root->end], which makes us think the available space started at 0,
      which makes us return space that may already be allocated.

Changes from v1 to v2:
    - Moved check for allocating before the available area from
      pcibios_align_resource() to find_resource().  Better to do it
      after the alignment callback is done, and make it generic.
    - Fixed pcibios_align_resource() alignment.  If we start from the
      end of the available area, we must align *downward*, not upward.
    - Fixed pcibios_align_resource() ISA alias avoidance.  Again, since
      the starting point is the end of the area, we must align downward
      when we avoid aliased areas.
---

Bjorn Helgaas (9):
      resources: add a default alignf to simplify find_resource()
      resources: factor out resource_clip() to simplify find_resource()
      resources: ensure callback doesn't allocate outside available space
      resources: handle overflow when aligning start of available area
      resources: support allocating space within a region from the top down
      PCI: allocate bus resources from the top down
      x86/PCI: allocate space from the end of a region, not the beginning
      x86: update iomem_resource end based on CPU physical address capabilities
      x86: allocate space within a region top-down


 Documentation/kernel-parameters.txt |    5 +
 arch/x86/kernel/setup.c             |    2 
 arch/x86/pci/i386.c                 |   17 +++-
 drivers/pci/bus.c                   |   53 +++++++++++-
 include/linux/ioport.h              |    1 
 kernel/resource.c                   |  151 +++++++++++++++++++++++++++++++----
 6 files changed, 202 insertions(+), 27 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/9] resources: add a default alignf to simplify find_resource()
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 2/9] resources: factor out resource_clip() " Bjorn Helgaas
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


This removes a test from find_resource(), which is getting cluttered.
No functional change.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 kernel/resource.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)


diff --git a/kernel/resource.c b/kernel/resource.c
index 7b36976..7dc8ad2 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -357,6 +357,14 @@ int __weak page_is_ram(unsigned long pfn)
 	return walk_system_ram_range(pfn, 1, NULL, __is_ram) == 1;
 }
 
+static resource_size_t simple_align_resource(void *data,
+					     const struct resource *avail,
+					     resource_size_t size,
+					     resource_size_t align)
+{
+	return avail->start;
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
@@ -391,8 +399,8 @@ static int find_resource(struct resource *root, struct resource *new,
 		if (tmp.end > max)
 			tmp.end = max;
 		tmp.start = ALIGN(tmp.start, align);
-		if (alignf)
-			tmp.start = alignf(alignf_data, &tmp, size, align);
+
+		tmp.start = alignf(alignf_data, &tmp, size, align);
 		if (tmp.start < tmp.end && tmp.end - tmp.start >= size - 1) {
 			new->start = tmp.start;
 			new->end = tmp.start + size - 1;
@@ -428,6 +436,9 @@ int allocate_resource(struct resource *root, struct resource *new,
 {
 	int err;
 
+	if (!alignf)
+		alignf = simple_align_resource;
+
 	write_lock(&resource_lock);
 	err = find_resource(root, new, size, min, max, align, alignf, alignf_data);
 	if (err >= 0 && __request_resource(root, new))


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/9] resources: factor out resource_clip() to simplify find_resource()
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 1/9] resources: add a default alignf to simplify find_resource() Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 3/9] resources: ensure callback doesn't allocate outside available space Bjorn Helgaas
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


This factors out the min/max clipping to simplify find_resource().
No functional change.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 kernel/resource.c |   15 +++++++++++----
 1 files changed, 11 insertions(+), 4 deletions(-)


diff --git a/kernel/resource.c b/kernel/resource.c
index 7dc8ad2..26e9f25 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -365,6 +365,15 @@ static resource_size_t simple_align_resource(void *data,
 	return avail->start;
 }
 
+static void resource_clip(struct resource *res, resource_size_t min,
+			  resource_size_t max)
+{
+	if (res->start < min)
+		res->start = min;
+	if (res->end > max)
+		res->end = max;
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
@@ -394,10 +403,8 @@ static int find_resource(struct resource *root, struct resource *new,
 			tmp.end = this->start - 1;
 		else
 			tmp.end = root->end;
-		if (tmp.start < min)
-			tmp.start = min;
-		if (tmp.end > max)
-			tmp.end = max;
+
+		resource_clip(&tmp, min, max);
 		tmp.start = ALIGN(tmp.start, align);
 
 		tmp.start = alignf(alignf_data, &tmp, size, align);


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/9] resources: ensure callback doesn't allocate outside available space
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 1/9] resources: add a default alignf to simplify find_resource() Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 2/9] resources: factor out resource_clip() " Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 4/9] resources: handle overflow when aligning start of available area Bjorn Helgaas
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


The alignment callback returns a proposed location, which may have been
adjusted to avoid ISA aliases or for other architecture-specific reasons.

We already had a check ("tmp.start < tmp.end") to make sure the callback
doesn't return an area that extends past the available area.  This patch
reworks the check to make sure it doesn't return an area that extends
either below or above the available area.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 kernel/resource.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)


diff --git a/kernel/resource.c b/kernel/resource.c
index 26e9f25..89d5041 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -374,6 +374,11 @@ static void resource_clip(struct resource *res, resource_size_t min,
 		res->end = max;
 }
 
+static bool resource_contains(struct resource *res1, struct resource *res2)
+{
+	return res1->start <= res2->start && res1->end >= res2->end;
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
@@ -387,7 +392,7 @@ static int find_resource(struct resource *root, struct resource *new,
 			 void *alignf_data)
 {
 	struct resource *this = root->child;
-	struct resource tmp = *new;
+	struct resource tmp = *new, alloc;
 
 	tmp.start = root->start;
 	/*
@@ -407,10 +412,11 @@ static int find_resource(struct resource *root, struct resource *new,
 		resource_clip(&tmp, min, max);
 		tmp.start = ALIGN(tmp.start, align);
 
-		tmp.start = alignf(alignf_data, &tmp, size, align);
-		if (tmp.start < tmp.end && tmp.end - tmp.start >= size - 1) {
-			new->start = tmp.start;
-			new->end = tmp.start + size - 1;
+		alloc.start = alignf(alignf_data, &tmp, size, align);
+		alloc.end = alloc.start + size - 1;
+		if (resource_contains(&tmp, &alloc)) {
+			new->start = alloc.start;
+			new->end = alloc.end;
 			return 0;
 		}
 		if (!this)


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 4/9] resources: handle overflow when aligning start of available area
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 3/9] resources: ensure callback doesn't allocate outside available space Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 5/9] resources: support allocating space within a region from the top down Bjorn Helgaas
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


If tmp.start is near ~0, ALIGN(tmp.start) may overflow, which would
make us think there's more available space than there really is.  We
would likely return something that conflicts with a previous resource,
which would cause a failure when allocate_resource() requests the newly-
allocated region.

Reference: https://bugzilla.redhat.com/show_bug.cgi?id=646027
Reported-by: Fabrice Bellet <fabrice@bellet.info>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 kernel/resource.c |   21 +++++++++++++--------
 1 files changed, 13 insertions(+), 8 deletions(-)


diff --git a/kernel/resource.c b/kernel/resource.c
index 89d5041..e15b922 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -392,7 +392,7 @@ static int find_resource(struct resource *root, struct resource *new,
 			 void *alignf_data)
 {
 	struct resource *this = root->child;
-	struct resource tmp = *new, alloc;
+	struct resource tmp = *new, avail, alloc;
 
 	tmp.start = root->start;
 	/*
@@ -410,14 +410,19 @@ static int find_resource(struct resource *root, struct resource *new,
 			tmp.end = root->end;
 
 		resource_clip(&tmp, min, max);
-		tmp.start = ALIGN(tmp.start, align);
 
-		alloc.start = alignf(alignf_data, &tmp, size, align);
-		alloc.end = alloc.start + size - 1;
-		if (resource_contains(&tmp, &alloc)) {
-			new->start = alloc.start;
-			new->end = alloc.end;
-			return 0;
+		/* Check for overflow after ALIGN() */
+		avail = *new;
+		avail.start = ALIGN(tmp.start, align);
+		avail.end = tmp.end;
+		if (avail.start >= tmp.start) {
+			alloc.start = alignf(alignf_data, &avail, size, align);
+			alloc.end = alloc.start + size - 1;
+			if (resource_contains(&avail, &alloc)) {
+				new->start = alloc.start;
+				new->end = alloc.end;
+				return 0;
+			}
 		}
 		if (!this)
 			break;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 5/9] resources: support allocating space within a region from the top down
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 4/9] resources: handle overflow when aligning start of available area Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 6/9] PCI: allocate bus resources " Bjorn Helgaas
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


Allocate space from the top of a region first, then work downward,
if an architecture desires this.

When we allocate space from a resource, we look for gaps between children
of the resource.  Previously, we always looked at gaps from the bottom up.
For example, given this:

    [mem 0xbff00000-0xf7ffffff] PCI Bus 0000:00
      [mem 0xbff00000-0xbfffffff] gap -- available
      [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02
      [mem 0xe0000000-0xf7ffffff] gap -- available

we attempted to allocate from the [mem 0xbff00000-0xbfffffff] gap first,
then the [mem 0xe0000000-0xf7ffffff] gap.

With this patch an architecture can choose to allocate from the top gap
[mem 0xe0000000-0xf7ffffff] first.

We can't do this across the board because iomem_resource.end is initialized
to 0xffffffff_ffffffff on 64-bit architectures, and most machines can't
address the entire 64-bit physical address space.  Therefore, we only
allocate top-down if the arch requests it by clearing
"resource_alloc_from_bottom".

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 Documentation/kernel-parameters.txt |    5 ++
 include/linux/ioport.h              |    1 
 kernel/resource.c                   |   98 ++++++++++++++++++++++++++++++++++-
 3 files changed, 100 insertions(+), 4 deletions(-)


diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 4bc2f3c..ed45e98 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2175,6 +2175,11 @@ and is between 256 and 4096 characters. It is defined in the file
 	reset_devices	[KNL] Force drivers to reset the underlying device
 			during initialization.
 
+	resource_alloc_from_bottom
+			Allocate new resources from the beginning of available
+			space, not the end.  If you need to use this, please
+			report a bug.
+
 	resume=		[SWSUSP]
 			Specify the partition device for software suspend
 
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index b227902..d377ea8 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -112,6 +112,7 @@ struct resource_list {
 /* PC/ISA/whatever - the normal PC address spaces: IO and memory */
 extern struct resource ioport_resource;
 extern struct resource iomem_resource;
+extern int resource_alloc_from_bottom;
 
 extern struct resource *request_resource_conflict(struct resource *root, struct resource *new);
 extern int request_resource(struct resource *root, struct resource *new);
diff --git a/kernel/resource.c b/kernel/resource.c
index e15b922..716b680 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -40,6 +40,23 @@ EXPORT_SYMBOL(iomem_resource);
 
 static DEFINE_RWLOCK(resource_lock);
 
+/*
+ * By default, we allocate free space bottom-up.  The architecture can request
+ * top-down by clearing this flag.  The user can override the architecture's
+ * choice with the "resource_alloc_from_bottom" kernel boot option, but that
+ * should only be a debugging tool.
+ */
+int resource_alloc_from_bottom = 1;
+
+static __init int setup_alloc_from_bottom(char *s)
+{
+	printk(KERN_INFO
+	       "resource: allocating from bottom-up; please report a bug\n");
+	resource_alloc_from_bottom = 1;
+	return 0;
+}
+early_param("resource_alloc_from_bottom", setup_alloc_from_bottom);
+
 static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct resource *p = v;
@@ -380,7 +397,74 @@ static bool resource_contains(struct resource *res1, struct resource *res2)
 }
 
 /*
+ * Find the resource before "child" in the sibling list of "root" children.
+ */
+static struct resource *find_sibling_prev(struct resource *root, struct resource *child)
+{
+	struct resource *this;
+
+	for (this = root->child; this; this = this->sibling)
+		if (this->sibling == child)
+			return this;
+
+	return NULL;
+}
+
+/*
+ * Find empty slot in the resource tree given range and alignment.
+ * This version allocates from the end of the root resource first.
+ */
+static int find_resource_from_top(struct resource *root, struct resource *new,
+				  resource_size_t size, resource_size_t min,
+				  resource_size_t max, resource_size_t align,
+				  resource_size_t (*alignf)(void *,
+						   const struct resource *,
+						   resource_size_t,
+						   resource_size_t),
+				  void *alignf_data)
+{
+	struct resource *this;
+	struct resource tmp, avail, alloc;
+
+	tmp.start = root->end;
+	tmp.end = root->end;
+
+	this = find_sibling_prev(root, NULL);
+	for (;;) {
+		if (this) {
+			if (this->end < root->end)
+				tmp.start = this->end + 1;
+		} else
+			tmp.start = root->start;
+
+		resource_clip(&tmp, min, max);
+
+		/* Check for overflow after ALIGN() */
+		avail = *new;
+		avail.start = ALIGN(tmp.start, align);
+		avail.end = tmp.end;
+		if (avail.start >= tmp.start) {
+			alloc.start = alignf(alignf_data, &avail, size, align);
+			alloc.end = alloc.start + size - 1;
+			if (resource_contains(&avail, &alloc)) {
+				new->start = alloc.start;
+				new->end = alloc.end;
+				return 0;
+			}
+		}
+
+		if (!this || this->start == root->start)
+			break;
+
+		tmp.end = this->start - 1;
+		this = find_sibling_prev(root, this);
+	}
+	return -EBUSY;
+}
+
+/*
  * Find empty slot in the resource tree given range and alignment.
+ * This version allocates from the beginning of the root resource first.
  */
 static int find_resource(struct resource *root, struct resource *new,
 			 resource_size_t size, resource_size_t min,
@@ -396,14 +480,15 @@ static int find_resource(struct resource *root, struct resource *new,
 
 	tmp.start = root->start;
 	/*
-	 * Skip past an allocated resource that starts at 0, since the assignment
-	 * of this->start - 1 to tmp->end below would cause an underflow.
+	 * Skip past an allocated resource that starts at 0, since the
+	 * assignment of this->start - 1 to tmp->end below would cause an
+	 * underflow.
 	 */
 	if (this && this->start == 0) {
 		tmp.start = this->end + 1;
 		this = this->sibling;
 	}
-	for(;;) {
+	for (;;) {
 		if (this)
 			tmp.end = this->start - 1;
 		else
@@ -424,8 +509,10 @@ static int find_resource(struct resource *root, struct resource *new,
 				return 0;
 			}
 		}
+
 		if (!this)
 			break;
+
 		tmp.start = this->end + 1;
 		this = this->sibling;
 	}
@@ -458,7 +545,10 @@ int allocate_resource(struct resource *root, struct resource *new,
 		alignf = simple_align_resource;
 
 	write_lock(&resource_lock);
-	err = find_resource(root, new, size, min, max, align, alignf, alignf_data);
+	if (resource_alloc_from_bottom)
+		err = find_resource(root, new, size, min, max, align, alignf, alignf_data);
+	else
+		err = find_resource_from_top(root, new, size, min, max, align, alignf, alignf_data);
 	if (err >= 0 && __request_resource(root, new))
 		err = -EBUSY;
 	write_unlock(&resource_lock);


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 6/9] PCI: allocate bus resources from the top down
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 5/9] resources: support allocating space within a region from the top down Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 7/9] x86/PCI: allocate space from the end of a region, not the beginning Bjorn Helgaas
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


Allocate space from the highest-address PCI bus resource first, then work
downward.

Previously, we looked for space in PCI host bridge windows in the order
we discovered the windows.  For example, given the following windows
(discovered via an ACPI _CRS method):

    pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
    pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff]
    pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff]
    pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff]
    pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff]
    pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff]
    pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff]

we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then
[mem 0x000c0000-0x000effff], and so on.

With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then
[mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc.

Allocating top-down follows Windows practice, so we're less likely to
trip over BIOS defects in the _CRS description.

On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region
doesn't actually work and is likely a BIOS defect.  The symptom is that we
move the AHCI controller to 0xbff00000, which leads to "Boot has failed,
sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933
Reported-by: Brian Bloniarz <phunge0@hotmail.com>
Reported-and-tested-by: Stefan Becker <chemobejk@gmail.com>
Reported-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 drivers/pci/bus.c |   53 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 48 insertions(+), 5 deletions(-)


diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 7f0af0e..172bf26 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -64,6 +64,49 @@ void pci_bus_remove_resources(struct pci_bus *bus)
 	}
 }
 
+/*
+ * Find the highest-address bus resource below the cursor "res".  If the
+ * cursor is NULL, return the highest resource.
+ */
+static struct resource *pci_bus_find_resource_prev(struct pci_bus *bus,
+						   unsigned int type,
+						   struct resource *res)
+{
+	struct resource *r, *prev = NULL;
+	int i;
+
+	pci_bus_for_each_resource(bus, r, i) {
+		if (!r)
+			continue;
+
+		if ((r->flags & IORESOURCE_TYPE_BITS) != type)
+			continue;
+
+		/* If this resource is at or past the cursor, skip it */
+		if (res) {
+			if (r == res)
+				continue;
+			if (r->end > res->end)
+				continue;
+			if (r->end == res->end && r->start > res->start)
+				continue;
+		}
+
+		if (!prev)
+			prev = r;
+
+		/*
+		 * A small resource is higher than a large one that ends at
+		 * the same address.
+		 */
+		if (r->end > prev->end ||
+		    (r->end == prev->end && r->start > prev->start))
+			prev = r;
+	}
+
+	return prev;
+}
+
 /**
  * pci_bus_alloc_resource - allocate a resource from a parent bus
  * @bus: PCI bus
@@ -89,9 +132,10 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 					  resource_size_t),
 		void *alignf_data)
 {
-	int i, ret = -ENOMEM;
+	int ret = -ENOMEM;
 	struct resource *r;
 	resource_size_t max = -1;
+	unsigned int type = res->flags & IORESOURCE_TYPE_BITS;
 
 	type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
 
@@ -99,10 +143,9 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res,
 	if (!(res->flags & IORESOURCE_MEM_64))
 		max = PCIBIOS_MAX_MEM_32;
 
-	pci_bus_for_each_resource(bus, r, i) {
-		if (!r)
-			continue;
-
+	/* Look for space at highest addresses first */
+	r = pci_bus_find_resource_prev(bus, type, NULL);
+	for ( ; r; r = pci_bus_find_resource_prev(bus, type, r)) {
 		/* type_mask must match */
 		if ((res->flags ^ r->flags) & type_mask)
 			continue;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 7/9] x86/PCI: allocate space from the end of a region, not the beginning
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (5 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 6/9] PCI: allocate bus resources " Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 8/9] x86: update iomem_resource end based on CPU physical address capabilities Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 9/9] x86: allocate space within a region top-down Bjorn Helgaas
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


Allocate from the end of a region, not the beginning.

For example, if we need to allocate 0x800 bytes for a device on bus
0000:00 given these resources:

    [mem 0xbff00000-0xdfffffff] PCI Bus 0000:00
      [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02

the available space at [mem 0xbff00000-0xbfffffff] is passed to the
alignment callback (pcibios_align_resource()).  Prior to this patch, we
would put the new 0x800 byte resource at the beginning of that available
space, i.e., at [mem 0xbff00000-0xbff007ff].

With this patch, we put it at the end, at [mem 0xbffff800-0xbfffffff].

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c41
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 arch/x86/pci/i386.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)


diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 5525309..826140a 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -65,16 +65,21 @@ pcibios_align_resource(void *data, const struct resource *res,
 			resource_size_t size, resource_size_t align)
 {
 	struct pci_dev *dev = data;
-	resource_size_t start = res->start;
+	resource_size_t start = round_down(res->end - size + 1, align);
 
 	if (res->flags & IORESOURCE_IO) {
-		if (skip_isa_ioresource_align(dev))
-			return start;
-		if (start & 0x300)
-			start = (start + 0x3ff) & ~0x3ff;
+
+		/*
+		 * If we're avoiding ISA aliases, the largest contiguous I/O
+		 * port space is 256 bytes.  Clearing bits 9 and 10 preserves
+		 * all 256-byte and smaller alignments, so the result will
+		 * still be correctly aligned.
+		 */
+		if (!skip_isa_ioresource_align(dev))
+			start &= ~0x300;
 	} else if (res->flags & IORESOURCE_MEM) {
 		if (start < BIOS_END)
-			start = BIOS_END;
+			start = res->end;	/* fail; no space */
 	}
 	return start;
 }


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 8/9] x86: update iomem_resource end based on CPU physical address capabilities
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (6 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 7/9] x86/PCI: allocate space from the end of a region, not the beginning Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 21:41 ` [PATCH v5 9/9] x86: allocate space within a region top-down Bjorn Helgaas
  8 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


The iomem_resource map reflects the available physical address space.
We statically initialize the end to -1, i.e., 0xffffffff_ffffffff, but
of course we can only use as much as the CPU can address.

This patch updates the end based on the CPU capabilities, so we don't
mistakenly allocate space that isn't usable, as we're likely to do when
allocating from the top-down.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 arch/x86/kernel/setup.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)


diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 95a3274..85268f8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -769,6 +769,7 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
 	setup_memory_map();
 	parse_setup_data();
 	/* update the e820_saved too */


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 9/9] x86: allocate space within a region top-down
  2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
                   ` (7 preceding siblings ...)
  2010-10-26 21:41 ` [PATCH v5 8/9] x86: update iomem_resource end based on CPU physical address capabilities Bjorn Helgaas
@ 2010-10-26 21:41 ` Bjorn Helgaas
  2010-10-26 22:34   ` Jesse Barnes
  8 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2010-10-26 21:41 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


Request that allocate_resource() use available space from high addresses
first, rather than the default of using low addresses first.

The most common place this makes a difference is when we move or assign
new PCI device resources.  Low addresses are generally scarce, so it's
better to use high addresses when possible.  This follows Windows practice
for PCI allocation.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
---

 arch/x86/kernel/setup.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)


diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 85268f8..21c6746 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -769,6 +769,7 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+	resource_alloc_from_bottom = 0;
 	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
 	setup_memory_map();
 	parse_setup_data();


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 9/9] x86: allocate space within a region top-down
  2010-10-26 21:41 ` [PATCH v5 9/9] x86: allocate space within a region top-down Bjorn Helgaas
@ 2010-10-26 22:34   ` Jesse Barnes
  2010-10-27  6:23     ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Jesse Barnes @ 2010-10-26 22:34 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bob Picco, Brian Bloniarz, Charles Butterfield, Denys Vlasenko,
	Ingo Molnar, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner

On Tue, 26 Oct 2010 15:41:54 -0600
Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:

> 
> Request that allocate_resource() use available space from high addresses
> first, rather than the default of using low addresses first.
> 
> The most common place this makes a difference is when we move or assign
> new PCI device resources.  Low addresses are generally scarce, so it's
> better to use high addresses when possible.  This follows Windows practice
> for PCI allocation.
> 
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42
> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
> ---

Ok, applied this version to -next.  It's been going through distros and
lots of testing despite a lack of recent coverage in PCI -next, so I'm
planning to send it to Linus this week.  We probably won't catch much
more until it hits his tree anyway...

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 9/9] x86: allocate space within a region top-down
  2010-10-26 22:34   ` Jesse Barnes
@ 2010-10-27  6:23     ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2010-10-27  6:23 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Bjorn Helgaas, Bob Picco, Brian Bloniarz, Charles Butterfield,
	Denys Vlasenko, linux-pci, Horst H. von Brand, H. Peter Anvin,
	linux-kernel, Stefan Becker, Chuck Ebbert, Fabrice Bellet,
	Yinghai Lu, Leann Ogasawara, Linus Torvalds, Thomas Gleixner


* Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Tue, 26 Oct 2010 15:41:54 -0600
> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
> 
> > Request that allocate_resource() use available space from high addresses first, 
> > rather than the default of using low addresses first.
> > 
> > The most common place this makes a difference is when we move or assign new PCI 
> > device resources.  Low addresses are generally scarce, so it's better to use 
> > high addresses when possible.  This follows Windows practice for PCI allocation.
> > 
> > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42
> > Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
> > ---
> 
> Ok, applied this version to -next.  It's been going through distros and lots of 
> testing despite a lack of recent coverage in PCI -next, so I'm planning to send it 
> to Linus this week.  We probably won't catch much more until it hits his tree 
> anyway...

It also appears to fix a serious compatibility bug (and a v2.6.34 regression), so 
even without extensive distro testing it's wise to push this upstream ASAP IMHO ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-10-27  6:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-26 21:41 [PATCH v5 0/9] PCI: allocate space top-down, not bottom-up Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 1/9] resources: add a default alignf to simplify find_resource() Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 2/9] resources: factor out resource_clip() " Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 3/9] resources: ensure callback doesn't allocate outside available space Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 4/9] resources: handle overflow when aligning start of available area Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 5/9] resources: support allocating space within a region from the top down Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 6/9] PCI: allocate bus resources " Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 7/9] x86/PCI: allocate space from the end of a region, not the beginning Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 8/9] x86: update iomem_resource end based on CPU physical address capabilities Bjorn Helgaas
2010-10-26 21:41 ` [PATCH v5 9/9] x86: allocate space within a region top-down Bjorn Helgaas
2010-10-26 22:34   ` Jesse Barnes
2010-10-27  6:23     ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.