linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Intel Alder IOAPIC fix
@ 2004-01-12  2:55 James Bottomley
  2004-01-12 21:24 ` Linus Torvalds
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-12  2:55 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: Linux Kernel

The intel alder motherboard really dislikes the way the current kernel
reassigns all PCI resources.  It exports 6 memory bars from its Extended
Express System Support Controller, but if the system touches any of
them, it disables the secondary IO-APIC.

The system is bootable if you disable all IO-APICs apart from the
primary, but it does become a bit crowded in interrupt space.  The patch
fixes the problem by adding a quirk to clear the first six memory bars.

James

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1530  -> 1.1531 
#	drivers/pci/quirks.c	1.38    -> 1.39   
#	include/linux/pci_ids.h	1.130   -> 1.131  
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/01/11	root@claymoor.il.steeleye.com	1.1531
# Fix intel alder boot failure
# 
# The alder has an intel Extended Express System Support Controller
# which presents apparently spurious BARs.  When the pci resource
# code tries to reassign these BARs, the second IO-APIC gets disabled
# (with disastrous consequences).
# 
# The patch adds a quirk to clear these BARs.
# --------------------------------------------
#
diff -Nru a/drivers/pci/quirks.c b/drivers/pci/quirks.c
--- a/drivers/pci/quirks.c	Sun Jan 11 20:51:06 2004
+++ b/drivers/pci/quirks.c	Sun Jan 11 20:51:06 2004
@@ -786,6 +786,28 @@
 	sis_96x_compatible = 1;
 }
 
+#ifdef CONFIG_X86_IO_APIC
+static void __init quirk_alder_ioapic(struct pci_dev *pdev)
+{
+	struct resource *res;
+	int i;
+	
+	if ((pdev->class >> 8) != 0xff00)
+		return;
+	
+	res = &pdev->resource[0];
+	/* The first six bars must be cleared otherwise the IO-APIC
+	 * will throw a wobbly when the pci resource allocation tries
+	 * to reassign them */
+	for(i=0; i < 6; i++) {
+		res[i].start = 0;
+		res[i].end = 0;
+		res[i].flags = 0;
+	}
+	
+}
+#endif
+
 #ifdef CONFIG_SCSI_SATA
 static void __init quirk_intel_ide_combined(struct pci_dev *pdev)
 {
@@ -910,6 +932,7 @@
 	{ PCI_FIXUP_FINAL,	PCI_VENDOR_ID_SI,	PCI_ANY_ID,			quirk_ioapic_rmw },
         { PCI_FIXUP_FINAL,      PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_APIC,
           quirk_amd_8131_ioapic }, 
+	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_EESSC,	quirk_alder_ioapic },
 #endif
 	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_VIA,	PCI_DEVICE_ID_VIA_82C586_3,	quirk_via_acpi },
 	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_VIA,	PCI_DEVICE_ID_VIA_82C686_4,	quirk_via_acpi },
diff -Nru a/include/linux/pci_ids.h b/include/linux/pci_ids.h
--- a/include/linux/pci_ids.h	Sun Jan 11 20:51:06 2004
+++ b/include/linux/pci_ids.h	Sun Jan 11 20:51:06 2004
@@ -1901,6 +1901,7 @@
 #define PCI_DEVICE_ID_GENROCO_HFP832	0x0003
 
 #define PCI_VENDOR_ID_INTEL		0x8086
+#define PCI_DEVICE_ID_INTEL_EESSC	0x0008
 #define PCI_DEVICE_ID_INTEL_21145	0x0039
 #define PCI_DEVICE_ID_INTEL_82375	0x0482
 #define PCI_DEVICE_ID_INTEL_82424	0x0483


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12  2:55 [PATCH] Intel Alder IOAPIC fix James Bottomley
@ 2004-01-12 21:24 ` Linus Torvalds
  2004-01-12 22:13   ` James Bottomley
  2004-01-12 23:04   ` James Bottomley
  0 siblings, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2004-01-12 21:24 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Linux Kernel



On Sun, 11 Jan 2004, James Bottomley wrote:
>
> The intel alder motherboard really dislikes the way the current kernel
> reassigns all PCI resources.  It exports 6 memory bars from its Extended
> Express System Support Controller, but if the system touches any of
> them, it disables the secondary IO-APIC.
> 
> The system is bootable if you disable all IO-APICs apart from the
> primary, but it does become a bit crowded in interrupt space.  The patch
> fixes the problem by adding a quirk to clear the first six memory bars.

What are the BAR contents? In particular, maybe the right fix is to add a 
flag to say "don't touch" - but leave the BAR contents there, so that 
the resource manager can actually see what the resources actually are..

		Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12 21:24 ` Linus Torvalds
@ 2004-01-12 22:13   ` James Bottomley
  2004-01-12 23:04   ` James Bottomley
  1 sibling, 0 replies; 16+ messages in thread
From: James Bottomley @ 2004-01-12 22:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel

On Mon, 2004-01-12 at 16:24, Linus Torvalds wrote:
> What are the BAR contents? In particular, maybe the right fix is to add a 
> flag to say "don't touch" - but leave the BAR contents there, so that 
> the resource manager can actually see what the resources actually are..

The bar contents seem to be real:

00:0f.0 Class ff00: Intel Corp. Extended Express System Support
Controller
        Subsystem: Unknown device 0008:2000
        Flags: fast devsel
        Memory at 20000000 (32-bit, prefetchable) [size=1K]
        Memory at 20000400 (32-bit, prefetchable) [size=1K]
        Memory at 20000800 (32-bit, prefetchable) [size=1K]
        Memory at 20000c00 (32-bit, prefetchable) [size=1K]
        Memory at 20001000 (32-bit, prefetchable) [size=1K]
        Memory at 20001400 (32-bit, prefetchable) [size=1K]
        Expansion ROM at fffff800 [disabled] [size=2K]


That sits just on top of my available memory.

However, the resource allocation code produces this on boot (this is
what kills the second IO APIC)

PCI: Cannot allocate resource region 0 of device 00:0f.0
PCI: Cannot allocate resource region 1 of device 00:0f.0
PCI: Cannot allocate resource region 2 of device 00:0f.0
PCI: Cannot allocate resource region 3 of device 00:0f.0
PCI: Cannot allocate resource region 4 of device 00:0f.0
PCI: Cannot allocate resource region 5 of device 00:0f.0
PCI: Error while updating region 00:0f.0/1 (20000408 != 20000008)
PCI: Error while updating region 00:0f.0/2 (20000808 != 20000008)
PCI: Error while updating region 00:0f.0/3 (20000c08 != 20000008)
PCI: Error while updating region 00:0f.0/4 (20001008 != 20000008)
PCI: Error while updating region 00:0f.0/5 (20001408 != 20000008)

I'll poke and find a flag to keep the range.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12 21:24 ` Linus Torvalds
  2004-01-12 22:13   ` James Bottomley
@ 2004-01-12 23:04   ` James Bottomley
  2004-01-12 23:04     ` Linus Torvalds
  1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-12 23:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel

OK, the bars in the previous mail are rubbish.  I put debugging into the
start up, so these are the correct bars:



PCI: Dev 0000:00:0f.0 Resource(0) fec01000-fec013ff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 0 of device 0000:00:0f.0
PCI: Dev 0000:00:0f.0 Resource(1) fffffc00-ffffffff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 1 of device 0000:00:0f.0
PCI: Dev 0000:00:0f.0 Resource(2) fffffc00-ffffffff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 2 of device 0000:00:0f.0
PCI: Dev 0000:00:0f.0 Resource(3) fffffc00-ffffffff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 3 of device 0000:00:0f.0
PCI: Dev 0000:00:0f.0 Resource(4) fffffc00-ffffffff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 4 of device 0000:00:0f.0
PCI: Dev 0000:00:0f.0 Resource(5) fffffc00-ffffffff (f=10001208, d=0,
p=0)
PCI: Cannot allocate resource region 5 of device 0000:00:0f.0

So BAR0 is actually the location of the second I/O APIC's mapped address
range (which we've already covered with a fixmap from the MP TABLE). 
I've no idea what the other four BARs all with addresses at 0xfffffc00
are doing.

The only way to prevent the current code (in arch/i386/pci/i386.c) from
reassigning this range seems to be to set the resource start to zero.

James





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12 23:04   ` James Bottomley
@ 2004-01-12 23:04     ` Linus Torvalds
  2004-01-13  0:25       ` James Bottomley
  2004-01-13  0:45       ` James Bottomley
  0 siblings, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2004-01-12 23:04 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Linux Kernel



On Mon, 12 Jan 2004, James Bottomley wrote:
> 
> So BAR0 is actually the location of the second I/O APIC's mapped address
> range (which we've already covered with a fixmap from the MP TABLE). 
> I've no idea what the other four BARs all with addresses at 0xfffffc00
> are doing.
> 
> The only way to prevent the current code (in arch/i386/pci/i386.c) from
> reassigning this range seems to be to set the resource start to zero.

Ok. What I think we should do is to have a special quirk for chips like 
this that just _force_ the BAR values into the resource allocation table, 
and ignore things like the existing BIOS-marked allocations - because 
obviously the BIOS-marked ones are going to overlap.

This is where that "insert_resource()" thing comes in. Something like this 
might just work as a quirk:

	adler_quirk(struct pci_dev *dev)
	{
		int i;

		for (i = 0; i < 6; i++) {
			if (!pci_resource_start(dev, i))
				continue;
			if (!pci_resource_len(dev, i))
				continue;
			insert_resource(&iomem_resource, dev->resource + i);
		}
	}

and then make sure that the PCI layer doesn't try to re-allocate the 
resource some other way (I forget - maybe it already notices when the 
resource has already been inserted into the resource tree - otherwise you 
might need to add the code that says "if this resource is already 
inserted, don't try to reallocate it").

		Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12 23:04     ` Linus Torvalds
@ 2004-01-13  0:25       ` James Bottomley
  2004-01-13  0:45       ` James Bottomley
  1 sibling, 0 replies; 16+ messages in thread
From: James Bottomley @ 2004-01-13  0:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel

On Mon, 2004-01-12 at 18:04, Linus Torvalds wrote:
> 		for (i = 0; i < 6; i++) {
> 			if (!pci_resource_start(dev, i))
> 				continue;
> 			if (!pci_resource_len(dev, i))
> 				continue;

Unfortunately this won't work because of the properties of insert
resource.  The BAR covers the second IO APIC at fec01000-fec013ff. 
However, this sits right in the middle of the fixmap region:

fec00000-fec08fff : reserved

This check in insert_resource makes sure that the resource being
inserted has to end beyond the resource it is replacing:

	/* existing resource overlaps end of new resource */
	if (next->end > new->end)
		goto out;

I could hack up another insert resource function that would put the
resource *under* anything else it finds (i.e. the reserved region).

Otherwise, everything will work since the i386 pci code assumes that if
the resource already has a parent, it has already been correctly
assigned, so won't try to reassign it.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-13  0:45       ` James Bottomley
@ 2004-01-13  0:25         ` Linus Torvalds
  2004-01-13 16:52           ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2004-01-13  0:25 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Linux Kernel



On Mon, 12 Jan 2004, James Bottomley wrote:
>
> OK, with the patch below (to insert_resource) I know get the IO APIC
> successfully inserted under the reserved fixmap resources:
> 
> /proc/iomem still looks very odd:
> 
> fec00000-fec08fff : reserved
>   fec01000-fec013ff : 0000:00:0f.0
> fffffc00-ffffffff : 0000:00:0f.0
>   fffffc00-ffffffff : 0000:00:0f.0
>     fffffc00-ffffffff : 0000:00:0f.0
>       fffffc00-ffffffff : 0000:00:0f.0
>         fffffc00-ffffffff : 0000:00:0f.0
>           ffe80000-ffffffff : reserved
> 
> unfortunately, because BARs 1-5 cover the same region.

I think BARs 1-5 don't exist at all. Being set to all ones is common for
"unused" (it ends up being a normal result of a lazy probe - you set all 
bits to 1 to check for the size of the region, and if you decide not to 
map it and leave it there, you'll get the above behaviour).

I suspect only BAR0 is actually real.

What's in that ffe80000-ffffffff region that the BIOS has allocated,
anyway?

		Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-12 23:04     ` Linus Torvalds
  2004-01-13  0:25       ` James Bottomley
@ 2004-01-13  0:45       ` James Bottomley
  2004-01-13  0:25         ` Linus Torvalds
  1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-13  0:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel

OK, with the patch below (to insert_resource) I know get the IO APIC
successfully inserted under the reserved fixmap resources:

/proc/iomem still looks very odd:

fec00000-fec08fff : reserved
  fec01000-fec013ff : 0000:00:0f.0
fffffc00-ffffffff : 0000:00:0f.0
  fffffc00-ffffffff : 0000:00:0f.0
    fffffc00-ffffffff : 0000:00:0f.0
      fffffc00-ffffffff : 0000:00:0f.0
        fffffc00-ffffffff : 0000:00:0f.0
          ffe80000-ffffffff : reserved

unfortunately, because BARs 1-5 cover the same region.

I also think insert_resource needs to be fixed to insert these resources
*under* the reserved resource, since it's larger than they are.

I can make these changes and send them to you, if you think this is the
correct thing to do?

James

===== kernel/resource.c 1.18 vs edited =====
--- 1.18/kernel/resource.c	Wed Nov 19 01:40:49 2003
+++ edited/kernel/resource.c	Mon Jan 12 18:34:00 2004
@@ -318,6 +318,7 @@
 	struct resource *first, *next;
 
 	write_lock(&resource_lock);
+ begin:
 	first = __request_resource(parent, new);
 	if (!first)
 		goto out;
@@ -331,8 +332,10 @@
 			break;
 
 	/* existing resource overlaps end of new resource */
-	if (next->end > new->end)
-		goto out;
+	if (next->end > new->end) {
+		parent = next;
+		goto begin;
+	}
 
 	result = 0;
 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-13  0:25         ` Linus Torvalds
@ 2004-01-13 16:52           ` James Bottomley
  2004-01-15  5:18             ` Eric W. Biederman
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-13 16:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel

On Mon, 2004-01-12 at 19:25, Linus Torvalds wrote:
> I think BARs 1-5 don't exist at all. Being set to all ones is common for
> "unused" (it ends up being a normal result of a lazy probe - you set all 
> bits to 1 to check for the size of the region, and if you decide not to 
> map it and leave it there, you'll get the above behaviour).
> 
> I suspect only BAR0 is actually real.

OK, I cleaned up the patch to forcibly insert BAR0 and clear BARs 1-5
(it still requires changes to insert_resource to work, though).

> What's in that ffe80000-ffffffff region that the BIOS has allocated,
> anyway?

I don't know...it comes from the BIOS-e820 map:

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 0000000000100000 - 0000000020000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec09000 (reserved)
 BIOS-e820: 00000000ffe80000 - 0000000100000000 (reserved)

James

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.1530  -> 1.1532 
#	drivers/pci/quirks.c	1.38    -> 1.40   
#	include/linux/pci_ids.h	1.130   -> 1.131  
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/01/11	root@claymoor.il.steeleye.com	1.1531
# Fix intel alder boot failure
# 
# The alder has an intel Extended Express System Support Controller
# which presents apparently spurious BARs.  When the pci resource
# code tries to reassign these BARs, the second IO-APIC gets disabled
# (with disastrous consequences).
# 
# The patch adds a quirk to clear these BARs.
# --------------------------------------------
# 04/01/13	root@claymoor.il.steeleye.com	1.1532
# update the alder quirk
# --------------------------------------------
#
diff -Nru a/drivers/pci/quirks.c b/drivers/pci/quirks.c
--- a/drivers/pci/quirks.c	Tue Jan 13 10:49:13 2004
+++ b/drivers/pci/quirks.c	Tue Jan 13 10:49:13 2004
@@ -786,6 +786,29 @@
 	sis_96x_compatible = 1;
 }
 
+#ifdef CONFIG_X86_IO_APIC
+static void __init quirk_alder_ioapic(struct pci_dev *pdev)
+{
+	int i;
+	
+	if ((pdev->class >> 8) != 0xff00)
+		return;
+	
+	/* the first BAR is the location of the IO APIC...we must
+	 * not touch this (and it's already covered by the fixmap), so
+	 * forcibly insert it into the resource tree */
+	if(pci_resource_start(pdev, 0) && pci_resource_len(pdev, 0))
+		insert_resource(&iomem_resource, &pdev->resource[0]);
+
+	/* The next five BARs all seem to be rubbish, so just clean
+	 * them out */
+	for(i=1; i < 6; i++) {
+		memset(&pdev->resource[i], 0, sizeof(pdev->resource[i]));
+	}
+	
+}
+#endif
+
 #ifdef CONFIG_SCSI_SATA
 static void __init quirk_intel_ide_combined(struct pci_dev *pdev)
 {
@@ -910,6 +933,7 @@
 	{ PCI_FIXUP_FINAL,	PCI_VENDOR_ID_SI,	PCI_ANY_ID,			quirk_ioapic_rmw },
         { PCI_FIXUP_FINAL,      PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_APIC,
           quirk_amd_8131_ioapic }, 
+	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_INTEL,	PCI_DEVICE_ID_INTEL_EESSC,	quirk_alder_ioapic },
 #endif
 	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_VIA,	PCI_DEVICE_ID_VIA_82C586_3,	quirk_via_acpi },
 	{ PCI_FIXUP_HEADER,	PCI_VENDOR_ID_VIA,	PCI_DEVICE_ID_VIA_82C686_4,	quirk_via_acpi },
diff -Nru a/include/linux/pci_ids.h b/include/linux/pci_ids.h
--- a/include/linux/pci_ids.h	Tue Jan 13 10:49:13 2004
+++ b/include/linux/pci_ids.h	Tue Jan 13 10:49:13 2004
@@ -1901,6 +1901,7 @@
 #define PCI_DEVICE_ID_GENROCO_HFP832	0x0003
 
 #define PCI_VENDOR_ID_INTEL		0x8086
+#define PCI_DEVICE_ID_INTEL_EESSC	0x0008
 #define PCI_DEVICE_ID_INTEL_21145	0x0039
 #define PCI_DEVICE_ID_INTEL_82375	0x0482
 #define PCI_DEVICE_ID_INTEL_82424	0x0483


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-13 16:52           ` James Bottomley
@ 2004-01-15  5:18             ` Eric W. Biederman
  2004-01-15 16:58               ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Eric W. Biederman @ 2004-01-15  5:18 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

James Bottomley <James.Bottomley@steeleye.com> writes:

> On Mon, 2004-01-12 at 19:25, Linus Torvalds wrote:
> > I think BARs 1-5 don't exist at all. Being set to all ones is common for
> > "unused" (it ends up being a normal result of a lazy probe - you set all 
> > bits to 1 to check for the size of the region, and if you decide not to 
> > map it and leave it there, you'll get the above behaviour).
> > 
> > I suspect only BAR0 is actually real.
> 
> OK, I cleaned up the patch to forcibly insert BAR0 and clear BARs 1-5
> (it still requires changes to insert_resource to work, though).

When I looked at the ia64 code that uses insert_resource (and I admit I am
reading between the lines a little) it seems to come along after potentially
allocating some resources behind some kind of bridge and then realize a bridge
is there.

Which is totally something different from this case where we just want
to ignore the BIOS, because we know better.  I have seen a number of
boxes that reserver the area where apics or ioapics live.  So I think
we need an IORESOURCE_TENTATIVE thing.  This is the third flavor of
thing that has shown up, lately.

Want me to code up a patch?

Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-15  5:18             ` Eric W. Biederman
@ 2004-01-15 16:58               ` James Bottomley
  2004-01-15 19:26                 ` Eric W. Biederman
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-15 16:58 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

On Thu, 2004-01-15 at 00:18, Eric W. Biederman wrote:
> James Bottomley <James.Bottomley@steeleye.com> writes:
> 
> > On Mon, 2004-01-12 at 19:25, Linus Torvalds wrote:
> > > I think BARs 1-5 don't exist at all. Being set to all ones is common for
> > > "unused" (it ends up being a normal result of a lazy probe - you set all 
> > > bits to 1 to check for the size of the region, and if you decide not to 
> > > map it and leave it there, you'll get the above behaviour).
> > > 
> > > I suspect only BAR0 is actually real.
> > 
> > OK, I cleaned up the patch to forcibly insert BAR0 and clear BARs 1-5
> > (it still requires changes to insert_resource to work, though).
> 
> When I looked at the ia64 code that uses insert_resource (and I admit I am
> reading between the lines a little) it seems to come along after potentially
> allocating some resources behind some kind of bridge and then realize a bridge
> is there.

Ah, that explains why it's expecting to find the new resource covering
the old one.

> Which is totally something different from this case where we just want
> to ignore the BIOS, because we know better.  I have seen a number of
> boxes that reserver the area where apics or ioapics live.  So I think
> we need an IORESOURCE_TENTATIVE thing.  This is the third flavor of
> thing that has shown up, lately.
> 
> Want me to code up a patch?

Well, I'm not sure there's a need for it.  It seems to me that all
insert_resource is supposed to be doing is saying "I've got this
resource here that should have been placed into the tree...please do it
now".

The ia64 I forgot this bridge, and the Alder IO-APIC this PCI BAR is
actually the IO-APIC and thus part of the reserved BIOS area look to be
similar aspects of the same problem.

The only difference (which is what I needed the patch for) was that the
Alder resource needs to go underneath the bios reserved area.

How are you proposing that IORESOURCE_TENTATIVE should work?

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-15 16:58               ` James Bottomley
@ 2004-01-15 19:26                 ` Eric W. Biederman
  2004-01-15 19:54                   ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Eric W. Biederman @ 2004-01-15 19:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

James Bottomley <James.Bottomley@steeleye.com> writes:


> Ah, that explains why it's expecting to find the new resource covering
> the old one.

And that is why I believe it tucks the old resource under the new one.

> > Which is totally something different from this case where we just want
> > to ignore the BIOS, because we know better.  I have seen a number of
> > boxes that reserver the area where apics or ioapics live.  So I think
> > we need an IORESOURCE_TENTATIVE thing.  This is the third flavor of
> > thing that has shown up, lately.
> > 
> > Want me to code up a patch?
> 
> Well, I'm not sure there's a need for it.  It seems to me that all
> insert_resource is supposed to be doing is saying "I've got this
> resource here that should have been placed into the tree...please do it
> now".
> 
> The ia64 I forgot this bridge, and the Alder IO-APIC this PCI BAR is
> actually the IO-APIC and thus part of the reserved BIOS area look to be
> similar aspects of the same problem.

The problem of you have an incorrect device tree what do you do yes.
The difference that I see is in how the tree gets wrong, and what
parts of it you want to keep.
 
> The only difference (which is what I needed the patch for) was that the
> Alder resource needs to go underneath the bios reserved area.
> 
> How are you proposing that IORESOURCE_TENTATIVE should work?

The solution I keep am thinking of is to simply push an IORESOURCE_TENTATIVE
thing out of the way.

What I am thinking is that /proc/iomem would start out looking link:
fec00000-fec08fff : reserved
ffe80000-ffffffff : reserved

And end up looking like:
fec00000-fec00fff : reserved
fec01000-fec013ff : 0000:00:0f.0
fec01400-fec08fff : reserved
ffe80000-ffffffff : reserved

And either put the code to do that in request_resource, or have
a demand_resource thing that does it.  The only thing worth preserving
in the mixed up BIOS case is that find_resource does not allocate that
range for something else.

Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-15 19:26                 ` Eric W. Biederman
@ 2004-01-15 19:54                   ` James Bottomley
  2004-01-16  5:32                     ` Eric W. Biederman
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-15 19:54 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

On Thu, 2004-01-15 at 14:26, Eric W. Biederman wrote:
> And end up looking like:
> fec00000-fec00fff : reserved
> fec01000-fec013ff : 0000:00:0f.0
> fec01400-fec08fff : reserved

Oh, I see you're splitting an existing resource around it.

So the e820 map requests reserved regions with tentative and
insert_resource is allowed to place resources into tentative regions. 
That works for me, but I don't see how it works for the bridge
case...there you really want to insert the bridge resource over
everything else.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-15 19:54                   ` James Bottomley
@ 2004-01-16  5:32                     ` Eric W. Biederman
  2004-01-17 15:18                       ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Eric W. Biederman @ 2004-01-16  5:32 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

James Bottomley <James.Bottomley@steeleye.com> writes:

> On Thu, 2004-01-15 at 14:26, Eric W. Biederman wrote:
> > And end up looking like:
> > fec00000-fec00fff : reserved
> > fec01000-fec013ff : 0000:00:0f.0
> > fec01400-fec08fff : reserved
> 
> Oh, I see you're splitting an existing resource around it.

Yes, this is the extreme case.  In normal cases I would just
expect to push to one side and probably shrink it to 0.  I guess
I have something against implying a hierarchal relationship that
does not exist.
 
> So the e820 map requests reserved regions with tentative and
> insert_resource is allowed to place resources into tentative regions. 
> That works for me, but I don't see how it works for the bridge
> case...there you really want to insert the bridge resource over
> everything else.

Right.  To me it looks like separate cases.  What I keep envisioning
scanning the PCI devices and then realizing they are behind
a bridge.  Before I go to far I guess I should ask.

The splitting/pushing aside looks especially useful for those
cases where you subdivide the resource again.

As for the bridge case I think that is something different.  

Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-16  5:32                     ` Eric W. Biederman
@ 2004-01-17 15:18                       ` James Bottomley
  2004-01-17 19:43                         ` Eric W. Biederman
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-01-17 15:18 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

On Fri, 2004-01-16 at 00:32, Eric W. Biederman wrote:
> Yes, this is the extreme case.  In normal cases I would just
> expect to push to one side and probably shrink it to 0.  I guess
> I have something against implying a hierarchal relationship that
> does not exist.

Well, it makes sense to me that the resource would be a child of the
reserved area, because the reserved area covers the APICs and this
rather annoying PCI device has one of the IO APICs tied to BAR0.

In this case, we have a PCI device claiming something we already
discovered and made use of long ago in bootup.

> Right.  To me it looks like separate cases.  What I keep envisioning
> scanning the PCI devices and then realizing they are behind
> a bridge.  Before I go to far I guess I should ask.
> 
> The splitting/pushing aside looks especially useful for those
> cases where you subdivide the resource again.
> 
> As for the bridge case I think that is something different.  

The pragmatist in me says we can handle them all as a single case. 
Simply put, it means insert_resource() says "I know this belongs in the
resource tree, just put it in where it should go, please".  As long as
we make sure we only use it for the exception cases, it should all work
fine.

All I really want is to get the alter 4 and 8 way boxes working again,
I'm happy to go with whatever people decide about resources.  What other
uses are there for the TENTATIVE regions?

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Intel Alder IOAPIC fix
  2004-01-17 15:18                       ` James Bottomley
@ 2004-01-17 19:43                         ` Eric W. Biederman
  0 siblings, 0 replies; 16+ messages in thread
From: Eric W. Biederman @ 2004-01-17 19:43 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel

James Bottomley <James.Bottomley@steeleye.com> writes:

> On Fri, 2004-01-16 at 00:32, Eric W. Biederman wrote:
> > Yes, this is the extreme case.  In normal cases I would just
> > expect to push to one side and probably shrink it to 0.  I guess
> > I have something against implying a hierarchal relationship that
> > does not exist.
> 
> Well, it makes sense to me that the resource would be a child of the
> reserved area, because the reserved area covers the APICs and this
> rather annoying PCI device has one of the IO APICs tied to BAR0.
> 
> In this case, we have a PCI device claiming something we already
> discovered and made use of long ago in bootup.

Yes, when the BIOS is trustworthy that sounds reasonable. 
The nasty theoretical case I can think of is what happens when that
annoying PCI device is behind a bridge?  How do we reserve the
bridge resources and become a child of them?
 
> > Right.  To me it looks like separate cases.  What I keep envisioning
> > scanning the PCI devices and then realizing they are behind
> > a bridge.  Before I go to far I guess I should ask.
> > 
> > The splitting/pushing aside looks especially useful for those
> > cases where you subdivide the resource again.
> > 
> > As for the bridge case I think that is something different.  
> 
> The pragmatist in me says we can handle them all as a single case. 
> Simply put, it means insert_resource() says "I know this belongs in the
> resource tree, just put it in where it should go, please".  As long as
> we make sure we only use it for the exception cases, it should all work
> fine.
> 
> All I really want is to get the alter 4 and 8 way boxes working again,
> I'm happy to go with whatever people decide about resources.  What other
> uses are there for the TENTATIVE regions?

The case I care about is BIOS ROMS.  That case is fun because frequently
the reserved region is smaller than region the ROM sits in.  But it
really comes down to trusting the BIOS.  And anytime we trust the BIOS
to do the right thing some BIOS will do it wrong.  So when we have a
conflict and we know we are right I would much rather throw away what
the BIOS has told me.

If we are somehow scanning the busses and stop and oh wait there is a
bridge above everything that I had not noticed before.  And in a root
complex (the pci express term) where the resources are non standard, I
can really see this happening.  I know on Opteron systems we currently
omit the resources on the cpu/HT chain.  That is what I think
insert_resource was designed for.

And there was another case a few days ago where someone was having
similar BIOS problems with the AGP GART window.


Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-01-17 19:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-12  2:55 [PATCH] Intel Alder IOAPIC fix James Bottomley
2004-01-12 21:24 ` Linus Torvalds
2004-01-12 22:13   ` James Bottomley
2004-01-12 23:04   ` James Bottomley
2004-01-12 23:04     ` Linus Torvalds
2004-01-13  0:25       ` James Bottomley
2004-01-13  0:45       ` James Bottomley
2004-01-13  0:25         ` Linus Torvalds
2004-01-13 16:52           ` James Bottomley
2004-01-15  5:18             ` Eric W. Biederman
2004-01-15 16:58               ` James Bottomley
2004-01-15 19:26                 ` Eric W. Biederman
2004-01-15 19:54                   ` James Bottomley
2004-01-16  5:32                     ` Eric W. Biederman
2004-01-17 15:18                       ` James Bottomley
2004-01-17 19:43                         ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).