All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/sfi: fix ioapic gsi range
@ 2010-06-07 23:07 Jacob Pan
  2010-06-08  0:01 ` jacob pan
  2010-06-08  0:24 ` Eric W. Biederman
  0 siblings, 2 replies; 26+ messages in thread
From: Jacob Pan @ 2010-06-07 23:07 UTC (permalink / raw)
  To: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown, Eric W. Biederman
  Cc: Jacob Pan

SFI based platforms should have zero based gsi_base for IOAPICs found in SFI
tables. The current code sets gsi_base starting from 1 when registering ioapic.
The result is that Moorestown platform would have wrong mp_gsi_routing for each
ioapic.

Background:
In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis and irqs
are one to one mapped, including those < 16. Specifically, IRQ0 and IRQ1 are
used for per-cpu timers. So without this patch, IOAPIC pin to IRQ mapping is
off by one.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 arch/x86/kernel/sfi.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/sfi.c b/arch/x86/kernel/sfi.c
index 7ded578..8d31950 100644
--- a/arch/x86/kernel/sfi.c
+++ b/arch/x86/kernel/sfi.c
@@ -87,14 +87,17 @@ static int __init sfi_parse_ioapic(struct sfi_table_header *table)
 	struct sfi_table_simple *sb;
 	struct sfi_apic_table_entry *pentry;
 	int i, num;
+	u32 gsi_base;
 
 	sb = (struct sfi_table_simple *)table;
 	num = SFI_GET_NUM_ENTRIES(sb, struct sfi_apic_table_entry);
 	pentry = (struct sfi_apic_table_entry *)sb->pentry;
 
+	gsi_base = gsi_end;
 	for (i = 0; i < num; i++) {
-		mp_register_ioapic(i, pentry->phys_addr, gsi_end + 1);
+		mp_register_ioapic(i, pentry->phys_addr, gsi_base);
 		pentry++;
+		gsi_base = gsi_end + 1;
 	}
 
 	WARN(pic_mode, KERN_WARNING
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-07 23:07 [PATCH] x86/sfi: fix ioapic gsi range Jacob Pan
@ 2010-06-08  0:01 ` jacob pan
  2010-06-08  0:24 ` Eric W. Biederman
  1 sibling, 0 replies; 26+ messages in thread
From: jacob pan @ 2010-06-08  0:01 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown, Eric W. Biederman

Jacob Pan Mon,  7 Jun 2010 16:07:24 -0700
>SFI based platforms should have zero based gsi_base for IOAPICs found in SFI
>tables. The current code sets gsi_base starting from 1 when registering ioapic.
>The result is that Moorestown platform would have wrong mp_gsi_routing for each
>ioapic.
>
>Background:
>In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis and irqs
>are one to one mapped, including those < 16. Specifically, IRQ0 and IRQ1 are
>used for per-cpu timers. So without this patch, IOAPIC pin to IRQ mapping is
>off by one.
>
Clarifiction/correction about IRQ0,1. I refer to IOAPIC IRQ #, which is the
IOAPIC RTE entry #. Not in the sense of kernel IRQ# which can be assigned
differently on Moorestown.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-07 23:07 [PATCH] x86/sfi: fix ioapic gsi range Jacob Pan
  2010-06-08  0:01 ` jacob pan
@ 2010-06-08  0:24 ` Eric W. Biederman
  2010-06-08  0:30   ` H. Peter Anvin
  2010-06-08  5:50   ` [PATCH] x86/sfi: fix ioapic gsi range jacob pan
  1 sibling, 2 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08  0:24 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

Jacob Pan <jacob.jun.pan@linux.intel.com> writes:

> SFI based platforms should have zero based gsi_base for IOAPICs found in SFI
> tables. The current code sets gsi_base starting from 1 when registering ioapic.
> The result is that Moorestown platform would have wrong mp_gsi_routing for each
> ioapic.

Yes starting at 1 is a bug.

> Background:
> In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis and irqs
> are one to one mapped, including those < 16. Specifically, IRQ0 and IRQ1 are
> used for per-cpu timers. So without this patch, IOAPIC pin to IRQ mapping is
> off by one.

The patch looks mostly reasonable the comment is wrong.

You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
irqs 0-15 are the ISA irqs you may not use those irq numbers for
something different on any architecture, but especially not on x86.
The gsi numbers are firmware specific and you may treat however you want.

Does the following patch work for you?

It appears I goofed when it was pointed out that gsi_end was inclusive and
didn't change the initialize.

Eric
---

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 33f3563..5de84e5 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -90,7 +90,7 @@ int nr_ioapics;
 struct mp_ioapic_gsi  mp_gsi_routing[MAX_IO_APICS];
 
 /* The last gsi number used */
-u32 gsi_end;
+u32 gsi_end = -1; 
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  0:24 ` Eric W. Biederman
@ 2010-06-08  0:30   ` H. Peter Anvin
  2010-06-08  1:10     ` Eric W. Biederman
                       ` (2 more replies)
  2010-06-08  5:50   ` [PATCH] x86/sfi: fix ioapic gsi range jacob pan
  1 sibling, 3 replies; 26+ messages in thread
From: H. Peter Anvin @ 2010-06-08  0:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jacob Pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

On 06/07/2010 05:24 PM, Eric W. Biederman wrote:
> Jacob Pan <jacob.jun.pan@linux.intel.com> writes:
> 
>> SFI based platforms should have zero based gsi_base for IOAPICs found in SFI
>> tables. The current code sets gsi_base starting from 1 when registering ioapic.
>> The result is that Moorestown platform would have wrong mp_gsi_routing for each
>> ioapic.
> 
> Yes starting at 1 is a bug.
> 
>> Background:
>> In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis and irqs
>> are one to one mapped, including those < 16. Specifically, IRQ0 and IRQ1 are
>> used for per-cpu timers. So without this patch, IOAPIC pin to IRQ mapping is
>> off by one.
> 
> The patch looks mostly reasonable the comment is wrong.
> 
> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
> irqs 0-15 are the ISA irqs you may not use those irq numbers for
> something different on any architecture, but especially not on x86.
> The gsi numbers are firmware specific and you may treat however you want.
> 
> Does the following patch work for you?
> 
> It appears I goofed when it was pointed out that gsi_end was inclusive and
> didn't change the initialize.
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 33f3563..5de84e5 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -90,7 +90,7 @@ int nr_ioapics;
>  struct mp_ioapic_gsi  mp_gsi_routing[MAX_IO_APICS];
>  
>  /* The last gsi number used */
> -u32 gsi_end;
> +u32 gsi_end = -1; 
>  

This seems like asking for signedness problems, especially since this is
used in range compares all the time.  The real problem here is that
gsi_end is inclusive, which is almost always the wrong thing for the
endpoint of a range.  Instead we should have the last number used plus
one; perhaps it should be called gsi_next or gsi_free.

	-hpa


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  0:30   ` H. Peter Anvin
@ 2010-06-08  1:10     ` Eric W. Biederman
  2010-06-08  8:10     ` Alan Cox
  2010-06-08 18:44     ` [PATCH] x86/irq: Rename gsi_end gsi_top, and fix off by one errors Eric W. Biederman
  2 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08  1:10 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jacob Pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 06/07/2010 05:24 PM, Eric W. Biederman wrote:
>> Jacob Pan <jacob.jun.pan@linux.intel.com> writes:
>> 
>>> SFI based platforms should have zero based gsi_base for IOAPICs found in SFI
>>> tables. The current code sets gsi_base starting from 1 when registering ioapic.
>>> The result is that Moorestown platform would have wrong mp_gsi_routing for each
>>> ioapic.
>> 
>> Yes starting at 1 is a bug.
>> 
>>> Background:
>>> In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis and irqs
>>> are one to one mapped, including those < 16. Specifically, IRQ0 and IRQ1 are
>>> used for per-cpu timers. So without this patch, IOAPIC pin to IRQ mapping is
>>> off by one.
>> 
>> The patch looks mostly reasonable the comment is wrong.
>> 
>> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
>> irqs 0-15 are the ISA irqs you may not use those irq numbers for
>> something different on any architecture, but especially not on x86.
>> The gsi numbers are firmware specific and you may treat however you want.
>> 
>> Does the following patch work for you?
>> 
>> It appears I goofed when it was pointed out that gsi_end was inclusive and
>> didn't change the initialize.
>> 
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 33f3563..5de84e5 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -90,7 +90,7 @@ int nr_ioapics;
>>  struct mp_ioapic_gsi  mp_gsi_routing[MAX_IO_APICS];
>>  
>>  /* The last gsi number used */
>> -u32 gsi_end;
>> +u32 gsi_end = -1; 
>>  
>
> This seems like asking for signedness problems, especially since this is
> used in range compares all the time.  The real problem here is that
> gsi_end is inclusive, which is almost always the wrong thing for the
> endpoint of a range.  Instead we should have the last number used plus
> one; perhaps it should be called gsi_next or gsi_free.

That does sound better.  Let me see if I can find a few minutes to implement
it that way.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  0:24 ` Eric W. Biederman
  2010-06-08  0:30   ` H. Peter Anvin
@ 2010-06-08  5:50   ` jacob pan
  2010-06-08 19:41     ` Eric W. Biederman
  1 sibling, 1 reply; 26+ messages in thread
From: jacob pan @ 2010-06-08  5:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown


> > Background:
> > In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis
> > and irqs are one to one mapped, including those < 16. Specifically,
> > IRQ0 and IRQ1 are used for per-cpu timers. So without this patch,
> > IOAPIC pin to IRQ mapping is off by one.
> 
> The patch looks mostly reasonable the comment is wrong.
> 
> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
> irqs 0-15 are the ISA irqs you may not use those irq numbers for
> something different on any architecture, but especially not on x86.
> The gsi numbers are firmware specific and you may treat however you
> want.

[jacob pan] If we don't have ISA irqs, why can't we have gsi# = irq#
for the legacy IRQ range? On Moorestown, we are re-using legacy irqs.
e.g. 
sh-4.0# cat /proc/interrupts
          CPU0       CPU1
  0:       1512          0   IO-APIC-edge      apbt0
  1:          0       1482   IO-APIC-edge      apbt1
  9:          0          0   IO-APIC-fasteoi   dw_spi
 10:          0          0   IO-APIC-fasteoi   mrst_i2c
 11:          0          0   IO-APIC-fasteoi   mrst_i2c
 12:          0          0   IO-APIC-fasteoi   mrst_i2c
 23:          0          0   IO-APIC-fasteoi   intel_scu_ipc
 27:         21          0   IO-APIC-fasteoi


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  0:30   ` H. Peter Anvin
  2010-06-08  1:10     ` Eric W. Biederman
@ 2010-06-08  8:10     ` Alan Cox
  2010-06-08 18:11       ` H. Peter Anvin
  2010-06-08 18:44     ` [PATCH] x86/irq: Rename gsi_end gsi_top, and fix off by one errors Eric W. Biederman
  2 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2010-06-08  8:10 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Eric W. Biederman, Jacob Pan, Alan Cox, Arjan van de Ven, LKML,
	Ingo Molnar, Feng Tang, Len Brown

> > You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
> > irqs 0-15 are the ISA irqs you may not use those irq numbers for

Linux IRQ 0 is "no IRQ assigned", except buried in certain bits of arch
specific historical knowledge.

Also calling 1-15 ISA IRQ lines is also somewhat misleading given they
are almost certainly routing for PCI devices. "PIC IRQ routing" maybe -
but even that is not really true on a lot of PC hardware today except by
convention.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  8:10     ` Alan Cox
@ 2010-06-08 18:11       ` H. Peter Anvin
  2010-06-08 20:04         ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2010-06-08 18:11 UTC (permalink / raw)
  To: Alan Cox
  Cc: Eric W. Biederman, Jacob Pan, Alan Cox, Arjan van de Ven, LKML,
	Ingo Molnar, Feng Tang, Len Brown

On 06/08/2010 01:10 AM, Alan Cox wrote:
>>> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
>>> irqs 0-15 are the ISA irqs you may not use those irq numbers for
> 
> Linux IRQ 0 is "no IRQ assigned", except buried in certain bits of arch
> specific historical knowledge.
> 
> Also calling 1-15 ISA IRQ lines is also somewhat misleading given they
> are almost certainly routing for PCI devices. "PIC IRQ routing" maybe -
> but even that is not really true on a lot of PC hardware today except by
> convention.

Yes, but I gather IRQ/GSI 0 is an early-acquire primary timer on MRST on
Moorestown just as on PC/AT... just a different one.  Hence "special" in
the same sort of way.  I don't really care, personally, though.

	-hpa


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] x86/irq: Rename gsi_end gsi_top, and fix off by one errors
  2010-06-08  0:30   ` H. Peter Anvin
  2010-06-08  1:10     ` Eric W. Biederman
  2010-06-08  8:10     ` Alan Cox
@ 2010-06-08 18:44     ` Eric W. Biederman
  2010-06-09 22:06       ` [tip:x86/urgent] x86, irq: " tip-bot for Eric W. Biederman
  2 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 18:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jacob Pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown


When I introduced the global variable gsi_end I thought gsi_end on
io_apics was one past the end of the gsi range for the io_apic.  After
it was pointed out the the range on io_apics was inclusive I changed
my global variable to match.  That was a big mistake.  Inclusive
semantics without a range start cannot describe the case when no gsi's
are allocated.  Describing the case where no gsi's are allocated is
important in sfi.c and mpparse.c so that we can assign gsi numbers
instead of blindly copying the gsi assignments the BIOS has done as we
do in the acpi case.

To keep from getting the global variable confused with the gsi range
end rename it gsi_top.

To allow describing the case where no gsi's are allocated have gsi_top
be one place the highest gsi number seen in the system.

This fixes an off by one bug in sfi.c:
Reported-by: jacob pan <jacob.jun.pan@linux.intel.com>

This fixes the same off by one bug in mpparse.c:

This fixes an off unreachable by one bug in acpi/boot.c:irq_to_gsi
Reported-by: Yinghai <yinghai.lu@oracle.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 arch/x86/include/asm/io_apic.h |    4 ++--
 arch/x86/kernel/acpi/boot.c    |    8 ++++----
 arch/x86/kernel/apic/io_apic.c |   12 ++++++------
 arch/x86/kernel/mpparse.c      |    2 +-
 arch/x86/kernel/sfi.c          |    2 +-
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 63cb409..9cb2edb 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -183,7 +183,7 @@ struct mp_ioapic_gsi{
 	u32 gsi_end;
 };
 extern struct mp_ioapic_gsi  mp_gsi_routing[];
-extern u32 gsi_end;
+extern u32 gsi_top;
 int mp_find_ioapic(u32 gsi);
 int mp_find_ioapic_pin(int ioapic, u32 gsi);
 void __init mp_register_ioapic(int id, u32 address, u32 gsi_base);
@@ -197,7 +197,7 @@ static const int timer_through_8259 = 0;
 static inline void ioapic_init_mappings(void)	{ }
 static inline void ioapic_insert_resources(void) { }
 static inline void probe_nr_irqs_gsi(void)	{ }
-#define gsi_end (NR_IRQS_LEGACY - 1)
+#define gsi_top (NR_IRQS_LEGACY)
 static inline int mp_find_ioapic(u32 gsi) { return 0; }
 
 struct io_apic_irq_attr;
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 60cc405..c05872a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -118,7 +118,7 @@ static unsigned int gsi_to_irq(unsigned int gsi)
 	if (gsi >= NR_IRQS_LEGACY)
 		irq = gsi;
 	else
-		irq = gsi_end + 1 + gsi;
+		irq = gsi_top + gsi;
 
 	return irq;
 }
@@ -129,10 +129,10 @@ static u32 irq_to_gsi(int irq)
 
 	if (irq < NR_IRQS_LEGACY)
 		gsi = isa_irq_to_gsi[irq];
-	else if (irq <= gsi_end)
+	else if (irq < gsi_top)
 		gsi = irq;
-	else if (irq <= (gsi_end + NR_IRQS_LEGACY))
-		gsi = irq - gsi_end;
+	else if (irq < (gsi_top + NR_IRQS_LEGACY))
+		gsi = irq - gsi_top;
 	else
 		gsi = 0xffffffff;
 
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 33f3563..e41ed24 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -89,8 +89,8 @@ int nr_ioapics;
 /* IO APIC gsi routing info */
 struct mp_ioapic_gsi  mp_gsi_routing[MAX_IO_APICS];
 
-/* The last gsi number used */
-u32 gsi_end;
+/* The one past the highest gsi number used */
+u32 gsi_top;
 
 /* MP IRQ source entries */
 struct mpc_intsrc mp_irqs[MAX_IRQ_SOURCES];
@@ -1035,7 +1035,7 @@ static int pin_2_irq(int idx, int apic, int pin)
 		if (gsi >= NR_IRQS_LEGACY)
 			irq = gsi;
 		else
-			irq = gsi_end + 1 + gsi;
+			irq = gsi_top + gsi;
 	}
 
 #ifdef CONFIG_X86_32
@@ -3853,7 +3853,7 @@ void __init probe_nr_irqs_gsi(void)
 {
 	int nr;
 
-	nr = gsi_end + 1 + NR_IRQS_LEGACY;
+	nr = gsi_top + NR_IRQS_LEGACY;
 	if (nr > nr_irqs_gsi)
 		nr_irqs_gsi = nr;
 
@@ -4294,8 +4294,8 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	 */
 	nr_ioapic_registers[idx] = entries;
 
-	if (mp_gsi_routing[idx].gsi_end > gsi_end)
-		gsi_end = mp_gsi_routing[idx].gsi_end;
+	if (mp_gsi_routing[idx].gsi_end >= gsi_top)
+		gsi_top = mp_gsi_routing[idx].gsi_end + 1;
 
 	printk(KERN_INFO "IOAPIC[%d]: apic_id %d, version %d, address 0x%x, "
 	       "GSI %d-%d\n", idx, mp_ioapics[idx].apicid,
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 5ae5d24..d86dbf7 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -123,7 +123,7 @@ static void __init MP_ioapic_info(struct mpc_ioapic *m)
 	printk(KERN_INFO "I/O APIC #%d Version %d at 0x%X.\n",
 	       m->apicid, m->apicver, m->apicaddr);
 
-	mp_register_ioapic(m->apicid, m->apicaddr, gsi_end + 1);
+	mp_register_ioapic(m->apicid, m->apicaddr, gsi_top);
 }
 
 static void print_MP_intsrc_info(struct mpc_intsrc *m)
diff --git a/arch/x86/kernel/sfi.c b/arch/x86/kernel/sfi.c
index 7ded578..cb22acf 100644
--- a/arch/x86/kernel/sfi.c
+++ b/arch/x86/kernel/sfi.c
@@ -93,7 +93,7 @@ static int __init sfi_parse_ioapic(struct sfi_table_header *table)
 	pentry = (struct sfi_apic_table_entry *)sb->pentry;
 
 	for (i = 0; i < num; i++) {
-		mp_register_ioapic(i, pentry->phys_addr, gsi_end + 1);
+		mp_register_ioapic(i, pentry->phys_addr, gsi_top);
 		pentry++;
 	}
 
-- 
1.6.5.2.143.g8cc62


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 19:41     ` Eric W. Biederman
@ 2010-06-08 19:12       ` Alan Cox
  2010-06-08 20:56         ` Yuhong Bao
  2010-06-08 20:36       ` H. Peter Anvin
  2010-06-08 20:41       ` jacob pan
  2 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2010-06-08 19:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: jacob pan, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

> The issue is what acpi calls bus 0 irqs, and how drivers deal with
> them.  We wind up having well know irqs:  irqs 3 and 4 for serial
> ports, irq 7 for parallel ports. irqs 14, and 15 for ide.

Only we don't.

IRQ 3/4 for serial is not true on many boxes today that have serial -
in fact its been iffy since about the Thinkpad 600 !
IRQ 7 for parallel is rarely used (and in fact we usually poll)
IRQ 14/15 is wrong for ATA today as its AHCI based on modern boxes

And all the drivers you list are *cross platform* already.

> A bunch of these hardware devices we can get if someone connects up a
> lpc superio chip.

To an x86 PC class system using some very traditional (and no longer
valid) bits of behaviour.

> Even if sfi is never implemented on a platform where that kind of
> hardware exists, the current sfi code is setup to coexist
> simultaneously in the kernel with all of the infrastructure of other
> platforms where those kinds of devices exist.  Which means there can
> be drivers compiled into your kernel that make assumptions about
> special properties of the irqs 0-15.

That would be a driver bug. It would be bite other systems beyond the
legacy PC. In the PC world its been unsafe since PnP BIOS let alone
ACPI.

> With the current code you should get all of the remapping of the
> gsi's out of the legacy irq space without needing to lift a finger,
> and if someone later decides we need an irq override so we can have
> an isa irq present on a weird embedded system on a chip the code will
> be able to handle that easily.

There is only one reason to care about this - that is ISA bus devices
with software IRQ steering registers for the ISA lines. Now that might
just about be a real reason, but as former maintainer of both serial
and IDE (and part time fixer of parport) I'd say the other reasons are
bunkum.

Alan
--

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08  5:50   ` [PATCH] x86/sfi: fix ioapic gsi range jacob pan
@ 2010-06-08 19:41     ` Eric W. Biederman
  2010-06-08 19:12       ` Alan Cox
                         ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 19:41 UTC (permalink / raw)
  To: jacob pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

jacob pan <jacob.jun.pan@linux.intel.com> writes:

>> > Background:
>> > In Moorestown/Medfield platforms, there is no legacy IRQs, all gsis
>> > and irqs are one to one mapped, including those < 16. Specifically,
>> > IRQ0 and IRQ1 are used for per-cpu timers. So without this patch,
>> > IOAPIC pin to IRQ mapping is off by one.
>> 
>> The patch looks mostly reasonable the comment is wrong.
>> 
>> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
>> irqs 0-15 are the ISA irqs you may not use those irq numbers for
>> something different on any architecture, but especially not on x86.
>> The gsi numbers are firmware specific and you may treat however you
>> want.
>
> [jacob pan] If we don't have ISA irqs, why can't we have gsi# = irq#
> for the legacy IRQ range? On Moorestown, we are re-using legacy irqs.
> e.g. 
> sh-4.0# cat /proc/interrupts
>           CPU0       CPU1
>   0:       1512          0   IO-APIC-edge      apbt0
>   1:          0       1482   IO-APIC-edge      apbt1
>   9:          0          0   IO-APIC-fasteoi   dw_spi
>  10:          0          0   IO-APIC-fasteoi   mrst_i2c
>  11:          0          0   IO-APIC-fasteoi   mrst_i2c
>  12:          0          0   IO-APIC-fasteoi   mrst_i2c
>  23:          0          0   IO-APIC-fasteoi   intel_scu_ipc
>  27:         21          0   IO-APIC-fasteoi

At the ioapic and gsi level, and in your firmware interface reusing the
numbers is fine.

The issue is what acpi calls bus 0 irqs, and how drivers deal with
them.  We wind up having well know irqs:  irqs 3 and 4 for serial ports,
irq 7 for parallel ports. irqs 14, and 15 for ide.

A bunch of these hardware devices we can get if someone connects up a
lpc superio chip.

Even if sfi is never implemented on a platform where that kind of
hardware exists, the current sfi code is setup to coexist
simultaneously in the kernel with all of the infrastructure of other
platforms where those kinds of devices exist.  Which means there can
be drivers compiled into your kernel that make assumptions about special
properties of the irqs 0-15.

I have seen a lot of weird hard to track issues, because of a conflict in
assumptions over the ISA irqs.  It is easiest and safest just to let the
first 16 linux irq numbers be reserved for the legacy oddness, so code can
make assumptions and we don't have to worry about it.

As for the question about using legacy_pic to detect the absence of an irq
controller that Peter raised.  We can't do that because it should be possible
for an acpi system with all of the legacy hardware to exist without needing
to implement an i8259, or ever run in the historical interrupt delivery mode
of pcs.

With the current code you should get all of the remapping of the gsi's out
of the legacy irq space without needing to lift a finger, and if someone later
decides we need an irq override so we can have an isa irq present on a weird
embedded system on a chip the code will be able to handle that easily.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 18:11       ` H. Peter Anvin
@ 2010-06-08 20:04         ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 20:04 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Alan Cox, Jacob Pan, Alan Cox, Arjan van de Ven, LKML,
	Ingo Molnar, Feng Tang, Len Brown

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 06/08/2010 01:10 AM, Alan Cox wrote:
>>>> You may not use a 1-1 mapping if you don't have legacy irqs.  Linux
>>>> irqs 0-15 are the ISA irqs you may not use those irq numbers for
>> 
>> Linux IRQ 0 is "no IRQ assigned", except buried in certain bits of arch
>> specific historical knowledge.
>> 
>> Also calling 1-15 ISA IRQ lines is also somewhat misleading given they
>> are almost certainly routing for PCI devices. "PIC IRQ routing" maybe -
>> but even that is not really true on a lot of PC hardware today except by
>> convention.
>
> Yes, but I gather IRQ/GSI 0 is an early-acquire primary timer on MRST on
> Moorestown just as on PC/AT... just a different one.  Hence "special" in
> the same sort of way.  I don't really care, personally, though.

Right.  I have to admit I was stunned when I realized that request_irq
works and has worked for a long time with irq 0.  I think that might
actually be a bug.  setup_irq is traditionally used for irq 0.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 19:41     ` Eric W. Biederman
  2010-06-08 19:12       ` Alan Cox
@ 2010-06-08 20:36       ` H. Peter Anvin
  2010-06-08 20:59         ` Eric W. Biederman
  2010-06-08 20:41       ` jacob pan
  2 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2010-06-08 20:36 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: jacob pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

On 06/08/2010 12:41 PM, Eric W. Biederman wrote:
> 
> At the ioapic and gsi level, and in your firmware interface reusing the
> numbers is fine.
> 
> The issue is what acpi calls bus 0 irqs, and how drivers deal with
> them.  We wind up having well know irqs:  irqs 3 and 4 for serial ports,
> irq 7 for parallel ports. irqs 14, and 15 for ide.
> 
> A bunch of these hardware devices we can get if someone connects up a
> lpc superio chip.
> 
> Even if sfi is never implemented on a platform where that kind of
> hardware exists, the current sfi code is setup to coexist
> simultaneously in the kernel with all of the infrastructure of other
> platforms where those kinds of devices exist.  Which means there can
> be drivers compiled into your kernel that make assumptions about special
> properties of the irqs 0-15.
> 
> I have seen a lot of weird hard to track issues, because of a conflict in
> assumptions over the ISA irqs.  It is easiest and safest just to let the
> first 16 linux irq numbers be reserved for the legacy oddness, so code can
> make assumptions and we don't have to worry about it.
> 
> As for the question about using legacy_pic to detect the absence of an irq
> controller that Peter raised.  We can't do that because it should be possible
> for an acpi system with all of the legacy hardware to exist without needing
> to implement an i8259, or ever run in the historical interrupt delivery mode
> of pcs.
> 
> With the current code you should get all of the remapping of the gsi's out
> of the legacy irq space without needing to lift a finger, and if someone later
> decides we need an irq override so we can have an isa irq present on a weird
> embedded system on a chip the code will be able to handle that easily.
> 

OK, let me ask this, then: if we do that, and we hardcode this as 16
magic IRQs indefinitely, does that mean we lose 16 IDT entries
indefinitely as well?  We are already shy on interrupt vectors, and that
would make me unhappy.

	-hpa

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 19:41     ` Eric W. Biederman
  2010-06-08 19:12       ` Alan Cox
  2010-06-08 20:36       ` H. Peter Anvin
@ 2010-06-08 20:41       ` jacob pan
  2010-06-08 21:22         ` Eric W. Biederman
  2 siblings, 1 reply; 26+ messages in thread
From: jacob pan @ 2010-06-08 20:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

On Tue, 08 Jun 2010 12:41:45 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:


> Even if sfi is never implemented on a platform where that kind of
> hardware exists, the current sfi code is setup to coexist
> simultaneously in the kernel with all of the infrastructure of other
> platforms where those kinds of devices exist.  Which means there can
> be drivers compiled into your kernel that make assumptions about
> special properties of the irqs 0-15.
> 
SFI code can be compiled in with ACPI at the same time but at runtime
there is only one used, ACPI take precedence. So there wouldn't be any
additional conflict caused by SFI added APIC tables.

> As for the question about using legacy_pic to detect the absence of
> an irq controller that Peter raised.  We can't do that because it
> should be possible for an acpi system with all of the legacy hardware
> to exist without needing to implement an i8259, or ever run in the
> historical interrupt delivery mode of pcs.
In your case, I don't understand how would it change the calculation of
irq mapping. Even if you don't use i8259 on a x86 PC platform, you
still have NR_LEGACY_IRQS=legacy_pic->nr_legacy_irqs.

On the other side, use NR_LEGACY_IRQS breaks the existing code for
Moorestown in terms of irq-gsi lookup and nr_irqs_gsi.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 19:12       ` Alan Cox
@ 2010-06-08 20:56         ` Yuhong Bao
  2010-06-08 22:16           ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: Yuhong Bao @ 2010-06-08 20:56 UTC (permalink / raw)
  To: alan, ebiederm
  Cc: jacob.jun.pan, arjan, linux-kernel, hpa, mingo, feng.tang, len.brown


> IRQ 14/15 is wrong for ATA today as its AHCI based on modern boxes
Not to mention even before that there was native mode IDE!In fact, XP SP1 and later support switching to native mode IDE on BIOSes supporting it as described in this document:http://www.microsoft.com/whdc/device/storage/Native-modeATA.mspx
Yuhong Bao 		 	   		  
_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 20:36       ` H. Peter Anvin
@ 2010-06-08 20:59         ` Eric W. Biederman
  2010-06-08 21:08           ` H. Peter Anvin
  0 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 20:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: jacob pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 06/08/2010 12:41 PM, Eric W. Biederman wrote:
>> 
>> At the ioapic and gsi level, and in your firmware interface reusing the
>> numbers is fine.
>> 
>> The issue is what acpi calls bus 0 irqs, and how drivers deal with
>> them.  We wind up having well know irqs:  irqs 3 and 4 for serial ports,
>> irq 7 for parallel ports. irqs 14, and 15 for ide.
>> 
>> A bunch of these hardware devices we can get if someone connects up a
>> lpc superio chip.
>> 
>> Even if sfi is never implemented on a platform where that kind of
>> hardware exists, the current sfi code is setup to coexist
>> simultaneously in the kernel with all of the infrastructure of other
>> platforms where those kinds of devices exist.  Which means there can
>> be drivers compiled into your kernel that make assumptions about special
>> properties of the irqs 0-15.
>> 
>> I have seen a lot of weird hard to track issues, because of a conflict in
>> assumptions over the ISA irqs.  It is easiest and safest just to let the
>> first 16 linux irq numbers be reserved for the legacy oddness, so code can
>> make assumptions and we don't have to worry about it.
>> 
>> As for the question about using legacy_pic to detect the absence of an irq
>> controller that Peter raised.  We can't do that because it should be possible
>> for an acpi system with all of the legacy hardware to exist without needing
>> to implement an i8259, or ever run in the historical interrupt delivery mode
>> of pcs.
>> 
>> With the current code you should get all of the remapping of the gsi's out
>> of the legacy irq space without needing to lift a finger, and if someone later
>> decides we need an irq override so we can have an isa irq present on a weird
>> embedded system on a chip the code will be able to handle that easily.
>> 
>
> OK, let me ask this, then: if we do that, and we hardcode this as 16
> magic IRQs indefinitely, does that mean we lose 16 IDT entries
> indefinitely as well?  We are already shy on interrupt vectors, and that
> would make me unhappy.

No.  There is no reason to loose 16 IDT entries indefinitely.  We may
need a boot time allocation when we see we have isa irqs, to replace
the static allocation that we have.  But for the most part we dynamically
idt entries aka vector numbers today, and there is no reason we can't
generalize that in the future.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 20:59         ` Eric W. Biederman
@ 2010-06-08 21:08           ` H. Peter Anvin
  2010-06-08 21:51             ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: H. Peter Anvin @ 2010-06-08 21:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: jacob pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

On 06/08/2010 01:59 PM, Eric W. Biederman wrote:
> 
> No.  There is no reason to loose 16 IDT entries indefinitely.  We may
> need a boot time allocation when we see we have isa irqs, to replace
> the static allocation that we have.  But for the most part we dynamically
> idt entries aka vector numbers today, and there is no reason we can't
> generalize that in the future.
> 

Well, that boot time allocation is one of the things
legacy_pic->nr_legacy_irq is used for, and it really makes sense, I
think.  I would really like to move away from a compile-time allocation,
and I still find it hard to believe it has a reason to exist.

	-hpa


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 20:41       ` jacob pan
@ 2010-06-08 21:22         ` Eric W. Biederman
  2010-06-08 22:17           ` jacob pan
  0 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 21:22 UTC (permalink / raw)
  To: jacob pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

jacob pan <jacob.jun.pan@linux.intel.com> writes:

> On Tue, 08 Jun 2010 12:41:45 -0700
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>
>> Even if sfi is never implemented on a platform where that kind of
>> hardware exists, the current sfi code is setup to coexist
>> simultaneously in the kernel with all of the infrastructure of other
>> platforms where those kinds of devices exist.  Which means there can
>> be drivers compiled into your kernel that make assumptions about
>> special properties of the irqs 0-15.
>> 
> SFI code can be compiled in with ACPI at the same time but at runtime
> there is only one used, ACPI take precedence. So there wouldn't be any
> additional conflict caused by SFI added APIC tables.
>
>> As for the question about using legacy_pic to detect the absence of
>> an irq controller that Peter raised.  We can't do that because it
>> should be possible for an acpi system with all of the legacy hardware
>> to exist without needing to implement an i8259, or ever run in the
>> historical interrupt delivery mode of pcs.
> In your case, I don't understand how would it change the calculation of
> irq mapping. Even if you don't use i8259 on a x86 PC platform, you
> still have NR_LEGACY_IRQS=legacy_pic->nr_legacy_irqs.
>
> On the other side, use NR_LEGACY_IRQS breaks the existing code for
> Moorestown in terms of irq-gsi lookup and nr_irqs_gsi.

Is this code merged where I should have fixed it in my patchset?

We went through this with acpi having an identity mapping of irq to
gsi mapping and the result is that we (a) developed  weird platform
specific hooks for things that should have been handled by generic
code, and on other systems we lost access to 16 irqs.

It took probably 10 years to sort the acpi irq handling out.  What
I have learned along the way is:
- Sharing irq in software is madness, so a one to one mapping with
  hardware irq is required.
- An identity mapping with gsis is nice but we can't count on the hardware
  designers or the spec designers always doing sane and reasonable things
  so not guaranteeing a particular mapping is important.

If I have actually broken any sfi drivers because you assumed a
particular we are back where we were with ISA, and still haven't
completely escaped.  The abstraction layer should provide all of
the mapping so drivers only see linux irq numbers.

Eric


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 21:08           ` H. Peter Anvin
@ 2010-06-08 21:51             ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 21:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: jacob pan, Alan Cox, Arjan van de Ven, LKML, Ingo Molnar,
	Feng Tang, Len Brown

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 06/08/2010 01:59 PM, Eric W. Biederman wrote:
>> 
>> No.  There is no reason to loose 16 IDT entries indefinitely.  We may
>> need a boot time allocation when we see we have isa irqs, to replace
>> the static allocation that we have.  But for the most part we dynamically
>> idt entries aka vector numbers today, and there is no reason we can't
>> generalize that in the future.
>> 
>
> Well, that boot time allocation is one of the things
> legacy_pic->nr_legacy_irq is used for, and it really makes sense, I
> think.  I would really like to move away from a compile-time allocation,
> and I still find it hard to believe it has a reason to exist.

Interesting.  Using legacy_pic->nr_legacy_irqs certainly isn't the
right way to handle that.  We should just have an init method for the
legacy_pic that just allocates what it needs when it is initialized.

I think we can now run in either just pic mode or just apic mode and
so can kill any code for switching from one mode to another.  That led
to all kinds of complexity.

As time and priorities permit I will send/review patches cleaning up the linux
irq code. 

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 20:56         ` Yuhong Bao
@ 2010-06-08 22:16           ` Eric W. Biederman
  2010-06-08 22:29             ` Alan Cox
  0 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-08 22:16 UTC (permalink / raw)
  To: Yuhong Bao
  Cc: alan, jacob.jun.pan, arjan, linux-kernel, hpa, mingo, feng.tang,
	len.brown

Yuhong Bao <yuhongbao_386@hotmail.com> writes:

>> IRQ 14/15 is wrong for ATA today as its AHCI based on modern boxes
> Not to mention even before that there was native mode IDE!In fact, XP SP1 and later support switching to native mode IDE on BIOSes supporting it as described in this document:http://www.microsoft.com/whdc/device/storage/Native-modeATA.mspx

One of these days we may even get a system where the designers really
care for being simple and easy to use by software.  Where all devices
will be discoverable pci devices, and all irqs will be msi or msi-x.  No
ioapics, no irq routing tables, just nice simple standards conformant
hardware that we already support.

Until then I guess we get things like Moorestown which are effectively
a reinvention of ISA based systems, with different firmware, and
different non-standard ISA devices.  It isn't particularly fun to
smash yet another incompatible idea into the existing infrastructure.
The cleanups that introduce modularity, flexibility, and
maintainability for irq handling are barely keeping ahead of new
poorly integrated features that make the code brittle again.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 21:22         ` Eric W. Biederman
@ 2010-06-08 22:17           ` jacob pan
  2010-06-09 23:44             ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: jacob pan @ 2010-06-08 22:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

On Tue, 08 Jun 2010 14:22:18 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> jacob pan <jacob.jun.pan@linux.intel.com> writes:
> 
> > On Tue, 08 Jun 2010 12:41:45 -0700
> > ebiederm@xmission.com (Eric W. Biederman) wrote:
> >
> >
> >> Even if sfi is never implemented on a platform where that kind of
> >> hardware exists, the current sfi code is setup to coexist
> >> simultaneously in the kernel with all of the infrastructure of
> >> other platforms where those kinds of devices exist.  Which means
> >> there can be drivers compiled into your kernel that make
> >> assumptions about special properties of the irqs 0-15.
> >> 
> > SFI code can be compiled in with ACPI at the same time but at
> > runtime there is only one used, ACPI take precedence. So there
> > wouldn't be any additional conflict caused by SFI added APIC tables.
> >
> >> As for the question about using legacy_pic to detect the absence of
> >> an irq controller that Peter raised.  We can't do that because it
> >> should be possible for an acpi system with all of the legacy
> >> hardware to exist without needing to implement an i8259, or ever
> >> run in the historical interrupt delivery mode of pcs.
> > In your case, I don't understand how would it change the
> > calculation of irq mapping. Even if you don't use i8259 on a x86 PC
> > platform, you still have NR_LEGACY_IRQS=legacy_pic->nr_legacy_irqs.
> >
> > On the other side, use NR_LEGACY_IRQS breaks the existing code for
> > Moorestown in terms of irq-gsi lookup and nr_irqs_gsi.
> 
> Is this code merged where I should have fixed it in my patchset?
> 
yes, merged.
> We went through this with acpi having an identity mapping of irq to
> gsi mapping and the result is that we (a) developed  weird platform
> specific hooks for things that should have been handled by generic
> code, and on other systems we lost access to 16 irqs.
> 
> It took probably 10 years to sort the acpi irq handling out.  What
> I have learned along the way is:
> - Sharing irq in software is madness, so a one to one mapping with
>   hardware irq is required.
> - An identity mapping with gsis is nice but we can't count on the
> hardware designers or the spec designers always doing sane and
> reasonable things so not guaranteeing a particular mapping is
> important.
> 
> If I have actually broken any sfi drivers because you assumed a
> particular we are back where we were with ISA, and still haven't
> completely escaped.  The abstraction layer should provide all of
> the mapping so drivers only see linux irq numbers.
> 
> Eric
> 

[jacob pan] 

In arch/x86/kernel/mrst.c we parse SFI MTMR table then
add timer irqs to mp_irqs. what is broken by this patch is
pin_2_irq() lookup for the legacy irq range since we want
NR_IRQS_LEGACY to be 0 on Moorestown. We do have the assumption that
mp_irqs from SFI is 1:1 mapped to IRQs.

Doing this can fix the problem, but you mentioned you have to use
NR_IRQS_LEGACY, which i still don't understand.

--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1032,7 +1032,7 @@ static int pin_2_irq(int idx, int apic, int pin)
 	} else {
 		u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
 
-		if (gsi >= NR_IRQS_LEGACY)
+		if (gsi >= legacy_pic->nr_legacy_irqs)
 			irq = gsi;


The second problem is nr_irqs_gsi gets an extra 16 for Moorestown.
Similarly, we need this in probe_nr_irqs_gsi:

-	nr = gsi_top + NR_IRQS_LEGACY;
+	nr = gsi_top + legacy_pic->nr_legacy_irqs;

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 22:16           ` Eric W. Biederman
@ 2010-06-08 22:29             ` Alan Cox
  0 siblings, 0 replies; 26+ messages in thread
From: Alan Cox @ 2010-06-08 22:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yuhong Bao, alan, jacob.jun.pan, arjan, linux-kernel, hpa, mingo,
	feng.tang, len.brown

> Until then I guess we get things like Moorestown which are effectively
> a reinvention of ISA based systems, with different firmware, and

I'm not quite sure where you get that idea from ?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [tip:x86/urgent] x86, irq: Rename gsi_end gsi_top, and fix off by one errors
  2010-06-08 18:44     ` [PATCH] x86/irq: Rename gsi_end gsi_top, and fix off by one errors Eric W. Biederman
@ 2010-06-09 22:06       ` tip-bot for Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: tip-bot for Eric W. Biederman @ 2010-06-09 22:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai.lu, ebiederm, tglx, jacob.jun.pan, hpa

Commit-ID:  a4384df3e24579d6292a1b3b41d500349948f30b
Gitweb:     http://git.kernel.org/tip/a4384df3e24579d6292a1b3b41d500349948f30b
Author:     Eric W. Biederman <ebiederm@xmission.com>
AuthorDate: Tue, 8 Jun 2010 11:44:32 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 9 Jun 2010 13:34:06 -0700

x86, irq: Rename gsi_end gsi_top, and fix off by one errors

When I introduced the global variable gsi_end I thought gsi_end on
io_apics was one past the end of the gsi range for the io_apic.  After
it was pointed out the the range on io_apics was inclusive I changed
my global variable to match.  That was a big mistake.  Inclusive
semantics without a range start cannot describe the case when no gsi's
are allocated.  Describing the case where no gsi's are allocated is
important in sfi.c and mpparse.c so that we can assign gsi numbers
instead of blindly copying the gsi assignments the BIOS has done as we
do in the acpi case.

To keep from getting the global variable confused with the gsi range
end rename it gsi_top.

To allow describing the case where no gsi's are allocated have gsi_top
be one place the highest gsi number seen in the system.

This fixes an off by one bug in sfi.c:
Reported-by: jacob pan <jacob.jun.pan@linux.intel.com>

This fixes the same off by one bug in mpparse.c:

This fixes an off unreachable by one bug in acpi/boot.c:irq_to_gsi
Reported-by: Yinghai <yinghai.lu@oracle.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <m17hm9jre7.fsf_-_@fess.ebiederm.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/io_apic.h |    4 ++--
 arch/x86/kernel/acpi/boot.c    |    8 ++++----
 arch/x86/kernel/apic/io_apic.c |   12 ++++++------
 arch/x86/kernel/mpparse.c      |    2 +-
 arch/x86/kernel/sfi.c          |    2 +-
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 63cb409..9cb2edb 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -183,7 +183,7 @@ struct mp_ioapic_gsi{
 	u32 gsi_end;
 };
 extern struct mp_ioapic_gsi  mp_gsi_routing[];
-extern u32 gsi_end;
+extern u32 gsi_top;
 int mp_find_ioapic(u32 gsi);
 int mp_find_ioapic_pin(int ioapic, u32 gsi);
 void __init mp_register_ioapic(int id, u32 address, u32 gsi_base);
@@ -197,7 +197,7 @@ static const int timer_through_8259 = 0;
 static inline void ioapic_init_mappings(void)	{ }
 static inline void ioapic_insert_resources(void) { }
 static inline void probe_nr_irqs_gsi(void)	{ }
-#define gsi_end (NR_IRQS_LEGACY - 1)
+#define gsi_top (NR_IRQS_LEGACY)
 static inline int mp_find_ioapic(u32 gsi) { return 0; }
 
 struct io_apic_irq_attr;
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 60cc405..c05872a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -118,7 +118,7 @@ static unsigned int gsi_to_irq(unsigned int gsi)
 	if (gsi >= NR_IRQS_LEGACY)
 		irq = gsi;
 	else
-		irq = gsi_end + 1 + gsi;
+		irq = gsi_top + gsi;
 
 	return irq;
 }
@@ -129,10 +129,10 @@ static u32 irq_to_gsi(int irq)
 
 	if (irq < NR_IRQS_LEGACY)
 		gsi = isa_irq_to_gsi[irq];
-	else if (irq <= gsi_end)
+	else if (irq < gsi_top)
 		gsi = irq;
-	else if (irq <= (gsi_end + NR_IRQS_LEGACY))
-		gsi = irq - gsi_end;
+	else if (irq < (gsi_top + NR_IRQS_LEGACY))
+		gsi = irq - gsi_top;
 	else
 		gsi = 0xffffffff;
 
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 33f3563..e41ed24 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -89,8 +89,8 @@ int nr_ioapics;
 /* IO APIC gsi routing info */
 struct mp_ioapic_gsi  mp_gsi_routing[MAX_IO_APICS];
 
-/* The last gsi number used */
-u32 gsi_end;
+/* The one past the highest gsi number used */
+u32 gsi_top;
 
 /* MP IRQ source entries */
 struct mpc_intsrc mp_irqs[MAX_IRQ_SOURCES];
@@ -1035,7 +1035,7 @@ static int pin_2_irq(int idx, int apic, int pin)
 		if (gsi >= NR_IRQS_LEGACY)
 			irq = gsi;
 		else
-			irq = gsi_end + 1 + gsi;
+			irq = gsi_top + gsi;
 	}
 
 #ifdef CONFIG_X86_32
@@ -3853,7 +3853,7 @@ void __init probe_nr_irqs_gsi(void)
 {
 	int nr;
 
-	nr = gsi_end + 1 + NR_IRQS_LEGACY;
+	nr = gsi_top + NR_IRQS_LEGACY;
 	if (nr > nr_irqs_gsi)
 		nr_irqs_gsi = nr;
 
@@ -4294,8 +4294,8 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
 	 */
 	nr_ioapic_registers[idx] = entries;
 
-	if (mp_gsi_routing[idx].gsi_end > gsi_end)
-		gsi_end = mp_gsi_routing[idx].gsi_end;
+	if (mp_gsi_routing[idx].gsi_end >= gsi_top)
+		gsi_top = mp_gsi_routing[idx].gsi_end + 1;
 
 	printk(KERN_INFO "IOAPIC[%d]: apic_id %d, version %d, address 0x%x, "
 	       "GSI %d-%d\n", idx, mp_ioapics[idx].apicid,
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 5ae5d24..d86dbf7 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -123,7 +123,7 @@ static void __init MP_ioapic_info(struct mpc_ioapic *m)
 	printk(KERN_INFO "I/O APIC #%d Version %d at 0x%X.\n",
 	       m->apicid, m->apicver, m->apicaddr);
 
-	mp_register_ioapic(m->apicid, m->apicaddr, gsi_end + 1);
+	mp_register_ioapic(m->apicid, m->apicaddr, gsi_top);
 }
 
 static void print_MP_intsrc_info(struct mpc_intsrc *m)
diff --git a/arch/x86/kernel/sfi.c b/arch/x86/kernel/sfi.c
index 7ded578..cb22acf 100644
--- a/arch/x86/kernel/sfi.c
+++ b/arch/x86/kernel/sfi.c
@@ -93,7 +93,7 @@ static int __init sfi_parse_ioapic(struct sfi_table_header *table)
 	pentry = (struct sfi_apic_table_entry *)sb->pentry;
 
 	for (i = 0; i < num; i++) {
-		mp_register_ioapic(i, pentry->phys_addr, gsi_end + 1);
+		mp_register_ioapic(i, pentry->phys_addr, gsi_top);
 		pentry++;
 	}
 

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-08 22:17           ` jacob pan
@ 2010-06-09 23:44             ` Eric W. Biederman
  2010-06-10  8:40               ` jacob pan
  0 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-09 23:44 UTC (permalink / raw)
  To: jacob pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

jacob pan <jacob.jun.pan@linux.intel.com> writes:

> [jacob pan] 
>
> In arch/x86/kernel/mrst.c we parse SFI MTMR table then
> add timer irqs to mp_irqs. what is broken by this patch is
> pin_2_irq() lookup for the legacy irq range since we want
> NR_IRQS_LEGACY to be 0 on Moorestown. We do have the assumption that
> mp_irqs from SFI is 1:1 mapped to IRQs.

NR_IRQS_LEGACY is a constant of 16.

> Doing this can fix the problem, but you mentioned you have to use
> NR_IRQS_LEGACY, which i still don't understand.

Looking at the code in io_apic.c there is a relatively clean way to
handle this.  The actual concept in io_apic.c today is mp_bus_not_pci.

So you just need to do:

diff --git a/arch/x86/kernel/mrst.c b/arch/x86/kernel/mrst.c
index e796448..9377fda 100644
--- a/arch/x86/kernel/mrst.c
+++ b/arch/x86/kernel/mrst.c
@@ -242,4 +242,5 @@ void __init x86_mrst_early_setup(void)
 	x86_init.mpparse.find_smp_config = x86_init_noop;
 	x86_init.mpparse.get_smp_config = x86_init_uint_noop;
 
+	set_bit(0, mp_bus_not_pci);
 }


Then you get treated as ISA for purposes of pin_2_irq, and everything
is a pass through.  Since that seems to be what you want anyway I don't
see a problem with that for now.

Using legacy pic to even talk about this behavior is wrong as that is
hardware abstraction and the presence or absence of an i8259 has
nothing to do with the presence of ISA irqs and their descendants.

Eric

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-09 23:44             ` Eric W. Biederman
@ 2010-06-10  8:40               ` jacob pan
  2010-06-10 14:39                 ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: jacob pan @ 2010-06-10  8:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

Eric W. Biederman Wed, 09 Jun 2010 16:44:11 -0700
>jacob pan <jacob.jun.pan@linux.intel.com> writes:
>
>> [jacob pan] 
>>
>> In arch/x86/kernel/mrst.c we parse SFI MTMR table then
>> add timer irqs to mp_irqs. what is broken by this patch is
>> pin_2_irq() lookup for the legacy irq range since we want
>> NR_IRQS_LEGACY to be 0 on Moorestown. We do have the assumption that
>> mp_irqs from SFI is 1:1 mapped to IRQs.
>
>NR_IRQS_LEGACY is a constant of 16.
i know, what i meant is we want this to be a variable.
>
>> Doing this can fix the problem, but you mentioned you have to use
>> NR_IRQS_LEGACY, which i still don't understand.
>
>Looking at the code in io_apic.c there is a relatively clean way to
>handle this.  The actual concept in io_apic.c today is mp_bus_not_pci.
>
>So you just need to do:
>
>diff --git a/arch/x86/kernel/mrst.c b/arch/x86/kernel/mrst.c
>index e796448..9377fda 100644
>--- a/arch/x86/kernel/mrst.c
>+++ b/arch/x86/kernel/mrst.c
>@@ -242,4 +242,5 @@ void __init x86_mrst_early_setup(void)
> 	x86_init.mpparse.find_smp_config = x86_init_noop;
> 	x86_init.mpparse.get_smp_config = x86_init_uint_noop;
> 
>+	set_bit(0, mp_bus_not_pci);
> }
>
>
>Then you get treated as ISA for purposes of pin_2_irq, and everything
>is a pass through.  Since that seems to be what you want anyway I don't
>see a problem with that for now.
>
I think this may work better. Since we already assign srcbus = MP_ISA_BUS to 
use the mp_irqs, this may be more inline with the current code.
Thanks for the pointer.

But the nr_irqs_gsi would still be wrong for Moorestown if don't use legacy_pic
or some sort of runtime detection.

>Using legacy pic to even talk about this behavior is wrong as that is
>hardware abstraction and the presence or absence of an i8259 has
>nothing to do with the presence of ISA irqs and their descendants.
>
>Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86/sfi: fix ioapic gsi range
  2010-06-10  8:40               ` jacob pan
@ 2010-06-10 14:39                 ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2010-06-10 14:39 UTC (permalink / raw)
  To: jacob pan
  Cc: Alan Cox, Arjan van de Ven, LKML, H. Peter Anvin, Ingo Molnar,
	Feng Tang, Len Brown

jacob pan <jacob.jun.pan@linux.intel.com> writes:

> Eric W. Biederman Wed, 09 Jun 2010 16:44:11 -0700
>>jacob pan <jacob.jun.pan@linux.intel.com> writes:
>>
>>> [jacob pan] 
>>>
>>> In arch/x86/kernel/mrst.c we parse SFI MTMR table then
>>> add timer irqs to mp_irqs. what is broken by this patch is
>>> pin_2_irq() lookup for the legacy irq range since we want
>>> NR_IRQS_LEGACY to be 0 on Moorestown. We do have the assumption that
>>> mp_irqs from SFI is 1:1 mapped to IRQs.
>>
>>NR_IRQS_LEGACY is a constant of 16.
> i know, what i meant is we want this to be a variable.

No.  Not on that code path.  As that code path has nothing
to do what what hardware is present.

There are a bunch of other very awkward usages of
legacy_pic->nr_legacy_irqs that should be removed with cleanups.

Things like EISA_ELCR are very wrong to have a variable in there.
We already have a probe for ISA being present so we don't need an
extra one.

>>> Doing this can fix the problem, but you mentioned you have to use
>>> NR_IRQS_LEGACY, which i still don't understand.
>>
>>Looking at the code in io_apic.c there is a relatively clean way to
>>handle this.  The actual concept in io_apic.c today is mp_bus_not_pci.
>>
>>So you just need to do:
>>
>>diff --git a/arch/x86/kernel/mrst.c b/arch/x86/kernel/mrst.c
>>index e796448..9377fda 100644
>>--- a/arch/x86/kernel/mrst.c
>>+++ b/arch/x86/kernel/mrst.c
>>@@ -242,4 +242,5 @@ void __init x86_mrst_early_setup(void)
>> 	x86_init.mpparse.find_smp_config = x86_init_noop;
>> 	x86_init.mpparse.get_smp_config = x86_init_uint_noop;
>> 
>>+	set_bit(0, mp_bus_not_pci);
>> }
>>
>>
>>Then you get treated as ISA for purposes of pin_2_irq, and everything
>>is a pass through.  Since that seems to be what you want anyway I don't
>>see a problem with that for now.
>>
> I think this may work better. Since we already assign srcbus = MP_ISA_BUS to 
> use the mp_irqs, this may be more inline with the current code.
> Thanks for the pointer.
>
> But the nr_irqs_gsi would still be wrong for Moorestown if don't use legacy_pic
> or some sort of runtime detection.

nr_irqs_gsi has a possibly confusing name.

nr_irqs_gsi is the number of linux irqs that we reserve for handling
the gsis.  All of the rest of the irqs are allowed to be dynamically
allocated for MSIs and the like.  As long as the number is >= the number
of gsis you are fine. 

Unless you have patches that do strange and use nr_irqs_gsi for something
else I don't see why you would have a problem with nr_irqs_gsi.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-06-10 14:40 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-07 23:07 [PATCH] x86/sfi: fix ioapic gsi range Jacob Pan
2010-06-08  0:01 ` jacob pan
2010-06-08  0:24 ` Eric W. Biederman
2010-06-08  0:30   ` H. Peter Anvin
2010-06-08  1:10     ` Eric W. Biederman
2010-06-08  8:10     ` Alan Cox
2010-06-08 18:11       ` H. Peter Anvin
2010-06-08 20:04         ` Eric W. Biederman
2010-06-08 18:44     ` [PATCH] x86/irq: Rename gsi_end gsi_top, and fix off by one errors Eric W. Biederman
2010-06-09 22:06       ` [tip:x86/urgent] x86, irq: " tip-bot for Eric W. Biederman
2010-06-08  5:50   ` [PATCH] x86/sfi: fix ioapic gsi range jacob pan
2010-06-08 19:41     ` Eric W. Biederman
2010-06-08 19:12       ` Alan Cox
2010-06-08 20:56         ` Yuhong Bao
2010-06-08 22:16           ` Eric W. Biederman
2010-06-08 22:29             ` Alan Cox
2010-06-08 20:36       ` H. Peter Anvin
2010-06-08 20:59         ` Eric W. Biederman
2010-06-08 21:08           ` H. Peter Anvin
2010-06-08 21:51             ` Eric W. Biederman
2010-06-08 20:41       ` jacob pan
2010-06-08 21:22         ` Eric W. Biederman
2010-06-08 22:17           ` jacob pan
2010-06-09 23:44             ` Eric W. Biederman
2010-06-10  8:40               ` jacob pan
2010-06-10 14:39                 ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.