linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.x BUGs at boot time (APIC related)
@ 2004-12-22 17:31 Denis Vlasenko
  2004-12-23 11:02 ` Denis Vlasenko
  0 siblings, 1 reply; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-22 17:31 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 907 bytes --]

This is happening on a "HP Compaq dc7100 CMT"
(I believe it is a model number - taken from the label on the box).

Both 2.6.9 and 2.6.10-rc3 are dying this way:

[top of visible screen]
I/O APIC #1 Version 17 at 0xFEC00000
Enabled APIC mode: Flat. Using 1 I/O APICs
...
[unrelated stuff (dentry cache size etc...)]
...
Enabling fast FPU save & restore ...done
Enabling unmasked SIMD FPU exception support ...done
Checking 'hlt' instruction ..OK
------------------------
kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]


Code:
....
void __init setup_local_APIC (void)
{
        ...
        /*
         * Double-check whether this APIC is really registered.
         */
        if (!apic_id_registered())
                BUG();   <=========================

2.4.27-rc3 boots just fine on the same hardware.
oops, lspci and .config's are in attached tarball.
--
vda

[-- Attachment #2: BUG_APIC.tar.bz2 --]
[-- Type: application/x-tbz, Size: 21634 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
  2004-12-23 11:02 ` Denis Vlasenko
@ 2004-12-23  9:12   ` William Lee Irwin III
  2004-12-23 14:57     ` Denis Vlasenko
  2004-12-23  9:33   ` Arnaud Patard
  1 sibling, 1 reply; 8+ messages in thread
From: William Lee Irwin III @ 2004-12-23  9:12 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: mingo, linux-kernel

On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
>>         if (!apic_id_registered())
>>                 BUG();   <=========================

On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
> Tested with noapic nolapic boot params. Still happens.
> Call chain is init() -> APIC_init_uniprocessor() ->
> ->  setup_local_APIC(). I am a bit suspicious why
> APIC_init_uniprocessor() does not bail out
> if enable_local_apic<0 (i.e. if I boot with "nolapic"):
> int __init APIC_init_uniprocessor (void)
> {
>         if (enable_local_apic < 0)
>                 clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
> 		<===== missing "return -1"?
> 
>         if (!smp_found_config && !cpu_has_apic)
>                 return -1;
> ...

Sounds pretty serious. What happens if you add the missing return -1?

-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
  2004-12-23 11:02 ` Denis Vlasenko
  2004-12-23  9:12   ` William Lee Irwin III
@ 2004-12-23  9:33   ` Arnaud Patard
  1 sibling, 0 replies; 8+ messages in thread
From: Arnaud Patard @ 2004-12-23  9:33 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: mingo, linux-kernel, wli

Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes:

> On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
>> This is happening on a "HP Compaq dc7100 CMT"
>> (I believe it is a model number - taken from the label on the box).

iirc, dc7100 is the name of the minitower box, not of the computer :)

>>
>> Both 2.6.9 and 2.6.10-rc3 are dying this way:
>>
>> [top of visible screen]
>> I/O APIC #1 Version 17 at 0xFEC00000
>> Enabled APIC mode: Flat. Using 1 I/O APICs
>> ...
>> [unrelated stuff (dentry cache size etc...)]
>> ...
>> Enabling fast FPU save & restore ...done
>> Enabling unmasked SIMD FPU exception support ...done
>> Checking 'hlt' instruction ..OK
>> ------------------------
>> kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]

I've taken a look at your .config for 2.6.10 and you don't have acpi
enabled (see below). Please try with acpi enabled


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
  2004-12-22 17:31 2.6.x BUGs at boot time (APIC related) Denis Vlasenko
@ 2004-12-23 11:02 ` Denis Vlasenko
  2004-12-23  9:12   ` William Lee Irwin III
  2004-12-23  9:33   ` Arnaud Patard
  0 siblings, 2 replies; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-23 11:02 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel, wli

On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
> This is happening on a "HP Compaq dc7100 CMT"
> (I believe it is a model number - taken from the label on the box).
>
> Both 2.6.9 and 2.6.10-rc3 are dying this way:
>
> [top of visible screen]
> I/O APIC #1 Version 17 at 0xFEC00000
> Enabled APIC mode: Flat. Using 1 I/O APICs
> ...
> [unrelated stuff (dentry cache size etc...)]
> ...
> Enabling fast FPU save & restore ...done
> Enabling unmasked SIMD FPU exception support ...done
> Checking 'hlt' instruction ..OK
> ------------------------
> kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]
>
>
> Code:
> ....
> void __init setup_local_APIC (void)
> {
>         ...
>         /*
>          * Double-check whether this APIC is really registered.
>          */
>         if (!apic_id_registered())
>                 BUG();   <=========================

Tested with noapic nolapic boot params. Still happens.

Call chain is init() -> APIC_init_uniprocessor() ->
->  setup_local_APIC(). I am a bit suspicious why
APIC_init_uniprocessor() does not bail out
if enable_local_apic<0 (i.e. if I boot with "nolapic"):

int __init APIC_init_uniprocessor (void)
{
        if (enable_local_apic < 0)
                clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
		<===== missing "return -1"?

        if (!smp_found_config && !cpu_has_apic)
                return -1;
...
        verify_local_APIC();

        connect_bsp_APIC();

        phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);

        setup_local_APIC();  <=== will die there

--
vda

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
  2004-12-23  9:12   ` William Lee Irwin III
@ 2004-12-23 14:57     ` Denis Vlasenko
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-23 14:57 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: mingo, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

On Thursday 23 December 2004 09:12, William Lee Irwin III wrote:
> On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
> >>         if (!apic_id_registered())
> >>                 BUG();   <=========================
>
> On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
> > Tested with noapic nolapic boot params. Still happens.
> > Call chain is init() -> APIC_init_uniprocessor() ->
> > ->  setup_local_APIC(). I am a bit suspicious why
> > APIC_init_uniprocessor() does not bail out
> > if enable_local_apic<0 (i.e. if I boot with "nolapic"):
> > int __init APIC_init_uniprocessor (void)
> > {
> >         if (enable_local_apic < 0)
> >                 clear_bit(X86_FEATURE_APIC,
> > boot_cpu_data.x86_capability); <===== missing "return -1"?
> >
> >         if (!smp_found_config && !cpu_has_apic)
> >                 return -1;
> > ...
>
> Sounds pretty serious. What happens if you add the missing return -1?

Just tested that. It booted ok. Patch is in attachment.
-- 
vda

[-- Attachment #2: apic.c.diff --]
[-- Type: text/x-diff, Size: 423 bytes --]

--- linux-2.6.10-rc3.src/arch/i386/kernel/apic.c.old	Mon Dec 20 14:13:59 2004
+++ linux-2.6.10-rc3.src/arch/i386/kernel/apic.c	Thu Dec 23 08:54:13 2004
@@ -1250,8 +1250,10 @@
  */
 int __init APIC_init_uniprocessor (void)
 {
-	if (enable_local_apic < 0)
+	if (enable_local_apic < 0) {
 		clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+		return -1;
+	}
 
 	if (!smp_found_config && !cpu_has_apic)
 		return -1;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
@ 2004-12-23 16:59 Chuck Ebbert
  0 siblings, 0 replies; 8+ messages in thread
From: Chuck Ebbert @ 2004-12-23 16:59 UTC (permalink / raw)
  To: Mikael Pettersson
  Cc: Ingo Molnar, linux-kernel, William Lee Irwin III, Denis Vlasenko

vda wrote:

> --- linux-2.6.10-rc3.src/arch/i386/kernel/apic.c.old  Mon Dec 20 14:13:59 2004
> +++ linux-2.6.10-rc3.src/arch/i386/kernel/apic.c      Thu Dec 23 08:54:13 2004
> @@ -1250,8 +1250,10 @@
>   */
>  int __init APIC_init_uniprocessor (void)
>  {
> -     if (enable_local_apic < 0)
> +     if (enable_local_apic < 0) {
>               clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
> +             return -1;
> +     }
>  
>       if (!smp_found_config && !cpu_has_apic)
>               return -1;


Mikael Pettersson wrote:

> The early return just hides the real bug, whatever it is.


How about this:

-       if (!smp_found_config && !cpu_has_apic)
+       if (!smp_found_config || !cpu_has_apic)

--
Please take it as a sign of my infinite respect for you,
that I insist on you doing all the work.
                                        -- Rusty Russell

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
  2004-12-23 16:11 Mikael Pettersson
@ 2004-12-23 16:22 ` William Lee Irwin III
  0 siblings, 0 replies; 8+ messages in thread
From: William Lee Irwin III @ 2004-12-23 16:22 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: vda, linux-kernel, mingo

At some point in the past, I wrote:
>>> Sounds pretty serious. What happens if you add the missing return -1?

On Thu, 23 Dec 2004 14:57:25 +0000, Denis Vlasenko wrote:
>> Just tested that. It booted ok. Patch is in attachment.

On Thu, Dec 23, 2004 at 05:11:39PM +0100, Mikael Pettersson wrote:
> The early return just hides the real bug, whatever it is.
> I'm suspecting some bogosity with boot_cpu_physical_apicid,
> or possibly smp_found_config. Please remove the early return
> and try the patch below instead.

Dropping the early return means nolapic is not honored in this
codepath. I realize it doesn't have much impact on the bug that
happens while nolapic is not passed. Thanks for fixing that.

Also, it should probably not have to clear X86_FEATURE_APIC from
boot_cpu_data.x86_capability, because lapic_disable() already did
so. Tracking down where that is being set (if it indeed is) when
enable_local_apic < 0 may be useful.


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.x BUGs at boot time (APIC related)
@ 2004-12-23 16:11 Mikael Pettersson
  2004-12-23 16:22 ` William Lee Irwin III
  0 siblings, 1 reply; 8+ messages in thread
From: Mikael Pettersson @ 2004-12-23 16:11 UTC (permalink / raw)
  To: vda, wli; +Cc: linux-kernel, mingo

On Thu, 23 Dec 2004 14:57:25 +0000, Denis Vlasenko wrote:
>> On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
>> > Tested with noapic nolapic boot params. Still happens.
>> > Call chain is init() -> APIC_init_uniprocessor() ->
>> > ->  setup_local_APIC(). I am a bit suspicious why
>> > APIC_init_uniprocessor() does not bail out
>> > if enable_local_apic<0 (i.e. if I boot with "nolapic"):
>> > int __init APIC_init_uniprocessor (void)
>> > {
>> >         if (enable_local_apic < 0)
>> >                 clear_bit(X86_FEATURE_APIC,
>> > boot_cpu_data.x86_capability); <=3D=3D=3D=3D=3D missing "return -1"?
>> >
>> >         if (!smp_found_config && !cpu_has_apic)
>> >                 return -1;
>> > ...
>>
>> Sounds pretty serious. What happens if you add the missing return -1?
>
>Just tested that. It booted ok. Patch is in attachment.

The early return just hides the real bug, whatever it is.
I'm suspecting some bogosity with boot_cpu_physical_apicid,
or possibly smp_found_config. Please remove the early return
and try the patch below instead.

/Mikael

--- linux-2.6.10-rc3/arch/i386/kernel/apic.c.~1~	2004-12-23 15:44:26.000000000 +0100
+++ linux-2.6.10-rc3/arch/i386/kernel/apic.c	2004-12-23 16:38:27.000000000 +0100
@@ -363,7 +363,7 @@ void __init init_bsp_APIC(void)
 	apic_write_around(APIC_LVT1, value);
 }
 
-void __init setup_local_APIC (void)
+int __init setup_local_APIC (void)
 {
 	unsigned long oldvalue, value, ver, maxlvt;
 
@@ -384,8 +384,10 @@ void __init setup_local_APIC (void)
 	/*
 	 * Double-check whether this APIC is really registered.
 	 */
+	printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
+	printk("%s: apic_read(APIC_ID) == %#lx\n", __FUNCTION__, apic_read(APIC_ID));
 	if (!apic_id_registered())
-		BUG();
+		return -1;
 
 	/*
 	 * Intel recommends to set DFR, LDR and TPR before enabling
@@ -511,6 +513,7 @@ void __init setup_local_APIC (void)
 	if (nmi_watchdog == NMI_LOCAL_APIC)
 		setup_apic_nmi_watchdog();
 	apic_pm_activate();
+	return 0;
 }
 
 /*
@@ -809,6 +812,8 @@ void __init init_apic_mappings(void)
 	 */
 	if (boot_cpu_physical_apicid == -1U)
 		boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+	printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
+	printk("%s: apic_read(APIC_ID) == %#lx\n", __FUNCTION__, apic_read(APIC_ID));
 
 #ifdef CONFIG_X86_IO_APIC
 	{
@@ -1271,7 +1276,8 @@ int __init APIC_init_uniprocessor (void)
 
 	phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);
 
-	setup_local_APIC();
+	if (setup_local_APIC() < 0)
+		return -1;
 
 	if (nmi_watchdog == NMI_LOCAL_APIC)
 		check_nmi_watchdog();
--- linux-2.6.10-rc3/arch/i386/kernel/mpparse.c.~1~	2004-12-23 15:44:26.000000000 +0100
+++ linux-2.6.10-rc3/arch/i386/kernel/mpparse.c	2004-12-23 16:32:16.000000000 +0100
@@ -180,6 +180,7 @@ void __init MP_processor_info (struct mp
 	if (m->mpc_cpuflag & CPU_BOOTPROCESSOR) {
 		Dprintk("    Bootup CPU\n");
 		boot_cpu_physical_apicid = m->mpc_apicid;
+		printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
 		boot_cpu_logical_apicid = apicid;
 	}
 
@@ -823,6 +824,7 @@ void __init mp_register_lapic_address (
 
 	if (boot_cpu_physical_apicid == -1U)
 		boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+	printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
 
 	Dprintk("Boot CPU = %d\n", boot_cpu_physical_apicid);
 }
--- linux-2.6.10-rc3/arch/i386/kernel/smpboot.c.~1~	2004-10-19 13:01:17.000000000 +0200
+++ linux-2.6.10-rc3/arch/i386/kernel/smpboot.c	2004-12-23 16:33:33.000000000 +0100
@@ -913,6 +913,7 @@ static void __init smp_boot_cpus(unsigne
 	print_cpu_info(&cpu_data[0]);
 
 	boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+	printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
 	boot_cpu_logical_apicid = logical_smp_processor_id();
 	x86_cpu_to_apicid[0] = boot_cpu_physical_apicid;
 
--- linux-2.6.10-rc3/include/asm-i386/apic.h.~1~	2004-12-23 15:44:30.000000000 +0100
+++ linux-2.6.10-rc3/include/asm-i386/apic.h	2004-12-23 16:37:53.000000000 +0100
@@ -94,7 +94,7 @@ extern int verify_local_APIC (void);
 extern void cache_APIC_registers (void);
 extern void sync_Arb_IDs (void);
 extern void init_bsp_APIC (void);
-extern void setup_local_APIC (void);
+extern int setup_local_APIC (void);
 extern void init_apic_mappings (void);
 extern void smp_local_timer_interrupt (struct pt_regs * regs);
 extern void setup_boot_APIC_clock (void);

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-12-23 17:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-22 17:31 2.6.x BUGs at boot time (APIC related) Denis Vlasenko
2004-12-23 11:02 ` Denis Vlasenko
2004-12-23  9:12   ` William Lee Irwin III
2004-12-23 14:57     ` Denis Vlasenko
2004-12-23  9:33   ` Arnaud Patard
2004-12-23 16:11 Mikael Pettersson
2004-12-23 16:22 ` William Lee Irwin III
2004-12-23 16:59 Chuck Ebbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).