* 2.6.x BUGs at boot time (APIC related)
@ 2004-12-22 17:31 Denis Vlasenko
2004-12-23 11:02 ` Denis Vlasenko
0 siblings, 1 reply; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-22 17:31 UTC (permalink / raw)
To: mingo; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 907 bytes --]
This is happening on a "HP Compaq dc7100 CMT"
(I believe it is a model number - taken from the label on the box).
Both 2.6.9 and 2.6.10-rc3 are dying this way:
[top of visible screen]
I/O APIC #1 Version 17 at 0xFEC00000
Enabled APIC mode: Flat. Using 1 I/O APICs
...
[unrelated stuff (dentry cache size etc...)]
...
Enabling fast FPU save & restore ...done
Enabling unmasked SIMD FPU exception support ...done
Checking 'hlt' instruction ..OK
------------------------
kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]
Code:
....
void __init setup_local_APIC (void)
{
...
/*
* Double-check whether this APIC is really registered.
*/
if (!apic_id_registered())
BUG(); <=========================
2.4.27-rc3 boots just fine on the same hardware.
oops, lspci and .config's are in attached tarball.
--
vda
[-- Attachment #2: BUG_APIC.tar.bz2 --]
[-- Type: application/x-tbz, Size: 21634 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
2004-12-23 11:02 ` Denis Vlasenko
@ 2004-12-23 9:12 ` William Lee Irwin III
2004-12-23 14:57 ` Denis Vlasenko
2004-12-23 9:33 ` Arnaud Patard
1 sibling, 1 reply; 8+ messages in thread
From: William Lee Irwin III @ 2004-12-23 9:12 UTC (permalink / raw)
To: Denis Vlasenko; +Cc: mingo, linux-kernel
On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
>> if (!apic_id_registered())
>> BUG(); <=========================
On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
> Tested with noapic nolapic boot params. Still happens.
> Call chain is init() -> APIC_init_uniprocessor() ->
> -> setup_local_APIC(). I am a bit suspicious why
> APIC_init_uniprocessor() does not bail out
> if enable_local_apic<0 (i.e. if I boot with "nolapic"):
> int __init APIC_init_uniprocessor (void)
> {
> if (enable_local_apic < 0)
> clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
> <===== missing "return -1"?
>
> if (!smp_found_config && !cpu_has_apic)
> return -1;
> ...
Sounds pretty serious. What happens if you add the missing return -1?
-- wli
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
2004-12-23 11:02 ` Denis Vlasenko
2004-12-23 9:12 ` William Lee Irwin III
@ 2004-12-23 9:33 ` Arnaud Patard
1 sibling, 0 replies; 8+ messages in thread
From: Arnaud Patard @ 2004-12-23 9:33 UTC (permalink / raw)
To: Denis Vlasenko; +Cc: mingo, linux-kernel, wli
Denis Vlasenko <vda@port.imtp.ilyichevsk.odessa.ua> writes:
> On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
>> This is happening on a "HP Compaq dc7100 CMT"
>> (I believe it is a model number - taken from the label on the box).
iirc, dc7100 is the name of the minitower box, not of the computer :)
>>
>> Both 2.6.9 and 2.6.10-rc3 are dying this way:
>>
>> [top of visible screen]
>> I/O APIC #1 Version 17 at 0xFEC00000
>> Enabled APIC mode: Flat. Using 1 I/O APICs
>> ...
>> [unrelated stuff (dentry cache size etc...)]
>> ...
>> Enabling fast FPU save & restore ...done
>> Enabling unmasked SIMD FPU exception support ...done
>> Checking 'hlt' instruction ..OK
>> ------------------------
>> kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]
I've taken a look at your .config for 2.6.10 and you don't have acpi
enabled (see below). Please try with acpi enabled
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
2004-12-22 17:31 2.6.x BUGs at boot time (APIC related) Denis Vlasenko
@ 2004-12-23 11:02 ` Denis Vlasenko
2004-12-23 9:12 ` William Lee Irwin III
2004-12-23 9:33 ` Arnaud Patard
0 siblings, 2 replies; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-23 11:02 UTC (permalink / raw)
To: mingo; +Cc: linux-kernel, wli
On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
> This is happening on a "HP Compaq dc7100 CMT"
> (I believe it is a model number - taken from the label on the box).
>
> Both 2.6.9 and 2.6.10-rc3 are dying this way:
>
> [top of visible screen]
> I/O APIC #1 Version 17 at 0xFEC00000
> Enabled APIC mode: Flat. Using 1 I/O APICs
> ...
> [unrelated stuff (dentry cache size etc...)]
> ...
> Enabling fast FPU save & restore ...done
> Enabling unmasked SIMD FPU exception support ...done
> Checking 'hlt' instruction ..OK
> ------------------------
> kernel BUG at arch/i386/kernel/apic.c:388! [:366! for 2.6.9]
>
>
> Code:
> ....
> void __init setup_local_APIC (void)
> {
> ...
> /*
> * Double-check whether this APIC is really registered.
> */
> if (!apic_id_registered())
> BUG(); <=========================
Tested with noapic nolapic boot params. Still happens.
Call chain is init() -> APIC_init_uniprocessor() ->
-> setup_local_APIC(). I am a bit suspicious why
APIC_init_uniprocessor() does not bail out
if enable_local_apic<0 (i.e. if I boot with "nolapic"):
int __init APIC_init_uniprocessor (void)
{
if (enable_local_apic < 0)
clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
<===== missing "return -1"?
if (!smp_found_config && !cpu_has_apic)
return -1;
...
verify_local_APIC();
connect_bsp_APIC();
phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);
setup_local_APIC(); <=== will die there
--
vda
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
2004-12-23 9:12 ` William Lee Irwin III
@ 2004-12-23 14:57 ` Denis Vlasenko
0 siblings, 0 replies; 8+ messages in thread
From: Denis Vlasenko @ 2004-12-23 14:57 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: mingo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]
On Thursday 23 December 2004 09:12, William Lee Irwin III wrote:
> On Wednesday 22 December 2004 17:31, Denis Vlasenko wrote:
> >> if (!apic_id_registered())
> >> BUG(); <=========================
>
> On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
> > Tested with noapic nolapic boot params. Still happens.
> > Call chain is init() -> APIC_init_uniprocessor() ->
> > -> setup_local_APIC(). I am a bit suspicious why
> > APIC_init_uniprocessor() does not bail out
> > if enable_local_apic<0 (i.e. if I boot with "nolapic"):
> > int __init APIC_init_uniprocessor (void)
> > {
> > if (enable_local_apic < 0)
> > clear_bit(X86_FEATURE_APIC,
> > boot_cpu_data.x86_capability); <===== missing "return -1"?
> >
> > if (!smp_found_config && !cpu_has_apic)
> > return -1;
> > ...
>
> Sounds pretty serious. What happens if you add the missing return -1?
Just tested that. It booted ok. Patch is in attachment.
--
vda
[-- Attachment #2: apic.c.diff --]
[-- Type: text/x-diff, Size: 423 bytes --]
--- linux-2.6.10-rc3.src/arch/i386/kernel/apic.c.old Mon Dec 20 14:13:59 2004
+++ linux-2.6.10-rc3.src/arch/i386/kernel/apic.c Thu Dec 23 08:54:13 2004
@@ -1250,8 +1250,10 @@
*/
int __init APIC_init_uniprocessor (void)
{
- if (enable_local_apic < 0)
+ if (enable_local_apic < 0) {
clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+ return -1;
+ }
if (!smp_found_config && !cpu_has_apic)
return -1;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
@ 2004-12-23 16:59 Chuck Ebbert
0 siblings, 0 replies; 8+ messages in thread
From: Chuck Ebbert @ 2004-12-23 16:59 UTC (permalink / raw)
To: Mikael Pettersson
Cc: Ingo Molnar, linux-kernel, William Lee Irwin III, Denis Vlasenko
vda wrote:
> --- linux-2.6.10-rc3.src/arch/i386/kernel/apic.c.old Mon Dec 20 14:13:59 2004
> +++ linux-2.6.10-rc3.src/arch/i386/kernel/apic.c Thu Dec 23 08:54:13 2004
> @@ -1250,8 +1250,10 @@
> */
> int __init APIC_init_uniprocessor (void)
> {
> - if (enable_local_apic < 0)
> + if (enable_local_apic < 0) {
> clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
> + return -1;
> + }
>
> if (!smp_found_config && !cpu_has_apic)
> return -1;
Mikael Pettersson wrote:
> The early return just hides the real bug, whatever it is.
How about this:
- if (!smp_found_config && !cpu_has_apic)
+ if (!smp_found_config || !cpu_has_apic)
--
Please take it as a sign of my infinite respect for you,
that I insist on you doing all the work.
-- Rusty Russell
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
2004-12-23 16:11 Mikael Pettersson
@ 2004-12-23 16:22 ` William Lee Irwin III
0 siblings, 0 replies; 8+ messages in thread
From: William Lee Irwin III @ 2004-12-23 16:22 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: vda, linux-kernel, mingo
At some point in the past, I wrote:
>>> Sounds pretty serious. What happens if you add the missing return -1?
On Thu, 23 Dec 2004 14:57:25 +0000, Denis Vlasenko wrote:
>> Just tested that. It booted ok. Patch is in attachment.
On Thu, Dec 23, 2004 at 05:11:39PM +0100, Mikael Pettersson wrote:
> The early return just hides the real bug, whatever it is.
> I'm suspecting some bogosity with boot_cpu_physical_apicid,
> or possibly smp_found_config. Please remove the early return
> and try the patch below instead.
Dropping the early return means nolapic is not honored in this
codepath. I realize it doesn't have much impact on the bug that
happens while nolapic is not passed. Thanks for fixing that.
Also, it should probably not have to clear X86_FEATURE_APIC from
boot_cpu_data.x86_capability, because lapic_disable() already did
so. Tracking down where that is being set (if it indeed is) when
enable_local_apic < 0 may be useful.
-- wli
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.6.x BUGs at boot time (APIC related)
@ 2004-12-23 16:11 Mikael Pettersson
2004-12-23 16:22 ` William Lee Irwin III
0 siblings, 1 reply; 8+ messages in thread
From: Mikael Pettersson @ 2004-12-23 16:11 UTC (permalink / raw)
To: vda, wli; +Cc: linux-kernel, mingo
On Thu, 23 Dec 2004 14:57:25 +0000, Denis Vlasenko wrote:
>> On Thu, Dec 23, 2004 at 11:02:09AM +0000, Denis Vlasenko wrote:
>> > Tested with noapic nolapic boot params. Still happens.
>> > Call chain is init() -> APIC_init_uniprocessor() ->
>> > -> setup_local_APIC(). I am a bit suspicious why
>> > APIC_init_uniprocessor() does not bail out
>> > if enable_local_apic<0 (i.e. if I boot with "nolapic"):
>> > int __init APIC_init_uniprocessor (void)
>> > {
>> > if (enable_local_apic < 0)
>> > clear_bit(X86_FEATURE_APIC,
>> > boot_cpu_data.x86_capability); <=3D=3D=3D=3D=3D missing "return -1"?
>> >
>> > if (!smp_found_config && !cpu_has_apic)
>> > return -1;
>> > ...
>>
>> Sounds pretty serious. What happens if you add the missing return -1?
>
>Just tested that. It booted ok. Patch is in attachment.
The early return just hides the real bug, whatever it is.
I'm suspecting some bogosity with boot_cpu_physical_apicid,
or possibly smp_found_config. Please remove the early return
and try the patch below instead.
/Mikael
--- linux-2.6.10-rc3/arch/i386/kernel/apic.c.~1~ 2004-12-23 15:44:26.000000000 +0100
+++ linux-2.6.10-rc3/arch/i386/kernel/apic.c 2004-12-23 16:38:27.000000000 +0100
@@ -363,7 +363,7 @@ void __init init_bsp_APIC(void)
apic_write_around(APIC_LVT1, value);
}
-void __init setup_local_APIC (void)
+int __init setup_local_APIC (void)
{
unsigned long oldvalue, value, ver, maxlvt;
@@ -384,8 +384,10 @@ void __init setup_local_APIC (void)
/*
* Double-check whether this APIC is really registered.
*/
+ printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
+ printk("%s: apic_read(APIC_ID) == %#lx\n", __FUNCTION__, apic_read(APIC_ID));
if (!apic_id_registered())
- BUG();
+ return -1;
/*
* Intel recommends to set DFR, LDR and TPR before enabling
@@ -511,6 +513,7 @@ void __init setup_local_APIC (void)
if (nmi_watchdog == NMI_LOCAL_APIC)
setup_apic_nmi_watchdog();
apic_pm_activate();
+ return 0;
}
/*
@@ -809,6 +812,8 @@ void __init init_apic_mappings(void)
*/
if (boot_cpu_physical_apicid == -1U)
boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+ printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
+ printk("%s: apic_read(APIC_ID) == %#lx\n", __FUNCTION__, apic_read(APIC_ID));
#ifdef CONFIG_X86_IO_APIC
{
@@ -1271,7 +1276,8 @@ int __init APIC_init_uniprocessor (void)
phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);
- setup_local_APIC();
+ if (setup_local_APIC() < 0)
+ return -1;
if (nmi_watchdog == NMI_LOCAL_APIC)
check_nmi_watchdog();
--- linux-2.6.10-rc3/arch/i386/kernel/mpparse.c.~1~ 2004-12-23 15:44:26.000000000 +0100
+++ linux-2.6.10-rc3/arch/i386/kernel/mpparse.c 2004-12-23 16:32:16.000000000 +0100
@@ -180,6 +180,7 @@ void __init MP_processor_info (struct mp
if (m->mpc_cpuflag & CPU_BOOTPROCESSOR) {
Dprintk(" Bootup CPU\n");
boot_cpu_physical_apicid = m->mpc_apicid;
+ printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
boot_cpu_logical_apicid = apicid;
}
@@ -823,6 +824,7 @@ void __init mp_register_lapic_address (
if (boot_cpu_physical_apicid == -1U)
boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+ printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
Dprintk("Boot CPU = %d\n", boot_cpu_physical_apicid);
}
--- linux-2.6.10-rc3/arch/i386/kernel/smpboot.c.~1~ 2004-10-19 13:01:17.000000000 +0200
+++ linux-2.6.10-rc3/arch/i386/kernel/smpboot.c 2004-12-23 16:33:33.000000000 +0100
@@ -913,6 +913,7 @@ static void __init smp_boot_cpus(unsigne
print_cpu_info(&cpu_data[0]);
boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+ printk("%s: boot_cpu_physical_apicid == %#x\n", __FUNCTION__, boot_cpu_physical_apicid);
boot_cpu_logical_apicid = logical_smp_processor_id();
x86_cpu_to_apicid[0] = boot_cpu_physical_apicid;
--- linux-2.6.10-rc3/include/asm-i386/apic.h.~1~ 2004-12-23 15:44:30.000000000 +0100
+++ linux-2.6.10-rc3/include/asm-i386/apic.h 2004-12-23 16:37:53.000000000 +0100
@@ -94,7 +94,7 @@ extern int verify_local_APIC (void);
extern void cache_APIC_registers (void);
extern void sync_Arb_IDs (void);
extern void init_bsp_APIC (void);
-extern void setup_local_APIC (void);
+extern int setup_local_APIC (void);
extern void init_apic_mappings (void);
extern void smp_local_timer_interrupt (struct pt_regs * regs);
extern void setup_boot_APIC_clock (void);
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-12-23 17:04 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-22 17:31 2.6.x BUGs at boot time (APIC related) Denis Vlasenko
2004-12-23 11:02 ` Denis Vlasenko
2004-12-23 9:12 ` William Lee Irwin III
2004-12-23 14:57 ` Denis Vlasenko
2004-12-23 9:33 ` Arnaud Patard
2004-12-23 16:11 Mikael Pettersson
2004-12-23 16:22 ` William Lee Irwin III
2004-12-23 16:59 Chuck Ebbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).