* Re: PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops
@ 2003-09-09 20:31 Mikael Pettersson
2003-09-10 10:26 ` Maciej W. Rozycki
0 siblings, 1 reply; 5+ messages in thread
From: Mikael Pettersson @ 2003-09-09 20:31 UTC (permalink / raw)
To: mathieu.desnoyers; +Cc: linux-kernel, mingo
On Mon, 08 Sep 2003 19:22:17 -0400, Mathieu Desnoyers wrote:
>> >On kernel 2.4.21-pre2, there is a kernel oops before this, with a
>> >"Dereferencing NULL pointer".
>>
>> You didn't run that through ksymoops and post it, so how is anyone
>> supposed to be able to debug it?
>
>As only 2.4.21-pre2 and 2.4.21-pre3 kernels show this problem, I thought
>it has been corrected in 2.4.21-pre4. But, as it can be very useful in
>finding the problem, here are the ksymoops for 2.4.21-pre2 and
>2.4.21-pre3 kernels, quite similar though.
...
>Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130>
>00000000 <_EIP>:
>Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130> <=====
> 0: 83 3c 90 ff cmpl $0xffffffff,(%eax,%edx,4) <=====
Ok, that one is line 295 in io_apic.c. It bombs in 2.4.21-pre{2,3}
because mp_bus_id_to_pci_bus was changed from a static array to
a dynamically allocated array. On your machine, smp_read_mpc() in
mpparse.c doesn't get to the point where it allocates that array,
so the array is NULL in io_apic.c and you get an oops.
Fixing the oops is easy (see below), but the real problem is
that 2.4.21-pre2 apparently broke MP table parsing on your HW.
I suggest you sprinkle tracing printk()s in setup/smpboot/mpparse
and compare 2.4.20 (good) and later (bad) to see where things
start to diverge.
/Mikael
--- linux-2.4.21-pre2/arch/i386/kernel/io_apic.c.~1~ 2003-09-09 21:27:39.000000000 +0200
+++ linux-2.4.21-pre2/arch/i386/kernel/io_apic.c 2003-09-09 22:17:02.464082064 +0200
@@ -292,7 +292,7 @@
Dprintk("querying PCI -> IRQ mapping bus:%d, slot:%d, pin:%d.\n",
bus, slot, pin);
- if (mp_bus_id_to_pci_bus[bus] == -1) {
+ if ((mp_bus_id_to_pci_bus==NULL) || mp_bus_id_to_pci_bus[bus] == -1) {
printk(KERN_WARNING "PCI BIOS passed nonexistent PCI bus %d!\n", bus);
return -1;
}
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops
2003-09-09 20:31 PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops Mikael Pettersson
@ 2003-09-10 10:26 ` Maciej W. Rozycki
2003-09-10 16:18 ` Mikael Pettersson
0 siblings, 1 reply; 5+ messages in thread
From: Maciej W. Rozycki @ 2003-09-10 10:26 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: mathieu.desnoyers, linux-kernel, mingo
On Tue, 9 Sep 2003, Mikael Pettersson wrote:
> Ok, that one is line 295 in io_apic.c. It bombs in 2.4.21-pre{2,3}
> because mp_bus_id_to_pci_bus was changed from a static array to
> a dynamically allocated array. On your machine, smp_read_mpc() in
> mpparse.c doesn't get to the point where it allocates that array,
> so the array is NULL in io_apic.c and you get an oops.
As I have already written, the system uses a default MP configuration.
smp_read_mpc() isn't called at all. construct_default_ISA_mptable() is
used instead.
> Fixing the oops is easy (see below), but the real problem is
> that 2.4.21-pre2 apparently broke MP table parsing on your HW.
> I suggest you sprinkle tracing printk()s in setup/smpboot/mpparse
> and compare 2.4.20 (good) and later (bad) to see where things
> start to diverge.
There is no need to -- the problem is already known. Mikael, if you need
additional details on how default MP configurations work in our code, feel
free to ask. Unfortunately, I won't likely be able to do any coding
and/or testing in this area before October.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops
2003-09-10 10:26 ` Maciej W. Rozycki
@ 2003-09-10 16:18 ` Mikael Pettersson
2003-09-10 16:58 ` Maciej W. Rozycki
0 siblings, 1 reply; 5+ messages in thread
From: Mikael Pettersson @ 2003-09-10 16:18 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: mathieu.desnoyers, linux-kernel, mingo
Maciej W. Rozycki writes:
> > Fixing the oops is easy (see below), but the real problem is
> > that 2.4.21-pre2 apparently broke MP table parsing on your HW.
> > I suggest you sprinkle tracing printk()s in setup/smpboot/mpparse
> > and compare 2.4.20 (good) and later (bad) to see where things
> > start to diverge.
>
> There is no need to -- the problem is already known. Mikael, if you need
> additional details on how default MP configurations work in our code, feel
I think I nailed it.
First I found one very strange thing in Mathieu's boot log:
--- mpbug-2.4.20 Wed Sep 10 17:19:05 2003
+++ mpbug-2.4.23-pre3 Wed Sep 10 17:18:44 2003
...
+DMI not present.
Intel MultiProcessor Specification v1.1
Virtual Wire compatibility mode.
Default MP configuration #6
This means construct_default_ISA_mptable() still gets called.
Ok so far.
...
ENABLING IO-APIC IRQs
Setting 2 in the phys_id_present_map
...changing IO-APIC physical APIC ID to 2 ... ok.
smp_found_config is true, we're now in setup_IO_APIC()
and have completed setup_ioapic_ids_from_mpc(). Ok so far.
-init IO_APIC IRQs
-IO-APIC (apicid-pin) 2-0 not connected.
THIS IS BAD. setup_IO_APIC() calls setup_IO_APIC_IRQs(),
which starts by printk()ing the first line above.
This line is missing from the 2.4.23-pre3 dmesg log, which
seems like an impossibility.
At this point I was thinking "memory corruption",
and the following struck me:
What used to be arrays (mp_irqs[] etc) are now pointers to
memory which is sized and allocated by smp_read_mpc().
In the case when construct_default_ISA_mptable() is called,
smp_read_mpc() is _not_ called, the pointers never get initialised,
and reads and writes of these arrays end up in la-la land.
The fix would be to add allocation and initialisation of
these pointers at the start of construct_default_ISA_mptable().
I'll prepare a patch doing this sometime tomorrow.
/Mikael
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops
2003-09-10 16:18 ` Mikael Pettersson
@ 2003-09-10 16:58 ` Maciej W. Rozycki
0 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2003-09-10 16:58 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: mathieu.desnoyers, linux-kernel, mingo
On Wed, 10 Sep 2003, Mikael Pettersson wrote:
> First I found one very strange thing in Mathieu's boot log:
>
> --- mpbug-2.4.20 Wed Sep 10 17:19:05 2003
> +++ mpbug-2.4.23-pre3 Wed Sep 10 17:18:44 2003
> ...
> +DMI not present.
> Intel MultiProcessor Specification v1.1
> Virtual Wire compatibility mode.
> Default MP configuration #6
>
> This means construct_default_ISA_mptable() still gets called.
> Ok so far.
Yep -- I've been aware of this.
> At this point I was thinking "memory corruption",
> and the following struck me:
>
> What used to be arrays (mp_irqs[] etc) are now pointers to
> memory which is sized and allocated by smp_read_mpc().
> In the case when construct_default_ISA_mptable() is called,
> smp_read_mpc() is _not_ called, the pointers never get initialised,
> and reads and writes of these arrays end up in la-la land.
Exactly.
> The fix would be to add allocation and initialisation of
> these pointers at the start of construct_default_ISA_mptable().
Possibly -- I haven't thought on how to fix it yet.
> I'll prepare a patch doing this sometime tomorrow.
Thanks a lot for taking care.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops
2003-09-08 9:33 PROBLEM: APIC on a Pentium Classic SMP, kernel 2.4.21-pre5 to 2.4.23-pre3 Mikael Pettersson
@ 2003-09-08 23:22 ` Mathieu Desnoyers
0 siblings, 0 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2003-09-08 23:22 UTC (permalink / raw)
To: Mikael Pettersson; +Cc: linux-kernel, mathieu.desnoyers, mingo
> >On kernel 2.4.21-pre2, there is a kernel oops before this, with a
> >"Dereferencing NULL pointer".
>
> You didn't run that through ksymoops and post it, so how is anyone
> supposed to be able to debug it?
As only 2.4.21-pre2 and 2.4.21-pre3 kernels show this problem, I thought
it has been corrected in 2.4.21-pre4. But, as it can be very useful in
finding the problem, here are the ksymoops for 2.4.21-pre2 and
2.4.21-pre3 kernels, quite similar though.
-------------------------------------------------------------------------------
2.4.21-pre2 ksymoops
Unable to handle kernel NULL pointer dereference at virtual address 00000000
c0115da7
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0115da7>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000 ebx: c1163400 ecx: 00000000 edx: 00000000
esi: 00000010 edi: c116bfbb ebp: 0008e000 esp: c116bf90
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c116b000)
Stack: 00000010 ffffffff c1163400 00000010 c116bfbb 0008e000 c02fe4c9 00000000
00000001 00000000 0008e000 c116a000 c02f7fc8 c0105000 0008e000 c02fdef9
c030c7c6 c116a000 c02f87fb c0105078 00010f00 c02f7fc8 c0105000 0008e000
Call Trace: [<c0105000>] [<c0105078>] [<c0105000>] [<c0107406>] [<c0105050>]
Code: 83 3c 90 ff 0f 84 f1 00 00 00 a1 c0 80 34 c0 31 ff 89 04 24
>>EIP; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130> <=====
Trace; c0105000 <_stext+0/0>
Trace; c0105078 <init+28/180>
Trace; c0105000 <_stext+0/0>
Trace; c0107406 <kernel_thread+26/30>
Trace; c0105050 <init+0/180>
Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130>
00000000 <_EIP>:
Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130> <=====
0: 83 3c 90 ff cmpl $0xffffffff,(%eax,%edx,4) <=====
Code; c0115dab <IO_APIC_get_PCI_irq_vector+1b/130>
4: 0f 84 f1 00 00 00 je fb <_EIP+0xfb>
Code; c0115db1 <IO_APIC_get_PCI_irq_vector+21/130>
a: a1 c0 80 34 c0 mov 0xc03480c0,%eax
Code; c0115db6 <IO_APIC_get_PCI_irq_vector+26/130>
f: 31 ff xor %edi,%edi
Code; c0115db8 <IO_APIC_get_PCI_irq_vector+28/130>
11: 89 04 24 mov %eax,(%esp,1)
<0>Kernel panic: Attempted to kill init!
-------------------------------------------------------------------------------
2.4.21-pre3 ksymoops
Unable to handle kernel NULL pointer dereference at virtual address 000000
c0115da7
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0115da7>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000 ebx: c1163400 ecx: 00000000 edx: 00000000
esi: 00000010 edi: c116bfbb ebp: 0008e000 esp: c116bf90
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c116b000)
Stack: 00000010 ffffffff c1163400 00000010 c116bfbb 0008e000 c02fe549 000
00000001 00000000 0008e000 c116a000 c02f7fc8 c0105000 0008e000 c02
c030c846 c116a000 c02f87fb c0105078 00010f00 c02f7fc8 c0105000 000
Call Trace: [<c0105000>] [<c0105078>] [<c0105000>] [<c0107406>] [<c010]
Code: 83 3c 90 ff 0f 84 f1 00 00 00 a1 c0 80 34 c0 31 ff 89 04 24
>>EIP; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130> <=====
Trace; c0105000 <_stext+0/0>
Trace; c0105078 <init+28/180>
Trace; c0105000 <_stext+0/0>
Trace; c0107406 <kernel_thread+26/30>
Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130>
00000000 <_EIP>:
Code; c0115da7 <IO_APIC_get_PCI_irq_vector+17/130> <=====
0: 83 3c 90 ff cmpl $0xffffffff,(%eax,%edx,4) <=====
Code; c0115dab <IO_APIC_get_PCI_irq_vector+1b/130>
4: 0f 84 f1 00 00 00 je fb <_EIP+0xfb>
Code; c0115db1 <IO_APIC_get_PCI_irq_vector+21/130>
a: a1 c0 80 34 c0 mov 0xc03480c0,%eax
Code; c0115db6 <IO_APIC_get_PCI_irq_vector+26/130>
f: 31 ff xor %edi,%edi
Code; c0115db8 <IO_APIC_get_PCI_irq_vector+28/130>
11: 89 04 24 mov %eax,(%esp,1)
<0>Kernel panic: Attempted to kill init!
-------------------------------------------------------------------------------
OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-09-10 16:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-09 20:31 PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops Mikael Pettersson
2003-09-10 10:26 ` Maciej W. Rozycki
2003-09-10 16:18 ` Mikael Pettersson
2003-09-10 16:58 ` Maciej W. Rozycki
-- strict thread matches above, loose matches on Subject: below --
2003-09-08 9:33 PROBLEM: APIC on a Pentium Classic SMP, kernel 2.4.21-pre5 to 2.4.23-pre3 Mikael Pettersson
2003-09-08 23:22 ` PROBLEM: APIC on a Pentium Classic SMP, 2.4.21-pre2 and 2.4.21-pre3 ksymoops Mathieu Desnoyers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).