* [parisc-linux] 64-bit kernel crashes on my c3600
@ 2004-10-19 17:54 Matthew Wilcox
2004-10-20 15:24 ` Carlos O'Donell
2004-10-31 6:29 ` Randolph Chung
0 siblings, 2 replies; 3+ messages in thread
From: Matthew Wilcox @ 2004-10-19 17:54 UTC (permalink / raw)
To: parisc-linux
One of the problems with this crash is that enabling EARLY_CONSOLE
doesn't help. The exact same configuration boots fine in 32-bit mode.
I'm building from the same tree (with O=) so there's no question of patch
skew. Turning on DISCONTIGMEM does not help. The HPMC points inside
the code generated by the save_general macro just past skip_save_ior
inside the intr_save function in entry.S
I'm not even sure how to start debugging. My initial thought is that r29
seems awfully high to be a good memory address.
Here's the HPMC if it's useful. BTW, the "system responder address" is
MEM_CONTROL_0 inside the memory controller block of Astro's config space.
Service Menu: Enter command > pim hpmc
PROCESSOR PIM INFORMATION
----------------- Processor 0 HPMC Information ------------------
Timestamp =
Tue Oct 19 15:58:28 GMT 2004 (20:04:10:19:15:58:28)
HPMC Chassis Codes = 2cbf0 2500b 2cbf4 2cbfc
General Registers 0 - 31
00-03 0000000000000000 0000000000000080 000000000010012c fffffff0f0000018
04-07 00000000004cd000 00000000004cf220 00000000fffffff0 00000000f0002f68
08-11 0000000000000006 00000001ffffff80 000000000804000e 000000001062c564
12-15 0000000000000000 00000000ffffffff 0000000000000000 00000000f0400004
16-19 0000000000000000 00000000f000017c 00000000f0000174 0000000000000000
20-23 0000000000000000 00000000fee003f8 00000000fee003fd 0000000000000000
24-27 0000000000000000 0000000000000000 0000000000000006 0000000010612ac0
28-31 0000000000000000 000000020ffffc40 000000020fffff80 0300000000802204
Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 0000000000000000 000000000000001f
12-15 0000000000000000 0000000000000000 0000000000106000 0000000000000000
16-19 0000000a99eb6986 0000000000000000 0000000010107678 0000000043ffff80
20-23 0000000000000000 0000000000000000 000000ff08007f00 8000000000000000
24-27 00000000004cd000 00000000004cd000 000000007fffffff 000000007fdfffff
28-31 000000007fffffff 000000007fffffff 00000000105c8000 00000000105cc000
Space Registers 0 - 7
00-03 00000000 00000000 00000000 00000000
04-07 00000000 00000000 00000000 00000000
IIA Space = 0x0000000000000000
IIA Offset = 0x000000001010767c
Check Type = 0x20000000
CPU State = 0x9e000004
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x003010bb
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00031800
System Responder Address = 0xfffffffffed10200
System Requestor Address = 0xfffffffffffa0000
Floating-Point Registers 0 - 31
00-03 0000001f00000000 0000000000000000 0000000000000000 0000000000000000
04-07 00001e84000f41fa 0000007810179ac8 00000000000e4de0 104270101052b810
08-11 12ae1e4000000002 eff1700000000002 0000000030433480 000f41fa10425000
12-15 1052bcb400000002 eff1700000000002 0000000000000001 12b1414000000000
16-19 f00008c41052b810 104270103b9aca00 104251601052bc80 30433480000f41fa
20-23 104250001052bcb4 1052bc801016533c 08a00000052d8e00 00000000431bde83
24-27 20e6da0000000000 0000008000000000 eff6a9d400000000 12ad5c40effc18dc
28-31 eff8cbc0ffffffff ffffffff10176990 ffffffff7fffffff fffffb7dffffffff
'9000/785 B,C,J Workstation Unarchitected (per-CPU)', rev 1, 140 bytes:
Check Summary = 0xcb81041000000000
Available Memory = 0x0000000200000000
CPU Diagnose Register 2 = 0x0300000000802204
CPU Status Register 0 = 0x2420c20000000000
CPU Status Register 1 = 0x8080000000000000
SADD LOG = 0x0000000000000000
Read Short LOG = 0xc13ff0f0f000a1b8
ERROR_STATUS = 0x0000000000000010
MEM_ADDR = 0x000001ff3fffffff
MEM_SYND = 0x0000000000000000
MEM_ADDR_CORR = 0x000001ff3fffffff
MEM_SYND_CORR = 0x0000000000000000
RUN_DATA_HIGH = 0xc1bff0fffed08040
RUN_DATA_LOW = 0xc1bff0fffed08040
RUN_CTRL = 0x0000021c00001418
RUN_ADDR = 0xc1bff0fffed08040
System Responder Path = 0x00ffffffffffffff
HPMC PIM Analysis Information:
Timestamp =
Tue Oct 19 15:58:28 GMT 2004 (20:04:10:19:15:58:28)
'9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes:
A Data Miss Timeout occurred while CPU 0 was
requesting information.
Memory/IO Controller Error Analysis Information:
The Memory/IO Controller only observed the Broadcast Error. It did not log
any additional information about the HPMC.
Memory Error Log Information:
Timestamp =
Tue Oct 19 15:58:28 GMT 2004 (20:04:10:19:15:58:28)
'9000/785 B,C,J Workstation Memory Error Log', rev 0, 64 bytes:
No memory errors logged
I/O Module Error Log Information:
Timestamp =
Tue Oct 19 15:58:28 GMT 2004 (20:04:10:19:15:58:28)
'9000/785 B,C,J Workstation IO Error Log', rev 0, 228 bytes:
Rope Word1 Word2 Word3
------ ------------ ------------
0 0x00000000 0x0e0cc009 0x00000000fed30048
1 0x00000000 0x1e0cc009 0x00000000fed32048
2 ---------- 0x2e0cc009 ------------------
3 ---------- 0x3e0cc009 ------------------
4 0x00000000 0x4e0cc009 0x00000000fed38048
5 ---------- 0x5e0cc009 ------------------
6 0x00000000 0x6e0cc009 0x00000000fed3c048
7 ---------- 0x7e0cc009 ------------------
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [parisc-linux] 64-bit kernel crashes on my c3600
2004-10-19 17:54 [parisc-linux] 64-bit kernel crashes on my c3600 Matthew Wilcox
@ 2004-10-20 15:24 ` Carlos O'Donell
2004-10-31 6:29 ` Randolph Chung
1 sibling, 0 replies; 3+ messages in thread
From: Carlos O'Donell @ 2004-10-20 15:24 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
On Tue, Oct 19, 2004 at 06:54:40PM +0100, Matthew Wilcox wrote:
> One of the problems with this crash is that enabling EARLY_CONSOLE
> doesn't help. The exact same configuration boots fine in 32-bit mode.
> I'm building from the same tree (with O=) so there's no question of patch
> skew. Turning on DISCONTIGMEM does not help. The HPMC points inside
> the code generated by the save_general macro just past skip_save_ior
> inside the intr_save function in entry.S
This is just before calling handle_interruption, so it looks like you
took an interrupt before something was setup properly?
These sorts of problems are very messy to debug if they are
non-deterministic. Just stick an infinite loop in a portion of code you
expect might be before the HPMC, run, TOC, check, and move the loop.
That's my normal procedure when I had to debug similar stuff to prove
some lws code.
> I'm not even sure how to start debugging. My initial thought is that r29
> seems awfully high to be a good memory address.
Why do you think that?
I'm interested in r2 which is a userspace address. Did this box make it
to userspace?
c.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [parisc-linux] 64-bit kernel crashes on my c3600
2004-10-19 17:54 [parisc-linux] 64-bit kernel crashes on my c3600 Matthew Wilcox
2004-10-20 15:24 ` Carlos O'Donell
@ 2004-10-31 6:29 ` Randolph Chung
1 sibling, 0 replies; 3+ messages in thread
From: Randolph Chung @ 2004-10-31 6:29 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
> One of the problems with this crash is that enabling EARLY_CONSOLE
> doesn't help. The exact same configuration boots fine in 32-bit mode.
> I'm building from the same tree (with O=) so there's no question of patch
> skew. Turning on DISCONTIGMEM does not help. The HPMC points inside
> the code generated by the save_general macro just past skip_save_ior
> inside the intr_save function in entry.S
i've found out some more info about this problem, but still no clue why
it's happening....
at the end of head.S, when we branch to virtual space, the first virtual
insn access (to start_kernel) causes a itlb miss fault (as expected).
For some reason, the itlb handler is not able to find the page for
start_kernel in the page table, so it attempts to call the fault handler
(handle_interruption, via intr_save). However, in intr_save, as soon as
we switch to virtual space (virt_map, right before the save_general
macro call), we immediately cause another itlb miss fault, which fails,
and calls intr_save again. Each time intr_save is called, we create a
new stack frame. Eventually, the stack pointer points past valid phys
addr space, and the machine HPMCs.
The question is, why does the itlb miss handler fail to find the mapping
for start_kernel? On my kernel, start_kernel is at 0x1056xxxx, which is
well within the 16MB initially mapped in head.S. I went through the code
in head.S several times and it seems to be correct. I also don't quite
understand how this part of the code, which is all in assembly, can
behave differently between gcc-3.3 and gcc-3.4. I tried to move the
initial-VM initialization code in head.S much closer to the rfi (with
the hypothesis that some intervening code had trashed the page table)
but that doesn't seem to help. I also had a theory that perhaps the
different gcc versions were expanding the #define's differently for
offsets.h, but that doesn't seem to be the case either... so i'm out of
ideas :(
if it helps, what i see is that in L2_ptep,
ldw,s \index(\pmd),\pmd
is returning with \pmd == 0
weird....
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-10-31 6:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-19 17:54 [parisc-linux] 64-bit kernel crashes on my c3600 Matthew Wilcox
2004-10-20 15:24 ` Carlos O'Donell
2004-10-31 6:29 ` Randolph Chung
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.