crash in entry.S restore_all, 2.6.12-rc2, x86, PAGEALLOC

* crash in entry.S restore_all, 2.6.12-rc2, x86, PAGEALLOC
@ 2005-04-05  6:55 Ingo Molnar
  2005-04-05  7:03 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Ingo Molnar @ 2005-04-05  6:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, stsp, Andrew Morton

the crashes below happen when PAGEALLOC is enabled. It's this 
instruction:

        movb OLDSS(%esp), %ah

OLDSS is 0x38, esp is f4f83fc8, OLDSS(%esp) is thus f4f84000, which 
correctly creates the PAGEALLOC pagefault. esp is off by 4 bytes?

it could be the ESP-16-bit-corruption patch causing this, or it could be 
an already existing latent bug getting triggered now: normally only iret 
accesses the OLDSS, and we fix any iret faults up, but now that we 
explicitly access %esp the esp bug shows up.

so it would be nice to understand why this triggers. It seems to be a 
sporadic event - first it hit hotplug, then input.agent. If i disable 
PAGEALLOC the system boots up fine. In any case, the ESP-corruption 
patch is not safe until this bug is understood, as it right now may read 
a random byte off the next page, and possibly doing bogus calls to the 
16-bit-fixup code.

	Ingo

-------------

BUG: Unable to handle kernel paging request at virtual address f4f84000
 printing eip:
c010287c
*pde = 00527067
*pte = 34f84000
Oops: 0000 [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c010287c>]    Not tainted VLI
EFLAGS: 00010046   (2.6.12-rc2-RT-V0.7.43-09) 
EIP is at restore_all+0x4/0x18
eax: 00000206   ebx: 00000000   ecx: 00000000   edx: 00000001
esi: 00000000   edi: 009b63f9   ebp: f4f82000   esp: f4f83fc8
ds: 007b   es: 007b   ss: 0068   preempt: 00000001
Process 10-udev.hotplug (pid: 1264, threadinfo=f4f82000 task=f5034a10)
Stack: 00000000 bfa71dd0 009c0ffc 00000000 009b63f9 bfa71d44 000000c5 0000007b 
       0000007b ffffffef c01027ba 00000060 00000206 0000007b 
Call Trace:
 [<c01036ac>] show_stack+0x7a/0x90 (32)
 [<c0103835>] show_registers+0x15a/0x1d2 (56)
 [<c0103a30>] die+0xf4/0x17e (68)
 [<c010f444>] do_page_fault+0x3de/0x60a (212)
 [<c01032eb>] error_code+0x4f/0x54 (-8076)

---------------------

BUG: Unable to handle kernel paging request at virtual address f57bc000
 printing eip:
c010287c
*pde = 00529067
*pte = 357bc000
Oops: 0000 [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c010287c>]    Not tainted VLI
EFLAGS: 00010046   (2.6.12-rc2-RT-V0.7.43-09) 
EIP is at restore_all+0x4/0x18
eax: 00000206   ebx: b7f11000   ecx: 00000000   edx: 00000000
esi: 080e4f28   edi: 00000000   ebp: f57ba000   esp: f57bbfc8
ds: 007b   es: 007b   ss: 0068   preempt: 00000001
Process input.agent (pid: 1131, threadinfo=f57ba000 task=f57b9a10)
Stack: b7f11000 00001000 009c0ffc 080e4f28 00000000 bfc112c0 0000005b 0000007b 
       0000007b ffffff00 c01027ba 00000060 00000206 0000007b 
Call Trace:
 [<c01036ac>] show_stack+0x7a/0x90 (32)
 [<c0103835>] show_registers+0x15a/0x1d2 (56)
 [<c0103a30>] die+0xf4/0x17e (68)
 [<c010f474>] do_page_fault+0x3de/0x60a (212)
 [<c01032eb>] error_code+0x4f/0x54 (-8076)

^ permalink raw reply	[flat|nested] 31+ messages in thread