linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
@ 2012-01-03 16:13 Sander Eikelenboom
  2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-03 16:13 UTC (permalink / raw)
  To: Thomas Gleixner, Konrad Rzeszutek Wilk; +Cc: linux-kernel

Hi all,

While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:

The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:


[    1.074218] i8042: No controller found
[    1.074510] mousedev: PS/2 mouse device common for all mice
[    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
[    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
[    1.233394] PGD 0
[    1.233399] Oops: 0002 [#1] SMP
[    1.233406] CPU 0
[    1.233409] Modules linked in:
[    1.233415]
[    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
[    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
[    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
[    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
[    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
[    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
[    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
[    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
[    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
[    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
[    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
[    1.233508] Stack:
[    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
[    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
[    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
[    1.233545] Call Trace:
[    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
[    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
[    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
[    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
[    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
[    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
[    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
[    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
[    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
[    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
[    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
[    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
[    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
[    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
[    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
[    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
[    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
[    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
[    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
[    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
[    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
[    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
[    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
[    1.234342]  RSP <ffff88000ee07b20>
[    1.234342] CR2: 0000000000000598
[    1.234342] ---[ end trace e13f105b060373ec ]---
[    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
[    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
[    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
[    1.277147] Oops: 0000 [#2] SMP
[    1.277153] CPU 0
[    1.277156] Modules linked in:
[    1.277162]
[    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
[    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
[    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
[    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
[    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
[    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
[    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
[    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
[    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
[    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
[    1.278053] Stack:
[    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
[    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
[    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
[    1.278053] Call Trace:
[    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
[    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
[    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
[    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
[    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
[    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
[    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
[    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
[    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
[    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
[    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
[    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
[    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
[    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
[    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
[    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
[    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
[    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
[    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
[    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
[    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
[    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
[    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
[    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
[    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
[    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
[    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
[    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
[    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
[    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
[    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
[    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
[    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
[    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
[    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
[    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
[    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
[    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
[    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
[    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
[    1.278053]  RSP <ffff88000ee07708>
[    1.278053] CR2: fffffffffffffff8
[    1.278053] ---[ end trace e13f105b060373ed ]---
[    1.278053] Fixing recursive fault but reboot is needed!

--
Sander


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 16:13 [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [ 1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30 Sander Eikelenboom
@ 2012-01-03 19:07 ` Konrad Rzeszutek Wilk
  2012-01-03 19:17   ` Sander Eikelenboom
                     ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-03 19:07 UTC (permalink / raw)
  To: Sander Eikelenboom, neilb, john.stultz, stefan.bader, rjw
  Cc: Thomas Gleixner, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9729 bytes --]

On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
> Hi all,
> 
> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
> 
> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:

Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
regression? Lets CC Rafeal.

But the git commit:

ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
Merge: 733bbb7 f88e1ae
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Jan 2 12:34:03 2012 -0800

    Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
    
    * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
      dt/device: Fix auxdata matching to handle entries without a name override

Looks to be unrelated? Or is there some other bug?

Is this by any chance related to "rtc: Expire alarms after the time is set."
(93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?

I think Stefan had a patch for this..

Ah, please see attached file.

Sander, could you please do two tests:

 1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
 2). Use the latest linus/master and try the attached patch?

Thanks!

> 
> [    1.074218] i8042: No controller found
> [    1.074510] mousedev: PS/2 mouse device common for all mice
> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> [    1.233394] PGD 0
> [    1.233399] Oops: 0002 [#1] SMP
> [    1.233406] CPU 0
> [    1.233409] Modules linked in:
> [    1.233415]
> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> [    1.233508] Stack:
> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
> [    1.233545] Call Trace:
> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> [    1.234342]  RSP <ffff88000ee07b20>
> [    1.234342] CR2: 0000000000000598
> [    1.234342] ---[ end trace e13f105b060373ec ]---
> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
> [    1.277147] Oops: 0000 [#2] SMP
> [    1.277153] CPU 0
> [    1.277156] Modules linked in:
> [    1.277162]
> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> [    1.278053] Stack:
> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
> [    1.278053] Call Trace:
> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> [    1.278053]  RSP <ffff88000ee07708>
> [    1.278053] CR2: fffffffffffffff8
> [    1.278053] ---[ end trace e13f105b060373ed ]---
> [    1.278053] Fixing recursive fault but reboot is needed!
> 
> --
> Sander

[-- Attachment #2: 0001-rtc-Delay-calling-rtc_initialize_alarm.patch --]
[-- Type: text/plain, Size: 2417 bytes --]

>From 67329417c32b6e4f9f0253fe1976d8cb2ca67996 Mon Sep 17 00:00:00 2001
From: Stefan Bader <stefan.bader@canonical.com>
Date: Tue, 3 Jan 2012 17:53:14 +0100
Subject: [PATCH] rtc: Delay calling rtc_initialize_alarm

Commit 93b2ec0128c431148b216b8f7337c1a52131ef03 added a schedule_work
call to rtc_initialize_alarm. However this seems to be called before
having completed the setup. At least running under Xen, I could see
the following crash:

BUG: unable to handle kernel NULL pointer dereference at 00000294
IP: [<c0172cd3>] queue_work_on+0x13/0x40
...
Call Trace:
[<c0172d3a>] queue_work+0x1a/0x20
[<c0172d54>] schedule_work+0x14/0x20
[<c054b252>] rtc_update_irq+0x12/0x20
[<c054dd0c>] cmos_checkintr.isra.2+0x5c/0x70
[<c054dd6a>] cmos_irq_disable+0x4a/0x60
[<c054e5ec>] cmos_set_alarm+0xdc/0x190
[<c054b9a7>] __rtc_set_alarm+0x87/0xa0
[<c054c7e0>] rtc_timer_do_work+0x160/0x200
[<c01748c1>] process_one_work+0x101/0x3a0
[<c0175384>] worker_thread+0x124/0x2d0
[<c01791ad>] kthread+0x6d/0x80

Moving the call farther down the function seems like avoiding this.
That still may be only because the timing is better, but on the other
hand it sounds better to run any handler really after all the setup is
complete.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Cc: stable@vger.kernel.org
---
 drivers/rtc/class.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index dc4c274..28b8b44 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -181,12 +181,6 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev,
 	rtc->pie_timer.function = rtc_pie_update_irq;
 	rtc->pie_enabled = 0;
 
-	/* Check to see if there is an ALARM already set in hw */
-	err = __rtc_read_alarm(rtc, &alrm);
-
-	if (!err && !rtc_valid_tm(&alrm.time))
-		rtc_initialize_alarm(rtc, &alrm);
-
 	strlcpy(rtc->name, name, RTC_DEVICE_NAME_SIZE);
 	dev_set_name(&rtc->dev, "rtc%d", id);
 
@@ -202,6 +196,12 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev,
 	rtc_sysfs_add_device(rtc);
 	rtc_proc_add_device(rtc);
 
+	/* Check to see if there is an ALARM already set in hw */
+	err = __rtc_read_alarm(rtc, &alrm);
+
+	if (!err && !rtc_valid_tm(&alrm.time))
+		rtc_initialize_alarm(rtc, &alrm);
+
 	dev_info(dev, "rtc core: registered %s as %s\n",
 			rtc->name, dev_name(&rtc->dev));
 
-- 
1.7.5.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
@ 2012-01-03 19:17   ` Sander Eikelenboom
  2012-01-03 19:26   ` Stefan Bader
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-03 19:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: neilb, john.stultz, stefan.bader, rjw, Thomas Gleixner, linux-kernel

Tuesday, January 3, 2012, 8:07:54 PM, you wrote:

> On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
>> Hi all,
>> 
>> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
>> 
>> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:

> Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
> regression? Lets CC Rafeal.

That's not what i meant, sorry for being unclear, it's the latest commit to the tested kernel (it's later than the tagged 3.2.0-rc7).
The previous kernel that i know works ok is 3.1.5.


> But the git commit:

> ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
> Merge: 733bbb7 f88e1ae
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Jan 2 12:34:03 2012 -0800

>     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
>     
>     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
>       dt/device: Fix auxdata matching to handle entries without a name override

> Looks to be unrelated? Or is there some other bug?

yes it's unrelated, see above


> Is this by any chance related to "rtc: Expire alarms after the time is set."
> (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?

> I think Stefan had a patch for this..

> Ah, please see attached file.

> Sander, could you please do two tests:

>  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
>  2). Use the latest linus/master and try the attached patch?

Will give both a try !


> Thanks!

>> 
>> [    1.074218] i8042: No controller found
>> [    1.074510] mousedev: PS/2 mouse device common for all mice
>> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
>> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233394] PGD 0
>> [    1.233399] Oops: 0002 [#1] SMP
>> [    1.233406] CPU 0
>> [    1.233409] Modules linked in:
>> [    1.233415]
>> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
>> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
>> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
>> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
>> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
>> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
>> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
>> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.233508] Stack:
>> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
>> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
>> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
>> [    1.233545] Call Trace:
>> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
>> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.234342]  RSP <ffff88000ee07b20>
>> [    1.234342] CR2: 0000000000000598
>> [    1.234342] ---[ end trace e13f105b060373ec ]---
>> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
>> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
>> [    1.277147] Oops: 0000 [#2] SMP
>> [    1.277153] CPU 0
>> [    1.277156] Modules linked in:
>> [    1.277162]
>> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
>> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
>> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
>> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
>> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
>> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
>> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.278053] Stack:
>> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
>> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
>> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
>> [    1.278053] Call Trace:
>> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
>> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
>> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
>> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
>> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
>> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
>> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
>> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
>> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
>> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
>> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
>> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
>> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
>> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
>> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
>> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
>> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
>> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
>> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.278053]  RSP <ffff88000ee07708>
>> [    1.278053] CR2: fffffffffffffff8
>> [    1.278053] ---[ end trace e13f105b060373ed ]---
>> [    1.278053] Fixing recursive fault but reboot is needed!
>> 
>> --
>> Sander



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
  2012-01-03 19:17   ` Sander Eikelenboom
@ 2012-01-03 19:26   ` Stefan Bader
  2012-01-03 20:11     ` Sander Eikelenboom
  2012-01-03 20:10   ` Sander Eikelenboom
  2012-01-03 23:09   ` John Stultz
  3 siblings, 1 reply; 26+ messages in thread
From: Stefan Bader @ 2012-01-03 19:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sander Eikelenboom, neilb, john.stultz, rjw, Thomas Gleixner,
	linux-kernel

On 03.01.2012 20:07, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
>> Hi all,
>>
>> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
>>
>> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
> 

One detail I noticed when booting with more than 1 cpu for the guest (64bit) was
that there seemed to be a relative huge delay (about 90s) between the message on
registering rtc0 and the one announcing the probe failed with -ENOSYS (which it
does always). So while not crashing, something seemed wrong in that case as well.

-Stefan


> Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
> regression? Lets CC Rafeal.
> 
> But the git commit:
> 
> ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
> Merge: 733bbb7 f88e1ae
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Jan 2 12:34:03 2012 -0800
> 
>     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
>     
>     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
>       dt/device: Fix auxdata matching to handle entries without a name override
> 
> Looks to be unrelated? Or is there some other bug?
> 
> Is this by any chance related to "rtc: Expire alarms after the time is set."
> (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
> 
> I think Stefan had a patch for this..
> 
> Ah, please see attached file.
> 
> Sander, could you please do two tests:
> 
>  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
>  2). Use the latest linus/master and try the attached patch?
> 
> Thanks!
> 
>>
>> [    1.074218] i8042: No controller found
>> [    1.074510] mousedev: PS/2 mouse device common for all mice
>> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
>> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233394] PGD 0
>> [    1.233399] Oops: 0002 [#1] SMP
>> [    1.233406] CPU 0
>> [    1.233409] Modules linked in:
>> [    1.233415]
>> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
>> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
>> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
>> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
>> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
>> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
>> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
>> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.233508] Stack:
>> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
>> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
>> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
>> [    1.233545] Call Trace:
>> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
>> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.234342]  RSP <ffff88000ee07b20>
>> [    1.234342] CR2: 0000000000000598
>> [    1.234342] ---[ end trace e13f105b060373ec ]---
>> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
>> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
>> [    1.277147] Oops: 0000 [#2] SMP
>> [    1.277153] CPU 0
>> [    1.277156] Modules linked in:
>> [    1.277162]
>> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
>> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
>> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
>> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
>> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
>> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
>> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.278053] Stack:
>> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
>> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
>> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
>> [    1.278053] Call Trace:
>> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
>> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
>> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
>> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
>> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
>> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
>> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
>> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
>> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
>> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
>> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
>> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
>> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
>> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
>> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
>> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
>> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
>> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
>> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.278053]  RSP <ffff88000ee07708>
>> [    1.278053] CR2: fffffffffffffff8
>> [    1.278053] ---[ end trace e13f105b060373ed ]---
>> [    1.278053] Fixing recursive fault but reboot is needed!
>>
>> --
>> Sander


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
  2012-01-03 19:17   ` Sander Eikelenboom
  2012-01-03 19:26   ` Stefan Bader
@ 2012-01-03 20:10   ` Sander Eikelenboom
  2012-01-03 22:33     ` Konrad Rzeszutek Wilk
  2012-01-03 23:09   ` John Stultz
  3 siblings, 1 reply; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-03 20:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: neilb, john.stultz, stefan.bader, rjw, Thomas Gleixner, linux-kernel

Tuesday, January 3, 2012, 8:07:54 PM, you wrote:

> On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
>> Hi all,
>> 
>> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
>> 
>> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:

> Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
> regression? Lets CC Rafeal.

> But the git commit:

> ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
> Merge: 733bbb7 f88e1ae
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Jan 2 12:34:03 2012 -0800

>     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
>     
>     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
>       dt/device: Fix auxdata matching to handle entries without a name override

> Looks to be unrelated? Or is there some other bug?

> Is this by any chance related to "rtc: Expire alarms after the time is set."
> (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?

> I think Stefan had a patch for this..

> Ah, please see attached file.

> Sander, could you please do two tests:

>  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
>  2). Use the latest linus/master and try the attached patch?

Both 1 and 2 make the problem disappear !
(no panic and no long delay on boot seen on 64bit with 1 and multiple cpu's)

Thanks !

--
Sander

> Thanks!

>> 
>> [    1.074218] i8042: No controller found
>> [    1.074510] mousedev: PS/2 mouse device common for all mice
>> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
>> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233394] PGD 0
>> [    1.233399] Oops: 0002 [#1] SMP
>> [    1.233406] CPU 0
>> [    1.233409] Modules linked in:
>> [    1.233415]
>> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
>> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
>> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
>> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
>> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
>> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
>> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
>> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.233508] Stack:
>> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
>> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
>> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
>> [    1.233545] Call Trace:
>> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
>> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>> [    1.234342]  RSP <ffff88000ee07b20>
>> [    1.234342] CR2: 0000000000000598
>> [    1.234342] ---[ end trace e13f105b060373ec ]---
>> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
>> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
>> [    1.277147] Oops: 0000 [#2] SMP
>> [    1.277153] CPU 0
>> [    1.277156] Modules linked in:
>> [    1.277162]
>> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
>> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
>> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
>> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
>> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
>> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
>> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
>> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>> [    1.278053] Stack:
>> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
>> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
>> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
>> [    1.278053] Call Trace:
>> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
>> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
>> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
>> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
>> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
>> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
>> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
>> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
>> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
>> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
>> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
>> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
>> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
>> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
>> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
>> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
>> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
>> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
>> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
>> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>> [    1.278053]  RSP <ffff88000ee07708>
>> [    1.278053] CR2: fffffffffffffff8
>> [    1.278053] ---[ end trace e13f105b060373ed ]---
>> [    1.278053] Fixing recursive fault but reboot is needed!
>> 
>> --
>> Sander



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 19:26   ` Stefan Bader
@ 2012-01-03 20:11     ` Sander Eikelenboom
  0 siblings, 0 replies; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-03 20:11 UTC (permalink / raw)
  To: Stefan Bader
  Cc: Konrad Rzeszutek Wilk, neilb, john.stultz, rjw, Thomas Gleixner,
	linux-kernel

Tuesday, January 3, 2012, 8:26:26 PM, you wrote:

> On 03.01.2012 20:07, Konrad Rzeszutek Wilk wrote:
>> On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
>>> Hi all,
>>>
>>> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
>>>
>>> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
>> 

> One detail I noticed when booting with more than 1 cpu for the guest (64bit) was
> that there seemed to be a relative huge delay (about 90s) between the message on
> registering rtc0 and the one announcing the probe failed with -ENOSYS (which it
> does always). So while not crashing, something seemed wrong in that case as well.

> -Stefan

Ah ok, missed seeing that, thanks for the patch !

--
Sander



>> Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
>> regression? Lets CC Rafeal.
>> 
>> But the git commit:
>> 
>> ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
>> Merge: 733bbb7 f88e1ae
>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>> Date:   Mon Jan 2 12:34:03 2012 -0800
>> 
>>     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
>>     
>>     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
>>       dt/device: Fix auxdata matching to handle entries without a name override
>> 
>> Looks to be unrelated? Or is there some other bug?
>> 
>> Is this by any chance related to "rtc: Expire alarms after the time is set."
>> (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
>> 
>> I think Stefan had a patch for this..
>> 
>> Ah, please see attached file.
>> 
>> Sander, could you please do two tests:
>> 
>>  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
>>  2). Use the latest linus/master and try the attached patch?
>> 
>> Thanks!
>> 
>>>
>>> [    1.074218] i8042: No controller found
>>> [    1.074510] mousedev: PS/2 mouse device common for all mice
>>> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
>>> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>>> [    1.233394] PGD 0
>>> [    1.233399] Oops: 0002 [#1] SMP
>>> [    1.233406] CPU 0
>>> [    1.233409] Modules linked in:
>>> [    1.233415]
>>> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
>>> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>>> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
>>> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
>>> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
>>> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
>>> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
>>> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
>>> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>>> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
>>> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>>> [    1.233508] Stack:
>>> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
>>> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
>>> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
>>> [    1.233545] Call Trace:
>>> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>>> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>>> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>>> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>>> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>>> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>>> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>>> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>>> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>>> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>>> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>>> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>>> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>>> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>>> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>>> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>>> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
>>> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>>> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>>> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>>> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>>> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
>>> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
>>> [    1.234342]  RSP <ffff88000ee07b20>
>>> [    1.234342] CR2: 0000000000000598
>>> [    1.234342] ---[ end trace e13f105b060373ec ]---
>>> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
>>> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
>>> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
>>> [    1.277147] Oops: 0000 [#2] SMP
>>> [    1.277153] CPU 0
>>> [    1.277156] Modules linked in:
>>> [    1.277162]
>>> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
>>> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>>> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
>>> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>>> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
>>> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
>>> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
>>> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
>>> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
>>> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
>>> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
>>> [    1.278053] Stack:
>>> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
>>> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
>>> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
>>> [    1.278053] Call Trace:
>>> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
>>> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
>>> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
>>> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
>>> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
>>> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
>>> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
>>> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
>>> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
>>> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
>>> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
>>> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
>>> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
>>> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
>>> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
>>> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
>>> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
>>> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
>>> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
>>> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
>>> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
>>> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
>>> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
>>> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
>>> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
>>> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
>>> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
>>> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
>>> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
>>> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
>>> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
>>> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
>>> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
>>> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
>>> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
>>> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
>>> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
>>> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
>>> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
>>> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
>>> [    1.278053]  RSP <ffff88000ee07708>
>>> [    1.278053] CR2: fffffffffffffff8
>>> [    1.278053] ---[ end trace e13f105b060373ed ]---
>>> [    1.278053] Fixing recursive fault but reboot is needed!
>>>
>>> --
>>> Sander



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 20:10   ` Sander Eikelenboom
@ 2012-01-03 22:33     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-03 22:33 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: neilb, john.stultz, stefan.bader, rjw, Thomas Gleixner, linux-kernel

On Tue, Jan 03, 2012 at 09:10:05PM +0100, Sander Eikelenboom wrote:
> Tuesday, January 3, 2012, 8:07:54 PM, you wrote:
> 
> > On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
> >> Hi all,
> >> 
> >> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
> >> 
> >> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
> 
> > Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
> > regression? Lets CC Rafeal.
> 
> > But the git commit:
> 
> > ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
> > Merge: 733bbb7 f88e1ae
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date:   Mon Jan 2 12:34:03 2012 -0800
> 
> >     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
> >     
> >     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
> >       dt/device: Fix auxdata matching to handle entries without a name override
> 
> > Looks to be unrelated? Or is there some other bug?
> 
> > Is this by any chance related to "rtc: Expire alarms after the time is set."
> > (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
> 
> > I think Stefan had a patch for this..
> 
> > Ah, please see attached file.
> 
> > Sander, could you please do two tests:
> 
> >  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
> >  2). Use the latest linus/master and try the attached patch?
> 
> Both 1 and 2 make the problem disappear !
> (no panic and no long delay on boot seen on 64bit with 1 and multiple cpu's)

Excellent. Can you also send me your guest config please? I am not able to reproduce this myself :-(

> 
> Thanks !
> 
> --
> Sander
> 
> > Thanks!
> 
> >> 
> >> [    1.074218] i8042: No controller found
> >> [    1.074510] mousedev: PS/2 mouse device common for all mice
> >> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
> >> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.233394] PGD 0
> >> [    1.233399] Oops: 0002 [#1] SMP
> >> [    1.233406] CPU 0
> >> [    1.233409] Modules linked in:
> >> [    1.233415]
> >> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
> >> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
> >> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
> >> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
> >> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
> >> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
> >> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
> >> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> >> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
> >> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> >> [    1.233508] Stack:
> >> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
> >> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
> >> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
> >> [    1.233545] Call Trace:
> >> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> >> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> >> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> >> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> >> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> >> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> >> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> >> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> >> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> >> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> >> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> >> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> >> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> >> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> >> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> >> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> >> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
> >> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> >> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> >> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> >> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> >> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
> >> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.234342]  RSP <ffff88000ee07b20>
> >> [    1.234342] CR2: 0000000000000598
> >> [    1.234342] ---[ end trace e13f105b060373ec ]---
> >> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
> >> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
> >> [    1.277147] Oops: 0000 [#2] SMP
> >> [    1.277153] CPU 0
> >> [    1.277156] Modules linked in:
> >> [    1.277162]
> >> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
> >> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
> >> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> >> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
> >> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
> >> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
> >> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
> >> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> >> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
> >> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> >> [    1.278053] Stack:
> >> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
> >> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
> >> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
> >> [    1.278053] Call Trace:
> >> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
> >> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
> >> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
> >> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
> >> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
> >> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
> >> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
> >> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
> >> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
> >> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
> >> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
> >> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
> >> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
> >> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
> >> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
> >> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
> >> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
> >> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> >> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> >> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> >> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> >> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> >> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> >> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> >> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> >> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> >> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> >> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> >> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> >> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> >> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> >> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> >> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> >> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
> >> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> >> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> >> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> >> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> >> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
> >> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.278053]  RSP <ffff88000ee07708>
> >> [    1.278053] CR2: fffffffffffffff8
> >> [    1.278053] ---[ end trace e13f105b060373ed ]---
> >> [    1.278053] Fixing recursive fault but reboot is needed!
> >> 
> >> --
> >> Sander
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
                     ` (2 preceding siblings ...)
  2012-01-03 20:10   ` Sander Eikelenboom
@ 2012-01-03 23:09   ` John Stultz
  2012-01-04  0:31     ` NeilBrown
  3 siblings, 1 reply; 26+ messages in thread
From: John Stultz @ 2012-01-03 23:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sander Eikelenboom, neilb, stefan.bader, rjw, Thomas Gleixner,
	linux-kernel

On Tue, 2012-01-03 at 14:07 -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
> > Hi all,
> > 
> > While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
> > 
> > The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
> 
[snip]
> Is this by any chance related to "rtc: Expire alarms after the time is set."
> (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
> 
[snip]
> > 
> > [    1.074218] i8042: No controller found
> > [    1.074510] mousedev: PS/2 mouse device common for all mice
> > [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
> > [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> > [    1.233394] PGD 0
> > [    1.233399] Oops: 0002 [#1] SMP
> > [    1.233406] CPU 0
> > [    1.233409] Modules linked in:
> > [    1.233415]
> > [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
> > [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> > [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
> > [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
> > [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
> > [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
> > [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
> > [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
> > [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> > [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
> > [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> > [    1.233508] Stack:
> > [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
> > [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
> > [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
> > [    1.233545] Call Trace:
> > [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> > [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> > [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> > [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> > [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> > [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> > [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> > [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> > [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> > [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> > [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> > [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> > [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> > [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> > [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> > [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> > [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
> > [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> > [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> > [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> > [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13

Hey Konrad,
	So I'm looking at the patch, and its not obviously sticking out to me
why the patch you posted should resolve the issue.

Specifically, what is being done in the following that it fails before
and not after?

	strlcpy(rtc->name, name, RTC_DEVICE_NAME_SIZE);
	dev_set_name(&rtc->dev, "rtc%d", id);

	rtc_dev_prepare(rtc);

	err = device_register(&rtc->dev);
	if (err) {
		put_device(&rtc->dev);
		goto exit_kfree;
	}

	rtc_dev_add_device(rtc);
	rtc_sysfs_add_device(rtc);
	rtc_proc_add_device(rtc);


>From the stack trace, we've kicked off a rtc_timer_do_work, probably
from the rtc_initialize_alarm() schedule_work call added in Neil's
patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.

So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
and pass that along. Unfortunately, since the cmos->rtc value isn't set
until after rtc_device_register() returns its null at that point. So
your patch isn't really fixing the issue, but just reducing the race
window for the second cpu to schedule the work.

Sigh. I'd guess dropping the schedule_work call from
rtc_initialize_alarm() is the right approach (see below). When reviewing
Neil's patch it seemed like a good idea there, but it seems off to me
now.

Neil, any thoughts on the following? Can you expand on the condition you
were worried about in around that call?

thanks
-john

Drop schedule_work() call in rtc_initialize_alarm, as we're not finished
setting things up and can't handle triggering the alarm yet.

Signed-off-by: John Stultz <john.stultz@linaro.org>

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 3bcc7cf..084a137 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -407,8 +407,6 @@ int rtc_initialize_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 		timerqueue_add(&rtc->timerqueue, &rtc->aie_timer.node);
 	}
 	mutex_unlock(&rtc->ops_lock);
-	/* maybe that was in the past.*/
-	schedule_work(&rtc->irqwork);
 	return err;
 }
 EXPORT_SYMBOL_GPL(rtc_initialize_alarm);





^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-03 23:09   ` John Stultz
@ 2012-01-04  0:31     ` NeilBrown
  2012-01-04  0:53       ` John Stultz
  0 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2012-01-04  0:31 UTC (permalink / raw)
  To: John Stultz
  Cc: Konrad Rzeszutek Wilk, Sander Eikelenboom, stefan.bader, rjw,
	Thomas Gleixner, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7197 bytes --]

On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:

> On Tue, 2012-01-03 at 14:07 -0500, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
> > > Hi all,
> > > 
> > > While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
> > > 
> > > The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
> > 
> [snip]
> > Is this by any chance related to "rtc: Expire alarms after the time is set."
> > (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
> > 
> [snip]
> > > 
> > > [    1.074218] i8042: No controller found
> > > [    1.074510] mousedev: PS/2 mouse device common for all mice
> > > [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
> > > [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> > > [    1.233394] PGD 0
> > > [    1.233399] Oops: 0002 [#1] SMP
> > > [    1.233406] CPU 0
> > > [    1.233409] Modules linked in:
> > > [    1.233415]
> > > [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
> > > [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> > > [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
> > > [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
> > > [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
> > > [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
> > > [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
> > > [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
> > > [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> > > [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
> > > [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> > > [    1.233508] Stack:
> > > [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
> > > [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
> > > [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
> > > [    1.233545] Call Trace:
> > > [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> > > [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> > > [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> > > [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> > > [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> > > [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> > > [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> > > [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> > > [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> > > [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> > > [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> > > [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> > > [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> > > [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> > > [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> > > [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> > > [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
> > > [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> > > [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> > > [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> > > [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> 
> Hey Konrad,
> 	So I'm looking at the patch, and its not obviously sticking out to me
> why the patch you posted should resolve the issue.
> 
> Specifically, what is being done in the following that it fails before
> and not after?
> 
> 	strlcpy(rtc->name, name, RTC_DEVICE_NAME_SIZE);
> 	dev_set_name(&rtc->dev, "rtc%d", id);
> 
> 	rtc_dev_prepare(rtc);
> 
> 	err = device_register(&rtc->dev);
> 	if (err) {
> 		put_device(&rtc->dev);
> 		goto exit_kfree;
> 	}
> 
> 	rtc_dev_add_device(rtc);
> 	rtc_sysfs_add_device(rtc);
> 	rtc_proc_add_device(rtc);
> 
> 
> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
> from the rtc_initialize_alarm() schedule_work call added in Neil's
> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> 
> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
> and pass that along. Unfortunately, since the cmos->rtc value isn't set
> until after rtc_device_register() returns its null at that point. So
> your patch isn't really fixing the issue, but just reducing the race
> window for the second cpu to schedule the work.
> 
> Sigh. I'd guess dropping the schedule_work call from
> rtc_initialize_alarm() is the right approach (see below). When reviewing
> Neil's patch it seemed like a good idea there, but it seems off to me
> now.
> 
> Neil, any thoughts on the following? Can you expand on the condition you
> were worried about in around that call?

If you set an alarm in the future, then shutdown and boot again after that
time, then you will end up with a timer_queue node which is in the past.

When this happens the queue gets stuck.  That entry-in-the-past won't get
removed until and interrupt happens and an interrupt won't happen because the
RTC only triggers an interrupt when the alarm is "now".

So you'll find that e.g. "hwclock" will always tell you that 'select' timed
out.

So we force the interrupt work to happen at the start just in case.

Did you see my proposed patch which converted those calls to do the work
in-process rather than passing it to a worker-thread?  I think that is a
clean fix.

NeilBrown



> 
> thanks
> -john
> 
> Drop schedule_work() call in rtc_initialize_alarm, as we're not finished
> setting things up and can't handle triggering the alarm yet.
> 
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 3bcc7cf..084a137 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -407,8 +407,6 @@ int rtc_initialize_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
>  		timerqueue_add(&rtc->timerqueue, &rtc->aie_timer.node);
>  	}
>  	mutex_unlock(&rtc->ops_lock);
> -	/* maybe that was in the past.*/
> -	schedule_work(&rtc->irqwork);
>  	return err;
>  }
>  EXPORT_SYMBOL_GPL(rtc_initialize_alarm);
> 
> 
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  0:31     ` NeilBrown
@ 2012-01-04  0:53       ` John Stultz
  2012-01-04  1:20         ` NeilBrown
  2012-01-04  8:17         ` Stefan Bader
  0 siblings, 2 replies; 26+ messages in thread
From: John Stultz @ 2012-01-04  0:53 UTC (permalink / raw)
  To: NeilBrown
  Cc: Konrad Rzeszutek Wilk, Sander Eikelenboom, stefan.bader, rjw,
	Thomas Gleixner, linux-kernel

On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
> > >From the stack trace, we've kicked off a rtc_timer_do_work, probably
> > from the rtc_initialize_alarm() schedule_work call added in Neil's
> > patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
> > cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> > 
> > So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
> > and pass that along. Unfortunately, since the cmos->rtc value isn't set
> > until after rtc_device_register() returns its null at that point. So
> > your patch isn't really fixing the issue, but just reducing the race
> > window for the second cpu to schedule the work.
> > 
> > Sigh. I'd guess dropping the schedule_work call from
> > rtc_initialize_alarm() is the right approach (see below). When reviewing
> > Neil's patch it seemed like a good idea there, but it seems off to me
> > now.
> > 
> > Neil, any thoughts on the following? Can you expand on the condition you
> > were worried about in around that call?
> 
> If you set an alarm in the future, then shutdown and boot again after that
> time, then you will end up with a timer_queue node which is in the past.

Thanks for explaining this again.

Hrm. It seems the easy answer is to simply not add alarms that are in
the past.  Further, I'm a bit perplexed, as if they are in the past, the
enabled flag shouldn't be set.   __rtc_read_alarm() does check the
current time, so maybe we can make sure we don't return old values? I
guess I assumed __rtc_read_alarm() avoided returning stale values, but
apparently not.

> When this happens the queue gets stuck.  That entry-in-the-past won't get
> removed until and interrupt happens and an interrupt won't happen because the
> RTC only triggers an interrupt when the alarm is "now".
> 
> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
> out.
> 
> So we force the interrupt work to happen at the start just in case.

Unfortunately its too early. 

> Did you see my proposed patch which converted those calls to do the work
> in-process rather than passing it to a worker-thread?  I think that is a
> clean fix.

I don't think I saw it today. Was it from before the holidays?

Even so, at this point, I don't know if we have enough time for testing,
so I'm thinking we either just drop the problematic sched_work call or
revert the whole thing and try again for 3.3

thanks
-john



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  0:53       ` John Stultz
@ 2012-01-04  1:20         ` NeilBrown
  2012-01-04 14:46           ` Konrad Rzeszutek Wilk
  2012-01-04 15:12           ` Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 " Stefan Bader
  2012-01-04  8:17         ` Stefan Bader
  1 sibling, 2 replies; 26+ messages in thread
From: NeilBrown @ 2012-01-04  1:20 UTC (permalink / raw)
  To: John Stultz
  Cc: Konrad Rzeszutek Wilk, Sander Eikelenboom, stefan.bader, rjw,
	Thomas Gleixner, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3215 bytes --]

On Tue, 03 Jan 2012 16:53:00 -0800 John Stultz <john.stultz@linaro.org> wrote:

> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
> > On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
> > > >From the stack trace, we've kicked off a rtc_timer_do_work, probably
> > > from the rtc_initialize_alarm() schedule_work call added in Neil's
> > > patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
> > > cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> > > 
> > > So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
> > > and pass that along. Unfortunately, since the cmos->rtc value isn't set
> > > until after rtc_device_register() returns its null at that point. So
> > > your patch isn't really fixing the issue, but just reducing the race
> > > window for the second cpu to schedule the work.
> > > 
> > > Sigh. I'd guess dropping the schedule_work call from
> > > rtc_initialize_alarm() is the right approach (see below). When reviewing
> > > Neil's patch it seemed like a good idea there, but it seems off to me
> > > now.
> > > 
> > > Neil, any thoughts on the following? Can you expand on the condition you
> > > were worried about in around that call?
> > 
> > If you set an alarm in the future, then shutdown and boot again after that
> > time, then you will end up with a timer_queue node which is in the past.
> 
> Thanks for explaining this again.
> 
> Hrm. It seems the easy answer is to simply not add alarms that are in
> the past.  Further, I'm a bit perplexed, as if they are in the past, the
> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
> current time, so maybe we can make sure we don't return old values? I
> guess I assumed __rtc_read_alarm() avoided returning stale values, but
> apparently not.

That would probably be a more robust approach.  Also it might make sense to
clean out old alarms whenever we are about to add a new one.

> 
> > When this happens the queue gets stuck.  That entry-in-the-past won't get
> > removed until and interrupt happens and an interrupt won't happen because the
> > RTC only triggers an interrupt when the alarm is "now".
> > 
> > So you'll find that e.g. "hwclock" will always tell you that 'select' timed
> > out.
> > 
> > So we force the interrupt work to happen at the start just in case.
> 
> Unfortunately its too early. 
> 
> > Did you see my proposed patch which converted those calls to do the work
> > in-process rather than passing it to a worker-thread?  I think that is a
> > clean fix.
> 
> I don't think I saw it today. Was it from before the holidays?

About 4 hours ago:
  Subject: Re: Patch Upstream: rtc: Expire alarms after the time is set.

> 
> Even so, at this point, I don't know if we have enough time for testing,
> so I'm thinking we either just drop the problematic sched_work call or
> revert the whole thing and try again for 3.3

I wouldn't object to that.  The bug only triggers in unusual circumstances
and is quite easy to work around so it is safer  to wait until we have a
really good fix.

Thanks,
NeilBrown


> 
> thanks
> -john
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  0:53       ` John Stultz
  2012-01-04  1:20         ` NeilBrown
@ 2012-01-04  8:17         ` Stefan Bader
  2012-01-04 12:25           ` Stefan Bader
  2012-01-04 18:36           ` John Stultz
  1 sibling, 2 replies; 26+ messages in thread
From: Stefan Bader @ 2012-01-04  8:17 UTC (permalink / raw)
  To: John Stultz
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On 04.01.2012 01:53, John Stultz wrote:
> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.

Sorry, I was off for the evening a while after sending this out. And I just
started, so a few thing I will be doing later but have not yet had time.

Over night I had still be thinking on this and maybe one important fact I had
been ignoring. This really has only been observed on paravirt guests on Xen as
far as I know. And one thing that I should have pointed out is that

[    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38

So first the registration is done and the first line is the last thing printed
in the registration function. Then, and that line always comes after, the probe,
which looks like being done asynchronously, detects that the rtc is not
implemented. I would assume that this causes the rtc to be unregistered again
and that is probably the point where, under the right circumstances, the worker
triggered by the initialize alarm is trying to set another alarm. Probably while
some of the elements of the structure started to be torn down. I need to check
on that code path, yet. So right now its more a guess.

>>>
>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>>> until after rtc_device_register() returns its null at that point. So
>>> your patch isn't really fixing the issue, but just reducing the race
>>> window for the second cpu to schedule the work.
>>>
>>> Sigh. I'd guess dropping the schedule_work call from
>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>>> Neil's patch it seemed like a good idea there, but it seems off to me
>>> now.
>>>
>>> Neil, any thoughts on the following? Can you expand on the condition you
>>> were worried about in around that call?
>>
>> If you set an alarm in the future, then shutdown and boot again after that
>> time, then you will end up with a timer_queue node which is in the past.
> 
> Thanks for explaining this again.
> 
> Hrm. It seems the easy answer is to simply not add alarms that are in
> the past.  Further, I'm a bit perplexed, as if they are in the past, the
> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
> current time, so maybe we can make sure we don't return old values? I
> guess I assumed __rtc_read_alarm() avoided returning stale values, but
> apparently not.
> 
>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>> removed until and interrupt happens and an interrupt won't happen because the
>> RTC only triggers an interrupt when the alarm is "now".
>>
>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>> out.
>>
>> So we force the interrupt work to happen at the start just in case.
> 
> Unfortunately its too early. 
> 
>> Did you see my proposed patch which converted those calls to do the work
>> in-process rather than passing it to a worker-thread?  I think that is a
>> clean fix.
> 
> I don't think I saw it today. Was it from before the holidays?
> 

I fear I caused a bit of confusion there. Neil responded to my initial mail
which was done as a reply to the mail announcing this patch for stable (which
just was the first thread I could get hold of).
I will try Neil's patch as well. And in parallel try to see whether the theory I
had this night makes sense. If it does, then it is only indirectly that the work
is scheduled too early. In that case just the teardown needs to make sure that
no work is being run while removal. Well, maybe the question is whether there
should be a delay in running the irq work until the device really, really is
completely set up... But that sounds a bit more complicated.

> Even so, at this point, I don't know if we have enough time for testing,
> so I'm thinking we either just drop the problematic sched_work call or
> revert the whole thing and try again for 3.3

That was the reason I was in a bit of hurry to get this back to you. Especially
since this patch had been marked as stable material and sooner or later will or
would be added to all the stable releases it applies to.

Thanks,
Stefan

> thanks
> -john
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  8:17         ` Stefan Bader
@ 2012-01-04 12:25           ` Stefan Bader
  2012-01-04 13:17             ` Sander Eikelenboom
                               ` (2 more replies)
  2012-01-04 18:36           ` John Stultz
  1 sibling, 3 replies; 26+ messages in thread
From: Stefan Bader @ 2012-01-04 12:25 UTC (permalink / raw)
  To: John Stultz
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On 04.01.2012 09:17, Stefan Bader wrote:
> On 04.01.2012 01:53, John Stultz wrote:
>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> 
> Sorry, I was off for the evening a while after sending this out. And I just
> started, so a few thing I will be doing later but have not yet had time.
> 
> Over night I had still be thinking on this and maybe one important fact I had
> been ignoring. This really has only been observed on paravirt guests on Xen as
> far as I know. And one thing that I should have pointed out is that
> 
> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
> 
> So first the registration is done and the first line is the last thing printed
> in the registration function. Then, and that line always comes after, the probe,
> which looks like being done asynchronously, detects that the rtc is not
> implemented. I would assume that this causes the rtc to be unregistered again
> and that is probably the point where, under the right circumstances, the worker
> triggered by the initialize alarm is trying to set another alarm. Probably while
> some of the elements of the structure started to be torn down. I need to check
> on that code path, yet. So right now its more a guess.
> 
>>>>
>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>>>> until after rtc_device_register() returns its null at that point. So
>>>> your patch isn't really fixing the issue, but just reducing the race
>>>> window for the second cpu to schedule the work.
>>>>
>>>> Sigh. I'd guess dropping the schedule_work call from
>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>>>> Neil's patch it seemed like a good idea there, but it seems off to me
>>>> now.
>>>>
>>>> Neil, any thoughts on the following? Can you expand on the condition you
>>>> were worried about in around that call?
>>>
>>> If you set an alarm in the future, then shutdown and boot again after that
>>> time, then you will end up with a timer_queue node which is in the past.
>>
>> Thanks for explaining this again.
>>
>> Hrm. It seems the easy answer is to simply not add alarms that are in
>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
>> current time, so maybe we can make sure we don't return old values? I
>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
>> apparently not.
>>
>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>>> removed until and interrupt happens and an interrupt won't happen because the
>>> RTC only triggers an interrupt when the alarm is "now".
>>>
>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>>> out.
>>>
>>> So we force the interrupt work to happen at the start just in case.
>>
>> Unfortunately its too early. 
>>
>>> Did you see my proposed patch which converted those calls to do the work
>>> in-process rather than passing it to a worker-thread?  I think that is a
>>> clean fix.
>>
>> I don't think I saw it today. Was it from before the holidays?
>>
> 
> I fear I caused a bit of confusion there. Neil responded to my initial mail
> which was done as a reply to the mail announcing this patch for stable (which
> just was the first thread I could get hold of).
> I will try Neil's patch as well. And in parallel try to see whether the theory I
> had this night makes sense. If it does, then it is only indirectly that the work
> is scheduled too early. In that case just the teardown needs to make sure that
> no work is being run while removal. Well, maybe the question is whether there
> should be a delay in running the irq work until the device really, really is
> completely set up... But that sounds a bit more complicated.

By now I tried Neil's proposed patch and unfortunately that makes things rather
worse. I also played around with the idea of the unregistration race. Maybe
there also is one (that cancel_work_sync should be called before unregistering
the device) but definitely it is not what happens at least in the one CPU case.
I added some more printk's and the crash happens before even the rtc core class
has been fully registered. And no unregister is call has been made either.

Which may point to execution of the irq worker (including a schedule_work)
before the rtc-cmos parts are finished... Would explain why moving the
initialize call further down does at least narrow the window for it to happen...
The only thing I do not understand then is why that seems only to happen on Xen
guests...

-Stefan

>> Even so, at this point, I don't know if we have enough time for testing,
>> so I'm thinking we either just drop the problematic sched_work call or
>> revert the whole thing and try again for 3.3
> 
> That was the reason I was in a bit of hurry to get this back to you. Especially
> since this patch had been marked as stable material and sooner or later will or
> would be added to all the stable releases it applies to.
> 
> Thanks,
> Stefan
> 
>> thanks
>> -john
>>
>>
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 12:25           ` Stefan Bader
@ 2012-01-04 13:17             ` Sander Eikelenboom
  2012-01-04 18:33               ` John Stultz
  2012-01-04 14:13             ` Stefan Bader
  2012-01-04 18:35             ` John Stultz
  2 siblings, 1 reply; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-04 13:17 UTC (permalink / raw)
  To: Stefan Bader
  Cc: John Stultz, NeilBrown, Konrad Rzeszutek Wilk, rjw,
	Thomas Gleixner, linux-kernel

Hello Stefan,

Wednesday, January 4, 2012, 1:25:16 PM, you wrote:

> On 04.01.2012 09:17, Stefan Bader wrote:
>> On 04.01.2012 01:53, John Stultz wrote:
>>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
>> 
>> Sorry, I was off for the evening a while after sending this out. And I just
>> started, so a few thing I will be doing later but have not yet had time.
>> 
>> Over night I had still be thinking on this and maybe one important fact I had
>> been ignoring. This really has only been observed on paravirt guests on Xen as
>> far as I know. And one thing that I should have pointed out is that
>> 
>> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
>> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
>> 
>> So first the registration is done and the first line is the last thing printed
>> in the registration function. Then, and that line always comes after, the probe,
>> which looks like being done asynchronously, detects that the rtc is not
>> implemented. I would assume that this causes the rtc to be unregistered again
>> and that is probably the point where, under the right circumstances, the worker
>> triggered by the initialize alarm is trying to set another alarm. Probably while
>> some of the elements of the structure started to be torn down. I need to check
>> on that code path, yet. So right now its more a guess.
>> 
>>>>>
>>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>>>>> until after rtc_device_register() returns its null at that point. So
>>>>> your patch isn't really fixing the issue, but just reducing the race
>>>>> window for the second cpu to schedule the work.
>>>>>
>>>>> Sigh. I'd guess dropping the schedule_work call from
>>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>>>>> Neil's patch it seemed like a good idea there, but it seems off to me
>>>>> now.
>>>>>
>>>>> Neil, any thoughts on the following? Can you expand on the condition you
>>>>> were worried about in around that call?
>>>>
>>>> If you set an alarm in the future, then shutdown and boot again after that
>>>> time, then you will end up with a timer_queue node which is in the past.
>>>
>>> Thanks for explaining this again.
>>>
>>> Hrm. It seems the easy answer is to simply not add alarms that are in
>>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
>>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
>>> current time, so maybe we can make sure we don't return old values? I
>>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
>>> apparently not.
>>>
>>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>>>> removed until and interrupt happens and an interrupt won't happen because the
>>>> RTC only triggers an interrupt when the alarm is "now".
>>>>
>>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>>>> out.
>>>>
>>>> So we force the interrupt work to happen at the start just in case.
>>>
>>> Unfortunately its too early. 
>>>
>>>> Did you see my proposed patch which converted those calls to do the work
>>>> in-process rather than passing it to a worker-thread?  I think that is a
>>>> clean fix.
>>>
>>> I don't think I saw it today. Was it from before the holidays?
>>>
>> 
>> I fear I caused a bit of confusion there. Neil responded to my initial mail
>> which was done as a reply to the mail announcing this patch for stable (which
>> just was the first thread I could get hold of).
>> I will try Neil's patch as well. And in parallel try to see whether the theory I
>> had this night makes sense. If it does, then it is only indirectly that the work
>> is scheduled too early. In that case just the teardown needs to make sure that
>> no work is being run while removal. Well, maybe the question is whether there
>> should be a delay in running the irq work until the device really, really is
>> completely set up... But that sounds a bit more complicated.

> By now I tried Neil's proposed patch and unfortunately that makes things rather
> worse. I also played around with the idea of the unregistration race. Maybe
> there also is one (that cancel_work_sync should be called before unregistering
> the device) but definitely it is not what happens at least in the one CPU case.
> I added some more printk's and the crash happens before even the rtc core class
> has been fully registered. And no unregister is call has been made either.

> Which may point to execution of the irq worker (including a schedule_work)
> before the rtc-cmos parts are finished... Would explain why moving the
> initialize call further down does at least narrow the window for it to happen...
> The only thing I do not understand then is why that seems only to happen on Xen
> guests...

> -Stefan

Has there been a request to revert this for 3.2 final gone to Linus ?


>>> Even so, at this point, I don't know if we have enough time for testing,
>>> so I'm thinking we either just drop the problematic sched_work call or
>>> revert the whole thing and try again for 3.3
>> 
>> That was the reason I was in a bit of hurry to get this back to you. Especially
>> since this patch had been marked as stable material and sooner or later will or
>> would be added to all the stable releases it applies to.
>> 
>> Thanks,
>> Stefan
>> 
>>> thanks
>>> -john
>>>
>>>
>> 




-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 12:25           ` Stefan Bader
  2012-01-04 13:17             ` Sander Eikelenboom
@ 2012-01-04 14:13             ` Stefan Bader
  2012-01-06 20:41               ` John Stultz
  2012-01-04 18:35             ` John Stultz
  2 siblings, 1 reply; 26+ messages in thread
From: Stefan Bader @ 2012-01-04 14:13 UTC (permalink / raw)
  To: John Stultz
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On 04.01.2012 13:25, Stefan Bader wrote:
> On 04.01.2012 09:17, Stefan Bader wrote:
>> On 04.01.2012 01:53, John Stultz wrote:
>>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
>>
>> Sorry, I was off for the evening a while after sending this out. And I just
>> started, so a few thing I will be doing later but have not yet had time.
>>
>> Over night I had still be thinking on this and maybe one important fact I had
>> been ignoring. This really has only been observed on paravirt guests on Xen as
>> far as I know. And one thing that I should have pointed out is that
>>
>> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
>> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
>>
>> So first the registration is done and the first line is the last thing printed
>> in the registration function. Then, and that line always comes after, the probe,
>> which looks like being done asynchronously, detects that the rtc is not
>> implemented. I would assume that this causes the rtc to be unregistered again
>> and that is probably the point where, under the right circumstances, the worker
>> triggered by the initialize alarm is trying to set another alarm. Probably while
>> some of the elements of the structure started to be torn down. I need to check
>> on that code path, yet. So right now its more a guess.
>>
>>>>>
>>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>>>>> until after rtc_device_register() returns its null at that point. So
>>>>> your patch isn't really fixing the issue, but just reducing the race
>>>>> window for the second cpu to schedule the work.
>>>>>
>>>>> Sigh. I'd guess dropping the schedule_work call from
>>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>>>>> Neil's patch it seemed like a good idea there, but it seems off to me
>>>>> now.
>>>>>
>>>>> Neil, any thoughts on the following? Can you expand on the condition you
>>>>> were worried about in around that call?
>>>>
>>>> If you set an alarm in the future, then shutdown and boot again after that
>>>> time, then you will end up with a timer_queue node which is in the past.
>>>
>>> Thanks for explaining this again.
>>>
>>> Hrm. It seems the easy answer is to simply not add alarms that are in
>>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
>>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
>>> current time, so maybe we can make sure we don't return old values? I
>>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
>>> apparently not.
>>>
>>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>>>> removed until and interrupt happens and an interrupt won't happen because the
>>>> RTC only triggers an interrupt when the alarm is "now".
>>>>
>>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>>>> out.
>>>>
>>>> So we force the interrupt work to happen at the start just in case.
>>>
>>> Unfortunately its too early. 
>>>
>>>> Did you see my proposed patch which converted those calls to do the work
>>>> in-process rather than passing it to a worker-thread?  I think that is a
>>>> clean fix.
>>>
>>> I don't think I saw it today. Was it from before the holidays?
>>>
>>
>> I fear I caused a bit of confusion there. Neil responded to my initial mail
>> which was done as a reply to the mail announcing this patch for stable (which
>> just was the first thread I could get hold of).
>> I will try Neil's patch as well. And in parallel try to see whether the theory I
>> had this night makes sense. If it does, then it is only indirectly that the work
>> is scheduled too early. In that case just the teardown needs to make sure that
>> no work is being run while removal. Well, maybe the question is whether there
>> should be a delay in running the irq work until the device really, really is
>> completely set up... But that sounds a bit more complicated.
> 
> By now I tried Neil's proposed patch and unfortunately that makes things rather
> worse. I also played around with the idea of the unregistration race. Maybe
> there also is one (that cancel_work_sync should be called before unregistering
> the device) but definitely it is not what happens at least in the one CPU case.
> I added some more printk's and the crash happens before even the rtc core class
> has been fully registered. And no unregister is call has been made either.
> 
> Which may point to execution of the irq worker (including a schedule_work)
> before the rtc-cmos parts are finished... Would explain why moving the
> initialize call further down does at least narrow the window for it to happen...
> The only thing I do not understand then is why that seems only to happen on Xen
> guests...
> 

Darn, guess I understand now... So cmos-rtc will call rtc_device_register from
within do_cmos_probe and set cmos_rtc.rtc with the pointer that
rtc_device_register returns.
But when having rtc_initialize_alarm being called earlier in
rtc_device_register, and that queues the work, there is a chance it will fire
even before that function returns.
And then cmos_checkintr will dereference the null pointer still stored in
cmos_rtc.rtc for calling rtc_update_irq...

-Stefan
> -Stefan
> 
>>> Even so, at this point, I don't know if we have enough time for testing,
>>> so I'm thinking we either just drop the problematic sched_work call or
>>> revert the whole thing and try again for 3.3
>>
>> That was the reason I was in a bit of hurry to get this back to you. Especially
>> since this patch had been marked as stable material and sooner or later will or
>> would be added to all the stable releases it applies to.
>>
>> Thanks,
>> Stefan
>>
>>> thanks
>>> -john
>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [ 1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  1:20         ` NeilBrown
@ 2012-01-04 14:46           ` Konrad Rzeszutek Wilk
  2012-01-04 15:12           ` Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 " Stefan Bader
  1 sibling, 0 replies; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-04 14:46 UTC (permalink / raw)
  To: NeilBrown, Linus Torvalds
  Cc: John Stultz, Sander Eikelenboom, stefan.bader, rjw,
	Thomas Gleixner, linux-kernel

On Wed, Jan 04, 2012 at 12:20:40PM +1100, NeilBrown wrote:
> On Tue, 03 Jan 2012 16:53:00 -0800 John Stultz <john.stultz@linaro.org> wrote:
> 
> > On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
> > > On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
> > > > >From the stack trace, we've kicked off a rtc_timer_do_work, probably
> > > > from the rtc_initialize_alarm() schedule_work call added in Neil's
> > > > patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
> > > > cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> > > > 
> > > > So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
> > > > and pass that along. Unfortunately, since the cmos->rtc value isn't set
> > > > until after rtc_device_register() returns its null at that point. So
> > > > your patch isn't really fixing the issue, but just reducing the race
> > > > window for the second cpu to schedule the work.
> > > > 
> > > > Sigh. I'd guess dropping the schedule_work call from
> > > > rtc_initialize_alarm() is the right approach (see below). When reviewing
> > > > Neil's patch it seemed like a good idea there, but it seems off to me
> > > > now.
> > > > 
> > > > Neil, any thoughts on the following? Can you expand on the condition you
> > > > were worried about in around that call?
> > > 
> > > If you set an alarm in the future, then shutdown and boot again after that
> > > time, then you will end up with a timer_queue node which is in the past.
> > 
> > Thanks for explaining this again.
> > 
> > Hrm. It seems the easy answer is to simply not add alarms that are in
> > the past.  Further, I'm a bit perplexed, as if they are in the past, the
> > enabled flag shouldn't be set.   __rtc_read_alarm() does check the
> > current time, so maybe we can make sure we don't return old values? I
> > guess I assumed __rtc_read_alarm() avoided returning stale values, but
> > apparently not.
> 
> That would probably be a more robust approach.  Also it might make sense to
> clean out old alarms whenever we are about to add a new one.
> 
> > 
> > > When this happens the queue gets stuck.  That entry-in-the-past won't get
> > > removed until and interrupt happens and an interrupt won't happen because the
> > > RTC only triggers an interrupt when the alarm is "now".
> > > 
> > > So you'll find that e.g. "hwclock" will always tell you that 'select' timed
> > > out.
> > > 
> > > So we force the interrupt work to happen at the start just in case.
> > 
> > Unfortunately its too early. 
> > 
> > > Did you see my proposed patch which converted those calls to do the work
> > > in-process rather than passing it to a worker-thread?  I think that is a
> > > clean fix.
> > 
> > I don't think I saw it today. Was it from before the holidays?
> 
> About 4 hours ago:
>   Subject: Re: Patch Upstream: rtc: Expire alarms after the time is set.
> 
> > 
> > Even so, at this point, I don't know if we have enough time for testing,
> > so I'm thinking we either just drop the problematic sched_work call or
> > revert the whole thing and try again for 3.3
> 
> I wouldn't object to that.  The bug only triggers in unusual circumstances
> and is quite easy to work around so it is safer  to wait until we have a
> really good fix.

Linus,

Sorry for getting you in this loop so late-ish. Would it be possible to revert
93b2ec0128c431148b216b8f7337c1a52131ef03 before 3.2 is released? If there are a
couple of days to work this out we can probably come up with a proper patch but
we don't know when 3.2 is going out (presumarily today?).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  1:20         ` NeilBrown
  2012-01-04 14:46           ` Konrad Rzeszutek Wilk
@ 2012-01-04 15:12           ` Stefan Bader
  2012-01-05 22:03             ` NeilBrown
  1 sibling, 1 reply; 26+ messages in thread
From: Stefan Bader @ 2012-01-04 15:12 UTC (permalink / raw)
  To: NeilBrown
  Cc: John Stultz, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4321 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 04.01.2012 02:20, NeilBrown wrote:
> On Tue, 03 Jan 2012 16:53:00 -0800 John Stultz <john.stultz@linaro.org>
> wrote:
> 
>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org>
>>> wrote:
>>>>> From the stack trace, we've kicked off a rtc_timer_do_work,
>>>>> probably
>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's 
>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm -> 
>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq ->
>>>> schedule_work.
>>>> 
>>>> So, what it looks to me is that in cmos_checkintr, we grab the
>>>> cmos->rtc and pass that along. Unfortunately, since the cmos->rtc
>>>> value isn't set until after rtc_device_register() returns its null at
>>>> that point. So your patch isn't really fixing the issue, but just
>>>> reducing the race window for the second cpu to schedule the work.
>>>> 
>>>> Sigh. I'd guess dropping the schedule_work call from 
>>>> rtc_initialize_alarm() is the right approach (see below). When
>>>> reviewing Neil's patch it seemed like a good idea there, but it seems
>>>> off to me now.
>>>> 
>>>> Neil, any thoughts on the following? Can you expand on the condition
>>>> you were worried about in around that call?
>>> 
>>> If you set an alarm in the future, then shutdown and boot again after
>>> that time, then you will end up with a timer_queue node which is in the
>>> past.
>> 
>> Thanks for explaining this again.
>> 
>> Hrm. It seems the easy answer is to simply not add alarms that are in the
>> past.  Further, I'm a bit perplexed, as if they are in the past, the 
>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the 
>> current time, so maybe we can make sure we don't return old values? I 
>> guess I assumed __rtc_read_alarm() avoided returning stale values, but 
>> apparently not.
> 
> That would probably be a more robust approach.  Also it might make sense
> to clean out old alarms whenever we are about to add a new one.
> 
>> 
>>> When this happens the queue gets stuck.  That entry-in-the-past won't
>>> get removed until and interrupt happens and an interrupt won't happen
>>> because the RTC only triggers an interrupt when the alarm is "now".
>>> 
>>> So you'll find that e.g. "hwclock" will always tell you that 'select'
>>> timed out.
>>> 
>>> So we force the interrupt work to happen at the start just in case.
>> 
>> Unfortunately its too early.
>> 
>>> Did you see my proposed patch which converted those calls to do the
>>> work in-process rather than passing it to a worker-thread?  I think
>>> that is a clean fix.
>> 
>> I don't think I saw it today. Was it from before the holidays?
> 
> About 4 hours ago: Subject: Re: Patch Upstream: rtc: Expire alarms after
> the time is set.
> 
>> 
>> Even so, at this point, I don't know if we have enough time for testing, 
>> so I'm thinking we either just drop the problematic sched_work call or 
>> revert the whole thing and try again for 3.3
> 
> I wouldn't object to that.  The bug only triggers in unusual circumstances 
> and is quite easy to work around so it is safer  to wait until we have a 
> really good fix.
> 
> Thanks, NeilBrown
> 
Neil, this variation would also keep the kernel from crashing in my test
environment. What do you think of that?

- -Stefan

> 
>> 
>> thanks -john
>> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCgAGBQJPBGw4AAoJEOhnXe7L7s6jg5YQAJ1PrNGhuXJSAevurIorc+l5
Xbyj96MRk2Z7y7c0f/OQCCKzmReNMOQ91pgok2FIamHBh55JtbgdldyTWG80Jr0M
N1jwkus1SJTIcLAeIrzoXSLU5uTvmcCcW0GBUuK98m0LDwMMJr0P1a5ilsyorRFY
ekdFIL732Cs6hn3BAlg6zVucG/kMOi3OEOtHlbKdw83VdxJ1OmjIkW9tlj/6268j
ZImuXh1XZmCJiTuCwdvdi3RVU3qZvQJi1AW0G0zsQcc2r13OQkSSoi/Wo5+6dNit
7WxSbYqeZ4hPmk67yDKjOK67kDU094end++ePtQNke0d9IePB3fRvDIi+eEkrjc5
0rJi2gLkpSyE8MJ3pBO9Bt8JCVjpY1QZnQsDT68fIHUApMnrwRK2N371JzbSim52
8VYgTfubRRdwiLlMAL7ACilFV2FKc9PcTf7FrrR2j+GZfhKTD5iy+bNvqpWaJVVT
jUrfCS79z+Z9v9/bjZ55n23gIJQn3jLk3B79plnFlwmxp9CDaAITmfzKsSB0EoZB
RF0ndDkd1Oii/XuFdqaVaMeZojPCT2rVHas2H26mH+Ihb0n9EvSBgx2L64DbRrAt
duLngf+IAO6T9ngaqnhmHwJXdwKiKGxUhWVdo92YTd16jdANd/711/B0kzHHWw4H
FT46+Os2UQJR+HsedKZg
=Kgpc
-----END PGP SIGNATURE-----

[-- Attachment #2: 0001-rtc-Do-not-schedule-work-in-rtc_initialize_alarm.patch --]
[-- Type: text/x-diff, Size: 2311 bytes --]

>From 0c91dc5c02c828b6b57066f99d9d2b853ced5c97 Mon Sep 17 00:00:00 2001
From: Stefan Bader <stefan.bader@canonical.com>
Date: Wed, 4 Jan 2012 15:41:16 +0100
Subject: [PATCH] rtc: Do not schedule work in rtc_initialize_alarm

commit 93b2ec0128c431148b216b8f7337c1a52131ef03
  rtc: Expire alarms after the time is set.

added a schedule_work to rtc_initialize_alarm in order to
get alarm processing running, even in case of an alarm that
had been in the past.

However, this is called from rtc_device_register that in turn
is called from cmos_do_probe while rtc_cmos.rtc is still NULL.
If the worker fires off and tries to rtc_update_irq it will
dereference that potentially NULL pointer (this only was
observer running a Xen paravirt guest, but I think if the
timing is right could happen anywhere). So it seems better to
tickle the interrupt worker just in any case when the probe
really is complete. Worst case it should just confirm the
fact that there is no alarm.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
---
 drivers/rtc/interface.c |    2 --
 drivers/rtc/rtc-cmos.c  |    7 +++++++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 3bcc7cf..084a137 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -407,8 +407,6 @@ int rtc_initialize_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 		timerqueue_add(&rtc->timerqueue, &rtc->aie_timer.node);
 	}
 	mutex_unlock(&rtc->ops_lock);
-	/* maybe that was in the past.*/
-	schedule_work(&rtc->irqwork);
 	return err;
 }
 EXPORT_SYMBOL_GPL(rtc_initialize_alarm);
diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 05beb6c..21fcb31 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -731,6 +731,13 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq)
 		goto cleanup2;
 	}
 
+	/* I hope it will not matter much if there was not actually
+	 * some interrupt work that needed to be done. The worker
+	 * thread just should do nothing (beside of disabling the
+	 * alarm again. At least by now all other setup is done.
+	 */
+	schedule_work(&cmos_rtc.rtc->irqwork);
+
 	pr_info("%s: %s%s, %zd bytes nvram%s\n",
 		dev_name(&cmos_rtc.rtc->dev),
 		!is_valid_irq(rtc_irq) ? "no alarms" :
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 13:17             ` Sander Eikelenboom
@ 2012-01-04 18:33               ` John Stultz
  0 siblings, 0 replies; 26+ messages in thread
From: John Stultz @ 2012-01-04 18:33 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Stefan Bader, NeilBrown, Konrad Rzeszutek Wilk, rjw,
	Thomas Gleixner, linux-kernel

On Wed, 2012-01-04 at 14:17 +0100, Sander Eikelenboom wrote:

> Has there been a request to revert this for 3.2 final gone to Linus ?

Yep. I just sent a changelog for the revert commit to Linus.

thanks
-john



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 12:25           ` Stefan Bader
  2012-01-04 13:17             ` Sander Eikelenboom
  2012-01-04 14:13             ` Stefan Bader
@ 2012-01-04 18:35             ` John Stultz
  2 siblings, 0 replies; 26+ messages in thread
From: John Stultz @ 2012-01-04 18:35 UTC (permalink / raw)
  To: Stefan Bader
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On Wed, 2012-01-04 at 13:25 +0100, Stefan Bader wrote:
> Which may point to execution of the irq worker (including a schedule_work)
> before the rtc-cmos parts are finished... Would explain why moving the
> initialize call further down does at least narrow the window for it to happen...
> The only thing I do not understand then is why that seems only to happen on Xen
> guests...

Its just likely with Xen SMP guests that the race window is larger. I
suspect similar behavior could be tripped on other slower SMP boxes as
well.

thanks
-john



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04  8:17         ` Stefan Bader
  2012-01-04 12:25           ` Stefan Bader
@ 2012-01-04 18:36           ` John Stultz
  2012-01-04 18:50             ` Stefan Bader
  2012-01-04 19:47             ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 26+ messages in thread
From: John Stultz @ 2012-01-04 18:36 UTC (permalink / raw)
  To: Stefan Bader
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On Wed, 2012-01-04 at 09:17 +0100, Stefan Bader wrote:
> Over night I had still be thinking on this and maybe one important fact I had
> been ignoring. This really has only been observed on paravirt guests on Xen as
> far as I know. And one thing that I should have pointed out is that
> 
> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
> 
> So first the registration is done and the first line is the last thing printed
> in the registration function. Then, and that line always comes after, the probe,
> which looks like being done asynchronously, detects that the rtc is not
> implemented. I would assume that this causes the rtc to be unregistered again
> and that is probably the point where, under the right circumstances, the worker
> triggered by the initialize alarm is trying to set another alarm. Probably while
> some of the elements of the structure started to be torn down. I need to check
> on that code path, yet. So right now its more a guess.

Hrm. Do you see the same probe error with 3.1 kernels as well?  

Konrad: Is the probe failure a known issue on Xen? Any clues on whats
going on there?

thanks
-john



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 18:36           ` John Stultz
@ 2012-01-04 18:50             ` Stefan Bader
  2012-01-04 19:47             ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 26+ messages in thread
From: Stefan Bader @ 2012-01-04 18:50 UTC (permalink / raw)
  To: John Stultz
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On 04.01.2012 19:36, John Stultz wrote:
> On Wed, 2012-01-04 at 09:17 +0100, Stefan Bader wrote:
>> Over night I had still be thinking on this and maybe one important fact I had
>> been ignoring. This really has only been observed on paravirt guests on Xen as
>> far as I know. And one thing that I should have pointed out is that
>>
>> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
>> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
>>
>> So first the registration is done and the first line is the last thing printed
>> in the registration function. Then, and that line always comes after, the probe,
>> which looks like being done asynchronously, detects that the rtc is not
>> implemented. I would assume that this causes the rtc to be unregistered again
>> and that is probably the point where, under the right circumstances, the worker
>> triggered by the initialize alarm is trying to set another alarm. Probably while
>> some of the elements of the structure started to be torn down. I need to check
>> on that code path, yet. So right now its more a guess.
> 
> Hrm. Do you see the same probe error with 3.1 kernels as well?  
> 
Yes. But I think this is not really wrong. Remember this is in a paravirtualized
guest. Which does not necessarily have access to all physical hardware. Maybe
something usually not expected. And having the class device registered before
the probe finally finds out about it not being present could be a bit suboptimal
in that case.

-Stefan

> Konrad: Is the probe failure a known issue on Xen? Any clues on whats
> going on there?
> 
> thanks
> -john
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 18:36           ` John Stultz
  2012-01-04 18:50             ` Stefan Bader
@ 2012-01-04 19:47             ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-04 19:47 UTC (permalink / raw)
  To: John Stultz
  Cc: Stefan Bader, NeilBrown, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On Wed, Jan 04, 2012 at 10:36:52AM -0800, John Stultz wrote:
> On Wed, 2012-01-04 at 09:17 +0100, Stefan Bader wrote:
> > Over night I had still be thinking on this and maybe one important fact I had
> > been ignoring. This really has only been observed on paravirt guests on Xen as
> > far as I know. And one thing that I should have pointed out is that
> > 
> > [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
> > [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
> > 
> > So first the registration is done and the first line is the last thing printed
> > in the registration function. Then, and that line always comes after, the probe,
> > which looks like being done asynchronously, detects that the rtc is not
> > implemented. I would assume that this causes the rtc to be unregistered again
> > and that is probably the point where, under the right circumstances, the worker
> > triggered by the initialize alarm is trying to set another alarm. Probably while
> > some of the elements of the structure started to be torn down. I need to check
> > on that code path, yet. So right now its more a guess.
> 
> Hrm. Do you see the same probe error with 3.1 kernels as well?  
> 
> Konrad: Is the probe failure a known issue on Xen? Any clues on whats
> going on there?

Hey John,

Stefan kind of summarized it - the paravirtualized guests do not have access
to the CMOS. In fact they have no access to any legacy device (except if one does
PCI passthrough) - so the rtc_core returning -38 is correct.

We have our own timer - which is the Xen hypervisor stamps the the nanosecond
resolution data in a per-cpu field that the timer API uses.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 15:12           ` Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 " Stefan Bader
@ 2012-01-05 22:03             ` NeilBrown
  0 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2012-01-05 22:03 UTC (permalink / raw)
  To: Stefan Bader
  Cc: John Stultz, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=US-ASCII, Size: 1519 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 04 Jan 2012 16:12:02 +0100 Stefan Bader <stefan.bader@canonical.com>
wrote:


> Neil, this variation would also keep the kernel from crashing in my test
> environment. What do you think of that?

Hi Stefan,

That patch moves the call out of common code into device-specific code.
As my device doesn't have a 'cmos' rtc it wouldn't run at all for me.

I think we need to get __rtc_read_alarm() to return an error if the alarm time
is before the clock time.

Thanks,
NeilBrown

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iQIVAwUBTwYeIjnsnt1WYoG5AQKZpw/9GAU05e5BPzZRF8ShqzEJyUUmYnHHQF7A
pZ2S1v7ihnWaEdK6bSdVfzwJpmdQOzNlJMjlLgh3BSZJRdHJtWKgPsZB46lyekYU
3j2KoBtXuzC5jY1HVLi8SMtwpPNJ3s4IcFTa2a5qhqfYCugzv7vWbxA/JTKpJtnB
2jYoBQyJ3W/dM4It2CXx/vouFUka/VtZ9AaMxCo+2c5LZaBXwgaodP9p6WQjyUTj
WASbNcIlUMEuI+fIYMh1SztWoBh4S0x8e7v1wIQluwCtb2zSEz9iIaf+oM1n/9jM
adMsrBhC6E+WZinijw/qTBWhLSR0uwfh16DLloqPh1BRq8PqLjXMt/zoAWgBh1uP
AKAZTYBzU4k8H0ies1F8Pc2LaX8b5py3I12Yk9u24F2G+RsF2GPJ3zUlnoD/YNaS
iBvoVbLlJLJ7l5OyoO1w6NnkWYzocXkV4K5NMdjABsbZATWuYZHMwgv+AAS2zXME
pffr6iyI4jI2TPFBRYIsbxIR0YjjqCVlwAGktVXAcYqppAO9itfwDtMKPTpmpvcd
rINnMbA4YGDEKo7htPAT5NiXZmGOIRvfAY2yd1dQdFlUhzGEA8uvYV8c1QScWqor
Gk0GThP5MSLJclg0kuWeRsVjeMKmKfzV+lcnw3Wx9s55RULff0HXM51Eyl4L+xtd
H8hLwRsN96I=
=UCTt
-----END PGP SIGNATURE-----
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-04 14:13             ` Stefan Bader
@ 2012-01-06 20:41               ` John Stultz
  2012-01-08 20:48                 ` Sander Eikelenboom
  2012-01-09 13:26                 ` Stefan Bader
  0 siblings, 2 replies; 26+ messages in thread
From: John Stultz @ 2012-01-06 20:41 UTC (permalink / raw)
  To: Stefan Bader
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On Wed, 2012-01-04 at 15:13 +0100, Stefan Bader wrote:
> On 04.01.2012 13:25, Stefan Bader wrote:
> > On 04.01.2012 09:17, Stefan Bader wrote:
> >> On 04.01.2012 01:53, John Stultz wrote:
> >>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
> >>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
> >>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
> >>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
> >>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
> >>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
> >>
> >> Sorry, I was off for the evening a while after sending this out. And I just
> >> started, so a few thing I will be doing later but have not yet had time.
> >>
> >> Over night I had still be thinking on this and maybe one important fact I had
> >> been ignoring. This really has only been observed on paravirt guests on Xen as
> >> far as I know. And one thing that I should have pointed out is that
> >>
> >> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
> >> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
> >>
> >> So first the registration is done and the first line is the last thing printed
> >> in the registration function. Then, and that line always comes after, the probe,
> >> which looks like being done asynchronously, detects that the rtc is not
> >> implemented. I would assume that this causes the rtc to be unregistered again
> >> and that is probably the point where, under the right circumstances, the worker
> >> triggered by the initialize alarm is trying to set another alarm. Probably while
> >> some of the elements of the structure started to be torn down. I need to check
> >> on that code path, yet. So right now its more a guess.
> >>
> >>>>>
> >>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
> >>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
> >>>>> until after rtc_device_register() returns its null at that point. So
> >>>>> your patch isn't really fixing the issue, but just reducing the race
> >>>>> window for the second cpu to schedule the work.
> >>>>>
> >>>>> Sigh. I'd guess dropping the schedule_work call from
> >>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
> >>>>> Neil's patch it seemed like a good idea there, but it seems off to me
> >>>>> now.
> >>>>>
> >>>>> Neil, any thoughts on the following? Can you expand on the condition you
> >>>>> were worried about in around that call?
> >>>>
> >>>> If you set an alarm in the future, then shutdown and boot again after that
> >>>> time, then you will end up with a timer_queue node which is in the past.
> >>>
> >>> Thanks for explaining this again.
> >>>
> >>> Hrm. It seems the easy answer is to simply not add alarms that are in
> >>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
> >>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
> >>> current time, so maybe we can make sure we don't return old values? I
> >>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
> >>> apparently not.
> >>>
> >>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
> >>>> removed until and interrupt happens and an interrupt won't happen because the
> >>>> RTC only triggers an interrupt when the alarm is "now".
> >>>>
> >>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
> >>>> out.
> >>>>
> >>>> So we force the interrupt work to happen at the start just in case.
> >>>
> >>> Unfortunately its too early. 
> >>>
> >>>> Did you see my proposed patch which converted those calls to do the work
> >>>> in-process rather than passing it to a worker-thread?  I think that is a
> >>>> clean fix.
> >>>
> >>> I don't think I saw it today. Was it from before the holidays?
> >>>
> >>
> >> I fear I caused a bit of confusion there. Neil responded to my initial mail
> >> which was done as a reply to the mail announcing this patch for stable (which
> >> just was the first thread I could get hold of).
> >> I will try Neil's patch as well. And in parallel try to see whether the theory I
> >> had this night makes sense. If it does, then it is only indirectly that the work
> >> is scheduled too early. In that case just the teardown needs to make sure that
> >> no work is being run while removal. Well, maybe the question is whether there
> >> should be a delay in running the irq work until the device really, really is
> >> completely set up... But that sounds a bit more complicated.
> > 
> > By now I tried Neil's proposed patch and unfortunately that makes things rather
> > worse. I also played around with the idea of the unregistration race. Maybe
> > there also is one (that cancel_work_sync should be called before unregistering
> > the device) but definitely it is not what happens at least in the one CPU case.
> > I added some more printk's and the crash happens before even the rtc core class
> > has been fully registered. And no unregister is call has been made either.
> > 
> > Which may point to execution of the irq worker (including a schedule_work)
> > before the rtc-cmos parts are finished... Would explain why moving the
> > initialize call further down does at least narrow the window for it to happen...
> > The only thing I do not understand then is why that seems only to happen on Xen
> > guests...
> > 
> 
> Darn, guess I understand now... So cmos-rtc will call rtc_device_register from
> within do_cmos_probe and set cmos_rtc.rtc with the pointer that
> rtc_device_register returns.
> But when having rtc_initialize_alarm being called earlier in
> rtc_device_register, and that queues the work, there is a chance it will fire
> even before that function returns.
> And then cmos_checkintr will dereference the null pointer still stored in
> cmos_rtc.rtc for calling rtc_update_irq...

Hey Stefan, Konrad, Sander, Neil,
	Thanks again for the bug reporting and sorry for the earlier trouble.
So after the problematic patch was reverted for 3.2, I worked with Neil
to try to address the issue in a better way. It looks like its working
for Neil, so I was wondering if any of you could look over and maybe
give the following tree a whirl:
	git://git.linaro.org/people/jstultz/linux.git dev/rtc-fixups

The shortlog is here:
http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/dev/rtc-fixups

If this avoids the issue for you, and no one sees any other problems,
I'll queue these for 3.3

thanks
-john


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-06 20:41               ` John Stultz
@ 2012-01-08 20:48                 ` Sander Eikelenboom
  2012-01-09 13:26                 ` Stefan Bader
  1 sibling, 0 replies; 26+ messages in thread
From: Sander Eikelenboom @ 2012-01-08 20:48 UTC (permalink / raw)
  To: John Stultz
  Cc: Stefan Bader, NeilBrown, Konrad Rzeszutek Wilk, rjw,
	Thomas Gleixner, linux-kernel


Friday, January 6, 2012, 9:41:15 PM, you wrote:

> On Wed, 2012-01-04 at 15:13 +0100, Stefan Bader wrote:
>> On 04.01.2012 13:25, Stefan Bader wrote:
>> > On 04.01.2012 09:17, Stefan Bader wrote:
>> >> On 04.01.2012 01:53, John Stultz wrote:
>> >>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>> >>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>> >>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>> >>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>> >>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>> >>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
>> >>
>> >> Sorry, I was off for the evening a while after sending this out. And I just
>> >> started, so a few thing I will be doing later but have not yet had time.
>> >>
>> >> Over night I had still be thinking on this and maybe one important fact I had
>> >> been ignoring. This really has only been observed on paravirt guests on Xen as
>> >> far as I know. And one thing that I should have pointed out is that
>> >>
>> >> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
>> >> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
>> >>
>> >> So first the registration is done and the first line is the last thing printed
>> >> in the registration function. Then, and that line always comes after, the probe,
>> >> which looks like being done asynchronously, detects that the rtc is not
>> >> implemented. I would assume that this causes the rtc to be unregistered again
>> >> and that is probably the point where, under the right circumstances, the worker
>> >> triggered by the initialize alarm is trying to set another alarm. Probably while
>> >> some of the elements of the structure started to be torn down. I need to check
>> >> on that code path, yet. So right now its more a guess.
>> >>
>> >>>>>
>> >>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>> >>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>> >>>>> until after rtc_device_register() returns its null at that point. So
>> >>>>> your patch isn't really fixing the issue, but just reducing the race
>> >>>>> window for the second cpu to schedule the work.
>> >>>>>
>> >>>>> Sigh. I'd guess dropping the schedule_work call from
>> >>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>> >>>>> Neil's patch it seemed like a good idea there, but it seems off to me
>> >>>>> now.
>> >>>>>
>> >>>>> Neil, any thoughts on the following? Can you expand on the condition you
>> >>>>> were worried about in around that call?
>> >>>>
>> >>>> If you set an alarm in the future, then shutdown and boot again after that
>> >>>> time, then you will end up with a timer_queue node which is in the past.
>> >>>
>> >>> Thanks for explaining this again.
>> >>>
>> >>> Hrm. It seems the easy answer is to simply not add alarms that are in
>> >>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
>> >>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
>> >>> current time, so maybe we can make sure we don't return old values? I
>> >>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
>> >>> apparently not.
>> >>>
>> >>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>> >>>> removed until and interrupt happens and an interrupt won't happen because the
>> >>>> RTC only triggers an interrupt when the alarm is "now".
>> >>>>
>> >>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>> >>>> out.
>> >>>>
>> >>>> So we force the interrupt work to happen at the start just in case.
>> >>>
>> >>> Unfortunately its too early. 
>> >>>
>> >>>> Did you see my proposed patch which converted those calls to do the work
>> >>>> in-process rather than passing it to a worker-thread?  I think that is a
>> >>>> clean fix.
>> >>>
>> >>> I don't think I saw it today. Was it from before the holidays?
>> >>>
>> >>
>> >> I fear I caused a bit of confusion there. Neil responded to my initial mail
>> >> which was done as a reply to the mail announcing this patch for stable (which
>> >> just was the first thread I could get hold of).
>> >> I will try Neil's patch as well. And in parallel try to see whether the theory I
>> >> had this night makes sense. If it does, then it is only indirectly that the work
>> >> is scheduled too early. In that case just the teardown needs to make sure that
>> >> no work is being run while removal. Well, maybe the question is whether there
>> >> should be a delay in running the irq work until the device really, really is
>> >> completely set up... But that sounds a bit more complicated.
>> > 
>> > By now I tried Neil's proposed patch and unfortunately that makes things rather
>> > worse. I also played around with the idea of the unregistration race. Maybe
>> > there also is one (that cancel_work_sync should be called before unregistering
>> > the device) but definitely it is not what happens at least in the one CPU case.
>> > I added some more printk's and the crash happens before even the rtc core class
>> > has been fully registered. And no unregister is call has been made either.
>> > 
>> > Which may point to execution of the irq worker (including a schedule_work)
>> > before the rtc-cmos parts are finished... Would explain why moving the
>> > initialize call further down does at least narrow the window for it to happen...
>> > The only thing I do not understand then is why that seems only to happen on Xen
>> > guests...
>> > 
>> 
>> Darn, guess I understand now... So cmos-rtc will call rtc_device_register from
>> within do_cmos_probe and set cmos_rtc.rtc with the pointer that
>> rtc_device_register returns.
>> But when having rtc_initialize_alarm being called earlier in
>> rtc_device_register, and that queues the work, there is a chance it will fire
>> even before that function returns.
>> And then cmos_checkintr will dereference the null pointer still stored in
>> cmos_rtc.rtc for calling rtc_update_irq...

> Hey Stefan, Konrad, Sander, Neil,
>         Thanks again for the bug reporting and sorry for the earlier trouble.
> So after the problematic patch was reverted for 3.2, I worked with Neil
> to try to address the issue in a better way. It looks like its working
> for Neil, so I was wondering if any of you could look over and maybe
> give the following tree a whirl:
>         git://git.linaro.org/people/jstultz/linux.git dev/rtc-fixups

> The shortlog is here:
> http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/dev/rtc-fixups

> If this avoids the issue for you, and no one sees any other problems,
> I'll queue these for 3.3

> thanks
> -john

Boots both with single as with multiple CPU Xen guests

Tested-by: Sander Eikelenboom <linux@eikelenboom.it>

thanks,

--
Sander


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598  1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
  2012-01-06 20:41               ` John Stultz
  2012-01-08 20:48                 ` Sander Eikelenboom
@ 2012-01-09 13:26                 ` Stefan Bader
  1 sibling, 0 replies; 26+ messages in thread
From: Stefan Bader @ 2012-01-09 13:26 UTC (permalink / raw)
  To: John Stultz
  Cc: NeilBrown, Konrad Rzeszutek Wilk, Sander Eikelenboom, rjw,
	Thomas Gleixner, linux-kernel

On 06.01.2012 21:41, John Stultz wrote:
> On Wed, 2012-01-04 at 15:13 +0100, Stefan Bader wrote:
>> On 04.01.2012 13:25, Stefan Bader wrote:
>>> On 04.01.2012 09:17, Stefan Bader wrote:
>>>> On 04.01.2012 01:53, John Stultz wrote:
>>>>> On Wed, 2012-01-04 at 11:31 +1100, NeilBrown wrote:
>>>>>> On Tue, 03 Jan 2012 15:09:48 -0800 John Stultz <john.stultz@linaro.org> wrote:
>>>>>>> >From the stack trace, we've kicked off a rtc_timer_do_work, probably
>>>>>>> from the rtc_initialize_alarm() schedule_work call added in Neil's
>>>>>>> patch.  From there, we call __rtc_set_alarm -> cmos_set_alarm ->
>>>>>>> cmos_rq_disable -> cmos_checkintr -> rtc_update_irq -> schedule_work.
>>>>
>>>> Sorry, I was off for the evening a while after sending this out. And I just
>>>> started, so a few thing I will be doing later but have not yet had time.
>>>>
>>>> Over night I had still be thinking on this and maybe one important fact I had
>>>> been ignoring. This really has only been observed on paravirt guests on Xen as
>>>> far as I know. And one thing that I should have pointed out is that
>>>>
>>>> [    0.792634] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
>>>> [    0.792725] rtc_cmos: probe of rtc_cmos failed with error -38
>>>>
>>>> So first the registration is done and the first line is the last thing printed
>>>> in the registration function. Then, and that line always comes after, the probe,
>>>> which looks like being done asynchronously, detects that the rtc is not
>>>> implemented. I would assume that this causes the rtc to be unregistered again
>>>> and that is probably the point where, under the right circumstances, the worker
>>>> triggered by the initialize alarm is trying to set another alarm. Probably while
>>>> some of the elements of the structure started to be torn down. I need to check
>>>> on that code path, yet. So right now its more a guess.
>>>>
>>>>>>>
>>>>>>> So, what it looks to me is that in cmos_checkintr, we grab the cmos->rtc
>>>>>>> and pass that along. Unfortunately, since the cmos->rtc value isn't set
>>>>>>> until after rtc_device_register() returns its null at that point. So
>>>>>>> your patch isn't really fixing the issue, but just reducing the race
>>>>>>> window for the second cpu to schedule the work.
>>>>>>>
>>>>>>> Sigh. I'd guess dropping the schedule_work call from
>>>>>>> rtc_initialize_alarm() is the right approach (see below). When reviewing
>>>>>>> Neil's patch it seemed like a good idea there, but it seems off to me
>>>>>>> now.
>>>>>>>
>>>>>>> Neil, any thoughts on the following? Can you expand on the condition you
>>>>>>> were worried about in around that call?
>>>>>>
>>>>>> If you set an alarm in the future, then shutdown and boot again after that
>>>>>> time, then you will end up with a timer_queue node which is in the past.
>>>>>
>>>>> Thanks for explaining this again.
>>>>>
>>>>> Hrm. It seems the easy answer is to simply not add alarms that are in
>>>>> the past.  Further, I'm a bit perplexed, as if they are in the past, the
>>>>> enabled flag shouldn't be set.   __rtc_read_alarm() does check the
>>>>> current time, so maybe we can make sure we don't return old values? I
>>>>> guess I assumed __rtc_read_alarm() avoided returning stale values, but
>>>>> apparently not.
>>>>>
>>>>>> When this happens the queue gets stuck.  That entry-in-the-past won't get
>>>>>> removed until and interrupt happens and an interrupt won't happen because the
>>>>>> RTC only triggers an interrupt when the alarm is "now".
>>>>>>
>>>>>> So you'll find that e.g. "hwclock" will always tell you that 'select' timed
>>>>>> out.
>>>>>>
>>>>>> So we force the interrupt work to happen at the start just in case.
>>>>>
>>>>> Unfortunately its too early. 
>>>>>
>>>>>> Did you see my proposed patch which converted those calls to do the work
>>>>>> in-process rather than passing it to a worker-thread?  I think that is a
>>>>>> clean fix.
>>>>>
>>>>> I don't think I saw it today. Was it from before the holidays?
>>>>>
>>>>
>>>> I fear I caused a bit of confusion there. Neil responded to my initial mail
>>>> which was done as a reply to the mail announcing this patch for stable (which
>>>> just was the first thread I could get hold of).
>>>> I will try Neil's patch as well. And in parallel try to see whether the theory I
>>>> had this night makes sense. If it does, then it is only indirectly that the work
>>>> is scheduled too early. In that case just the teardown needs to make sure that
>>>> no work is being run while removal. Well, maybe the question is whether there
>>>> should be a delay in running the irq work until the device really, really is
>>>> completely set up... But that sounds a bit more complicated.
>>>
>>> By now I tried Neil's proposed patch and unfortunately that makes things rather
>>> worse. I also played around with the idea of the unregistration race. Maybe
>>> there also is one (that cancel_work_sync should be called before unregistering
>>> the device) but definitely it is not what happens at least in the one CPU case.
>>> I added some more printk's and the crash happens before even the rtc core class
>>> has been fully registered. And no unregister is call has been made either.
>>>
>>> Which may point to execution of the irq worker (including a schedule_work)
>>> before the rtc-cmos parts are finished... Would explain why moving the
>>> initialize call further down does at least narrow the window for it to happen...
>>> The only thing I do not understand then is why that seems only to happen on Xen
>>> guests...
>>>
>>
>> Darn, guess I understand now... So cmos-rtc will call rtc_device_register from
>> within do_cmos_probe and set cmos_rtc.rtc with the pointer that
>> rtc_device_register returns.
>> But when having rtc_initialize_alarm being called earlier in
>> rtc_device_register, and that queues the work, there is a chance it will fire
>> even before that function returns.
>> And then cmos_checkintr will dereference the null pointer still stored in
>> cmos_rtc.rtc for calling rtc_update_irq...
> 
> Hey Stefan, Konrad, Sander, Neil,
> 	Thanks again for the bug reporting and sorry for the earlier trouble.
> So after the problematic patch was reverted for 3.2, I worked with Neil
> to try to address the issue in a better way. It looks like its working
> for Neil, so I was wondering if any of you could look over and maybe
> give the following tree a whirl:
> 	git://git.linaro.org/people/jstultz/linux.git dev/rtc-fixups
> 

Had a go with the patches applied with one HVM setup and two PVM setups (one and
4 vcpus) and have not seen any bad side effects.

Tested-by: Stefan Bader <stefan.bader@canonical.com>

> The shortlog is here:
> http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/dev/rtc-fixups
> 
> If this avoids the issue for you, and no one sees any other problems,
> I'll queue these for 3.3
> 
> thanks
> -john
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-01-09 13:26 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-03 16:13 [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [ 1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30 Sander Eikelenboom
2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
2012-01-03 19:17   ` Sander Eikelenboom
2012-01-03 19:26   ` Stefan Bader
2012-01-03 20:11     ` Sander Eikelenboom
2012-01-03 20:10   ` Sander Eikelenboom
2012-01-03 22:33     ` Konrad Rzeszutek Wilk
2012-01-03 23:09   ` John Stultz
2012-01-04  0:31     ` NeilBrown
2012-01-04  0:53       ` John Stultz
2012-01-04  1:20         ` NeilBrown
2012-01-04 14:46           ` Konrad Rzeszutek Wilk
2012-01-04 15:12           ` Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 " Stefan Bader
2012-01-05 22:03             ` NeilBrown
2012-01-04  8:17         ` Stefan Bader
2012-01-04 12:25           ` Stefan Bader
2012-01-04 13:17             ` Sander Eikelenboom
2012-01-04 18:33               ` John Stultz
2012-01-04 14:13             ` Stefan Bader
2012-01-06 20:41               ` John Stultz
2012-01-08 20:48                 ` Sander Eikelenboom
2012-01-09 13:26                 ` Stefan Bader
2012-01-04 18:35             ` John Stultz
2012-01-04 18:36           ` John Stultz
2012-01-04 18:50             ` Stefan Bader
2012-01-04 19:47             ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).