linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
@ 2011-10-15 20:12 Sergey Senozhatsky
  2011-10-15 21:42 ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-15 20:12 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, Andrew Morton

Hello,

3.1-rc9

[10172.218213] ------------[ cut here ]------------
[10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
[10172.218242] Hardware name: Aspire 5741G    
[10172.218248] Modules linked in: ipv6 usb_storage uas microcode snd_hda_codec_hdmi snd_hda_codec_realtek broadcom tg3 snd_hda_intel snd_hda_codec snd_pcm snd_timer snd rndis_host cdc_ether usbnet evdev psmouse soundcore pcspkr mii
snd_page_alloc libphy ac battery wmi button ehci_hcd sr_mod cdrom usbcore sd_mod ahci
[10172.218330] Pid: 22953, comm: kworker/0:2 Not tainted 3.1.0-rc9-dbg-00681-gec325b2 #730
[10172.218335] Call Trace:
[10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
[10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
[10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
[10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
[10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
[10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
[10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
[10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
[10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
[10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
[10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
[10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
[10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
[10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
[10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
[10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
[10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
[10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
[10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
[10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
[10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
[10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
[10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
[10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
[10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
[10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
[10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
[10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
[10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
[10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
[10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
[10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---



Really not sure what's going on and how to reproduce.


	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-15 20:12 WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b() Sergey Senozhatsky
@ 2011-10-15 21:42 ` David Rientjes
  2011-10-15 22:23   ` Borislav Petkov
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2011-10-15 21:42 UTC (permalink / raw)
  To: Sergey Senozhatsky, Tejun Heo, Tejun Heo
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Andrew Morton

On Sat, 15 Oct 2011, Sergey Senozhatsky wrote:

> [10172.218213] ------------[ cut here ]------------
> [10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
> [10172.218242] Hardware name: Aspire 5741G    
> [10172.218248] Modules linked in: ipv6 usb_storage uas microcode snd_hda_codec_hdmi snd_hda_codec_realtek broadcom tg3 snd_hda_intel snd_hda_codec snd_pcm snd_timer snd rndis_host cdc_ether usbnet evdev psmouse soundcore pcspkr mii
> snd_page_alloc libphy ac battery wmi button ehci_hcd sr_mod cdrom usbcore sd_mod ahci
> [10172.218330] Pid: 22953, comm: kworker/0:2 Not tainted 3.1.0-rc9-dbg-00681-gec325b2 #730
> [10172.218335] Call Trace:
> [10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
> [10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
> [10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
> [10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
> [10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
> [10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> [10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
> [10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
> [10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> [10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
> [10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
> [10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
> [10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> [10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
> [10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> [10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
> [10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
> [10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
> [10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
> [10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
> [10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
> [10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
> [10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
> [10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
> [10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
> [10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
> [10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
> [10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
> [10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
> [10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
> [10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
> [10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---

I think this is a problem with lockdep itself, could you try reverting 
f59de8992aa6 ("lockdep: Clear whole lockdep_map on initialization") if 
this reliably happens everytime you reboot (lockdep will only emit this 
once and then will suppress future warnings until the next boot)?

I think the new memset() is inadvertently clearing the name for 
double_unlock_balance().

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-15 21:42 ` David Rientjes
@ 2011-10-15 22:23   ` Borislav Petkov
  2011-10-15 22:32     ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Borislav Petkov @ 2011-10-15 22:23 UTC (permalink / raw)
  To: David Rientjes
  Cc: Sergey Senozhatsky, Tejun Heo, Tejun Heo, Peter Zijlstra,
	Ingo Molnar, linux-kernel, Andrew Morton

On Sat, Oct 15, 2011 at 02:42:14PM -0700, David Rientjes wrote:
> On Sat, 15 Oct 2011, Sergey Senozhatsky wrote:
> 
> > [10172.218213] ------------[ cut here ]------------
> > [10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
> > [10172.218242] Hardware name: Aspire 5741G    
> > [10172.218248] Modules linked in: ipv6 usb_storage uas microcode snd_hda_codec_hdmi snd_hda_codec_realtek broadcom tg3 snd_hda_intel snd_hda_codec snd_pcm snd_timer snd rndis_host cdc_ether usbnet evdev psmouse soundcore pcspkr mii
> > snd_page_alloc libphy ac battery wmi button ehci_hcd sr_mod cdrom usbcore sd_mod ahci
> > [10172.218330] Pid: 22953, comm: kworker/0:2 Not tainted 3.1.0-rc9-dbg-00681-gec325b2 #730
> > [10172.218335] Call Trace:
> > [10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
> > [10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
> > [10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
> > [10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
> > [10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
> > [10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> > [10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
> > [10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
> > [10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> > [10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
> > [10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
> > [10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
> > [10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> > [10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
> > [10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> > [10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
> > [10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
> > [10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
> > [10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
> > [10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
> > [10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
> > [10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
> > [10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
> > [10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
> > [10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
> > [10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
> > [10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
> > [10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
> > [10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
> > [10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
> > [10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
> > [10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---
> 
> I think this is a problem with lockdep itself, could you try reverting 
> f59de8992aa6 ("lockdep: Clear whole lockdep_map on initialization") if 
> this reliably happens everytime you reboot (lockdep will only emit this 
> once and then will suppress future warnings until the next boot)?
> 
> I think the new memset() is inadvertently clearing the name for 
> double_unlock_balance().

Great,

so I'm not the only one seeing the above:
http://marc.info/?l=linux-kernel&m=131468805610527

Due to it being very hard to reproduce, we dismissed it then as a
possible hw corruption.

But yeah, it looks like I have triggered it on -rc9 too, just the
other day. Oh, and I see -rc6 and -rc8 warnings in the logs too. Ok,
correction, not that hard to trigger.

Oct 11 09:08:11 liondog kernel: [15367.473110] ------------[ cut here ]------------
Oct 11 09:08:11 liondog kernel: [15367.473135] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x173/0x17b5()
Oct 11 09:08:11 liondog kernel: [15367.473145] Hardware name: System Product Name
Oct 11 09:08:11 liondog kernel: [15367.473152] Modules linked in: cryptd aes_x86_64 aes_generic nls_iso8859_15 nls_cp437 tun cpufreq_powersave cpufreq_userspace cpufreq_conservative powernow_k8 mperf cpufreq_stats binfmt_misc fuse dm_crypt dm_mod ipv6 kvm_amd kvm vfat fat radeon 8250_pnp 8250 ttm drm_kms_helper cfbcopyarea edac_core serial_core cfbimgblt cfbfillrect k10temp
Oct 11 09:08:11 liondog kernel: [15367.473256] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0-rc9-00005-g538d2882213e #5
Oct 11 09:08:11 liondog kernel: [15367.473264] Call Trace:
Oct 11 09:08:11 liondog kernel: [15367.473270]  <IRQ>  [<ffffffff810367ff>] warn_slowpath_common+0x83/0x9b
Oct 11 09:08:11 liondog kernel: [15367.473298]  [<ffffffff81036831>] warn_slowpath_null+0x1a/0x1c
Oct 11 09:08:11 liondog kernel: [15367.473309]  [<ffffffff810691fc>] __lock_acquire+0x173/0x17b5
Oct 11 09:08:11 liondog kernel: [15367.473321]  [<ffffffff8106a82c>] ? __lock_acquire+0x17a3/0x17b5
Oct 11 09:08:11 liondog kernel: [15367.473334]  [<ffffffff81029c53>] ? double_rq_lock+0x4d/0x52
Oct 11 09:08:11 liondog kernel: [15367.473346]  [<ffffffff8106ae67>] lock_acquire+0x154/0x198
Oct 11 09:08:11 liondog kernel: [15367.473356]  [<ffffffff81029c53>] ? double_rq_lock+0x4d/0x52
Oct 11 09:08:11 liondog kernel: [15367.473368]  [<ffffffff810659de>] ? put_lock_stats.isra.15+0xe/0x29
Oct 11 09:08:11 liondog kernel: [15367.473382]  [<ffffffff813c1c49>] _raw_spin_lock_nested+0x44/0x79
Oct 11 09:08:11 liondog kernel: [15367.473392]  [<ffffffff81029c53>] ? double_rq_lock+0x4d/0x52
Oct 11 09:08:11 liondog kernel: [15367.473403]  [<ffffffff813c1b84>] ? _raw_spin_lock+0x6c/0x73
Oct 11 09:08:11 liondog kernel: [15367.473413]  [<ffffffff81029c34>] ? double_rq_lock+0x2e/0x52
Oct 11 09:08:11 liondog kernel: [15367.473423]  [<ffffffff81029c53>] double_rq_lock+0x4d/0x52
Oct 11 09:08:11 liondog kernel: [15367.473434]  [<ffffffff8102e7d5>] load_balance+0x1b7/0x4f7
Oct 11 09:08:11 liondog kernel: [15367.473447]  [<ffffffff8102ec79>] rebalance_domains+0x164/0x1f9
Oct 11 09:08:11 liondog kernel: [15367.473458]  [<ffffffff8102eb15>] ? load_balance+0x4f7/0x4f7
Oct 11 09:08:11 liondog kernel: [15367.473470]  [<ffffffff8102edcb>] run_rebalance_domains+0xbd/0x12a
Oct 11 09:08:11 liondog kernel: [15367.473487]  [<ffffffff8103d180>] __do_softirq+0x165/0x2eb
Oct 11 09:08:11 liondog kernel: [15367.473499]  [<ffffffff81070ff4>] ? generic_smp_call_function_single_interrupt+0x9f/0xd8
Oct 11 09:08:11 liondog kernel: [15367.473512]  [<ffffffff813c472c>] call_softirq+0x1c/0x30
Oct 11 09:08:11 liondog kernel: [15367.473525]  [<ffffffff810036bb>] do_softirq+0x3d/0x86
Oct 11 09:08:11 liondog kernel: [15367.473535]  [<ffffffff8103d561>] irq_exit+0x53/0xbd
Oct 11 09:08:11 liondog kernel: [15367.473548]  [<ffffffff81018155>] smp_call_function_single_interrupt+0x34/0x37
Oct 11 09:08:11 liondog kernel: [15367.473560]  [<ffffffff813c41b0>] call_function_single_interrupt+0x70/0x80
Oct 11 09:08:11 liondog kernel: [15367.473567]  <EOI>  [<ffffffff8105b7f3>] ? local_clock+0xf/0x3b
Oct 11 09:08:11 liondog kernel: [15367.473586]  [<ffffffff8105b7f3>] ? local_clock+0xf/0x3b
Oct 11 09:08:11 liondog kernel: [15367.473598]  [<ffffffff810090b7>] ? default_idle+0xf1/0x1fd
Oct 11 09:08:11 liondog kernel: [15367.473610]  [<ffffffff810090b5>] ? default_idle+0xef/0x1fd
Oct 11 09:08:11 liondog kernel: [15367.473621]  [<ffffffff8100931a>] amd_e400_idle+0xc4/0xe7
Oct 11 09:08:11 liondog kernel: [15367.473632]  [<ffffffff8100074c>] cpu_idle+0x67/0xbe
Oct 11 09:08:11 liondog kernel: [15367.473645]  [<ffffffff813b54af>] start_secondary+0x1ad/0x1b2
Oct 11 09:08:11 liondog kernel: [15367.473655] ---[ end trace 63070f7e22365bb6 ]---

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-15 22:23   ` Borislav Petkov
@ 2011-10-15 22:32     ` David Rientjes
  2011-10-16  5:09       ` Sergey Senozhatsky
  2011-10-20 18:39       ` Borislav Petkov
  0 siblings, 2 replies; 26+ messages in thread
From: David Rientjes @ 2011-10-15 22:32 UTC (permalink / raw)
  To: Borislav Petkov, Sergey Senozhatsky, Tejun Heo, Tejun Heo,
	Peter Zijlstra, Ingo Molnar, linux-kernel, Andrew Morton

On Sun, 16 Oct 2011, Borislav Petkov wrote:

> > I think this is a problem with lockdep itself, could you try reverting 
> > f59de8992aa6 ("lockdep: Clear whole lockdep_map on initialization") if 
> > this reliably happens everytime you reboot (lockdep will only emit this 
> > once and then will suppress future warnings until the next boot)?
> > 
> > I think the new memset() is inadvertently clearing the name for 
> > double_unlock_balance().
> 
> Great,
> 
> so I'm not the only one seeing the above:
> http://marc.info/?l=linux-kernel&m=131468805610527
> 
> Due to it being very hard to reproduce, we dismissed it then as a
> possible hw corruption.
> 
> But yeah, it looks like I have triggered it on -rc9 too, just the
> other day. Oh, and I see -rc6 and -rc8 warnings in the logs too. Ok,
> correction, not that hard to trigger.
> 

Could you try to revert f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
initialization") with this patch and see if it helps?  Thanks.
---
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -2874,7 +2874,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 void lockdep_init_map(struct lockdep_map *lock, const char *name,
 		      struct lock_class_key *key, int subclass)
 {
-	memset(lock, 0, sizeof(*lock));
+	int i;
+
+	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
+		lock->class_cache[i] = NULL;
 
 #ifdef CONFIG_LOCK_STAT
 	lock->cpu = raw_smp_processor_id();

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-15 22:32     ` David Rientjes
@ 2011-10-16  5:09       ` Sergey Senozhatsky
  2011-10-20 18:39       ` Borislav Petkov
  1 sibling, 0 replies; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-16  5:09 UTC (permalink / raw)
  To: David Rientjes
  Cc: Borislav Petkov, Tejun Heo, Tejun Heo, Peter Zijlstra,
	Ingo Molnar, linux-kernel, Andrew Morton

On (10/15/11 15:32), David Rientjes wrote:
> > > I think this is a problem with lockdep itself, could you try reverting 
> > > f59de8992aa6 ("lockdep: Clear whole lockdep_map on initialization") if 
> > > this reliably happens everytime you reboot (lockdep will only emit this 
> > > once and then will suppress future warnings until the next boot)?
> > > 
> > > I think the new memset() is inadvertently clearing the name for 
> > > double_unlock_balance().
> > 
> > Great,
> > 
> > so I'm not the only one seeing the above:
> > http://marc.info/?l=linux-kernel&m=131468805610527
> > 
> > Due to it being very hard to reproduce, we dismissed it then as a
> > possible hw corruption.
> > 
> > But yeah, it looks like I have triggered it on -rc9 too, just the
> > other day. Oh, and I see -rc6 and -rc8 warnings in the logs too. Ok,
> > correction, not that hard to trigger.
> > 
> 
> Could you try to revert f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> initialization") with this patch and see if it helps?  Thanks.


Sure, I'd love to and will do, it's just I'm not sure I can easily reproduce it.



> ---
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -2874,7 +2874,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  void lockdep_init_map(struct lockdep_map *lock, const char *name,
>  		      struct lock_class_key *key, int subclass)
>  {
> -	memset(lock, 0, sizeof(*lock));
> +	int i;
> +
> +	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
> +		lock->class_cache[i] = NULL;
>  
>  #ifdef CONFIG_LOCK_STAT
>  	lock->cpu = raw_smp_processor_id();
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-15 22:32     ` David Rientjes
  2011-10-16  5:09       ` Sergey Senozhatsky
@ 2011-10-20 18:39       ` Borislav Petkov
  2011-10-20 18:53         ` Sergey Senozhatsky
  1 sibling, 1 reply; 26+ messages in thread
From: Borislav Petkov @ 2011-10-20 18:39 UTC (permalink / raw)
  To: David Rientjes
  Cc: Sergey Senozhatsky, Tejun Heo, Tejun Heo, Peter Zijlstra,
	Ingo Molnar, linux-kernel, Andrew Morton

On Sat, Oct 15, 2011 at 03:32:32PM -0700, David Rientjes wrote:
> Could you try to revert f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> initialization") with this patch and see if it helps?  Thanks.
> ---
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -2874,7 +2874,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  void lockdep_init_map(struct lockdep_map *lock, const char *name,
>  		      struct lock_class_key *key, int subclass)
>  {
> -	memset(lock, 0, sizeof(*lock));
> +	int i;
> +
> +	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
> +		lock->class_cache[i] = NULL;
>  
>  #ifdef CONFIG_LOCK_STAT
>  	lock->cpu = raw_smp_processor_id();

FWIW,

the box has been running here with f59de8992aa6 reverted for a couple of
days now and no sign of the warning. I'll keep watching it but it looks
ok so far, so David, you could've nailed it.

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 18:39       ` Borislav Petkov
@ 2011-10-20 18:53         ` Sergey Senozhatsky
  2011-10-20 19:07           ` Sergey Senozhatsky
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-20 18:53 UTC (permalink / raw)
  To: Borislav Petkov, David Rientjes, Tejun Heo, Tejun Heo,
	Peter Zijlstra, Ingo Molnar, linux-kernel, Andrew Morton

On (10/20/11 20:39), Borislav Petkov wrote:
> On Sat, Oct 15, 2011 at 03:32:32PM -0700, David Rientjes wrote:
> > Could you try to revert f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> > initialization") with this patch and see if it helps?  Thanks.
> > ---
> > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > --- a/kernel/lockdep.c
> > +++ b/kernel/lockdep.c
> > @@ -2874,7 +2874,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
> >  void lockdep_init_map(struct lockdep_map *lock, const char *name,
> >  		      struct lock_class_key *key, int subclass)
> >  {
> > -	memset(lock, 0, sizeof(*lock));
> > +	int i;
> > +
> > +	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
> > +		lock->class_cache[i] = NULL;
> >  
> >  #ifdef CONFIG_LOCK_STAT
> >  	lock->cpu = raw_smp_processor_id();
> 
> FWIW,
> 
> the box has been running here with f59de8992aa6 reverted for a couple of
> days now and no sign of the warning. I'll keep watching it but it looks
> ok so far, so David, you could've nailed it.
> 

Hello,
Well, the same with me. My laptop has been running with reverted f59de8992aa6 without any
problems so far. Yet, I'm not sure I understand how memset() and loop could
produce different results.

commit in question (f59de8992aa6dc85e81aadc26b0f69e17809721d) has been merge on
Jul 14 15:19:09 2011 +0200, so, Borislav, you probably should have seen it
not only on 3.1-rc5, 3.1-rc6,..., but even on 3.0.


	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 18:53         ` Sergey Senozhatsky
@ 2011-10-20 19:07           ` Sergey Senozhatsky
  2011-10-20 21:17             ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-20 19:07 UTC (permalink / raw)
  To: Borislav Petkov, David Rientjes, Tejun Heo, Tejun Heo,
	Peter Zijlstra, Ingo Molnar, linux-kernel, Andrew Morton

On (10/20/11 21:53), Sergey Senozhatsky wrote:
> Date: Thu, 20 Oct 2011 21:53:29 +0300
> From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> To: Borislav Petkov <bp@alien8.de>, David Rientjes <rientjes@google.com>,
>  Tejun Heo <tj@kernel.org>, Tejun Heo <htejun@gmail.com>, Peter Zijlstra
>  <peterz@infradead.org>, Ingo Molnar <mingo@elte.hu>,
>  linux-kernel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>
> Subject: Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> On (10/20/11 20:39), Borislav Petkov wrote:
> > On Sat, Oct 15, 2011 at 03:32:32PM -0700, David Rientjes wrote:
> > > Could you try to revert f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> > > initialization") with this patch and see if it helps?  Thanks.
> > > ---
> > > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > > --- a/kernel/lockdep.c
> > > +++ b/kernel/lockdep.c
> > > @@ -2874,7 +2874,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
> > >  void lockdep_init_map(struct lockdep_map *lock, const char *name,
> > >  		      struct lock_class_key *key, int subclass)
> > >  {
> > > -	memset(lock, 0, sizeof(*lock));
> > > +	int i;
> > > +
> > > +	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
> > > +		lock->class_cache[i] = NULL;
> > >  
> > >  #ifdef CONFIG_LOCK_STAT
> > >  	lock->cpu = raw_smp_processor_id();
> > 
> > FWIW,
> > 
> > the box has been running here with f59de8992aa6 reverted for a couple of
> > days now and no sign of the warning. I'll keep watching it but it looks
> > ok so far, so David, you could've nailed it.
> > 
> 
> Hello,
> Well, the same with me. My laptop has been running with reverted f59de8992aa6 without any
> problems so far. Yet, I'm not sure I understand how memset() and loop could
> produce different results.
> 

Oh, well, nevermind I think I get it. 

Reverting opens https://bugzilla.kernel.org/show_bug.cgi?id=35532 again.  


> commit in question (f59de8992aa6dc85e81aadc26b0f69e17809721d) has been merge on
> Jul 14 15:19:09 2011 +0200, so, Borislav, you probably should have seen it
> not only on 3.1-rc5, 3.1-rc6,..., but even on 3.0.
> 
> 


	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 19:07           ` Sergey Senozhatsky
@ 2011-10-20 21:17             ` David Rientjes
  2011-10-20 21:23               ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2011-10-20 21:17 UTC (permalink / raw)
  To: Sergey Senozhatsky, Ingo Molnar, Tejun Heo, Tejun Heo
  Cc: Borislav Petkov, Peter Zijlstra, linux-kernel, Andrew Morton

On Thu, 20 Oct 2011, Sergey Senozhatsky wrote:

> > > FWIW,
> > > 
> > > the box has been running here with f59de8992aa6 reverted for a couple of
> > > days now and no sign of the warning. I'll keep watching it but it looks
> > > ok so far, so David, you could've nailed it.
> > > 
> > 
> > Hello,
> > Well, the same with me. My laptop has been running with reverted f59de8992aa6 without any
> > problems so far. Yet, I'm not sure I understand how memset() and loop could
> > produce different results.
> > 
> 
> Oh, well, nevermind I think I get it. 
> 
> Reverting opens https://bugzilla.kernel.org/show_bug.cgi?id=35532 again.  
> 

I don't know what that is since bugzilla.kernel.org is down :)  The 
problem is that the memset(), in addition to all the other fields in 
lockdep_map, clears the "name" field, which is what the scheduler uses 
via lock_set_sublcass() to prevent this lockdep warning.  My initial 
speculation seems to be confirmed since either you or Borislav have been 
able to reproduce the warning since removing the memset().

Tejun, would you like to revert f59de8992aa6 ("lockdep: Clear whole 
lockdep_map on initialization") since it fixes this lockdep warning?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 21:17             ` David Rientjes
@ 2011-10-20 21:23               ` Tejun Heo
  2011-10-20 21:31                 ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2011-10-20 21:23 UTC (permalink / raw)
  To: David Rientjes
  Cc: Sergey Senozhatsky, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

Hello,

On Thu, Oct 20, 2011 at 02:17:29PM -0700, David Rientjes wrote:
> On Thu, 20 Oct 2011, Sergey Senozhatsky wrote:
> 
> > > > FWIW,
> > > > 
> > > > the box has been running here with f59de8992aa6 reverted for a couple of
> > > > days now and no sign of the warning. I'll keep watching it but it looks
> > > > ok so far, so David, you could've nailed it.
> > > > 
> > > 
> > > Hello,
> > > Well, the same with me. My laptop has been running with reverted f59de8992aa6 without any
> > > problems so far. Yet, I'm not sure I understand how memset() and loop could
> > > produce different results.
> > > 
> > 
> > Oh, well, nevermind I think I get it. 
> > 
> > Reverting opens https://bugzilla.kernel.org/show_bug.cgi?id=35532 again.  
> > 
> 
> I don't know what that is since bugzilla.kernel.org is down :)  The 
> problem is that the memset(), in addition to all the other fields in 
> lockdep_map, clears the "name" field, which is what the scheduler uses 
> via lock_set_sublcass() to prevent this lockdep warning.  My initial 
> speculation seems to be confirmed since either you or Borislav have been 
> able to reproduce the warning since removing the memset().
> 
> Tejun, would you like to revert f59de8992aa6 ("lockdep: Clear whole 
> lockdep_map on initialization") since it fixes this lockdep warning?

Hmmm... the issue was that kmemcheck noticed that memory regions in
lockdep_map are accessed before being set to any value.  I'm feeling
dim as usual and don't understand what's going on here.  The function
looks like the following.


 void lockdep_init_map(struct lockdep_map *lock, const char *name,
		       struct lock_class_key *key, int subclass)
 {
	 memset(lock, 0, sizeof(*lock));

 #ifdef CONFIG_LOCK_STAT
	 lock->cpu = raw_smp_processor_id();
 #endif
	 if (DEBUG_LOCKS_WARN_ON(!name)) {
		 lock->name = "NULL";
		 return;
	 }

	 lock->name = name;


So, according to this thread, the problem is that the memset() clears
lock->name field, right?  But how can that be a problem?  lock->name
is always set to either "NULL" or @name.  Why would clearing it before
setting make any difference?  What am I missing?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 21:23               ` Tejun Heo
@ 2011-10-20 21:31                 ` David Rientjes
  2011-10-20 21:36                   ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: David Rientjes @ 2011-10-20 21:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Sergey Senozhatsky, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

On Thu, 20 Oct 2011, Tejun Heo wrote:

> > Tejun, would you like to revert f59de8992aa6 ("lockdep: Clear whole 
> > lockdep_map on initialization") since it fixes this lockdep warning?
> 
> Hmmm... the issue was that kmemcheck noticed that memory regions in
> lockdep_map are accessed before being set to any value.  I'm feeling
> dim as usual and don't understand what's going on here.  The function
> looks like the following.
> 
> 
>  void lockdep_init_map(struct lockdep_map *lock, const char *name,
> 		       struct lock_class_key *key, int subclass)
>  {
> 	 memset(lock, 0, sizeof(*lock));
> 
>  #ifdef CONFIG_LOCK_STAT
> 	 lock->cpu = raw_smp_processor_id();
>  #endif
> 	 if (DEBUG_LOCKS_WARN_ON(!name)) {
> 		 lock->name = "NULL";
> 		 return;
> 	 }
> 
> 	 lock->name = name;
> 
> 
> So, according to this thread, the problem is that the memset() clears
> lock->name field, right?

Right, and reverting f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
initialization") seems to fix the lockdep warning.

> But how can that be a problem?  lock->name
> is always set to either "NULL" or @name.  Why would clearing it before
> setting make any difference?  What am I missing?
> 

The scheduler (in sched_fair and sched_rt) calls lock_set_subclass() which 
sets the name in double_unlock_balance() to set the name but there's a 
race between when that is cleared with the memset() and setting of 
lock->name where lockdep can find them to match.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 21:31                 ` David Rientjes
@ 2011-10-20 21:36                   ` Tejun Heo
  2011-10-20 23:00                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2011-10-20 21:36 UTC (permalink / raw)
  To: David Rientjes
  Cc: Sergey Senozhatsky, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

Hello,

On Thu, Oct 20, 2011 at 02:31:39PM -0700, David Rientjes wrote:
> > So, according to this thread, the problem is that the memset() clears
> > lock->name field, right?
> 
> Right, and reverting f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> initialization") seems to fix the lockdep warning.
> 
> > But how can that be a problem?  lock->name
> > is always set to either "NULL" or @name.  Why would clearing it before
> > setting make any difference?  What am I missing?
> > 
> 
> The scheduler (in sched_fair and sched_rt) calls lock_set_subclass() which 
> sets the name in double_unlock_balance() to set the name but there's a 
> race between when that is cleared with the memset() and setting of 
> lock->name where lockdep can find them to match.

Hmmm... so lock_set_subclass() is racing against lockdep_init()?  That
sounds very fishy and probably needs better fix.  Anyways, if someone
can't come up with proper solution, please feel free to revert the
commit.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 21:36                   ` Tejun Heo
@ 2011-10-20 23:00                     ` Sergey Senozhatsky
  2011-10-21  9:14                       ` David Rientjes
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-20 23:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Rientjes, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

On (10/20/11 14:36), Tejun Heo wrote:
> Hello,
> 
> On Thu, Oct 20, 2011 at 02:31:39PM -0700, David Rientjes wrote:
> > > So, according to this thread, the problem is that the memset() clears
> > > lock->name field, right?
> > 
> > Right, and reverting f59de8992aa6 ("lockdep: Clear whole lockdep_map on 
> > initialization") seems to fix the lockdep warning.
> > 
> > > But how can that be a problem?  lock->name
> > > is always set to either "NULL" or @name.  Why would clearing it before
> > > setting make any difference?  What am I missing?
> > > 
> > 
> > The scheduler (in sched_fair and sched_rt) calls lock_set_subclass() which 
> > sets the name in double_unlock_balance() to set the name but there's a 
> > race between when that is cleared with the memset() and setting of 
> > lock->name where lockdep can find them to match.
> 
> Hmmm... so lock_set_subclass() is racing against lockdep_init()?  That
> sounds very fishy and probably needs better fix.  Anyways, if someone
> can't come up with proper solution, please feel free to revert the
> commit.
> 

I thought I've started understand this, but it was wrong feeling.

The error indeed is that class name and lock name are mismatch

 689                 if (class->key == key) {                                                                                                                                                                                      
 690                         WARN_ON_ONCE(class->name != lock->name);                                            
 691                         return class;                                                                       
 692                 }  

And the problem as far as I understand only shows up when active_load_balance_cpu_stop() gets
called on rq with active_balance.

double_unlock_balance() is called with busiest_rq spin lock held and I don't see who
calls lockdep_init_map() on busiest_rq somewhere around. work_struct has its
own lockdep_map touched after __queue_work(cpu, wq, work).

I'm not sure that reverting is the best option we have, since it's not fixing
the possible race condition it's just mask it.


I'm not very lucky at reproducing issue, in fact I had only one trace so far.

[10172.218213] ------------[ cut here ]------------
[10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
[10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
[10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
[10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
[10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
[10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
[10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
[10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
[10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
[10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
[10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
[10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
[10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
[10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
[10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
[10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
[10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
[10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
[10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
[10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
[10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
[10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
[10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
[10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
[10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
[10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
[10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
[10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
[10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
[10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
[10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
[10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
[10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
[10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---



	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-20 23:00                     ` Sergey Senozhatsky
@ 2011-10-21  9:14                       ` David Rientjes
  2011-10-21  9:26                         ` Sergey Senozhatsky
  2011-10-21  9:45                         ` Yong Zhang
  0 siblings, 2 replies; 26+ messages in thread
From: David Rientjes @ 2011-10-21  9:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tejun Heo, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

On Fri, 21 Oct 2011, Sergey Senozhatsky wrote:

> I thought I've started understand this, but it was wrong feeling.
> 
> The error indeed is that class name and lock name are mismatch
> 
>  689                 if (class->key == key) {                                                                                                                                                                                      
>  690                         WARN_ON_ONCE(class->name != lock->name);                                            
>  691                         return class;                                                                       
>  692                 }  
> 
> And the problem as far as I understand only shows up when active_load_balance_cpu_stop() gets
> called on rq with active_balance.
> 
> double_unlock_balance() is called with busiest_rq spin lock held and I don't see who
> calls lockdep_init_map() on busiest_rq somewhere around. work_struct has its
> own lockdep_map touched after __queue_work(cpu, wq, work).
> 
> I'm not sure that reverting is the best option we have, since it's not fixing
> the possible race condition it's just mask it.
> 

How does it mask the race condition?  Before the memset(), the ->name 
field was never _cleared_ in lockdep_init_map() like it is now, it was 
only stored.

> I'm not very lucky at reproducing issue, in fact I had only one trace so far.
> 
> [10172.218213] ------------[ cut here ]------------
> [10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
> [10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
> [10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
> [10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
> [10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
> [10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
> [10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> [10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
> [10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> [10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
> [10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> [10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
> [10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
> [10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
> [10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> [10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
> [10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> [10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
> [10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
> [10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
> [10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
> [10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
> [10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
> [10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
> [10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
> [10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
> [10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
> [10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
> [10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
> [10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
> [10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
> [10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
> [10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
> [10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---
> 

This is with the revert?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-21  9:14                       ` David Rientjes
@ 2011-10-21  9:26                         ` Sergey Senozhatsky
  2011-10-21  9:45                         ` Yong Zhang
  1 sibling, 0 replies; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-10-21  9:26 UTC (permalink / raw)
  To: David Rientjes
  Cc: Tejun Heo, Ingo Molnar, Borislav Petkov, Peter Zijlstra,
	linux-kernel, Andrew Morton

On (10/21/11 02:14), David Rientjes wrote:
> > I thought I've started understand this, but it was wrong feeling.
> > 
> > The error indeed is that class name and lock name are mismatch
> > 
> >  689                 if (class->key == key) {                                                                                                                                                                                      
> >  690                         WARN_ON_ONCE(class->name != lock->name);                                            
> >  691                         return class;                                                                       
> >  692                 }  
> > 
> > And the problem as far as I understand only shows up when active_load_balance_cpu_stop() gets
> > called on rq with active_balance.
> > 
> > double_unlock_balance() is called with busiest_rq spin lock held and I don't see who
> > calls lockdep_init_map() on busiest_rq somewhere around. work_struct has its
> > own lockdep_map touched after __queue_work(cpu, wq, work).
> > 
> > I'm not sure that reverting is the best option we have, since it's not fixing
> > the possible race condition it's just mask it.
> > 
> 
> How does it mask the race condition?  Before the memset(), the ->name 
> field was never _cleared_ in lockdep_init_map() like it is now, it was 
> only stored.
>

Well, if we have race condition between `reader' and `writer', then it's our luck that we only hit it
with ->name modification. It could be `->cpu = raw_smp_processor_id' or while iterating thr' `class_cache'
to NULL it. Current implementation may only race with `->name' but in theory we have the whole bunch of
opportunities. Of course I may be wrong.

 
> > I'm not very lucky at reproducing issue, in fact I had only one trace so far.
> > 
> > [10172.218213] ------------[ cut here ]------------
> > [10172.218233] WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
> > [10172.218346]  [<ffffffff8103e7c8>] warn_slowpath_common+0x7e/0x96
> > [10172.218353]  [<ffffffff8103e7f5>] warn_slowpath_null+0x15/0x17
> > [10172.218361]  [<ffffffff8106fee5>] __lock_acquire+0x168/0x164b
> > [10172.218370]  [<ffffffff81034645>] ? find_busiest_group+0x7b6/0x941
> > [10172.218381]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218389]  [<ffffffff8107197e>] lock_acquire+0x138/0x1ac
> > [10172.218397]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218404]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> > [10172.218414]  [<ffffffff8148fb49>] _raw_spin_lock_nested+0x3a/0x49
> > [10172.218421]  [<ffffffff8102a5e3>] ? double_rq_lock+0x4d/0x52
> > [10172.218428]  [<ffffffff8148fabe>] ? _raw_spin_lock+0x3e/0x45
> > [10172.218435]  [<ffffffff8102a5c4>] ? double_rq_lock+0x2e/0x52
> > [10172.218442]  [<ffffffff8102a5e3>] double_rq_lock+0x4d/0x52
> > [10172.218449]  [<ffffffff810349cc>] load_balance+0x1fc/0x769
> > [10172.218458]  [<ffffffff810075c5>] ? native_sched_clock+0x38/0x65
> > [10172.218466]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> > [10172.218474]  [<ffffffff8148caf5>] __schedule+0x3d3/0xa2d
> > [10172.218480]  [<ffffffff8148ca17>] ? __schedule+0x2f5/0xa2d
> > [10172.218490]  [<ffffffff8104db06>] ? add_timer_on+0xd/0x196
> > [10172.218497]  [<ffffffff8148fc02>] ? _raw_spin_lock_irq+0x4a/0x51
> > [10172.218505]  [<ffffffff8105907b>] ? process_one_work+0x3ed/0x54c
> > [10172.218512]  [<ffffffff81059126>] ? process_one_work+0x498/0x54c
> > [10172.218518]  [<ffffffff81058e1b>] ? process_one_work+0x18d/0x54c
> > [10172.218526]  [<ffffffff814902d0>] ? _raw_spin_unlock_irq+0x28/0x56
> > [10172.218533]  [<ffffffff81033950>] ? get_parent_ip+0xe/0x3e
> > [10172.218540]  [<ffffffff8148d26e>] schedule+0x55/0x57
> > [10172.218547]  [<ffffffff8105970f>] worker_thread+0x217/0x21c
> > [10172.218554]  [<ffffffff810594f8>] ? manage_workers.isra.21+0x16c/0x16c
> > [10172.218564]  [<ffffffff8105d4de>] kthread+0x9a/0xa2
> > [10172.218573]  [<ffffffff81497984>] kernel_thread_helper+0x4/0x10
> > [10172.218580]  [<ffffffff8102d6d2>] ? finish_task_switch+0x76/0xf3
> > [10172.218587]  [<ffffffff81490778>] ? retint_restore_args+0x13/0x13
> > [10172.218595]  [<ffffffff8105d444>] ? __init_kthread_worker+0x53/0x53
> > [10172.218602]  [<ffffffff81497980>] ? gs_change+0x13/0x13
> > [10172.218607] ---[ end trace 9d11d6b5e4b96730 ]---
> > 
> 
> This is with the revert?
> 

Nope, sorry for being unclear, this is the only trace I got.

	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-21  9:14                       ` David Rientjes
  2011-10-21  9:26                         ` Sergey Senozhatsky
@ 2011-10-21  9:45                         ` Yong Zhang
  2011-11-03  7:17                           ` Sergey Senozhatsky
  1 sibling, 1 reply; 26+ messages in thread
From: Yong Zhang @ 2011-10-21  9:45 UTC (permalink / raw)
  To: David Rientjes
  Cc: Sergey Senozhatsky, Tejun Heo, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, linux-kernel, Andrew Morton

On Fri, Oct 21, 2011 at 02:14:34AM -0700, David Rientjes wrote:
> How does it mask the race condition?  Before the memset(), the ->name 
> field was never _cleared_ in lockdep_init_map() like it is now, it was 
> only stored.

A typcal race condition will like this:

	CPU A					CPU B
lock_set_subclass(lockA);
  lock_set_class(lockA);
    lockdep_init_map(lockA);
      /* lockA->name is cleared */
      memset(lockA);
					__lock_acquire(lockA);
					  /* lockA->class_cache[] is cleared */
					  register_lock_class(lockA);
					    look_up_lock_class(lockA);
					      WARN_ON_ONCE(class->name !=
							lock->name);

      lock->name = name;

And a untested patch is below:
BTW, now the patch could cure (I guess) the very issue reported
in this thread.
But it don't cover the case which change the key and the relevant
lock_class has existed, I don't think out a way how to fix it yet :)
But the fact is we have no such caller yet, the only call site of
lock_set_subclass() is double_unlock_balance().

Thanks,
Yong

---
From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH] lockdep: On-demand initialization for lock_set_class()

Since commit f59de89 [lockdep: Clear whole lockdep_map on initialization],
lockdep_init_map() will clear all the struct. But it will break
lock_set_class()/lock_set_subclass(). A typical race condition
is like below:

	CPU A					CPU B
lock_set_subclass(lockA);
  lock_set_class(lockA);
    lockdep_init_map(lockA);
      /* lockA->name is cleared */
      memset(lockA);
					__lock_acquire(lockA);
					  /* lockA->class_cache[] is cleared */
					  register_lock_class(lockA);
					    look_up_lock_class(lockA);
					      WARN_ON_ONCE(class->name !=
							lock->name);

      lock->name = name;

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/lockdep.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 91d67ce..bc7dd1e 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3160,7 +3160,10 @@ __lock_set_class(struct lockdep_map *lock, const char *name,
 	return print_unlock_inbalance_bug(curr, lock, ip);
 
 found_it:
-	lockdep_init_map(lock, name, key, 0);
+	/* only changing lock->name make no sense */
+	WARN_ON(lock->key == key && lock->name != name);
+	if (lock->key != key)
+		lockdep_init_map(lock, name, key, 0);
 	class = register_lock_class(lock, subclass, 0);
 	hlock->class_idx = class - lock_classes + 1;
 
-- 
1.7.5.4



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-10-21  9:45                         ` Yong Zhang
@ 2011-11-03  7:17                           ` Sergey Senozhatsky
  2011-11-03  7:27                             ` Yong Zhang
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-11-03  7:17 UTC (permalink / raw)
  To: Yong Zhang
  Cc: David Rientjes, Tejun Heo, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, linux-kernel, Andrew Morton

On (10/21/11 17:45), Yong Zhang wrote:
> On Fri, Oct 21, 2011 at 02:14:34AM -0700, David Rientjes wrote:
> > How does it mask the race condition?  Before the memset(), the ->name 
> > field was never _cleared_ in lockdep_init_map() like it is now, it was 
> > only stored.
> 
> A typcal race condition will like this:
> 
> 	CPU A					CPU B
> lock_set_subclass(lockA);
>   lock_set_class(lockA);
>     lockdep_init_map(lockA);
>       /* lockA->name is cleared */
>       memset(lockA);
> 					__lock_acquire(lockA);
> 					  /* lockA->class_cache[] is cleared */
> 					  register_lock_class(lockA);
> 					    look_up_lock_class(lockA);
> 					      WARN_ON_ONCE(class->name !=
> 							lock->name);
> 
>       lock->name = name;
> 
> And a untested patch is below:
> BTW, now the patch could cure (I guess) the very issue reported
> in this thread.
> But it don't cover the case which change the key and the relevant
> lock_class has existed, I don't think out a way how to fix it yet :)
> But the fact is we have no such caller yet, the only call site of
> lock_set_subclass() is double_unlock_balance().
> 

Hello,
Any news on this patch? Do you like it or hate it? With recent kernels
I'm able to hit this problem more often (several time a day) so if any
testing is required I'm willing to help.


	Sergey

> 
> ---
> From: Yong Zhang <yong.zhang0@gmail.com>
> Subject: [PATCH] lockdep: On-demand initialization for lock_set_class()
> 
> Since commit f59de89 [lockdep: Clear whole lockdep_map on initialization],
> lockdep_init_map() will clear all the struct. But it will break
> lock_set_class()/lock_set_subclass(). A typical race condition
> is like below:
> 
> 	CPU A					CPU B
> lock_set_subclass(lockA);
>   lock_set_class(lockA);
>     lockdep_init_map(lockA);
>       /* lockA->name is cleared */
>       memset(lockA);
> 					__lock_acquire(lockA);
> 					  /* lockA->class_cache[] is cleared */
> 					  register_lock_class(lockA);
> 					    look_up_lock_class(lockA);
> 					      WARN_ON_ONCE(class->name !=
> 							lock->name);
> 
>       lock->name = name;
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
>  kernel/lockdep.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 91d67ce..bc7dd1e 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -3160,7 +3160,10 @@ __lock_set_class(struct lockdep_map *lock, const char *name,
>  	return print_unlock_inbalance_bug(curr, lock, ip);
>  
>  found_it:
> -	lockdep_init_map(lock, name, key, 0);
> +	/* only changing lock->name make no sense */
> +	WARN_ON(lock->key == key && lock->name != name);
> +	if (lock->key != key)
> +		lockdep_init_map(lock, name, key, 0);
>  	class = register_lock_class(lock, subclass, 0);
>  	hlock->class_idx = class - lock_classes + 1;
>  
> -- 
> 1.7.5.4
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-03  7:17                           ` Sergey Senozhatsky
@ 2011-11-03  7:27                             ` Yong Zhang
  2011-11-03  7:45                               ` Sergey Senozhatsky
  0 siblings, 1 reply; 26+ messages in thread
From: Yong Zhang @ 2011-11-03  7:27 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: David Rientjes, Tejun Heo, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, linux-kernel, Andrew Morton

On Thu, Nov 03, 2011 at 10:17:36AM +0300, Sergey Senozhatsky wrote:
> On (10/21/11 17:45), Yong Zhang wrote:
> > On Fri, Oct 21, 2011 at 02:14:34AM -0700, David Rientjes wrote:
> > > How does it mask the race condition?  Before the memset(), the ->name 
> > > field was never _cleared_ in lockdep_init_map() like it is now, it was 
> > > only stored.
> > 
> > A typcal race condition will like this:
> > 
> > 	CPU A					CPU B
> > lock_set_subclass(lockA);
> >   lock_set_class(lockA);
> >     lockdep_init_map(lockA);
> >       /* lockA->name is cleared */
> >       memset(lockA);
> > 					__lock_acquire(lockA);
> > 					  /* lockA->class_cache[] is cleared */
> > 					  register_lock_class(lockA);
> > 					    look_up_lock_class(lockA);
> > 					      WARN_ON_ONCE(class->name !=
> > 							lock->name);
> > 
> >       lock->name = name;
> > 
> > And a untested patch is below:
> > BTW, now the patch could cure (I guess) the very issue reported
> > in this thread.
> > But it don't cover the case which change the key and the relevant
> > lock_class has existed, I don't think out a way how to fix it yet :)
> > But the fact is we have no such caller yet, the only call site of
> > lock_set_subclass() is double_unlock_balance().
> > 
> 
> Hello,
> Any news on this patch? Do you like it or hate it? With recent kernels
> I'm able to hit this problem more often (several time a day) so if any
> testing is required I'm willing to help.

Did you have tried it? Though I don't find time to polish it yet but
I think will smooth your concern.

Thanks,
Yong

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-03  7:27                             ` Yong Zhang
@ 2011-11-03  7:45                               ` Sergey Senozhatsky
  2011-11-03  7:53                                 ` Yong Zhang
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-11-03  7:45 UTC (permalink / raw)
  To: Yong Zhang
  Cc: David Rientjes, Tejun Heo, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, linux-kernel, Andrew Morton

On (11/03/11 15:27), Yong Zhang wrote:
> > > A typcal race condition will like this:
> > > 
> > > 	CPU A					CPU B
> > > lock_set_subclass(lockA);
> > >   lock_set_class(lockA);
> > >     lockdep_init_map(lockA);
> > >       /* lockA->name is cleared */
> > >       memset(lockA);
> > > 					__lock_acquire(lockA);
> > > 					  /* lockA->class_cache[] is cleared */
> > > 					  register_lock_class(lockA);
> > > 					    look_up_lock_class(lockA);
> > > 					      WARN_ON_ONCE(class->name !=
> > > 							lock->name);
> > > 
> > >       lock->name = name;
> > > 
> > > And a untested patch is below:
> > > BTW, now the patch could cure (I guess) the very issue reported
> > > in this thread.
> > > But it don't cover the case which change the key and the relevant
> > > lock_class has existed, I don't think out a way how to fix it yet :)
> > > But the fact is we have no such caller yet, the only call site of
> > > lock_set_subclass() is double_unlock_balance().
> > > 
> > 
> > Hello,
> > Any news on this patch? Do you like it or hate it? With recent kernels
> > I'm able to hit this problem more often (several time a day) so if any
> > testing is required I'm willing to help.
> 
> Did you have tried it? Though I don't find time to polish it yet but
> I think will smooth your concern.
> 

I'm compiling the kernel with you patch right now. The whole point was just for
case if someone has different approach or whatsoever.


	Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-03  7:45                               ` Sergey Senozhatsky
@ 2011-11-03  7:53                                 ` Yong Zhang
  2011-11-04  9:25                                   ` Borislav Petkov
  0 siblings, 1 reply; 26+ messages in thread
From: Yong Zhang @ 2011-11-03  7:53 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: David Rientjes, Tejun Heo, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, linux-kernel, Andrew Morton

On Thu, Nov 03, 2011 at 10:45:06AM +0300, Sergey Senozhatsky wrote:
> On (11/03/11 15:27), Yong Zhang wrote:
> > Did you have tried it? Though I don't find time to polish it yet but
> > I think will smooth your concern.
> > 
> 
> I'm compiling the kernel with you patch right now.
> The whole point was just for
> case if someone has different approach or whatsoever.

Understood. If someone can come up with a simple patch which could
cover the case I mentioned before, that would be great.
/me goes to poke at it.

Thanks,
Yong

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-03  7:53                                 ` Yong Zhang
@ 2011-11-04  9:25                                   ` Borislav Petkov
  2011-11-04  9:31                                     ` Sergey Senozhatsky
  2011-11-04  9:34                                     ` Yong Zhang
  0 siblings, 2 replies; 26+ messages in thread
From: Borislav Petkov @ 2011-11-04  9:25 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Sergey Senozhatsky, David Rientjes, Tejun Heo, Ingo Molnar,
	Borislav Petkov, Peter Zijlstra, linux-kernel, Andrew Morton

On Thu, Nov 03, 2011 at 03:53:54PM +0800, Yong Zhang wrote:
> On Thu, Nov 03, 2011 at 10:45:06AM +0300, Sergey Senozhatsky wrote:
> > On (11/03/11 15:27), Yong Zhang wrote:
> > > Did you have tried it? Though I don't find time to polish it yet but
> > > I think will smooth your concern.
> > > 
> > 
> > I'm compiling the kernel with you patch right now.
> > The whole point was just for
> > case if someone has different approach or whatsoever.
> 
> Understood. If someone can come up with a simple patch which could
> cover the case I mentioned before, that would be great.
> /me goes to poke at it.

I dunno whether this is related but I get the following on 3.1:

[ 5499.537074] INFO: trying to register non-static key.
[ 5499.537080] the code is fine but needs lockdep annotation.
[ 5499.537083] turning off the locking correctness validator.
[ 5499.537088] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0 #1
[ 5499.537091] Call Trace:
[ 5499.537094]  <IRQ>  [<ffffffff8107beed>] __lock_acquire+0x165d/0x1e30
[ 5499.537109]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
[ 5499.537115]  [<ffffffff8107ccd3>] lock_acquire+0x93/0x160
[ 5499.537120]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
[ 5499.537126]  [<ffffffff814d9866>] _raw_spin_lock+0x36/0x50
[ 5499.537130]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
[ 5499.537135]  [<ffffffff810321fc>] double_rq_lock+0x2c/0x80
[ 5499.537140]  [<ffffffff81039195>] load_balance+0x215/0x6c0
[ 5499.537146]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
[ 5499.537151]  [<ffffffff810396fd>] rebalance_domains+0xbd/0x1d0
[ 5499.537155]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
[ 5499.537161]  [<ffffffff810398ec>] run_rebalance_domains+0xdc/0x130
[ 5499.537166]  [<ffffffff81048dcd>] __do_softirq+0xbd/0x290
[ 5499.537173]  [<ffffffff814dc42c>] call_softirq+0x1c/0x30
[ 5499.537178]  [<ffffffff81003eb5>] do_softirq+0x85/0xc0
[ 5499.537183]  [<ffffffff810492ce>] irq_exit+0x9e/0xc0
[ 5499.537189]  [<ffffffff8101ca9f>] smp_call_function_single_interrupt+0x2f/0x40
[ 5499.537195]  [<ffffffff814dbeb0>] call_function_single_interrupt+0x70/0x80
[ 5499.537199]  <EOI>  [<ffffffff810096e6>] ? native_sched_clock+0x26/0x70
[ 5499.537212]  [<ffffffffa0038e1a>] ? acpi_idle_enter_simple+0xee/0x11f [processor]
[ 5499.537221]  [<ffffffffa0038e15>] ? acpi_idle_enter_simple+0xe9/0x11f [processor]
[ 5499.537227]  [<ffffffff813f8b1d>] cpuidle_idle_call+0xdd/0x350
[ 5499.537233]  [<ffffffff8100081f>] cpu_idle+0x6f/0xd0
[ 5499.537238]  [<ffffffff814cc665>] start_secondary+0x1ae/0x1b3

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-04  9:25                                   ` Borislav Petkov
@ 2011-11-04  9:31                                     ` Sergey Senozhatsky
  2011-11-07  4:54                                       ` Yong Zhang
  2011-11-04  9:34                                     ` Yong Zhang
  1 sibling, 1 reply; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-11-04  9:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yong Zhang, David Rientjes, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Andrew Morton

On (11/04/11 10:25), Borislav Petkov wrote:
> > Understood. If someone can come up with a simple patch which could
> > cover the case I mentioned before, that would be great.
> > /me goes to poke at it.
> 
> I dunno whether this is related but I get the following on 3.1:
>

I think this is different problem. Failed check that lockdep key is marked as `static'.

	Sergey
 
> [ 5499.537074] INFO: trying to register non-static key.
> [ 5499.537080] the code is fine but needs lockdep annotation.
> [ 5499.537083] turning off the locking correctness validator.
> [ 5499.537088] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0 #1
> [ 5499.537091] Call Trace:
> [ 5499.537094]  <IRQ>  [<ffffffff8107beed>] __lock_acquire+0x165d/0x1e30
> [ 5499.537109]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537115]  [<ffffffff8107ccd3>] lock_acquire+0x93/0x160
> [ 5499.537120]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537126]  [<ffffffff814d9866>] _raw_spin_lock+0x36/0x50
> [ 5499.537130]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537135]  [<ffffffff810321fc>] double_rq_lock+0x2c/0x80
> [ 5499.537140]  [<ffffffff81039195>] load_balance+0x215/0x6c0
> [ 5499.537146]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> [ 5499.537151]  [<ffffffff810396fd>] rebalance_domains+0xbd/0x1d0
> [ 5499.537155]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> [ 5499.537161]  [<ffffffff810398ec>] run_rebalance_domains+0xdc/0x130
> [ 5499.537166]  [<ffffffff81048dcd>] __do_softirq+0xbd/0x290
> [ 5499.537173]  [<ffffffff814dc42c>] call_softirq+0x1c/0x30
> [ 5499.537178]  [<ffffffff81003eb5>] do_softirq+0x85/0xc0
> [ 5499.537183]  [<ffffffff810492ce>] irq_exit+0x9e/0xc0
> [ 5499.537189]  [<ffffffff8101ca9f>] smp_call_function_single_interrupt+0x2f/0x40
> [ 5499.537195]  [<ffffffff814dbeb0>] call_function_single_interrupt+0x70/0x80
> [ 5499.537199]  <EOI>  [<ffffffff810096e6>] ? native_sched_clock+0x26/0x70
> [ 5499.537212]  [<ffffffffa0038e1a>] ? acpi_idle_enter_simple+0xee/0x11f [processor]
> [ 5499.537221]  [<ffffffffa0038e15>] ? acpi_idle_enter_simple+0xe9/0x11f [processor]
> [ 5499.537227]  [<ffffffff813f8b1d>] cpuidle_idle_call+0xdd/0x350
> [ 5499.537233]  [<ffffffff8100081f>] cpu_idle+0x6f/0xd0
> [ 5499.537238]  [<ffffffff814cc665>] start_secondary+0x1ae/0x1b3
> 
> -- 
> Regards/Gruss,
> Boris.
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-04  9:25                                   ` Borislav Petkov
  2011-11-04  9:31                                     ` Sergey Senozhatsky
@ 2011-11-04  9:34                                     ` Yong Zhang
  2011-11-04  9:51                                       ` Sergey Senozhatsky
  1 sibling, 1 reply; 26+ messages in thread
From: Yong Zhang @ 2011-11-04  9:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sergey Senozhatsky, David Rientjes, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Andrew Morton

On Fri, Nov 04, 2011 at 10:25:20AM +0100, Borislav Petkov wrote:
> On Thu, Nov 03, 2011 at 03:53:54PM +0800, Yong Zhang wrote:
> > On Thu, Nov 03, 2011 at 10:45:06AM +0300, Sergey Senozhatsky wrote:
> > > On (11/03/11 15:27), Yong Zhang wrote:
> > > > Did you have tried it? Though I don't find time to polish it yet but
> > > > I think will smooth your concern.
> > > > 
> > > 
> > > I'm compiling the kernel with you patch right now.
> > > The whole point was just for
> > > case if someone has different approach or whatsoever.
> > 
> > Understood. If someone can come up with a simple patch which could
> > cover the case I mentioned before, that would be great.
> > /me goes to poke at it.
> 
> I dunno whether this is related but I get the following on 3.1:

Maybe, so could you try my patches just sent out?
http://marc.info/?l=linux-kernel&m=132039886826672

Thanks,
Yong

> 
> [ 5499.537074] INFO: trying to register non-static key.
> [ 5499.537080] the code is fine but needs lockdep annotation.
> [ 5499.537083] turning off the locking correctness validator.
> [ 5499.537088] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0 #1
> [ 5499.537091] Call Trace:
> [ 5499.537094]  <IRQ>  [<ffffffff8107beed>] __lock_acquire+0x165d/0x1e30
> [ 5499.537109]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537115]  [<ffffffff8107ccd3>] lock_acquire+0x93/0x160
> [ 5499.537120]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537126]  [<ffffffff814d9866>] _raw_spin_lock+0x36/0x50
> [ 5499.537130]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> [ 5499.537135]  [<ffffffff810321fc>] double_rq_lock+0x2c/0x80
> [ 5499.537140]  [<ffffffff81039195>] load_balance+0x215/0x6c0
> [ 5499.537146]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> [ 5499.537151]  [<ffffffff810396fd>] rebalance_domains+0xbd/0x1d0
> [ 5499.537155]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> [ 5499.537161]  [<ffffffff810398ec>] run_rebalance_domains+0xdc/0x130
> [ 5499.537166]  [<ffffffff81048dcd>] __do_softirq+0xbd/0x290
> [ 5499.537173]  [<ffffffff814dc42c>] call_softirq+0x1c/0x30
> [ 5499.537178]  [<ffffffff81003eb5>] do_softirq+0x85/0xc0
> [ 5499.537183]  [<ffffffff810492ce>] irq_exit+0x9e/0xc0
> [ 5499.537189]  [<ffffffff8101ca9f>] smp_call_function_single_interrupt+0x2f/0x40
> [ 5499.537195]  [<ffffffff814dbeb0>] call_function_single_interrupt+0x70/0x80
> [ 5499.537199]  <EOI>  [<ffffffff810096e6>] ? native_sched_clock+0x26/0x70
> [ 5499.537212]  [<ffffffffa0038e1a>] ? acpi_idle_enter_simple+0xee/0x11f [processor]
> [ 5499.537221]  [<ffffffffa0038e15>] ? acpi_idle_enter_simple+0xe9/0x11f [processor]
> [ 5499.537227]  [<ffffffff813f8b1d>] cpuidle_idle_call+0xdd/0x350
> [ 5499.537233]  [<ffffffff8100081f>] cpu_idle+0x6f/0xd0
> [ 5499.537238]  [<ffffffff814cc665>] start_secondary+0x1ae/0x1b3
> 
> -- 
> Regards/Gruss,
> Boris.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-04  9:34                                     ` Yong Zhang
@ 2011-11-04  9:51                                       ` Sergey Senozhatsky
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-11-04  9:51 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Borislav Petkov, David Rientjes, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Andrew Morton

On (11/04/11 17:34), Yong Zhang wrote:
> > > > I'm compiling the kernel with you patch right now.
> > > > The whole point was just for
> > > > case if someone has different approach or whatsoever.
> > > 
> > > Understood. If someone can come up with a simple patch which could
> > > cover the case I mentioned before, that would be great.
> > > /me goes to poke at it.
> > 
> > I dunno whether this is related but I get the following on 3.1:
> 
> Maybe, so could you try my patches just sent out?
> http://marc.info/?l=linux-kernel&m=132039886826672
>

Sure I'll try you patches. Later today or (most likely) during the upcoming
weekend, since now I'm extremly busy at work.

Thanks,
	Sergey

 
> Thanks,
> Yong
> 
> > 
> > [ 5499.537074] INFO: trying to register non-static key.
> > [ 5499.537080] the code is fine but needs lockdep annotation.
> > [ 5499.537083] turning off the locking correctness validator.
> > [ 5499.537088] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0 #1
> > [ 5499.537091] Call Trace:
> > [ 5499.537094]  <IRQ>  [<ffffffff8107beed>] __lock_acquire+0x165d/0x1e30
> > [ 5499.537109]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537115]  [<ffffffff8107ccd3>] lock_acquire+0x93/0x160
> > [ 5499.537120]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537126]  [<ffffffff814d9866>] _raw_spin_lock+0x36/0x50
> > [ 5499.537130]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537135]  [<ffffffff810321fc>] double_rq_lock+0x2c/0x80
> > [ 5499.537140]  [<ffffffff81039195>] load_balance+0x215/0x6c0
> > [ 5499.537146]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> > [ 5499.537151]  [<ffffffff810396fd>] rebalance_domains+0xbd/0x1d0
> > [ 5499.537155]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> > [ 5499.537161]  [<ffffffff810398ec>] run_rebalance_domains+0xdc/0x130
> > [ 5499.537166]  [<ffffffff81048dcd>] __do_softirq+0xbd/0x290
> > [ 5499.537173]  [<ffffffff814dc42c>] call_softirq+0x1c/0x30
> > [ 5499.537178]  [<ffffffff81003eb5>] do_softirq+0x85/0xc0
> > [ 5499.537183]  [<ffffffff810492ce>] irq_exit+0x9e/0xc0
> > [ 5499.537189]  [<ffffffff8101ca9f>] smp_call_function_single_interrupt+0x2f/0x40
> > [ 5499.537195]  [<ffffffff814dbeb0>] call_function_single_interrupt+0x70/0x80
> > [ 5499.537199]  <EOI>  [<ffffffff810096e6>] ? native_sched_clock+0x26/0x70
> > [ 5499.537212]  [<ffffffffa0038e1a>] ? acpi_idle_enter_simple+0xee/0x11f [processor]
> > [ 5499.537221]  [<ffffffffa0038e15>] ? acpi_idle_enter_simple+0xe9/0x11f [processor]
> > [ 5499.537227]  [<ffffffff813f8b1d>] cpuidle_idle_call+0xdd/0x350
> > [ 5499.537233]  [<ffffffff8100081f>] cpu_idle+0x6f/0xd0
> > [ 5499.537238]  [<ffffffff814cc665>] start_secondary+0x1ae/0x1b3
> > 
> > -- 
> > Regards/Gruss,
> > Boris.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> -- 
> Only stand for myself
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-04  9:31                                     ` Sergey Senozhatsky
@ 2011-11-07  4:54                                       ` Yong Zhang
  2011-11-07  8:43                                         ` Sergey Senozhatsky
  0 siblings, 1 reply; 26+ messages in thread
From: Yong Zhang @ 2011-11-07  4:54 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Borislav Petkov, David Rientjes, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Andrew Morton

On Fri, Nov 04, 2011 at 12:31:24PM +0300, Sergey Senozhatsky wrote:
> On (11/04/11 10:25), Borislav Petkov wrote:
> > > Understood. If someone can come up with a simple patch which could
> > > cover the case I mentioned before, that would be great.
> > > /me goes to poke at it.
> > 
> > I dunno whether this is related but I get the following on 3.1:
> >
> 
> I think this is different problem. Failed check that lockdep key is marked as `static'.

Actually the lockdep_init_map() in __lock_set_class could lead to
more problem, such as: certain rq->lock could have different 'key'
with what we give them in sched_init() because rq is defined staticly.

Given that, we could have another typical race:

          CPU A                                   CPU B
    lock_set_subclass(lockA);
      lock_set_class(lockA);
                                            /* lockA->class_cache[] is not set */
                                            register_lock_class(lockA);
                                              look_up_lock_class(lockA); /* retrun NULL */
        lockdep_init_map(lockA);
          /* lockA->name is cleared */
          memset(lockA);
   						if (!static_obj(lock->key))
						  /* we get warning here */
          lock->name = name;
    

So memset() in lockdep_init_map() is still the culprit IMHO.

Thanks,
Yong

> 
> 	Sergey
>  
> > [ 5499.537074] INFO: trying to register non-static key.
> > [ 5499.537080] the code is fine but needs lockdep annotation.
> > [ 5499.537083] turning off the locking correctness validator.
> > [ 5499.537088] Pid: 0, comm: kworker/0:1 Not tainted 3.1.0 #1
> > [ 5499.537091] Call Trace:
> > [ 5499.537094]  <IRQ>  [<ffffffff8107beed>] __lock_acquire+0x165d/0x1e30
> > [ 5499.537109]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537115]  [<ffffffff8107ccd3>] lock_acquire+0x93/0x160
> > [ 5499.537120]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537126]  [<ffffffff814d9866>] _raw_spin_lock+0x36/0x50
> > [ 5499.537130]  [<ffffffff810321fc>] ? double_rq_lock+0x2c/0x80
> > [ 5499.537135]  [<ffffffff810321fc>] double_rq_lock+0x2c/0x80
> > [ 5499.537140]  [<ffffffff81039195>] load_balance+0x215/0x6c0
> > [ 5499.537146]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> > [ 5499.537151]  [<ffffffff810396fd>] rebalance_domains+0xbd/0x1d0
> > [ 5499.537155]  [<ffffffff81039640>] ? load_balance+0x6c0/0x6c0
> > [ 5499.537161]  [<ffffffff810398ec>] run_rebalance_domains+0xdc/0x130
> > [ 5499.537166]  [<ffffffff81048dcd>] __do_softirq+0xbd/0x290
> > [ 5499.537173]  [<ffffffff814dc42c>] call_softirq+0x1c/0x30
> > [ 5499.537178]  [<ffffffff81003eb5>] do_softirq+0x85/0xc0
> > [ 5499.537183]  [<ffffffff810492ce>] irq_exit+0x9e/0xc0
> > [ 5499.537189]  [<ffffffff8101ca9f>] smp_call_function_single_interrupt+0x2f/0x40
> > [ 5499.537195]  [<ffffffff814dbeb0>] call_function_single_interrupt+0x70/0x80
> > [ 5499.537199]  <EOI>  [<ffffffff810096e6>] ? native_sched_clock+0x26/0x70
> > [ 5499.537212]  [<ffffffffa0038e1a>] ? acpi_idle_enter_simple+0xee/0x11f [processor]
> > [ 5499.537221]  [<ffffffffa0038e15>] ? acpi_idle_enter_simple+0xe9/0x11f [processor]
> > [ 5499.537227]  [<ffffffff813f8b1d>] cpuidle_idle_call+0xdd/0x350
> > [ 5499.537233]  [<ffffffff8100081f>] cpu_idle+0x6f/0xd0
> > [ 5499.537238]  [<ffffffff814cc665>] start_secondary+0x1ae/0x1b3
> > 
> > -- 
> > Regards/Gruss,
> > Boris.
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b()
  2011-11-07  4:54                                       ` Yong Zhang
@ 2011-11-07  8:43                                         ` Sergey Senozhatsky
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Senozhatsky @ 2011-11-07  8:43 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Borislav Petkov, David Rientjes, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, linux-kernel, Andrew Morton

On (11/07/11 12:54), Yong Zhang wrote:
> > > > Understood. If someone can come up with a simple patch which could
> > > > cover the case I mentioned before, that would be great.
> > > > /me goes to poke at it.
> > > 
> > > I dunno whether this is related but I get the following on 3.1:
> > >
> > 
> > I think this is different problem. Failed check that lockdep key is marked as `static'.
> 
> Actually the lockdep_init_map() in __lock_set_class could lead to
> more problem, such as: certain rq->lock could have different 'key'
> with what we give them in sched_init() because rq is defined staticly.
> 
> Given that, we could have another typical race:
> 
>           CPU A                                   CPU B
>     lock_set_subclass(lockA);
>       lock_set_class(lockA);
>                                             /* lockA->class_cache[] is not set */
>                                             register_lock_class(lockA);
>                                               look_up_lock_class(lockA); /* retrun NULL */
>         lockdep_init_map(lockA);
>           /* lockA->name is cleared */
>           memset(lockA);
>    						if (!static_obj(lock->key))
> 						  /* we get warning here */
>           lock->name = name;
>     
> 
> So memset() in lockdep_init_map() is still the culprit IMHO.
>

Hm, agreed, that still could be the reason.
I guess a little more information may be helpful in some cases.

---
Print key address when attempt to register non-static lock key detected.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

---

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index e69434b..de8a996 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -729,7 +729,7 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
  	 */
 	if (!static_obj(lock->key)) {
 		debug_locks_off();
-		printk("INFO: trying to register non-static key.\n");
+		printk("INFO: trying to register non-static key at address %p.\n", lock->key);
 		printk("the code is fine but needs lockdep annotation.\n");
 		printk("turning off the locking correctness validator.\n");
 		dump_stack();


^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-11-07  8:46 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-15 20:12 WARNING: at kernel/lockdep.c:690 __lock_acquire+0x168/0x164b() Sergey Senozhatsky
2011-10-15 21:42 ` David Rientjes
2011-10-15 22:23   ` Borislav Petkov
2011-10-15 22:32     ` David Rientjes
2011-10-16  5:09       ` Sergey Senozhatsky
2011-10-20 18:39       ` Borislav Petkov
2011-10-20 18:53         ` Sergey Senozhatsky
2011-10-20 19:07           ` Sergey Senozhatsky
2011-10-20 21:17             ` David Rientjes
2011-10-20 21:23               ` Tejun Heo
2011-10-20 21:31                 ` David Rientjes
2011-10-20 21:36                   ` Tejun Heo
2011-10-20 23:00                     ` Sergey Senozhatsky
2011-10-21  9:14                       ` David Rientjes
2011-10-21  9:26                         ` Sergey Senozhatsky
2011-10-21  9:45                         ` Yong Zhang
2011-11-03  7:17                           ` Sergey Senozhatsky
2011-11-03  7:27                             ` Yong Zhang
2011-11-03  7:45                               ` Sergey Senozhatsky
2011-11-03  7:53                                 ` Yong Zhang
2011-11-04  9:25                                   ` Borislav Petkov
2011-11-04  9:31                                     ` Sergey Senozhatsky
2011-11-07  4:54                                       ` Yong Zhang
2011-11-07  8:43                                         ` Sergey Senozhatsky
2011-11-04  9:34                                     ` Yong Zhang
2011-11-04  9:51                                       ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).