recursive locking (coredump/vfs

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* recursive locking (coredump/vfs_write)
@ 2013-11-13 21:11 Dave Jones
  2013-11-15 10:18 ` Peter Wu
  2013-11-22 21:11 ` Jan Kara
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Jones @ 2013-11-13 21:11 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel

Hey Al,

here's another one..


=============================================
[ INFO: possible recursive locking detected ]
3.12.0+ #2 Not tainted
---------------------------------------------
trinity-child3/13302 is trying to acquire lock:
 (sb_writers#5){.+.+.+}, at: [<ffffffff811b7013>] vfs_write+0x173/0x1f0

but task is already holding lock:
 (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(sb_writers#5);
  lock(sb_writers#5);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

1 lock held by trinity-child3/13302:
 #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070

stack backtrace:
CPU: 3 PID: 13302 Comm: trinity-child3 Not tainted 3.12.0+ #2 
 ffffffff82526e10 ffff8801b54af820 ffffffff8171b3dc ffffffff82526e10
 ffff8801b54af8e0 ffffffff810d722b 00007f93d6ce5000 0000000000000000
 ffff880154b3f200 ffff880100000000 00000000004da26d ffffffff821b3901
Call Trace:
 [<ffffffff8171b3dc>] dump_stack+0x4e/0x7a
 [<ffffffff810d722b>] __lock_acquire+0x19ab/0x19f0
 [<ffffffff81729334>] ? __do_page_fault+0x264/0x610
 [<ffffffff8100b144>] ? native_sched_clock+0x24/0x80
 [<ffffffff810d1d1f>] ? trace_hardirqs_off_caller+0x1f/0xc0
 [<ffffffff810d7a23>] lock_acquire+0x93/0x1c0
 [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
 [<ffffffff811b97f9>] __sb_start_write+0xc9/0x1a0
 [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
 [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
 [<ffffffff812cc303>] ? security_file_permission+0x23/0xa0
 [<ffffffff811b7013>] vfs_write+0x173/0x1f0
 [<ffffffff8121ef02>] dump_emit+0x92/0xd0
 [<ffffffff81218d50>] elf_core_dump+0xde0/0x1740
 [<ffffffff81218832>] ? elf_core_dump+0x8c2/0x1740
 [<ffffffff8121fdee>] do_coredump+0xc9e/0x1070
 [<ffffffff81719d9b>] ? __slab_free+0x191/0x35d
 [<ffffffff8106a9b8>] get_signal_to_deliver+0x2c8/0x930
 [<ffffffff810024b8>] do_signal+0x48/0x610
 [<ffffffff810d1e39>] ? get_lock_stats+0x19/0x60
 [<ffffffff810d25ae>] ? put_lock_stats.isra.28+0xe/0x30
 [<ffffffff81715e86>] ? pagefault_enable+0xe/0x21
 [<ffffffff8114b86e>] ? context_tracking_user_exit+0x4e/0x190
 [<ffffffff810d54c5>] ? trace_hardirqs_on_caller+0x115/0x1e0
 [<ffffffff81002adc>] do_notify_resume+0x5c/0xa0
 [<ffffffff81725f86>] retint_signal+0x46/0x90


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: recursive locking (coredump/vfs_write)
  2013-11-13 21:11 recursive locking (coredump/vfs_write) Dave Jones
@ 2013-11-15 10:18 ` Peter Wu
  2013-11-22 21:11 ` Jan Kara
  1 sibling, 0 replies; 5+ messages in thread
From: Peter Wu @ 2013-11-15 10:18 UTC (permalink / raw)
  To: Dave Jones; +Cc: Al Viro, Linux Kernel

Hi,

On Wednesday 13 November 2013 16:11:47 Dave Jones wrote:
> Hey Al,
> 
> here's another one..
> 
> [..]

I also saw this warning with a slightly different path.
Kernel is v3.12-7033-g42a2d92, the out-of-tree module below is unrelated
(bbswitch).

This was triggered when trying trying to get a coredump by running
/lib/libGL.so after manually changing kernel.core_pattern back to
"core" as systemd hijacked this setting.

libGL.so[23053]: segfault at 1 ip 0000000000000001 sp 00007fffc8829418 error 14 in mesa-libGL.so.1.2.0[7f7d9fa9c000+5a000]

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-1-custom #1 Tainted: G           O
---------------------------------------------
libGL.so/23053 is trying to acquire lock:
 (sb_writers#3){.+.+.+}, at: [<ffffffff81175203>] vfs_write+0x173/0x1f0

but task is already holding lock:
 (sb_writers#3){.+.+.+}, at: [<ffffffff811d5c15>] do_coredump+0xdd5/0xf20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(sb_writers#3);
  lock(sb_writers#3);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

1 lock held by libGL.so/23053:
 #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff811d5c15>] do_coredump+0xdd5/0xf20

stack backtrace:
CPU: 2 PID: 23053 Comm: libGL.so Tainted: G           O 3.12.0-1-custom #1
Hardware name: CLEVO CO.                        B7130                           /B7130                           , BIOS 6.00 08/27/2010
 ffffffff8207ac50 ffff8800ae1a1828 ffffffff815be016 ffffffff8207ac50
 ffff8800ae1a18f0 ffffffff810af393 000202d200000000 0000000000000246
 ffff88023bff4740 0000000000000001 0000000000000000 00000000003aa1d5
Call Trace:
 [<ffffffff815be016>] dump_stack+0x4d/0x66
 [<ffffffff810af393>] __lock_acquire+0x16c3/0x1a60
 [<ffffffff810adf8d>] ? __lock_acquire+0x2bd/0x1a60
 [<ffffffff810afec3>] lock_acquire+0x93/0x120
 [<ffffffff81175203>] ? vfs_write+0x173/0x1f0
 [<ffffffff811777e1>] __sb_start_write+0xc1/0x190
 [<ffffffff81175203>] ? vfs_write+0x173/0x1f0
 [<ffffffff81175203>] ? vfs_write+0x173/0x1f0
 [<ffffffff812a8ad3>] ? security_file_permission+0x23/0xa0
 [<ffffffff81175203>] vfs_write+0x173/0x1f0
 [<ffffffff811d4d07>] dump_emit+0x87/0xc0
 [<ffffffff811cd848>] elf_core_dump+0xcd8/0x14b0
 [<ffffffff811cd37e>] ? elf_core_dump+0x80e/0x14b0
 [<ffffffff810ad790>] ? mark_held_locks+0xb0/0x130
 [<ffffffff811d5a1a>] do_coredump+0xbda/0xf20
 [<ffffffff81056363>] ? __sigqueue_free.part.15+0x33/0x40
 [<ffffffff810ab1ed>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff81059cea>] get_signal_to_deliver+0x2aa/0x6a0
 [<ffffffff810023c8>] do_signal+0x48/0x960
 [<ffffffff81165fc5>] ? kmem_cache_free+0x95/0x1d0
 [<ffffffff81180a72>] ? final_putname+0x22/0x50
 [<ffffffff8116609e>] ? kmem_cache_free+0x16e/0x1d0
 [<ffffffff815c608d>] ? retint_signal+0x11/0x84
 [<ffffffff81002d45>] do_notify_resume+0x65/0x80
 [<ffffffff815c60c2>] retint_signal+0x46/0x84

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: recursive locking (coredump/vfs_write)
  2013-11-13 21:11 recursive locking (coredump/vfs_write) Dave Jones
  2013-11-15 10:18 ` Peter Wu
@ 2013-11-22 21:11 ` Jan Kara
  2013-11-22 23:52   ` Al Viro
  1 sibling, 1 reply; 5+ messages in thread
From: Jan Kara @ 2013-11-22 21:11 UTC (permalink / raw)
  To: Dave Jones; +Cc: Al Viro, Linux Kernel

[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]

  Hi,

On Wed 13-11-13 16:11:47, Dave Jones wrote:
> here's another one..
> 
> 
> =============================================
> [ INFO: possible recursive locking detected ]
> 3.12.0+ #2 Not tainted
> ---------------------------------------------
> trinity-child3/13302 is trying to acquire lock:
>  (sb_writers#5){.+.+.+}, at: [<ffffffff811b7013>] vfs_write+0x173/0x1f0
> 
> but task is already holding lock:
>  (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070
  Thanks for report. Attached patch should fix this. Al, can you please
merge it?

								Honza

> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(sb_writers#5);
>   lock(sb_writers#5);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 1 lock held by trinity-child3/13302:
>  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070
> 
> stack backtrace:
> CPU: 3 PID: 13302 Comm: trinity-child3 Not tainted 3.12.0+ #2 
>  ffffffff82526e10 ffff8801b54af820 ffffffff8171b3dc ffffffff82526e10
>  ffff8801b54af8e0 ffffffff810d722b 00007f93d6ce5000 0000000000000000
>  ffff880154b3f200 ffff880100000000 00000000004da26d ffffffff821b3901
> Call Trace:
>  [<ffffffff8171b3dc>] dump_stack+0x4e/0x7a
>  [<ffffffff810d722b>] __lock_acquire+0x19ab/0x19f0
>  [<ffffffff81729334>] ? __do_page_fault+0x264/0x610
>  [<ffffffff8100b144>] ? native_sched_clock+0x24/0x80
>  [<ffffffff810d1d1f>] ? trace_hardirqs_off_caller+0x1f/0xc0
>  [<ffffffff810d7a23>] lock_acquire+0x93/0x1c0
>  [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
>  [<ffffffff811b97f9>] __sb_start_write+0xc9/0x1a0
>  [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
>  [<ffffffff811b7013>] ? vfs_write+0x173/0x1f0
>  [<ffffffff812cc303>] ? security_file_permission+0x23/0xa0
>  [<ffffffff811b7013>] vfs_write+0x173/0x1f0
>  [<ffffffff8121ef02>] dump_emit+0x92/0xd0
>  [<ffffffff81218d50>] elf_core_dump+0xde0/0x1740
>  [<ffffffff81218832>] ? elf_core_dump+0x8c2/0x1740
>  [<ffffffff8121fdee>] do_coredump+0xc9e/0x1070
>  [<ffffffff81719d9b>] ? __slab_free+0x191/0x35d
>  [<ffffffff8106a9b8>] get_signal_to_deliver+0x2c8/0x930
>  [<ffffffff810024b8>] do_signal+0x48/0x610
>  [<ffffffff810d1e39>] ? get_lock_stats+0x19/0x60
>  [<ffffffff810d25ae>] ? put_lock_stats.isra.28+0xe/0x30
>  [<ffffffff81715e86>] ? pagefault_enable+0xe/0x21
>  [<ffffffff8114b86e>] ? context_tracking_user_exit+0x4e/0x190
>  [<ffffffff810d54c5>] ? trace_hardirqs_on_caller+0x115/0x1e0
>  [<ffffffff81002adc>] do_notify_resume+0x5c/0xa0
>  [<ffffffff81725f86>] retint_signal+0x46/0x90
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-coredump-Avoid-fs-freezing-deadlock-when-dumping-cor.patch --]
[-- Type: text/x-patch, Size: 1291 bytes --]

>From b7d1b0a12722eb6a5cb25cc614fae26ddf652c02 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 22 Nov 2013 21:59:24 +0100
Subject: [PATCH] coredump: Avoid fs freezing deadlock when dumping core

Commit 2507a4fbd48a96bc4236e584252635f8539079df (make dump_emit() use
vfs_write() instead of banging at ->f_op->write directly) introduced a
possible deadlock when dumping core while filesystem is being frozen. We
already acquired freeze protection in do_coredump() and after this patch
we also acquire it in vfs_write(). Fix the problem by removing now
unnecessary protection in do_coredump().

Fixes: 2507a4fbd48a96bc4236e584252635f8539079df
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/coredump.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 62406b6959b6..bdb9052744d8 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -657,11 +657,8 @@ void do_coredump(const siginfo_t *siginfo)
 		goto close_fail;
 	if (displaced)
 		put_files_struct(displaced);
-	if (!dump_interrupted()) {
-		file_start_write(cprm.file);
+	if (!dump_interrupted())
 		core_dumped = binfmt->core_dump(&cprm);
-		file_end_write(cprm.file);
-	}
 	if (ispipe && core_pipe_limit)
 		wait_for_dump_helpers(cprm.file);
 close_fail:
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: recursive locking (coredump/vfs_write)
  2013-11-22 21:11 ` Jan Kara
@ 2013-11-22 23:52   ` Al Viro
  2013-11-24  9:31     ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2013-11-22 23:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Jones, Linux Kernel

On Fri, Nov 22, 2013 at 10:11:56PM +0100, Jan Kara wrote:
>   Hi,
> 
> On Wed 13-11-13 16:11:47, Dave Jones wrote:
> > here's another one..
> > 
> > 
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 3.12.0+ #2 Not tainted
> > ---------------------------------------------
> > trinity-child3/13302 is trying to acquire lock:
> >  (sb_writers#5){.+.+.+}, at: [<ffffffff811b7013>] vfs_write+0x173/0x1f0
> > 
> > but task is already holding lock:
> >  (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070
>   Thanks for report. Attached patch should fix this. Al, can you please
> merge it?

No.  It's already fixed in mainline by commit 52da40.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: recursive locking (coredump/vfs_write)
  2013-11-22 23:52   ` Al Viro
@ 2013-11-24  9:31     ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2013-11-24  9:31 UTC (permalink / raw)
  To: Al Viro; +Cc: Jan Kara, Dave Jones, Linux Kernel

On Fri 22-11-13 23:52:18, Al Viro wrote:
> On Fri, Nov 22, 2013 at 10:11:56PM +0100, Jan Kara wrote:
> >   Hi,
> > 
> > On Wed 13-11-13 16:11:47, Dave Jones wrote:
> > > here's another one..
> > > 
> > > 
> > > =============================================
> > > [ INFO: possible recursive locking detected ]
> > > 3.12.0+ #2 Not tainted
> > > ---------------------------------------------
> > > trinity-child3/13302 is trying to acquire lock:
> > >  (sb_writers#5){.+.+.+}, at: [<ffffffff811b7013>] vfs_write+0x173/0x1f0
> > > 
> > > but task is already holding lock:
> > >  (sb_writers#5){.+.+.+}, at: [<ffffffff8122006d>] do_coredump+0xf1d/0x1070
> >   Thanks for report. Attached patch should fix this. Al, can you please
> > merge it?
> 
> No.  It's already fixed in mainline by commit 52da40.
  Ah, right. After pull I can see that change. Sorry for the noise.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-24  9:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 21:11 recursive locking (coredump/vfs_write) Dave Jones
2013-11-15 10:18 ` Peter Wu
2013-11-22 21:11 ` Jan Kara
2013-11-22 23:52   ` Al Viro
2013-11-24  9:31     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).