linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* vfs: lockdep splat with prepare_bprm_creds
@ 2013-03-15  4:07 Sasha Levin
  2013-03-15  4:26 ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2013-03-15  4:07 UTC (permalink / raw)
  To: Al Viro, Dave Jones, Oleg Nesterov, Andrew Morton, Eric W. Biederman
  Cc: Dave Jones, linux-kernel

Hi all,

While fuzzing with trinity inside a KVM tools guest running latest -next kernel
I've stumbled on the following.

Dave Jones reported something similar, but that seemed to involve cgroup's mutex
and didn't seem like it was the same issue as this one.


[  549.400084] ======================================================
[  549.400084] [ INFO: possible circular locking dependency detected ]
[  549.400084] 3.9.0-rc2-next-20130314-sasha-00046-g3897511 #295 Tainted: G        W
[  549.400084] -------------------------------------------------------
[  549.420729] can: request_module (can-proto-0) failed.
[  549.400084] trinity-child20/12048 is trying to acquire lock:
[  549.400084]  (&p->lock){+.+.+.}, at: [<ffffffff8129f11a>] seq_read+0x3a/0x3d0
[  549.400084]
[  549.400084] but task is already holding lock:
[  549.400084]  (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff812813d1>] prepare_bprm_creds+0x31/0x80
[  549.400084]
[  549.400084] which lock already depends on the new lock.
[  549.400084]
[  549.400084]
[  549.400084] the existing dependency chain (in reverse order) is:
[  549.400084]
-> #1 (&sig->cred_guard_mutex){+.+.+.}:
[  549.400084]        [<ffffffff8117994a>] check_prevs_add+0xba/0x1a0
[  549.400084]        [<ffffffff8117a100>] validate_chain.isra.21+0x6d0/0x800
[  549.400084]        [<ffffffff8117d523>] __lock_acquire+0xa23/0xb10
[  549.400084]        [<ffffffff8117df3a>] lock_acquire+0x1ca/0x270
[  549.400084]        [<ffffffff83d8786a>] __mutex_lock_common+0x5a/0x5a0
[  549.400084]        [<ffffffff83d87e3f>] mutex_lock_killable_nested+0x3f/0x50
[  549.400084]        [<ffffffff812e8858>] lock_trace+0x28/0x70
[  549.400084]        [<ffffffff812e8a45>] proc_pid_stack+0x65/0xf0
[  549.400084]        [<ffffffff812e9dfa>] proc_single_show+0x5a/0xa0
[  549.400084]        [<ffffffff8129f28f>] seq_read+0x1af/0x3d0
[  549.400084]        [<ffffffff8127a22b>] do_loop_readv_writev+0x4b/0x90
[  549.400084]        [<ffffffff8127a4a6>] do_readv_writev+0xf6/0x1d0
[  549.400084]        [<ffffffff8127a61e>] vfs_readv+0x3e/0x60
[  549.400084]        [<ffffffff8127a690>] SyS_readv+0x50/0xd0
[  549.400084]        [<ffffffff83d946d8>] tracesys+0xe1/0xe6
[  549.400084]
-> #0 (&p->lock){+.+.+.}:
[  549.400084]        [<ffffffff811792c5>] check_prev_add+0x145/0x710
[  549.400084]        [<ffffffff8117994a>] check_prevs_add+0xba/0x1a0
[  549.400084]        [<ffffffff8117a100>] validate_chain.isra.21+0x6d0/0x800
[  549.400084]        [<ffffffff8117d523>] __lock_acquire+0xa23/0xb10
[  549.400084]        [<ffffffff8117df3a>] lock_acquire+0x1ca/0x270
[  549.400084]        [<ffffffff83d8786a>] __mutex_lock_common+0x5a/0x5a0
[  549.400084]        [<ffffffff83d87edf>] mutex_lock_nested+0x3f/0x50
[  549.400084]        [<ffffffff8129f11a>] seq_read+0x3a/0x3d0
[  549.400084]        [<ffffffff812e7b61>] proc_reg_read+0x201/0x230
[  549.400084]        [<ffffffff81279e05>] vfs_read+0xb5/0x180
[  549.400084]        [<ffffffff8127fc61>] kernel_read+0x41/0x60
[  549.400084]        [<ffffffff8127fe0d>] prepare_binprm+0x18d/0x1b0
[  549.400084]        [<ffffffff81281646>] do_execve_common.isra.21+0x1b6/0x380
[  549.400084]        [<ffffffff81281823>] do_execve+0x13/0x20
[  549.400084]        [<ffffffff81281b0e>] SyS_execve+0x3e/0x60
[  549.400084]        [<ffffffff83d94ae9>] stub_execve+0x69/0xa0
[  549.400084]
[  549.400084] other info that might help us debug this:
[  549.400084]
[  549.400084]  Possible unsafe locking scenario:
[  549.400084]
[  549.400084]        CPU0                    CPU1
[  549.400084]        ----                    ----
[  549.400084]   lock(&sig->cred_guard_mutex);
[  549.400084]                                lock(&p->lock);
[  549.400084]                                lock(&sig->cred_guard_mutex);
[  549.400084]   lock(&p->lock);
[  549.400084]
[  549.400084]  *** DEADLOCK ***
[  549.400084]
[  549.400084] 1 lock held by trinity-child20/12048:
[  549.400084]  #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff812813d1>] prepare_bprm_creds+0x31/0x80
[  549.400084]
[  549.400084] stack backtrace:
[  549.400084] Pid: 12048, comm: trinity-child20 Tainted: G        W    3.9.0-rc2-next-20130314-sasha-00046-g3897511 #295
[  549.400084] Call Trace:
[  549.400084]  [<ffffffff83cfd02e>] print_circular_bug+0xd3/0xe4
[  549.400084]  [<ffffffff811792c5>] check_prev_add+0x145/0x710
[  549.400084]  [<ffffffff8117994a>] check_prevs_add+0xba/0x1a0
[  549.400084]  [<ffffffff8106ffa5>] ? sched_clock+0x15/0x20
[  549.400084]  [<ffffffff8117a100>] validate_chain.isra.21+0x6d0/0x800
[  549.400084]  [<ffffffff8117d523>] __lock_acquire+0xa23/0xb10
[  549.400084]  [<ffffffff8109b1c8>] ? kvm_clock_read+0x38/0x70
[  549.400084]  [<ffffffff811776ae>] ? lock_release_holdtime+0x12e/0x140
[  549.400084]  [<ffffffff8106ffa5>] ? sched_clock+0x15/0x20
[  549.400084]  [<ffffffff8114d2c5>] ? sched_clock_local+0x25/0x90
[  549.400084]  [<ffffffff81259396>] ? deactivate_slab+0x7d6/0x820
[  549.400084]  [<ffffffff8117df3a>] lock_acquire+0x1ca/0x270
[  549.400084]  [<ffffffff8129f11a>] ? seq_read+0x3a/0x3d0
[  549.400084]  [<ffffffff8114d2c5>] ? sched_clock_local+0x25/0x90
[  549.400084]  [<ffffffff8129f0e0>] ? seq_lseek+0x110/0x110
[  549.400084]  [<ffffffff83d8786a>] __mutex_lock_common+0x5a/0x5a0
[  549.400084]  [<ffffffff8129f11a>] ? seq_read+0x3a/0x3d0
[  549.400084]  [<ffffffff8117be32>] ? __lock_is_held+0x52/0x80
[  549.400084]  [<ffffffff8129f11a>] ? seq_read+0x3a/0x3d0
[  549.400084]  [<ffffffff8129f0e0>] ? seq_lseek+0x110/0x110
[  549.400084]  [<ffffffff83d87edf>] mutex_lock_nested+0x3f/0x50
[  549.400084]  [<ffffffff8129f11a>] seq_read+0x3a/0x3d0
[  549.400084]  [<ffffffff819f19cd>] ? delay_tsc+0xdd/0x110
[  549.400084]  [<ffffffff8129f0e0>] ? seq_lseek+0x110/0x110
[  549.400084]  [<ffffffff8129f0e0>] ? seq_lseek+0x110/0x110
[  549.400084]  [<ffffffff812e7b61>] proc_reg_read+0x201/0x230
[  549.400084]  [<ffffffff812e7960>] ? proc_reg_write+0x230/0x230
[  549.400084]  [<ffffffff81279e05>] vfs_read+0xb5/0x180
[  549.400084]  [<ffffffff8127fc61>] kernel_read+0x41/0x60
[  549.400084]  [<ffffffff8127fe0d>] prepare_binprm+0x18d/0x1b0
[  549.400084]  [<ffffffff81281646>] do_execve_common.isra.21+0x1b6/0x380
[  549.400084]  [<ffffffff81281823>] do_execve+0x13/0x20
[  549.400084]  [<ffffffff81281b0e>] SyS_execve+0x3e/0x60
[  549.400084]  [<ffffffff83d94ae9>] stub_execve+0x69/0xa0


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: vfs: lockdep splat with prepare_bprm_creds
  2013-03-15  4:07 vfs: lockdep splat with prepare_bprm_creds Sasha Levin
@ 2013-03-15  4:26 ` Al Viro
  2013-03-15 18:19   ` Oleg Nesterov
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2013-03-15  4:26 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Dave Jones, Oleg Nesterov, Andrew Morton, Eric W. Biederman,
	linux-kernel

On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote:
> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running latest -next kernel
> I've stumbled on the following.
> 
> Dave Jones reported something similar, but that seemed to involve cgroup's mutex
> and didn't seem like it was the same issue as this one.

Lovely...  It's an execve() attempt on a "binary" that is, in fact, a procfs
file (/proc/<pid>/stack), with its ->read() trying to grab ->cred_guard_mutex.
The fact that it's seq_file-based is irrelevant here - all that matters is
that we have ->read() for some file trying to grab ->cred_guard_mutex.

It's not *quite* a deadlock, though - all these guys are using
mutex_lock_killable()...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: vfs: lockdep splat with prepare_bprm_creds
  2013-03-15  4:26 ` Al Viro
@ 2013-03-15 18:19   ` Oleg Nesterov
  2013-03-16 19:41     ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2013-03-15 18:19 UTC (permalink / raw)
  To: Al Viro
  Cc: Sasha Levin, Dave Jones, Andrew Morton, Eric W. Biederman, linux-kernel

On 03/15, Al Viro wrote:
>
> On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote:
> > Hi all,
> >
> > While fuzzing with trinity inside a KVM tools guest running latest -next kernel
> > I've stumbled on the following.
> >
> > Dave Jones reported something similar, but that seemed to involve cgroup's mutex
> > and didn't seem like it was the same issue as this one.
>
> Lovely...  It's an execve() attempt on a "binary" that is, in fact, a procfs
> file (/proc/<pid>/stack),

probably... other lock_trace() callers can't generate this lockdep output
afaics.

> with its ->read() trying to grab ->cred_guard_mutex.
> The fact that it's seq_file-based is irrelevant here - all that matters is
> that we have ->read() for some file trying to grab ->cred_guard_mutex.

Yes, perhaps the patch below makes sense anyway as a cleanup, but obviously
it can't help.

Cough... I am shy to disclose my ignorance, but could you explain how
open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least
acl_permission_check() looks as if open_exec() should fail...

Just curious, thanks in advance.

Oleg.

--- x/fs/proc/base.c
+++ x/fs/proc/base.c
@@ -317,12 +317,12 @@ static int proc_pid_stack(struct seq_fil
 	err = lock_trace(task);
 	if (!err) {
 		save_stack_trace_tsk(task, &trace);
+		unlock_trace(task);
 
 		for (i = 0; i < trace.nr_entries; i++) {
 			seq_printf(m, "[<%pK>] %pS\n",
 				   (void *)entries[i], (void *)entries[i]);
 		}
-		unlock_trace(task);
 	}
 	kfree(entries);
 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: vfs: lockdep splat with prepare_bprm_creds
  2013-03-15 18:19   ` Oleg Nesterov
@ 2013-03-16 19:41     ` Al Viro
  2013-03-17 17:07       ` Oleg Nesterov
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2013-03-16 19:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sasha Levin, Dave Jones, Andrew Morton, Eric W. Biederman, linux-kernel

On Fri, Mar 15, 2013 at 07:19:56PM +0100, Oleg Nesterov wrote:
> On 03/15, Al Viro wrote:
> >
> > On Fri, Mar 15, 2013 at 12:07:14AM -0400, Sasha Levin wrote:
> > > Hi all,
> > >
> > > While fuzzing with trinity inside a KVM tools guest running latest -next kernel
> > > I've stumbled on the following.
> > >
> > > Dave Jones reported something similar, but that seemed to involve cgroup's mutex
> > > and didn't seem like it was the same issue as this one.
> >
> > Lovely...  It's an execve() attempt on a "binary" that is, in fact, a procfs
> > file (/proc/<pid>/stack),

> Cough... I am shy to disclose my ignorance, but could you explain how
> open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least
> acl_permission_check() looks as if open_exec() should fail...

Umm... point.  It might be a false positive, actually - some other
seq_file-based sucker (while chmod +x /proc/self/stack will fail,
chmod +x /proc/vmstat won't) that could be fed to execve(), leading to
	1) kernel_read() from execve() can grab m.lock for *some* seq_file m,
while holding ->cred_guard_mutex
	2) read() on /proc/self/stack tries to grab ->cred_guard_mutex,
while holding m.lock for a different seq_file m
... with lockdep having no idea that there's a reason why (1) and (2) can't
have the same seq_file involved, said reason being that all files with ->read()
trying to grab ->cred_guard_mutex don't have exec bit set *and* are impossible
to chmod.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: vfs: lockdep splat with prepare_bprm_creds
  2013-03-16 19:41     ` Al Viro
@ 2013-03-17 17:07       ` Oleg Nesterov
  0 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2013-03-17 17:07 UTC (permalink / raw)
  To: Al Viro
  Cc: Sasha Levin, Dave Jones, Andrew Morton, Eric W. Biederman, linux-kernel

On 03/16, Al Viro wrote:
>
> On Fri, Mar 15, 2013 at 07:19:56PM +0100, Oleg Nesterov wrote:
>
> > Cough... I am shy to disclose my ignorance, but could you explain how
> > open_exec()->do_filp_open(MAY_EXEC) can succeed in this case? At least
> > acl_permission_check() looks as if open_exec() should fail...
>
> Umm... point.  It might be a false positive, actually - some other
> seq_file-based sucker (while chmod +x /proc/self/stack will fail,
> chmod +x /proc/vmstat won't) that could be fed to execve(), leading to
> 	1) kernel_read() from execve() can grab m.lock for *some* seq_file m,
> while holding ->cred_guard_mutex

Yes, thanks.

I am wondering if lock_trace() is really useful...

Lets ignore proc_pid_syscall() and proc_pid_personality(). Suppose we
change proc_pid_stack()

	int proc_pid_stack(...)
	{
		...

		save_stack_trace_tsk(task, &trace);
		if (!ptrace_may_access(task, PTRACE_MODE_ATTACH))
			goto return -EPERM;

		for (i = 0; i < trace.nr_entries; i++)
			seq_printf(...);

		return 0;
	}

Sure, without cred_guard_mutex we can race with install_exec_creds(). But
is it a problem in practice? In any case lock_trace() can't protect against
commit_creds()...

We can even do

		task_lock(task);
		err = __ptrace_may_access(task, mode);
		if (!err)
			save_stack_trace_tsk(...);
		task_unlock(task);

This way task_lock() protects us against exec_mmap(). And even exec_mmap()
was already called and the task is going to do install_exec_creds() we can't
show the stack of this process "after" exec.

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-03-17 17:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-15  4:07 vfs: lockdep splat with prepare_bprm_creds Sasha Levin
2013-03-15  4:26 ` Al Viro
2013-03-15 18:19   ` Oleg Nesterov
2013-03-16 19:41     ` Al Viro
2013-03-17 17:07       ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).