[PATCH v6 01/16] exec: Only compute current once in flush_old

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 01/16] exec: Only compute current once in flush_old_exec
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
@ 2020-03-20 20:24 ` Bernd Edlinger
  2020-03-20 20:24 ` [PATCH v6 02/16] exec: Factor unshare_sighand out of de_thread and call it separately Bernd Edlinger
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

Make it clear that current only needs to be computed once in
flush_old_exec.  This may have some efficiency improvements and it
makes the code easier to change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/exec.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index db17be5..c3f3479 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1260,13 +1260,14 @@ void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
  */
 int flush_old_exec(struct linux_binprm * bprm)
 {
+	struct task_struct *me = current;
 	int retval;
 
 	/*
 	 * Make sure we have a private signal table and that
 	 * we are unassociated from the previous thread group.
 	 */
-	retval = de_thread(current);
+	retval = de_thread(me);
 	if (retval)
 		goto out;
 
@@ -1294,10 +1295,10 @@ int flush_old_exec(struct linux_binprm * bprm)
 	bprm->mm = NULL;
 
 	set_fs(USER_DS);
-	current->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
+	me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
 					PF_NOFREEZE | PF_NO_SETAFFINITY);
 	flush_thread();
-	current->personality &= ~bprm->per_clear;
+	me->personality &= ~bprm->per_clear;
 
 	/*
 	 * We have to apply CLOEXEC before we change whether the process is
@@ -1305,7 +1306,7 @@ int flush_old_exec(struct linux_binprm * bprm)
 	 * trying to access the should-be-closed file descriptors of a process
 	 * undergoing exec(2).
 	 */
-	do_close_on_exec(current->files);
+	do_close_on_exec(me->files);
 	return 0;
 
 out:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 02/16] exec: Factor unshare_sighand out of de_thread and call it separately
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
  2020-03-20 20:24 ` [PATCH v6 01/16] exec: Only compute current once in flush_old_exec Bernd Edlinger
@ 2020-03-20 20:24 ` Bernd Edlinger
  2020-03-20 20:25 ` [PATCH v6 03/16] exec: Move cleanup of posix timers on exec out of de_thread Bernd Edlinger
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This makes the code clearer and makes it easier to implement a mutex
that is not taken over any locations that may block indefinitely waiting
for userspace.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/exec.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index c3f3479..ff74b9a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1194,6 +1194,23 @@ static int de_thread(struct task_struct *tsk)
 	flush_itimer_signals();
 #endif
 
+	BUG_ON(!thread_group_leader(tsk));
+	return 0;
+
+killed:
+	/* protects against exit_notify() and __exit_signal() */
+	read_lock(&tasklist_lock);
+	sig->group_exit_task = NULL;
+	sig->notify_count = 0;
+	read_unlock(&tasklist_lock);
+	return -EAGAIN;
+}
+
+
+static int unshare_sighand(struct task_struct *me)
+{
+	struct sighand_struct *oldsighand = me->sighand;
+
 	if (refcount_read(&oldsighand->count) != 1) {
 		struct sighand_struct *newsighand;
 		/*
@@ -1210,23 +1227,13 @@ static int de_thread(struct task_struct *tsk)
 
 		write_lock_irq(&tasklist_lock);
 		spin_lock(&oldsighand->siglock);
-		rcu_assign_pointer(tsk->sighand, newsighand);
+		rcu_assign_pointer(me->sighand, newsighand);
 		spin_unlock(&oldsighand->siglock);
 		write_unlock_irq(&tasklist_lock);
 
 		__cleanup_sighand(oldsighand);
 	}
-
-	BUG_ON(!thread_group_leader(tsk));
 	return 0;
-
-killed:
-	/* protects against exit_notify() and __exit_signal() */
-	read_lock(&tasklist_lock);
-	sig->group_exit_task = NULL;
-	sig->notify_count = 0;
-	read_unlock(&tasklist_lock);
-	return -EAGAIN;
 }
 
 char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
@@ -1264,14 +1271,20 @@ int flush_old_exec(struct linux_binprm * bprm)
 	int retval;
 
 	/*
-	 * Make sure we have a private signal table and that
-	 * we are unassociated from the previous thread group.
+	 * Make this the only thread in the thread group.
 	 */
 	retval = de_thread(me);
 	if (retval)
 		goto out;
 
 	/*
+	 * Make the signal table private.
+	 */
+	retval = unshare_sighand(me);
+	if (retval)
+		goto out;
+
+	/*
 	 * Must be called _before_ exec_mmap() as bprm->mm is
 	 * not visibile until then. This also enables the update
 	 * to be lockless.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 03/16] exec: Move cleanup of posix timers on exec out of de_thread
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
  2020-03-20 20:24 ` [PATCH v6 01/16] exec: Only compute current once in flush_old_exec Bernd Edlinger
  2020-03-20 20:24 ` [PATCH v6 02/16] exec: Factor unshare_sighand out of de_thread and call it separately Bernd Edlinger
@ 2020-03-20 20:25 ` Bernd Edlinger
  2020-03-20 20:25 ` [PATCH v6 04/16] exec: Move exec_mmap right after de_thread in flush_old_exec Bernd Edlinger
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

These functions have very little to do with de_thread move them out
of de_thread an into flush_old_exec proper so it can be more clearly
seen what flush_old_exec is doing.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/exec.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index ff74b9a..215d86f7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1189,11 +1189,6 @@ static int de_thread(struct task_struct *tsk)
 	/* we have changed execution domain */
 	tsk->exit_signal = SIGCHLD;
 
-#ifdef CONFIG_POSIX_TIMERS
-	exit_itimers(sig);
-	flush_itimer_signals();
-#endif
-
 	BUG_ON(!thread_group_leader(tsk));
 	return 0;
 
@@ -1277,6 +1272,11 @@ int flush_old_exec(struct linux_binprm * bprm)
 	if (retval)
 		goto out;
 
+#ifdef CONFIG_POSIX_TIMERS
+	exit_itimers(me->signal);
+	flush_itimer_signals();
+#endif
+
 	/*
 	 * Make the signal table private.
 	 */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 04/16] exec: Move exec_mmap right after de_thread in flush_old_exec
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (2 preceding siblings ...)
  2020-03-20 20:25 ` [PATCH v6 03/16] exec: Move cleanup of posix timers on exec out of de_thread Bernd Edlinger
@ 2020-03-20 20:25 ` Bernd Edlinger
  2020-03-20 20:25 ` [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex Bernd Edlinger
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

I have read through the code in exec_mmap and I do not see anything
that depends on sighand or the sighand lock, or on signals in anyway
so this should be safe.

This rearrangement of code has two significant benefits.  It makes
the determination of passing the point of no return by testing bprm->mm
accurate.  All failures prior to that point in flush_old_exec are
either truly recoverable or they are fatal.

Further this consolidates all of the possible indefinite waits for
userspace together at the top of flush_old_exec.  The possible wait
for a ptracer on PTRACE_EVENT_EXIT, the possible wait for a page fault
to be resolved in clear_child_tid, and the possible wait for a page
fault in exit_robust_list.

This consolidation allows the creation of a mutex to replace
cred_guard_mutex that is not held over possible indefinite userspace
waits.  Which will allow removing deadlock scenarios from the kernel.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/exec.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 215d86f7..d820a72 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1272,18 +1272,6 @@ int flush_old_exec(struct linux_binprm * bprm)
 	if (retval)
 		goto out;
 
-#ifdef CONFIG_POSIX_TIMERS
-	exit_itimers(me->signal);
-	flush_itimer_signals();
-#endif
-
-	/*
-	 * Make the signal table private.
-	 */
-	retval = unshare_sighand(me);
-	if (retval)
-		goto out;
-
 	/*
 	 * Must be called _before_ exec_mmap() as bprm->mm is
 	 * not visibile until then. This also enables the update
@@ -1307,6 +1295,18 @@ int flush_old_exec(struct linux_binprm * bprm)
 	 */
 	bprm->mm = NULL;
 
+#ifdef CONFIG_POSIX_TIMERS
+	exit_itimers(me->signal);
+	flush_itimer_signals();
+#endif
+
+	/*
+	 * Make the signal table private.
+	 */
+	retval = unshare_sighand(me);
+	if (retval)
+		goto out;
+
 	set_fs(USER_DS);
 	me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
 					PF_NOFREEZE | PF_NO_SETAFFINITY);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (3 preceding siblings ...)
  2020-03-20 20:25 ` [PATCH v6 04/16] exec: Move exec_mmap right after de_thread in flush_old_exec Bernd Edlinger
@ 2020-03-20 20:25 ` Bernd Edlinger
  2020-03-23 10:51   ` Kirill Tkhai
  2020-03-20 20:26 ` [PATCH v6 06/16] exec: Fix a deadlock in strace Bernd Edlinger
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

The cred_guard_mutex is problematic as it is held over possibly
indefinite waits for userspace.  The possible indefinite waits for
userspace that I have identified are: The cred_guard_mutex is held in
PTRACE_EVENT_EXIT waiting for the tracer.  The cred_guard_mutex is
held over "put_user(0, tsk->clear_child_tid)" in exit_mm().  The
cred_guard_mutex is held over "get_user(futex_offset, ...")  in
exit_robust_list.  The cred_guard_mutex held over copy_strings.

The functions get_user and put_user can trigger a page fault which can
potentially wait indefinitely in the case of userfaultfd or if
userspace implements part of the page fault path.

In any of those cases the userspace process that the kernel is waiting
for might make a different system call that winds up taking the
cred_guard_mutex and result in deadlock.

Holding a mutex over any of those possibly indefinite waits for
userspace does not appear necessary.  Add exec_update_mutex that will
just cover updating the process during exec where the permissions and
the objects pointed to by the task struct may be out of sync.

The plan is to switch the users of cred_guard_mutex to
exec_update_mutex one by one.  This lets us move forward while still
being careful and not introducing any regressions.

Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/
Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/
Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/
Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/
Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/
Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.")
Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/exec.c                    | 22 +++++++++++++++++++---
 include/linux/binfmts.h      |  8 +++++++-
 include/linux/sched/signal.h |  9 ++++++++-
 init/init_task.c             |  1 +
 kernel/fork.c                |  1 +
 5 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index d820a72..0e46ec5 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1010,16 +1010,26 @@ ssize_t read_code(struct file *file, unsigned long addr, loff_t pos, size_t len)
 }
 EXPORT_SYMBOL(read_code);
 
+/*
+ * Maps the mm_struct mm into the current task struct.
+ * On success, this function returns with the mutex
+ * exec_update_mutex locked.
+ */
 static int exec_mmap(struct mm_struct *mm)
 {
 	struct task_struct *tsk;
 	struct mm_struct *old_mm, *active_mm;
+	int ret;
 
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
 	old_mm = current->mm;
 	exec_mm_release(tsk, old_mm);
 
+	ret = mutex_lock_killable(&tsk->signal->exec_update_mutex);
+	if (ret)
+		return ret;
+
 	if (old_mm) {
 		sync_mm_rss(old_mm);
 		/*
@@ -1031,9 +1041,11 @@ static int exec_mmap(struct mm_struct *mm)
 		down_read(&old_mm->mmap_sem);
 		if (unlikely(old_mm->core_state)) {
 			up_read(&old_mm->mmap_sem);
+			mutex_unlock(&tsk->signal->exec_update_mutex);
 			return -EINTR;
 		}
 	}
+
 	task_lock(tsk);
 	active_mm = tsk->active_mm;
 	membarrier_exec_mmap(mm);
@@ -1288,11 +1300,12 @@ int flush_old_exec(struct linux_binprm * bprm)
 		goto out;
 
 	/*
-	 * After clearing bprm->mm (to mark that current is using the
-	 * prepared mm now), we have nothing left of the original
+	 * After setting bprm->called_exec_mmap (to mark that current is
+	 * using the prepared mm now), we have nothing left of the original
 	 * process. If anything from here on returns an error, the check
 	 * in search_binary_handler() will SEGV current.
 	 */
+	bprm->called_exec_mmap = 1;
 	bprm->mm = NULL;
 
 #ifdef CONFIG_POSIX_TIMERS
@@ -1438,6 +1451,8 @@ static void free_bprm(struct linux_binprm *bprm)
 {
 	free_arg_pages(bprm);
 	if (bprm->cred) {
+		if (bprm->called_exec_mmap)
+			mutex_unlock(&current->signal->exec_update_mutex);
 		mutex_unlock(&current->signal->cred_guard_mutex);
 		abort_creds(bprm->cred);
 	}
@@ -1487,6 +1502,7 @@ void install_exec_creds(struct linux_binprm *bprm)
 	 * credentials; any time after this it may be unlocked.
 	 */
 	security_bprm_committed_creds(bprm);
+	mutex_unlock(&current->signal->exec_update_mutex);
 	mutex_unlock(&current->signal->cred_guard_mutex);
 }
 EXPORT_SYMBOL(install_exec_creds);
@@ -1678,7 +1694,7 @@ int search_binary_handler(struct linux_binprm *bprm)
 
 		read_lock(&binfmt_lock);
 		put_binfmt(fmt);
-		if (retval < 0 && !bprm->mm) {
+		if (retval < 0 && bprm->called_exec_mmap) {
 			/* we got to flush_old_exec() and failed after it */
 			read_unlock(&binfmt_lock);
 			force_sigsegv(SIGSEGV);
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index b40fc63..a345d9f 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -44,7 +44,13 @@ struct linux_binprm {
 		 * exec has happened. Used to sanitize execution environment
 		 * and to set AT_SECURE auxv for glibc.
 		 */
-		secureexec:1;
+		secureexec:1,
+		/*
+		 * Set by flush_old_exec, when exec_mmap has been called.
+		 * This is past the point of no return, when the
+		 * exec_update_mutex has been taken.
+		 */
+		called_exec_mmap:1;
 #ifdef __alpha__
 	unsigned int taso:1;
 #endif
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 8805025..a29df79 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -224,7 +224,14 @@ struct signal_struct {
 
 	struct mutex cred_guard_mutex;	/* guard against foreign influences on
 					 * credential calculations
-					 * (notably. ptrace) */
+					 * (notably. ptrace)
+					 * Deprecated do not use in new code.
+					 * Use exec_update_mutex instead.
+					 */
+	struct mutex exec_update_mutex;	/* Held while task_struct is being
+					 * updated during exec, and may have
+					 * inconsistent permissions.
+					 */
 } __randomize_layout;
 
 /*
diff --git a/init/init_task.c b/init/init_task.c
index 9e5cbe5..bd403ed 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -26,6 +26,7 @@
 	.multiprocess	= HLIST_HEAD_INIT,
 	.rlim		= INIT_RLIMITS,
 	.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
+	.exec_update_mutex = __MUTEX_INITIALIZER(init_signals.exec_update_mutex),
 #ifdef CONFIG_POSIX_TIMERS
 	.posix_timers = LIST_HEAD_INIT(init_signals.posix_timers),
 	.cputimer	= {
diff --git a/kernel/fork.c b/kernel/fork.c
index 8642530..036b692 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1594,6 +1594,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	sig->oom_score_adj_min = current->signal->oom_score_adj_min;
 
 	mutex_init(&sig->cred_guard_mutex);
+	mutex_init(&sig->exec_update_mutex);
 
 	return 0;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 06/16] exec: Fix a deadlock in strace
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (4 preceding siblings ...)
  2020-03-20 20:25 ` [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex Bernd Edlinger
@ 2020-03-20 20:26 ` Bernd Edlinger
  2020-03-20 20:26 ` [PATCH v6 07/16] selftests/ptrace: add test cases for dead-locks Bernd Edlinger
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This fixes a deadlock in the tracer when tracing a multi-threaded
application that calls execve while more than one thread are running.

I observed that when running strace on the gcc test suite, it always
blocks after a while, when expect calls execve, because other threads
have to be terminated.  They send ptrace events, but the strace is no
longer able to respond, since it is blocked in vm_access.

The deadlock is always happening when strace needs to access the
tracees process mmap, while another thread in the tracee starts to
execve a child process, but that cannot continue until the
PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received:

strace          D    0 30614  30584 0x00000000
Call Trace:
__schedule+0x3ce/0x6e0
schedule+0x5c/0xd0
schedule_preempt_disabled+0x15/0x20
__mutex_lock.isra.13+0x1ec/0x520
__mutex_lock_killable_slowpath+0x13/0x20
mutex_lock_killable+0x28/0x30
mm_access+0x27/0xa0
process_vm_rw_core.isra.3+0xff/0x550
process_vm_rw+0xdd/0xf0
__x64_sys_process_vm_readv+0x31/0x40
do_syscall_64+0x64/0x220
entry_SYSCALL_64_after_hwframe+0x44/0xa9

expect          D    0 31933  30876 0x80004003
Call Trace:
__schedule+0x3ce/0x6e0
schedule+0x5c/0xd0
flush_old_exec+0xc4/0x770
load_elf_binary+0x35a/0x16c0
search_binary_handler+0x97/0x1d0
__do_execve_file.isra.40+0x5d4/0x8a0
__x64_sys_execve+0x49/0x60
do_syscall_64+0x64/0x220
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This changes mm_access to use the new exec_update_mutex
instead of cred_guard_mutex.

This patch is based on the following patch by Eric W. Biederman:
"[PATCH 0/5] Infrastructure to allow fixing exec deadlocks"
Link: https://lore.kernel.org/lkml/87v9ne5y4y.fsf_-_@x220.int.ebiederm.org/

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 kernel/fork.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 036b692..e23ccac 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1224,7 +1224,7 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode)
 	struct mm_struct *mm;
 	int err;
 
-	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
+	err =  mutex_lock_killable(&task->signal->exec_update_mutex);
 	if (err)
 		return ERR_PTR(err);
 
@@ -1234,7 +1234,7 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode)
 		mmput(mm);
 		mm = ERR_PTR(-EACCES);
 	}
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_update_mutex);
 
 	return mm;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 07/16] selftests/ptrace: add test cases for dead-locks
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (5 preceding siblings ...)
  2020-03-20 20:26 ` [PATCH v6 06/16] exec: Fix a deadlock in strace Bernd Edlinger
@ 2020-03-20 20:26 ` Bernd Edlinger
  2020-03-20 20:26 ` [PATCH v6 08/16] mm: docs: Fix a comment in process_vm_rw_core Bernd Edlinger
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This adds test cases for ptrace deadlocks.

Additionally fixes a compile problem in get_syscall_info.c,
observed with gcc-4.8.4:

get_syscall_info.c: In function 'get_syscall_info':
get_syscall_info.c:93:3: error: 'for' loop initial declarations are only
                                 allowed in C99 mode
   for (unsigned int i = 0; i < ARRAY_SIZE(args); ++i) {
   ^
get_syscall_info.c:93:3: note: use option -std=c99 or -std=gnu99 to compile
                               your code

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 tools/testing/selftests/ptrace/Makefile   |  4 +-
 tools/testing/selftests/ptrace/vmaccess.c | 86 +++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/ptrace/vmaccess.c

diff --git a/tools/testing/selftests/ptrace/Makefile b/tools/testing/selftests/ptrace/Makefile
index c0b7f89..2f1f532 100644
--- a/tools/testing/selftests/ptrace/Makefile
+++ b/tools/testing/selftests/ptrace/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
-CFLAGS += -iquote../../../../include/uapi -Wall
+CFLAGS += -std=c99 -pthread -iquote../../../../include/uapi -Wall
 
-TEST_GEN_PROGS := get_syscall_info peeksiginfo
+TEST_GEN_PROGS := get_syscall_info peeksiginfo vmaccess
 
 include ../lib.mk
diff --git a/tools/testing/selftests/ptrace/vmaccess.c b/tools/testing/selftests/ptrace/vmaccess.c
new file mode 100644
index 0000000..4db327b
--- /dev/null
+++ b/tools/testing/selftests/ptrace/vmaccess.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2020 Bernd Edlinger <bernd.edlinger@hotmail.de>
+ * All rights reserved.
+ *
+ * Check whether /proc/$pid/mem can be accessed without causing deadlocks
+ * when de_thread is blocked with ->cred_guard_mutex held.
+ */
+
+#include "../kselftest_harness.h"
+#include <stdio.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <signal.h>
+#include <unistd.h>
+#include <sys/ptrace.h>
+
+static void *thread(void *arg)
+{
+	ptrace(PTRACE_TRACEME, 0, 0L, 0L);
+	return NULL;
+}
+
+TEST(vmaccess)
+{
+	int f, pid = fork();
+	char mm[64];
+
+	if (!pid) {
+		pthread_t pt;
+
+		pthread_create(&pt, NULL, thread, NULL);
+		pthread_join(pt, NULL);
+		execlp("true", "true", NULL);
+	}
+
+	sleep(1);
+	sprintf(mm, "/proc/%d/mem", pid);
+	f = open(mm, O_RDONLY);
+	ASSERT_GE(f, 0);
+	close(f);
+	f = kill(pid, SIGCONT);
+	ASSERT_EQ(f, 0);
+}
+
+TEST(attach)
+{
+	int s, k, pid = fork();
+
+	if (!pid) {
+		pthread_t pt;
+
+		pthread_create(&pt, NULL, thread, NULL);
+		pthread_join(pt, NULL);
+		execlp("sleep", "sleep", "2", NULL);
+	}
+
+	sleep(1);
+	k = ptrace(PTRACE_ATTACH, pid, 0L, 0L);
+	ASSERT_EQ(errno, EAGAIN);
+	ASSERT_EQ(k, -1);
+	k = waitpid(-1, &s, WNOHANG);
+	ASSERT_NE(k, -1);
+	ASSERT_NE(k, 0);
+	ASSERT_NE(k, pid);
+	ASSERT_EQ(WIFEXITED(s), 1);
+	ASSERT_EQ(WEXITSTATUS(s), 0);
+	sleep(1);
+	k = ptrace(PTRACE_ATTACH, pid, 0L, 0L);
+	ASSERT_EQ(k, 0);
+	k = waitpid(-1, &s, 0);
+	ASSERT_EQ(k, pid);
+	ASSERT_EQ(WIFSTOPPED(s), 1);
+	ASSERT_EQ(WSTOPSIG(s), SIGSTOP);
+	k = ptrace(PTRACE_DETACH, pid, 0L, 0L);
+	ASSERT_EQ(k, 0);
+	k = waitpid(-1, &s, 0);
+	ASSERT_EQ(k, pid);
+	ASSERT_EQ(WIFEXITED(s), 1);
+	ASSERT_EQ(WEXITSTATUS(s), 0);
+	k = waitpid(-1, NULL, 0);
+	ASSERT_EQ(k, -1);
+	ASSERT_EQ(errno, ECHILD);
+}
+
+TEST_HARNESS_MAIN
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 08/16] mm: docs: Fix a comment in process_vm_rw_core
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (6 preceding siblings ...)
  2020-03-20 20:26 ` [PATCH v6 07/16] selftests/ptrace: add test cases for dead-locks Bernd Edlinger
@ 2020-03-20 20:26 ` Bernd Edlinger
  2020-03-20 20:26 ` [PATCH v6 09/16] kernel: doc: remove outdated comment cred.c Bernd Edlinger
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This removes a duplicate "a" in the comment in process_vm_rw_core.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 mm/process_vm_access.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index de41e83..74e957e 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -206,7 +206,7 @@ static ssize_t process_vm_rw_core(pid_t pid, struct iov_iter *iter,
 	if (!mm || IS_ERR(mm)) {
 		rc = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
 		/*
-		 * Explicitly map EACCES to EPERM as EPERM is a more a
+		 * Explicitly map EACCES to EPERM as EPERM is a more
 		 * appropriate error code for process_vw_readv/writev
 		 */
 		if (rc == -EACCES)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 09/16] kernel: doc: remove outdated comment cred.c
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (7 preceding siblings ...)
  2020-03-20 20:26 ` [PATCH v6 08/16] mm: docs: Fix a comment in process_vm_rw_core Bernd Edlinger
@ 2020-03-20 20:26 ` Bernd Edlinger
  2020-03-20 20:27 ` [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve Bernd Edlinger
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This removes an outdated comment in prepare_kernel_cred.

There is no "cred_replace_mutex" any more, so the comment must
go away.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 kernel/cred.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/cred.c b/kernel/cred.c
index 809a985..71a7926 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -675,8 +675,6 @@ void __init cred_init(void)
  * The caller may change these controls afterwards if desired.
  *
  * Returns the new credentials or NULL if out of memory.
- *
- * Does not take, and does not return holding current->cred_replace_mutex.
  */
 struct cred *prepare_kernel_cred(struct task_struct *daemon)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (8 preceding siblings ...)
  2020-03-20 20:26 ` [PATCH v6 09/16] kernel: doc: remove outdated comment cred.c Bernd Edlinger
@ 2020-03-20 20:27 ` Bernd Edlinger
  2020-03-25 15:41   ` Christian Brauner
  2020-03-20 20:27 ` [PATCH v6 11/16] proc: " Bernd Edlinger
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This changes kcmp_epoll_target to use the new exec_update_mutex
instead of cred_guard_mutex.

This should be safe, as the credentials are only used for reading,
and furthermore ->mm and ->sighand are updated on execve,
but only under the new exec_update_mutex.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 kernel/kcmp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/kcmp.c b/kernel/kcmp.c
index a0e3d7a..b3ff928 100644
--- a/kernel/kcmp.c
+++ b/kernel/kcmp.c
@@ -173,8 +173,8 @@ static int kcmp_epoll_target(struct task_struct *task1,
 	/*
 	 * One should have enough rights to inspect task details.
 	 */
-	ret = kcmp_lock(&task1->signal->cred_guard_mutex,
-			&task2->signal->cred_guard_mutex);
+	ret = kcmp_lock(&task1->signal->exec_update_mutex,
+			&task2->signal->exec_update_mutex);
 	if (ret)
 		goto err;
 	if (!ptrace_may_access(task1, PTRACE_MODE_READ_REALCREDS) ||
@@ -229,8 +229,8 @@ static int kcmp_epoll_target(struct task_struct *task1,
 	}
 
 err_unlock:
-	kcmp_unlock(&task1->signal->cred_guard_mutex,
-		    &task2->signal->cred_guard_mutex);
+	kcmp_unlock(&task1->signal->exec_update_mutex,
+		    &task2->signal->exec_update_mutex);
 err:
 	put_task_struct(task1);
 	put_task_struct(task2);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 11/16] proc: Use new infrastructure to fix deadlocks in execve
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (9 preceding siblings ...)
  2020-03-20 20:27 ` [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve Bernd Edlinger
@ 2020-03-20 20:27 ` Bernd Edlinger
  2020-03-20 20:27 ` [PATCH v6 12/16] proc: io_accounting: " Bernd Edlinger
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This changes lock_trace to use the new exec_update_mutex
instead of cred_guard_mutex.

This fixes possible deadlocks when the trace is accessing
/proc/$pid/stack for instance.

This should be safe, as the credentials are only used for reading,
and task->mm is updated on execve under the new exec_update_mutex.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/proc/base.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index c7c6427..fed76abf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -405,11 +405,11 @@ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
 
 static int lock_trace(struct task_struct *task)
 {
-	int err = mutex_lock_killable(&task->signal->cred_guard_mutex);
+	int err = mutex_lock_killable(&task->signal->exec_update_mutex);
 	if (err)
 		return err;
 	if (!ptrace_may_access(task, PTRACE_MODE_ATTACH_FSCREDS)) {
-		mutex_unlock(&task->signal->cred_guard_mutex);
+		mutex_unlock(&task->signal->exec_update_mutex);
 		return -EPERM;
 	}
 	return 0;
@@ -417,7 +417,7 @@ static int lock_trace(struct task_struct *task)
 
 static void unlock_trace(struct task_struct *task)
 {
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_update_mutex);
 }
 
 #ifdef CONFIG_STACKTRACE
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 12/16] proc: io_accounting: Use new infrastructure to fix deadlocks in execve
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (10 preceding siblings ...)
  2020-03-20 20:27 ` [PATCH v6 11/16] proc: " Bernd Edlinger
@ 2020-03-20 20:27 ` Bernd Edlinger
  2020-03-20 20:27 ` [PATCH v6 13/16] perf: " Bernd Edlinger
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This changes do_io_accounting to use the new exec_update_mutex
instead of cred_guard_mutex.

This fixes possible deadlocks when the trace is accessing
/proc/$pid/io for instance.

This should be safe, as the credentials are only used for reading.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/proc/base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index fed76abf..6b13fc4 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2861,7 +2861,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
 	unsigned long flags;
 	int result;
 
-	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
+	result = mutex_lock_killable(&task->signal->exec_update_mutex);
 	if (result)
 		return result;
 
@@ -2897,7 +2897,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
 	result = 0;
 
 out_unlock:
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_update_mutex);
 	return result;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 13/16] perf: Use new infrastructure to fix deadlocks in execve
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (11 preceding siblings ...)
  2020-03-20 20:27 ` [PATCH v6 12/16] proc: io_accounting: " Bernd Edlinger
@ 2020-03-20 20:27 ` Bernd Edlinger
  2020-03-21  2:46 ` [PATCH v6 14/16] pidfd: " Bernd Edlinger
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-20 20:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

This changes perf_event_set_clock to use the new exec_update_mutex
instead of cred_guard_mutex.

This should be safe, as the credentials are only used for reading.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 kernel/events/core.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index e453589..71cba8c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1249,7 +1249,7 @@ static void put_ctx(struct perf_event_context *ctx)
  * function.
  *
  * Lock order:
- *    cred_guard_mutex
+ *    exec_update_mutex
  *	task_struct::perf_event_mutex
  *	  perf_event_context::mutex
  *	    perf_event::child_mutex;
@@ -11263,14 +11263,14 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
 	}
 
 	if (task) {
-		err = mutex_lock_interruptible(&task->signal->cred_guard_mutex);
+		err = mutex_lock_interruptible(&task->signal->exec_update_mutex);
 		if (err)
 			goto err_task;
 
 		/*
 		 * Reuse ptrace permission checks for now.
 		 *
-		 * We must hold cred_guard_mutex across this and any potential
+		 * We must hold exec_update_mutex across this and any potential
 		 * perf_install_in_context() call for this new event to
 		 * serialize against exec() altering our credentials (and the
 		 * perf_event_exit_task() that could imply).
@@ -11559,7 +11559,7 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
 	mutex_unlock(&ctx->mutex);
 
 	if (task) {
-		mutex_unlock(&task->signal->cred_guard_mutex);
+		mutex_unlock(&task->signal->exec_update_mutex);
 		put_task_struct(task);
 	}
 
@@ -11595,7 +11595,7 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
 		free_event(event);
 err_cred:
 	if (task)
-		mutex_unlock(&task->signal->cred_guard_mutex);
+		mutex_unlock(&task->signal->exec_update_mutex);
 err_task:
 	if (task)
 		put_task_struct(task);
@@ -11900,7 +11900,7 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 /*
  * When a child task exits, feed back event values to parent events.
  *
- * Can be called with cred_guard_mutex held when called from
+ * Can be called with exec_update_mutex held when called from
  * install_exec_creds().
  */
 void perf_event_exit_task(struct task_struct *child)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 14/16] pidfd: Use new infrastructure to fix deadlocks in execve
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (12 preceding siblings ...)
  2020-03-20 20:27 ` [PATCH v6 13/16] perf: " Bernd Edlinger
@ 2020-03-21  2:46 ` Bernd Edlinger
  2020-03-25 15:40   ` Christian Brauner
  2020-03-21  2:46 ` [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
  2020-03-21  2:47 ` [PATCH v6 16/16] doc: Update documentation of ->exec_*_mutex Bernd Edlinger
  15 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-21  2:46 UTC (permalink / raw)
  To: gregkh, Kirill Tkhai, Eric W. Biederman, Christian Brauner,
	Kees Cook, jannh, Jonathan Corbet, Alexander Viro, Andrew Morton,
	adobriyan, Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker,
	avagin, Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

This changes __pidfd_fget to use the new exec_update_mutex
instead of cred_guard_mutex.

This should be safe, as the credentials do not change
before exec_update_mutex is locked.  Therefore whatever
file access is possible with holding the cred_guard_mutex
here is also possbile with the exec_update_mutex.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 kernel/pid.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index 0f4ecb5..04821f4 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -584,7 +584,7 @@ static struct file *__pidfd_fget(struct task_struct *task, int fd)
 	struct file *file;
 	int ret;
 
-	ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
+	ret = mutex_lock_killable(&task->signal->exec_update_mutex);
 	if (ret)
 		return ERR_PTR(ret);
 
@@ -593,7 +593,7 @@ static struct file *__pidfd_fget(struct task_struct *task, int fd)
 	else
 		file = ERR_PTR(-EPERM);
 
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_update_mutex);
 
 	return file ?: ERR_PTR(-EBADF);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (13 preceding siblings ...)
  2020-03-21  2:46 ` [PATCH v6 14/16] pidfd: " Bernd Edlinger
@ 2020-03-21  2:46 ` Bernd Edlinger
  2020-03-25 14:27   ` Eric W. Biederman
  2020-03-21  2:47 ` [PATCH v6 16/16] doc: Update documentation of ->exec_*_mutex Bernd Edlinger
  15 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-21  2:46 UTC (permalink / raw)
  To: gregkh, Kirill Tkhai, Eric W. Biederman, Christian Brauner,
	Kees Cook, jannh, Jonathan Corbet, Alexander Viro, Andrew Morton,
	adobriyan, Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker,
	avagin, Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

This removes the last users of cred_guard_mutex
and replaces it with a new mutex exec_guard_mutex,
and a boolean unsafe_execve_in_progress.

This addresses the case when at least one of the
sibling threads is traced, and therefore the trace
process may dead-lock in ptrace_attach, but de_thread
will need to wait for the tracer to continue execution.

The solution is to detect this situation and make
ptrace_attach and similar functions return -EAGAIN,
but only in a situation where a dead-lock is imminent.

This means this is an API change, but only when the
process is traced while execve happens in a
multi-threaded application.

See tools/testing/selftests/ptrace/vmaccess.c
for a test case that gets fixed by this change.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/exec.c                    | 44 +++++++++++++++++++++++++++++++++++---------
 fs/proc/base.c               | 13 ++++++++-----
 include/linux/sched/signal.h | 14 +++++++++-----
 init/init_task.c             |  2 +-
 kernel/cred.c                |  2 +-
 kernel/fork.c                |  2 +-
 kernel/ptrace.c              | 20 +++++++++++++++++---
 kernel/seccomp.c             | 15 +++++++++------
 8 files changed, 81 insertions(+), 31 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 0e46ec5..2056562 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1078,14 +1078,26 @@ static int de_thread(struct task_struct *tsk)
 	struct signal_struct *sig = tsk->signal;
 	struct sighand_struct *oldsighand = tsk->sighand;
 	spinlock_t *lock = &oldsighand->siglock;
+	struct task_struct *t = tsk;
 
 	if (thread_group_empty(tsk))
 		goto no_thread_group;
 
+	spin_lock_irq(lock);
+	while_each_thread(tsk, t) {
+		if (unlikely(t->ptrace))
+			sig->unsafe_execve_in_progress = true;
+	}
+
+	if (unlikely(sig->unsafe_execve_in_progress)) {
+		spin_unlock_irq(lock);
+		mutex_unlock(&sig->exec_guard_mutex);
+		spin_lock_irq(lock);
+	}
+
 	/*
 	 * Kill all other threads in the thread group.
 	 */
-	spin_lock_irq(lock);
 	if (signal_group_exit(sig)) {
 		/*
 		 * Another group action in progress, just
@@ -1429,22 +1441,30 @@ void finalize_exec(struct linux_binprm *bprm)
 EXPORT_SYMBOL(finalize_exec);
 
 /*
- * Prepare credentials and lock ->cred_guard_mutex.
+ * Prepare credentials and lock ->exec_guard_mutex.
  * install_exec_creds() commits the new creds and drops the lock.
  * Or, if exec fails before, free_bprm() should release ->cred and
  * and unlock.
  */
 static int prepare_bprm_creds(struct linux_binprm *bprm)
 {
-	if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
+	int ret;
+
+	if (mutex_lock_interruptible(&current->signal->exec_guard_mutex))
 		return -ERESTARTNOINTR;
 
+	ret = -EAGAIN;
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		goto out;
+
 	bprm->cred = prepare_exec_creds();
 	if (likely(bprm->cred))
 		return 0;
 
-	mutex_unlock(&current->signal->cred_guard_mutex);
-	return -ENOMEM;
+	ret = -ENOMEM;
+out:
+	mutex_unlock(&current->signal->exec_guard_mutex);
+	return ret;
 }
 
 static void free_bprm(struct linux_binprm *bprm)
@@ -1453,7 +1473,10 @@ static void free_bprm(struct linux_binprm *bprm)
 	if (bprm->cred) {
 		if (bprm->called_exec_mmap)
 			mutex_unlock(&current->signal->exec_update_mutex);
-		mutex_unlock(&current->signal->cred_guard_mutex);
+		if (unlikely(current->signal->unsafe_execve_in_progress))
+			mutex_lock(&current->signal->exec_guard_mutex);
+		current->signal->unsafe_execve_in_progress = false;
+		mutex_unlock(&current->signal->exec_guard_mutex);
 		abort_creds(bprm->cred);
 	}
 	if (bprm->file) {
@@ -1497,19 +1520,22 @@ void install_exec_creds(struct linux_binprm *bprm)
 	if (get_dumpable(current->mm) != SUID_DUMP_USER)
 		perf_event_exit_task(current);
 	/*
-	 * cred_guard_mutex must be held at least to this point to prevent
+	 * exec_guard_mutex must be held at least to this point to prevent
 	 * ptrace_attach() from altering our determination of the task's
 	 * credentials; any time after this it may be unlocked.
 	 */
 	security_bprm_committed_creds(bprm);
 	mutex_unlock(&current->signal->exec_update_mutex);
-	mutex_unlock(&current->signal->cred_guard_mutex);
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		mutex_lock(&current->signal->exec_guard_mutex);
+	current->signal->unsafe_execve_in_progress = false;
+	mutex_unlock(&current->signal->exec_guard_mutex);
 }
 EXPORT_SYMBOL(install_exec_creds);
 
 /*
  * determine how safe it is to execute the proposed program
- * - the caller must hold ->cred_guard_mutex to protect against
+ * - the caller must hold ->exec_guard_mutex to protect against
  *   PTRACE_ATTACH or seccomp thread-sync
  */
 static void check_unsafe_exec(struct linux_binprm *bprm)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 6b13fc4..a428536 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2680,14 +2680,17 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
 	}
 
 	/* Guard against adverse ptrace interaction */
-	rv = mutex_lock_interruptible(&current->signal->cred_guard_mutex);
+	rv = mutex_lock_interruptible(&current->signal->exec_guard_mutex);
 	if (rv < 0)
 		goto out_free;
 
-	rv = security_setprocattr(PROC_I(inode)->op.lsm,
-				  file->f_path.dentry->d_name.name, page,
-				  count);
-	mutex_unlock(&current->signal->cred_guard_mutex);
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		rv = -EAGAIN;
+	else
+		rv = security_setprocattr(PROC_I(inode)->op.lsm,
+					  file->f_path.dentry->d_name.name,
+					  page, count);
+	mutex_unlock(&current->signal->exec_guard_mutex);
 out_free:
 	kfree(page);
 out:
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index a29df79..e83cef2 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -212,6 +212,13 @@ struct signal_struct {
 #endif
 
 	/*
+	 * Set while execve is executing but is *not* holding
+	 * exec_guard_mutex to avoid possible dead-locks.
+	 * Only valid when exec_guard_mutex is held.
+	 */
+	bool unsafe_execve_in_progress;
+
+	/*
 	 * Thread is the potential origin of an oom condition; kill first on
 	 * oom
 	 */
@@ -222,11 +229,8 @@ struct signal_struct {
 	struct mm_struct *oom_mm;	/* recorded mm when the thread group got
 					 * killed by the oom killer */
 
-	struct mutex cred_guard_mutex;	/* guard against foreign influences on
-					 * credential calculations
-					 * (notably. ptrace)
-					 * Deprecated do not use in new code.
-					 * Use exec_update_mutex instead.
+	struct mutex exec_guard_mutex;	/* Held while execve runs, except when
+					 * a sibling thread is being traced.
 					 */
 	struct mutex exec_update_mutex;	/* Held while task_struct is being
 					 * updated during exec, and may have
diff --git a/init/init_task.c b/init/init_task.c
index bd403ed..6f96327 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -25,7 +25,7 @@
 	},
 	.multiprocess	= HLIST_HEAD_INIT,
 	.rlim		= INIT_RLIMITS,
-	.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
+	.exec_guard_mutex = __MUTEX_INITIALIZER(init_signals.exec_guard_mutex),
 	.exec_update_mutex = __MUTEX_INITIALIZER(init_signals.exec_update_mutex),
 #ifdef CONFIG_POSIX_TIMERS
 	.posix_timers = LIST_HEAD_INIT(init_signals.posix_timers),
diff --git a/kernel/cred.c b/kernel/cred.c
index 71a7926..341ca59 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -295,7 +295,7 @@ struct cred *prepare_creds(void)
 
 /*
  * Prepare credentials for current to perform an execve()
- * - The caller must hold ->cred_guard_mutex
+ * - The caller must hold ->exec_guard_mutex
  */
 struct cred *prepare_exec_creds(void)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index e23ccac..98012f7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1593,7 +1593,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	sig->oom_score_adj = current->signal->oom_score_adj;
 	sig->oom_score_adj_min = current->signal->oom_score_adj_min;
 
-	mutex_init(&sig->cred_guard_mutex);
+	mutex_init(&sig->exec_guard_mutex);
 	mutex_init(&sig->exec_update_mutex);
 
 	return 0;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 43d6179..221759e 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -392,9 +392,13 @@ static int ptrace_attach(struct task_struct *task, long request,
 	 * under ptrace.
 	 */
 	retval = -ERESTARTNOINTR;
-	if (mutex_lock_interruptible(&task->signal->cred_guard_mutex))
+	if (mutex_lock_interruptible(&task->signal->exec_guard_mutex))
 		goto out;
 
+	retval = -EAGAIN;
+	if (unlikely(task->signal->unsafe_execve_in_progress))
+		goto unlock_creds;
+
 	task_lock(task);
 	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
 	task_unlock(task);
@@ -447,7 +451,7 @@ static int ptrace_attach(struct task_struct *task, long request,
 unlock_tasklist:
 	write_unlock_irq(&tasklist_lock);
 unlock_creds:
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_guard_mutex);
 out:
 	if (!retval) {
 		/*
@@ -472,10 +476,18 @@ static int ptrace_attach(struct task_struct *task, long request,
  */
 static int ptrace_traceme(void)
 {
-	int ret = -EPERM;
+	int ret;
+
+	if (mutex_lock_interruptible(&current->signal->exec_guard_mutex))
+		return -ERESTARTNOINTR;
+
+	ret = -EAGAIN;
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		goto unlock_creds;
 
 	write_lock_irq(&tasklist_lock);
 	/* Are we already being traced? */
+	ret = -EPERM;
 	if (!current->ptrace) {
 		ret = security_ptrace_traceme(current->parent);
 		/*
@@ -490,6 +502,8 @@ static int ptrace_traceme(void)
 	}
 	write_unlock_irq(&tasklist_lock);
 
+unlock_creds:
+	mutex_unlock(&current->signal->exec_guard_mutex);
 	return ret;
 }
 
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index b6ea3dc..acd6960 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -329,7 +329,7 @@ static int is_ancestor(struct seccomp_filter *parent,
 /**
  * seccomp_can_sync_threads: checks if all threads can be synchronized
  *
- * Expects sighand and cred_guard_mutex locks to be held.
+ * Expects sighand and exec_guard_mutex locks to be held.
  *
  * Returns 0 on success, -ve on error, or the pid of a thread which was
  * either not in the correct seccomp mode or did not have an ancestral
@@ -339,9 +339,12 @@ static inline pid_t seccomp_can_sync_threads(void)
 {
 	struct task_struct *thread, *caller;
 
-	BUG_ON(!mutex_is_locked(&current->signal->cred_guard_mutex));
+	BUG_ON(!mutex_is_locked(&current->signal->exec_guard_mutex));
 	assert_spin_locked(&current->sighand->siglock);
 
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		return -EAGAIN;
+
 	/* Validate all threads being eligible for synchronization. */
 	caller = current;
 	for_each_thread(caller, thread) {
@@ -371,7 +374,7 @@ static inline pid_t seccomp_can_sync_threads(void)
 /**
  * seccomp_sync_threads: sets all threads to use current's filter
  *
- * Expects sighand and cred_guard_mutex locks to be held, and for
+ * Expects sighand and exec_guard_mutex locks to be held, and for
  * seccomp_can_sync_threads() to have returned success already
  * without dropping the locks.
  *
@@ -380,7 +383,7 @@ static inline void seccomp_sync_threads(unsigned long flags)
 {
 	struct task_struct *thread, *caller;
 
-	BUG_ON(!mutex_is_locked(&current->signal->cred_guard_mutex));
+	BUG_ON(!mutex_is_locked(&current->signal->exec_guard_mutex));
 	assert_spin_locked(&current->sighand->siglock);
 
 	/* Synchronize all threads. */
@@ -1319,7 +1322,7 @@ static long seccomp_set_mode_filter(unsigned int flags,
 	 * while another thread is in the middle of calling exec.
 	 */
 	if (flags & SECCOMP_FILTER_FLAG_TSYNC &&
-	    mutex_lock_killable(&current->signal->cred_guard_mutex))
+	    mutex_lock_killable(&current->signal->exec_guard_mutex))
 		goto out_put_fd;
 
 	spin_lock_irq(&current->sighand->siglock);
@@ -1337,7 +1340,7 @@ static long seccomp_set_mode_filter(unsigned int flags,
 out:
 	spin_unlock_irq(&current->sighand->siglock);
 	if (flags & SECCOMP_FILTER_FLAG_TSYNC)
-		mutex_unlock(&current->signal->cred_guard_mutex);
+		mutex_unlock(&current->signal->exec_guard_mutex);
 out_put_fd:
 	if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
 		if (ret) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 16/16] doc: Update documentation of ->exec_*_mutex
       [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
                   ` (14 preceding siblings ...)
  2020-03-21  2:46 ` [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
@ 2020-03-21  2:47 ` Bernd Edlinger
  15 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-21  2:47 UTC (permalink / raw)
  To: gregkh, Kirill Tkhai, Eric W. Biederman, Christian Brauner,
	Kees Cook, jannh, Jonathan Corbet, Alexander Viro, Andrew Morton,
	adobriyan, Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker,
	avagin, Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

This brings the outdated Documentation/security/credentials.rst
back in line with the current implementation, and describes the
purpose of current->signal->exec_update_mutex,
current->signal->exec_guard_mutex and
current->signal->unsafe_execve_in_progress.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 Documentation/security/credentials.rst | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/Documentation/security/credentials.rst b/Documentation/security/credentials.rst
index 282e79f..fe4cd76 100644
--- a/Documentation/security/credentials.rst
+++ b/Documentation/security/credentials.rst
@@ -437,15 +437,30 @@ new set of credentials by calling::
 
 	struct cred *prepare_creds(void);
 
-this locks current->cred_replace_mutex and then allocates and constructs a
-duplicate of the current process's credentials, returning with the mutex still
-held if successful.  It returns NULL if not successful (out of memory).
+this allocates and constructs a duplicate of the current process's credentials.
+It returns NULL if not successful (out of memory).
+
+If called from __do_execve_file, the mutex current->signal->exec_guard_mutex
+is acquired before this function gets called, and usually released after
+the new process mmap and credentials are installed.  However if one of the
+sibling threads are being traced when the execve is invoked, there is no
+guarantee how long it takes to terminate all sibling threads, and therefore
+the variable current->signal->unsafe_execve_in_progress is set, and the
+exec_guard_mutex is released immediately.  Functions that may have effect
+on the credentials of a different thread need to lock the exec_guard_mutex
+and additionally check the unsafe_execve_in_progress status, and fail with
+-EAGAIN if that variable is set.
 
 The mutex prevents ``ptrace()`` from altering the ptrace state of a process
 while security checks on credentials construction and changing is taking place
 as the ptrace state may alter the outcome, particularly in the case of
 ``execve()``.
 
+The mutex current->signal->exec_update_mutex is acquired when only a single
+thread is remaining, and the credentials and the process mmap are actually
+changed.  Functions that only need to access to a consistent state of the
+credentials and the process mmap do only need to aquire this mutex.
+
 The new credentials set should be altered appropriately, and any security
 checks and hooks done.  Both the current and the proposed sets of credentials
 are available for this purpose as current_cred() will return the current set
@@ -466,9 +481,8 @@ by calling::
 
 This will alter various aspects of the credentials and the process, giving the
 LSM a chance to do likewise, then it will use ``rcu_assign_pointer()`` to
-actually commit the new credentials to ``current->cred``, it will release
-``current->cred_replace_mutex`` to allow ``ptrace()`` to take place, and it
-will notify the scheduler and others of the changes.
+actually commit the new credentials to ``current->cred``, and it will notify
+the scheduler and others of the changes.
 
 This function is guaranteed to return 0, so that it can be tail-called at the
 end of such functions as ``sys_setresuid()``.
@@ -486,8 +500,7 @@ invoked::
 
 	void abort_creds(struct cred *new);
 
-This releases the lock on ``current->cred_replace_mutex`` that
-``prepare_creds()`` got and then releases the new credentials.
+This releases the new credentials.
 
 
 A typical credentials alteration function would look something like this::
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex
  2020-03-20 20:25 ` [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex Bernd Edlinger
@ 2020-03-23 10:51   ` Kirill Tkhai
  0 siblings, 0 replies; 23+ messages in thread
From: Kirill Tkhai @ 2020-03-23 10:51 UTC (permalink / raw)
  To: Bernd Edlinger, Greg Kroah-Hartman, Eric W. Biederman,
	Christian Brauner, Kees Cook, Jann Horn, Jonathan Corbet,
	Alexander Viro, Andrew Morton, Alexey Dobriyan, Thomas Gleixner,
	Oleg Nesterov, Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

On 20.03.2020 23:25, Bernd Edlinger wrote:
> The cred_guard_mutex is problematic as it is held over possibly
> indefinite waits for userspace.  The possible indefinite waits for
> userspace that I have identified are: The cred_guard_mutex is held in
> PTRACE_EVENT_EXIT waiting for the tracer.  The cred_guard_mutex is
> held over "put_user(0, tsk->clear_child_tid)" in exit_mm().  The
> cred_guard_mutex is held over "get_user(futex_offset, ...")  in
> exit_robust_list.  The cred_guard_mutex held over copy_strings.
> 
> The functions get_user and put_user can trigger a page fault which can
> potentially wait indefinitely in the case of userfaultfd or if
> userspace implements part of the page fault path.
> 
> In any of those cases the userspace process that the kernel is waiting
> for might make a different system call that winds up taking the
> cred_guard_mutex and result in deadlock.
> 
> Holding a mutex over any of those possibly indefinite waits for
> userspace does not appear necessary.  Add exec_update_mutex that will
> just cover updating the process during exec where the permissions and
> the objects pointed to by the task struct may be out of sync.
> 
> The plan is to switch the users of cred_guard_mutex to
> exec_update_mutex one by one.  This lets us move forward while still
> being careful and not introducing any regressions.
>
> Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/
> Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/
> Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/
> Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/
> Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/
> Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.")
> Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2")
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  fs/exec.c                    | 22 +++++++++++++++++++---
>  include/linux/binfmts.h      |  8 +++++++-
>  include/linux/sched/signal.h |  9 ++++++++-
>  init/init_task.c             |  1 +
>  kernel/fork.c                |  1 +
>  5 files changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index d820a72..0e46ec5 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1010,16 +1010,26 @@ ssize_t read_code(struct file *file, unsigned long addr, loff_t pos, size_t len)
>  }
>  EXPORT_SYMBOL(read_code);
>  
> +/*
> + * Maps the mm_struct mm into the current task struct.
> + * On success, this function returns with the mutex
> + * exec_update_mutex locked.
> + */
>  static int exec_mmap(struct mm_struct *mm)
>  {
>  	struct task_struct *tsk;
>  	struct mm_struct *old_mm, *active_mm;
> +	int ret;
>  
>  	/* Notify parent that we're no longer interested in the old VM */
>  	tsk = current;
>  	old_mm = current->mm;
>  	exec_mm_release(tsk, old_mm);
>  
> +	ret = mutex_lock_killable(&tsk->signal->exec_update_mutex);
> +	if (ret)
> +		return ret;
> +
>  	if (old_mm) {
>  		sync_mm_rss(old_mm);
>  		/*
> @@ -1031,9 +1041,11 @@ static int exec_mmap(struct mm_struct *mm)
>  		down_read(&old_mm->mmap_sem);
>  		if (unlikely(old_mm->core_state)) {
>  			up_read(&old_mm->mmap_sem);
> +			mutex_unlock(&tsk->signal->exec_update_mutex);
>  			return -EINTR;
>  		}
>  	}
> +
>  	task_lock(tsk);
>  	active_mm = tsk->active_mm;
>  	membarrier_exec_mmap(mm);
> @@ -1288,11 +1300,12 @@ int flush_old_exec(struct linux_binprm * bprm)
>  		goto out;
>  
>  	/*
> -	 * After clearing bprm->mm (to mark that current is using the
> -	 * prepared mm now), we have nothing left of the original
> +	 * After setting bprm->called_exec_mmap (to mark that current is
> +	 * using the prepared mm now), we have nothing left of the original
>  	 * process. If anything from here on returns an error, the check
>  	 * in search_binary_handler() will SEGV current.
>  	 */
> +	bprm->called_exec_mmap = 1;
>  	bprm->mm = NULL;
>  
>  #ifdef CONFIG_POSIX_TIMERS
> @@ -1438,6 +1451,8 @@ static void free_bprm(struct linux_binprm *bprm)
>  {
>  	free_arg_pages(bprm);
>  	if (bprm->cred) {
> +		if (bprm->called_exec_mmap)
> +			mutex_unlock(&current->signal->exec_update_mutex);
>  		mutex_unlock(&current->signal->cred_guard_mutex);
>  		abort_creds(bprm->cred);
>  	}
> @@ -1487,6 +1502,7 @@ void install_exec_creds(struct linux_binprm *bprm)
>  	 * credentials; any time after this it may be unlocked.
>  	 */
>  	security_bprm_committed_creds(bprm);
> +	mutex_unlock(&current->signal->exec_update_mutex);
>  	mutex_unlock(&current->signal->cred_guard_mutex);
>  }
>  EXPORT_SYMBOL(install_exec_creds);
> @@ -1678,7 +1694,7 @@ int search_binary_handler(struct linux_binprm *bprm)
>  
>  		read_lock(&binfmt_lock);
>  		put_binfmt(fmt);
> -		if (retval < 0 && !bprm->mm) {
> +		if (retval < 0 && bprm->called_exec_mmap) {
>  			/* we got to flush_old_exec() and failed after it */
>  			read_unlock(&binfmt_lock);
>  			force_sigsegv(SIGSEGV);
> diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
> index b40fc63..a345d9f 100644
> --- a/include/linux/binfmts.h
> +++ b/include/linux/binfmts.h
> @@ -44,7 +44,13 @@ struct linux_binprm {
>  		 * exec has happened. Used to sanitize execution environment
>  		 * and to set AT_SECURE auxv for glibc.
>  		 */
> -		secureexec:1;
> +		secureexec:1,
> +		/*
> +		 * Set by flush_old_exec, when exec_mmap has been called.
> +		 * This is past the point of no return, when the
> +		 * exec_update_mutex has been taken.
> +		 */
> +		called_exec_mmap:1;
>  #ifdef __alpha__
>  	unsigned int taso:1;
>  #endif
> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
> index 8805025..a29df79 100644
> --- a/include/linux/sched/signal.h
> +++ b/include/linux/sched/signal.h
> @@ -224,7 +224,14 @@ struct signal_struct {
>  
>  	struct mutex cred_guard_mutex;	/* guard against foreign influences on
>  					 * credential calculations
> -					 * (notably. ptrace) */
> +					 * (notably. ptrace)
> +					 * Deprecated do not use in new code.
> +					 * Use exec_update_mutex instead.
> +					 */
> +	struct mutex exec_update_mutex;	/* Held while task_struct is being
> +					 * updated during exec, and may have
> +					 * inconsistent permissions.
> +					 */
>  } __randomize_layout;
>  
>  /*
> diff --git a/init/init_task.c b/init/init_task.c
> index 9e5cbe5..bd403ed 100644
> --- a/init/init_task.c
> +++ b/init/init_task.c
> @@ -26,6 +26,7 @@
>  	.multiprocess	= HLIST_HEAD_INIT,
>  	.rlim		= INIT_RLIMITS,
>  	.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
> +	.exec_update_mutex = __MUTEX_INITIALIZER(init_signals.exec_update_mutex),
>  #ifdef CONFIG_POSIX_TIMERS
>  	.posix_timers = LIST_HEAD_INIT(init_signals.posix_timers),
>  	.cputimer	= {
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 8642530..036b692 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1594,6 +1594,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
>  	sig->oom_score_adj_min = current->signal->oom_score_adj_min;
>  
>  	mutex_init(&sig->cred_guard_mutex);
> +	mutex_init(&sig->exec_update_mutex);
>  
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach
  2020-03-21  2:46 ` [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
@ 2020-03-25 14:27   ` Eric W. Biederman
  2020-03-29  4:31     ` Bernd Edlinger
  0 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2020-03-25 14:27 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: gregkh, Kirill Tkhai, Christian Brauner, Kees Cook, jannh,
	Jonathan Corbet, Alexander Viro, Andrew Morton, adobriyan,
	Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker, avagin,
	Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> This removes the last users of cred_guard_mutex
> and replaces it with a new mutex exec_guard_mutex,
> and a boolean unsafe_execve_in_progress.
>
> This addresses the case when at least one of the
> sibling threads is traced, and therefore the trace
> process may dead-lock in ptrace_attach, but de_thread
> will need to wait for the tracer to continue execution.
>
> The solution is to detect this situation and make
> ptrace_attach and similar functions return -EAGAIN,
> but only in a situation where a dead-lock is imminent.
>
> This means this is an API change, but only when the
> process is traced while execve happens in a
> multi-threaded application.
>
> See tools/testing/selftests/ptrace/vmaccess.c
> for a test case that gets fixed by this change.

Hmm.  The logic with unsafe_execve_in_progress is interesting.
I think I see what you are aiming for.

So far as you have hit what you are aiming for I think this is
a safe change as the only cases that will change are the cases
that would deadlock today.

At a minimum the code is subtle and I don't see big fat
warning comments that subtle code needs to keep people
from using it wrong.

Further while the change below to proc_pid_attr_write looks
like it is being treated the same as ptrace_attach.  When in
fact proc_pid_attr_write needs the no_new_privs and ptrace_attach
protection the same as exec.  As the updated cred won't be used in an
ongoing exec, exec does not need protection from proc_pid_attr_write,
other than deadlock protection.

Having the relevant lock be per task_struct lock would probably be a
better way to avoid deadlock with a concurrent proc_pid_attr_write.


So I am going to pass on these last two patches for now, and apply the
rest and get them into linux-next.

Eric


> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 6b13fc4..a428536 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2680,14 +2680,17 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
>  	}
>  
>  	/* Guard against adverse ptrace interaction */
> -	rv = mutex_lock_interruptible(&current->signal->cred_guard_mutex);
> +	rv = mutex_lock_interruptible(&current->signal->exec_guard_mutex);
>  	if (rv < 0)
>  		goto out_free;
>  
> -	rv = security_setprocattr(PROC_I(inode)->op.lsm,
> -				  file->f_path.dentry->d_name.name, page,
> -				  count);
> -	mutex_unlock(&current->signal->cred_guard_mutex);
> +	if (unlikely(current->signal->unsafe_execve_in_progress))
> +		rv = -EAGAIN;
> +	else
> +		rv = security_setprocattr(PROC_I(inode)->op.lsm,
> +					  file->f_path.dentry->d_name.name,
> +					  page, count);
> +	mutex_unlock(&current->signal->exec_guard_mutex);
>  out_free:
>  	kfree(page);
>  out:

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 14/16] pidfd: Use new infrastructure to fix deadlocks in execve
  2020-03-21  2:46 ` [PATCH v6 14/16] pidfd: " Bernd Edlinger
@ 2020-03-25 15:40   ` Christian Brauner
  0 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2020-03-25 15:40 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: gregkh, Kirill Tkhai, Eric W. Biederman, Kees Cook, jannh,
	Jonathan Corbet, Alexander Viro, Andrew Morton, adobriyan,
	Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker, avagin,
	Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

On Sat, Mar 21, 2020 at 02:46:16AM +0000, Bernd Edlinger wrote:
> This changes __pidfd_fget to use the new exec_update_mutex
> instead of cred_guard_mutex.
> 
> This should be safe, as the credentials do not change
> before exec_update_mutex is locked.  Therefore whatever
> file access is possible with holding the cred_guard_mutex
> here is also possbile with the exec_update_mutex.
> 
> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>

Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve
  2020-03-20 20:27 ` [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve Bernd Edlinger
@ 2020-03-25 15:41   ` Christian Brauner
  0 siblings, 0 replies; 23+ messages in thread
From: Christian Brauner @ 2020-03-25 15:41 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Greg Kroah-Hartman, Kirill Tkhai, Eric W. Biederman, Kees Cook,
	Jann Horn, Jonathan Corbet, Alexander Viro, Andrew Morton,
	Alexey Dobriyan, Thomas Gleixner, Oleg Nesterov,
	Frederic Weisbecker, Andrei Vagin, Ingo Molnar,
	Peter Zijlstra (Intel),
	Yuyang Du, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, Christian Kellner, Andrea Arcangeli,
	Aleksa Sarai, Dmitry V. Levin, linux-doc, linux-kernel,
	linux-fsdevel, linux-mm, stable, linux-api

On Fri, Mar 20, 2020 at 09:27:05PM +0100, Bernd Edlinger wrote:
> This changes kcmp_epoll_target to use the new exec_update_mutex
> instead of cred_guard_mutex.
> 
> This should be safe, as the credentials are only used for reading,
> and furthermore ->mm and ->sighand are updated on execve,
> but only under the new exec_update_mutex.
> 
> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach
  2020-03-25 14:27   ` Eric W. Biederman
@ 2020-03-29  4:31     ` Bernd Edlinger
  2020-03-29  6:36       ` Bernd Edlinger
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-29  4:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: gregkh, Kirill Tkhai, Christian Brauner, Kees Cook, jannh,
	Jonathan Corbet, Alexander Viro, Andrew Morton, adobriyan,
	Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker, avagin,
	Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

On 3/25/20 3:27 PM, Eric W. Biederman wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
>> This removes the last users of cred_guard_mutex
>> and replaces it with a new mutex exec_guard_mutex,
>> and a boolean unsafe_execve_in_progress.
>>
>> This addresses the case when at least one of the
>> sibling threads is traced, and therefore the trace
>> process may dead-lock in ptrace_attach, but de_thread
>> will need to wait for the tracer to continue execution.
>>
>> The solution is to detect this situation and make
>> ptrace_attach and similar functions return -EAGAIN,
>> but only in a situation where a dead-lock is imminent.
>>
>> This means this is an API change, but only when the
>> process is traced while execve happens in a
>> multi-threaded application.
>>
>> See tools/testing/selftests/ptrace/vmaccess.c
>> for a test case that gets fixed by this change.
> 
> Hmm.  The logic with unsafe_execve_in_progress is interesting.
> I think I see what you are aiming for.
> 
> So far as you have hit what you are aiming for I think this is
> a safe change as the only cases that will change are the cases
> that would deadlock today.
> 
> At a minimum the code is subtle and I don't see big fat
> warning comments that subtle code needs to keep people
> from using it wrong.
> 

Okay, I can add big fat warning comments, yeah.

> Further while the change below to proc_pid_attr_write looks
> like it is being treated the same as ptrace_attach.  When in
> fact proc_pid_attr_write needs the no_new_privs and ptrace_attach
> protection the same as exec.  As the updated cred won't be used in an
> ongoing exec, exec does not need protection from proc_pid_attr_write,
> other than deadlock protection.
> 

Not sure I understand this comment correct.
You refer to this block here:

> @@ -2680,14 +2680,17 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
>         }
> 
>         /* Guard against adverse ptrace interaction */
> -       rv = mutex_lock_interruptible(&current->signal->cred_guard_mutex);
> +       rv = mutex_lock_interruptible(&current->signal->exec_guard_mutex);
>         if (rv < 0)
>                 goto out_free;
> 
> -       rv = security_setprocattr(PROC_I(inode)->op.lsm,
> -                                 file->f_path.dentry->d_name.name, page,
> -                                 count);
> -       mutex_unlock(&current->signal->cred_guard_mutex);
> +       if (unlikely(current->signal->unsafe_execve_in_progress))
> +               rv = -EAGAIN;
> +       else
> +               rv = security_setprocattr(PROC_I(inode)->op.lsm,
> +                                         file->f_path.dentry->d_name.name,
> +                                         page, count);
> +       mutex_unlock(&current->signal->exec_guard_mutex);
>  out_free:
>         kfree(page);

I think the logic is correct, but instead if an if-then-else,
I need the big-fat-warning-comment followed by if-unsafe-goto-mutex-unlock
kind of thing, so it looks more like the other places, right?


> Having the relevant lock be per task_struct lock would probably be a
> better way to avoid deadlock with a concurrent proc_pid_attr_write.
> 

Please elaborate your idea a bit.

> 
> So I am going to pass on these last two patches for now, and apply the
> rest and get them into linux-next.
> 

No problem, I can update this patch and if you like take it to your tree,
otherwise it is of course not the most important issue in the world ;-)


Thanks
Bernd.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach
  2020-03-29  4:31     ` Bernd Edlinger
@ 2020-03-29  6:36       ` Bernd Edlinger
  2020-03-30 18:26         ` [PATCH v7 " Bernd Edlinger
  0 siblings, 1 reply; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-29  6:36 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: gregkh, Kirill Tkhai, Christian Brauner, Kees Cook, jannh,
	Jonathan Corbet, Alexander Viro, Andrew Morton, adobriyan,
	Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker, avagin,
	Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

On 3/29/20 6:31 AM, Bernd Edlinger wrote:
> On 3/25/20 3:27 PM, Eric W. Biederman wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> At a minimum the code is subtle and I don't see big fat
>> warning comments that subtle code needs to keep people
>> from using it wrong.
>>
> 
> Okay, I can add big fat warning comments, yeah.
> 

So how about that:

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 221759e..2d6b5cd 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -395,6 +395,17 @@ static int ptrace_attach(struct task_struct *task, long req
        if (mutex_lock_interruptible(&task->signal->exec_guard_mutex))
                goto out;
 
+       /*
+        * BIG FAT WARNING - Fragile code ahead.
+        * Please do not insert any code between these two
+        * if statements.  It may happen that execve has to
+        * release the exec_guard_mutex in order to prevent
+        * deadlocks.  In that case unsafe_execve_in_progress
+        * will be set.  If that happens you cannot assume that
+        * the usual guarantees implied by exec_guard_mutex
+        * are valid.  Just return -EAGAIN in that case and
+        * unlock the mutex immediately.
+        */
        retval = -EAGAIN;
        if (unlikely(task->signal->unsafe_execve_in_progress))
                goto unlock_creds;

Is that cool enough :-)


Thanks
Bernd.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 15/16] exec: Fix dead-lock in de_thread with ptrace_attach
  2020-03-29  6:36       ` Bernd Edlinger
@ 2020-03-30 18:26         ` Bernd Edlinger
  0 siblings, 0 replies; 23+ messages in thread
From: Bernd Edlinger @ 2020-03-30 18:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: gregkh, Kirill Tkhai, Christian Brauner, Kees Cook, jannh,
	Jonathan Corbet, Alexander Viro, Andrew Morton, adobriyan,
	Thomas Gleixner, Oleg Nesterov, Frederic Weisbecker, avagin,
	Ingo Molnar, Peter Zijlstra (Intel),
	duyuyang, David Hildenbrand, Sebastian Andrzej Siewior,
	Anshuman Khandual, David Howells, James Morris, Shakeel Butt,
	Jason Gunthorpe, christian, Andrea Arcangeli, Aleksa Sarai,
	Dmitry V. Levin, linux-doc, linux-kernel, linux-fsdevel,
	linux-mm, stable, linux-api

This removes the last users of cred_guard_mutex
and replaces it with a new mutex exec_guard_mutex,
and a boolean unsafe_execve_in_progress.

This addresses the case when at least one of the
sibling threads is traced, and therefore the trace
process may dead-lock in ptrace_attach, but de_thread
will need to wait for the tracer to continue execution.

The solution is to detect this situation and make
ptrace_attach and similar functions return -EAGAIN,
but only in a situation where a dead-lock is imminent.

This means this is an API change, but only when the
process is traced while execve happens in a
multi-threaded application.

See tools/testing/selftests/ptrace/vmaccess.c
for a test case that gets fixed by this change.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/exec.c                    | 44 +++++++++++++++++++++++++++++++++++---------
 fs/proc/base.c               | 20 ++++++++++++++++++--
 include/linux/sched/signal.h | 14 +++++++++-----
 init/init_task.c             |  2 +-
 kernel/cred.c                |  2 +-
 kernel/fork.c                |  2 +-
 kernel/ptrace.c              | 42 +++++++++++++++++++++++++++++++++++++++---
 kernel/seccomp.c             | 25 +++++++++++++++++++------
 8 files changed, 123 insertions(+), 28 deletions(-)

v7: Added "big fat" warning comments, made the change in
proc_pid_attr_write a bit more readable.

diff --git a/fs/exec.c b/fs/exec.c
index 0e46ec5..2056562 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1078,14 +1078,26 @@ static int de_thread(struct task_struct *tsk)
 	struct signal_struct *sig = tsk->signal;
 	struct sighand_struct *oldsighand = tsk->sighand;
 	spinlock_t *lock = &oldsighand->siglock;
+	struct task_struct *t = tsk;
 
 	if (thread_group_empty(tsk))
 		goto no_thread_group;
 
+	spin_lock_irq(lock);
+	while_each_thread(tsk, t) {
+		if (unlikely(t->ptrace))
+			sig->unsafe_execve_in_progress = true;
+	}
+
+	if (unlikely(sig->unsafe_execve_in_progress)) {
+		spin_unlock_irq(lock);
+		mutex_unlock(&sig->exec_guard_mutex);
+		spin_lock_irq(lock);
+	}
+
 	/*
 	 * Kill all other threads in the thread group.
 	 */
-	spin_lock_irq(lock);
 	if (signal_group_exit(sig)) {
 		/*
 		 * Another group action in progress, just
@@ -1429,22 +1441,30 @@ void finalize_exec(struct linux_binprm *bprm)
 EXPORT_SYMBOL(finalize_exec);
 
 /*
- * Prepare credentials and lock ->cred_guard_mutex.
+ * Prepare credentials and lock ->exec_guard_mutex.
  * install_exec_creds() commits the new creds and drops the lock.
  * Or, if exec fails before, free_bprm() should release ->cred and
  * and unlock.
  */
 static int prepare_bprm_creds(struct linux_binprm *bprm)
 {
-	if (mutex_lock_interruptible(&current->signal->cred_guard_mutex))
+	int ret;
+
+	if (mutex_lock_interruptible(&current->signal->exec_guard_mutex))
 		return -ERESTARTNOINTR;
 
+	ret = -EAGAIN;
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		goto out;
+
 	bprm->cred = prepare_exec_creds();
 	if (likely(bprm->cred))
 		return 0;
 
-	mutex_unlock(&current->signal->cred_guard_mutex);
-	return -ENOMEM;
+	ret = -ENOMEM;
+out:
+	mutex_unlock(&current->signal->exec_guard_mutex);
+	return ret;
 }
 
 static void free_bprm(struct linux_binprm *bprm)
@@ -1453,7 +1473,10 @@ static void free_bprm(struct linux_binprm *bprm)
 	if (bprm->cred) {
 		if (bprm->called_exec_mmap)
 			mutex_unlock(&current->signal->exec_update_mutex);
-		mutex_unlock(&current->signal->cred_guard_mutex);
+		if (unlikely(current->signal->unsafe_execve_in_progress))
+			mutex_lock(&current->signal->exec_guard_mutex);
+		current->signal->unsafe_execve_in_progress = false;
+		mutex_unlock(&current->signal->exec_guard_mutex);
 		abort_creds(bprm->cred);
 	}
 	if (bprm->file) {
@@ -1497,19 +1520,22 @@ void install_exec_creds(struct linux_binprm *bprm)
 	if (get_dumpable(current->mm) != SUID_DUMP_USER)
 		perf_event_exit_task(current);
 	/*
-	 * cred_guard_mutex must be held at least to this point to prevent
+	 * exec_guard_mutex must be held at least to this point to prevent
 	 * ptrace_attach() from altering our determination of the task's
 	 * credentials; any time after this it may be unlocked.
 	 */
 	security_bprm_committed_creds(bprm);
 	mutex_unlock(&current->signal->exec_update_mutex);
-	mutex_unlock(&current->signal->cred_guard_mutex);
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		mutex_lock(&current->signal->exec_guard_mutex);
+	current->signal->unsafe_execve_in_progress = false;
+	mutex_unlock(&current->signal->exec_guard_mutex);
 }
 EXPORT_SYMBOL(install_exec_creds);
 
 /*
  * determine how safe it is to execute the proposed program
- * - the caller must hold ->cred_guard_mutex to protect against
+ * - the caller must hold ->exec_guard_mutex to protect against
  *   PTRACE_ATTACH or seccomp thread-sync
  */
 static void check_unsafe_exec(struct linux_binprm *bprm)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 6b13fc4..eaca36e 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2680,14 +2680,30 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
 	}
 
 	/* Guard against adverse ptrace interaction */
-	rv = mutex_lock_interruptible(&current->signal->cred_guard_mutex);
+	rv = mutex_lock_interruptible(&current->signal->exec_guard_mutex);
 	if (rv < 0)
 		goto out_free;
 
+	/*
+	 * BIG FAT WARNING - Fragile code ahead.
+	 * Please do not insert any code between these two
+	 * if statements.  It may happen that execve has to
+	 * release the exec_guard_mutex in order to prevent
+	 * deadlocks.  In that case unsafe_execve_in_progress
+	 * will be set.  If that happens you cannot assume that
+	 * the usual guarantees implied by exec_guard_mutex
+	 * are valid.  Just return -EAGAIN in that case and
+	 * unlock the mutex immediately.
+	 */
+	rv = -EAGAIN;
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		goto out_unlock;
+
 	rv = security_setprocattr(PROC_I(inode)->op.lsm,
 				  file->f_path.dentry->d_name.name, page,
 				  count);
-	mutex_unlock(&current->signal->cred_guard_mutex);
+out_unlock:
+	mutex_unlock(&current->signal->exec_guard_mutex);
 out_free:
 	kfree(page);
 out:
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index a29df79..e83cef2 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -212,6 +212,13 @@ struct signal_struct {
 #endif
 
 	/*
+	 * Set while execve is executing but is *not* holding
+	 * exec_guard_mutex to avoid possible dead-locks.
+	 * Only valid when exec_guard_mutex is held.
+	 */
+	bool unsafe_execve_in_progress;
+
+	/*
 	 * Thread is the potential origin of an oom condition; kill first on
 	 * oom
 	 */
@@ -222,11 +229,8 @@ struct signal_struct {
 	struct mm_struct *oom_mm;	/* recorded mm when the thread group got
 					 * killed by the oom killer */
 
-	struct mutex cred_guard_mutex;	/* guard against foreign influences on
-					 * credential calculations
-					 * (notably. ptrace)
-					 * Deprecated do not use in new code.
-					 * Use exec_update_mutex instead.
+	struct mutex exec_guard_mutex;	/* Held while execve runs, except when
+					 * a sibling thread is being traced.
 					 */
 	struct mutex exec_update_mutex;	/* Held while task_struct is being
 					 * updated during exec, and may have
diff --git a/init/init_task.c b/init/init_task.c
index bd403ed..6f96327 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -25,7 +25,7 @@
 	},
 	.multiprocess	= HLIST_HEAD_INIT,
 	.rlim		= INIT_RLIMITS,
-	.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
+	.exec_guard_mutex = __MUTEX_INITIALIZER(init_signals.exec_guard_mutex),
 	.exec_update_mutex = __MUTEX_INITIALIZER(init_signals.exec_update_mutex),
 #ifdef CONFIG_POSIX_TIMERS
 	.posix_timers = LIST_HEAD_INIT(init_signals.posix_timers),
diff --git a/kernel/cred.c b/kernel/cred.c
index 71a7926..341ca59 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -295,7 +295,7 @@ struct cred *prepare_creds(void)
 
 /*
  * Prepare credentials for current to perform an execve()
- * - The caller must hold ->cred_guard_mutex
+ * - The caller must hold ->exec_guard_mutex
  */
 struct cred *prepare_exec_creds(void)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index e23ccac..98012f7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1593,7 +1593,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	sig->oom_score_adj = current->signal->oom_score_adj;
 	sig->oom_score_adj_min = current->signal->oom_score_adj_min;
 
-	mutex_init(&sig->cred_guard_mutex);
+	mutex_init(&sig->exec_guard_mutex);
 	mutex_init(&sig->exec_update_mutex);
 
 	return 0;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 43d6179..19bf69f 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -392,9 +392,24 @@ static int ptrace_attach(struct task_struct *task, long request,
 	 * under ptrace.
 	 */
 	retval = -ERESTARTNOINTR;
-	if (mutex_lock_interruptible(&task->signal->cred_guard_mutex))
+	if (mutex_lock_interruptible(&task->signal->exec_guard_mutex))
 		goto out;
 
+	/*
+	 * BIG FAT WARNING - Fragile code ahead.
+	 * Please do not insert any code between these two
+	 * if statements.  It may happen that execve has to
+	 * release the exec_guard_mutex in order to prevent
+	 * deadlocks.  In that case unsafe_execve_in_progress
+	 * will be set.  If that happens you cannot assume that
+	 * the usual guarantees implied by exec_guard_mutex
+	 * are valid.  Just return -EAGAIN in that case and
+	 * unlock the mutex immediately.
+	 */
+	retval = -EAGAIN;
+	if (unlikely(task->signal->unsafe_execve_in_progress))
+		goto unlock_creds;
+
 	task_lock(task);
 	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
 	task_unlock(task);
@@ -447,7 +462,7 @@ static int ptrace_attach(struct task_struct *task, long request,
 unlock_tasklist:
 	write_unlock_irq(&tasklist_lock);
 unlock_creds:
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_guard_mutex);
 out:
 	if (!retval) {
 		/*
@@ -472,10 +487,29 @@ static int ptrace_attach(struct task_struct *task, long request,
  */
 static int ptrace_traceme(void)
 {
-	int ret = -EPERM;
+	int ret;
+
+	if (mutex_lock_interruptible(&current->signal->exec_guard_mutex))
+		return -ERESTARTNOINTR;
+
+	/*
+	 * BIG FAT WARNING - Fragile code ahead.
+	 * Please do not insert any code between these two
+	 * if statements.  It may happen that execve has to
+	 * release the exec_guard_mutex in order to prevent
+	 * deadlocks.  In that case unsafe_execve_in_progress
+	 * will be set.  If that happens you cannot assume that
+	 * the usual guarantees implied by exec_guard_mutex
+	 * are valid.  Just return -EAGAIN in that case and
+	 * unlock the mutex immediately.
+	 */
+	ret = -EAGAIN;
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		goto unlock_creds;
 
 	write_lock_irq(&tasklist_lock);
 	/* Are we already being traced? */
+	ret = -EPERM;
 	if (!current->ptrace) {
 		ret = security_ptrace_traceme(current->parent);
 		/*
@@ -490,6 +524,8 @@ static int ptrace_traceme(void)
 	}
 	write_unlock_irq(&tasklist_lock);
 
+unlock_creds:
+	mutex_unlock(&current->signal->exec_guard_mutex);
 	return ret;
 }
 
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index b6ea3dc..7ebb194 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -329,7 +329,7 @@ static int is_ancestor(struct seccomp_filter *parent,
 /**
  * seccomp_can_sync_threads: checks if all threads can be synchronized
  *
- * Expects sighand and cred_guard_mutex locks to be held.
+ * Expects sighand and exec_guard_mutex locks to be held.
  *
  * Returns 0 on success, -ve on error, or the pid of a thread which was
  * either not in the correct seccomp mode or did not have an ancestral
@@ -339,9 +339,22 @@ static inline pid_t seccomp_can_sync_threads(void)
 {
 	struct task_struct *thread, *caller;
 
-	BUG_ON(!mutex_is_locked(&current->signal->cred_guard_mutex));
+	BUG_ON(!mutex_is_locked(&current->signal->exec_guard_mutex));
 	assert_spin_locked(&current->sighand->siglock);
 
+	/*
+	 * BIG FAT WARNING - Fragile code ahead.
+	 * It may happen that execve has to release the
+	 * exec_guard_mutex in order to prevent deadlocks.
+	 * In that case unsafe_execve_in_progress will be set.
+	 * If that happens you cannot assume that the usual
+	 * guarantees implied by exec_guard_mutex are valid.
+	 * Just return -EAGAIN in that case and unlock the mutex
+	 * immediately.
+	 */
+	if (unlikely(current->signal->unsafe_execve_in_progress))
+		return -EAGAIN;
+
 	/* Validate all threads being eligible for synchronization. */
 	caller = current;
 	for_each_thread(caller, thread) {
@@ -371,7 +384,7 @@ static inline pid_t seccomp_can_sync_threads(void)
 /**
  * seccomp_sync_threads: sets all threads to use current's filter
  *
- * Expects sighand and cred_guard_mutex locks to be held, and for
+ * Expects sighand and exec_guard_mutex locks to be held, and for
  * seccomp_can_sync_threads() to have returned success already
  * without dropping the locks.
  *
@@ -380,7 +393,7 @@ static inline void seccomp_sync_threads(unsigned long flags)
 {
 	struct task_struct *thread, *caller;
 
-	BUG_ON(!mutex_is_locked(&current->signal->cred_guard_mutex));
+	BUG_ON(!mutex_is_locked(&current->signal->exec_guard_mutex));
 	assert_spin_locked(&current->sighand->siglock);
 
 	/* Synchronize all threads. */
@@ -1319,7 +1332,7 @@ static long seccomp_set_mode_filter(unsigned int flags,
 	 * while another thread is in the middle of calling exec.
 	 */
 	if (flags & SECCOMP_FILTER_FLAG_TSYNC &&
-	    mutex_lock_killable(&current->signal->cred_guard_mutex))
+	    mutex_lock_killable(&current->signal->exec_guard_mutex))
 		goto out_put_fd;
 
 	spin_lock_irq(&current->sighand->siglock);
@@ -1337,7 +1350,7 @@ static long seccomp_set_mode_filter(unsigned int flags,
 out:
 	spin_unlock_irq(&current->sighand->siglock);
 	if (flags & SECCOMP_FILTER_FLAG_TSYNC)
-		mutex_unlock(&current->signal->cred_guard_mutex);
+		mutex_unlock(&current->signal->exec_guard_mutex);
 out_put_fd:
 	if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
 		if (ret) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-03-30 18:26 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <077b63b7-6f5e-aa8e-bf96-a586b481cc46@hotmail.de>
2020-03-20 20:24 ` [PATCH v6 01/16] exec: Only compute current once in flush_old_exec Bernd Edlinger
2020-03-20 20:24 ` [PATCH v6 02/16] exec: Factor unshare_sighand out of de_thread and call it separately Bernd Edlinger
2020-03-20 20:25 ` [PATCH v6 03/16] exec: Move cleanup of posix timers on exec out of de_thread Bernd Edlinger
2020-03-20 20:25 ` [PATCH v6 04/16] exec: Move exec_mmap right after de_thread in flush_old_exec Bernd Edlinger
2020-03-20 20:25 ` [PATCH v6 05/16] exec: Add exec_update_mutex to replace cred_guard_mutex Bernd Edlinger
2020-03-23 10:51   ` Kirill Tkhai
2020-03-20 20:26 ` [PATCH v6 06/16] exec: Fix a deadlock in strace Bernd Edlinger
2020-03-20 20:26 ` [PATCH v6 07/16] selftests/ptrace: add test cases for dead-locks Bernd Edlinger
2020-03-20 20:26 ` [PATCH v6 08/16] mm: docs: Fix a comment in process_vm_rw_core Bernd Edlinger
2020-03-20 20:26 ` [PATCH v6 09/16] kernel: doc: remove outdated comment cred.c Bernd Edlinger
2020-03-20 20:27 ` [PATCH v6 10/16] kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve Bernd Edlinger
2020-03-25 15:41   ` Christian Brauner
2020-03-20 20:27 ` [PATCH v6 11/16] proc: " Bernd Edlinger
2020-03-20 20:27 ` [PATCH v6 12/16] proc: io_accounting: " Bernd Edlinger
2020-03-20 20:27 ` [PATCH v6 13/16] perf: " Bernd Edlinger
2020-03-21  2:46 ` [PATCH v6 14/16] pidfd: " Bernd Edlinger
2020-03-25 15:40   ` Christian Brauner
2020-03-21  2:46 ` [PATCH v6 15/16] exec: Fix dead-lock in de_thread with ptrace_attach Bernd Edlinger
2020-03-25 14:27   ` Eric W. Biederman
2020-03-29  4:31     ` Bernd Edlinger
2020-03-29  6:36       ` Bernd Edlinger
2020-03-30 18:26         ` [PATCH v7 " Bernd Edlinger
2020-03-21  2:47 ` [PATCH v6 16/16] doc: Update documentation of ->exec_*_mutex Bernd Edlinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).