From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751507AbdDBW4j (ORCPT <rfc822;w@1wt.eu>);
        Sun, 2 Apr 2017 18:56:39 -0400
Received: from out02.mta.xmission.com ([166.70.13.232]:53657 "EHLO
        out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1750983AbdDBW4h (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 2 Apr 2017 18:56:37 -0400
From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
        Aleksa Sarai <asarai@suse.com>, Andy Lutomirski <luto@amacapital.net>,
        Attila Fazekas <afazekas@redhat.com>, Jann Horn <jann@thejh.net>,
        Kees Cook <keescook@chromium.org>, Michal Hocko <mhocko@kernel.org>,
        Ulrich Obergfell <uobergfe@redhat.com>, linux-kernel@vger.kernel.org,
        linux-api@vger.kernel.org
References: <20170213141452.GA30203@redhat.com>
        <20170224160354.GA845@redhat.com> <87shmv6ufl.fsf@xmission.com>
        <20170303173326.GA17899@redhat.com> <87tw7axlr0.fsf@xmission.com>
        <87d1dyw5iw.fsf@xmission.com> <87tw7aunuh.fsf@xmission.com>
        <87lgsmunmj.fsf_-_@xmission.com> <20170304170312.GB13131@redhat.com>
        <8760ir192p.fsf@xmission.com> <878tnkpv8h.fsf_-_@xmission.com>
        <874ly6a0h1.fsf_-_@xmission.com>
Date: Sun, 02 Apr 2017 17:51:16 -0500
In-Reply-To: <874ly6a0h1.fsf_-_@xmission.com> (Eric W. Biederman's message of
        "Sun, 02 Apr 2017 17:50:02 -0500")
Message-ID: <87wpb28luj.fsf_-_@xmission.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-XM-SPF: eid=1cuoQA-0001Bh-Ow;;;mid=<87wpb28luj.fsf_-_@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=67.3.234.240;;;frm=ebiederm@xmission.com;;;spf=neutral
X-XM-AID: U2FsdGVkX1/IDxI38GwLxzrSPrXCbP7nAKX0ZSvO0Zo=
X-SA-Exim-Connect-IP: 67.3.234.240
X-SA-Exim-Mail-From: ebiederm@xmission.com
X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP
        *  0.0 TVD_RCVD_IP Message was received from an IP address
        *  1.5 TR_Symld_Words too many words that have symbols inside
        *  0.7 XMSubLong Long Subject
        *  0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60%
        *      [score: 0.5000]
        * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC
        *      [sa06 1397; Body=1 Fuz1=1 Fuz2=1]
        *  0.0 T_TooManySym_02 5+ unique symbols in subject
        *  1.0 T_XMHurry_00 Hurry and Do Something
        *  0.0 T_TooManySym_01 4+ unique symbols in subject
X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 
X-Spam-Combo: ***;Oleg Nesterov <oleg@redhat.com>
X-Spam-Relay-Country: 
X-Spam-Timing: total 5541 ms - load_scoreonly_sql: 0.05 (0.0%),
        signal_user_changed: 2.8 (0.0%), b_tie_ro: 1.86 (0.0%), parse: 0.92 (0.0%),
        extract_message_metadata: 13 (0.2%), get_uri_detail_list: 2.8 (0.1%),
        tests_pri_-1000: 6 (0.1%), tests_pri_-950: 1.16 (0.0%), tests_pri_-900: 0.97
        (0.0%), tests_pri_-400: 29 (0.5%), check_bayes: 28 (0.5%), b_tokenize: 11
        (0.2%), b_tok_get_all: 10 (0.2%), b_comp_prob: 2.6 (0.0%), b_tok_touch_all:
        2.9 (0.1%), b_finish: 0.56 (0.0%), tests_pri_0: 313 (5.6%),
        check_dkim_signature: 0.56 (0.0%), check_dkim_adsp: 2.5 (0.0%),
        tests_pri_500: 5172 (93.3%), poll_dns_idle: 5166 (93.2%), rewrite_mail: 0.00
        (0.0%)
Subject: [RFC][PATCH v2 1/5] ptrace: Don't wait in PTRACE_O_TRACEEXIT for exec or coredump
X-Spam-Flag: No
X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600)
X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


Take advantage of the fact that no ptrace stop will wait for
a debugger if another process sends SIGKILL to the waiting task.

In the case of exec and coredump which have many interesting deadlock
opportunities and no one tests the what happens if you are ptraced
during exec or coredump act like another SIGKILL was immediately
sent to the process and don't stop.

Keep sending the signal to the tracer so that this appears like
the worst case where someone else sent the process a SIGKILL before
the tracer could react.  So all non-buggy tracers must support
this case.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 kernel/signal.c | 103 +++++++++++++++++++++++---------------------------------
 1 file changed, 42 insertions(+), 61 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7e59ebc2c25e..11fa736eb2ae 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1740,30 +1740,6 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
 	spin_unlock_irqrestore(&sighand->siglock, flags);
 }
 
-static inline int may_ptrace_stop(void)
-{
-	if (!likely(current->ptrace))
-		return 0;
-	/*
-	 * Are we in the middle of do_coredump?
-	 * If so and our tracer is also part of the coredump stopping
-	 * is a deadlock situation, and pointless because our tracer
-	 * is dead so don't allow us to stop.
-	 * If SIGKILL was already sent before the caller unlocked
-	 * ->siglock we must see ->core_state != NULL. Otherwise it
-	 * is safe to enter schedule().
-	 *
-	 * This is almost outdated, a task with the pending SIGKILL can't
-	 * block in TASK_TRACED. But PTRACE_EVENT_EXIT can be reported
-	 * after SIGKILL was already dequeued.
-	 */
-	if (unlikely(current->mm->core_state) &&
-	    unlikely(current->mm == current->parent->mm))
-		return 0;
-
-	return 1;
-}
-
 /*
  * Return non-zero if there is a SIGKILL that should be waking us up.
  * Called with the siglock held.
@@ -1789,6 +1765,8 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
+	bool ptrace_parent;
+	bool ptrace_wait;
 	bool gstop_done = false;
 
 	if (arch_ptrace_stop_needed(exit_code, info)) {
@@ -1842,53 +1820,56 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
-	if (may_ptrace_stop()) {
-		/*
-		 * Notify parents of the stop.
-		 *
-		 * While ptraced, there are two parents - the ptracer and
-		 * the real_parent of the group_leader.  The ptracer should
-		 * know about every stop while the real parent is only
-		 * interested in the completion of group stop.  The states
-		 * for the two don't interact with each other.  Notify
-		 * separately unless they're gonna be duplicates.
-		 */
+	ptrace_parent = likely(current->ptrace);
+	/*
+	 * Notify parents of the stop.
+	 *
+	 * While ptraced, there are two parents - the ptracer and
+	 * the real_parent of the group_leader.  The ptracer should
+	 * know about every stop while the real parent is only
+	 * interested in the completion of group stop.  The states
+	 * for the two don't interact with each other.  Notify
+	 * separately unless they're gonna be duplicates.
+	 */
+	if (ptrace_parent)
 		do_notify_parent_cldstop(current, true, why);
-		if (gstop_done && ptrace_reparented(current))
-			do_notify_parent_cldstop(current, false, why);
+	if (gstop_done && (ptrace_reparented(current) || !ptrace_parent))
+		do_notify_parent_cldstop(current, false, why);
 
-		/*
-		 * Don't want to allow preemption here, because
-		 * sys_ptrace() needs this task to be inactive.
-		 *
-		 * XXX: implement read_unlock_no_resched().
-		 */
-		preempt_disable();
-		read_unlock(&tasklist_lock);
-		preempt_enable_no_resched();
-		freezable_schedule();
-	} else {
-		/*
-		 * By the time we got the lock, our tracer went away.
-		 * Don't drop the lock yet, another tracer may come.
-		 *
-		 * If @gstop_done, the ptracer went away between group stop
-		 * completion and here.  During detach, it would have set
-		 * JOBCTL_STOP_PENDING on us and we'll re-enter
-		 * TASK_STOPPED in do_signal_stop() on return, so notifying
-		 * the real parent of the group stop completion is enough.
-		 */
-		if (gstop_done)
-			do_notify_parent_cldstop(current, false, why);
+	/*
+	 * Always wait for the traceer if the process is traced unless
+	 * the process is in the middle of an exec or core dump.
+	 *
+	 * In exec we risk a deadlock with de_thread waiting for the
+	 * thread to be killed, the tracer, and acquring cred_guard_mutex.
+	 *
+	 * In coredump we risk a deadlock if the tracer is also part
+	 * of the coredump stopping.
+	 */
+	ptrace_wait = ptrace_parent &&
+		!current->signal->group_exit_task &&
+		!current->mm->core_state;
 
+	if (!ptrace_wait) {
 		/* tasklist protects us from ptrace_freeze_traced() */
 		__set_current_state(TASK_RUNNING);
 		if (clear_code)
 			current->exit_code = 0;
-		read_unlock(&tasklist_lock);
 	}
 
 	/*
+	 * Don't want to allow preemption here, because
+	 * sys_ptrace() needs this task to be inactive.
+	 *
+	 * XXX: implement read_unlock_no_resched().
+	 */
+	preempt_disable();
+	read_unlock(&tasklist_lock);
+	preempt_enable_no_resched();
+	if (ptrace_wait)
+		freezable_schedule();
+
+	/*
 	 * We are back.  Now reacquire the siglock before touching
 	 * last_siginfo, so that we are sure to have synchronized with
 	 * any signal-sending on another CPU that wants to examine it.
-- 
2.10.1

From mboxrd@z Thu Jan  1 00:00:00 1970
From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
Subject: [RFC][PATCH v2 1/5] ptrace: Don't wait in PTRACE_O_TRACEEXIT for exec or coredump
Date: Sun, 02 Apr 2017 17:51:16 -0500
Message-ID: <87wpb28luj.fsf_-_@xmission.com>
References: <20170213141452.GA30203@redhat.com>
        <20170224160354.GA845@redhat.com> <87shmv6ufl.fsf@xmission.com>
        <20170303173326.GA17899@redhat.com> <87tw7axlr0.fsf@xmission.com>
        <87d1dyw5iw.fsf@xmission.com> <87tw7aunuh.fsf@xmission.com>
        <87lgsmunmj.fsf_-_@xmission.com> <20170304170312.GB13131@redhat.com>
        <8760ir192p.fsf@xmission.com> <878tnkpv8h.fsf_-_@xmission.com>
        <874ly6a0h1.fsf_-_@xmission.com>
Mime-Version: 1.0
Content-Type: text/plain
Return-path: <linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
In-Reply-To: <874ly6a0h1.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> (Eric W. Biederman's message of
        "Sun, 02 Apr 2017 17:50:02 -0500")
Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
To: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>, Aleksa Sarai <asarai-IBi9RG/b67k@public.gmane.org>, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>, Attila Fazekas <afazekas-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, Jann Horn <jann-XZ1E9jl8jIdeoWH0uzbU5w@public.gmane.org>, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>, Ulrich Obergfell <uobergfe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
List-Id: linux-api@vger.kernel.org


Take advantage of the fact that no ptrace stop will wait for
a debugger if another process sends SIGKILL to the waiting task.

In the case of exec and coredump which have many interesting deadlock
opportunities and no one tests the what happens if you are ptraced
during exec or coredump act like another SIGKILL was immediately
sent to the process and don't stop.

Keep sending the signal to the tracer so that this appears like
the worst case where someone else sent the process a SIGKILL before
the tracer could react.  So all non-buggy tracers must support
this case.

Signed-off-by: Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
 kernel/signal.c | 103 +++++++++++++++++++++++---------------------------------
 1 file changed, 42 insertions(+), 61 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7e59ebc2c25e..11fa736eb2ae 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1740,30 +1740,6 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
 	spin_unlock_irqrestore(&sighand->siglock, flags);
 }
 
-static inline int may_ptrace_stop(void)
-{
-	if (!likely(current->ptrace))
-		return 0;
-	/*
-	 * Are we in the middle of do_coredump?
-	 * If so and our tracer is also part of the coredump stopping
-	 * is a deadlock situation, and pointless because our tracer
-	 * is dead so don't allow us to stop.
-	 * If SIGKILL was already sent before the caller unlocked
-	 * ->siglock we must see ->core_state != NULL. Otherwise it
-	 * is safe to enter schedule().
-	 *
-	 * This is almost outdated, a task with the pending SIGKILL can't
-	 * block in TASK_TRACED. But PTRACE_EVENT_EXIT can be reported
-	 * after SIGKILL was already dequeued.
-	 */
-	if (unlikely(current->mm->core_state) &&
-	    unlikely(current->mm == current->parent->mm))
-		return 0;
-
-	return 1;
-}
-
 /*
  * Return non-zero if there is a SIGKILL that should be waking us up.
  * Called with the siglock held.
@@ -1789,6 +1765,8 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
+	bool ptrace_parent;
+	bool ptrace_wait;
 	bool gstop_done = false;
 
 	if (arch_ptrace_stop_needed(exit_code, info)) {
@@ -1842,53 +1820,56 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
-	if (may_ptrace_stop()) {
-		/*
-		 * Notify parents of the stop.
-		 *
-		 * While ptraced, there are two parents - the ptracer and
-		 * the real_parent of the group_leader.  The ptracer should
-		 * know about every stop while the real parent is only
-		 * interested in the completion of group stop.  The states
-		 * for the two don't interact with each other.  Notify
-		 * separately unless they're gonna be duplicates.
-		 */
+	ptrace_parent = likely(current->ptrace);
+	/*
+	 * Notify parents of the stop.
+	 *
+	 * While ptraced, there are two parents - the ptracer and
+	 * the real_parent of the group_leader.  The ptracer should
+	 * know about every stop while the real parent is only
+	 * interested in the completion of group stop.  The states
+	 * for the two don't interact with each other.  Notify
+	 * separately unless they're gonna be duplicates.
+	 */
+	if (ptrace_parent)
 		do_notify_parent_cldstop(current, true, why);
-		if (gstop_done && ptrace_reparented(current))
-			do_notify_parent_cldstop(current, false, why);
+	if (gstop_done && (ptrace_reparented(current) || !ptrace_parent))
+		do_notify_parent_cldstop(current, false, why);
 
-		/*
-		 * Don't want to allow preemption here, because
-		 * sys_ptrace() needs this task to be inactive.
-		 *
-		 * XXX: implement read_unlock_no_resched().
-		 */
-		preempt_disable();
-		read_unlock(&tasklist_lock);
-		preempt_enable_no_resched();
-		freezable_schedule();
-	} else {
-		/*
-		 * By the time we got the lock, our tracer went away.
-		 * Don't drop the lock yet, another tracer may come.
-		 *
-		 * If @gstop_done, the ptracer went away between group stop
-		 * completion and here.  During detach, it would have set
-		 * JOBCTL_STOP_PENDING on us and we'll re-enter
-		 * TASK_STOPPED in do_signal_stop() on return, so notifying
-		 * the real parent of the group stop completion is enough.
-		 */
-		if (gstop_done)
-			do_notify_parent_cldstop(current, false, why);
+	/*
+	 * Always wait for the traceer if the process is traced unless
+	 * the process is in the middle of an exec or core dump.
+	 *
+	 * In exec we risk a deadlock with de_thread waiting for the
+	 * thread to be killed, the tracer, and acquring cred_guard_mutex.
+	 *
+	 * In coredump we risk a deadlock if the tracer is also part
+	 * of the coredump stopping.
+	 */
+	ptrace_wait = ptrace_parent &&
+		!current->signal->group_exit_task &&
+		!current->mm->core_state;
 
+	if (!ptrace_wait) {
 		/* tasklist protects us from ptrace_freeze_traced() */
 		__set_current_state(TASK_RUNNING);
 		if (clear_code)
 			current->exit_code = 0;
-		read_unlock(&tasklist_lock);
 	}
 
 	/*
+	 * Don't want to allow preemption here, because
+	 * sys_ptrace() needs this task to be inactive.
+	 *
+	 * XXX: implement read_unlock_no_resched().
+	 */
+	preempt_disable();
+	read_unlock(&tasklist_lock);
+	preempt_enable_no_resched();
+	if (ptrace_wait)
+		freezable_schedule();
+
+	/*
 	 * We are back.  Now reacquire the siglock before touching
 	 * last_siginfo, so that we are sure to have synchronized with
 	 * any signal-sending on another CPU that wants to examine it.
-- 
2.10.1